\documentclass[12pt,a4paper,english]{article}
\usepackage[T1]{fontenc}
\usepackage{isolatin1}
\usepackage{anysize}
\usepackage{times}
\usepackage[english, german]{babel}
%\usepackage{tipa}
\usepackage{latexsym}
\usepackage{marvosym}
\usepackage{graphics}
\usepackage{epsfig}
\usepackage{pifont}
\usepackage{wasysym}
\usepackage{color}
%\usepackage{geometry}                               
%\geometry{left=3.3cm, right=3cm}

\pagestyle{plain}

\begin{document}
\begin{titlepage}
\includegraphics{sv4264264.eps}\hfill
\includegraphics{sv4264265.eps}
\begin{center}
\vspace*{5cm}

\Huge{\bf SmartKom}
\\[1.5ex]
\LARGE{Transliteration Manual}

\vspace*{5cm}

\normalsize{\bf Written by:}
\\[1.5ex]
\normalsize{Susen Rabold}
\\[1ex]
\normalsize{Sonja Biersack}
\\[2.5ex]
\small{Institut für Phonetik und sprachliche Kommunikation}
\\[0.5ex]
\small{Ludiwg-Maximilians-Universität München}
\\[1.0ex]
\small{August 2002}
\end{center}

\end{titlepage}


%\pagestyle{headings}
%\originalTeX
\selectlanguage{english}
\tableofcontents

\newpage


\section{Preamble}

These transliteration conventions were
originated from the further Verbmobile transliteration
conventions. They were reviesed and advanced for the SmartKom
project. 
All this happend within the data workshop from the 11.10.to the
13.10.1999 at the Ludwig-Maximilians-University of Munich.
\\[1.5ex]
In the following we will inform you in which manner we build up a
transcript and how you can read it with all the different labels
without losing information.

\newpage

\section{General Information}



\subsection{Spontaneous Speech in SmartKom}

As a part of the SmartKom project, subjects are recorded while using a
self-explanatory, user-adaptive man-computer interface. This interface
is supposed to interpret voice and gesture input, as well as the
user's mimic.  Within the man-machine dialogues the subjects are
supposed to handle various tasks in various scenarios, like asking for
the way or today's movie.  As a result we get on the one hand
video-data and on the other hand multi-channel data of spontaneous
speech, which serves as a base for research and development in the
field of speech recognition, speech synthesis and user-interface
models.



\subsection{Transliteration of Spontaneous Speech}

The spontaneous speech data is to be made available for all SmartKom
partners by means of an orthographic transliteration. Thus, project
assistants listen to the recordings of the dialogues and transcribe
them on the word level.  The transliteration conventions have been
developed in order to label phenomenons of spontaneous speech such as
false starts, corrections, repetitions, reductions and filled pauses.
They also indicate technical artefacts like recording interruptions or
microphone noises, as well as speaker interference.


\subsection{Transliteration Categories}


The following simplified categories have to be taken into
consideration when transcribing the turns of a dialogue:

\begin{enumerate}
\item lexical units

\item syntactic-semantic formation

\item nonverbal articulatory production

\item noises

\item pauses

\item acoustic interference

\item comments

\item special comments

\item prosody
\end{enumerate}


\subsection{Basic Requirements}


The basic requirements are equal to those of Verbmobil:

\begin{enumerate}

\item[a)] automatic processing

\begin{itemize}

\item consistent file structure

\item informative and consistent turn names

\item consistent transliterations

\item explicit symbolisation

\item ASCII symbols

\item parsable transliteration conventions
\end{itemize}

\item[b)] textual requirements

\begin{itemize}
\item written record of all audible elements of the dialogue

\item syntactic-semantic labels

\item labelling of speaker and noise interference
  
\item labelling of certain word categories (proper names, numbers) and
  failures

\item maintenance of readability
\end{itemize}



\item[c)] the process of transliteration

\begin{itemize}
\item straightforward usability of all conventions
  
\item simple and easy-to-understand conventions, even for non-experts
\end{itemize}
\end{enumerate}


\subsection{Limits of the Transliteration Process}

An annotation on the level of a broad transcription does not provide:
%
\begin{itemize}
\item a phonological transliteration

\item a phonetic transcription
 
\item a time correlation of the speech signal
\end{itemize}
%
For labelling nonverbal productions transcribers can only choose from
categories. A pronunciation or event that is particularly noticeable
can be referred to by a so-called pronunciation or local comment.
Pronunciation comments are an attempt to mirror the phonetic
characteristics of pronunciation variation and disfluencies by means
of orthography.



\subsection{Use of Conventions}

The transliteration lexicon is a key to the understanding and
processing of completed transliterations. Even more important it is
supposed to be a reference for the staff transcribing the dialogues.
The lexicon can also be used as a reference for other western
languages, the target language is German, though. {\bf Dictionary
  entries applying exclusively for German are indicated by
  **german**}.
\\[1.5ex]
The lexicon itself describes the structure of a transliteration file,
the conventions for the transliteration of the individual turn
elements and provides in addition tables of the respective symbols.
\\[1.5ex]
Dictionary entries for the turn elements include:
\begin{itemize}
\item the name of the marker, symbol or event
\item the symbol
\item some examples
\item a definition
\item details of usage
\end{itemize}
The examples are taken either directly from SmartKom transliterations
or alternatively from Verbmobil transliterations. In a few cases the
examples have been constructed.


\section{Transliteration Manual}


\subsection{The Structure of a Transliteration File}

\subsubsection{How to build up a Transcription File} 

A transcription file consists of a {\bf header} and the {\bf
  transcribed dialogue} which again is splitted in several turns. The
turns are needed to show if the subject or the machine speaks. Header
and transcript are separated by a {\bf blank line beginning with a
  semicolon}. There is also a blank line inserted between each
dialogue turn. The transcript is finished by the tag {\bf EOF;} with
the previous line blank again.

\subsubsection{Header}

Example:
\\[1.0ex]
; DVD: 1.0 \\
; Version: 1.0 \\
; Dialog: \\
; ATMO: Demovideo \\
; ENC: Tex \\
; zuletzt bearbeitet am: \\
; VPK: AAA \\
; Offtalk: keiner|wenig|viel \\
; bearbeitet von: \\
; Tonqualität:
\\[1.5ex]
Each line of the header starts with a {\bf semicolon} followed by {\bf
  one blank space}. The categories and the corresponding entries
inform about both the technical and the organisational specifications
of the dialogue. Thus, {\bf DVD} refers to the the version number of
the relevant DVD and {\bf Version} to the version number of the latest
transliteration. {\bf Dialog} indicates the name of the dialogue and
at the same time the name of the directory in which the corresponding
signal file can be found on DVD. {\bf ENC} refers to the encoding used
(`TEX' {\it or} `ISO'), {\bf VPK} to the subject's identification
code, {\bf ATMO} to the type of background sound used. The entry in
{\bf Off-Talk} informs about whether the dialogue contains `Off-Talk'
and if it does how much (`keiner' {\it or} `wenig' {\it or} `viel').
The remaining lines consist of comments that relate to the dialogue as
a whole, such as when the transliteration has been edited the last
time, who's been working on it and what the sound quality was like.
Even comments on the subject are possible here (e.g.\ concerning his
or her dialect, possible speech defects, etc.). Ratings concerning the
quality of the sound are naturally subjective, nonetheless the
transcribers are instructed to choose from only three categories
(loud, normal, quiet). If there is no specification, the default value
is set to `normal'.


\subsubsection{Transliteration}

The entire dialogue is put down orthographically.
%The transliteration part contains the whole dialogue
Each speech act is identified as an own turn and each of these turns is
transliterated in accordance with the conventions. It should be noted
that the information system is always, i.e.\ in each dialogue,
labelled with the same code {\bf SMA}.

\subsubsection{Turns}

A turn starts with the turn name, followed by a blank space. After that
it is possible to specify with a {\bf marker} in which {\bf language}
the respective turn is spoken. This might become necessary if the
dialogue is multilingual. The turn as a whole can also be specified as
{\bf Off-Talk} (speech not relevant to the SmartKom system). Here
again turn name and marker are separated by a blank space.  Within the
turn body all audible events, syntactic-semantic markers and comments
are separated by blank spaces.  At each end of line within a turn a
line break has to be created, a new line always begins with a blank
space. The last turn element in a turn ends in a line break without
any blank spaces following.


\subsubsection{Turn Name}

Each turn starts with its turn name. This name works as an identifier
if a single transliterated turn has to be searched for. With the name
of the turn a connection between the signal file and the database is
made.
\\[1.5ex]
%Example:
%\\[1.5ex]
\texttt{w001\_pkd\_001\_AAA: !KEYComputer , Wetter .}
\\[1.5ex]
The name of the signal file is the same like the one on the DVD, only
without the extension. Each name is composed of four components:

\begin{enumerate}
  
\item The {\bf dialogue directory name} that shows the type of
  recording in form of a one-digit token. Here we differentiate
  between biometric recordings (b) and wizard recordings (w).
  Subsequent to that a triple-digit session number is given, followed
  by a two-digit number for the identification of the scenario. The
  last two tokens are separated by underlines.
  
\item The {\bf type of microphone} used, e.g.\ `d' for the microphone
  used in the room, where the subject is recorded and `w' for the
  microphone of the wizard.
  
\item The {\bf number of the turn}, which is indicated in form of a
  triple-digit number going up by one with each new turn. This token is
  again separated by an underline.
  
\item The {\bf speaker code} by which every single speaker is labelled
  and thus easiliy identifiable in the database. This code contains
  three capital letters (except umlauts).

\end{enumerate}

\noindent
The turn's name is completed by a colon and a blank space (or a line
break and a blank space).

\begin{center}
\begin{tabular}{|l|l|l|l|}
\hline
dialogue name & type of microphone & turn number & speaker code\\
\hline
\hline
\texttt{w001\_pk} & \texttt{d }(= directional microphone) & \texttt{\_001} &
\texttt{\_AAA}\\
\hline
\end{tabular}
\end{center}


\subsubsection{Indication of Language in a Multilingual Dialogue}

The language code follows the standard of the so-called
ISO-language-codes:\\
http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt
\\[1.5ex]
\definecolor{lightgray}{gray}{.90}

\begin{tabular}{llll}
\colorbox{lightgray}{Symbol} & \multicolumn{3}{l}{<*tXX> with XX standing for the following variables}\\
& EN & = & English\\
& DE & = & German (in non-German dialogues or multilingual dialogues)\\
& FR & = & French\\
& IT & = & Italian\\
& ES & = & Spanish\\
& JA & = & Japanese\\
& & etc.\ &
\end{tabular}

\vspace{1.5ex}
\noindent
In a multilingual dialogue a language code is placed after each turn name
according to the language used:
%
\begin{verbatim}
w001_ptd_001_AAA: <*tEN> good morning , ~John . how are you ?

w001_ptd_002_AAB: <*tDE> guten Tag , Herr ~Miller . danke , es
geht mir gut .
\end{verbatim}
%
%
\texttt{<*tXX>} is followed and preceded by a blank space. Note that
the transliteration of the dialogue follows the orthography of the
respective language used.


\subsubsection{Labelling Off-Talk}

A turn can also be entirely labelled as Off-Talk.
\\[1.5ex]
\colorbox{lightgray}{Symbol} \texttt{<*tOOT>}, \texttt{<*tROT>}
\\[1ex]
<*tROT> is chosen if the subject reads aloud from the screen,
so-called `Read Off-Talk'. <*tOOT> is chosen in all other cases of
speech that is not relevant to the SmartKom system (`Other Off-Talk'):
%
\begin{verbatim}
w425_pkd_001_AAA: <*tROT> <Ger"ausch> Kino ~Schlo"s ,
~Hauptstra"se und Kino<Z> <P> ~Hoelldobler .
\end{verbatim}
%
<*tOOT> and <*tROT> are preceded and followed by a blank space
(alternatively by a line break and a blank space).


\subsubsection{Turn Body}

The turn body itself contains all audible events such as lexical units
and noise, and in addition syntactic labels and
comments.
\\[1.5ex]
Three points have to be considered concerning the format of the turn
body:
\begin{enumerate}
\item ASCII code:\\
  The transliteration is encoded in {\bf 7-bit ASCII}. Letters like
  umlauts and `ß' are substituted by {\bf \TeX\ notations}. For the delivery
  format the \TeX  notation is again converted into ISO.
  
\item Separation of elements:\\
  Turn elements are preceded and followed by blank spaces. The only
  exception can be found with line breaks, as these follow immediately
  after the respective turn element.
  
\item Word separation:\\
  Words are {\bf not} separated; the respective word will always move
  to the next line.
\end{enumerate}
%
If a turn contains a lexical unit it must be concluded with a
punctuation mark like a period or a question mark or with the turn
break label <*T>t. After the last punctuation mark it is only allowed
to transliterate technical noise or non-verbal articulatory
production. The turn break label must be the final turn unit, though.
If a turn contains no lexical unit punctuation marks aren't used,
either.  Nevertheless the ear impression might justify a turn break
label.
\\[1.5ex]
A turn must contain at least \underline{one} of the following
elements:
\begin{itemize}
\item a lexical unit 

\item a hesitation

\item the symbol for incomprehensible utterances: <\%>
\end{itemize}
 
\subsubsection{Global Comment}

 \colorbox{lightgray}{Symbol} \texttt{<;...>}
\\[1.5ex]
A global comment always refers to a bigger part of the turn or to the
whole turn. Thus, it is more concise to put {\bf one comment} at the
end of the turn than labelling each word separately. A global comment
may involve a technical occurence, irregularities in the subject's
speech, or everything else that's extraordinary compared to the
previous or following turns. The global comment is placed directly
after a turn and starts with a semicolon.

\begin{verbatim}
w001_ptd_001_AAA: hallo , Herr ~Meier , wir m"ussen einen Termin
ausmachen .
;Sprecher ist heiser.
\end{verbatim}

\subsection{The Transcription}

\subsubsection{Lexical Units}

Lexical units are
\begin{itemize}
\item words (according to the dictionary, e.g.\ the Duden for German)
\item interjections
\item reduced forms of words
\item compound words (a typical German feature)
\item special words (like numbers or names which are labelled as such)
\item words with articulatory features (like lengthening, reduction, etc.)
\end{itemize}
%
The lexical units are transcribed orthographically. German
transcriptions follow the spelling of the Duden (please note that we
still conform to the 'old` German spelling from before 1995).
Capitalisation is not sensitive to punctuation marks like periods or
question marks. Thus, each sentence starts with a lower case letter
unless the first word is a proper name or a noun.
\\[1.5ex]
Dialect speech is transcribed like other speech but can, if necessary,
be labelled with a pronunciation comment and/or a global comment in
which the dialect of the turn is noted.


\subsubsection{Interjections}

Expressions of surprise, affirmation and negation as well as discourse
particles are examples of interjections.  They are transcribed without
labels as most of them are listed in the Duden.

\begin{verbatim}
w001_ptd_001_AAA: oh , das pa"st mir gar nicht . mm .
\end{verbatim}
%
Note:\\
The negation {\bf m'm} is transcribed {\bf mm} to distinguish it from
the affirmation {\bf mhm}.


\subsubsection{Reduction}

**german**
\\[1.5ex]
\colorbox{lightgray}{Symbol} \texttt{'}
\\[1.5ex]
Reductions that can be found in the transcript are elisions,
assimilations and deletions. The apostrophe is put when the {\bf final
  e} of a word (e.g.\ in {\texttt hab'}, {\texttt wollt'}, {\texttt
  heut'}) isn't pronounced. The same thing goes for {\bf final t} if
it isn't pronounced in {\texttt is'}, {\texttt nich'} and {\texttt
  jetz'}.  The apostrophe also subtitutes {\bf initial ei} in the
German indefinite articles ein ({\texttt 'n}), eine ({\texttt 'ne}),
einem ({\texttt 'nem}), einer ({\texttt 'ner}) etc. and {\bf initial
  e} in the reduced pronoun es ({\texttt 's}).  The lexical units
mentioned above are always transcribed like this if reduced. If they
occur together, the two apostrophes may appear in a row.

\begin{verbatim}
w001_ptd_001_AAA: wie w"ar' 's mit Kino . 
\end{verbatim}


\subsubsection{Compounds}

**german**
\\[1.5ex]
\colorbox{lightgray}{Symbol} \texttt{-} or \texttt{---}
\\[1.5ex]
A peculiarity of German is the compounding of words (mostly nouns)
without using a dash between the word components. In SmartKom we use
dashes in compounds that are made up of more than two words as well as
in some abbreviations. In order to avoid deviant spellings within the
SmartKom corpus we compiled our own glossary.

\begin{verbatim}
w001_ptd_001_AAA: wir wollen die Acht-Uhr-Maschine in die $U-$S-$A
erreichen .
w001_ptd_001_AAA: wieviel kostet die Hin-- und R"uckfahrt ?
\end{verbatim}
%
Note:\\
If we come across a compound that is disconnected (because the subject
is breathing or hesitating while pronouncing it) the dash is placed
behind the first part of the compound. A double dash is used if the
common component of two compounds is only mentioned once.


\subsubsection{Spelling}

\colorbox{lightgray}{Symbol} \texttt{\$}
\\[1.5ex]
The spelling label is used when the subject spells a name letter by
letter, e.g.\ for referring to the orthography or in abbreviations
like USA. Each single letter is preceded by a \$-label. The letters
are always uppercase and separated by a comma or a dash, the latter mostly
in abbreviations.

\begin{verbatim}
w001_ptd_001_AAA: mein Name ist ~Meier , $M , $E , $I , $E , $R .
w001_ptd_001_AAA: ich nehme in den $U-$S-$A nicht gerne die
$U-Bahn .
\end{verbatim}


\subsubsection{Acronyms}
**german**
\\[1.5ex]
\colorbox{lightgray}{Symbol} \texttt{\&}
\\[1.5ex]
%
Acronyms are official substitutes for particular words. They are
labelled with \texttt{\&}. The label only has to be placed once, at the
beginning of the acronym. Acronyms can be spelt letter by letter - in
this case the label \texttt{\$} has to be used, too (as seen before) -
or they can be pronounced like a word.

\begin{verbatim}
w001_ptd_001_AAA: im &$Z-$D-$F gab es gestern einen interessanten
Bericht "uber die &OPEC .
\end{verbatim}


\subsubsection{Proper Nouns}

\colorbox{lightgray}{Symbol} \texttt{$\sim$}
\\[1.5ex]
%
As proper nouns names are marked that can't be translated into another
language such as surnames and first names of people, names of streets,
hotels and restaurants, company names, names of institutions, local
places, national holidays etc. According to this, names are not
labelled as proper nouns if they appear not only in one language and
thus can be translated. In this category we find names of
international holidays, names of countries and continents, the names
of the seven seas etc.\ . If the proper noun consists of several words,
that in regular orthography are separated by blank spaces, they will
be here linked by pluses between each part of the name.

\begin{verbatim}
w001_ptd_001_AAA: ich m"ochte gerne in dem Restaurant
~Zur+blauen+Traube einen Tisch f"ur ~Kirchweih reservieren .
Das liegt doch in der ~Richard-Strau"s-Stra"se ?
w001_ptd_001_AAA: wei"st du noch , als wir letztes Jahr von
England nach Amerika "uber den Atlantik gesegelt sind ?
\end{verbatim}
%
Note:\\
Typical for the German orthography is that names of streets are always
connected by dashes.


\subsubsection{Numbers}

\colorbox{lightgray} {Symbol} \texttt{\#}
\\[1.5ex]
Numbers are numerals, combinations of numbers and ordinal numbers. In
all these cases a \texttt{\#} is preceding the number. Two-digit
numbers are labelled as one number. All numbers are written as words.

\begin{verbatim}
w001_ptd_001_AAA: ich suche die ~Bernd-Meier-Stra"se #hundert 
#f"undundzwanzig .
w001_ptd_001_AAA: bitte reserviere f"ur die Acht-Uhr-Maschine 
morgen einen Platz in der #zw"olften Reihe .
\end{verbatim}
%
Note:\\
The number won't be labelled if it is a part of a compound.


\subsubsection{Neologisms}

\colorbox{lightgray}{Symbol} \texttt{*}
\\[1.5ex]
`Neologism' is a term referring to a word that has been made up by the
speaker and does not appear in a regular dictionary. It could be slang
or a slip of the tongue. A neologism is marked with an asterisk at the
beginning of the respective word.

\begin{verbatim}
www001_ptd_001_AAA: kannst du mir *weiterhebbeln ?
\end{verbatim}
%
Note:\\
Minor slips of the tongue are transcribed correctly followed by a
pronunciation comment. Absurd or impossible combinations of letters
are labelled as not identifiable (see 3.2.1.3.).


\subsubsection{Foreign Words}

\colorbox{lightgray}{Symbol} \texttt{<*XX>}
\\[1.5ex]
Foreign words are words that stem from another language than that used
in most of the dialogue (see also 3.1.7.). In these cases a <*XX> is
attached to the beginning of the word. As always, there is no space
between the label and the word.

\begin{verbatim}
w001_ptd_001_AAA: wo liegt das Restaurant ~<*IT>Milano ?
\end{verbatim} 


\subsubsection{Command Words}

\colorbox{lightgray}{Symbol} \texttt{!KEY}
\\[1.5ex]
Command words are words that speakers use to operate the system by
means of meta language. These words are always labelled with a
\texttt{!KEY} with no space between the label and the word.

\begin{verbatim}
w001_ptd_001_AAA: hallo !KEYSmartakus .
\end{verbatim}


\subsubsection{Lengthening}

\colorbox{lightgray}{Symbol} \texttt{<Z>}
\\[1.5ex]
When sounds within or at the end of a lexical unit are lengthened, we
use \texttt{<Z>} (with Z standing for the German word 'Zögerung'). You
can find it for prefinal lenghtening, with plosives that have a longer
closure phase and in the event of an aspiration phase that is stronger
or longer than normally. This label is directly added to the letter
representing the sound affected.

\begin{verbatim}
w001_ptd_001_AAA: wann f"angt der Fi<Z>lm de<Z>nn an ?
w001_ptd_001_AAA: f"ur welche Uhrzeit<Z> gilt<Z> das ?
\end{verbatim}


\subsubsection{Not or Hardly Identifiable Words}

\colorbox{lightgray}{Symbol} \texttt{<\%>} or \texttt{\%}
\\[1.5ex]
This label can be used if it is impossible to understand a part of
what has been said within the man-machine-dialogue. Words that are not
identifiable can either be completely incomprehensible or may be
partially understood but not with certainty. For this reason we use
the label <\%> instead of a word we can't identify at all and another
but quite similar label \% if we can understand a word partially but
not well enough to identify it without any doubt.

\begin{verbatim}
w001_ptd_001_AAA: heute will ich <%> sehen .
w001_ptd_001_AAA: heute abend% will ich einen Film sehen .
\end{verbatim}


\subsubsection{Truncated Words}
% WAHLWEISE AUCH NOCH TRUNCATED UTTERANCES


\colorbox{lightgray}{Symbol} \texttt{=}
\\[1.5ex]
Truncated words occur when the speaker has begun to articulate a word
but doesn't finish it. In other words the item is terminated at a
point where some of the component sounds have already been produced,
while the rest has been cut off before being articulated. The equal
sign is used here as the label and also placed during a series of
stutters where parts of a word are repeated but the word as a whole is
still not completely pronounced. Again, the label is directly added to
the word.

\begin{verbatim}
w001_ptd_001_AAA: meine +/fra=/+ Frage lautet .
w001_ptd_001_AAA: könntest du mir hel= <*T>t
\end{verbatim}
%
Note:\\
This label appears most of the time together with other labels such as
false start (see 3.2.19), repetition (see 3.2.18) and turn break (see
3.2.16).


\subsubsection{Articulatory Interruptions}

\colorbox{lightgray}{Symbol} \_
\\[1.5ex]
%
Lexical items can be interrupted by various phenomena such as pauses,
breathing, hesitations, slips of the tongue, mispronunciations etc.
To mark the event, we add a subline followed by a blank space at the
point of interruption. Then we insert the interrupting element and
finally conclude with the remainig part of the interrupted word which
is preceded by another blank space and subline.

\begin{verbatim}
w001_ptd_001_AAA: was kommt heute a_ <A> _bend im Fernsehen ?
\end{verbatim}


\subsubsection{Technical Interruptions}

\begin{tabular}{lcl}
\colorbox{lightgray}{Symbol} & <*T> & technical interruption within a turn
\\
& <*T>t & turn break 
\\
& <T\_> & beginning of turn / word missing
\\
& <\_T> & end of word missing
\end{tabular}

\vspace{1.5ex}
\noindent
Technical interruptions are caused by a temporarily broken or missing
section of the audio signal, something that might happen due to
technical or other mistakes. There are four distinguishable types of
technical interruption:

\begin{enumerate}
\item{\bf <T\_>} is used when the beginning of a turn is missing. In
  this case it is attached to the beginning of the first lexical item
  occuring, again without a blank space and regardless of whether this
  item seems to be complete or fragmental.
\item{\bf <*T>}   is used when larger parts of an utterance are cut off.
It's a substitute for the missing turn elements.
\item{\bf <*T>t} is used when the end of a turn is missing.
\item{\bf <\_T>} is used when the last part of a word is cut off.
\end{enumerate} 

\begin{verbatim}
w001_ptd_001_AAA: <T_> im Kino ?
w001_ptd_001_AAA: was l"auft <*T> Kino ?
w001_ptd_001_AAA: was l"auft heute abend im Ki= <*T>t
w001_ptd_001_AAA: was l<_T> <*T> <_T>end im Kino ?
\end{verbatim}


\subsubsection{Pronunciation Comments}

\colorbox{lightgray}{Symbol} \texttt{<!n...>}
\\[1.5ex]
The pronunciation comment indicates that the subject showes at this
point an unusual pronunciation (like slang, something with a hint of
a foreign accent or dialect, word contractions, assimiliations or
mispronunciations). Thus, pronunciation comments show the deviation
between actual pronunciation and the `canonical form'. In the case of
contractions the number of the contracted words is given after the
exclamation mark of the label. The label follows the lexical item,
seperated by a blank space.

\begin{verbatim}
w001_ptd_001_AAA: das <!1 des> will <!1 wui> ich <!1 i> aber
<!1 aba> nicht <!1 net> sehen .
w001_ptd_001_AAA: kommst du <!2 koms'u> mal <!1 ma'> schneller
in die Gänge .  
\end{verbatim}


\subsubsection{Repetition or Correction}

\colorbox{lightgray}{Symbol} \texttt{+/.../+}
\\[1.5ex]
There's a tendency in spontanous speech to stutter and also to correct
such disfluencies. The label +/.../+ is used when the speaker repeats
a word or a phrase or when he substitutes a new word for the one he
started with, but continues with the same word class.

\begin{verbatim}
w001_ptd_001_AAA: ich h"atte also gerne +/das/+ +/das/+ das
Restaurant gesehen .
\end{verbatim}


\subsubsection{False Start}

\colorbox{lightgray}{Symbol} \texttt{-/.../-}
\\[1.5ex]
A false start is characterised by the subject beginning an utterance,
breaking it off before completion and continuing the utterance with an
entirely new thought. The label is placed in the same way the
repetiton/correction label is.

\begin{verbatim}
w001_ptd_001_AAA: -/heute abend will/- was kommt denn morgen im
Kino ?
\end{verbatim}


\subsubsection{Breathing}

\colorbox{lightgray}{Symbol} \texttt{<A>}
\\[1.5ex]
Breathing (in German `Atmung'), inhalation or exhalation, often occurs
at prosodic or syntactic boundaries. In the transcript only breathing
that can be heard well is indicated. If the punctuation mark and the
breathing label collide, the punctuation mark is put first.

\begin{verbatim}
w001_ptd_001_AAA: bitte zeige <A> mir das Kinoprogramm .
\end{verbatim}


\subsubsection{Filled Pause}

\begin{tabular}{ll}
\colorbox{lightgray}{Symbol} & <"ah> \\
& <"ahm> \\
& <hm> \\
&  <h"as>
\end{tabular}
\vspace{1.5ex}

\noindent
In spontaneous speech filled pauses are defined as pauses that are
filled with some vocalisation. A filled pause may occur when a
speaker thinks about something. The speaker actually interrupts his
speech while continuing his articulation. This articulation is however
neither a word nor part of a word and should thus not be treated as
such. As a consequence a puncuation mark cannot follow a filled pause,
it has to come first. Nevertheless a filled pause can make a turn of
its own.

\begin{verbatim}
w001_ptd_001_AAA: <hm>
w001_ptd_001_AAA: welches <"ah> Restaurant kannst du mir 
empfehlen .
\end{verbatim}

\noindent
In German transcripts the label <h" as> is used when the subject
produces hesitations like `brrt', `pf', `puh' and the like.




\subsubsection{Empty Pause}

\colorbox{lightgray}{Symbol} \texttt{<P>}
\\[1.5ex]
Empty pauses can be defined as temporary, unfilled gaps in speech.
They can be overlayed by a speaker interference (see 3.2.25), but
cannot overlay actively. Just as with the filled pause labels
punctuation marks always come first. Empty pauses at the beginning or
at the end of a turn are not transcribed.


\begin{verbatim}
w001_ptd_001_AAA: hallo , <P> kannst du mir bitte sagen wo ich
das Schlo"s in  <P> ~Heidelberg finde .
\end{verbatim}
%
Note:\\
In the of event of an extremly long empty pause the label <PP> may be
used.

\subsubsection{Human Noises}

\begin{tabular}{ll}
\colorbox{lightgray}{Symbol} &\texttt{<Ger" ausch>}\\
& \texttt{<Lachen>}
\end{tabular}

\vspace{1.5ex}
\noindent
Speakers also produce sounds that have no real meaning, such as
laughing, coughing, swallowing etc.\ . These are all labelled as
<Ger"ausch> (German for noise). If one of these noises occurs for a
longer period of time, without being interrupted (a speaker laughing
for example), a single label will be sufficient. As usual, punctuation
marks come first.

\begin{verbatim}
w001_ptd_001_AAA: das wollte ich aber nicht wissen . <Lachen>
w001_ptd_001_AAA: <Ger"ausch> in welcher Stra"se liegt das ?
\end{verbatim}


\subsubsection{Technical Noises}

\colorbox{lightgray}{Symbol} \texttt{<\#>}
\\[1.5ex]
Noises that can't be attributed to the speaker are technical noises.
These might be caused by the recording instruments, by dropping things
or by people moving around while recording.

\begin{verbatim}
w001_ptd_001_AAA: hallo <#> !KEYSmartakus .
\end{verbatim}

\subsubsection{Speaker Interference}

\begin{tabular}{lcl}
\colorbox{lightgray}{Symbol} & ..n@  & (passive speaker interference of lexical units)\\
& @n..  & (active speaker interference of lexical units)\\
& ..n@> & (passive speaker interference of other events)\\
& <@n.. & (active speaker interference of other events)
\end{tabular}

\vspace{1.5ex}
\noindent
Speaker interference occurs when the subject and the system speak at
the same time or when noises occur while the subject (or the system)
speaks. From the point of view of the subject an interference may be
either passive or active, depending on whether the speaker is the one
who has been interrupted or the one who has interrupted. The labelling
indicates, in either case both the turn components passively affected
by and the turn components actively affecting the interference.

It's quite usual that within a dialogue there are several speaker
interferences. This is why interferences are numbered consecutively.

\begin{verbatim}
w001_ptd_001_AAA: ich m"ochte gerne in der N"ahe des 
~Marktplatz1@ par=1@ <*T>t
w001_ptw_002_SMA: @1bitte @1w"ahle @1ein Parkhaus aus .
...

...
w001_ptw_015_SMA: hier sehen Sie die Sehensw"urdigkeiten 
<#>3@> von3@ ~Heidelberg3@ .
w001_ptd_016_AAA: @3ja , <@3<"ahm> @3genau @3das wollte 
ich von dir gezeigt bekommen .
...
\end{verbatim}


\subsubsection{Noise Interference}

\colorbox{lightgray}{Symbol} \texttt{<:<...> ...:>}
\\[1.5ex]
Any utterance may interfere with one or more noises that are either
background noises or noises produced by a speaker. A noise, human or
non-human, may occur before, after or during a turn element. But
only when a noise appears during a word angle brackets are used
to embrace both, the noise and the word.

\begin{verbatim}
w001_ptd_001_AAA: ich werde mir das <:<Ger"ausch> 
Schlo"s:> anschauen
.
...
w001_ptd_001_AAA: was gibt es so <:<#> in<Z>:> ~Heidelberg .
w001_ptd_002_SMA: die <:<#> Sehensw"urdigkeiten:> von ~Heidelberg .
...
\end{verbatim}


\subsubsection{Off-Talk}

\begin{tabular}{ll}
\colorbox{lightgray}{Symbol} & \texttt{<OOT>} \\
& \texttt{<ROT>}
\end{tabular}

\vspace{1.5ex}
\noindent
As seen before (3.1.8.), Off-Talk is labelled if the subject reads
aloud or uses speech which is not relevant to the SmartKom system. If
this applies only to single words or phrases, and not to the whole
turn, the labels are attached directly to the end of each word spoken
in Off-Talk.

\begin{verbatim}
w001_ptd_001_AAA: zeig mir die italienischen Restaurants hier .
w001_ptw_002_SMA: hier siehst du die italienischen Restaurants 
von ~Heidelberg .
w001_ptd_003_AAA: mhm . ~<*IT>Milano<ROT> , ~<*IT>Roma<ROT> .
welches<OOT> von<OOT> den<OOT> beiden<OOT> ist<OOT> jetz'<OOT> 
besser<OOT> ? <P> !KEYSmartakus , welches Restaurant kannst du 
empfehlen ?
\end{verbatim}


\subsubsection{Punctuation Marks}

\begin{tabular}{ll}
\colorbox{lightgray}{Symbol} & \texttt{,} \\
& \texttt{?} \\
& \texttt{.}
\end{tabular}

\vspace{1.5ex}
\noindent
Spontaneous speech is difficult to punctuate correctly. Transcribers
will have to rely on their hearing and intuition as native speakers as
well as on their grammatical knowledge in order to structure
utterances and sentences in a logical way.

\begin{verbatim}
w001_ptd_001_AAA: okay , das h"ort sich gut an .
w001_ptd_001_AAA: kannst du mir helfen ?
\end{verbatim}
%
Note:\\
Please note that we only use commas, question marks and periods. The
exclamation mark is used as part of some labels (see 3.2.17) and thus
can't be used as a regular punctuation mark. It should also be noted
that punctuation marks and lexical items are separated by blank
spaces. It's only after the last period or question mark that a blank
space mustn't be put.


\subsection{Prosodic Labelling}

\subsubsection{Phrase Boundaries}

\begin{center}
\begin{tabular}{|l|l|}
\hline
\texttt{[B2]} &  weak phrase boundary\\
\hline
\texttt{[B3 rise]} & strong phrase boundary with light rising intonation\\
\hline
\texttt{[B3 cont]} & strong phrase boundary neither rising nor falling
intonation\\
\hline
\texttt{[B3 fall]} & strong phrase boundary with light falling intonation\\
\hline
\texttt{[B9]} & irregular phrase boundary\\
\hline
\end{tabular}
\end{center}

\vspace{1.5ex}
\noindent
Speech is subdivided by intonational patterns, in other words
utterances are combined into prosodical units while speaking. To show
these units we decided to label the phrase boundaries.  We're
differentiating three types of boundaries with one of them divided
into another three subcategories.

\begin{enumerate}
\item The intermediary phrase boundary [B2]:\\
  A [B2] is a weak boundary that can be found within a prosodical
  phrase. It's like a very short interruption.
  
\begin{verbatim}
w001_ptd_001_AAA: kannst du den [B2] Videorekorder 
programmieren ?
\end{verbatim}
  
\item The phrase boundary [B3]:\\
  A [B3] is a strong boundary within the speech flow. This boundary is
  divided further into three subcategories. The subject can finish the
  phrase with a rising, a falling or an intonation that
  shows hardly any movement, which we then call `cont' (a shortform of
  continuous).
  
\begin{verbatim}
w001_ptd_001_AAA: kannst du den [B2] Videorekorder 
programmieren [B3 fall] ? hallo [B3 rise] ?
\end{verbatim}
%
  Note:

  [B3 rise], [B3 fall] or [B3 cont] are placed independently of
  punctuation. Thus, a period does not necessarily generate a [B3
  fall] and a question mark not necessarily a [B3 rise].
  
\item The irregular phrase boundary [B9]:\\
  This label is used when we come across a false start, a repetition
  or a correction. Turn breaks are also labelled with [B9].

\begin{verbatim}
w001_ptd_001_AAA: wie lang brauchs= [B9] <*T>t
w001_ptd_001_AAA: was +/machen s=/+ [B9] machst 
du da [B3 rise] ?
\end{verbatim}
 
\end{enumerate}
%
In general, a phrase boundary label is placed after the last word of
the phrase, separated by a blank space and always in front of the
punctuation mark. Boundary labels then can be followed by other labels
like those for noise, breathing etc. They should be basically found at
the end of each phrase or sentence.
\\[1.5ex]
\noindent
Note:\\
\large{\textcircled{\Lightning}} \normalsize{{\it There is no prosodic
    labelling for the output of the SmartKom information system!}}


\subsubsection{Accents}

\begin{center}
\begin{tabular}{|l|l|}
\hline
\texttt{[NA]} & secondary stress\\
\hline
\texttt{[PA]} & primary stress\\
\hline
\texttt{[EK]} & emphases/contrast\\
\hline
\end{tabular}
\end{center}
%
\vspace{1.5ex} Accentuated words are marked with stress labels. Here
we distinguish between different accentuation patterns. The
accent labels are placed behind the respective word, and if there's
already a pronunciation comment, behind that. Just like the boundary
labelling the labelling of accents should be considered in every
single phrase -- no matter how short it may be.

\begin{enumerate}
\item Primary Stress [PA]:\\
  This label can be found with the most accentuated word of the
  phrase. The word marked normally also contains the most important
  information for the hearer. Usually, the label is allocated
    only once per phrase.

\begin{verbatim}
w001_ptd_001_AAA: kannst du den [B2] Videorekorder [PA] 
programmieren [B3 fall] ?
\end{verbatim}
  
\item Secondary Stress [NA]:\\
  All other words accentuated within a phrase are labelled with [NA].
  Here, it's quite usual to label more than one word per phrase.

\begin{verbatim}
w001_ptd_001_AAA: kannst [NA] du den [B2] Videorekorder [PA]
programmieren [NA] [B3 fall] ?
\end{verbatim}
  
\item Contrastive Stress [EK]:\\
  When a word is more heavily stressed than usually this might be the
  result of the speaker trying to emphasize or contrast something. In
  very obvious cases we might use [EK] instead of [PA]. Such strong
  accentuations are found rarely, though.

\end{enumerate}



\section{List of all Labels}

\subsection{Lexical Labels}

\begin{center}
\begin{tabular}{|l|l|}
\hline
\texttt{<*tXX>} & multilingual turn (EN, DE, ...)\\
\hline
\texttt{<*tOOT>} & Other Off-Talk turn\\
\hline
\texttt{<*tROT>} & Read Off-Talk turn\\
\hline
\texttt{<;...>} & global comment\\
\hline
\texttt{..'..} & reduction\\
\hline
\texttt{..-..(---)} & compound\\
\hline
\texttt{\$} & spelling\\
\hline
\texttt{\&} & acronym\\
\hline
\texttt{\~} & proper noun\\
\hline
\texttt{\#} & number\\
\hline
\texttt{*} & neologism\\
\hline
\texttt{*XX} & foreign word\\
\hline
\texttt{!KEY} & command word\\
\hline
\texttt{<Z>} & lengthening\\
\hline
\texttt{<\%> \%} & not/hardly identifiable word\\
\hline
\texttt{=} & truncated word\\
\hline
\texttt{\_} & articulatory interruption\\
\hline
\texttt{<*T>} & technical interruption within a turn\\
\hline
\texttt{<*T>t} & turn break\\
\hline
\texttt{T\_} & beginning of turn/word missing\\
\hline
\texttt{\_T} & end of word missing\\
\hline
\texttt{<!n...>} & pronunciation comment\\
\hline
\texttt{+/.../+} & repetition or correction\\
\hline
\texttt{-/.../-} & false start\\
\hline
\texttt{<A>} & breathing\\
\hline
\texttt{<" ah> <" ahm> <hm> <h" as>} & filled pause\\
\hline
\texttt{<P>} & empty pause\\
\hline
\texttt{<Ger" ausch> <Lachen>} & human noise\\
\hline
\texttt{<\#>} & technical noise\\
\hline
\texttt{..n@} & passive speaker interference of a lexical unit\\
\hline
\texttt{@n..} & active speaker interference of a lexical unit\\
\hline
\texttt{..n@>} & passive speaker interference of other events\\
\hline
\texttt{<@n..} & active speaker interference of other events\\
\hline
\texttt{<:<...> ...:>} & noise interference\\
\hline
\texttt{<OOT>} & Other Off-Talk\\
\hline
\texttt{<ROT>} & Read Off-Talk\\
\hline
\texttt{. ? ,} & puntuation marks\\
\hline
\end{tabular}
\end{center}

\subsection{Prosodic Labels}

\begin{center}
\begin{tabular}{|l|l|}
\hline
\texttt{[B2]} & weak phrase boundary\\
\hline
\texttt{[B3 rise]} &  strong phrase boundary with rising intonation\\
\hline
\texttt{[B3 cont]} & strong phrase boundary with hardly any movement\\
\hline 
\texttt{[B3 fall]} & strong phrase boundary with falling intonation\\
\hline
\texttt{[B9]} & irregular phrase boundary\\
\hline
\end{tabular}
\end{center}

\newpage

\section{Literature}

A. Batliner: M specified: A revision of the syntactic-prosodic
labelling system for large spontaneous speech databases. Verbmobil-Memo
124 F.-A.-University Erlangen-Nuremberg. August 1997
\\[1.5ex]
A. Batliner, S. Burger, A. Kießling: Außergrammatische Phaenomene in
der Sponatnsprache: Gegenstandsbereich, Beschreibung,
Merkmalsinventar. Verbmobil-Report, Nr. 5 Muenchen, Erlangen. Februar
1994
\\[1.5ex]
A. Barliner, A. Kießling, S. Burger, E. Noeth: Filled Pauses in
Sponatneous Speech. Verbmobil-Report, Nr. 88 Muenchen, Erlangen. Juli 1995.
\\[1.5ex]
S. Burger. Transliterationslexikon. Verbmobil Technisches Dokument,
Nr. 36. Muenchen. Oktober 1995
\\[1.5ex]
S. Burger: Transliteration spontansprachlicher Daten -Lexikon der
Transliterationskonventionen- VERBMOBIL II Verbmobil Technisches
Dokument 56. Muenchen. April 1997 
\\[1.5ex]
S. Burger, V. Maclaren, T. Jones: The Verbmobil Transliteration
Manual. Carnegie Mellon. April 2000. 
\\[1.5ex]
S. Rabold, D. Oppermann, N. Beringer: Transliteration
Spontansprachlicher Daten -Lexikon der Transliterationskonventionen-
SmartKom, Technische Dokument Nr. 2. Muenchen. Dezember 2001.
\\[1.5ex] 
K. Kohler, G. Lex, M. Pätzold, M. Scheffers, A. Simpson, W. Thon:
Handbuch zur Datenaufnahme und Tramsliteration in TP14 von Verbmobil
-3.0. Verbmobil-Technischer, Nr. 11. Kiel. September 1994.


\end{document}