- ... \today1
- This document is prone to frequent updates. You may check www.bas.uni-muenchen.de/Forschung/BITS/TP1/Cookbook for the latest version.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
Processing1.1
- www.bas.uni-muenchen.de/Forschung/BITS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Munich1.2
- www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
speaking1.3
- Aside from the speech signal these time signals may
be: laryngographic signal, electropalatographic signal, coordinate
parameters derived from EMA (Electro Magnetic Articulography), X-ray
movie (cineradiography), coordinate
parameters derived from X-ray micro beam, air flow, nuclear magnetic
resonance imaging, ultrasound imaging etc. In this cookbook we will not
give any specific instructions on how to use special recording hardware
for the listed signals, because this would be far beyond the scope of
this book.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
Page2.1
- www.icp.grenet.fr/ELRA/org/reasons.php3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
data2.2
- Aside of course from the natural concern that you would not
like your data to be destroyed or stolen by intruders in your computer
system!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... technology2.3
- And -
ironically as it is - the speakers of a biometric speech corpus might be
the most vulnerable ones to be broken into depending on the used
technology.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... ID2.4
- See also section
, p.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
ELRA's2.5
- European Language Resources Association,
www.icp.grenet.fr/ELRA/home.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
Nijmegen2.6
-
www.spex.nl
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... (LDC)2.7
- www.ldc.upenn.edu
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... ELRAs2.8
- For example, for commercial organizations the
yearly ELRA fee is EUR 1.500 while the yearly LDC fee is $ 20.000.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Germany2.9
- www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... basis2.10
- This does not mean that you are not
earning royalties for your corpus, but that BAS does not want to make
profit by distributing your corpus.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
(IMDI)3.1
- www.mpi.nl/ISLE/index.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
resources3.2
- In this case the term
`speech data' is not restricted to speech corpora like described in
this cookbook. It also refers to text corpora, terminology
databases and lexica.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
corpus.3.3
- For information about meta data file formats
see section
(p.
).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...3.33.4
-
Other meta data initiatives are Dublin Core, which
defines a very small set of descriptors for language resources, MPEG-7 which
is an attempt to define a classification system for any type of content
of relevance to the home entertainment industry, and OLAC (Open Language Archive)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...noise3.5
- Note that background noise might be played back
artificially during the recording and in that case will be easy to
describe.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... speakers.3.6
- Noise
events and cross talk may
be subject to annotation techniques (see chapter
).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...recorded3.7
- This
does not necessarily match the specifications of the signals in
the final speech corpus because signals may be altered in the
post-processing (chapter
). For instance
very often signals are recorded with 48kHz sampling frequency
and then filtered and down-sampled to a lower sampling frequency
in the post-processing.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
birth3.8
- Do not use the age of the speaker at the time of
recording, because you might record the same speaker in a
different corpus/release later and want to re-use the speaker
profile information.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... digits4.1
- German has two word forms for the
digit `2': zwei and zwo.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... corpus4.2
- Of course this makes only sense
if your not interested in these adaptive effects!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... followed.4.3
- Please note that the annotation phase is in
most cases necessary anyway!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... world'.4.4
- Therefore some authors call them `real world
recordings'.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
EMA4.5
- EMA = Electro-Magnetic Articulography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... setup.4.6
- If not,
think about it: It does not increase the efforts
significantly, but will increase the value of your corpus.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... SAM4.7
- See for instance [2], Part IV, C.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... SPHERE4.8
- www.nist.gov/speech
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... RIFF4.9
- ccrma-www.stanford.edu/CCRMA/Courses/422/projects/WaveFormat
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... SHORTEN4.10
- www.hornig.net/shorten.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Format4.11
- www.icp.inpg.fr/Relator/standsam.html or
[2], Part IV, C.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Format)4.12
- www.mpi.nl/DOBES/tools/Eudico-Annotation-Tool.pdf
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Format)4.13
- www.bas.uni-muenchen.de/Bas/BasFormatseng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
formats.4.14
- At BAS there exists a public domain tool par2ags.pl to
transform BPF into Bird's
annotation graph file format (XML).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... signal4.15
- See
section
for details.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...IMDI4.16
- www.mpi.nl/ISLE/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... X-SAMPA4.17
- www.phon.ucl.ac.uk/home/sampa/home.htm or [2], Part IV, B.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... them.4.18
- There is nothing like ``the pronunciation of a word''. Your
lexicon will always contain word forms where the most likely
pronunciation is debatable. For a more detailed discussion of dictionary
contents please refer to chapter
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
paper5.1
- Often this may result in un-wanted background noise like
paper rustle, page turning etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... adjust5.2
- Typically a threshold and two
timing parameters for speech and silence: When the signal is higher than
the threshold for more than T1, it's speech starting; when the signal
stays under the threshold for longer that T2, it's speech end
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... (EU)5.3
- MIME
types: audio/x-alaw-basic or audio/x-ulaw-basic
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
`echo-canceling'5.4
- `Echo' in this context means that the signal
sent to an analog telephone will be heard with a certain time delay in
the channel coming from the analog telephone
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
Clippings5.5
- Samples that have the
maximum value of your sample format, for instance +32767 in 16 bit
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... significantly.5.6
- You may mark the position of furniture by taping
markers on the floor.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... laptop.5.7
- In most cases you
can hear the spinning up of the hard drive quite clearly.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... only5.8
- For most
laptops adaptors for DC 12Vi are available.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... machine'5.9
- The application that is
simulated by the WOZ experiment. For example if you need a camera in a
simple command and control recording, tell the speakers that the machine
uses a camera to detect the point in time when he/she starts speaking.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... button5.10
- This also increases the `machine-likeness' of
the simulation because these pre-recorded answers sound exactly the
same every time.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... system.5.11
- In the SmartKom WOZ
experiments we asked speakers to solve a simple task with the help of the
`virtual machine'. However, one of the speakers finished the task very
quickly and spent the rest of the recording time testing the system with
a kind of von Neumann test: he repeatedly asked the system to meet
him at the cinema and maybe later to have dinner together. Fortunately,
our wizard kept a straight face (straight voice) and kept on hitting
the button saying Sorry Sir, I did not understand. Could you please
state your question again? again and again and again...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... staff.5.12
- Believe us: the bugs are
there!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... developments6.1
- For instance to strengthen the
recruitment efforts in areas that are not covered yet.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... entirely6.2
- In fact this has been done in
BAS corpus productions but we do not recommend it. The reason for
this is that the costs for a online monitoring is often higher than a
annotation after the recording. Furthermore, a post-recording
annotation may find errors that go by unnoticed even in a monitored
recording and additional characteristics in the speech signal may
be annotated that the supervisor was not able to detect (for
instance a disturbance of the signal caused by a malfunctioning
recording device). Finally, we think that a speech corpus of monitored
speech utterances is in most cases not what the users of speech corpora
really need: The scientist or developer of a speech application has to
cope with errors in the spoken language input. Therefore they should not be
omitted from the corpus but rather labeled. Only corpora for speech
synthesis might justify a monitored recording technique.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
accordingly.6.3
- See for instance [8] for a description
about data pipelining in the SmartKom project.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... lucky6.4
- For instance, if you
work in a company and will recruit the own employees as speakers.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... collection6.5
- Please also
note the sections about recruitment in chapter
, p.
, and chapter
, p.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... it6.6
- If you're working for a
non-profit organization this is much easier than
if you work for a company.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... EUR6.7
- All numbers taken from the year 2002.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
arbitrary7.1
- Except for filtering before down-sampling!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... files7.2
- Correct with regard
to your specified terminology, see
section
, p.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... recording7.3
- This process
is sometimes also referred to as segmentation but we prefer the term
editing to distinguish it from the segmentation of speech into
linguistically units.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... speak7.4
- Of course in this case no real spontaneous
conversation is possible, because the partner has always to wait for the other
partner to finish.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
file7.5
- Using a simple conference call and an ISDN card.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... package7.6
- www.phon.ucl.ac.uk/resource/sfs/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... down-sampling7.7
- Nyquist or Shannon Theorem
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...sox7.8
- www.spies.com/Sox/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... platforms7.9
- Our
most sincere regrets.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... PC7.10
- www.cygwin.com/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... symbolic8.1
- Symbolic
in the sense of discrete or categorical.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Format8.2
- www.bas.uni-muenchen.de/Bas/BasFormatseng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Bird8.3
- See for instance [1].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
LDC8.4
- agtk.sourceforge.net/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... layers8.5
- In some cases the latter is called a transliteration to distinguish it from a simple orthographic
transcript. Beware: some authors do not even use the term transcript for
the orthographic representation at all, because they reserve this term
for the phonemic or phonetic representation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... file.8.6
- If your corpus is not
segmented into signals files of the size of an utterance or less,
consider incorporating such a segmentation into
the transcription. For example in the Verbmobil II speech corpus the first
edition consisted of signal files of approx. 10 minutes length that
contained the speech of one dialogue partner over a whole dialogue. In
the transcript this long recording was segmented into dialogue turns and
numbered throughout the dialogue starting with `000'. Furthermore, the
transcribers were asked to markup the begin and end of each turn on the
time scale resulting in a rough segmentation of the recording which
simplifies the later usage of the corpus considerably.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... brief8.7
- That is: does not need a lot
of redundant typing.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
transcript8.8
- Most of these are more thoroughly explained in
section
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...Senia19978.9
- www.speechdat.org/speechdat/deliverables/public/SD132V24.PDF
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
(BPF)8.10
- www.bas.uni-muenchen.de/Bas/BasFormatseng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... units8.11
- For instance in the Verbmobil
projects we found that the time to segment a dialogue into turns my be
achieved in 5 times real-time while a phonemic segmentation required 800
times real-time.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... writing8.12
- Oct 2002
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... spontaneous8.13
- For instance the XWaves Aligner or the
HTK package,
a public domain software package developed by the University of
Cambridge,
htk.eng.cam.ac.uk/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... set8.14
- See for instance work
that has been done at the IMS Stuttgart,
www.ims.uni-stuttgart.de.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
Modeling8.15
- Again XWaves Aligner or HTK.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
MAUS8.16
- See for instance [5] or
www.bas.uni-muenchen.de/Forschung/Verbmobil/VM14.1eng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
categories8.17
- See www.icsi.berkely.edu/~steveng
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
corpus8.18
- www.bas.uni-muenchen.de/Bas/BasKorporaeng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Corpus8.19
- A corpus of telephone dialogue
recordings available at LDC, www.ldc.upenn.edu
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... (BAS)8.20
- Contact the author Chr. Draxler,
draxler@bas.uni-muenchen.de, for more information regarding
WWWTranscribe.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Currently8.21
- Oct 2002.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... distribution8.22
- See
www.bas.uni-muenchen.de/Forschung/BITS for updated information about the availability of WWWTranscribe.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
Amsterdam8.23
- www.praat.org/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... files8.24
- including the NIST SPHERE
format, which is rather seldom
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... platforms8.25
- We have tested Windows,
Linux and Macintosh.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... tool8.26
- See lands.let.kun.nl/cgn/ehome.htm
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... transcript.8.27
- For instance the tool HResults from the
HTK package htk.eng.cam.ac.uk
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
Wells9.1
- www.phon.ucl.ac.uk/home/sampa/home.htm or citeEagles1997, Part IV, C.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
dictionary9.2
- For instance for German the official `Ausspracheduden'; for American English the `Webster'; for British English the `Oxford Dictionary'.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... pronunciation9.3
- For
instance for Spanish where there is a formal relationship between
orthographic and phonemic form.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... spelling9.4
- However,
for most languages there exist at least one dictionary that is widely
accepted to be a standard. For instance in German the official
`Duden'; for American English the `Webster'; for British
English the `Oxford Dictionary'.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... files9.5
- This includes the consistent
usage of special characters like the German Umlauts.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... count9.6
- Sometime referred to as empirical
pronunciation variants.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... HTK9.7
- HTK = Hidden Markov Model Toolkit : a public domain software package developed by the University of Cambridge, htk.eng.cam.ac.uk/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... sentences9.8
- A proper transliteration
should not contain any of these!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
ELDA9.9
- www.icp.grenet.fr/ELRA/home.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
LDC9.10
- www.ldc.upenn.edu
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... BAS9.11
- www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Web10.1
- Avoid formats that require
non public domain software to read, such as Word, StarOffice, etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
page10.2
- www.bas.uni-muenchen.de/Bas/BasFormatseng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... authors11.1
- See
www.bas.uni-muenchen.de/Forschung/BITS/TP2/Cookbook/ for a down-loadable version of this
document.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
Netherlands11.2
- www.spex.nl
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Germany11.3
- www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... Linguistics11.4
- See
www.ims.uni-stuttgart.de/phonetic/joerg/worldwide/lingphon.html
for some links to such institutions.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
X\%11.5
- Also, in this special example it must be stated in the
specifications whether the distribution is with regards to the speaker numbers
or with regards to the
amount of material recorded by the speakers.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... server12.1
- 2002: We've had good experiences
with the following constellations:
- CD-R 8x, Linux, 100 MBit network
- DVD-R 2x, Macintosh, local
- DVD+RW 2x, NT, Linux server, 100 MBit network
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
platforms.12.2
- Although VFAT is defined for a maximum size of
125GB we found that older Linux kernels (
2.4.18) will only handle partitions of a maximum
size of 65GB. You can circumvent this problem by adding more than
one partition to the hard drive.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... signals12.3
- In general high sampling rates tend to contain
more redundancy than lower sampling rates and are therefore easier to
compress.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
speech12.4
- For instance Tony Robinsons shorten.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... mode12.5
- Never use a non loss-less compression
algorithm on your speech data. Don't even think about it!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... CDROM12.6
- The latter has of course
the disadvantage that a mixed set of media have to be
distributed.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... format15.1
- For a detailed description of the SmartKom
transliteration format refer to www.bas.uni-muenchen.de/Forschungsprojekte/SmartKom
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
automatically15.2
- which is partly done in the BAS Partitur Format
of Verbmobil or SmartKom.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... interjections15.3
- That is:
everything that is not marked in any way is either a normal word or an
interjection. All other cases are tagged individually.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... sentences16.1
- A proper transliteration
should not contain any of these!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
ELDA16.2
- www.icp.grenet.fr/ELRA/home.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...
LDC16.3
- www.ldc.upenn.edu
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... BAS16.4
- www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ... signals18.1
- See
www.bas.uni-muenchen.de/Bas/BasGermanPronunciation/ for an updated version of this document
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
- ...18.2
- #1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.