... \today1
This document is prone to frequent updates. You may check www.bas.uni-muenchen.de/Forschung/BITS/TP1/Cookbook for the latest version.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Processing1.1
www.bas.uni-muenchen.de/Forschung/BITS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Munich1.2
www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... speaking1.3
Aside from the speech signal these time signals may be: laryngographic signal, electropalatographic signal, coordinate parameters derived from EMA (Electro Magnetic Articulography), X-ray movie (cineradiography), coordinate parameters derived from X-ray micro beam, air flow, nuclear magnetic resonance imaging, ultrasound imaging etc. In this cookbook we will not give any specific instructions on how to use special recording hardware for the listed signals, because this would be far beyond the scope of this book.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Page2.1
www.icp.grenet.fr/ELRA/org/reasons.php3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... data2.2
Aside of course from the natural concern that you would not like your data to be destroyed or stolen by intruders in your computer system!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... technology2.3
And - ironically as it is - the speakers of a biometric speech corpus might be the most vulnerable ones to be broken into depending on the used technology.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... ID2.4
See also section [*], p. [*]
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... ELRA's2.5
European Language Resources Association, www.icp.grenet.fr/ELRA/home.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Nijmegen2.6
www.spex.nl
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... (LDC)2.7
www.ldc.upenn.edu
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... ELRAs2.8
For example, for commercial organizations the yearly ELRA fee is EUR 1.500 while the yearly LDC fee is $ 20.000.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Germany2.9
www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... basis2.10
This does not mean that you are not earning royalties for your corpus, but that BAS does not want to make profit by distributing your corpus.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... (IMDI)3.1
www.mpi.nl/ISLE/index.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... resources3.2
In this case the term `speech data' is not restricted to speech corpora like described in this cookbook. It also refers to text corpora, terminology databases and lexica.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... corpus.3.3
For information about meta data file formats see section [*] (p. [*]).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...3.33.4
Other meta data initiatives are Dublin Core, which defines a very small set of descriptors for language resources, MPEG-7 which is an attempt to define a classification system for any type of content of relevance to the home entertainment industry, and OLAC (Open Language Archive) .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...noise3.5
Note that background noise might be played back artificially during the recording and in that case will be easy to describe.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... speakers.3.6
Noise events and cross talk may be subject to annotation techniques (see chapter [*]).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...recorded3.7
This does not necessarily match the specifications of the signals in the final speech corpus because signals may be altered in the post-processing (chapter [*]). For instance very often signals are recorded with 48kHz sampling frequency and then filtered and down-sampled to a lower sampling frequency in the post-processing.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... birth3.8
Do not use the age of the speaker at the time of recording, because you might record the same speaker in a different corpus/release later and want to re-use the speaker profile information.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... digits4.1
German has two word forms for the digit `2': zwei and zwo.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... corpus4.2
Of course this makes only sense if your not interested in these adaptive effects!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... followed.4.3
Please note that the annotation phase is in most cases necessary anyway!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... world'.4.4
Therefore some authors call them `real world recordings'.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... EMA4.5
EMA = Electro-Magnetic Articulography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... setup.4.6
If not, think about it: It does not increase the efforts significantly, but will increase the value of your corpus.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... SAM4.7
See for instance [2], Part IV, C.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... SPHERE4.8
www.nist.gov/speech
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... RIFF4.9
ccrma-www.stanford.edu/CCRMA/Courses/422/projects/WaveFormat
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... SHORTEN4.10
www.hornig.net/shorten.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Format4.11
www.icp.inpg.fr/Relator/standsam.html or [2], Part IV, C.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Format)4.12
www.mpi.nl/DOBES/tools/Eudico-Annotation-Tool.pdf
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Format)4.13
www.bas.uni-muenchen.de/Bas/BasFormatseng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... formats.4.14
At BAS there exists a public domain tool par2ags.pl to transform BPF into Bird's annotation graph file format (XML).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... signal4.15
See section [*] for details.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...IMDI4.16
www.mpi.nl/ISLE/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... X-SAMPA4.17
www.phon.ucl.ac.uk/home/sampa/home.htm or [2], Part IV, B.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... them.4.18
There is nothing like ``the pronunciation of a word''. Your lexicon will always contain word forms where the most likely pronunciation is debatable. For a more detailed discussion of dictionary contents please refer to chapter [*].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... paper5.1
Often this may result in un-wanted background noise like paper rustle, page turning etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... adjust5.2
Typically a threshold and two timing parameters for speech and silence: When the signal is higher than the threshold for more than T1, it's speech starting; when the signal stays under the threshold for longer that T2, it's speech end
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... (EU)5.3
MIME types: audio/x-alaw-basic or audio/x-ulaw-basic
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... `echo-canceling'5.4
`Echo' in this context means that the signal sent to an analog telephone will be heard with a certain time delay in the channel coming from the analog telephone
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Clippings5.5
Samples that have the maximum value of your sample format, for instance +32767 in 16 bit
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... significantly.5.6
You may mark the position of furniture by taping markers on the floor.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... laptop.5.7
In most cases you can hear the spinning up of the hard drive quite clearly.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... only5.8
For most laptops adaptors for DC 12Vi are available.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... machine'5.9
The application that is simulated by the WOZ experiment. For example if you need a camera in a simple command and control recording, tell the speakers that the machine uses a camera to detect the point in time when he/she starts speaking.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... button5.10
This also increases the `machine-likeness' of the simulation because these pre-recorded answers sound exactly the same every time.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... system.5.11
In the SmartKom WOZ experiments we asked speakers to solve a simple task with the help of the `virtual machine'. However, one of the speakers finished the task very quickly and spent the rest of the recording time testing the system with a kind of von Neumann test: he repeatedly asked the system to meet him at the cinema and maybe later to have dinner together. Fortunately, our wizard kept a straight face (straight voice) and kept on hitting the button saying Sorry Sir, I did not understand. Could you please state your question again? again and again and again...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... staff.5.12
Believe us: the bugs are there!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... developments6.1
For instance to strengthen the recruitment efforts in areas that are not covered yet.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... entirely6.2
In fact this has been done in BAS corpus productions but we do not recommend it. The reason for this is that the costs for a online monitoring is often higher than a annotation after the recording. Furthermore, a post-recording annotation may find errors that go by unnoticed even in a monitored recording and additional characteristics in the speech signal may be annotated that the supervisor was not able to detect (for instance a disturbance of the signal caused by a malfunctioning recording device). Finally, we think that a speech corpus of monitored speech utterances is in most cases not what the users of speech corpora really need: The scientist or developer of a speech application has to cope with errors in the spoken language input. Therefore they should not be omitted from the corpus but rather labeled. Only corpora for speech synthesis might justify a monitored recording technique.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... accordingly.6.3
See for instance [8] for a description about data pipelining in the SmartKom project.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... lucky6.4
For instance, if you work in a company and will recruit the own employees as speakers.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... collection6.5
Please also note the sections about recruitment in chapter [*], p. [*], and chapter [*], p. [*].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... it6.6
If you're working for a non-profit organization this is much easier than if you work for a company.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... EUR6.7
All numbers taken from the year 2002.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... arbitrary7.1
Except for filtering before down-sampling!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... files7.2
Correct with regard to your specified terminology, see section [*], p. [*].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... recording7.3
This process is sometimes also referred to as segmentation but we prefer the term editing to distinguish it from the segmentation of speech into linguistically units.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... speak7.4
Of course in this case no real spontaneous conversation is possible, because the partner has always to wait for the other partner to finish.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... file7.5
Using a simple conference call and an ISDN card.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... package7.6
www.phon.ucl.ac.uk/resource/sfs/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... down-sampling7.7
Nyquist or Shannon Theorem
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...sox7.8
www.spies.com/Sox/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... platforms7.9
Our most sincere regrets.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... PC7.10
www.cygwin.com/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... symbolic8.1
Symbolic in the sense of discrete or categorical.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Format8.2
www.bas.uni-muenchen.de/Bas/BasFormatseng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Bird8.3
See for instance [1].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... LDC8.4
agtk.sourceforge.net/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... layers8.5
In some cases the latter is called a transliteration to distinguish it from a simple orthographic transcript. Beware: some authors do not even use the term transcript for the orthographic representation at all, because they reserve this term for the phonemic or phonetic representation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... file.8.6
If your corpus is not segmented into signals files of the size of an utterance or less, consider incorporating such a segmentation into the transcription. For example in the Verbmobil II speech corpus the first edition consisted of signal files of approx. 10 minutes length that contained the speech of one dialogue partner over a whole dialogue. In the transcript this long recording was segmented into dialogue turns and numbered throughout the dialogue starting with `000'. Furthermore, the transcribers were asked to markup the begin and end of each turn on the time scale resulting in a rough segmentation of the recording which simplifies the later usage of the corpus considerably.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... brief8.7
That is: does not need a lot of redundant typing.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... transcript8.8
Most of these are more thoroughly explained in section [*].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...Senia19978.9
www.speechdat.org/speechdat/deliverables/public/SD132V24.PDF
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... (BPF)8.10
www.bas.uni-muenchen.de/Bas/BasFormatseng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... units8.11
For instance in the Verbmobil projects we found that the time to segment a dialogue into turns my be achieved in 5 times real-time while a phonemic segmentation required 800 times real-time.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... writing8.12
Oct 2002
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... spontaneous8.13
For instance the XWaves Aligner or the HTK package, a public domain software package developed by the University of Cambridge, htk.eng.cam.ac.uk/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... set8.14
See for instance work that has been done at the IMS Stuttgart,
www.ims.uni-stuttgart.de.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Modeling8.15
Again XWaves Aligner or HTK.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... MAUS8.16
See for instance [5] or
www.bas.uni-muenchen.de/Forschung/Verbmobil/VM14.1eng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... categories8.17
See www.icsi.berkely.edu/~steveng
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... corpus8.18
www.bas.uni-muenchen.de/Bas/BasKorporaeng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Corpus8.19
A corpus of telephone dialogue recordings available at LDC, www.ldc.upenn.edu
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... (BAS)8.20
Contact the author Chr. Draxler, draxler@bas.uni-muenchen.de, for more information regarding WWWTranscribe.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Currently8.21
Oct 2002.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... distribution8.22
See www.bas.uni-muenchen.de/Forschung/BITS for updated information about the availability of WWWTranscribe.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Amsterdam8.23
www.praat.org/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... files8.24
including the NIST SPHERE format, which is rather seldom
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... platforms8.25
We have tested Windows, Linux and Macintosh.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... tool8.26
See lands.let.kun.nl/cgn/ehome.htm
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... transcript.8.27
For instance the tool HResults from the HTK package htk.eng.cam.ac.uk
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Wells9.1
www.phon.ucl.ac.uk/home/sampa/home.htm or citeEagles1997, Part IV, C.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... dictionary9.2
For instance for German the official `Ausspracheduden'; for American English the `Webster'; for British English the `Oxford Dictionary'.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... pronunciation9.3
For instance for Spanish where there is a formal relationship between orthographic and phonemic form.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... spelling9.4
However, for most languages there exist at least one dictionary that is widely accepted to be a standard. For instance in German the official `Duden'; for American English the `Webster'; for British English the `Oxford Dictionary'.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... files9.5
This includes the consistent usage of special characters like the German Umlauts.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... count9.6
Sometime referred to as empirical pronunciation variants.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... HTK9.7
HTK = Hidden Markov Model Toolkit : a public domain software package developed by the University of Cambridge, htk.eng.cam.ac.uk/
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... sentences9.8
A proper transliteration should not contain any of these!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... ELDA9.9
www.icp.grenet.fr/ELRA/home.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... LDC9.10
www.ldc.upenn.edu
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... BAS9.11
www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Web10.1
Avoid formats that require non public domain software to read, such as Word, StarOffice, etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... page10.2
www.bas.uni-muenchen.de/Bas/BasFormatseng.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... authors11.1
See www.bas.uni-muenchen.de/Forschung/BITS/TP2/Cookbook/ for a down-loadable version of this document.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Netherlands11.2
www.spex.nl
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Germany11.3
www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... Linguistics11.4
See www.ims.uni-stuttgart.de/phonetic/joerg/worldwide/lingphon.html for some links to such institutions.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... X\%11.5
Also, in this special example it must be stated in the specifications whether the distribution is with regards to the speaker numbers or with regards to the amount of material recorded by the speakers.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... server12.1
2002: We've had good experiences with the following constellations:
- CD-R 8x, Linux, 100 MBit network
- DVD-R 2x, Macintosh, local
- DVD+RW 2x, NT, Linux server, 100 MBit network
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... platforms.12.2
Although VFAT is defined for a maximum size of 125GB we found that older Linux kernels ($<$ 2.4.18) will only handle partitions of a maximum size of 65GB. You can circumvent this problem by adding more than one partition to the hard drive.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... signals12.3
In general high sampling rates tend to contain more redundancy than lower sampling rates and are therefore easier to compress.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... speech12.4
For instance Tony Robinsons shorten.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... mode12.5
Never use a non loss-less compression algorithm on your speech data. Don't even think about it!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... CDROM12.6
The latter has of course the disadvantage that a mixed set of media have to be distributed.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... format15.1
For a detailed description of the SmartKom transliteration format refer to www.bas.uni-muenchen.de/Forschungsprojekte/SmartKom
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... automatically15.2
which is partly done in the BAS Partitur Format of Verbmobil or SmartKom.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... interjections15.3
That is: everything that is not marked in any way is either a normal word or an interjection. All other cases are tagged individually.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... sentences16.1
A proper transliteration should not contain any of these!
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... ELDA16.2
www.icp.grenet.fr/ELRA/home.html
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... LDC16.3
www.ldc.upenn.edu
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... BAS16.4
www.bas.uni-muenchen.de/Bas
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... signals18.1
See www.bas.uni-muenchen.de/Bas/BasGermanPronunciation/ for an updated version of this document
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...18.2
#1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.