Data collection

In the first 4 year phase (1993-1996) the main task of our part was the creation of a new database of spontaneously spoken German. Also involved in this task are the Institut für Phonetik und digitale Sprachverarbeitung (IPDS) in Kiel as well as the Institut für Kommunikation und Phonetik (IKP) in Bonn.

The desired corpus contains recorded dialogs between two german speakers within a certain task. The task was the scheduling of several business appointments (meetings, dinner, business trip, etc.). After two

An example for such a dialogue can be found here.

Up to date information about the latest VERBMOBIL CDROMs can be found here.

The recordings of the dialogs are done in all of the three participating institutes and additional recordings are provided by University of Karlsruhe. The recordings are collected in Munich and disseminated on CDROM. The orthographic transcription ('transliteration') of each dialog is created by phonetically skilled listeners, including noise classification and all effects of spontanuous speech (e.g. hesitations, interupts, ...). A handbook of data collection and transliteration in TP14 of VERBMOBIL was produced by IPDS Kiel (VERBMOBIL The new version of the conventions for transliteration can be found here. During the first pahes 9 CDROMs, each containing about 500 MB data were distributed. From a total of 1954 appointments 763 were recorded at our site.

The corresponding transliterations are distributed on the official VERBMOBIL ftp server in Saarbrücken.

Furthermore some special recordings were done in Munich and Bonn:

This database and the corresponding transliteration are the basis for the development of the speech recognizer and the research of dialog handling, semantics and syntax of spontaneous speech within the VERBMOBIL project.

Automatic Speech Verification

As a second task a full automatic segmentation system on the basis of Hidden Markov Models was developed. In a first alpha version the system showed the expected errors, where speech recognition systems typically tend to fail. These results were made available to other partners with the aim to improve weak points of standard speech recognition algorithms. The alpha version was distributed to several Verbmobil partners for automatic segmentation tasks (mainly for speech synthesis). A source code version was exported to the Institut für Kommunikation und Phonetik (IKP) in Bonn.
The next step of this task will be the development of a segmentation algorithm which uses a rule based system to recognize the most common word variations in German. This includes the development of a graph-oriented representation for variations of whole utterances and Viterbi-search for such structures. Another point of interest is the re-evaluation of the output by a rule-based system which incorporates expert knowledge about transcription by hand.


Florian Schiel