The Munich Automatic Segmentation System MAUS


gleiche Seite auf deutsch

Contact: Florian Schiel


What is meant by SEGMENTATION?

The aim of phonetic sciences is to analyze the correlation between linguistic categories (e.g. word, syllable, phone) and corresponding signals (e.g. acoustic signal, spectrum, articulatory signal, neuronal signals). Usually, a concrete mapping of categories to the corresponding sections in the signal is done according to the aim of the analysis in question. This results in a partition of the signal in segments, known as segmentation.

Because of the subjective nature of the analysis in question (dependency on the observer and the thesis and therefore the necessity of a different description) segmentations are produced manually according to the relevant aspects of the analysis. These data are carefully produced and of the highest possible reliability, which is an absolute prerequisite for experimental phonetic work.

A good training and a good experience is needed to produce careful and reliable manual segmentations which still are time consuming. Therefore, high quality segmentations can only be produced for a small amount of data.

In digital speech processing, especially in ASR, a large amount of segmented data is needed. To produce manual segmentations for this is uneconomical. Therefore, automatic procedures are developed to automatically segment a large amount of data in a relatively short time. On the one hand, this is only possible in reducing the quality of segmentations, which can be traced back to the impreciseness of the analysis of the acoustic signal (the computer cannot find categories and depends on signalimmanent criteria or insufficiently analysed empirical correlations), on the other hand it can be traced back to a missing thesis in an experimental background.

But now a large amount of segmented material can be offered for research and development in the area of technical speech processing under consideration of phonetic information. Retrospectively, success in the field of speech processing leads to decisive improvements of automatic segmentation.


Short Description of MAUS

Input-->>>>> speech signal and related orthographic representation

Output-->>>> fully automatically produced segmentation on the phonemic level

technical implementation:

steps of processing (for German):


Download of MAUS

The MAUS download package comprise a number of scripts and binaries to be run under Linux. It is possible to run it under WinXX using cygwin (not contained in the package).
The main scripts of the package are: and other tools to convert and display S&Ls.
Furthermore parameter sets and acoustical models for German (and other languages) are in the package.
The algorithm to automatically learn the statistical rule set from data is not contained.

Download


Publications on MAUS

Verbmobil Memos:

Conference papers:

Dissertations:


Home page Phonetik