Bavarian Archive for Speech Signals



BAS - General Information

  • Duration: founded in 01/01/95
  • Financing: fundings by the Bavarian State, the University of Munich and cooperations
  • Staff:
    • Dr. Florian Schiel
    • Dr. Phil Hoole
    • Dr. Christoph Draxler
  • Address:

    Bavarian Archive for Speech Signals
    c/o Institut für Phonetik, Universität München
    Schellingstr. 3 / II
    80799 München
    Tel.: +49-89-2180-2758
    Fax: +49-89-2800362
    Email: bas@phonetik.uni-muenchen.de


The Bavarian Archive for Speech Signals (BAS) is a public institution hosted by the University of Munich. This institution was founded with the aim of making corpora of current spoken German available to both the basic research and the speech technology communities via a maximally comprehensive digital speech-signal database. The speech material will be structured in a manner allowing flexible and precise access, with acoustic-phonetic and linguistic-phonetic evaluation forming an integral part of it.

Tasks

The last few years have seen an abrupt increase in the demand for large speech-signal data collections, both on the part of academic investigators carrying out basic research as well as on the part of engineers from industry working in the new integrated field of speech and information technology. There are many reasons for this. Primarily, however, the sudden increase in demand must be attributed to the breakneck pace of hardware and software development in speech signal processing. The increasing number of techniques for acoustic-phonetic signal processing, and the increasing amount of speech data that can be efficiently handled and processed together generate an accompanying demand not only for linguistically interesting text material (which of course emerges automatically from the modern printing industry) but also for reliably acquired and phonetically evaluated spoken language material. A number of national and international initiatives (such as BDSON, PHONDAT, LDC, SPEX or COCOSDA) have, it is true, already resulted in the collection and distribution of large speech corpora. However, they exhibit a variety of formats, corresponding to the variety in the aims pursued. For German, a central institution was clearly lacking that could carry out such tasks within a long-term perspective. BAS will be responsible in Germany for these tasks for distributable corpora of spoken German, collecting, maintaining and making them available in standardized form.

In addition, BAS will develop its own procedures for automatic labelling and segmentation, making the results available with the distributed speech corpora.

BAS was entrusted by the Bundesmministerium für Bildung, Wissenschaft, Forschung und Technologie (BMBF) with the task of maintaining both existing and future databases set up within funded projects by the BMBF, and of exporting them (after any restrictions on availability have expired) within the EU as well as to the Linguistic Data Consortium (LDC). Imported databases are to be converted by BAS to a standardized form, enabling them to be expoited in all BMBF- funded speech projects for a fraction of the cost and effort usually incurred.

Aims

The first aim of BAS will be to satisfy the immediate demand for spoken language data recorded under controlled conditions of the kind required for speech technology development in German. This will include development of new techniques for efficient handling of and access to very large quantities of phonetic data, independent of the location and the nature of the storage. In addition to typical application-oriented corpora such as Polyphone this first aim will concentrate on establishing a representative database of publically spoken German.

The second goal consists in the long term development of a (more or less) Complete Phonetic Theory \mbox{(CPT) of spoken German. For this endeavour, the central category will no longer be the speech sound but rather the word as the lexically given unit. The great variability characterizing the pronunciation of words in running speech as opposed to citation form will be systematically documented and related to the communicative information content.

External cooperations

The Leibniz Rechenzentrum München (LRZ) -- which is connected to the site via fiber optic data link -- provides the Archive with mass storage and network support within the framework of the TERABACK project.

The BAS is keen to cooperate with all institutions in the German speaking area interested in contributing to the common goal. Most of the projects will be financed by interested partners in industry, by public grants or by European projects.

The BAS produces speech ressources either by public funding or industrial cooperations. Speech ressources funded exclusively by public money are available without restrictions immediately after the release for everybody. Industrial partners that have significantly contributed to the production of the ressource are granted a period of one year after the release to exploit the data exclusively. After that period the ressource is distributed via the BAS either unrestricted or under license.

Staff

Christoph Draxler studied Computer Science at the Technical University of Munich, Germany. He earned his PhD in 1991 from the University of Zurich, Switzerland in the field of databases. Since 1991 he has been working at the Institut für Phonetik und Sprachliche Kommunikation mainly within the PhonDat and VERBMOBIL projects. His main interests include logical programming, databases and multi-media applications.

Phil Hoole

Florian Schiel received his Dipl.-Ing. and Dr.-Ing. degrees from the Technical University in Munich in 1990 and 1993 respectively, both in electrical engineering. Since 1993 he has been with the Institute of Phonetics, University of Munich, participating in the VERBMOBIL project. His main interests are: speaker adaptation, German phonetics, computational phonology, automatic analysis of very large speech corpora.