Bavarian Archive for Speech Signals

Gleiche Seite in deutsch

This page was last updated 2021-05-01

Please note that selected corpora of this catalogue and other corpora not listed here may be downloaded for free by academic users from the CLARIN Repository (partly marked with a (*) in the following).

Speech Corpora

(If not stated otherwise, the language of the corpora is German.)

Entire Catalog

Presently the following corpora are available on CD-R/DVD-R/Harddisc/online. Note that a subset of these corpora is also online accessible for members of academic institutions and for licensees of BAS resources in the BAS CLARIN Repository (tagged with (*) in the following list).

The following speech corpiora are exclusively accessible via the BAS CLARIN Repository; a commercial usage is in some cases possible (inquiries via

Some audio files of the available corpora.

The TED corpus is currently distributed by ELDA. Therefore BAS will only disseminate further copies of the corpus, if this first edition is run out.

For further questions or orders please contact

Corpora for commercial usage

Most speech corpora of BAS are available for commercial usage. Under commercial usage we subsumize any developments of speech technology on the basis of the BAS data and the commercial exploitation of products that were developed on the basis of the BAS data. Commercial usage does not include the direct exploitation of the data, that is no BAS data may be distributed to third parties under any circumstances. Some BAS corpora require a special lincense fee for commercial usage; see the corpus pages for details.

Corpora of read speech

The following corpora contain read speech, some of them recorded as a dictation task:

Corpora of spontaneous speech

The following corpora contain spontaneous recorded speech:

Corpora of accentuated/dialectal/alcoholized speech speech

The following corpora contain speech with classified (foreign) accent / dialect:

Corpora with telephone speech

The following BAS corpora contain speech recorded via public telephone lines (traditionell and cellular, GSM):

Corpora with high quality speech

'High quality speech' denotes recordings done with at least 16kHz sampling frequncy and in a controlled environment (studio). The following BAS corpora contain high quality speech:

Processing and Evaluation

Before distribution the BAS corpora are evaluated for certain formal properties (BAS Revalidation). These properties include: After the pass of this formal evaluation, the corpora are stored as 'master volumes' in our archive. They are linked to a central documentation and software server. If there is an order, the volumes are copied to CDROM and distributed (press on demand) or online access is granted via the
BAS CLARIN Repository.

In a second step the signals are analysed in more detail. An automatic segmentation in phonemes and words is carried out (MAUS), deviations from the canonical word form are detected and other features extracted. All results from further analysis are stored in the BAS Partitur Format (BPF).

In a sub-project of the German BITS project (TP8) all available BAS corpora have been re-validated against public guidelines. The results of this re-validation will be published on the BITS webserver.
Within the CLAIN initiative these guidelines for validation must be followed before publication within the BAS CLARIN repository.

File Formats and Software

Most of the disseminated speech corpora of BAS contain signal files in RIFF WAVE and NIST SPHERE formats. Some corpora contain SAM annotation formats.
A description of the formats used in BAS corpora can be found
Of course all formats are described in detail in the accompanying corpus documentation (you can access most of these on-line by looking up the WWW page of the corpus).
Last but not least on each BAS corpus you will find a small collection of software and ANSI C functions for the access to the signal files.


The following section gives some of the most common uses of BAS speech corpora.

Automatic speech recognition

To initialize statistically based applications for speech recognition phonetically labelled and segmented corpora are needed.
The following corpora may be used for this purpose:
For embedded training without segmentation (after bootstrapping):

Human - machine interaction (HMI)

Speech synthesis

For PSOLA synthesis all corpora with segmental information may be used: (in brackets corpora with automatically segmented speech).

Speaker recognition, verification, adaptation, paralinguistic classification

PD1 and SI100 have a variety of speakers of both sex and different age.

Empiric phonetic investigations

All BAS corpora with segmentations done manually. Since these are naturally very few data, it may be wise to use automatically segmented data, too (in brackets):
Prosodic investigations
Foreign accents / speaker characteristics
Dialectal variation

Copyright © 1995-2016 Bayerisches Archiv für Sprachsignale, Universität Müchen
This page and all other pages with the initial 'BAS' or 'Bas' in the filename may be copied, printed and distributed to other parties, under the condition that the pages are distributed as shown here. Parts of pages or extended pages may not be distributed further withoutpermission of the BAS.

Florian Schiel