Gleiche Seite in deutsch
Last update of this page: 2018-10-11
Munich AUtomatic Segmentation System (MAUS) -
Available
The final aim of the MAUS project will be the full automatic
annotation of arbitrary speech utterances. MAUS in his final design will
produce the following output:
using only the following input:
A short description of MAUS can be found
here or
here (only German).
Please also refer to our
BAS publications.
MAUS can be used as a freeware package but also as a web service.
There is a close connection between the MAUS project and the development of the Partitur Format that allows an easy and well structured way to represent categorical information of speech signals.
The development of MAUS was partly funded by the German Verbmobil Project.
On the other hand very large well structured linguistic lexica exist but they
do not contain any information about pronunciation.
An often used work-around is the usage of automatic graphem-to-phoneme
converters which of course do not work perfectly and usually are not
free available.
The first aim of the BAS PHONOLEX Project
will be the development
of list of canonical pronunciations that cover approx. 95 % of written German
(including all possible flexions).
This task is done in close cooperation with the University of
Saarbrücken (Prof. Uzkoreith), the University of Bonn
(Dr. Stock) and the University of Leipzig (Dr. Quasthoff).
Presently the following SC corpora are available or planned:
Some of these SC corpora will be subdivided by BAS into
training and test corpora respectively. That way users may refer to the
corpora in publications.
If you as a member of the speech community have other interesting
proposals for the SC corpus, don't hesitate to contact us under
the following email:
BAS is providing an extented edition of these corpora. This edition contains
the cut signal files as before and additionally the
orthographic transliteration, a so-called 'proposed transscription' (the
former term 'canonical form' cannot be used further for spontaneous speech)
and - if feasible - a first automatic phonological segmentation.
There may be some more results of other Verbmobil partners that are
included as well, like prosodic and syntactic information.
BAS will provide an updated and extended edition of the SmartKom corpora.
Unfinished with partly usable results; please contact bas@bas.uni-muenchen.de if you are interested in these sort of data.
In several projects this corpus will be extended by adolescent speakers.
The aim is more than 1000 speakers and the usage of web-based recording
techniques like SpeechRecorder.
BAS PHONOLEX Lexicon -
Available
Almost every kind of speech processing is in need of some 'canonical' or
empirical definition of pronunciation of single words. Such a
computer-readable pronunciation lexicon which contains all flexions
of a sufficiant amount of German words is currently not available.
Furthermore there's no resource for spontaneous speech which contains
many 'non-regular' words or other than lexical speech events that
nevertheless speech people have to cope with.
Currently (Version 2.6) the lexical list covers more than 1.600.000
entries; futher details about contants, format, availability can be found
here.
Strange Corpora - SC
The Strange Corpora are a series of smaller corpora. Each of them will
document a certain known problem in the field of speech science and
engineering. By the use of these corpora scientists and engineers may
test their solutions and/or applications and compare their performance.
Status: available
Status: available
Status: work in progress, estimated availability unknown.
Status: planned
Status: planned
Status: planned
Status: available (SI1000)
Status: planned
Status: planned
Status: available
BAS Edition of Verbmobil Corpora - VM -
Completed
During the first year after edition the
Verbmobil Corpora
(spontaneous dialog recordings) are for the exclusive use
of the official VM partners only. After that period the corpus
is distributed by the BAS
as well as the European Language Resources
Agency (ELRA) of the European Union.
Articulatory data - EMA
There exists a huge amount of EMA (electromagnetic articulography) data
recorded from speakers that have spoken the
SI1000 Corpus, which will be edited in
a seperate corpus. The corpus will contain the speech signal as well as
the geometric parameters of the vocal tract.
Estimated availability: End of 2000.
Spicos Training Data - SPICOS
The training corpus used in the SPICOS Project is still one of the major
corpora in German that may be used for bootstrapping speech recognition
algorithms. It contains the speech of 12 speakers each speaking 100 - 400
phonem-balanced sentences of German. The data were fully transliterated
into IPA.
BAS is planning to edit this corpus again after a careful validation and
filtering of the original data. The edition will contain the full IPA
annotation for phonetic science as well as a SAM-PA annotation for technical
usage.
BAS Edition of SmartKom Corpora - SK -
Available
The SmartKom
Corpora (Multimodale WOZ dialogue recordings)
will be distributed
to the scientific community after one year of exclusive
usage of the SmartKom consortium (starting 09/2003).
Munich Automatic Speaker Verification - MASV -
Available
MASV stands for Munich Automatic Speaker
Verification. It is an experimental environment to setup and test
speaker verification systems based on HMMs or GMMs.
It depends on the HTK tools
(version 3.1 or greater), Matlab (version 5 or greater) and Perl
(version 5 or greater). The Perl scripts control training and testing
of speaker models, the Matlab part provides various score
normalization schemes and a GUI for exploring the performance of a
speaker verification system. MASV is published under the GNU General
Public License in the hope to help others in getting started with
speaker verification based on HMM models. The key features are:
A more detailed description can be found in the manual which can be
downloaded from the MASV
website.
ASR Benchmark for spontaneous German (Verbmobil) -
Available
Based on the Verbmobil corpora
we define a training, development and test sets, lexica and language models.
We report the base line word accuracy on a mono-phone HTK recognizer as well
as a cross-word tri-phone recognizer.
ASR Benchmark for telephone speech, German (SpeechDat) -
Available
Based on the German SpeechDat
corpora SpeechDat II and SpeechDat Mobil
we define a training, development and test sets, lexica and language models.
We report the base line word accuracy on a mono-phone HTK recognizer
for the fixed network and results from using the base line training on
GSM speech. Also we report about adaptation techniques to overcome the
observed drop in performance.
ASR Adress Recognizer, German (GEO1)
The aim of this project is to build up an experimental ASR system
to recognize German addresses over the telephone network. The recognizer will
be HTK based and use the German SpeechDat corpus as acoustical training and
the GEO1 database as pronunciation base.
Alcohol Language Corpus (ALC)
Available
The aim of this project is to create a multi-style speech corpus
of intoxicated speakers for the investigation of alcoholic intoxication
on speech. The corpus contains speech of the same speakers under
sober and intoxicated conditions. It comprises a variety of speaking styles
ranging from simple digit strings over read speech, tongue twisters,
application commands (elicited by situational prompting), monologues to
real conversational speech. ALC aims at a total number of 150 speakers
(75/75 female/male). Grade of intoxication is being monitored by
breath and blood samples. This project is being conducted in close cooperation
with the Institute of Legal Medicine, University of Munich, and the
Association against Alcohol and Drugs in Traffic (B.A.D.S.), Germany.
CLARIN-D Webservices and Webinterface -
Available
Within the CLARIN-D projects funded by the German BMB+F the BAS developped
a series of REST call based speech tools (G2P,MAUS,CHUNKER, etc.) and a
user-friendly webinterface to process speech data using these webservices interactively.
CLARIN-D Respository for Speech Resources -
Available
Within the CLARIN-D projects funded by the German BMB+F the BAS archive of speech
resources has been transformed in to a CLARIN center of type B, including this
repository.
Major industrial BAS Cooperations
SpeechDat - SD -
Completed
Presently BAS is engaged in the production of the German SpeechDat
corpus (telephone speech) as a subcontractor of Siemens Company Munich.
Whether this corpus will be - as a whole or partly - disseminated by BAS or the
ERLA is uncertain at the moment.
The first (1000 speakers) and second (4000 speakers) project phase is
finished successfully. Also the project
SpeechDat Car (another
600 recordings in the running car) is finished.
Extension of Regional Variants of German - RVG-J, Ph@tt Sessionz
Available
In cooperation with AT&T Lucent the first
corpus of German dialects was produced in the 1990s
(RVG1).
The recordings are done with four different
microphones in parallel (low cost to studio quality) in normal office
enviroment. The recorded items cover di-phone balanced sentences, single
digits, connected digit strings, telephon numbers, computer commands
and 1 minute spontaneous speech. The 500 speakers were recorded in
different locations by a mobile recording equipment.
Speaker Verification over the Telephone - VERIDAT Completed
In cooperation with the German Telecom we developed a corpus for
speaker verification over the telephone network. Since this corpus
will not be distributed publicly via the BAS nor the ELRA, please contact BAS,
if you are interested in a bilateral user agreement.
Moving vehicle data - AUTO Completed
Currently several speech data collection in the moving automobil
are under way in close cooperation with several industrial partners.
The data are recorded from several speakers, in different dialectal regions
of Germany and in different car models.
No distribution via BAS planned.
BMW - TUMMIC Completed
The acronym TUMMIC stands for "Thoroughly User-Oriented Man-Machine
Interface in Cars". Several institutes of the Technical University of
Munich (Institute of Ergonomics, Faculty of Augmented Reality,
Institute for Human-Machine Communication and the Chair of Software and
Systems Engineering), the IPSK of the LMU and the Institute for
Psychology from Regensburg collaborate closely in this project.
Together, they develop a concept for the operation of assistance and
information systems in cars.
Florian Schiel