Bavarian Archive for Speech Signals
Pronunciation Lexicon PHONOLEX

Gleiche Seite in deutsch

This page was last updated 2013-09-19


PHONOLEX is the result of a close cooperation between the DFKI Saarbrücken, Computational Linguistics Lab, the Universität Leipzig (UL) and the Bavarian Archive for Speech Signals (BAS) in Munich. It comprises a simple list of word forms (inflected words) with the following entries:


PHONOLEX is currently build as a simple ASCII file (file phonolex) and as a XML version (file phonolex_xml). The entries are sorted to ASCII order ('NL' is new line).
file       ->  item 'NL'
               [ item 'NL' 
               ... ]

item       ->  orthography

orthography  ->  German Orthography with LateX Umlauts

info  ->  TAB-seperated list of keys:string

canonic_pronunciation  ->  word_form

empirical_pronunciation_list  ->
                word_form TAB counter TAB corpus TAB type
                [ ... ]

word_form  ->  string of extended SAM-PA

counter  ->  Integer

corpus  ->  String

type  ->  String

Aside from the basic PHONOLEX ASCII list there exist a XML version and two excerpts that might come handy:

Known Bugs

(plenty and hopefully decreasing; see German Page)

Actual Corpus Documentation

Source Table



A copy of the current PHONOLEX version may be obtained from BAS. The purchase of a user licence is necessary.
The user licence entitles to use the PHONOLEX list for commercial, scientific and educational purpose (depending on licence). Furthermore the owner of a user licence will receive free upgrades of higher versions of PHONOLEX for free.
The user licence does not entitle the user to re-distribute the list in any form, not partly and not modified or extended to third parties.
Furthermore the user agrees to report any errors found in the list to the BAS. This way we hope to achieve improvements in the future.
All rights stay with DFKI and BAS.
By purchasing the user licence the user will accept the above conditions.


PHONOLEX - Delivery via CDROM, Update-Service
Scientific Licence EUR 1030.25
Scientific Licence ELRA members EUR 631.45
Licence commercial EUR 6081.82
Licence commercial ELRA members EUR 3423.10
Please send orders or questions to the following address:


The signed user agreement has to be faxed or send by mail prior or together with the order.

Copyright © 1996-2011 Bayerisches Archiv für Sprachsignale, Universität Müchen, Deutsches Forschungszentrum für künstliche Intelligenz, Saarbrücken, Universität Leipzig.
This page and all other pages with the initial 'BAS' or 'Bas' in the filename may be copied, printed and distributed to other parties, under the condition that the pages are distributed as shown here. Parts of pages or extended pages may not be distributed further without permission of the BAS.

Florian Schiel