PH@TTSESSIONZ GERMAN DATABASE OF ADOLESCENT SPEECH Ph@ttSessionz SPEECH DATABASE COLLECTION Version 2.0.1 Copyright(C) 2014 by Institute of Phonetics and Speech Processing University of Munich, Munich Germany Compiled by: Chr. Draxler Department of Phonetics and Speech Communication University of Munich Schellingstr. 3/II D 80799 Munich +49/89/2180 2807 tel +49/89/2180 5790 fax draxler@phonetik.uni-muenchen.de 1) OVERVIEW The Ph@ttSessionz speech database Version 2.0 contains recordings of 1019 adolescent speakers of German (mostly from the age range 12-20). The recordings were performed via the WWW in public schools (Gymnasium) in 46 locations in Germany (and one in Austria). The speech material recorded is a superset of the German SpeechDat-II and RVG-I corpora. A session consists of up to 138 recording items, with both read and non-scripted speech. The read speech material comprises isolated digits, digit sequences, numbers, time and date expressions, spellings, person, company and geographical names, and phonetically rich sentences. The non-scripted speech consists of short and long text production items. The short text production items are questions on the current date or prompts for descriptions, e.g. on how to get from home to the train station, or the speaker's clothes; for the long items the speaker was asked to talk about the last holidays, or the favorite subject at school, etc. ITEMCODE | DESCRIPTION | SESSIONS | FILES ----------+------------------------------+------------------ 01-12 | single digit | 1019 | 12118 13-30 | number | 1019 | 18177 31-42 | command | 1019 | 12109 43-72 | phonetically rich sentence | 1019 | 30273 73-85 | telephone number | 1019 | 13125 B1-B3 | digit string, all digits | 1018 | 3017 C1-C3 | digit string, credit card | 1019 | 3021 C4-C6 | digit string, PIN code | 1019 | 3028 D1-D3 | date expression | 1019 | 3025 L0,L9 | spelling, arbitrary sequence | 1016 | 1896 L1-L5 | spelling, geographical name | 1019 | 5047 L6-L8 | spelling, person name | 1019 | 3032 O1-O3 | geographical name | 1019 | 3016 O4-O6 | company name | 1019 | 3015 O7-O9 | person name | 1019 | 3029 P00-P10 | phonetics test sentence | 274 | 2971 T1-T3 | time expression | 1019 | 3018 X1-X5 | text production, short | 1018 | 4555 Y1,Y3,Y4 | text production, long | 987 | 2588 NOTE 1: items P00-P10 were added later during the project and hence have not been recorded at every site. 2) DATABASE ORGANIZATION The database is organized in recording sessions. Each session corresponds to a directory, and each recording is stored in two separate audio files in WAV format, one for each recording channel. The file nomenclature is NNNN"/AAA"NNNNII[I]"_"C".WAV" with NNNN the four digit session code, II[I] a two- respectively three- character item code, and C the recording channel. The file hierarchy on the distribution media is as follows /- README.TXT /- COPYRIGHT.TXT /- DOC ---+- SAMPALEX.PDF +- at3031_english.pdf +- 060628_MPre_UG_EN01.pdf +- Opus54_DB_E.pdf +- TRANSCRIPTION.PDF /- TABLE -+- CONTENTS.TBL +- LEXICON.TBL +- METADATA.TBL +- PH110TRN.TBL +- PH110TST.TBL /- SOURCE +- DEFTEST.PL /- DATA --+- 2028 -+- AAA202801_0.par | +- AAA202801_0.TextGrid | +- AAA202801_0.wav | +- AAA202801_1.wav | +... : +- 4866 -+- AAA4866Y4_0.par +- AAA2866Y4_0.TextGrid +- AAA4866Y4_0.wav +- AAA4866Y4_1.wav NOTE 2: the file name extension mappings are .PDF Adobe Portable Document Format .PL perl script .TXT UTF-8 plain text file with Unix line breaks (line feed) .TBL tab-delimited UTF-8 table file with Unix line breaks (line feed) .PAR BAS partitur format .TextGrid Praat TextGrid time-aligned annotation files The following directories contain documentation and related information: DOC : SAMPALEX.PDF German SAM-PA table TRANSCRIPTION.PDF validation and transcription handbook at3031_english.pdf Audio-Technica AT3031 data sheet 060628_MPre_UG_EN01.pdf M-Audio mobile pre user guide Opus54_DB_E.pdf Beyerdynamic opus 54 data sheet HTML Description of the recording procedure SOURCE : contains the follwing Unix formatted ISO 8859-1 files DEFTEST.PL perl script to define training and test sets TABLE : contains the following UTF-8 encoded plain text files CONTENTS.TBL the prompts and annotations file with tab-delimited fields SESSION ITEMCODE DESCRIPTION PROMPT SEGMENT_BEGIN SEGMENT_END ANNOTATION SEGMENT_BEGIN and SEGMENT_END are given in milliseconds LEXICON.TBL the lexicon file with the following tab-delimited fields. The lexicon file contains all regular words plus the noise markers plus word fragments or truncated words. ORTHOGRAPHY FREQUENCY SAM-PRONUNCIATION SAM-PRONUNCIATION: automatically generated and manually checked German Sampa Symbols with following exceptions: - without symbols of affricates - vowels with and without vowel length mark (':') METADATA.TBL the speaker information file with the following tab separated fields SESSION ZIPCODE CITY REC_DATE REC_TIME SEX DIALECT SMOKER HEIGHT WEIGHT AGE COUNT This file is used to generate the training and test sets respectively. PH110TRN.TBL 819 session numbers for training set PH110TST.TBL 200 session numbers for test set DATA : contains directories corresponding to recording sessions. A session consists of pairs of audio files (WAV format), one for each channel, + plus segment files in BAS Partitur format, + plus segment files in TextGrid format. These segmentations were created automatically using the MAUS automatic segmentation system with the orthographic transliteration provided in the CONTENTS.TBL file. HTML : manual_files Description files in HTML format manual_eng.html Starting page of instruction manual_eng.zip Compressed archive file of description 3) SIGNAL QUALITY The following recording equipment was used: 1) Beyerdynamic opus54 close-talk microphone 2) AudioTechnica AT3031 table microphone 3) m-audio mobile pre USB A/D converter The recording quality is 22.05 kHz 16 bit, and the files are in mono .wav format. The file suffix "_0" identifies the close-talk microphone channel, the suffix "_1" the table top microphone. 4) INSTRUCTIONS TO RECORDING STAFF AND SPEAKERS A description of the instructions to the recording staff is given in file DOC/HTML/manual_eng.html 5) REFERENCES Papers on Ph@ttSessionz and the tools used (SpeechRecorder and WebTranscribe) were published at numerous international conferences: @inproceedings{Draxler_2006_b, Address = {St. Petersburg}, Author = {Chr. Draxler}, Booktitle = {Proc. of Specom}, Title = {Web-Based Speech Data Collection and Annotation}, Year = {2006}} @inproceedings{Draxler_Jaensch_2006, Address = {Genova}, Author = {Chr. Draxler and K. J{\"a}nsch}, Booktitle = {Proc. of LREC}, Title = {Speech Recordings in Public Schools in {Germany} - the Perfect Show Case for Web-based Recordings and Annotation}, Year = {2006}} @inproceedings{Draxler_2006_a, Address = {Pittsburgh, PA}, Author = {Chr. Draxler}, Booktitle = {Proc. of Interspeech}, Title = {Exploring the Unknown -- Collecting 1000 speakers over the Internet for the Ph@ttSessionz Database of Adolescent Speakers}, Year = {2006}} @inproceedings{Draxler_2005, Address = {Karlsbad, Czech Republic}, Author = {Chr. Draxler}, Booktitle = {Proceedings of TSD 2005}, Title = {WebTranscribe -- An Extensible Web-based Speech Annotation Framework}, Year = {2005}} @inproceedings{DraxlerJaensch2005, Author = {Chr. Draxler and K. J{\"a}nsch}, Booktitle = {Proceedings of DAGA 2005}, Title = {{SpeechRecorder} -- Mehrkanal Sprachaufnahmen {\"u}ber das {WWW}}, Year = {2005}} @inproceedings{Steffen_et_al_2005, Author = {A. Steffen and Chr. Draxler and A. Baumann and S. Schmidt}, Booktitle = {Proceedings of DAGA 2005}, Title = {{Ph@ttSessionz}: {A}ufbau einer {D}atenbank mit {J}ugendsprache}, Year = {2005}} @inproceedings{Draxler_Steffen_2005, Address = {Lisbon, Portugal}, Author = {Chr. Draxler and A. Steffen}, Booktitle = {Proceedings of Interspeech 2005}, Title = {Ph@ttSessionz: Recording 1000 Adolescent Speakers in Schools in Germany}, Year = {2005}} @inproceedings{DraxlerJaensch2004, Address = {Lisbon}, Author = {Chr. Draxler and K. J{\"a}nsch}, Booktitle = {Proceedings. of 4th Intl. Conference on Language Resources and Evaluation}, Pages = {559-562}, Title = {SpeechRecorder -- a Universal Platform Independent Multi-Channel AudioRecording Software}, Year = {2004}} @inproceedings{Draxler1998, Address = {Granada}, Author = {Chr. Draxler}, Booktitle = {Proceedings of LREC}, Title = {{WWWSigTranscribe} -- A {J}ava {E}xtension of the {WWWTranscribe} {T}oolbox}, Year = {1998}} @inproceedings{Draxler1997, Address = {Rhodos}, Author = {Chr. Draxler}, Booktitle = {Proc. of {Eurospeech}}, Title = {{WWWTranscribe} -- A {Modular} {T}ranscription {S}ystem {B}ased on the {W}orld {W}ide {W}eb}, Year = {1997}} 6. HISTORY ....-..-.. : Version 1.0.0 First edition : subset 2015-08-01 : Version 2.0.0 First complete edition published in BAS CLARIN Repo 2016-06-17 : Version 2.0.1 Available as emuDB