        PH@TTSESSIONZ GERMAN DATABASE OF ADOLESCENT SPEECH
                    
                          Ph@ttSessionz
                      
                    SPEECH DATABASE COLLECTION
                          
                          
                          Version 2.0.1
                    Copyright(C) 2014 by
            Institute of Phonetics and Speech Processing
                    University of Munich, Munich
                             Germany

Compiled by: Chr. Draxler
             Department of Phonetics and Speech Communication
             University of Munich
             Schellingstr. 3/II
             D 80799 Munich

             +49/89/2180 2807 tel
             +49/89/2180 5790 fax

             draxler@phonetik.uni-muenchen.de


1) OVERVIEW

The Ph@ttSessionz speech database Version 2.0 contains recordings of 1019 adolescent speakers of German (mostly from the age range 12-20). The recordings were performed via the WWW in public schools (Gymnasium) in 46 locations in Germany (and one in Austria). The speech material recorded is a superset of the German SpeechDat-II and RVG-I corpora.

A session consists of up to 138 recording items, with both read and non-scripted speech. The read speech material comprises isolated digits, digit sequences, numbers, time and date expressions, spellings, person, company and geographical names, and phonetically rich sentences. The non-scripted speech consists of short and long text production items. The short text production items are questions on the current date or prompts for descriptions, e.g. on how to get from home to the train station, or the speaker's clothes; for the long items the speaker was asked to talk about the last holidays, or the favorite subject at school, etc. 

 ITEMCODE |         DESCRIPTION          | SESSIONS | FILES
----------+------------------------------+------------------
 01-12    | single digit                 |   1019   | 12118
 13-30    | number                       |   1019   | 18177
 31-42    | command                      |   1019   | 12109
 43-72    | phonetically rich sentence   |   1019   | 30273
 73-85    | telephone number             |   1019   | 13125
 B1-B3    | digit string, all digits     |   1018   |  3017
 C1-C3    | digit string, credit card    |   1019   |  3021
 C4-C6    | digit string, PIN code       |   1019   |  3028
 D1-D3    | date expression              |   1019   |  3025
 L0,L9    | spelling, arbitrary sequence |   1016   |  1896
 L1-L5    | spelling, geographical name  |   1019   |  5047
 L6-L8    | spelling, person name        |   1019   |  3032
 O1-O3    | geographical name            |   1019   |  3016
 O4-O6    | company name                 |   1019   |  3015
 O7-O9    | person name                  |   1019   |  3029
 P00-P10  | phonetics test sentence      |   274    |  2971
 T1-T3    | time expression              |   1019   |  3018
 X1-X5    | text production, short       |   1018   |  4555
 Y1,Y3,Y4 | text production, long        |   987    |  2588

NOTE 1: items P00-P10 were added later during the project and hence have not been recorded at every site.


2) DATABASE ORGANIZATION

The database is organized in recording sessions. Each session corresponds to a directory, and each recording is stored in two separate audio files in WAV
format, one for each recording channel. 

The file nomenclature is

NNNN"/AAA"NNNNII[I]"_"C".WAV"

with NNNN the four digit session code, II[I] a two- respectively three- character item code, and C the recording channel.

The file hierarchy on the distribution media is as follows

/- README.TXT
/- COPYRIGHT.TXT
/- DOC ---+- SAMPALEX.PDF
          +- at3031_english.pdf
          +- 060628_MPre_UG_EN01.pdf
          +- Opus54_DB_E.pdf
          +- TRANSCRIPTION.PDF
/- TABLE -+- CONTENTS.TBL
          +- LEXICON.TBL
          +- METADATA.TBL
          +- PH110TRN.TBL
          +- PH110TST.TBL
/- SOURCE +- DEFTEST.PL
/- DATA --+- 2028 -+- AAA202801_0.par
          |        +- AAA202801_0.TextGrid
          |        +- AAA202801_0.wav
          |        +- AAA202801_1.wav
          |        +...
          :
          +- 4866 -+- AAA4866Y4_0.par
                   +- AAA2866Y4_0.TextGrid
                   +- AAA4866Y4_0.wav
                   +- AAA4866Y4_1.wav

NOTE 2: the file name extension mappings are

.PDF     Adobe Portable Document Format
.PL      perl script
.TXT     UTF-8 plain text file with Unix line breaks (line feed)
.TBL     tab-delimited UTF-8 table file with Unix line breaks (line feed)
.PAR     BAS partitur format 
.TextGrid Praat TextGrid time-aligned annotation files


The following directories contain documentation and related information:

DOC    : SAMPALEX.PDF            German SAM-PA table
	 TRANSCRIPTION.PDF       validation and transcription handbook
         at3031_english.pdf      Audio-Technica AT3031 data sheet
         060628_MPre_UG_EN01.pdf M-Audio mobile pre user guide
         Opus54_DB_E.pdf         Beyerdynamic opus 54 data sheet
         HTML                    Description of the recording procedure
         
SOURCE : contains the follwing Unix formatted ISO 8859-1 files

         DEFTEST.PL   perl script to define training and test sets
         
		 
TABLE  : contains the following UTF-8 encoded plain text files

         CONTENTS.TBL the prompts and annotations file with
                      tab-delimited fields
                      
                      SESSION ITEMCODE DESCRIPTION PROMPT SEGMENT_BEGIN SEGMENT_END ANNOTATION
                      
                      SEGMENT_BEGIN and SEGMENT_END are given in milliseconds
                      
         LEXICON.TBL  the lexicon file with the following 
                      tab-delimited fields. The lexicon file contains all regular 
                      words plus the noise markers plus word fragments or truncated words.

                      ORTHOGRAPHY FREQUENCY SAM-PRONUNCIATION
                      
                             SAM-PRONUNCIATION: automatically generated and manually checked
                                                German Sampa Symbols with following exceptions:
                                                - without symbols of affricates
                                                - vowels with and without vowel length mark (':') 

         METADATA.TBL the speaker information file with the following
                      tab separated fields

                      SESSION	ZIPCODE	CITY	REC_DATE	REC_TIME	SEX	DIALECT	SMOKER	HEIGHT	WEIGHT	AGE	COUNT

                      This file is used to generate the training and test
                      sets respectively.

         PH110TRN.TBL 819 session numbers for training set

         PH110TST.TBL 200 session numbers for test set
         
DATA   : contains directories corresponding to recording sessions. 
         A session consists of pairs of audio files (WAV format), one for each channel,
         
	 + plus segment files in BAS Partitur format,
	 + plus segment files in TextGrid format.
	 
	 These segmentations were created automatically using the MAUS automatic segmentation
	 system with the orthographic transliteration provided in the CONTENTS.TBL file.

HTML   : manual_files      Description files in HTML format

         manual_eng.html   Starting page of instruction
         
         manual_eng.zip    Compressed archive file of description
         

3) SIGNAL QUALITY

The following recording equipment was used:

1) Beyerdynamic opus54 close-talk microphone
2) AudioTechnica AT3031 table microphone
3) m-audio mobile pre USB A/D converter

The recording quality is 22.05 kHz 16 bit, and the files are in mono .wav format. The file suffix "_0" identifies the close-talk microphone channel, the suffix "_1" the table top microphone.


4) INSTRUCTIONS TO RECORDING STAFF AND SPEAKERS

A description of the instructions to the recording staff is given in file
DOC/HTML/manual_eng.html


5) REFERENCES

Papers on Ph@ttSessionz and the tools used (SpeechRecorder and WebTranscribe) were published at numerous international conferences:


@inproceedings{Draxler_2006_b,
	Address = {St. Petersburg},
	Author = {Chr. Draxler},
	Booktitle = {Proc. of Specom},
	Title = {Web-Based Speech Data Collection and Annotation},
	Year = {2006}}


@inproceedings{Draxler_Jaensch_2006,
	Address = {Genova},
	Author = {Chr. Draxler and K. J{\"a}nsch},
	Booktitle = {Proc. of LREC},
	Title = {Speech Recordings in Public Schools in {Germany} - the Perfect Show Case for Web-based Recordings and Annotation},
	Year = {2006}}


@inproceedings{Draxler_2006_a,
	Address = {Pittsburgh, PA},
	Author = {Chr. Draxler},
	Booktitle = {Proc. of Interspeech},
	Title = {Exploring the Unknown -- Collecting 1000 speakers over the Internet for the Ph@ttSessionz Database of Adolescent Speakers},
	Year = {2006}}


@inproceedings{Draxler_2005,
	Address = {Karlsbad, Czech Republic},
	Author = {Chr. Draxler},
	Booktitle = {Proceedings of TSD 2005},
	Title = {WebTranscribe -- An Extensible Web-based Speech Annotation Framework},
	Year = {2005}}


@inproceedings{DraxlerJaensch2005,
	Author = {Chr. Draxler and K. J{\"a}nsch},
	Booktitle = {Proceedings of DAGA 2005},
	Title = {{SpeechRecorder} -- Mehrkanal Sprachaufnahmen {\"u}ber das {WWW}},
	Year = {2005}}


@inproceedings{Steffen_et_al_2005,
	Author = {A. Steffen and Chr. Draxler and A. Baumann and S. Schmidt},
	Booktitle = {Proceedings of DAGA 2005},
	Title = {{Ph@ttSessionz}: {A}ufbau einer {D}atenbank mit {J}ugendsprache},
	Year = {2005}}


@inproceedings{Draxler_Steffen_2005,
	Address = {Lisbon, Portugal},
	Author = {Chr. Draxler and A. Steffen},
	Booktitle = {Proceedings of Interspeech 2005},
	Title = {Ph@ttSessionz: Recording 1000 Adolescent Speakers in Schools in Germany},
	Year = {2005}}


@inproceedings{DraxlerJaensch2004,
	Address = {Lisbon},
	Author = {Chr. Draxler and K. J{\"a}nsch},
	Booktitle = {Proceedings. of 4th Intl. Conference on Language Resources and Evaluation},
	Pages = {559-562},
	Title = {SpeechRecorder -- a Universal Platform Independent Multi-Channel AudioRecording Software},
	Year = {2004}}


@inproceedings{Draxler1998,
	Address = {Granada},
	Author = {Chr. Draxler},
	Booktitle = {Proceedings of LREC},
	Title = {{WWWSigTranscribe} -- A {J}ava {E}xtension of the {WWWTranscribe} {T}oolbox},
	Year = {1998}}


@inproceedings{Draxler1997,
	Address = {Rhodos},
	Author = {Chr. Draxler},
	Booktitle = {Proc. of {Eurospeech}},
	Title = {{WWWTranscribe} -- A {Modular} {T}ranscription {S}ystem {B}ased on the {W}orld {W}ide {W}eb},
	Year = {1997}}

6. HISTORY

....-..-.. : Version 1.0.0 First edition : subset
2015-08-01 : Version 2.0.0 First complete edition
             published in BAS CLARIN Repo
2016-06-17 : Version 2.0.1 Available as emuDB
