Availability and prices for the Verbmobil Dialog Speech Corpus F. Schiel 13.06.01 / 24.10.14 Backround The Verbmobil Corpora I and II were completely funded by the German Ministry of Sciences and Education (BMBF). Therefore all data produced during the project time by all partners were transfered to the Bavarian Archive for Speech Signals for further maintenance and distribution to the scientific public. There is no commercial license on these data. Sole distribution agency is ELRA and BAS. Contents The Verbmobil Corpora consist of recordings of non-prompted dialoges of two or more human partners: - two partners - two partner + one interpreter - two partners + two interpreters - one partner and VM system (WOZ + EMOTION) The domain is scheduling business trips (VMI) and furthermore negotiating the means of travel and accomodation (VMII). The language pairs of the corpus are: - German - German - American English - A. English - Japanes - Japanese - 'Denglish' - 'Denglish' ('Denglish' is English spoken by native Germans) - German - A. English - German - Japanese Volumes and amount of data VMI German 1 2 3 4 5 7 12 14 8 volumes ~ 4GB English 6 8 13 3 volumes ~ 1,5G Japanese 16 17 18 19 4 volumes ~ 3,5GB VMII German 15 20 21 22 24 29 30 38 39 48 49 53 12 volumes ~ 4,8GB Englisch 23 28 31 42 43 50 6 volumes ~ 2,4GB Japanese 25 26 27 33 34 35 44 45 60 61 62 11 volumes ~ 4,4GB Multilingual 32 46 47 51 52 55 56 57 58 59 10 volumes ~ 4GB Emotional 63 64 65 3 volumes ~ 2GB Any combination of volumes may be ordered from BAS on CDROM (one volumes per CDROM) or DVD5 (approx. 9 volumes per DVD). The price is calculated by multiplying the number of volumes with the base CDROM fee of BAS (at the time of writing May/2001 this is EU 255 for non-member and EU 127 for members) regardless of the medium CDROM or DVD5, and then deduct 30% discount. Example: The full English part of VM on DVD5: 9 volumes = EU 2.296,- ./. 30% Numbers to the German VM copus (VMI+VMII): SET WORDS TURNS DIALOGS LEX VOLUMES --------------------------------------------------------------------- DEV 14721 694 51 1638 VM14.1 VM15.1 VM2.1 VM20.1 VM21.1 VM22.1 VM24.1 VM29.1 VM30.1 VM4.1 TEST 13792 691 56 1410 VM14.1 VM2.1 VM21.1 VM22.1 VM24.1 VM29.1 VM30.1 VM4.1 TRAIN 461664 25495 942 9297 VM1.1 VM12.1 VM14.1 VM15.1 VM2.1 VM20.1 VM21.1 VM22.1 VM24.1 VM29.1 VM3.1 VM30.1 VM38.1 VM39.1 VM4.1 VM48.1 VM49.1 VM5.1 VM53.1 VM7.1 Other material than speech Additional material is also available from BAS VM Bonus : Verbmobil material and data that has not been published on the regular VM volumes, e.g. multiple phonetic/prosodic labelings, original tree banks, treebanks of VMI, japanese treebanks, additional documentation, reports, techdocs, publications related to VM, translations of transcripts. VM LEX : This CD contains the original Verbmobil Lexicon Database of the University of Bielefeld. This corpus was not validated by the BAS and is distributed 'as is'. VM EMOTION (63 64 65) : Recordings of emotional stipulated speech with special annotation More information (including the definition of training, development and test sets) can be found in www.bas.uni-muenchen.de/Bas/BasKorporaeng.html#SpontKorpora