``The SmartKom multi-modal corpus was produced in the years 1999 - 2003 at the Bavarian Archive for Speech Signals (BAS) located at the University of Munich (LMU). The corpus was 100% funded by the German Ministry for Education and Science and is therefore freely available for all kinds of usage except re-distribution to third parties.
The primary aim of the corpus was the empirical study of Human - Computer interaction (HCI) in a number of different tasks (domains) and technical setups (scenarios).''
(from the corpus documentation)In the SmartKom data collection subjects were recorded while using a self-explanatory, user adaptive man-machine interface (MMI). The MMI is simulated using a Wizard-of-Oz setup (WOZ, see section ) and interprets speech and gesture input and analyses the facial expression of the user. The total corpus consists of a number of speech channels, four video channels, the output of a graphic tablet or finger point detector and a separate multi-modal biometric data collection. The resulting video data and multi-channel recorded spontaneous speech data serve as a basis for research and development of speech recognition, gesture recognition and the user model of SmartKom.
In the following only the speech part of the WOZ data collection is described.