Meta Data

The term meta data for speech recordings refers not to the recorded speech data itself, but to data about these recorded data. The emphasis here lies on the term data because meta data does not include documentation of a speech corpus. Meta data consists of categorized, machine-readable data that may be used to classify the speech data contained in the corpus.

Consequently, meta data consists of codes (in opposition to free text) except for free comments. When you specify your meta data for a speech corpus, it is therefore important not only to specify the type but also the set of possible values.


