Bavarian Archive for Speech Signals

Gleiche Seite in deutsch

Last update of this page: 07/28/03

BAS Validation of external resources

We distinguish between the terms evaluation and validation of a speech resource.
Evaluation denotes a general judgement about the usability of the resource with respect to a certain task. For example the evaluation of speech corpus for the task of speech recognition over the telephone network would involve a basic test with a standard recognizer (e.g. HTK) to prove the usability of the data for this task. Up to now we carried out such evaluations only of speech resources that are distributed by BAS. No evaluation of external resources so far.
Validation denotes the formal check of a resource with respect to its specifications. This includes the checking of formats, documentations and structure, the completeness, the labeling, tagging etc. Most of the resources listed in the BAS catalogue were either validated externally or by BAS. Presently we are in the process of re-validating all BAS speech corpora using a standardized protocol established in the BITS project:

Validation Guidelines
Re-validaton Protocols

Aside from the internal validation of speech resources in the BAS catalogue we also carry out validations of external resources that were not produced at BAS. In most cases such a validation is ordered by the producer or his/her principal to make sure that the resource reaches a certain quality standard. Standardized validation procedures exist only for certain special types of resources (e.g. SpeechDat). Therefore the validation procedure is negotiated in each case between BAS and the producer/principal. In the simplest case the resource is validated against the BAS validation protocol.

Example for a BAS validation report:

Validation report for the CGN Database, release 3

Florian Schiel