Next: Validation Check List
Up: Reference and Check List
Previous: How to get the
Contents
Check for Reference Completeness
To decide whether a reference is satisfactory for a
successful validation, check the following points:
- Browse through the speech corpus and compile a `survey list' in
which you note down
- the names of all directories or directory groups
- file types (usually by the extension), e.g. *.wav, *.par,
*.ags, ...
Then check if everything found in the corpus is described in the
specification. If not, the specification is incomplete.
- Check for basic meta data that must be mentioned to perform a
validation3.3
such as:
- Speakers: number and profile requirements: e.g. gender
distribution, age distribution, regional distribution, ...
- Data description: formats, signal specification like sampling
rate, word length, S/N ratio, ...
- Contents: Spoken prompts, domains, word distribution,
word / utterance count per speaker / domain etc.
- Annotation: formats, numbers, procedures, labeling and
segmentation tolerance ...
If in doubt whether a specification is essential, try to mimic a
potential user of the SLR and decide whether the specification is needed
or not.
The above procedure gives you only hints as to what to look for. In
some cases speech corpora are produced with such a special purpose that
some of the above listed items may not be important, but others may well
be.
Therefore we recommend communicating extensively with the producer /
client in this phase.
Next: Validation Check List
Up: Reference and Check List
Previous: How to get the
Contents
Angela Baumann
2004-06-03