Next: Check for Reference Completeness
Up: Reference and Check List
Previous: Reference and Check List
Contents
How to get the Reference
Typically in the first working step you will browse through the
corpus contents and other data provided by the producer / client and
find one of the following situations:
- You are provided with a complete specification. Then use it as the
reference. Goto the next chapter.
- You are provided with an incomplete (or even non-existent)
specification. For example,
some properties of the speech corpus like the lexicon are not specified,
but you find them in the corpus or in the documentation. Now you have two
choices:
- You are able to fill the gaps in the specification by referring to the
documentation of the corpus. Then produce an extended specification
based on
that and proceed with the next chapter.
- You are not able to fill the gaps in the specification by referring
to the
documentation of the corpus. Contact the producer or the client and try to
clarify the unspecified points3.1. Produce an extended specification based
on
that source and write a first chapter for the final validation report listing
the missing items in the documentation. Then proceed with the next chapter.
- You are not provided with a specification and the corpus does not
have any documentation.3.2 Contact the producer / client and state clearly that a validation is
not possible in that case.
Beware: It is not wise to use the term correctness in all
contexts. For instance the reference of a speech corpus
should not contain phrases like:
``85% of the phonemic segmentations are correct.''
The problem with the concept of phonemic segmentation (as well as with
other linguistic annotations) is that it is hard to agree on what is
correct and what is not. Therefore we recommend formulating
specifications on
items like
the phonemic segmentation more cautiously, e.g.:
``The interlabeler agreement in the phonemic segmentation is at least
85%.''
The same is probably true for the transcript, the prosodic
labeling, all kinds of segment boundaries etc.
Next: Check for Reference Completeness
Up: Reference and Check List
Previous: Reference and Check List
Contents
Angela Baumann
2004-06-03