Up: Corpus Specification
Previous: Number of Speakers
The spoken content of a speech corpus is the second major feature
that determines the possible usage of the resource. Of course, this
feature is not totally orthogonal to other specifications, for
instance the speaking style. Basically, there are four main
approaches defining the spoken content of a corpus: by vocabulary,
by domain, by task or by phonological distribution.
These might be applied in a mixed manner in some cases.