Phonological Distribution

In some cases - very often in the scientific context or in combination with speech synthesis - the contents of a speech corpus have to be specified not in term of vocabulary but in terms of phonological units, like phonemes, syllables, morphemes.

For instance, a general purpose speech recognition system will require a minimum of repetitions of every possible phoneme in various contexts by each speaker.

Or a corpus for concatenative speech synthesis will require every diphone combination uttered from the same speaker in a minimum of 20 different left and right contexts.

