Number of Speakers

The number of speakers is one of the most important characteristics of a spoken language corpus. Speech corpora can be roughly divided into the following three classes ([2], pp. 107 - 109):

  1. Speech corpora with 1 to 5 speakers are often used in the development of speech synthesis systems or for basic research e.g. where invasive measurements must be made.
  2. Speech corpora with about 5 to 50 speakers are often used in experimental factorial research. In general, the number of speakers and the number of repetitions of the speech phenomena that are investigated should be large enough for a meaningful statistical processing if factorial experimental designs are planned.
  3. Speech corpora with more than 50 speakers are necessary to adequately train and test speech recognition or speaker verification systems.
Note that a small number of speakers does not necessarily mean a small corpus!

