Segments vs. Points-in-Time

Speech events may either cover a certain time span, a segment, or happen at a certain point in time. Segmental events are for instance: phonetic features (for instance voiced), phones, syllables, morphs, words, dialog acts, dialogue turns, while events that have only a single point in time might be: glottal pulses, bursts, energy peaks or valleys, fundamental frequency peaks or lows, voice onsets, accents, syllable nuclei.

In most speech corpora you will encounter segmentations in turns, dialog acts or words, on a much smaller scale also segmentations in phones and prosodic categories. As a rule of thumb we can say that the effort for segmentation and labeling increases dramatically and inversely proportional to the size of the labeled units8.11

