Next: Manual Annotation Tools
Up: Segmentation and Labeling
Previous: Automatic and Semi-automatic Segmentation
Contents
The scope of this cookbook does not allow us to cover all possible
segmentation and labeling procedures. The development of a
segmentation scheme requires probably even more time and effort than the
transcription scheme discussed in the previous sections. We therefore
strongly recommend that you select an already existing scheme and follow the
recommendations given there. The following (incomplete) list of projects that
involved segmentation and labeling schemes
might give you some directions:
- Kiel Corpus of Read/Spontaneous Speech
In this project a moderate amount of read speech from the PhonDat corpus
and non-prompted speech from the Verbmobil I
corpus8.18 was
segmented and labeled into phonetic/phonemic units together with a
selection of prosodic markers. The formats used in this project are
called S1 and S2 and were developed in the PhonDat
projects. Since this format is very hard to parse, we do not recommend
using this format for segmentation and labeling.
See www.ipds.uni-kiel.de/forschung/kielcorpus.en.html for details.
- BAS Verbmobil I
In the Verbmobil I corpus distributed by BAS several segmentations and
labellings are contained: phonetic/phonemic manual segmentation,
phonetic/phonemic automatic segmentation using the MAUS method, prosodic
segmentation and labeling in GTobi, word segmentation. See www.bas.uni-muenchen.de/Bas/BasKorporaeng.html#VMI for details.
- Segmentation of the Switch-board Corpus
Parts of the Switch-board Corpus8.19 have been segmented and
labeled into syllable units by Steve Greenberg and his group. For
details see www.icsi.berkeley.edu/real/stp/
Next: Manual Annotation Tools
Up: Segmentation and Labeling
Previous: Automatic and Semi-automatic Segmentation
Contents
BITS Projekt-Account
2004-06-01