The format in which these data are stored has to be included into the corpus specification. Since no widely accepted standard exists and since very often speech corpora contain a new form of annotation that has never been applied before, many speech corpora contain proprietary formats that were defined only for that special occasion. Some of these formats have become commonly accepted and have been re-used in other collections.
In this section we list some more or less standardized file formats for annotation data and outline their respective properties. These properties can be described by the following criteria:
Pros:
Pros:
Pros:
Recommendation: Comparable to SAM but without the header part.