Format Conversion

Most likely you will have to convert your final signal files into a standard format as given in your corpus specification (see section [*], p. [*] for a detailed description of the most common speech file formats). If your target file format is a standard, you will probably use the general speech file conversion tool sox mentioned above. Sox has the great advantage that it is a command line tool and therefore easy to include in your scripts. Again we recommend that you include some simple check procedures into your check lists to ensure that no data loss has occurred in the conversion.

If you want to use a compression technique like shorten or gzip, you should place the format conversion at the end of your data pipeline because your annotation tools most certainly will not work with compressed input data (see section [*] for a discussion of compression techniques).

