Post-processing includes all processing steps from the recorded raw signal data to the final distributed corpus. The following processing steps might not all be necessary in your corpus collection; however, some of them are (marked with a *): file transfer from recording device to computer, file name assignment*, filtering, cutting, synchronization, re-sampling, format conversion*, special conversion for annotation and automatic error detection*. Please note that some of these processing steps may be applied after or between the annotation steps described in the next chapter depending on the structure of your data pipelining (see section , p. ).
We deem this chapter to be quite relevant for the prospective producer of a new speech corpus, because the costs and man power needed for post-processing is often neglected or at least grossly under-estimated. Please review this chapter carefully before you calculate the overall costs of your corpus production and take into account all the necessary post-processing steps for your individual corpus production.
Although the order of the processing steps is in principle arbitrary7.1, the most effective order is given in the following description.