next up previous contents
Next: Pre-Validation Up: Collection Previous: Collection   Contents


Ongoing Documentation, Logging

Ongoing Documentation or Logging is of paramount importance to ensure the later usability of the speech corpus. All processes of the data collection must be documented in such a way that the user of the speech corpus understands all aspects that might be of importance for the later usage of the data.

There are basically two ways to do the logging during the speech data collection: on paper or online.

Logging on paper is easy and can be performed everywhere without computer hardware. However, in most cases the written data must be transferred into machine readable form later which means additional costs. It is much better to perform online logging, either by using a customized editor or into a database system via a Web server.

Practically all modern database systems allow the access and input of data via a Web interface. The advantage of this method is that different data from different processes can be easily linked together. For instance you might use the same database system for the scheduling of your recording sessions and to input the required meta data about recordings and speakers. Care has to be taken that the basic rules of data protection are observed. See also section [*] (p. [*]).

The following list gives the obligatory data to be logged (marked with one *) and other possible data of interest logged during the collection phase:

If you are working on a large data collection with many staff members or project partners at different locations, you might also think of an automated Web information system, where interested parties can monitor the progress of the collection and react to certain developments6.1.


next up previous contents
Next: Pre-Validation Up: Collection Previous: Collection   Contents
BITS Projekt-Account 2004-06-01