Apart from the actual coding scheme, you have to decide about the contents as well. The minimum content as described in section will be a simple table containing a consistent orthographic representation and a most likely or canonical pronunciation. Since the latter is not a well defined term for most languages, please make sure that you come up with a definition that may successfully be used in the creation of the dictionary. In some cases you may refer to a standard dictionary9.2 or even better to a standardized rule set of pronunciation9.3. If this is not possible, provide a minimal rule set for problematic cases to be used by your staff during the work on pronunciation. Include this rule set into your documentation of the corpus.
If you are working on a German speech corpus, you may use the BAS rule set for manual transcription as given in appendix .