BAS Partitur Format files (BPF) TNNN_SD.par The BAS Partitur Format is a simple but effective way to represent symbolic (discrete) labels (categories together with their time information) aligned to a physical signal. Developed for the multi-tier annotation of speech signals it was also used in the SmartKom project to summarize and align all the annotation layers (tiers) associated with different (time synchronized) signals. The main (and up-to-date) documentation can be found in www.bas.uni-muenchen.de/Bas/BasFormatseng.html (a copy of this page at the time of distribution can be found in 'doc/pardoc/index.html'). This README.PAR does only contain some general remarks about the usage of BPF within SmartKom. For the details of the individual tiers please refer to the main documentation. For a more general description of the concept please refer to 'doc/papers/Schiel-02-LREC-WS.ps'. BPF is a format that uses two ways to align different information tiers in a recording: the physical time scale and a discrete numbering of events that may be associated to most of the tiers: the word order (numbered beginning with 0). For the SmartKom corpus this has the following consequences: - To each recording there exist two BPF files, e.g. w157_mn_ADA.par contains the BPF of the speaker 'ADA' and w157_mn_SMA.par contains the BPF of the SmartKom system as they communicate in the recording session 'w157_mn' (the SmartKom system has always the ID 'SMA') - Each BPF file contains all events and annotations for the whole session; time is measured from the beginning of the time synchronized signals files. If not given otherwise the unit of time is 1/16000 sec. - the following tiers are given in a SK BPF file: TRS : basic transliteration broken into word units TRN : beginning and duration of turns numbered as in the transliteration format ORT : orthography as used in the pronunciation dictionary KAN : canonic pronunciation as given in the pronunciation dictionary SUP : crosstalk between speaker and system NOI : noise markers associated to words or the gaps between words GES : 2-D gesture labeling with reference to the time scale USH : holistic user state labeling and segmentation with reference to the time scale USM : user state labeling and segmentation from facial expression with reference to the time scale USP : meta-linguistic labeling and segmentation from voice input with reference to words and to the time scale as input to automatic user state detection from voice input MAU : automatic phonetic/phonemic segmentation and labelling by MAUS Since the BPF files integrate all other annotations into a single format and since these annotations will be error pruned and/or updated from time to time, it is clear that the update rate of the BPFs are quite high (every time a annotation is getting updated). Therefore, we strongly recommend that you download the newest version of the BPF files before any critical work with the SK data. Updated versions of the BPF files of SmartKom may be downloaded from ftp://ftp.bas.uni-muenchen.de/pub/BAS/SK