Concept

The SIGNUM Database contains videos of isolated signs and of continuous sentences performed by various signers. The vocabulary comprises 450 signs in German Sign Language representing different word types such as nouns, verbs, adjectives, and numbers. Those signs were selected which occur most frequently in everyday conversation and are not dividable into smaller signs. Hence, they are called basic signs in the following. For selection several books and visual media commonly used for learning German Sign Language (DGS) were evaluated.

All 450 basic signs differ in their manual parameters. Many of them, however, change their specific meaning when the manual performance is recombined with a different facial expression. For example, the signs BÜRO (OFFICE) and SEKRETÄRIN (SECRETARY) are identical with respect to gesturing and can only be distinguished by the signer’s lip movements. In this case only the former sign is regarded as basic sign, whereas both signs appear in the continuous sentences of the corpus. In total 134 additional signs, derived from the basic signs, were integrated into the corpus.

Furthermore, some of the basic signs can be concatenated for creating a new sign with a different meaning. For example, the sign KOPF+SCHMERZEN (HEADACHE) is composed of the two basic signs KOPF (HEAD) and SCHMERZEN (PAIN). According to this concept, 156 composed signs were collected and integrated as well. Although the selected vocabulary is limited to 450 basic signs, altogether 740 different meanings can be expressed by means of recombination and concatenation.

Based on this extended vocabulary, overall 780 continuous sentences were constructed. All sentences are grammatically well-formed. There are no constraints regarding a specific sentence structure. Each sentence ranges from two to eleven signs in length. No intentional pauses are placed between signs within a sentence, but the sentences themselves are separated. The annotation follows the specifications of the Aachener Glossenumschrift, developed by the Deaf Sign Language Research Team (DESIRE) at the RWTH Aachen University.

For modeling interindividual variation in articulation all 450 basic signs and 780 sentences were performed once by 25 native signers of different sexes and ages. One of them was chosen to be the so-called reference signer. His articulations were recorded even three times, serving for evaluation of signer-dependent recognition rates. Altogether 33,210 utterances (12,150 isolated signs and 21,060 continuous sentences) are stored in the database.

In order to evaluate the recognition performance for different vocabulary sizes, the corpus is divided into three subcorpora simulating a vocabulary of 150, 300, and 450 basic signs respectively. The three subcorpora are built on each other in the following manner:

  Subcorpus Vocabulary Size Isolated Signs Continuous Sentences
  Subcorpus A 150 0001 - 0150 0001 - 0260
  Subcorpus B 300 0001 - 0300 0001 - 0520
  Subcorpus C 450 0001 - 0450 0001 - 0780