Storage Structure

Each signer has performed once the 450 basic signs and 780 continuous sentences of the corpus. The performances of the first signer, called reference signer, were recorded not once but even three times. Altogether 33,210 utterances (12,150 isolated signs and 21,060 continuous sentences) are available in the database. Each utterance is stored in a separate directory as a sequence of image files; the directory is packed into a ZIP file. The naming convention for directories and files consists of a consecutive sequence of letters followed by a two or four digit number.

    Directory   Filename
  Isolated signs   /data/sig#1/per#2/iso#3/   s#1-p#2-i#3-f#5.jpg
  Continuous sentences   /data/sig#1/per#2/con#4/   s#1-p#2-c#4-f#5.jpg

The following table summarizes the abbreviations used in the naming convention.

  Denotation   Directory   Filename
  Abbreviation   Number (Digits)   Abbreviation   Number (Digits)
  signer   sig   #1   (2)   s   #1   (2)
  performance   per   #2   (2)   p   #2   (2)
  isolated   iso   #3   (4)   i   #3   (4)
  continuous   con   #4   (4)   c   #4   (4)
  frame   -   #5   (4)   f   #5   (4)

Note that with version 1.2 the directory for the single frame picture files are replaced by equally named ZIP files, and each of these ZIP files is accompanied by a TXT label file (UTF-8) giving the annotation of the recorded sign/sentence.

Some examples to illustrate the naming convention (after un-packing the ZIP file):

  Directory/Filename   Explanation
  /data/sig01/per02/iso0150/s01-p02-i0150-f0064.jpg   64th frame of the image sequence containing the
  150th isolated sign performed the second time
  by the 1st signer (= reference signer)
  /data/sig25/per01/con0260/s25-p01-c0260-f0128.jpg   128th frame of the image sequence containing the
  260th continuous sentence performed the first time
  by the 25th signer