Face, Speech, and Acoustics  

Face, Speech, and Acoustics

   Introduction ·  Program ·  Local Information ·  Registration ·  Contact

 

Abstract

Johannes Behrends, (City Center Hospital, University of Munich)

Vocal tract recovering using MRI

MRI enables the in-vivo-analysis of the 3D structure of the human vocal tract. From this, acoustic-articulatory models can be obtained by innovative computer-aided image analysis and pattern recognition methods as an important contribution for the construction of speech synthesis systems and for clinical diagnostics of inborn or surgery-caused speech disorder. For this, MRI examinations were made during phonation with different slice orientation. Anatomically correct image registration leads to a highly precise 3D reconstruction of the vocal tract in isotropic resolution. The segmentation of the vocal tract is based on 3D region growing. The problems of poor rendering of teeth and hard palate, mouth-, and glottal opening require some preprocessing steps: The problem of teeth and hard palate can be solved by inserting computertomographic scans of teeth impressions of each speaker into the anatomical dataset by interactive multi-modality image fusion. The closing of the mouth and glottal opening is done by defining reference points on the "leaks". Additionaly the mouth region is convoluted by an I-shaped kernel. After extraction of the vocal tract by 3D region growing a curvilinear midline is computed which, in contrast to earlier examinations on this topic, three-dimensionally runs through the vocal tract. This is done by an unsupervised training of a Kohonen chain which is modified in order to avoid an overfolding of the codebook (i.e. |r'-r''|>1) (Der, 1998). After some postprocessing steps including smoothing and sampling equidistant points on the midline, determination of the intersectional area at each sampling point leads to the characteristic "area function" for each phoneme. The system as a whole faciliates a rapid as well as a precise segmentation and analysis of the human vocal tract which is a helpful contribution to the MRI analysis of multi-speaker studies.

 


Last modified: Mon Nov 18 13:43:21 CET 2002