banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
A linear model of acoustic-to-facial mapping: Model parameters, data set size, and generalization across speakers
Rent this article for
    + View Affiliations - Hide Affiliations
    1 Institute of Biomaterials and Biomedical Engineering and Oral Dynamics Laboratory, University of Toronto, 160-500 University Avenue, Toronto, Ontario M5G 1V7, Canada
    2 Department of Speech Language Pathology, Oral Dynamics Laboratory, Institute of Biomaterials and Biomedical Engineering, and Department of Psychology, University of Toronto and Toronto Rehabilitation Institute, Toronto, Canada
    3 Institute of Biomaterials and Biomedical Engineering and Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada
    a) Electronic mail: p.vanlieshout@utoronto.ca
    J. Acoust. Soc. Am. 124, 3183 (2008); http://dx.doi.org/10.1121/1.2982369


Image of FIG. 1.
FIG. 1.

The power spectrum of an acoustic frame during the production of /i/ by a male speaker, with a 16th order LPC filter response (envelope) and first six LSPs (dark vertical lines).

Image of FIG. 2.
FIG. 2.

A subject with illuminated markers (left). The four gestures used in this study (right).

Image of FIG. 3.
FIG. 3.

Extracted 3D marker positions showing the oral area between the lips. The polygon connects the four points on the lips.

Image of FIG. 4.
FIG. 4.

Overall gesture CCs and NMSEs across sentences.

Image of FIG. 5.
FIG. 5.

Distribution of sentence mean CCs, averaged across subjects and gestures.

Image of FIG. 6.
FIG. 6.

Predicted (solid line) vs actual (dotted line) trajectories for the four different gestures (jaw, UL, LC, and UL/LL) for the sentence “The museum hires musicians every evening” from subject F3. These data illustrate different combinations for CC and NMSE values.

Image of FIG. 7.
FIG. 7.

Correlation between CC and NMSE for jaw predictions for subject M6.

Image of FIG. 8.
FIG. 8.

Overscaled predictions for UL from subject M5 for the sentence “Nothing is as offensive as innocence.” As in Fig. 6, solid lines depict predicted trajectories and dotted line depicts actual trajectories.

Image of FIG. 9.
FIG. 9.

Top: Sensitivity of CCs to acoustic frame size, using 89-sentence training sets, averaged across subjects. Bottom: Sensitivity of CC to the number of sentences in the training data set using acoustic frame, averaged across subjects.

Image of FIG. 10.
FIG. 10.

Single subject, multisubject, and subject averaged transformations, separated by gesture.


Generic image for table

Correlation between CCs and NMSEs across subjects.

Generic image for table

A comparison of subject-dependent and subject-independent results.


Article metrics loading...


Full text loading...

This is a required field
Please enter a valid email address
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd
Scitation: A linear model of acoustic-to-facial mapping: Model parameters, data set size, and generalization across speakers