Skip to main content
banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
1. O. Fujimura, S. Kiritani, and H. Ishida, “Computer controlled radiography for observation of movements of articulatory and other human organs,” Comp. Biol. Med. 3(4), 371384 (1973).
2. M. Stone, “A guide to analyzing tongue motion from ultrasound images,” Clin. Ling. Phon. 19(6–7), 455501 (2005).
3. K. Iskarous, M. Pouplier, S. Marin, and J. Harrington, “The interaction between prosodic boundaries and accent in the production of sibilants,” in ISCA Proceedings of the 5th International Conference on Speech Prosody, Chicago (2010), pp. 14.
4. J. S. Perkell, M. H. Cohen, M. A. Svirsky, M. L. Matthies, I. Garabieta, and M. T. Jackson, “Electromagnetic mid-sagittal articulometer systems for transducing speech articulatory movements,” J. Acoust. Soc. Am. 92(6), 30783096 (1992).
5. S. S. Narayanan, K. Nayak, S. Lee, A. Sethy, and D. Byrd, “An approach to real-time magnetic resonance imaging for speech production,” J. Acoust. Soc. Am. 115(4), 17711776 (2004).
6. T. Fitch and J. Giedd, “Morphology and development of the human vocal tract: A study using magnetic resonance imaging,” J. Acoust. Soc. Am. 106(3), 15111522 (1999).
7. E. Bresch, J. Nielsen, K. Nayak, and S. S. Narayanan, “Synchronized and noise-robust audio recordings during real-time magnetic resonance imaging scans,” J. Acoust. Soc. Am. 120, 17911794 (2006).
8. J. Kim, A. Lammert, P. Kumar Ghosh, and S. S. Narayanan, “Spatial and temporal alignment of multimodal human speech production data: Real-time imaging, flesh point tracking and audio,” in IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver (May 2013), pp. 3637–3641.
9. (Last viewed September 23, 2013).
10. S. S. Narayanan, E. Bresch, P. Kumar Ghosh, L. Goldstein, A. Katsamanis, Y.-C. Kim, A. Lammert, M. I. Proctor, V. Ramanarayanan, and Y. Zhu, “A multimodal real-time MRI articulatory corpus for speech research,” in ISCA Proceedings of Interspeech, Florence, Italy (August 2011).
11. A. A. Wrench, “A multichannel articulatory database and its application for automatic speech recognition,” in ISSP Proceedings of the 5th Seminar of Speech Production, Kloster Seeon, Bavaria, Germany (2000), pp. 305308.
12. B. Pellom and K. Hacioglu, “SONIC: The University of Colorado Continuous Speech Recognizer,” Technical Report TR-CSLR-2001-01, May 2005.
13. A. Lammert, M. Proctor, and S. S. Narayanan, “Data-driven analysis of real-time vocal tract MRI using correlated image regions,” in Proceedings of Interspeech, Makuhari, Japan (2010).
14. A. Katsamanis, M. Black, P. G. Georgiou, L. Goldstein, and S. S. Narayanan, “SailAlign: Robust long speech-text alignment,” in Workshop on New Tools and Methods for Very-Large Scale Phonetics Research, Philadelphia, PA (January 2011).
15. M. Proctor, D. Bone, A. Katsamanis, and S. S. Narayanan, “Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysis,” in ISCA Proceedings of Interspeech, Makuhari, Japan (2010), pp. 15761579.
16. C. Qin and M. A. Carreira-Perpinan, “Reconstructing the full tongue contour from EMA/X-ray microbeam,” in IEEE International Conference on Acoustics Speech and Signal Processing (March 2010), pp. 41904193.
17. S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond, and M. Wester, “Speech production knowledge in automatic speech recognition,” J. Acoust. Soc. Am. 121(2), 723742 (2007).
18. M. Li, J. Kim, P. Ghosh, V. Ramanarayanan, and S. S. Narayanan, “Speaker verification based on fusion of acoustic and articulatory information,” in ISCA Proceedings of Interspeech (2013).

Data & Media loading...


Article metrics loading...



This paper describes a spatio-temporal registration approach for speech articulation data obtained from electromagnetic articulography (EMA) and real-time Magnetic Resonance Imaging (rtMRI). This is motivated by the potential for combining the complementary advantages of both types of data. The registration method is validated on EMA and rtMRI datasets obtained at different times, but using the same stimuli. The aligned corpus offers the advantages of high temporal resolution (from EMA) and a complete mid-sagittal view (from rtMRI). The co-registration also yields optimum placement of EMA sensors as articulatory landmarks on the magnetic resonance images, thus providing richer spatio-temporal information about articulatory dynamics.


Full text loading...


Access Key

  • FFree Content
  • OAOpen Access Content
  • SSubscribed Content
  • TFree Trial Content
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd