Skip to main content
banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
1. ANSI (1997). ANSI S3.5, American National Standards Methods for Calculation of the Speech Intelligibility Index (Acoustical Society of America, New York).
2. Chen, F. , and Loizou, P. (2011). “ Predicting the intelligibility of vocoded speech,” Ear Hear. 32, 331338.
3. Dolson, M. (1986). “ The phase vocoder: A tutorial,” Comput. Music J. 10, 1427.
4. Dorman, M. , Loizou, P. , and Rainey, D. (1997). “ Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs,” J. Acoust. Soc. Am. 102, 24032411.
5. Gilbert, G. , and Lorenzi, C. (2006). “ The ability of listeners to use recovered envelope cues from speech fine structure,” J. Acoust. Soc. Am. 119, 24382444.
6. Greenwood, D. D. (1990). “ A cochlear frequency-position function for several species—29 years later,” J. Acoust. Soc. Am. 87, 25922605.
7. Kazama, M. , Gotoh, S. , Tohyama, M. , and Houtgast, T. (2010). “ On the significance of phase in the short term Fourier spectrum for speech intelligibility,” J. Acoust. Soc. Am. 127, 14321439.
8. Lorenzi, C. , Gilbert, G. , Carn, H. , Garnier, S. , and Moore, B. C. (2006). “ Speech perception problems of the hearing impaired reflect inability to use temporal fine structure,” Proc. Natl. Acad. Sci. U.S.A. 103, 1886618869.
9. McAulay, R. , and Quatieri, T. (1995). “ Sinusoidal coding,” in Speech Coding and Synthesis, edited by W. Kleijn and K. Paliwal (Elsevier Science, New York).
10. Moore, B. C. (2008). “ The role of temporal fine structure processing in pitch perception, masking, speech perception for normal-hearing hearing-impaired people,” J. Assoc. Res. Otolaryngol. 9, 399406.
11. Shannon, R. V. , Zeng, F. G. , Kamath, V. , Wygonski, J. , and Ekelid, M. (1995). “ Speech recognition with primarily temporal cues,” Science 270, 303304.
12. Smith, Z. M. , Delgutte, B. , and Oxenham, A. J. (2002). “ Chimaeric sounds reveal dichotomies in auditory perception,” Nature 416, 8790.
13. Studebaker, G. A. (1985). “ A ‘rationalized’ arcsine transform,” J. Speech Hear. Res. 28, 455462.
14. Wong, L. L. , Soli, S. D. , Liu, S. , Han, N. , and Huang, M. W. (2007). “ Development of the Mandarin hearing in noise test (MHINT),” Ear Hear. 28, 70S74S.
15. Xu, L. , Thompson, C. S. , and Pfingst, B. E. (2005). “ Relative contributions of spectral and temporal cues for phoneme recognition,” J. Acoust. Soc. Am. 117, 32553267.
16. Zeng, F. G. , Nie, K. , Liu, S. , Stickney, G. , Del Rio, E. , Kong, Y. Y. , and Chen, H. (2004). “ On the dichotomy in auditory perception between temporal envelope and fine structure cues,” J. Acoust. Soc. Am. 116, 13511354.

Data & Media loading...


Article metrics loading...



This study investigated the effect of temporal modulation rate on the intelligibility of speech synthesized with primarily phase information using two methods: Phase-based vocoded speech (preserving phase cues and discarding envelope cues) and Hilbert fine-structure stimuli (summing up the multi-channel Hilbert fine-structure waveforms). Listening experiments with normal-hearing participants showed that the intelligibility of the two types of phase-based speech was significantly improved when synthesized using a high temporal modulation rate (or short frame) compared to that synthesized using the whole speech segment. This intelligibility advantage appears to be attributed to better preservation of the temporal envelope cues in phase-based speech.


Full text loading...


Access Key

  • FFree Content
  • OAOpen Access Content
  • SSubscribed Content
  • TFree Trial Content
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd