No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
Predicting binaural speech intelligibility using the signal-to-noise ratio in the envelope power spectrum domain
ANSI (1997). ANSI S3.5, American National Standard Methods for Calculation of the Speech Intelligibility Index ( American National Standards Institute, New York).
Bernstein, J. G. W. , and Grant, K. W. (2009). “ Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners,” J. Acoust. Soc. Am. 125, 3358–3372.
Bernstein, L. R. , and Trahiotis, C. (1996). “ The normalized correlation: Accounting for binaural detection across center frequency,” J. Acoust. Soc. Am. 100, 3774–3784.
Beutelmann, R. , and Brand, T. (2006). “ Prediction of speech intelligibility in spatial noise and reverberation for normal-hearing and hearing-impaired listeners,” J. Acoust. Soc. Am. 120, 331–342.
Beutelmann, R. , Brand, T. , and Kollmeier, B. (2010). “ Revision, extension, and evaluation of a binaural speech intelligibility model,” J. Acoust. Soc. Am. 127, 2479–2497.
Blauert, J. , Brueggen, M. , Hartung, K. , Bronkhorst, A. W. , Drullmann, R. , Reynaud, G. , Pellieux, L. , Krebber, W. , and Sottek, R. (1998). “ The AUDIS catalog of human HRTFs,” J. Acoust. Soc. Am. 103, 3082–3082.
Breebaart, J. , van de Par, S. , and Kohlrausch, A. (2001). “ Binaural processing model based on contralateral inhibition. I. Model structure,” J. Acoust. Soc. Am. 110, 1074–1088.
Bronkhorst, A. , and Plomp, R. (1988). “ The effect of head-induced interaural time and level differences on speech intelligibility in noise,” J. Acoust. Soc. Am. 83, 1508–1516.
Brungart, D. S. , and Iyer, N. (2012). “ Better-ear glimpsing efficiency with symmetrically-placed interfering talkers,” J. Acoust. Soc. Am. 132, 2545–2556.
Carlile, S. , and Corkhill, C. (2015). “ Selective spatial attention modulates bottom-up informational masking of speech,” Sci. Rep. 5, 8662.
Chabot-Leclerc, A. , Jørgensen, S. , and Dau, T. (2014). “ The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction,” J. Acoust. Soc. Am. 135, 3502–3512.
, C. L.
). “ Odeon room acoustics program, version 8.0
(Last viewed 5/28/15).
Christiansen, S. K. , Jepsen, M. L. , and Dau, T. (2014). “ Effects of tonotopicity, adaptation, modulation tuning, and temporal coherence in ‘primitive’ auditory stream segregation,” J. Acoust. Soc. Am. 135, 323–333.
Collin, B. , and Lavandier, M. (2013). “ Binaural speech intelligibility in rooms with variations in spatial location of sources and modulation depth of noise interferers,” J. Acoust. Soc. Am. 134, 1146–1159.
Culling, J. F. , and Colburn, H. S. (2000). “ Binaural sluggishness in the perception of tone sequences and speech in noise,” J. Acoust. Soc. Am. 107, 517–527.
Culling, J. F. , Hawley, M. L. , and Litovsky, R. Y. (2004). “ The role of head-induced interaural time and level differences in the speech reception threshold for multiple interfering sound sources,” J. Acoust. Soc. Am. 116, 1057–1065.
Culling, J. F. , Hawley, M. L. , and Litovsky, R. Y. (2005). “ Erratum: The role head-induced interaural time and level differences in the speech reception threshold for multiple interfering sound sources [J. Acoust. Soc. Am. 116, 1057 (2004)],” J. Acoust. Soc. Am. 118, 552–552.
Culling, J. F. , and Mansell, E. R. (2013). “ Speech intelligibility among modulated and spatially distributed noise sources,” J. Acoust. Soc. Am. 133, 2254–2261.
Culling, J. F. , and Summerfield, Q. (1998). “ Measurements of the binaural temporal window using a detection task,” J. Acoust. Soc. Am. 103, 3540–3553.
Dau, T. , Kollmeier, B. , and Kohlrausch, A. (1997). “ Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers,” J. Acoust. Soc. Am. 102, 2892–2905.
Dreschler, W. A. , Verschuure, H. , Ludvigsen, C. , and Westermann, S. (2001). “ ICRA noises: Artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment,” Audiology 40, 148–157.
Elhilali, M. , and Shamma, S. A. (2008). “ A cocktail party with a cortical twist: How cortical mechanisms contribute to sound segregation,” J. Acoust. Soc. Am. 124, 3751–3771.
Ewert, S. D. , and Dau, T. (2000). “ Characterizing frequency selectivity for envelope fluctuations,” J. Acoust. Soc. Am. 108, 1181–1196.
Glyde, H. , Buchholz, J. , Dillon, H. , Best, V. , Hickson, L. , and Cameron, S. (2013). “ The effect of better-ear glimpsing on spatial release from masking,” J. Acoust. Soc. Am. 134, 2937–2945.
Hawley, M. , Litovsky, R. , and Culling, J. (2004). “ The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer,” J. Acoust. Soc. Am. 115, 833–843.
Houtgast, T. , and Steeneken, H. J. (1973). “ The modulation transfer function in room acoustics as a predictor of speech intelligibility,” Acta Acust. Acust. 28, 66–73.
IEC (2003). IEC60268-16, Sound System Equipment—Part 16: Objective Rating of Speech Intelligibility by Speech Transmission Index ( International Electrotechnical Commission, Geneva, Switzerland).
ISO (2005). 389-7, Reference Zero for the Calibration of Audiometric Equipment—Part 7: Reference Threshold of Hearing under Free-Field and Diffuse-Field Listening Conditions ( International Organization for Standardization, Geneva, Switzerland).
Jørgensen, S. , and Dau, T. (2011). “ Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing,” J. Acoust. Soc. Am. 130, 1475–1487.
Jørgensen, S. , Ewert, S. D. , and Dau, T. (2013). “ A multi-resolution envelope-power based model for speech intelligibility,” J. Acoust. Soc. Am. 134, 436–446.
Lavandier, M. , and Culling, J. F. (2007). “ Speech segregation in rooms: Effects of reverberation on both target and interferer,” J. Acoust. Soc. Am. 122, 1713–1723.
Lavandier, M. , and Culling, J. F. (2010). “ Prediction of binaural speech intelligibility against noise in rooms,” J. Acoust. Soc. Am. 127, 387–399.
Lavandier, M. , Jelfs, S. , Culling, J. F. , Watkins, A. J. , Raimond, A. P. , and Makin, S. J. (2012). “ Binaural prediction of speech intelligibility in reverberant rooms with multiple noise sources,” J. Acoust. Soc. Am. 131, 218–231.
Levitt, H. , and Rabiner, L. (1967). “ Predicting binaural gain in intelligibility and release from masking for speech,” J. Acoust. Soc. Am. 42, 820–829.
Lőcsei, G. , Hefting Pedersen, J. , Laugesen, S. , Santurette, S. , Dau, T. , and MacDonald, E. N. (2015). “ Lateralized speech perception, temporal processing and cognitive function in NH and HI listeners,” presented at the Speech in Noise Workshop, Copenhagen, Denmark.
Loizou, P. C. (2007). Speech Enhancement: Theory and Practice, 1st ed. ( CRC, Boca Raton, FL).
Marrone, N. , Mason, C. R. , and Kidd, G. (2008). “ Tuning in the spatial dimension: Evidence from a masked speech identification task,” J. Acoust. Soc. Am. 124, 1146–1158.
Nielsen, J. B. , Dau, T. , and Neher, T. (2014). “ A Danish open-set speech corpus for competing-speech studies,” J. Acoust. Soc. Am. 135, 407–420.
Plomp, R. (1976). “ Binaural and monaural speech intelligibility of connected discourse in reverberation as a function of azimuth of a single competing sound source (speech or noise),” Acustica 34, 200–211.
Rennies, J. , Brand, T. , and Kollmeier, B. (2011). “ Prediction of the influence of reverberation on binaural speech intelligibility in noise and in quiet,” J. Acoust. Soc. Am. 130, 2999–3012.
Rhebergen, K. S. , and Versfeld, N. J. (2005). “ A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners,” J. Acoust. Soc. Am. 117, 2181–2192.
Rhebergen, K. S. , Versfeld, N. J. , and Dreschler, W. A. (2009). “ The dynamic range of speech, compression, and its effect on the speech reception threshold in stationary and interrupted noise,” J. Acoust. Soc. Am. 126, 3236–3245.
Rothauser, E. , Chapman, W. , Guttman, N. , Nordby, K. , Silbiger, H. , Urbanek, G. , and Weinstock, M. (1969). “ IEEE recommended practice for speech quality measurements,” IEEE Trans. Audio Electroacoust. 17, 225–246.
Van Wijngaarden, S. , and Drullman, R. (2008). “ Binaural intelligibility prediction based on the speech transmission index,” J. Acoust. Soc. Am. 123, 4514–4523.
Verhey, J. L. , Dau, T. , and Kollmeier, B. (1999). “ Within-channel cues in comodulation masking release (CMR): Experiments and model predictions using a modulation-filterbank model,” J. Acoust. Soc. Am. 106, 2733–2745.
Wagener, K. , Kühnel, V. , and Kollmeier, B. (1999). “ Development and evaluation of a German sentence test I: Design of the Oldenburg sentence test,” Z. Audiol. Audiol. Acoust. 38, 4–15.
Wagener, K. C. , and Brand, T. (2005). “ Sentence intelligibility in noise for listeners with normal hearing and hearing impairment: Influence of measurement procedure and masking parameters,” Int. J. Audiol. 44, 144–156.
Wan, R. , Durlach, N. I. , and Colburn, H. S. (2010). “ Application of an extended equalization-cancellation model to speech intelligibility with spatially distributed maskers,” J. Acoust. Soc. Am. 128, 3678–3690.
Wan, R. , Durlach, N. I. , and Colburn, H. S. (2014). “ Application of a short-time version of the equalization-cancellation model to speech intelligibility experiments with speech maskers,” J. Acoust. Soc. Am. 136, 768–776.
Westermann, A. , and Buchholz, J. M. (2015a). “ The effect of spatial separation in distance on the intelligibility of speech in rooms,” J. Acoust. Soc. Am. 137, 757–767.
Westermann, A. , and Buchholz, J. M. (2015b). “ The influence of informational masking in reverberant, multi-talker environmentsa),” J. Acoust. Soc. Am. 138, 584–593.
Article metrics loading...
This study proposes a binaural extension to the multi-resolution speech-based envelope power spectrum model (mr-sEPSM) [Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134, 436–446]. It consists of a combination of better-ear (BE) and binaural unmasking processes, implemented as two monaural realizations of the mr-sEPSM combined with a short-term equalization-cancellation process, and uses the signal-to-noise ratio in the envelope domain (SNRenv) as the decision metric. The model requires only two parameters to be fitted per speech material and does not require an explicit frequency weighting. The model was validated against three data sets from the literature, which covered the following effects: the number of maskers, the masker types [speech-shaped noise (SSN), speech-modulated SSN, babble, and reversed speech], the masker(s) azimuths, reverberation on the target and masker, and the interaural time difference of the target and masker. The Pearson correlation coefficient between the simulated speech reception thresholds and the data across all experiments was 0.91. A model version that considered only BE processing performed similarly (correlation coefficient of 0.86) to the complete model, suggesting that BE processing could be considered sufficient to predict intelligibility in most realistic conditions.
Full text loading...
Most read this month