No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
A metric for predicting binaural speech intelligibility in stationary noise and competing speech maskersa)
ANSI (1997). S3.5, “ Methods for the calculation of the Speech Intelligibility Index” (Acoustical Society of America, New York).
Beutelmann, R. , and Brand, T. (2006). “ Prediction of speech intelligibility in spatial noise and reverberation for normal-hearing and hearing-impaired listeners,” J. Acoust. Soc. Am. 120, 331–342.
Beutelmann, R. , Brand, T. , and Kollmeier, B. (2010). “ Revision, extension, and evaluation of a binaural speech intelligibility model,” J. Acoust. Soc. Am. 127, 2479–2497.
Brungart, D. S. , and Iyer, N. (2012). “ Better-ear glimpsing efficiency with symmetrically-placed interfering talkers,” J. Acoust. Soc. Am. 132, 2545–2556.
Collin, B. , and Lavandier, M. (2013). “ Binaural speech intelligibility in rooms with variations in spatial location of sources and modulation depth of noise interferers,” J. Acoust. Soc. Am. 134, 1146–1159.
Cooke, M. (1993). Modelling Auditory Processing and Organisation ( Cambridge University Press, Cambridge, UK).
Cosentino, S. , Marquardt, T. , McAlpine, D. , Culling, J. F. , and Falk, T. H. (2014). “ A model that predicts the binaural advantage to speech intelligibility from the mixed target and interferer signals,” J. Acoust. Soc. Am. 135, 796–807.
Culling, J. F. , Hawley, M. L. , and Litovsky, R. Y. (2004). “ The role of head-induced interaural time and level differences in the speech reception threshold for multiple interfering sound sources,” J. Acoust. Soc. Am. 116, 1057–1065.
Culling, J. F. , Hawley, M. L. , and Litovsky, R. Y. (2005). “ Erratum: The role head-induced interaural time and level differences in the speech reception threshold for multiple interfering sound sources [J. Acoust. Soc. Am. 116, 1057 (2004)],” J. Acoust. Soc. Am. 118, 552.
Dirks, D. D. , and Wilson, R. H. (1969). “ The effect of spatially separated sound sources on speech intelligibility,” J. Speech Hear. Res. 12, 5–38.
Drullman, R. (1995). “ Speech intelligibility in noise: Relative contributions of speech elements above and below the noise level,” J. Acoust. Soc. Am. 98, 1796–1798.
Durlach, N. I. (1963). “ Equalization and cancellation theory of binaural masking-level differences,” J. Acoust. Soc. Am. 35, 1206–1218.
Durlach, N. I. (1972). “ Binaural signal detection: Equalization and cancellation theory,” in Foundations of Modern Auditory Theory Vol. II, edited by J. V. Tobias ( Academic, New York).
Falk, T. , and Chan, W.-Y. (2008). “ A non-intrusive quality measure of dereverberated speech,” in IEEE Proceedings of the International Workshop on Acoustic Echo and Noise Control, pp. 978–989.
Glyde, H. , Buchholz, J. , Dillon, H. , Best, V. , Hickson, L. , and Cameron, S. (2013). “ The effect of better-ear glimpsing on spatial release from masking,” J. Acoust. Soc. Am. 134, 2937–2945.
Hawley, M. L. , Litovsky, R. Y. , and Colburn, H. S. (1999). “ Speech intelligibility and localization in a multi-source environment,” J. Acoust. Soc. Am. 105, 3436–3448.
Hawley, M. L. , Litovsky, R. Y. , and Culling, J. F. (2004). “ The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer,” J. Acoust. Soc. Am. 115, 833–843.
IEC 60268-16:2011 (2011). Part 16: Objective rating of speech intelligibility by speech transmission index ( International Electrotechnical Commission, Geneva, Switzerland), Sound System Equipment (fourth ed.).
ISO 3382-1 (2009). “ Acoustics—Measurement of room acoustic parameters—Part 1: Performance spaces” (International Organization for Standardization, Geneva, Switzerland).
ISO 389-7 (2006). “ Acoustics—Reference zero for the calibration of audiometric equipment—Part 7: Reference threshold of hearing under free-field and diffuse-field listening conditions” (International Organization for Standardization, Geneva, Switzerland).
Lavandier, M. , and Culling, J. F. (2010). “ Prediction of binaural speech intelligibility against noise in rooms,” J. Acoust. Soc. Am. 127(1), 387–399.
Levitt, H. , and Rabiner, L. R. (1967). “ Predicting binaural gain in intelligibility and release from masking for speech,” J. Acoust. Soc. Am. 42, 820–829.
Ma, J. , Hu, Y. , and Loizou, P. C. (2009). “ Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions,” J. Acoust. Soc. Am. 125, 3387–3405.
Mapp, P. (2008). “ Designing for speech intelligibility,” in Handbook for Sound Engineers, 4th ed. ( Focal, Oxford), pp. 1385–1414.
Moore, B. C. J. , and Glasberg, B. R. (1983). “ Suggested formulas for calculating auditory-filter bandwidths and excitation patterns,” J. Acoust. Soc. Am. 74, 750–753.
Moore, B. C. J. , Glasberg, B. R. , Plack, C. J. , and Biswas, A. K. (1988). “ The shape of the ear's temporal window,” J. Acoust. Soc. Am. 83, 1102–1116.
Patterson, R. D. , Holdsworth, J. , Nimmo-Smith, I. , and Rice, P. (1988). “ SVOS Final Report: The Auditory Filterbank,” Technical Report 2341, Medical Research Council (MRC) Applied Psychology Unit.
Rennies, J. , Brand, T. , and Kollmeier, B. (2011). “ Prediction of the influence of reverberation on binaural speech intelligibility in noise and in quiet,” J. Acoust. Soc. Am. 130, 2999–3012.
Rennies, J. , Warzybok, A. , Brand, T. , and Kollmeier, B. (2014). “ Modeling the effects of a single reflection on binaural speech intelligibility,” J. Acoust. Soc. Am. 135, 1556–1567.
Rhebergen, K. S. , and Versfeld, N. J. (2005). “ A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners,” J. Acoust. Soc. Am. 117, 2181–2192.
Rothauser, E. H. , Chapman, W. D. , Guttman, N. , Silbiger, H. R. , Hecker, M. H. L. , Urbanek, G. E. , Nordby, K. S. , and Weinstock, M. (1969). “ IEEE recommended practice for speech quality measurements,” IEEE Trans. Audio Electroacoust. 17, 225–246.
Sauert, B. , and Vary, P. (2010). “ Recursive closed-form optimization of spectral audio power allocation for near end listening enhancement,” in Proc. ITG-Fachtagung Sprachkommunikation ( Bochum, Germany).
Shannon, R. V. , Zeng, F. G. , Kamath, V. , Wygonski, J. , and Ekelid, M. (1995). “ Speech recognition with primarily temporal cues,” Science 270, 303–304.
Shaw, E. , and Vaillancourt, M. M. (1985). “ Transformation of sound pressure level from the free field to the eardrum presented in numerical form,” J. Acoust. Soc. Am. 78, 1120–1123.
Sonnenscheinn, D. (2001). Sound Design: The Expressive Power of Music, Voice and Sound Effects in Cinema ( Michael Wiese Productions, CA).
Steeneken, H. J. M. , and Houtgast, T. (1980). “ A physical method for measuring speech-transmission quality,” J. Acoust. Soc. Am. 67, 318–326.
Taal, C. , Hendriks, R. C. , and Heusdens, R. (2014). “ Speech energy redistribution for intelligibility improvement in noise based on a perceptual distortion measure,” Comput. Speech Lang. 28, 858–872.
Taal, C. H. , Hendriks, R. C. , Heusdens, R. , and Jensen, J. (2010). “ A short time objective intelligibility measure for time-frequency weighted noisy speech,” in Proc. ICASSP, pp. 4214–4217.
Tang, Y. (2014). “ Speech intelligibility enhancement and glimpse-based intelligibility models for known noise conditions,” Ph.D. thesis, Universidad del País Vasco.
Tang, Y. , and Cooke, M. (2010). “ Energy reallocation strategies for speech enhancement in known noise conditions,” in Proc. Interspeech, pp. 1636–1639.
Tang, Y. , Cooke, M. , and Valentini-Botinhao, C. (2016). “ Evaluating the predictions of objective intelligibility metrics for modified and synthetic speech,” Comput. Speech Lang. 35, 73–92.
Tang, Y. , Hughes, R. J. , Fazenda, B. M. , and Cox, T. J. (2016). “ Evaluating a distortion-weighted glimpsing metric for predicting binaural speech intelligibility in rooms,” Speech Commun. 82, 26–37.
University College London
, Cambridge University, Edinburgh University, the Speech Research Unit and the National Physical Laboratory (1992
). “ SCRIBE—Spoken corpus of British English
,” available at http://www.phon.ucl.ac.uk/resource/scribe
(Last viewed October 19, 2009).
Välimäki, V. , Parker, J. D. , Savioja, L. , Smith, J. O. , and Abel, J. S. (2012). “ Fifty years of artificial reverberation,” IEEE Trans. Audio Speech Lang. Process. 20, 1421–1448.
van Wijngaarden, S. J. , and Drullman, R. (2008). “ Binaural intelligibility prediction based on the speech transmission index,” J. Acoust. Soc. Am. 123, 4514–4523.
Wierstorf, H. , Geier, M. , Raake, A. , and Spors, S. (2011). “ A free database of head-related impulse response measurements in the horizontal plane with multiple distances,” in 130th Convention of the Audio Engineering Society.
Zurek, P. M. (1993). “ Binaural advantages and directional effects in speech intelligibility,” in Acoustical Factors Affecting Hearing Aid Performance ( Allyn and Bacon, Needham Heights, MA), pp. 255–276.
Article metrics loading...
One criterion in the design of binaural
sound scenes in audio production is the extent to which the intended speech message is correctly understood. Object-based audio broadcasting systems have permitted sound editors to gain more access to the metadata (e.g., intensity and location) of each sound source, providing better control over speech intelligibility. The current study describes and evaluates a binaural distortion-weighted glimpse proportion metric—BiDWGP—which is motivated by better-ear glimpsing and binaural masking level differences. BiDWGP predicts intelligibility from two alternative input forms: either binaural recordings or monophonic recordings from each sound source along with their locations. Two listening experiments were performed with stationary noise and competing speech, one in the presence of a single masker, the other with multiple maskers, for a variety of spatial configurations. Overall, BiDWGP with both input forms predicts listener keyword scores with correlations of 0.95 and 0.91 for single- and multi-masker conditions, respectively. When considering masker type separately, correlations rise to 0.95 and above for both types of maskers. Predictions using the two input forms are very similar, suggesting that BiDWGP can be applied to the design of sound scenes where only individual sound sources and their locations are available.
Full text loading...
Most read this month