No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
Requirements for the evaluation of computational speech segregation systems
2. Brungart, D. S. , Chang, P. S. , Simpson, B. D. , and Wang, D. L. (2006). “ Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation,” J. Acoust. Soc. Am. 120, 4007–4018.
3. Dreschler, W. A. , Verschuure, H. , Ludvigsen, C. , and Westermann, S. (2001). “ ICRA noises: Artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment,” Audiology 40, 148–157.
4. Healy, E. W. , Yoho, S. E. , Wang, Y. , and Wang, D. L. (2013). “ An algorithm to improve speech recognition in noise for hearing-impaired listeners,” J. Acoust. Soc. Am. 134, 3029–3038.
6. Kim, G. , Lu, Y. , Hu, Y. , and Loizou, P. C. (2009). “ An algorithm that improves speech intelligibility in noise for normal-hearing listeners,” J. Acoust. Soc. Am. 126, 1486–1494.
7. May, T. , and Dau, T. (2013). “ Environment-aware ideal binary mask estimation using monaural cues,” in Proceedings of WASPAA ( New Paltz, NY).
12. May, T. , and Gerkmann, T. (2014). “ Generalization of supervised learning for binary mask estimation,” in Proceedings of IWAENC ( Juan les Pins, France).
9. Tchorz, J. , and Kollmeier, B. (2003). “ SNR estimation based on amplitude modulation analysis with applications to noise suppression,” IEEE Trans. Audio, Speech, Lang. Process. 11, 184–192.
10. Wang, D. L. (2005). “ On ideal binary mask as the computational goal of auditory scene analysis,” in Speech Separation by Humans and Machines, edited by P. Divenyi ( Kluwer Academic, Dordrecht, The Netherlands), Chap. 12, pp. 181–197.
11. Wang, D. L. , Kjems, U. , Pedersen, M. S. , Boldt, J. B. , and Lunner, T. (2008). “ Speech perception of noise with binary gains,” J. Acoust. Soc. Am. 124, 2303–2307.
Article metrics loading...
Recent studies on computational speech segregation reported improved speech intelligibility in noise when estimating and applying an ideal binary mask with supervised learning algorithms. However, an important requirement for such systems in technical applications is their robustness to acoustic conditions not considered during training. This study demonstrates that the spectro-temporal noise variations that occur during training and testing determine the achievable segregation performance. In particular, such variations strongly affect the identification of acoustical features in the system associated with perceptual attributes in speech segregation. The results could help establish a framework for a systematic evaluation of future segregation systems.
Full text loading...
Most read this month