Skip to main content

News about Scitation

In December 2016 Scitation will launch with a new design, enhanced navigation and a much improved user experience.

To ensure a smooth transition, from today, we are temporarily stopping new account registration and single article purchases. If you already have an account you can continue to use the site as normal.

For help or more information please visit our FAQs.

banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
/content/asa/journal/jasa/137/5/10.1121/1.4919344
1.
1. Bacon, S. P. , and Opie, J. M. (1994). “ Monotic and dichotic modulation detection interference in practiced and unpracticed subjects,” J. Acoust. Soc. Am. 95, 26372641.
http://dx.doi.org/10.1121/1.410020
2.
2. Bench, J. , Kowal, A. , and Bamford, J. (1979). “ The BKB (Bamford-Kowal-Bench) sentence lists for partially-hearing children,” Br. J. Audiol. 13, 108112.
http://dx.doi.org/10.3109/03005367909078884
3.
3. Boersma, P. , and Weenink, D. (2010). “ PRAAT, a system for doing phonetics by computer, software package, version 5.1.28. Institute of Phonetic Sciences, University of Amsterdam, The Netherlands,” Retrieved 10 March 2010 from http://www.praat.org/ (Last viewed 9/29/2014).
4.
4. Bregman, A. S. (1990). Auditory Scene Analysis: The Perceptual Organization of Sound ( MIT Press, Cambridge, MA), pp. 1790.
5.
5. Brown, G. J. , and Cooke, M. (1994). “ Computational auditory scene analysis,” Comput. Speech Lang. 8, 297336.
http://dx.doi.org/10.1006/csla.1994.1016
6.
6. Brungart, D. S. , Chang, P. S. , Simpson, B. D. , and Wang, D. L. (2006). “ Isolating the energetic component of speech-on-speech masking with an ideal time-frequency segregation,” J. Acoust. Soc. Am. 120, 40074018.
http://dx.doi.org/10.1121/1.2363929
7.
7. Brungart, D. S. , Chang, P. S. , Simpson, B. D. , and Wang, D. L. (2009). “ Multitalker speech perception with ideal time-frequency segregation: Effects of voice characteristics and number of talkers,” J. Acoust. Soc. Am. 125, 40064022.
http://dx.doi.org/10.1121/1.3117686
8.
8. Brungart, D. S. , Simpson, B. D. , Darwin, C. J. , Arbogast, T. L. , and Kidd, G. (2005). “ Across-ear interference from parametrically degraded synthetic speech signals in a dichotic cocktail-party listening task,” J. Acoust. Soc. Am. 117, 292304.
http://dx.doi.org/10.1121/1.1835509
9.
9. Cherry, E. C. (1953). “ Some experiments on the recognition of speech, with one and with two ears,” J. Acoust. Soc. Am. 25, 975979.
http://dx.doi.org/10.1121/1.1907229
10.
10. Cooke, M. (2006). “ A glimpsing model of speech perception in noise,” J. Acoust. Soc. Am. 119, 15621573.
http://dx.doi.org/10.1121/1.2166600
11.
11. Cooke, M. , Green, P. , Josifovski, L. , and Vizinho, A. (2001). “ Robust automatic speech recognition with missing and unreliable acoustic data,” Speech Commun. 34, 267285.
http://dx.doi.org/10.1016/S0167-6393(00)00034-0
12.
12. Darwin, C. J. (1981). “ Perceptual grouping of speech components differing in fundamental frequency and onset-time,” Q. J. Exp. Psychol. 33A, 185207.
13.
13. Darwin, C. J. (2008). “ Listening to speech in the presence of other sounds,” Philos. Trans. R. Soc. B 363, 10111021.
http://dx.doi.org/10.1098/rstb.2007.2156
14.
14. Davis, M. H. , Johnsrude, I. S. , Hervais-Adelman, A. , Taylor, K. , and McGettigan, C. (2005). “ Lexical information drives perceptual learning of distorted speech: Evidence from the comprehension of noise-vocoded sentences,” J. Exp. Psychol. Gen. 134, 222241.
http://dx.doi.org/10.1037/0096-3445.134.2.222
15.
15. Dubbelboer, F. , and Houtgast, T. (2008). “ The concept of signal-to-noise ratio in the modulation domain and speech intelligibility,” J. Acoust. Soc. Am. 124, 39373946.
http://dx.doi.org/10.1121/1.3001713
16.
16. Durlach, N. I. , Mason, C. R. , Kidd, G. , Arbogast, T. L. , Colburn, H. S. , and Shinn-Cunningham, B. G. (2003). “ Note on informational masking,” J. Acoust. Soc. Am. 113, 29842987.
http://dx.doi.org/10.1121/1.1570435
17.
17. Foster, J. R. , Summerfield, A. Q. , Marshall, D. H. , Palmer, L. , Ball, V. , and Rosen, S. (1993). “ Lip-reading the BKB sentence lists: Corrections for list and practice effects,” Br. J. Audiol. 27, 233246.
http://dx.doi.org/10.3109/03005369309076700
18.
18. Gardner, R. B. , Gaskill, S. A. , and Darwin, C. J. (1989). “ Perceptual grouping of formants with static and dynamic differences in fundamental frequency,” J. Acoust. Soc. Am. 85, 13291337.
http://dx.doi.org/10.1121/1.397464
19.
19. Hall, J. W. , Haggard, M. P. , and Fernandes, M. A. (1984). “ Detection in noise by spectro-temporal pattern analysis,” J. Acoust. Soc. Am. 76, 5056.
http://dx.doi.org/10.1121/1.391005
20.
20. Henke, W. L. (2005). “MITSYN: A coherent family of high-level languages for time signal processing, software package (Belmont, MA),” www.mitsyn.com (Last viewed 9/29/2014).
21.
21.Institute of Electrical and Electronics Engineers (IEEE) (1969). “ IEEE recommended practice for speech quality measurements,” IEEE Trans. Audio Electroacoust. AU-17, 225246.
22.
22. Jørgensen, S. , and Dau, T. (2011). “ Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing,” J. Acoust. Soc. Am. 130, 14751487.
http://dx.doi.org/10.1121/1.3621502
23.
23. Keppel, G. , and Wickens, T. D. (2004). Design and Analysis: A Researcher's Handbook, 4th ed. ( Pearson Prentice Hall, Englewood Cliffs, NJ), pp. 1611.
24.
24. Kidd, G. , Mason, C. R. , Richards, V. M. , Gallun, F. J. , and Durlach, N. I. (2008). “ Informational masking,” in Auditory Perception of Sound Sources, Springer Handbook of Auditory Research, edited by W. A. Yost and R. R. Fay ( Springer, Berlin), Vol. 29, pp. 143189.
25.
25. Klatt, D. H. (1980). “ Software for a cascade/parallel formant synthesizer,” J. Acoust. Soc. Am. 67, 971995.
http://dx.doi.org/10.1121/1.383940
26.
26. Lewis, D. E. , and Carrell, T. D. (2007). “ The effect of amplitude modulation on intelligibility of time-varying sinusoidal speech in children and adults,” Percept. Psychophys. 69, 11401151.
http://dx.doi.org/10.3758/BF03193951
27.
27. Lindblom, B. E. F. , and Sundberg, J. E. F. (1971). “ Acoustical consequences of lip, tongue, jaw, and larynx movement,” J. Acoust. Soc. Am. 50, 11661179.
http://dx.doi.org/10.1121/1.1912750
28.
28. Lyzenga, J. , and Carlyon, R. P. (2000). “ Binaural effects in center-frequency modulation detection interference for vowel formants,” J. Acoust. Soc. Am. 108, 753759.
http://dx.doi.org/10.1121/1.429608
29.
29. Mattys, S. L. , Davis, M. H. , Bradlow, A. R. , and Scott, S. K. (2012). “ Speech recognition in adverse conditions: A review,” Lang. Cognit. Proc. 27, 953978.
http://dx.doi.org/10.1080/01690965.2012.705006
30.
30. Neff, D. L. (1995). “ Signal properties that reduce masking by simultaneous, random-frequency maskers,” J. Acoust. Soc. Am. 98, 19091920.
http://dx.doi.org/10.1121/1.414458
31.
31. Porter, R. J. , and Whittaker, R. G. (1980). “ Dichotic and monotic masking of CV's by CV second formants with different transition starting values,” J. Acoust. Soc. Am. 67, 17721780.
http://dx.doi.org/10.1121/1.384305
32.
32. Remez, R. E. , Dubowski, K. R. , Davids, M. L. , Thomas, E. F. , Paddu, N. U. , Grossman, Y. S. , and Moskalenko, M. (2011). “ Estimating speech spectra for copy synthesis by linear prediction and by hand,” J. Acoust. Soc. Am. 130, 21732178.
http://dx.doi.org/10.1121/1.3631667
33.
33. Remez, R. E. , Rubin, P. E. , Berns, S. M. , Pardo, J. S. , and Lang, J. M. (1994). “ On the perceptual organization of speech,” Psychol. Rev. 101, 129156.
http://dx.doi.org/10.1037/0033-295X.101.1.129
34.
34. Roberts, B. , Summers, R. J. , and Bailey, P. J. (2010). “ The perceptual organization of sine-wave speech under competitive conditions,” J. Acoust. Soc. Am. 128, 804817.
http://dx.doi.org/10.1121/1.3445786
35.
35. Roberts, B. , Summers, R. J. , and Bailey, P. J. (2011). “ The intelligibility of noise-vocoded speech: Spectral information available from across-channel comparison of amplitude envelopes,” Proc. R. Soc. London, Ser. B 278, 15951600.
http://dx.doi.org/10.1098/rspb.2010.1554
36.
36. Roberts, B. , Summers, R. J. , and Bailey, P. J. (2014). “ Formant-frequency variation and informational masking of speech by extraneous formants: Evidence against dynamic and speech-specific acoustical constraints,” J. Exp. Psychol. Hum. Percept. Perform. 40, 15071525.
http://dx.doi.org/10.1037/a0036629
37.
37. Roberts, B. , Summers, R. J. , and Bailey, P. J. (2015). “ Acoustic source characteristics, across-formant integration, and speech intelligibility under competitive conditions,” J. Exp. Psychol. Hum. Percept. Perform. (published online).
http://dx.doi.org/10.1037/xhp0000038
38.
38. Rosenberg, A. E. (1971). “ Effect of glottal pulse shape on the quality of natural vowels,” J. Acoust. Soc. Am. 49, 583590.
http://dx.doi.org/10.1121/1.1912389
39.
39. Shinn-Cunningham, B. G. (2008). “ Object-based auditory and visual attention,” Trends Cognit. Sci. 12, 182186.
http://dx.doi.org/10.1016/j.tics.2008.02.003
40.
40. Snedecor, G. W. , and Cochran, W. G. (1967). Statistical Methods, 6th ed. ( Iowa Press, Ames, IA), pp. 1310.
41.
41. Stone, M. A. , Füllgrabe, C. , Mackinnon, R. C. , and Moore, B. C. J. (2011). “ The importance for speech intelligibility of random fluctuations in ‘steady’ background noise,” J. Acoust. Soc. Am. 130, 28742881.
http://dx.doi.org/10.1121/1.3641371
42.
42. Stone, M. A. , Füllgrabe, C. , and Moore, B. C. J. (2012). “ Notionally steady background noise acts primarily as a modulation masker of speech,” J. Acoust. Soc. Am. 132, 317326.
http://dx.doi.org/10.1121/1.4725766
43.
43. Summers, R. J. , Bailey, P. J. , and Roberts, B. (2010). “ Effects of differences in fundamental frequency on across-formant grouping in speech perception,” J. Acoust. Soc. Am. 128, 36673677.
http://dx.doi.org/10.1121/1.3505119
44.
44. Summers, R. J. , Bailey, P. J. , and Roberts, B. (2012). “ Effects of the rate of formant-frequency variation on the grouping of formants in speech perception,” J. Assoc. Res. Otolaryngol. 13, 269280.
http://dx.doi.org/10.1007/s10162-011-0307-y
45.
45. Wang, D. L. (2005). “ On ideal binary mask as the computational goal of auditory scene analysis,” in Speech Separation by Humans and Machines, edited by P. Divenyi ( Kluwer Academic, Norwell, MA), pp. 181197.
46.
46. Wang, D. L. , and Brown, G. J. (1999). “ Separation of speech from interfering sounds based on oscillatory correlation,” IEEE Trans. Neural Networks 10, 684697.
http://dx.doi.org/10.1109/72.761727
47.
47. Weismer, G. , and Berry, J. (2003). “ Effects of speaking rate on second formant trajectories of selected vocalic nuclei,” J. Acoust. Soc. Am. 113, 33623378.
http://dx.doi.org/10.1121/1.1572142
http://aip.metastore.ingenta.com/content/asa/journal/jasa/137/5/10.1121/1.4919344
Loading
/content/asa/journal/jasa/137/5/10.1121/1.4919344
Loading

Data & Media loading...

Loading

Article metrics loading...

/content/asa/journal/jasa/137/5/10.1121/1.4919344
2015-05-01
2016-12-06

Abstract

Recent research suggests that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This idea was explored using a method that ensures interference cannot occur through energetic masking. Three-formant (F1 + F2 + F3) analogues of natural sentences were synthesized using a monotonous periodic source. Target formants were presented monaurally, with the target ear assigned randomly on each trial. A competitor for F2 (F2C) was presented contralaterally; listeners must reject F2C to optimize recognition. In experiment 1, F2Cs with various frequency and amplitude contours were used. F2Cs with time-varying frequency contours were effective competitors; constant-frequency F2Cs had far less impact. To a lesser extent, amplitude contour also influenced competitor impact; this effect was additive. In experiment 2, F2Cs were created by inverting the F2 frequency contour about its geometric mean and varying its depth of variation over a range from constant to twice the original (0%−200%). The impact on intelligibility was least for constant F2Cs and increased up to ∼100% depth, but little thereafter. The effect of an extraneous formant depends primarily on its frequency contour; interference increases as the depth of variation is increased until the range exceeds that typical for F2 in natural speech.

Loading

Full text loading...

/deliver/fulltext/asa/journal/jasa/137/5/1.4919344.html;jsessionid=isXHwpEVXeELt0C04w7Oco8e.x-aip-live-03?itemId=/content/asa/journal/jasa/137/5/10.1121/1.4919344&mimeType=html&fmt=ahah&containerItemId=content/asa/journal/jasa
true
true

Access Key

  • FFree Content
  • OAOpen Access Content
  • SSubscribed Content
  • TFree Trial Content
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd
/content/realmedia?fmt=ahah&adPositionList=
&advertTargetUrl=//oascentral.aip.org/RealMedia/ads/&sitePageValue=asadl.org/jasa/137/5/10.1121/1.4919344&pageURL=http://scitation.aip.org/content/asa/journal/jasa/137/5/10.1121/1.4919344'
Right1,Right2,Right3,