Index of content:
Volume 128, Issue 2, August 2010
- SPEECH PERCEPTION 
128(2010); http://dx.doi.org/10.1121/1.3455796View Description Hide Description
This study addresses how prosodic expectations affect perceptual discrimination. Prosodic expectations were created using natural recordings of six-syllable sentences in dactylic, iambic, and trochaic metrical patterns at two speech rates, slow and quick. PSOLA resynthesis was used to lengthen target syllables located in three different serial positions in each of the three patterns. Subjects made forced-choice comparisons of durational structure in an AX task. Lengthening was detected significantly better for strong syllables than for weak ones in all metrical patterns, serial positions, and at speech rates. The result obtains even when absolute duration is eliminated as a potential confound. Results are interpreted in the light of prior research showing that prosodically strong syllables offer perceptual advantages in recognition and identification tasks, even when prosodic strength is cued only by the prior context (and not by any acoustic phoneticproperties of the target syllables). In conclusion, metrical expectations cause listeners to focus their attention on metrically prominent syllables, with attentional focus leading to better performance in tasks tapping multiple levels of processing.
128(2010); http://dx.doi.org/10.1121/1.3458857View Description Hide Description
It has been reported that listeners can benefit from a release in masking when the masker speech is spoken in a language that differs from the target speech compared to when the target and masker speech are spoken in the same language [Freyman, R. L. et al. (1999). J. Acoust. Soc. Am.106, 3578–3588; Van Engen, K., and Bradlow, A. (2007), J. Acoust. Soc. Am.121, 519–526]. It is unclear whether listeners benefit from this release in masking due to the lack of linguistic interference of the masker speech, from acoustic and phonetic differences between the target and masker languages, or a combination of these differences. In the following series of experiments, listeners’ sentence recognition was evaluated using speech and noise maskers that varied in the amount of linguistic content, including native-English, Mandarin-accented English, and Mandarin speech. Results from three experiments indicated that the majority of differences observed between the linguistic maskers could be explained by spectral differences between the masker conditions. However, when the recognition task increased in difficulty, i.e., at a more challenging signal-to-noise ratio, a greater decrease in performance was observed for the maskers with more linguistically relevant information than what could be explained by spectral differences alone.
128(2010); http://dx.doi.org/10.1121/1.3458817View Description Hide Description
Experiment 1 replicated the finding that normal-hearing listeners identify speech better in modulated than in unmodulated noise. This modulated-unmodulated difference (“MUD”) has been previously shown to be reduced or absent for cochlear-implant listeners and for normal-hearing listeners presented with noise-vocoded speech. Experiments 2–3 presented normal-hearing listeners with noise-vocoded speech in unmodulated or 16-Hz-square-wave modulated noise, and investigated whether the introduction of simple binaural differences between target and masker could restore the masking release. Stimuli were presented over headphones. When the target and masker were presented to one ear, adding a copy of the masker to the other ear (“diotic configuration”) aided performance but did so to a similar degree for modulated and unmodulated maskers, thereby failing to improve the modulation masking release. Presenting an uncorrelated noise to the opposite ear (“dichotic configuration”) had no effect, either for modulated or unmodulated maskers, consistent with the improved performance in the diotic configuration being due to interaural decorrelation processing. For noise-vocoded speech, the provision of simple spatial differences did not allow listeners to take greater advantage of the dips present in a modulated masker.
128(2010); http://dx.doi.org/10.1121/1.3458851View Description Hide Description
Jin & Nelson (2006) found that although amplified speech recognition performance of hearing-impaired (HI) listeners was equal to that of normal-hearing (NH) listeners in quiet and in steady noise, nevertheless HI listeners' performance was significantly poorer in modulated noise. As a follow-up, the current study investigated whether three factors, auditory integration, low-mid frequency audibility and auditory filter bandwidths, might contribute to reduced sentence recognition of HI listeners in the presence of modulated interference. Three findings emerged. First, sentence recognition in modulated noise found in Jin & Nelson (2006) was highly correlated with perception of sentences interrupted by silent gaps. This suggests that understanding speech interrupted by either noise or silent gaps require similar perceptual integration of speech fragments available either in the dips of a gated noise or across silent gaps of an interrupted speech signal. Second, those listeners with greatest hearing losses in the low frequencies were poorest at understanding interrupted sentences. Third, low-to mid-frequency hearing thresholds accounted for most of the variability in Masking Release (MR) for HI listeners. As suggested by Oxenham and his colleagues (2003 and 2009), low-frequency information within speech plays an important role in the perceptual segregation of speech from competing background noise.