Index of content:
Volume 125, Issue 6, June 2009
- SPEECH PERCEPTION 
125(2009); http://dx.doi.org/10.1121/1.3125342View Description Hide Description
Within tone languages that use pitch variations to contrast meaning, large variability exists in the pitches produced by different speakers. Context-dependent perception may help to resolve this perceptual challenge. However, whether speakers rely on context in contour tone perception is unclear; previous studies have produced inconsistent results. The present study aimed to provide an unambiguous test of the effect of context on contour lexical tone perception and to explore its underlying mechanisms. In three experiments, Mandarin listeners’ perception of Mandarin first and second (high-level and mid-rising) tones was investigated with preceding speech and non-speech contexts. Results indicate that the mean fundamental frequency of a preceding sentence affects perception of contour lexical tones and the effect is contrastive. Following a sentence with a higher-frequency mean , the following syllable is more likely to be perceived as a lower frequency lexical tone and vice versa. Moreover, non-speech precursors modeling the mean spectrum of also elicit this effect, suggesting general perceptual processing rather than articulatory-based or speaker-identity-driven mechanisms.
125(2009); http://dx.doi.org/10.1121/1.3125329View Description Hide Description
An interrupted signal may be perceptually restored and, as a result, perceived as continuous, when the interruptions are filled with loud noise bursts. Additionally, when the signal is speech, an improvement in intelligibility may be observed. The perceived continuity of interrupted tones is reduced when the signal level is ramped down and up before and after the noise burst, respectively—an effect that has been attributed to envelope discontinuities at the tone-noise interface [Bregman, A. S., and Dannenbring, G. L. (1977). Can. J. Psychiatry31, 151–159]. The hypothesis of the present study was that the perceptual restoration of speech would also be reduced with similar envelope discontinuities that may occur in real life due to the release time constants of hearing-aid compression. In an effort to make the conditions more relevant to hearing aids,speech was amplitude-compressed and normal-hearing listeners of varying ages were recruited. Envelope amplitude ramps were placed at the onsets/offsets of speech segments of interrupted sentences and the restoration effect was measured in two ways: objectively as the improvement in intelligibility when noise was added in the gaps and subjectively through the perceived continuity measured by subjects’ own reporting. Both measures showed a reduction as the ramp duration increased—a trend observed for subjects of all ages and for all ramp configurations. These findings can be attributed to envelope discontinuities, with an additional contribution from reduced speech information due to ramping and temporal masking from loud noise bursts.
Multitalker speech perception with ideal time-frequency segregation: Effects of voice characteristics and number of talkers125(2009); http://dx.doi.org/10.1121/1.3117686View Description Hide Description
When a target voice is masked by an increasingly similar masker voice, increases in energetic masking are likely to occur due to increased spectro-temporal overlap in the competing speech waveforms. However, the impact of this increase may be obscured by informational masking effects related to the increased confusability of the target and masking utterances. In this study, the effects of target-masker similarity and the number of competing talkers on the energetic component of speech-on-speech masking were measured with an ideal time-frequency segregation (ITFS) technique that retained all the target-dominated time-frequency regions of a multitalker mixture but eliminated all the time-frequency regions dominated by the maskers. The results show that target-masker similarity has a small but systematic impact on energetic masking, with roughly a release from masking for same-sex maskers versus same-talker maskers and roughly an additional release from masking for different-sex masking voices. The results of a second experiment measuring ITFS performance with up to 18 interfering talkers indicate that energetic masking increased systematically with the number of competing talkers. These results suggest that energetic masking differences related to target-masker similarity have a much smaller impact on multitalker listening performance than energetic masking effects related to the number of competing talkers in the stimulus and non-energetic masking effects related to the confusability of the target and masking voices.
125(2009); http://dx.doi.org/10.1121/1.3126344View Description Hide Description
This study assessed the effects of spectral smearing and temporal fine structure (TFS) degradation on masking release (MR) (the improvement in speech identification in amplitude-modulated compared to steady noise observed for normal-hearing listeners). Syllables and noise stimuli were processed using either a spectral-smearing algorithm or a tone-excited vocoder. The two processing schemes simulated broadening of the auditory filters by factors of 2 and 4. Simulations of the early stages of auditory processing showed that the two schemes produced comparable excitation patterns; however, fundamental frequency (F0) information conveyed by TFS was degraded more severely by the vocoder than by the spectral-smearing algorithm. Both schemes reduced MR but, for each amount of spectral smearing, the vocoder produced a greater reduction in MR than the spectral-smearing algorithm, consistent with the effects of each scheme on F0 representation. Moreover, the effects of spectral smearing on MR produced by the two schemes were different for manner and voicing. Finally, MR data for listeners with moderate hearing loss were well matched by MR data obtained for normal-hearing listeners with vocoded stimuli, suggesting that impaired frequency selectivity alone may not be sufficient to account for the reduced MR observed for hearing-impaired listeners.