Index of content:
Volume 123, Issue 3, March 2008
- SPEECH PERCEPTION 
123(2008); http://dx.doi.org/10.1121/1.2828067View Description Hide Description
The present study sought to establish whether speech recognition can be disrupted by the presence of amplitude modulation (AM) at a remote spectral region, and whether that disruption depends upon the rate of AM. The goal was to determine whether this paradigm could be used to examine which modulation frequencies in the speech envelope are most important for speech recognition. Consonant identification for a band of speech located in either the low- or high-frequency region was measured in the presence of a band of noise located in the opposite frequency region. The noise was either unmodulated or amplitude modulated by a sinusoid, a band of noise with a fixed absolute bandwidth, or a band of noise with a fixed relative bandwidth. The frequency of the modulator was 4, 16, 32, or . Small amounts of modulation interference were observed for all modulator types, irrespective of the location of the speech band. More important, the interference depended on modulation frequency, clearly supporting the existence of selectivity of modulation interference with speech stimuli. Overall, the results suggest a primary role of envelope fluctuations around 4 and without excluding the possibility of a contribution by faster rates.
123(2008); http://dx.doi.org/10.1121/1.2832617View Description Hide Description
The application of the ideal binary mask to an auditory mixture has been shown to yield substantial improvements in intelligibility. This mask is commonly applied to the time–frequency representation of a mixture signal and eliminates portions of a signal below a signal-to-noise-ratio (SNR) threshold while allowing others to pass through intact. The factors influencing intelligibility of ideal binary-masked speech are not well understood and are examined in the present study. Specifically, the effects of the local SNR threshold, input SNR level, masker type, and errors introduced in estimating the ideal mask are examined. Consistent with previous studies, intelligibility of binary-masked stimuli is quite high even at SNR for all maskers tested. Performance was affected the most when the masker dominated units were wrongly labeled as target-dominated units. Performance plateaued near 100% correct for SNR thresholds ranging from . The existence of the plateau region suggests that it is the pattern of the ideal binary mask that matters the most rather than the local SNR of each unit. This pattern directs the listener’s attention to where the target is and enables them to segregate speech effectively in multitalker environments.