Index of content:
Volume 130, Issue 2, August 2011
- SPEECH PERCEPTION 
Extending the articulation index to account for non-linear distortions introduced by noise-suppression algorithms130(2011); http://dx.doi.org/10.1121/1.3605668View Description Hide Description
The conventional articulation index (AI) measure cannot be applied in situations where non-linear operations are involved and additive noise is present. This is because the definitions of the target and masker signals become vague following non-linear processing, as both the target and masker signals are affected. The aim of the present work is to modify the basic form of the AI measure to account for non-linear processing. This was done using a new definition of the output or effective SNR obtained following non-linear processing. The proposed output SNR definition for a specific band was designed to handle cases where the non-linear processing affects predominantly the target signal rather than the masker signal. The proposed measure also takes into consideration the fact that the input SNR in a specific band cannot be improved following any form of non-linear processing. Overall, the proposed measure quantifies the proportion of input band SNR preserved or transmitted in each band after non-linear processing. High correlation (r = 0.9) was obtained with the proposed measure when evaluated with intelligibility scores obtained by normal-hearing listeners in 72 noisy conditions involving noise-suppressed speech corrupted in four different real-world maskers.
130(2011); http://dx.doi.org/10.1121/1.3609258View Description Hide Description
The auditory system takes advantage of early reflections (ERs) in a room by integrating them with the direct sound (DS) and thereby increasing the effective speech level. In the present paper the benefit from realistic ERs on speech intelligibility in diffuse speech-shaped noise was investigated for normal-hearing and hearing-impaired listeners. Monaural and binauralspeech intelligibility tests were performed in a virtual auditory environment where the spectralcharacteristics of ERs from a simulated room could be preserved. The useful ER energy was derived from the speech intelligibility results and the efficiency of the ERs was determined as the ratio of the useful ER energy to the total ER energy. Even though ER energy contributed to speech intelligibility, DS energy was always more efficient, leading to better speech intelligibility for both groups of listeners. The efficiency loss for the ERs was mainly ascribed to their altered spectrum compared to the DS and to the filtering by the torso, head, and pinna. No binaural processing other than a binaural summation effect could be observed.
Effects of fundamental frequency and vocal-tract length cues on sentence segregation by listeners with hearing loss130(2011); http://dx.doi.org/10.1121/1.3605548View Description Hide Description
The purpose was to determine the effect of hearing loss on the ability to separate competing talkers using talker differences in fundamental frequency (F0) and apparent vocal-tract length (VTL). Performance of 13 adults with hearing loss and 6 adults with normal hearing was measured using the Coordinate Response Measure. For listeners with hearing loss, the speech was amplified and filtered according to the NAL-RP hearing aid prescription. Target-to-competition ratios varied from 0 to 9 dB. The target sentence was randomly assigned to the higher or lower values of F0 or VTL on each trial. Performance improved for F0 differences up to 9 and 6 semitones for people with normal hearing and hearing loss, respectively, but only when the target talker had the higher F0. Recognition for the lower F0 target improved when trial-to-trial uncertainty was removed (9-semitone condition). Scores improved with increasing differences in VTL for the normal-hearing group. On average, hearing-impaired listeners did not benefit from VTL cues, but substantial inter-subject variability was observed. The amount of benefit from VTL cues was related to the average hearing loss in the 1–3-kHz region when the target talker had the shorter VTL.