Linking dynamic-range compression across the ears can improve speech intelligibility in spatially separated noise
Signal processing block diagram. Speech and noise signals were filtered with HRTFs and summed to give left- and right-ear signals. Subsequent processing stages are labeled down the center; those highlighted withasterisks involved joint (i.e., bilaterally linked) processing at the two ears. CHANNEL FILTERING: The signal at each ear was filtered into low- (0.1 to 2 kHz) and high- (2 to 5 kHz) frequency channels; *COMPRESSION*: Wide dynamic-range compression was applied separately in each frequency channel, either independently at each ear (unlinked condition) or linked across the ears (see main text for details). In the “uncompressed condition,” the compression stage was bypassed. *RMS LEVEL MATCHING*: (This and all subsequent stages are specific to the speech intelligibility experiment described in Sec. III .) Long-term levels were matched before and after compression, separately in each frequency channel but with identical gain at both ears to preserve ILDs. OPTIONAL RE-INCLUSION OF LOW-FREQUENCY CHANNEL: The low-frequency channel was added back in to the signal at each ear in the both-channels condition only. *AMPLIFICATION TO COMFORTABLE LISTENING LEVEL*: Identical gain was applied at both ears to preserve ILDs. MUTE SIGNAL ON SIDE OF NOISE SOURCE IN MONAURAL CONDITION: Only the “better ear” signal was presented in the monaural condition.
Apparent long-term SNR at the better ear (left) and worse ear (right) in the high-frequency (upper panels) and low-frequency (lower panels) channel. The vertical arrows indicate the nominal source SNR tested in the both-channels (“BOTH”) and high-frequency-channel-only (“HF-ONLY”) conditions of the speech intelligibility experiment, respectively (this experiment is described in Sec. III ).
Envelopes of an extract of the speech and noise signals in the high-frequency channel (2 to 5 kHz) at the better ear. The nominal source SNR was −2 dB. The data in each panel were normalized so that the overall rms level of the noise was 0 dB. At point a, which marks a dip in the speech envelope, the level of the speech is the same (to within 1 dB) for all three processing conditions because the compressors' behavior is dominated by the steady noise at such moments. At point b, marking a peak in the speech envelope, linked compression reduces the speech level by 2 dB compared to the uncompressed condition, and unlinked compression by 7 dB. This “penalizing” of the speech peaks by compression causes a reduction in the long-term apparent SNR, even though the instantaneous SNR is at all times unaffected by the processing.
IGD (momentary difference in the gain applied at the right and left ears) plotted against time in the high-frequency channel following unlinked compression. The nominal source SNR was +4 dB for the top panel and −10 dB for the middle panel. The bottom panel shows the envelope of the original speech signal in the high-frequency channel for reference.
Standard deviation of the IGD plotted against nominal source SNR in the low-frequency (solid line) and high-frequency (dashed line) channels. This provides a measure of the magnitude of the dynamic changes to ILDs introduced by unlinked compression. The vertical arrows indicate the nominal source SNRs tested in the speech intelligibility experiment (cf. Fig. 2 caption).
Overall differences in performance across individual sentence lists after removing the mean experimental effects. The relative percent-correct score is plotted for each list (mean ±1 standard error). Positive (negative) values indicate that a particular list was harder (easier) than the average.
Mean percent-correct score across the ten participants for each experimental condition. Error bars indicate one standard error and asterisks indicate significant differences between processing conditions (* p < 0.05; ** p < 0.01; *** p < 0.001).
Mean binaural squelch (binaural performance minus monaural performance) across the ten participants for each experimental condition. Error bars indicate one standard error.
Comparison of predicted (lines, left axis) and measured (symbols, right axis) speech intelligibility for monaural listening to the ear with the better SNR. A correction was applied to the I 3 values (+0.08 in the both-channels condition; −0.06 in the high-frequency-channel-only condition) to calibrate the model to the overall level of performance measured in each bandwidth condition.
Predicted intelligibility (I 3) for monaural listening to the ear with the better SNR for a hypothetical hearing-impaired listener wearing bilateral in-the-canal hearing aids performing either unlinked (dashed line) or linked (solid line) compression.
As in Fig. 2 but for a two-talker scenario where target speech comes from directly in front and background speech (rather than noise) from an azimuth of 60°. The apparent long-term TBR after compression is plotted against the nominal source TBR.
Details of the hearing loss, hearing-aid configuration, and gain prescription used to predict speech intelligibility for a hypothetical hearing-impaired listener.
Article metrics loading...
Full text loading...