Index of content:
Volume 112, Issue 5, November 2002
- PSYCHOLOGICAL ACOUSTICS 
Primitive stream segregation of tone sequences without differences in fundamental frequency or passband112(2002); http://dx.doi.org/10.1121/1.1508784View Description Hide Description
Peripheral-channeling theorists argue that differences in excitation pattern between successive sounds are necessary for stream segregation to occur. The component phases of complex tones comprising unresolved harmonics (F0=100 Hz) were manipulated to change pitch and timbre without changing the power spectrum. In experiment 1, listeners compared two alternating sequences of tones, A and B. One sequence was isochronous (tone duration=60 ms, intertone interval=40 ms). The other began isochronously, but the progressive delay of tone B made the rhythm irregular. Subjects had to identify the sequence with irregular rhythm. Stream segregation makes this task more difficult. A and B could differ in passband (1250–2500 Hz, 1768–3536 Hz, 2500–5000 Hz), component phase (cosine, alternating, random), or both. Stimuli were presented at 70 dB SPL in pink noise. Dissimilarity in either passband or phase increased discrimination thresholds. Moreover, phase differences raised threshold even when there was no passband difference. In experiment 2, listeners judged moment-by-moment the grouping of long ABA-ABA-… sequences. The measure was the proportion of time a sequence was heard as segregated. The factors that increased segregation were very similar to those that increased threshold in experiment 1. Overall, the findings indicate that substantial stream segregation can occur without differences in power spectrum. It is concluded that differences in peripheral channeling are not a requirement for stream segregation.
112(2002); http://dx.doi.org/10.1121/1.1510141View Description Hide Description
The effect of spatial separation of sources on the masking of a speech signal was investigated for three types of maskers, ranging from energetic to informational. Normal-hearing listeners performed a closed-set speech identification task in the presence of a masker at various signal-to-noise ratios. Stimuli were presented in a quiet sound field. The signal was played from 0° azimuth and a masker was played either from the same location or from 90° to the right. Signals and maskers were derived from sentences that were preprocessed by a modified cochlear-implant simulation program that filtered each sentence into 15 frequency bands, extracted the envelopes from each band, and used these envelopes to modulate pure tones at the center frequencies of the bands. In each trial, the signal was generated by summing together eight randomly selected frequency bands from the preprocessed signal sentence. Three maskers were derived from the preprocessed masker sentences: (1) different-band sentence, which was generated by summing together six randomly selected frequency bands out of the seven bands not present in the signal (resulting in primarily informational masking); (2) different-band noise, which was generated by convolving the different-band sentence with Gaussian noise; and (3) same-band noise, which was generated by summing the same eight bands from the preprocessed masker sentence that were used in the signal sentence and convolving the result with Gaussian noise (resulting in primarily energetic masking). Results revealed that in the different-band sentence masker, the effect of spatial separation averaged 18 dB (at 51% correct), while in the different-band and same-band noise maskers the effect was less than 10 dB. These results suggest that, in these conditions, the advantage due to spatial separation of sources is greater for informational masking than for energetic masking.
112(2002); http://dx.doi.org/10.1121/1.1508793View Description Hide Description
The threshold for detecting a narrow-band noise signal in one or more masking noise bands is higher when the signal and masker bands have the same envelope (correlated condition) than when they have independent envelopes (uncorrelated condition). This comodulation detection difference (CDD) might be caused by perceptual grouping of the signal and masker bands when they are correlated. Alternatively, CDD may occur because, in the uncorrelated condition, the signal can be detected in the dips of the masker. A previous paper [S. J. Borrill and B. C. J. Moore, J. Acoust. Soc. Am. 111, 309–319 (2002)] described results and a model supporting a dip-listening explanation. The model predicted steeper psychometric functions for the correlated than for the uncorrelated condition, a prediction confirmed by experiment 1. In experiment 2, the width of the signal and masker bands was varied. The dip-listening model predicts a small decrease in CDD with increasing bandwidth, while an account based on perceptual grouping predicts a substantial decrease, as across-channel sensitivity to envelope disparity decreases with increasing envelope modulation rate. The CDD was independent of bandwidth. Experiment 3 showed no effect of masker–signal onset asynchrony on CDD, even though asynchrony should reduce perceptual grouping. An explanation of CDD is proposed based on the suppression that has been observed in cochlear mechanics and in the auditory nerve.
112(2002); http://dx.doi.org/10.1121/1.1506692View Description Hide Description
Although the ratio of direct-to-reverberant sound energy is known to be an important acoustic cue to sound source distance, human sensitivity to changes in this cue is largely unknown. Here, direct-to-reverberant energy discrimination thresholds were measured for six listeners using virtual sound source techniques that allow for convenient and precise control of this stimulus parameter. Four different types of source stimuli were tested: a 50 ms noise burst with abrupt onset/offset, a 300 ms duration noise burst with gradual onset/offset, a speech syllable, and an impulse. Over a range of direct-to-reverberant ratios from 0 to 20 dB, an adaptive 2AFC procedure (3-down, 1-up) was used to measurediscrimination thresholds. For all stimuli, these thresholds ranged from 5 to 6 dB. A post hoc fitting procedure confirmed that slopes of the psychometric functions were homogeneous across stimulus conditions and listeners. These threshold results suggest that direct-to-reverberant energy ratio by itself provides only a course coding of sound source distance, because threshold values correspond to greater than 2-fold changes in physical distance for the acoustic environment under examination.
112(2002); http://dx.doi.org/10.1121/1.1510140View Description Hide Description
A stimulator array is described which can deliver a wide range of displacement waveforms from each contactor, allowing vibratory stimuli to be targeted towards different populations of mechanoreceptors in the skin. The array has a working bandwidth of 20–400 Hz and 100 moving contactors covering an area of on the fingertip. The array was validated with two experiments on the perception of moving vibratory targets within a uniform background vibration. In the first experiment, with target and background at the same frequency, equivalent discrimination of target movement was obtained at higher values of target/background amplitude ratio for 40-Hz stimuli than for 320-Hz stimuli. In the second experiment, discrimination of target movement within a different-frequency background (320-Hz target and 40-Hz background, or vice versa) was found to be much easier than within a same-frequency background. These results suggest that tactile spatial acuity is better at 320 Hz than 40 Hz and that it is possible to target different receptor populations in the skin by using these frequencies. However, there are problems with this interpretation: on the basis of characterization of touch receptors in previous studies, spatial acuity is expected to be worse at 320 Hz than at 40 Hz.