Volume 118, Issue 4, October 2005
Index of content:
- SPEECH PERCEPTION 
118(2005); http://dx.doi.org/10.1121/1.2031975View Description Hide Description
This study compared how normal-hearing listeners (NH) and listeners with moderate to moderately severe cochlear hearing loss (HI) use and combine information within and across frequency regions in the perceptual separation of competing vowels with fundamental frequency differences ranging from 0 to 9 semitones. Following the procedure of Culling and Darwin [J. Acoust. Soc. Am.93, 3454–3467 (1993)], eight NH listeners and eight HI listeners identified competing vowels with either a consistent or inconsistent harmonic structure. Vowels were amplified to assure audibility for HI listeners. The contribution of frequency region depended on the value of between the competing vowels. When was small, both groups of listeners effectively utilized cues in the low-frequency region. In contrast, HI listeners derived significantly less benefit than NH listeners from cues conveyed by the high-frequency region at small ’s. At larger ’s, both groups combined cues from the low and high formant-frequency regions. Cochlear impairment appears to negatively impact the ability to use cues for within-formant grouping in the high-frequency region. However, cochlear loss does not appear to disrupt the ability to use within-formant cues in the low-frequency region or to group cues across formant regions.
118(2005); http://dx.doi.org/10.1121/1.2040047View Description Hide Description
Among the most influential publications in speech perception is Liberman, Delattre, and Cooper’s [Am. J. Phys.65, 497–516 (Year: 1952)] report on the identification of synthetic, voiceless stops generated by the Pattern Playback. Their map of stop consonant identification shows a highly complex relationship between acoustics and perception. This complex mapping poses a challenge to many classes of relatively simple pattern recognition models which are unable to capture the original finding of Liberman et al. that identification of ∕k∕ was bimodal for bursts preceding front vowels but otherwise unimodal. A replication of this experiment was conducted in an attempt to reproduce these identification patterns using a simulation of the Pattern Playback device. Examination of spectrographic data from stimuli generated by the Pattern Playback revealed additional spectral peaks that are consistent with harmonic distortion characteristic of tube amplifiers of that era. Only when harmonic distortion was introduced did bimodal ∕k∕ responses in front-vowel context emerge. The acoustic consequence of this distortion is to add, e.g., a high-frequency peak to midfrequency bursts or a midfrequency peak to a low-frequency burst. This likely resulted in additional ∕k∕ responses when the second peak approximated the second formant of front vowels. Although these results do not challenge the main observations made by Liberman et al. that perception of stop bursts is context dependent, they do show that the mapping from acoustics to perception is much less complex without these additional distortion products.
118(2005); http://dx.doi.org/10.1121/1.2005887View Description Hide Description
This study examined the effect of noise on the identification of four synthetic speech continua (/rɑ/-/lɑ/, /wɑ/-/jɑ/, /i/-/u/, and say-stay) by adults with cochlea implants (CIs) and adults with normal-hearing (NH) sensitivity in quiet and noise. Significant group-by-SNR interactions were found for endpoint identification accuracy for all continua except /i/-/u/. The CI listeners showed the least NH-like identification functions for the /rɑ/-/lɑ/ and /wɑ/-/jɑ/ continua. In a second experiment, NH adults identified four- and eight-band cochlear implant stimulations of the four continua, to examine whether group differences in frequency selectivity could account for the group differences in the first experiment. Number of bands and SNR interacted significantly for /rɑ/-/lɑ/, /wɑ/-/jɑ/, and say-stay endpoint identification; strongest effects were found for the /rɑ/-/lɑ/ and say-stay continua. Results suggest that the speech features that are most vulnerable to misperception in noise by listeners with CIs are those whose acoustic cues are rapidly changing spectral patterns, like the formant transitions in the /wɑ/-/jɑ/ and /rɑ/-/lɑ/ continua. However, the group differences in the first experiment cannot be wholly attributable to frequency selectivity differences, as the number of bands in the second experiment affected performance differently than suggested by group differences in the first experiment.
Incidental categorization of spectrally complex non-invariant auditory stimuli in a computer game task118(2005); http://dx.doi.org/10.1121/1.2011156View Description Hide Description
This study examined perceptual learning of spectrally complex nonspeech auditory categories in an interactive multi-modal training paradigm. Participants played a computer game in which they navigated through a three-dimensional space while responding to animated characters encountered along the way. Characters’ appearances in the game correlated with distinctive sound category distributions, exemplars of which repeated each time the characters were encountered. As the game progressed, the speed and difficulty of required tasks increased and characters became harder to identify visually, so quick identification of approaching characters by soundpatterns was, although never required or encouraged, of gradually increasing benefit. After 30 min of play, participants performed a categorization task, matching sounds to characters. Despite not being informed of audio-visual correlations, participants exhibited reliable learning of these patterns at posttest. Categorization accuracy was related to several measures of game performance and category learning was sensitive to category distribution differences modeling acoustic structures of speech categories. Category knowledge resulting from the game was qualitatively different from that gained from an explicit unsupervised categorization task involving the same stimuli. Results are discussed with respect to information sources and mechanisms involved in acquiring complex, context-dependent auditory categories, including phonetic categories, and to multi-modal statistical learning.