Volume 116, Issue 6, December 2004
Index of content:
- SPEECH PERCEPTION 
116(2004); http://dx.doi.org/10.1121/1.1815131View Description Hide Description
This study explored the perceptual benefits of brief exposure to non-native speech. Native English listeners were exposed to English sentences produced by non-native speakers. Perceptual processing speed was tracked by measuring reaction times to visual probe words following each sentence. Three experiments using Spanish- and Chinese-accented speech indicate that processing speed is initially slower for accented speech than for native speech but that this deficit diminishes within one minute of exposure. Control conditions rule out explanations for the adaptation effect based on practice with the task and general strategies for dealing with difficult speech. Further results suggest that adaptation can occur within as few as two to four sentence-length utterances. The findings emphasize the flexibility of human speech processing and require models of spoken word recognition that can rapidly accommodate significant acoustic-phonetic deviations from native language speech patterns.
Enhancing Chinese tone recognition by manipulating amplitude envelope: Implications for cochlear implants116(2004); http://dx.doi.org/10.1121/1.1783352View Description Hide Description
Tone recognition is important for speech understanding in tonal languages such as Mandarin Chinese. Cochlear implant patients are able to perceive some tonal information by using temporal cues such as periodicity-related amplitude fluctuations and similarities between the fundamental frequency contour and the amplitude envelope. The present study investigates whether modifying the amplitude envelope to better resemble the contour can further improve tone recognition in multichannel cochlear implants. Chinese tone and vowel recognition were measured for six native Chinese normal-hearing subjects listening to a simulation of a four-channel cochlear implantspeech processor with and without amplitude envelope enhancement. Two algorithms were proposed to modify the amplitude envelope to more closely resemble the contour. In the first algorithm, the amplitude envelope as well as the modulation depth of periodicity fluctuations was adjusted for each spectral channel. In the second algorithm, the overall amplitude envelope was adjusted before multichannel speech processing, thus reducing any local distortions to the speech spectral envelope. The results showed that both algorithms significantly improved Chinese tone recognition. By adjusting the overall amplitude envelope to match the contour before multichannel processing, vowel recognition was better preserved and less speech-processing computation was required. The results suggest that modifying the amplitude envelope to more closely resemble the contour may be a useful approach toward improving Chinese-speaking cochlear implant patients’ tone recognition.
116(2004); http://dx.doi.org/10.1121/1.1810292View Description Hide Description
Native American English and non-native (Dutch) listeners identified either the consonant or the vowel in all possible American English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (0, 8, and 16 dB). The phoneme identification performance of the non-native listeners was less accurate than that of the native listeners. All listeners were adversely affected by noise. With these isolated syllables, initial segments were harder to identify than final segments. Crucially, the effects of language background and noise did not interact; the performance asymmetry between the native and non-native groups was not significantly different across signal-to-noise ratios. It is concluded that the frequently reported disproportionate difficulty of non-native listening under disadvantageous conditions is not due to a disproportionate increase in phoneme misidentifications.
Analysis of speech-based speech transmission index methods with implications for nonlinear operations116(2004); http://dx.doi.org/10.1121/1.1804628View Description Hide Description
The Speech Transmission Index(STI) is a physical metric that is well correlated with the intelligibility of speech degraded by additive noise and reverberation. The traditional STI uses modulated noise as a probe signal and is valid for assessing degradations that result from linear operations on the speech signal. Researchers have attempted to extend the STI to predict the intelligibility of nonlinearly processed speech by proposing variations that use speech as a probe signal. This work considers four previously proposed speech-based STI methods and four novel methods, studied under conditions of additive noise, reverberation, and two nonlinear operations (envelope thresholding and spectral subtraction). Analyzing intermediate metrics in the STI calculation reveals why some methods fail for nonlinear operations. Results indicate that none of the previously proposed methods is adequate for all of the conditions considered, while four proposed methods produce qualitatively reasonable results and warrant further study. The discussion considers the relevance of this work to predicting the intelligibility of cochlear-implant processed speech.