Index of content:
Volume 120, Issue 1, July 2006
- SPEECH PERCEPTION 
120(2006); http://dx.doi.org/10.1121/1.2208457View Description Hide Description
Perception of breathy voice quality appears to be cued by changes in the vowelspectrum. These changes are related to alterations in the intensity of aspiration noise and spectral slope of the harmonic energy [Shrivastav and Sapienza, J. Acoust. Soc. Am., 114 (4), 2217–2224 (2003)]. Ten young-adult listeners with normal hearing were tested using an adaptive listening task to determine the smallest change in signal-to-noise ratio that resulted in a change in breathiness. Six vowel continua, three female and three male, were generated using a Klatt synthesizer and served as stimuli. Results showed that listeners needed as much as 20-dB increase in aspiration noise to perceive a change in breathiness against a relatively normal voice. In contrast, listeners needed approximately an 11-dB increase in aspiration noise to discriminate breathiness against a severely breathy voice. The difference limens for breathiness were observed to vary across the six talkers. Voices having aspiration noise that was predominantly in the high frequencies had smaller difference limens. No significant differences for male and female voice were observed.
120(2006); http://dx.doi.org/10.1121/1.2208427View Description Hide Description
Three experiments were conducted to study relative contributions of speaking rate, temporal envelope, and temporal fine structure to clear speech perception.Experiment I used uniform time scaling to match the speaking rate between clear and conversational speech.Experiment II decreased the speaking rate in conversational speech without processing artifacts by increasing silent gaps between phonetic segments.Experiment III created “auditory chimeras” by mixing the temporal envelope of clear speech with the fine structure of conversational speech, and vice versa. Speech intelligibility in normal-hearing listeners was measured over a wide range of signal-to-noise ratios to derive speech reception thresholds(SRT). The results showed that processing artifacts in uniform time scaling, particularly time compression, reduced speech intelligibility. Inserting gaps in conversational speech improved the SRT by , but this improvement might be a result of increased short-term signal-to-noise ratios during level normalization. Data from auditory chimeras indicated that the temporal envelope cue contributed more to the clear speech advantage at high signal-to-noise ratios, whereas the temporal fine structure cue contributed more at low signal-to-noise ratios. Taken together, these results suggest that acoustic cues for the clear speech advantage are multiple and distributed.
120(2006); http://dx.doi.org/10.1121/1.2203595View Description Hide Description
In a follow-up study to that of Bent and Bradlow (2003), carrier sentences containing familiar keywords were read aloud by five talkers (Korean high proficiency; Korean low proficiency; Saudi Arabian high proficiency; Saudi Arabian low proficiency; native English). The intelligibility of these keywords to 50 listeners in four first language groups (Korean, ; Saudi Arabian, ; native English, ; other mixed first languages, ) was measured in a word recognition test. In each case, the non-native listeners found the non-native low-proficiency talkers who did not share the same first language as the listeners the least intelligible, at statistically significant levels, while not finding the low-proficiency talker who shared their own first language similarly unintelligible. These findings indicate a mismatched interlanguage speech intelligibilitydetriment for low-proficiency non-native speakers and a potential intelligibility problem between mismatched first language low-proficiency speakers unfamiliar with each others’ accents in English. There was no strong evidence to support either an intelligibility benefit for the high-proficiency non-native talkers to the listeners from a different first language background or to indicate that the native talkers were more intelligible than the high-proficiency non-native talkers to any of the listeners.