Index of content:
Volume 125, Issue 2, February 2009
- SPEECH PERCEPTION 
The interaction of vocal characteristics and audibility in the recognition of concurrent syllablesa)125(2009); http://dx.doi.org/10.1121/1.3050321View Description Hide Description
In concurrent-speech recognition, performance is enhanced when either the glottal pulse rate (GPR) or the vocal tract length (VTL) of the target speaker differs from that of the distracter, but relatively little is known about the trading relationship between the two variables, or how they interact with other cues such as signal-to-noise ratio (SNR). This paper presents a study in which listeners were asked to identify a target syllable in the presence of a distracter syllable, with carefully matched temporal envelopes. The syllables varied in GPR and VTL over a large range, and they were presented at different SNRs. The results showed that performance is particularly sensitive to the combination of GPR and VTL when the SNR is . Equal-performance contours showed that when there are no other cues, a two-semitone difference in GPR produced the same advantage in performance as a 20% difference in VTL. This corresponds to a trading relationship between GPR and VTL of 1.6. The results illustrate that the auditory system can use any combination of differences in GPR, VTL, and SNR to segregate competing speech signals.
Identifying isolated, multispeaker Mandarin tones from brief acoustic input: A perceptual and acoustic study125(2009); http://dx.doi.org/10.1121/1.3050322View Description Hide Description
Lexical tone identification relies primarily on the processing of F0. Since F0 range differs across individuals, the interpretation of F0 usually requires reference to specific speakers. This study examined whether multispeaker Mandarin tone stimuli could be identified without cues commonly considered necessary for speaker normalization. The syllables, produced by 16 speakers of each gender, were digitally processed such that only the fricative and the first six glottal periods remained in the stimuli, neutralizing the dynamic F0 contrasts among the tones. Each stimulus was presented once, in isolation, to 40 native listeners who had no prior exposure to the speakers’ voices. Chi-square analyses showed that tone identification accuracy exceeded chance as did tone classification based on F0 height. Acoustic analyses showed contrasts between the high- and low-onset tones in F0, duration, and two voice quality measures (F1 bandwidth and spectral tilt). Correlation analyses showed that F0 covaried with the voice quality measures and that tone classification based on F0 height also correlated with these acoustic measures. Since the same acoustic measures consistently distinguished the female from the male stimuli, gender detection may be implicated in F0 height estimation when no context, dynamic F0, or familiarity with speaker voices is available.
Language experience and consonantal context effects on perceptual assimilation of French vowels by American-English learners of Frencha)125(2009); http://dx.doi.org/10.1121/1.3050256View Description Hide Description
Recent research has called for an examination of perceptual assimilation patterns in second-language speech learning. This study examined the effects of language learning and consonantal context on perceptual assimilation of Parisian French (PF) front rounded vowels /y/ and /œ/ by American English (AE) learners of French. AE listeners differing in their French language experience (no experience, formal instruction, formal-plus-immersion experience) performed an assimilation task involving PF /y, œ, u, o, i, ε, a/ in bilabial /rabVp/ and alveolar /radVt/ contexts, presented in phrases. PF front rounded vowels were assimilated overwhelmingly to back AEvowels. For PF /œ/, assimilation patterns differed as a function of language experience and consonantal context. However, PF /y/ revealed no experience effect in alveolar context. In bilabial context, listeners with extensive experience assimilated PF /y/ to less often than listeners with no or only formal experience, a pattern predicting the poorest /u-y/ discrimination for the most experienced group. An “internal consistency” analysis indicated that responses were most consistent with extensive language experience and in bilabial context. Acoustical analysis revealed that acoustical similarities among PF vowels alone cannot explain context-specific assimilation patterns. Instead it is suggested that native-language allophonic variation influences context-specific perceptual patterns in second-language learning.
Intelligibility of interrupted sentences at subsegmental levels in young normal-hearing and elderly hearing-impaired listenersa)125(2009); http://dx.doi.org/10.1121/1.3021304View Description Hide Description
Although listeners can partially understand sentences interrupted by silence or noise, and their performance depends on the characteristics of the glimpses, few studies have examined effects of the types of segmental and subsegmental information on sentence intelligibility. Given the finding of twice better intelligibility from vowel-only glimpses than from consonants [Kewley-Port et al. (2007). “Contribution of consonant versus vowelinformation to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners,” J. Acoust. Soc. Am.122, 2365–2375], this study examined young normal-hearing and elderly hearing-impaired (EHI) listeners’ intelligibility of interrupted sentences that preserved four different types of subsegmental cues (steady-states at centers or transitions at margins; vowel onset or offset transitions). Forty-two interrupted sentences from TIMIT were presented twice at SPL, first with 50% and second with 70% of sentence duration. Compared to high sentence intelligibility for uninterrupted sentences, interrupted sentences had significant decreases in performance for all listeners, with a larger decrease for EHI listeners. Scores for both groups were significantly better for 70% duration than for 50% but were not significantly different for the type of subsegmental information. Performance by EHI listeners was associated with their high-frequency hearing thresholds rather than with age. Together with previous results using segmental interruption, preservation of vowels in interrupted sentences provides greater benefit to sentence intelligibility compared to consonants or subsegmental cues.