Index of content:
Volume 125, Issue 4, April 2009
- SPEECH PERCEPTION 
125(2009); http://dx.doi.org/10.1121/1.3086269View Description Hide Description
Previous research estimating vowelformant discrimination thresholds in words and sentences has often employed a modified two-alternative-forced-choice (2AFC) task with adaptive tracking. Although this approach has produced stable data, the length and number of experimental sessions, as well as the unnaturalness of the task, limit generalizations of results to ordinary speech communication. In this exploratory study, a typical identification task was used to estimate vowelformant discrimination thresholds. Specifically, a signal detection theory approach was used to develop a method to estimate vowelformant discrimination thresholds from a quicker, more natural single-interval classification task. In experiment 1 “classification thresholds” for words in isolation and embedded in sentences were compared to previously collected 2AFC data. Experiment 2 used a within-subjects design to compare thresholds estimated from both classification and 2AFC tasks. Due to instabilities observed in the experiment 1 sentence data, experiment 2 examined only isolated words. Results from these experiments show that for isolated words, thresholds estimated using the classification procedure are comparable to those estimated using the 2AFC task. These results, as well as an analysis of several aspects of the classification procedure, support the viability of this new approach for estimating discrimination thresholds for speech stimuli.
125(2009); http://dx.doi.org/10.1121/1.3083233View Description Hide Description
Ideal binary time-frequency masking is a signal separation technique that retains mixture energy in time-frequency units where local signal-to-noise ratio exceeds a certain threshold and rejects mixture energy in other time-frequency units. Two experiments were designed to assess the effects of ideal binary masking on speech intelligibility of both normal-hearing (NH) and hearing-impaired (HI) listeners in different kinds of background interference. The results from Experiment 1 demonstrate that ideal binary masking leads to substantial reductions in speech-reception threshold for both NH and HI listeners, and the reduction is greater in a cafeteria background than in a speech-shaped noise. Furthermore, listeners with hearing loss benefit more than listeners with normal hearing, particularly for cafeteria noise, and ideal masking nearly equalizes the speech intelligibility performances of NH and HI listeners in noisy backgrounds. The results from Experiment 2 suggest that ideal binary masking in the low-frequency range yields larger intelligibility improvements than in the high-frequency range, especially for listeners with hearing loss. The findings from the two experiments have major implications for understanding speech perception in noise, computational auditory scene analysis, speech enhancement, and hearing aid design.
Perception of allophonic cues to English word boundaries by Japanese second language learners of English125(2009); http://dx.doi.org/10.1121/1.3082103View Description Hide Description
Perception of stop aspiration and glottal stop allophonic cues for word juncture in English by Japanese second language (L2) learners of English was examined, extending a study of Spanish L2 learners [Altenberg, E. P. (Year: 2005). Second Lang. Res.21, 325–358]. Thirty Japanese listeners ranging in length of residence (LOR) in the United States ( ) were tested on 42 contrasting pairs (e.g., aspiration: keeps talking vs keep stalking, glottal stop: a nice man vs an ice man, and double cues: grape in vs grey pin). Phrases were presented in randomly ordered lists and subjects responded in a two-choice identification task followed by a phrase familiarity test. The Japanese listeners performed more poorly than an American English-speaking control group , especially on aspiration pairs. Aspiration pairs were differentiated significantly less well (73% correct) by Japanese listeners than were glottal stop pairs (91% correct) and double cue pairs (94% correct); response biases predicted from relative familiarity of phrases were evident only for aspiration pairs. Performance correlated with LOR and suggested that aspiration cues take more immersion experience to learn than glottal stop cues. The patterns of errors were similar, but not identical, to Altenberg’s Spanish data.
125(2009); http://dx.doi.org/10.1121/1.3082117View Description Hide Description
This study investigates the relative contributions of auditory and cognitive factors to the common finding that an increase in speech rate affects elderly listeners more than young listeners. Since a direct relation between non-auditory factors, such as age-related cognitive slowing, and fast speech performance has been difficult to demonstrate, the present study took an on-line, rather than off-line, approach and focused on processing time. Elderly and young listeners were presented with speech at two rates of time compression and were asked to detect pre-assigned target words as quickly as possible. A number of auditory and cognitive measures were entered in a statistical model as predictors of elderly participants’ fast speech performance: hearing acuity, an information processing rate measure, and two measures of reading speed. The results showed that hearing loss played a primary role in explaining elderly listeners’ increased difficulty with fast speech. However, non-auditory factors such as reading speed and the extent to which participants were affected by increased rate of presentation in a visual analog of the listening experiment also predicted fast speech performance differences among the elderly participants. These on-line results confirm that slowed information processing is indeed part of elderly listeners’ problem keeping up with fast language.