Index of content:
Volume 116, Issue 4, October 2004
- SPEECH PRODUCTION 
116(2004); http://dx.doi.org/10.1121/1.1789491View Description Hide Description
Two subjects from the X-Ray Microbeam Speech ProductionDatabase were examined in their production of the vowels /ɪ/ and /ε/ in alveolar and dental consonant contexts. Secant lines, or first-order splines, between the three most anterior pellets were examined at vowel critical times. These critical times were zero crossings in the tangential acceleration of the midpoints of the secant lines. We expected and found, in general, that vowel reduction occurred as a function of vowel duration in measures of the secant line midpoint-to-palate distance and secant line orientation at vowel critical times. The shorter the vowel, the smaller the distance of the secant line midpoints to the palate and the less downward the orientation of the secant lines at the vowel critical times. Phonetic reduction was also apparent in the formant frequencies. There were differences between the speakers in terms of the range of vowel duration and degree of reduction. The subjects differed in the functional parts of the tongue spanned by the secant lines and the shape of their palates. These differences were factors in the observed relations between formant frequencies and the articulatory, secant line measures for each subject.
The distinctness of speakers’ productions of vowel contrasts is related to their discrimination of the contrasts116(2004); http://dx.doi.org/10.1121/1.1787524View Description Hide Description
This study addresses the hypothesis that the more accurately a speaker discriminates a vowelcontrast, the more distinctly the speaker produces that contrast.Measures of speech production and perception were collected from 19 young adult speakers of American English. In the production experiment, speakers repeated the words cod, cud, who’d, and hood in a carrier phrase at normal, clear, and fast rates. Articulatory movements and the associated acoustic signal were recorded, yielding measures of contrast distance between /ɑ/ and /ʌ/ and between /u/ and /ʊ/. In the discrimination experiment, sets of seven natural-sounding stimuli ranging from cod to cud and who’d to hood were synthesized, based on productions by one male and one female speaker. The continua were then presented to each of the 19 speakers in labeling and discrimination tasks. Consistent with the hypothesis, speakers with discrimination scores above the median produced greater acoustic contrasts than speakers with discrimination scores at or below the median. Such a relation between speech production and perception is compatible with a model of speech production in which articulatory movements for vowels are planned primarily in auditory space.
116(2004); http://dx.doi.org/10.1121/1.1785571View Description Hide Description
One of the most important areas of study in speech motor control is the identification of control variables, the variables controlled by the nervous system during motor tasks. The current study examined two hypotheses regarding control variables in speech production: (1) pressure and resistance in the vocal tract are controlled, and (2) perceptual and acoustic accuracy are controlled. Aerodynamic and acoustic data were collected on 20 subjects in three conditions, normally (NT), with an open air pressure bleed tube in place (TWB), and with a closed bleed tube in place (TNB). The voice recordings collected from the speakers in the production study were used in the perceptual study. Results showed that oral pressure was significantly lower in the TWB condition than in the NT and TNB conditions. The in the TWB condition seemed to be related to maintenance of subglottal pressure Examination of the perceptual and acoustic data indicated that perceptual accuracy for [ɑ] was achieved by maintaining to preserve a steady sound pressure level, fundamental frequency, and voicing. Overall, it appeared speakers controlled pressure in compensating, but for the ultimate goal of maintaining acoustic and perceptual accuracy.
A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters116(2004); http://dx.doi.org/10.1121/1.1715112View Description Hide Description
Three neural network models were trained on the forward mapping from articulatory positions to acoustic outputs for a single speaker of the Edinburgh multi-channel articulatory speechdatabase. The model parameters (i.e., connection weights) were learned via the backpropagation of error signals generated by the difference between acoustic outputs of the models, and their acoustic targets. Efficacy of the trained models was assessed by subjecting the models’ acoustic outputs to speech intelligibility tests. The results of these tests showed that enough phonetic information was captured by the models to support rates of word identification as high as 84%, approaching an identification rate of 92% for the actual target stimuli. These forward models could serve as one component of a data-driven articulatory synthesizer. The models also provide the first step toward building a model of spoken word acquisition and phonological development trained on real speech.
Talker differences in clear and conversational speech: Vowel intelligibility for normal-hearing listeners116(2004); http://dx.doi.org/10.1121/1.1788730View Description Hide Description
Several studies have shown that when a talker is instructed to speak as though talking to a hearing-impaired person, the resulting “clear” speech is significantly more intelligible than typical conversational speech. While variability among talkers during speech production is well known, only one study to date [Gagné et al., J. Acad. Rehab. Audiol. 27, 135–158 (1994)] has directly examined differences among talkers producing clear and conversational speech. Data from that study, which utilized ten talkers, suggested that talkers vary in the extent to which they improve their intelligibility by speaking clearly. Similar variability can be also seen in studies using smaller groups of talkers [e.g., Picheny, Durlach, and Braida, J. Speech Hear. Res. 28, 96–103 (1985)]. In the current paper, clear and conversational speech materials were recorded from 41 male and female talkers aged 18 to 45 years. A listening experiment demonstrated that for normal-hearing listeners in noise, vowel intelligibility varied widely among the 41 talkers for both speaking styles, as did the magnitude of the speaking style effect. While female talkers showed a larger clear speechvowel intelligibility benefit than male talkers, neither talker age nor prior experience communicating with hearing-impaired listeners significantly affected the speaking style effect.