Volume 128, Issue 4, October 2010
Index of content:
- SPEECH PRODUCTION 
128(2010); http://dx.doi.org/10.1121/1.3478856View Description Hide Description
This study concentrates on one of the commonly occurring phonetic variations in English: the stop-like modification of the dental fricative /ð/. The variant exhibits a drastic change from the canonical /ð/; the manner of articulation is changed from one that is fricative to one that is stop-like. Furthermore, the place of articulation of stop-like /ð/ has been a point of uncertainty, leading to confusion between stop-like /ð/ and /d/. In this study, acoustic and spectral moment measures were taken from 100 stop-like /ð/ and 102 /d/ tokens produced by 59 male and 23 female speakers in the TIMIT corpus. Data analysis indicated that stop-like /ð/ is significantly different from /d/ in burst amplitude, burst spectrum shape, burst peak frequency, second formant at following-vowel onset, and spectral moments. Moreover, the acoustic differences from /d/ are consistent with those expected for a dental stop-like /ð/. Automatic classification experiments involving these acoustic measures suggested that they are salient in distinguishing stop-like /ð/ from /d/.
128(2010); http://dx.doi.org/10.1121/1.3479538View Description Hide Description
The study investigated the articulatory basis of locus equations, regression lines relating F2 at the start of a Consonant-Vowel (CV) transition to F2 at the middle of the vowel, with C fixed and V varying. Several studies have shown that consonants of different places of articulation have locus equation slopes that descend from labial to velar to alveolar, and intercept magnitudes that increase in the opposite order. Using formulas from the theory of bivariate regression that express regression slopes and intercepts in terms of standard deviations and averages of the variables, it is shown that the slope directly encodes a well-established measure of coarticulation resistance. It is also shown that intercepts are directly related to the degree to which the tongue body assists the formation of the constriction for the consonant. Moreover, it is shown that the linearity of locus equations and the linear relation between locus equation slopes and intercepts originates in linearity in articulation between the horizontal position of the tongue dorsum in the consonant and to that in the vowel. It is concluded that slopes and intercepts of acoustic locus equations are measures of articulator synergy.
Adaptive auditory feedback control of the production of formant trajectories in the Mandarin triphthong /iau/ and its pattern of generalization128(2010); http://dx.doi.org/10.1121/1.3479539View Description Hide Description
In order to test whether auditory feedback is involved in the planning of complex articulatory gestures in time-varying phonemes, the current study examined native Mandarin speakers' responses to auditory perturbations of their auditory feedback of the trajectory of the first formant frequency during their production of the triphthong /iau/. On average, subjects adaptively adjusted their productions to partially compensate for the perturbations in auditory feedback. This result indicates that auditory feedback control of speech movements is not restricted to quasi-static gestures in monophthongs as found in previous studies, but also extends to time-varying gestures. To probe the internal structure of the mechanisms of auditory-motor transformations, the pattern of generalization of the adaptation learned on the triphthong /iau/ to other vowels with different temporal and spatial characteristics (produced only under masking noise) was tested. A broad but weak pattern of generalization was observed; the strength of the generalization diminished with increasing dissimilarity from /iau/. The details and implications of the pattern of generalization are examined and discussed in light of previous sensorimotor adaptation studies of both speech and limb motor control and a neurocomputational model of speech motor control.
128(2010); http://dx.doi.org/10.1121/1.3458847View Description Hide Description
The theory of relational acoustic invariance [Pickett, E. R., et al. (1999). Phonetica56, 135–157] was tested with the Japanese stop quantity distinction in disyllables spoken at various rates. The questions were whether the perceptual boundary between the two phonemic categories of single and geminate stops is invariant across rates, and whether there is a close correspondence between the perception and production boundaries. The durational ratio of stop closure to word (where the “word” was defined as disyllables) was previously found to be an invariant parameter that classified the two categories in production, but the present study found that this ratio varied with different speaking rates in perception. However, regression and discriminant analyses of perception and production data showed that treating stop closure as a function of word duration with an intercept term represented the perception and production boundaries very well. This result indicated that the durational ratio of adjusted stop closure (i.e., closure with an added constant) to the word was invariant and distinguished the two phonemic categories clearly. Taken together, the results support the relational acoustic invariance theory, and help refine the theory with regard to exactly what form ‘invariance’ can take.
Spectral and temporal changes to speech produced in the presence of energetic and informational maskersa)128(2010); http://dx.doi.org/10.1121/1.3478775View Description Hide Description
Talkers change the way they speak in noisy conditions. For energetic maskers, speech production changes are relatively well-understood, but less is known about how informational maskers such as competing speech affect speech production. The current study examines the effect of energetic and informational maskers on speech production by talkers speaking alone or in pairs. Talkers produced speech in quiet and in backgrounds of speech-shaped noise, speech-modulated noise, and competing speech. Relative to quiet, speech output level and fundamental frequency increased and spectral tilt flattened in proportion to the energetic masking capacity of the background. In response to modulated backgrounds, talkers were able to reduce substantially the degree of temporal overlap with the noise, with greater reduction for the competing speech background. Reduction in foreground-background overlap can be expected to lead to a release from both energetic and informational masking for listeners. Passive changes in speech rate, mean pause length or pause distribution cannot explain the overlap reduction, which appears instead to result from a purposeful process of listening while speaking. Talkers appear to monitor the background and exploit upcoming pauses, a strategy which is particularly effective for backgrounds containing intelligible speech.