Volume 129, Issue 5, May 2011
Index of content:
- SPEECH PRODUCTION 
Effects of consonant manner and vowel height on intraoral pressure and articulatory contact at voicing offset and onset for voiceless obstruentsa)129(2011); http://dx.doi.org/10.1121/1.3561658View Description Hide Description
In obstruent consonants, a major constriction in the upper vocal tract yields an increase in intraoral pressure (P io). Phonation requires that subglottal pressure (P sub) exceed P io by a threshold value, so as the transglottal pressure reaches the threshold, phonation will cease. This work investigates how P io levels at phonation offset and onset vary before and after different German voiceless obstruents (stop, fricative, affricates, clusters), and with following high vs low vowels. Articulatory contacts, measured using electropalatography, were recorded simultaneously with P io to clarify how supraglottal constrictions affect P io. Effects of consonant type on phonation thresholds could be explained mainly in terms of the magnitude and timing of vocal-fold abduction. Phonation offset occurred at lower values of P io before fricative-initial sequences than stop-initial sequences, and onset occurred at higher levels of P io following the unaspirated stops of clusters compared to fricatives, affricates, and aspirated stops. The vowel effects were somewhat surprising: High vowels had an inhibitory effect at voicing offset (phonation ceasing at lower values of P io) in short-duration consonant sequences, but a facilitating effect on phonation onset that was consistent across consonantal contexts. The vowel influences appear to reflect a combination of vocal-fold characteristics and vocal-tract impedance.
129(2011); http://dx.doi.org/10.1121/1.3569714View Description Hide Description
Finding the control parameters of an articulatory model that result in given acoustics is an important problem in speech research. However, one should also be able to derive the same parameters from measured articulatory data. In this paper, a method to estimate the control parameters of the the model by Maeda from electromagnetic articulography (EMA) data, which allows the derivation of full sagittal vocal tract slices from sparse flesh-point information, is presented. First, the articulatory grid system involved in the model’s definition is adapted to the speaker involved in the experiment, and EMA data are registered to it automatically. Then, articulatory variables that correspond to measurements defined by Maeda on the grid are extracted. An initial solution for the articulatory control parameters is found by a least-squares method, under constraints ensuring vocal tract shape naturalness. Dynamic smoothness of the parameter trajectories is then imposed by a variational regularization method. Generated vocal tract slices for vowels are compared with slices appearing in magnetic resonance images of the same speaker or found in the literature. Formants synthesized on the basis of these generated slices are adequately close to those tracked in real speech recorded concurrently with EMA.
129(2011); http://dx.doi.org/10.1121/1.3559709View Description Hide Description
Patterns of durational variation were examined by applying 15 previously published rhythm measures to a large corpus of speech from five languages. In order to achieve consistent segmentation across all languages, an automatic speech recognition system was developed to divide the waveforms into consonantal and vocalic regions. The resulting duration measurements rest strictly on acoustic criteria. Machine classification showed that rhythm measures could separate languages at rates above chance. Within-language variability in rhythm measures, however, was large and comparable to that between languages. Therefore, different languages could not be identified reliably from single paragraphs. In experiments separating pairs of languages, a rhythm measure that was relatively successful at separating one pair often performed very poorly on another pair: there was no broadly successful rhythm measure. Separation of all five languages at once required a combination of three rhythm measures. Many triplets were about equally effective, but the confusion patterns between languages varied with the choice of rhythm measures.