Volume 110, Issue 6, December 2001
Index of content:
- SPEECH PRODUCTION 
Spatio-temporal analysis of irregular vocal fold oscillations: Biphonation due to desynchronization of spatial modes110(2001); http://dx.doi.org/10.1121/1.1406498View Description Hide Description
This report is on direct observation and modal analysis of irregular spatio-temporal vibration patterns of vocal fold pathologiesin vivo. The observed oscillation patterns are described quantitatively with multiline kymograms, spectral analysis, and spatio-temporal plots. The complex spatio-temporal vibration patterns are decomposed by empirical orthogonal functions into independent vibratory modes. It is shown quantitatively that biphonation can be induced either by left–right asymmetry or by desynchronized anterior–posterior vibratory modes, and the term “AP (anterior–posterior) biphonation” is introduced. The presented phonation examples show that for normal phonation the first two modes sufficiently explain the glottal dynamics. The spatio-temporal oscillation pattern associated with biphonation due to left–right asymmetry can be explained by the first three modes. Higher-order modes are required to describe the pattern for biphonation induced by anterior–posterior vibrations. Spatial irregularity is quantified by an entropy measure, which is significantly higher for irregular phonation than for normal phonation. Two asymmetry measures are introduced: the left–right asymmetry and the anterior–posterior asymmetry, as the ratios of the fundamental frequencies of left and right vocal fold and of anterior–posterior modes, respectively. These quantities clearly differentiate between left–right biphonation and anterior–posterior biphonation. This paper proposes methods to analyze quantitatively irregular vocal fold contour patternsin vivo and complements previous findings of desynchronization of vibration modes in computer modes and in in vitro experiments.
110(2001); http://dx.doi.org/10.1121/1.1397321View Description Hide Description
A new method for analysis of digital high-speed recordings of vocal-fold vibrations is presented. The method is based on the extraction of light-intensity time sequences from consecutive images, which in turn are Fourier transformed. The spectra thus acquired can be displayed in four different modes, each having its own benefits. When applied to the larynx, the method visualizes oscillations in the entire laryngeal area, not merely the glottal region. The method was applied to two laryngoscopic high-speed image sequences. Among these examples, covibrations in the ventricular folds and in the mucosa covering the arytenoid cartilages were found. In some cases the covibrations occurred at other frequencies than those of the glottis.
110(2001); http://dx.doi.org/10.1121/1.1413751View Description Hide Description
The effects of ingesting ethanol have been shown to be somewhat variable in humans. To date, there appear to be but few universals. Yet, the question often arises: is it possible to determine if a person is intoxicated by observing them in some manner? A closely related question is: can speech be used for this purpose and, if so, can the degree of intoxication be determined? One of the many issues associated with these questions involves the relationships between a person’s paralinguistic characteristics and the presence and level of inebriation. To this end, young, healthy speakers of both sexes were carefully selected and sorted into roughly equal groups of light, moderate, and heavy drinkers. They were asked to produce four types of utterances during a learning phase, when sober and at four strictly controlled levels of intoxication (three ascending and one descending). The primary motor speech measures employed were speaking fundamental frequency, speech intensity, speaking rate and nonfluencies. Several statistically significant changes were found for increasing intoxication; the primary ones included rises in in task duration and for nonfluencies. Minor gender differences were found but they lacked statistical significance. So did the small differences among the drinking category subgroups and the subject groupings related to levels of perceived intoxication. Finally, although it may be concluded that certain changes in speechsuprasegmentals will occur as a function of increasing intoxication, these patterns cannot be viewed as universal since a few subjects (about 20%) exhibited no (or negative) changes.
110(2001); http://dx.doi.org/10.1121/1.1413749View Description Hide Description
Normal vowels are known to have irregularities in the pitch-to-pitch variation which is quite important for speech signals to be perceived as natural human sound. Such pitch-to-pitch variation of vowels is studied in the light of nonlinear dynamics. For the analysis, five normal vowels recorded from three male and two female subjects are exploited, where the vowel signals are shown to have normal levels of the pitch-to-pitch variation. First, by the false nearest-neighbor analysis,nonlinear dynamics of the vowels are shown to be well analyzed by using a relatively low-dimensional reconstructing dimension of Then, we further studied nonlinear dynamics of the vowels by spike-and-wave surrogate analysis. The results imply that there exists nonlinear dynamical correlation between one pitch-waveform pattern to another in the vowel signals. On the basis of the analysis results, applicability of the nonlinear prediction technique to vowel synthesis is discussed.