Volume 103, Issue 3, March 1998
Index of content:
- SPEECH PRODUCTION 
103(1998); http://dx.doi.org/10.1121/1.421296View Description Hide Description
A model of the midsagittal plane motion of the tongue, jaw, hyoid bone, and larynx is presented, based on the λ version of equilibrium point hypothesis. The model includes muscle properties and realistic geometrical arrangement of muscles, modeled neural inputs and reflexes, and dynamics of soft tissue and bony structures. The focus is on the organization of control signals underlying vocal tract motions and on the dynamic behavior of articulators. A number of muscle synergies or “basic motions” of the system are identified. In particular, it is shown that systematic sources of variation in an x-ray data base of midsagittal vocal tract motions can be accounted for, at the muscle level, with six independent commands, each corresponding to a direction of articulator motion. There are two commands for the jaw (corresponding to sagittal plane jaw rotation and jaw protrusion), one command controlling larynx height, and three commands for the tongue (corresponding to forward and backward motion of the tongue body, arching and flattening of the tongue dorsum, and motion of the tongue tip). It is suggested that all movements of the system can be approximated as linear combinations of such basic motions. In other words, individual movements and sequences of movements can be accounted for by a simple additive control model. The dynamics of individual commands are also assessed. It is shown that the dynamic effects are not neglectable in speechlike movements because of the different dynamic behaviors of soft and bony structures.
103(1998); http://dx.doi.org/10.1121/1.421305View Description Hide Description
The glottal to noise excitation ratio (GNE) is an acoustic measure designed to assess the amount of noise in a pulse train generated by the oscillation of the vocal folds. So far its properties have only been studied for synthesized signals, where it was found to be independent of variations of fundamental frequency (jitter) and amplitude (shimmer). On the other hand, other features designed for the same purpose like NNE (normalized noise energy) or CHNR (cepstrum based harmonics-to-noise ratio) did not show this independence. This advantage of the GNE over NNE and CHNR, as well as its general applicability in voice quality assessment, is now tested for real speech using a large group of pathologic voices A set of four acoustic features is extracted from a total of 22 mostly well-known acoustic voice quality measures by correlation analysis, mutual informationanalysis, and principal components analysis. Three of these measures are chosen to assess primarily different aspects of signal aperiodicity, while the fourth one indicates the noise content of the signal. All analysis methods lead to the same feature set that consists of a measure of period correlation, jitter, shimmer, and GNE. The two-dimensional projection of this set named “hoarseness diagram” allows a graphical illustration of voice quality that can be easily interpreted.