^{1,a)}and Richard S. McGowan

^{1}

### Abstract

In a re-analysis of x rays of speakers producing Swedish vowels, midsagittal pharyngeal dimensions were predicted from anterior tongue positions using procedures based on estimated tongue pellet positions. Principal component analysis was used to reduce the number of pellet degrees of freedom from eight to three prior to applying linear regression from these three independent variables to dependent vocal tract midsagittal cross distances. Except for the regions around the laryngopharynx and uvula, the pharynx dimensions were predictable from linear regressions and were significant at the level. Numerical experiments show that it is crucial to reduce the number of independent variables in tests of statistical significance.

This work was supported by Grant No. NIDCD-001247 to CReSS LLC. The authors would also like to acknowledge the personal help of Dr. Sundberg and Dr. Fant, who supplied the x-ray tracings analyzed in this article.

I. INTRODUCTION

II. PROCEDURE

A. Image acquisition, processing, and coordinate system

B. Reference points and measurement grids

C. Gridlines

1. Lower pharynx

2. Upper pharynx and posterior oral cavity

3. Anterior oral cavity

D. Estimated fleshpoint locations and principal components regression

E. Numerical experiments

III. RESULTS

A. PCA components

B. Principal component regressions predicting midsagittal vocal tract cross distances

C. Numerical experiments

IV. DISCUSSION

### Key Topics

- Phonetic segments
- 34.0
- Vocal tract
- 14.0
- Medical imaging
- 10.0
- Medical magnetic resonance imaging
- 8.0
- Medical X-ray imaging
- 6.0

## Figures

Reference points for speaker BE. P1—mean position of the alveolar ridge reference point on all nine digitized vowel tracings; P2—mean position of the highest point of the palate; P3—mean position of the superior point on the rear pharyngeal wall; P4—mean position of the inferior point on the rear pharyngeal wall; P0—center of circular arc through P1, P2, and P3.

Reference points for speaker BE. P1—mean position of the alveolar ridge reference point on all nine digitized vowel tracings; P2—mean position of the highest point of the palate; P3—mean position of the superior point on the rear pharyngeal wall; P4—mean position of the inferior point on the rear pharyngeal wall; P0—center of circular arc through P1, P2, and P3.

Gridlines for speaker BE constructed based on the reference points P1 through P4.

Gridlines for speaker BE constructed based on the reference points P1 through P4.

PCA component shapes for speaker BE. Pellet displacements from mean position associated with variation in each component: (a) first principal component; (b) second principal component; (c) third principal component.

PCA component shapes for speaker BE. Pellet displacements from mean position associated with variation in each component: (a) first principal component; (b) second principal component; (c) third principal component.

Regression results for speaker BE. For each gridline, black bars show the model sum of squares for the regression; gray bars show the error sum of squares; the total sum of squares is shown by the total height of the black and gray bars; solid line shows the regression ; dashed line shows the level at which is significant at the level according to the test.

Regression results for speaker BE. For each gridline, black bars show the model sum of squares for the regression; gray bars show the error sum of squares; the total sum of squares is shown by the total height of the black and gray bars; solid line shows the regression ; dashed line shows the level at which is significant at the level according to the test.

Regression results for speaker JS. For each gridline, black bars show the model sum of squares for the regression; gray bars show the error sum of squares; the total sum of squares is shown by the total height of the black and gray bars; solid line shows the regression ; dashed line shows the level at which is significant at the level according to the test.

Regression results for speaker JS. For each gridline, black bars show the model sum of squares for the regression; gray bars show the error sum of squares; the total sum of squares is shown by the total height of the black and gray bars; solid line shows the regression ; dashed line shows the level at which is significant at the level according to the test.

Regression results for speaker RL. For each gridline, black bars show the model sum of squares for the regression; gray bars show the error sum of squares; the total sum of squares is shown by the total height of the black and gray bars; solid line shows the regression ; dashed line shows the level at which is significant at the level according to the test.

Regression results for speaker RL. For each gridline, black bars show the model sum of squares for the regression; gray bars show the error sum of squares; the total sum of squares is shown by the total height of the black and gray bars; solid line shows the regression ; dashed line shows the level at which is significant at the level according to the test.

Regression results for speaker F. For each gridline, black bars show the model sum of squares for the regression; gray bars show the error sum of squares; the total sum of squares is shown by the total height of the black and gray bars; solid line shows the regression ; dashed line shows the level at which is significant at the level according to the test.

Regression results for speaker F. For each gridline, black bars show the model sum of squares for the regression; gray bars show the error sum of squares; the total sum of squares is shown by the total height of the black and gray bars; solid line shows the regression ; dashed line shows the level at which is significant at the level according to the test.

The distribution of from multiple regressions with the final maximum number random, uniformly distributed independent variables and midsagittal cross distances at each gridline as dependent variables. The light gray region of each distribution shows the fraction of regressions below the 95th percentile of each gridline’s distribution, the dark gray shows fraction between the 99th and 99th percentiles, and the white shows the regressions above the 99th percentile. The solid horizontal line denotes the minimum value that would need to be attained for according to an test, and the dashed line the minimum values that would need to be attained for according to an test. Subject BE is shown in (a); subject JS in (b); subject RL in (c); and subject F in (d).

The distribution of from multiple regressions with the final maximum number random, uniformly distributed independent variables and midsagittal cross distances at each gridline as dependent variables. The light gray region of each distribution shows the fraction of regressions below the 95th percentile of each gridline’s distribution, the dark gray shows fraction between the 99th and 99th percentiles, and the white shows the regressions above the 99th percentile. The solid horizontal line denotes the minimum value that would need to be attained for according to an test, and the dashed line the minimum values that would need to be attained for according to an test. Subject BE is shown in (a); subject JS in (b); subject RL in (c); and subject F in (d).

## Tables

Vowels analyzed in this study. “√” indicate that the vowel was produced by the given speaker; “n/a” indicates that it was not; a phonetic transcription indicates that a variant with different phonemic length (e.g., [ø] rather than [ø:]) was produced.

Vowels analyzed in this study. “√” indicate that the vowel was produced by the given speaker; “n/a” indicates that it was not; a phonetic transcription indicates that a variant with different phonemic length (e.g., [ø] rather than [ø:]) was produced.

Standard deviations (mm) in the and coordinates of reference points across images. P1 is on the alveolar ridge; P2 is the most superior point of the hard palate; P3 is on the dorsal wall of the pharynx at the level of the anterior tubercle of the atlas; P4 is on the dorsal wall of the pharynx at the level of the bottom of the vallecular sinus. Coordinates are relative to the tip of the upper incisor.

Standard deviations (mm) in the and coordinates of reference points across images. P1 is on the alveolar ridge; P2 is the most superior point of the hard palate; P3 is on the dorsal wall of the pharynx at the level of the anterior tubercle of the atlas; P4 is on the dorsal wall of the pharynx at the level of the bottom of the vallecular sinus. Coordinates are relative to the tip of the upper incisor.

Numerical experiment parameters. Although a potential maximum of eight independent variables (an - and a- and a -coordinate for each of four estimated fleshpoints) could be simulated, the small number of observations (vowels) and / or numerical degeneracy limits the actual maximum number of independent variables that can be used in each numerical experiment.

Numerical experiment parameters. Although a potential maximum of eight independent variables (an - and a- and a -coordinate for each of four estimated fleshpoints) could be simulated, the small number of observations (vowels) and / or numerical degeneracy limits the actual maximum number of independent variables that can be used in each numerical experiment.

Cumulative variance accounted for by PCA components of eight pseudopellet variables

Cumulative variance accounted for by PCA components of eight pseudopellet variables

Mean for pharyngeal region gridlines. The pharyngeal regions include gridlines 11 through 30 for speaker BE; 11–24 for speaker JS; 11–28 for RL; and 12–24 for F. (See Figs. 4–7 for comparison.)

Mean for pharyngeal region gridlines. The pharyngeal regions include gridlines 11 through 30 for speaker BE; 11–24 for speaker JS; 11–28 for RL; and 12–24 for F. (See Figs. 4–7 for comparison.)

Article metrics loading...

Full text loading...

Commenting has been disabled for this content