Adaptive auditory feedback control of the production of formant trajectories in the Mandarin triphthong /iau/ and its pattern of generalization
Spectrogram and parsing of the training utterance. A spectrogram of the utterance / ʈʂɣ/ spoken by a male speaker is overlaid with F1 and F2 tracks estimated online by the experimental apparatus. The two vertical dashed lines indicate the beginning and end of the triphthong //, automatically delineated online using heuristics described in Sec. II D.
Experimental design. The experiment was divided into seven phases. The first three phases, Pre, Prac-1 and Prac-2, were for familiarization purposes. The next four phases, Start, Ramp, Stay and End, comprised the main experimental stages. The Start phase served as a no-perturbation baseline, at the end of which a subject-specific perturbation field was calculated (see Sec. II F for details). Perturbation of auditory feedback was present only in the Ramp and Stay phases. Each phase consisted of a number of blocks. The numbers of blocks are shown in the brackets. Each block was divided into two parts, the first of which contained ten training phrases, the second of which contained ten test utterances.
Design of the perturbation fields. An example from a single subject is shown. (A) Formant trajectories from 120 repetitions of /iau/ were extracted and gathered from the Start phase and were used as the basis for calculating the average trajectory and the field boundaries. (B) Inflate and Deflate perturbation fields. The perturbation vectors were parallel to the F1 axis. The magnitudes of the vectors followed a quadratic function of F2, and were zero at the boundaries and greatest near the center of the field (see text for details).
Adaptive changes in the formant trajectories of the training vowel /iau/ in representative subjects. The F1-F2 trajectories produced by subject IH of the Inflate group are plotted (A) in the formant plane and (B) as functions of time. Different line patterns (color version online) indicate different phase of the experiment (see legend). The dashed curves show the perturbed auditory feedback. The shading surrounding the curves show ±3 SEM. The profiles of F1 and F2 in panel B are normalized in time. Panels C and D show analogous results from subject DF of the Deflate group.
Group-average formant trajectories of the training vowel /iau/. F1 and F2 were normalized with respect to the perturbation-field boundaries. (A) The mean F1-F2 trajectories of the Inflate group (color online). (B) The time-normalized trajectories of F1 (bottom) and F2 (top) of the Inflate group. Panels C and D analogous results for the Deflate group. The shading shows ±1 SEM of the mean across subjects. The SEM is not shown for the End-phase trajectory for visualization purposes.
Quantification of adaptive changes in several trajectory parameters for the training vowel /iau/. In A, the definitions of the parameters of the F1 and F2 trajectories of the triphthong /iau/ are shown schematically (see text for details). (B) The change of F1Max (maximum F1 during /iau/) from the Start-phase mean in the Stay and End phases. The End phase is subdivided into “End-early” and “End-late,” in order to show the after-effect of the adaptation in the Stay phase and its decay. The End-early and End-late phases included the first two and the last eight blocks of the End phase, respectively. The error bars show SEM across all 18 subjects in each group. The brackets with dots indicate significant change of F1Max from the Start-phase baseline. The gray-shaded regions with asterisks indicate significant differences between the Inflate and Deflate groups according to two-sample t-tests. (C)–(F) The changes re Start-phase mean in F1Begin, F1End, F2Mid and A-Ratio are shown in the same format as Panel B.
Amount of adaptation for the training vowel /iau/ in individual subjects. Fractions of compensation in F1Max with respect to the auditory perturbations are shown. The upper and lower panels show the subjects in the Inflate and Deflate groups, respectively. Positive values in both panels indicate compensatory changes, i.e., changes in productions in the direction opposite to the auditory perturbations. A value of 1.0 corresponds to complete compensation. In each group, the subjects are shown in descending order. The error bars show SEM across the trials. The asterisks show significance Stay-phase changes from the Start phase (two-sample t-test). Most of the subjects who showed significant compensatory responses in the Stay phase demonstrated a significant after-effect of these responses in the early End phase, as indicated by the gray bars. In each panel, the vertical dashed gray lines divide the subjects into three subgroups: a group that showed significant adaptation in F1Max, a group that showed no change, and a group that followed the auditory perturbation in their F1Max.
The relations of the test vowels to the training vowel in formant space. Data in this plot are from the baseline (i.e., Start-phase) productions of all the 21 subjects (13 Inflate, 8 Deflate) who showed significant compensatory adjustment to the auditory perturbation in the training utterances (see Fig. 7). The average Start-phase trajectories of the vowels in the test utterances are plotted in the same formant plane to illustrate their relationship to the trajectory of the training vowel /iau/.
Generalization of the auditory-motor adaptation to the test utterances. Data from the 13 subjects in the Inflate group and the eight subjects in the Deflate group who showed significant Stay-phase adaptation in the training utterance. Panel A shows the average time- and frequency-normalized F1 trajectories of the training vowel /iau/ from the Inflate (Left) and Deflate (right) groups in the Start and Stay phases (color online). The right-hand plot in Panel A shows the average F1Max changes from baseline in the Stay phase and early and late parts of the End phase. The format of this plot is the same as Fig. 6(B), in which brackets with filled dots show significant within-group, between-phase changes, and gray shading with asterisks show significant between-group differences. Panels B–H have the same layout as A; they show the data from the seven test vowels: /iau/, //, /uai/, /a/, /ia/, /au/, and /iou/, respectively. The dashed vertical lines in panel E show the time intervals from which F1Max was calculated.
Quantification of transfer of the adaption to the test vowels. Each bars shows the difference between the Inflate and Deflate group in the changes in F1Max from the Start-phase baseline to the Stay-phase value. From left to right are the results for the training vowel /iau/ (leftmost column) and the seven test vowels.
List of stimulus utterances and their IPA transcriptions. The left half of the list shows the training utterances, during which auditory feedback of speech was played through the earphones. The right half shows the test utterances, which were masked by noise (see text for details).
Article metrics loading...
Full text loading...