Volume 106, Issue 1, July 1999
Index of content:
- SPEECH PERCEPTION 
Analysis and perception of spectral characteristics of amplitude and period fluctuations in normal sustained vowels106(1999); http://dx.doi.org/10.1121/1.427065View Description Hide Description
Two kinds of fluctuations are always observed in the steady parts of normal sustained vowels. One is amplitude fluctuation, defined as the cyclic changes of maximum peak amplitudes. The other is period fluctuation, defined as the cyclic changes of pitch periods. The primary purpose of this paper is to present quantitative descriptions of amplitude and period sequences obtained from normal sustained vowels. These fluctuation sequences consisted of maximum peak amplitudes or pitch periods extracted successively from 512 consecutive pitch periods in the steady part. Results of the frequency analysis indicated that their frequency characteristics seemed to be subject to the spectral power law. In order to investigate the possibility that the frequency characteristics of the fluctuation sequences influence the voice quality of sustained vowels,psychoacoustic experiments were conducted. Amplitude and period sequences evaluated in the experiments were spectral(white noise), and sequences, respectively. The experimental results indicated that the subjective voice quality of synthesized sustained vowels could reflect the differences in the frequency characteristics of the fluctuation sequences.
106(1999); http://dx.doi.org/10.1121/1.427066View Description Hide Description
An important speech cue is that of voice onset time (VOT), a cue for the perception of voicing and aspiration in word-initial stops. Preaspiration, an [h]-like sound between a vowel and the following stop, can be cued by voice offset time, a cue which in most respects mirrors VOT. In Icelandic VOffT is much more sensitive to the duration of the preceding vowel than is VOT to the duration of the following vowel. This has been explained by noting that preaspiration can only follow a phonemically short vowel. Lengthening of the vowel, either by changing its duration or by moving the spectrum towards that appropriate for a long vowel, will thus demand a longer VOffT to cue preaspiration. An experiment is reported showing that this greater effect that vowel quantity has on the perception of VOffT than on the perception of VOT cannot be explained by the effect of frequency at vowel offset.
106(1999); http://dx.doi.org/10.1121/1.427067View Description Hide Description
Most investigators agree that the acoustic information for American English vowels includes dynamic (time-varying) parameters as well as static “target” information contained in a single cross section of the syllable. Using the silent-center (SC) paradigm, the present experiment examined the case in which the initial and final portions of stop consonant–vowel–stop consonant (CVC) syllables containing the same vowel but different consonants were recombined into mixed-consonant SC syllables and presented to listeners for vowel identification. Ten vowels were spoken in six different syllables, /bVb, bVd, bVt, dVb, dVd, dVt/, embedded in a carrier sentence. Initial and final transitional portions of these syllables were cross-matched in: (1) silent-center syllables with original syllable durations (silences) preserved (mixed-consonant SC condition) and (2) mixed-consonant SC syllables with syllable duration equated across the ten vowels (fixed duration mixed-consonant SC condition). Vowel-identification accuracy in these two mixed consonant SC conditions was compared with performance on the original SC and fixed duration SC stimuli, and in initial and final control conditions in which initial and final transitional portions were each presented alone. Vowels were identified highly accurately in both mixed-consonant SC and original syllable SC conditions (only 7%–8% overall errors). Neutralizing duration information led to small, but significant, increases in identification errors in both mixed-consonant and original fixed-duration SC conditions (14%–15% errors), but performance was still much more accurate than for initial and finals control conditions (35% and 52% errors, respectively). Acoustical analysis confirmed that direction and extent of formant change from initial to final portions of mixed-consonant stimuli differed from that of original syllables, arguing against a explanation of the perceptual results. Results do support the hypothesis that temporal trajectories specifying “style of movement” provide information for the differentiation of American English tense and lax vowels, and that this information is invariant over the place of articulation and voicing of the surrounding stop consonants.
106(1999); http://dx.doi.org/10.1121/1.427068View Description Hide Description
This positron emission tomography study used a correlational design to investigate neural activity during speech perception in six normal subjects and two aphasic patients. The normal subjects listened either to speech or to signal-correlated noise equivalents; the latter were nonspeech stimuli, similar to speech in complexity but not perceived as speechlike. Regions common to the auditory processing of both types of stimuli were dissociated from those specific to spoken words. Increasing rates of presentation of both speech and nonspeech correlated with cerebral activity in bilateral transverse gyri and adjacent superior temporal cortex. Correlations specific to speech stimuli were located more anteriorly in both superior temporal sulci. The only asymmetry in normal subjects was a left lateralized response to speech in the posterior superior temporal sulcus, corresponding closely to structural asymmetry on the subjects’ magnetic resonanceimages. Two patients, who had left temporal infarction but performed well on single word comprehension tasks, were also scanned while listening to speech. These cases showed right superior temporal activity correlating with increasing rates of hearingspeech, but no significant left temporal activation. These findings together suggest that the dorsolateral temporal cortex of both hemispheres can be involved in prelexical processing of speech.