Volume 119, Issue 6, June 2006
Index of content:
- SPEECH PERCEPTION 
119(2006); http://dx.doi.org/10.1121/1.2195119View Description Hide Description
The extent to which context influences speech categorization can inform theories of pre-lexical speech perception. Across three conditions, listeners categorized speech targets preceded by speech context syllables. These syllables were presented as the sole context or paired with nonspeech tone contexts previously shown to affect speech categorization. Listeners’ context-dependent categorization across these conditions provides evidence that speech and nonspeech context stimuli jointly influence speech processing. Specifically, when the spectral characteristics of speech and nonspeech context stimuli are mismatched such that they are expected to produce opposing effects on speech categorization the influence of nonspeech contexts may undermine, or even reverse, the expected effect of adjacent speech context. Likewise, when spectrally matched, the cross-class contexts may collaborate to increase effects of context. Similar effects are observed even when natural speech syllables, matched in source to the speech categorization targets, serve as the speech contexts. Results are well-predicted by spectral characteristics of the context stimuli.
119(2006); http://dx.doi.org/10.1121/1.2190162View Description Hide Description
Three experiments examine the effect of a difference in fundamental frequency (F0) range between two simultaneous voices on the processing of unattended speech. Previous experiments have only found evidence for the processing of nominally unattended speech when it has consisted of isolated words which could have attracted the listener’s attention. A paradigm recently used by Dupoux et al. [J. Exp. Psychol.: Human Percept. Perform.29(1), 172–184 (2003)] was modified so that participants had to detect a target word belonging to a specific category presented in a rapid list of words in the attended ear. In the unattended ear, concatenated sentences were presented, some containing a repetition prime presented just before a target word. Primes speeded category detection by when the two messages were in a difference F0 range. This priming effect was unaffected by whether the target was led to the left or the right ear, but disappeared when there was no F0 range difference between the messages. Finally, it was replicated when participants were compelled to focus on the attended message in order to perform a second task. The results demonstrate that repetition priming can be produced by words in unattended continuous speech provided that there is a difference in F0 range between the voices.
119(2006); http://dx.doi.org/10.1121/1.2188369View Description Hide Description
This study was designed to measure the relative contributions to speech intelligibility of spectral envelope peaks (including, but not limited to formants) versus the detailed shape of the spectral envelope. The problem was addressed by asking listeners to identify sentences and nonsense syllables that were generated by two structurally identical source-filter synthesizers, one of which constructs the filter function based on the detailed spectral envelope shape while the other constructs the filter function using a purposely coarse estimate that is based entirely on the distribution of peaks in the envelope. Viewed in the broadest terms the results showed that nearly as much speech information is conveyed by the peaks-only method as by the detail-preserving method. Just as clearly, however, every test showed some measurable advantage for spectral detail, although the differences were not large in absolute terms.
Improving syllable identification by a preprocessing method reducing overlap-masking in reverberant environmentsa)119(2006); http://dx.doi.org/10.1121/1.2198191View Description Hide Description
Overlap-masking degrades speech intelligibility in reverberation [R. H. Bolt and A. D. MacDonald, J. Acoust. Soc. Am.21(6), 577–580 (1949)]. To reduce the effect of this degradation, steady-state suppression has been proposed as a preprocessing technique [Arai et al. , Proc. Autumn Meet. Acoust. Soc. Jpn., 2001; Acoust. Sci. Tech.23(8), 229–232 (2002)]. This technique automatically suppresses steady-state portions of speech that have more energy but are less crucial for speech perception. The present paper explores the effect of steady-state suppression on syllable identification preceded by /a/ under various reverberant conditions. In each of two perception experiments, stimuli were presented to 22 subjects with normal hearing. The stimuli consisted of mono-syllables in a carrier phrase with and without steady-state suppression and were presented under different reverberant conditions using artificial impulse responses. The results indicate that steady-state suppression statistically improves consonant identification for reverberation times of 0.7 to . Analysis of confusion matrices shows that identification of voiced consonants, stop and nasal consonants, and bilabial, alveolar, and velar consonants were especially improved by steady-state suppression. The steady-state suppression is demonstrated to be an effective preprocessing method for improving syllable identification by reducing the effect of overlap-masking under specific reverberant conditions.
119(2006); http://dx.doi.org/10.1121/1.2195091View Description Hide Description
Previous research has identified a “synchrony window” of several hundred milliseconds over which auditory-visual (AV) asynchronies are not reliably perceived. Individual variability in the size of this AV synchrony window has been linked with variability in AV speech perception measures, but it was not clear whether AV speech perception measures are related to synchrony detection for speech only or for both speech and nonspeech signals. An experiment was conducted to investigate the relationship between measures of AV speech perception and AV synchrony detection for speech and nonspeech signals. Variability in AV synchrony detection for both speech and nonspeech signals was found to be related to variability in measures of auditory-only (A-only) and AV speech perception, suggesting that temporal processing for both speech and nonspeech signals must be taken into account in explaining variability in A-only and multisensory speech perception.