Index of content:
Volume 134, Issue 1, July 2013
- SPEECH PROCESSING AND COMMUNICATION SYSTEMS 
Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages134(2013); http://dx.doi.org/10.1121/1.4807565View Description Hide Description
This study presents a method for analyzing speech rhythm using empirical mode decomposition of the speech amplitude envelope, which allows for extraction and quantification of syllabic- and supra-syllabic time-scale components of the envelope. The method of empirical mode decomposition of a vocalic energy amplitude envelope is illustrated in detail, and several types of rhythm metrics derived from this method are presented. Spontaneous speech extracted from the Buckeye Corpus is used to assess the effect of utterance length on metrics, and it is shown how metrics representing variability in the supra-syllabic time-scale components of the envelope can be used to identify stretches of speech with targeted rhythmic characteristics. Furthermore, the envelope-based metrics are used to characterize cross-linguistic differences in speech rhythm in the UC San Diego Speech Lab corpus of English, German, Greek, Italian, Korean, and Spanish speech elicited in read sentences, read passages, and spontaneous speech. The envelope-based metrics exhibit significant effects of language and elicitation method that argue for a nuanced view of cross-linguistic rhythm patterns.
134(2013); http://dx.doi.org/10.1121/1.4807645View Description Hide Description
Recent perspectives suggest that the Lombard effect is an increase in the suprasegmental speech parameters of vocal intensity, duration, and fundamental frequency in the presence of noise. It has been viewed as a non-specific response to ambient noise, but this assumption has not been thoroughly tested. Two experiments using healthy adults measured intensity, duration, and F0 changes in broadband (0.2–20 kHz) and notched noise (0.05–4 kHz removed) during a picture naming task. The pilot experiment showed that broadband noise containing speech-similar frequencies significantly increased intensity, duration, and F0 while notched noise, which removed the majority of speech-similar frequencies, had no effect. The main experiment added bandpass noise (0.05–4.0 kHz) which contained a major portion of speech-similar frequencies and was the mirror image of the notched noise. Broadband and notched noise results were replicated. Bandpass noise increased intensity and duration, but to a lesser degree than did broadband noise, and had no effect on F0. Findings show that the Lombard effect is sensitive to frequencies vital for speech and is not a general response to any competing sound in the environment. Implications for suprasegmental control of speech are discussed.