Index of content:
Volume 121, Issue 1, January 2007
- SPEECH PROCESSING AND COMMUNICATION SYSTEMS 
121(2007); http://dx.doi.org/10.1121/1.2390679View Description Hide Description
A technique to synthesize laughter based on time-domain behavior of real instances of human laughter is presented. In the speech synthesis community, interest in improving the expressive quality of synthetic speech has grown considerably. While the focus has been on the linguistic aspects, such as precise control of speechintonation to achieve desired expressiveness, inclusion of nonlinguistic cues could further enhance the expressive quality of synthetic speech. Laughter is one such cue used for communicating, say, a happy or amusing context. It can be generated in many varieties and qualities: from a short exhalation to a long full-blown episode. Laughter is modeled at two levels, the overall episode level and at the local call level. The first attempts to capture the overall temporal behavior in a parametric model based on the equations that govern the simple harmonic motion of a mass-spring system is presented. By changing a set of easily available parameters, the authors are able to synthesize a variety of laughter. At the call level, the authors relied on a standard linear prediction based analysis-synthesismodel. Results of subjective tests to assess the acceptability and naturalness of the synthetic laughter relative to real human laughter samples are presented.