Volume 128, Issue 3, September 2010
Index of content:
- SPEECH PROCESSING AND COMMUNICATION SYSTEMS 
128(2010); http://dx.doi.org/10.1121/1.3467764View Description Hide Description
The purpose of this study was to explore the impact of different bandwidths on acoustic measures when using low-cost internettechnology of teletherapy in the field of speech and language rehabilitation. Normal speech and voice samples were collected at a clinic and a remote place by connecting the computers to Skype and VoiceEmotion software, while the disordered speech samples were collected through teaching CD samples from a quality voice textbook. Pure tones at 200 and 1000 Hz were also collected. The acoustic parameters: average fundamental frequency (F0), jitter percent, shimmer percent and noise-to-harmonic ratio (NHR) were used for fidelity analysis. The average F0 increased across all samples and bandwidths. There was greater increase and variability on the disordered voice samples. Speaking F0 was shown to both increase and decrease in no identifiable pattern with the different bandwidths. Jitter, shimmer and NHR were significantly different on pre- and post-transmission trials. The study provided preliminary pilot data on the fidelity effect of internet transmission on acoustic variables for voice and speech. Cautious suggestions were also provided to speech and language therapists who would consider using teletherapy for speech and voice diagnosis and treatment.
128(2010); http://dx.doi.org/10.1121/1.3458854View Description Hide Description
Confusion matrices have been used as a tool for the analysis of speech perception or human speech recognition (HSR) for decades. However, they are rarely employed in automatic speech recognition (ASR) mainly due to the lack of a systematic procedure for their exploration. The generalization of formal concept analysis employed in this paper provides a conceptual interpretation of confusion matrices that enables the analysis of the structure of confusions for both human and machine performances. Generalized formal concept analysis transforms confusion matrices into ordered lattices of confusion events, supporting classic results in HSR that identify a hierarchy of virtual articulatory-acoustic channels. Translating this technique into ASR, a detailed map of the relationships among the speech units employed in the system can be traced to make different sources of confusions apparent: the influence of the lexicon, segmentation errors, dialectal variations or limitations of the feature extraction procedures, among others.