The aim of this study is the analysis of continuous speech signals of people with Parkinson's disease (PD) considering recordings in different languages (Spanish, German, and Czech). A method for the characterization of the speech signals, based on the automatic segmentation of utterances into voiced and unvoiced frames, is addressed here. The energy content of the unvoiced sounds is modeled using 12 Mel-frequency cepstral coefficients and 25 bands scaled according to the Bark scale. Four speech tasks comprising isolated words, rapid repetition of the syllables /pa/-/ta/-/ka/, sentences, and read texts are evaluated. The method proves to be more accurate than classical approaches in the automatic classification of speech of people with PD and healthy controls. The accuracies range from 85% to 99% depending on the language and the speech task. Cross-language experiments are also performed confirming the robustness and generalization capability of the method, with accuracies ranging from 60% to 99%. This work comprises a step forward for the development of computer aided tools for the automatic assessment of dysarthric speech signals in multiple languages.
J.R.O.-A. was supported by grants of COLCIENCIAS through the call No. 528 “generación del bicentenario 2011.” This work was also financed by COLCIENCIAS through project No. 111556933858. The research leading to these results has received funding from the Hessen Agentur, Grant Nos. 397/13-36 (ASSIST 1) and 463/15-05 (ASSIST 2). The authors express thanks to CODI at Universidad de Antioquia for its support through “estrategia de sostenibilidad 2014-2015 de la Universidad de Antioquia.” This project was also funded by the Deanship of Scientific Research (DSR), King Abdulaziz University, under Grant No. 9-135-1434-HiCi. The authors, therefore, acknowledge with thanks DSR technical and financial support.
I. INTRODUCTION II. METHODS A. Preprocessing B. Speech modeling 1. Modeling based on MFCC-GMM supervectors 2. Prosody analysis 3. Noise content, formantmeasures, and cepstral analysis of voiced frames 4. Cepstral analysis and energy content of unvoiced frames 5. Classification and validation C. Speech tasks III. EXPERIMENTAL SETUP A. The data 1. Spanish 2. German 3. Czech IV. EXPERIMENTS AND RESULTS A. Results on reading texts B. Results on sentences C. Results on DDK evaluation D. Results on words E. Results on cross-language experiments V. DISCUSSION A. The patients B. The results 1. Results in read texts 2. Results in sentences 3. Results in DDK evaluation 4. Results in isolated words 5. Results in the cross-language experiments VI. CONCLUSIONS