No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
Speech intelligibility estimation using multi-resolution spectral features for speakers undergoing cancer treatment
1. Anderson, D. V. , and Clements, M. A. (2000). “ Efficient multi-resolution sinusoidal modeling,” in World Multiconference on Systemics, Cybernetics, and Informatics (International Institute of Informatics and Systemics, Orlando, FL), Vol. 6, 424–429.
2. Björklund, M. , Sarvimäki, A. , and Berg, A. (2008). “ Health promotion and empowerment from the perspective of individuals living with head and neck cancer,” Eur. J. Oncol. Nursing 12, 26–34.
3. Clapham, R. P. , van der Molen, L. , van Son, R. , van den Brekel, M. , and Hilgers, F. J. (2012). “ NKI-CCRT corpus: speech intelligibility before and after advanced head and neck cancer treated with concomitant chemoradiotherapy,” in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12) ( European Language Resources Association, Paris), pp. 3350–3355.
4. Eyben, F. , Wöllmer, M. , and Schuller, B. (2010). “ openSMILE: The Munich versatile and fast open-source audio feature extractor,” in Proceedings of the Inter-national Conference on Multimedia ( Association for Computing Machinery, New York), pp. 1459–1462.
5. Gavidia-Ceballos, L. , and Hansen, J. H. L. (1996). “ Direct speech feature estimation using an iterative EM algorithm for vocal fold pathology detection,” IEEE Trans. Biomed. Eng. 43, 373–383.
6. Kim, J. C. , and Clements, M. A. (2010). “ Time-scale modification of audio signals using multi-relative onset time estimations in sinusoidal transform coding,” in 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers (ASILOMAR) ( IEEE, New York), pp. 558–561.
8. Muta, H. , Baer, T. , Wagatsuma, K. , Muraoka, T. , and Fukuda, H. (1988). “ A pitch-synchronous analysis of hoarseness in running speech,” J. Acoust. Soc. Am. 84, 1292–1301.
10. Quatieri, T. F. , and McAulay, R. (1992). “ Shape invariant time-scale and pitch modification of speech,” IEEE Trans. Signal Processing 40, 497–510.
11. Schuller, B. , Steidl, S. , Batliner, A. , Nöth, E. , Vinciarelli, A. , Burkhardt, F. , van Son, R. , Weninger, F. , Eyben, F. , Bocklet, T. , Mohammadi, G. , and Weiss, B. (2012). “ The interspeech 2012 speaker trait challenge,” in INTERSPEECH-2012, Portland, OR, pp. 254–257.
12. Van Der Molen, L. , van Rossum, M. , Ackerstaff, A. , Smeele, L. , Rasch, C. , and Hilgers, F. (2009). “ Pretreatment organ function in patients with advanced head and neck cancer: clinical outcome measures and patients' views,” BMC Ear Nose Throat Disorders 9, 10.
Article metrics loading...
Head and neck cancer can significantly hamper speech production which often reduces speech intelligibility. A method of extracting spectral features is presented. The method uses a multi-resolution sinusoidal transform scheme, which enables better representation of spectral and harmonic characteristics. Regression methods were used to predict interval-scaled intelligibility scores of utterances in the NKI-CCRT speech corpus. The inclusion of these features lowered the mean squared estimation error from 0.43 to 0.39 on a scale from 1 to 7, with a p-value less than 0.001. For binary intelligibility classification, their inclusion resulted in an improvement by 5.0 percentage points when tested on a disjoint set.
Full text loading...
Most read this month