Skip to main content

News about Scitation

In December 2016 Scitation will launch with a new design, enhanced navigation and a much improved user experience.

To ensure a smooth transition, from today, we are temporarily stopping new account registration and single article purchases. If you already have an account you can continue to use the site as normal.

For help or more information please visit our FAQs.

banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.
/content/asa/journal/jasa/135/5/10.1121/1.4870484
1.
1. M. Christensen and A. Jakobsson, Multi-pitch Estimation (Morgan and Claypool, San Rafael, CA, 2009), pp. 16.
2.
2. I. R. Titze, Principles of Voice Production, 2nd ed. (National Center for Voice and Speech, Iowa City, 2000).
3.
3. D. Talkin, “ A robust algorithm for pitch tracking,” in Speech Coding and Synthesis, edited by W. B. Kleijn and K. K. Paliwal (Elsevier Science, Philadelphia, 1995), Chap. 14, pp. 495518.
4.
4. R. M. Roark, “ Frequency and voice: Perspectives in the time domain,” J. Voice 20, 325354 (2006).
http://dx.doi.org/10.1016/j.jvoice.2005.12.009
5.
5. V. Parsa and D. G. Jamieson, “ A comparison of high precision F0 extraction algorithms for sustained vowels,” J. Speech Lang. Hear. Res. 42, 112126 (1999).
http://dx.doi.org/10.1044/jslhr.4201.112
6.
6. I. R. Titze and H. Liang, “ Comparison of F0 extraction methods for high-precision voice perturbation measurements,” J. Speech Hear. Res. 36, 11201133 (1993).
7.
7. S. -J. Jang, S. -H. Choi, H. -M. Kim, H. -S. Choi, and Y. -R. Yoon, “ Evaluation of performance of several established pitch detection algorithms in pathological voices,” Proceedings of the 29th International Conference, IEEE EMBS, Lyon, France (2007), pp. 620623.
8.
8. C. Manfredi, A. Giordano, J. Schoentgen, S. Fraj, L. Bocchi, and P. H. Dejonckere, “ Perturbation measurements in highly irregular voice signals: Performance/validity of analysis software tools,” Biomed. Signal Process. Control 7, 409416 (2012).
http://dx.doi.org/10.1016/j.bspc.2011.06.004
9.
9. C. Ferrer, D. Torres, and M. E. Hernandez-Diaz, “ Using dynamic time warping of T0 contours in the evaluation of cycle-to-cycle pitch detection algorithms,” Pattern Recogn. Lett. 31, 517522 (2010).
http://dx.doi.org/10.1016/j.patrec.2009.07.021
10.
10. A. Tsanas, M. A. Little, P. E. McSharry, and L. O. Ramig, “ Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson's disease symptom severity,” J. R. Soc. Interface 8, 842855 (2011).
http://dx.doi.org/10.1098/rsif.2010.0456
11.
11. A. Tsanas, M. A. Little, P. E. McSharry, J. Spielman, and L. O. Ramig: “ Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease,” IEEE Trans. Biomed. Eng. 59, 12641271 (2012).
http://dx.doi.org/10.1109/TBME.2012.2183367
12.
12. A. Tsanas, M. A. Little, C. Fox, and L. O. Ramig, “ Objective automatic assessment of rehabilitative speech treatment in Parkinson's disease,” IEEE Trans. Neural Syst. Rehab. Eng. 22, 181190 (2014).
http://dx.doi.org/10.1109/TNSRE.2013.2293575
13.
13. A. Tsanas, M. A. Little, P. E. McSharry, and L. O. Ramig, “ New nonlinear markers and insights into speech signal degradation for effective tracking of Parkinson's disease symptom severity,” in International Symposium on Nonlinear Theory and its Applications (NOLTA), Krakow, Poland (2010), pp. 457460.
14.
14. J. I. Godino-Llorente, P. Gomez-Vilda, and M. Blanco-Velasco, “ Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters,” IEEE Trans. Biomed. Eng. 53, 19431953 (2006).
http://dx.doi.org/10.1109/TBME.2006.871883
15.
15. R. H. Colton and E. G. Conture, “ Problems and pitfalls of electroglottography,” J. Voice 4, 1024 (1990).
http://dx.doi.org/10.1016/S0892-1997(05)80077-3
16.
16. N. Henrich, C. d'Alessandro, B. Doval, and M. Castellengo, “ On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation,” J. Acoust. Soc. Am. 115, 13211332 (2004).
http://dx.doi.org/10.1121/1.1646401
17.
17. D. D. Mehta, M. Zañartu, T. F. Quatieri, D. D. Deliyski, and R. E. Hillman, “ Investigating acoustic correlates of human vocal fold phase asymmetry through mathematical modeling and laryngeal high-speed videoendoscopy,” J. Acoust. Soc. Am. 130, 39994009 (2011).
http://dx.doi.org/10.1121/1.3658441
18.
18. M. Zañartu, “ Acoustic coupling in phonation and its effect on inverse filtering of oral airflow and neck surface acceleration,” Ph.D. dissertation, School of Electrical and Computer Engineering, Purdue University (2010).
19.
19. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. (Springer Science+Business Media, New York, 2009).
20.
20. Q. Li, R. G. Mark, and G. D. Clifford, “ Robust heart rate estimation from multiple asynchronous noisy sources using signal quality indices and a Kalman filter,” Physiol. Meas. 29, 1532 (2008).
http://dx.doi.org/10.1088/0967-3334/29/1/002
21.
21. B. H. Story and I. R. Titze, “ Voice simulation with a body-cover model of the vocal folds,” J. Acoust. Soc. Am. 97, 12491260 (1995).
http://dx.doi.org/10.1121/1.412234
22.
22. I. Steinecke and H. Herzel, “ Bifurcations in an asymmetric vocal-fold model,” J. Acoust. Soc. Am. 97, 18741884 (1995).
http://dx.doi.org/10.1121/1.412061
23.
23. I. R. Titze and B. H. Story, “ Rules for controlling low-dimensional vocal fold models with muscle activation,” J. Acoust. Soc. Am. 112, 10641076 (2002).
http://dx.doi.org/10.1121/1.1496080
24.
24. B. D. Erath, S. D. Peterson, M. Zañartu, G. R. Wodicka, and M. W. Plesniak, “ A theoretical model of the pressure distributions arising from asymmetric intraglottal flows applied to a two-mass model of the vocal folds,” J. Acoust. Soc. Am. 130, 389403 (2011).
http://dx.doi.org/10.1121/1.3586785
25.
25. R. E. Hillman, E. B. Holmberg, J. S. Perkell, M. Walsh, and C. Vaughan, “ Objective assessment of vocal hyperfunction: An experimental framework and initial results,” J. Speech Hear. Res. 32, 373392 (1989).
26.
26. J. Kuo, “ Voice source modeling and analysis of speakers with vocal-fold nodules,” Ph.D. dissertation, Harvard–MIT Division of Health Sciences and Technology (1998).
27.
27. B. H. Story, “ Physiologically-based speech simulation using an enhanced wave-reflection model of the vocal tract,” Ph.D. dissertation, University of Iowa (1995).
28.
28. I. R. Titze, “ Nonlinear source-filter coupling in phonation: Theory,” J. Acoust. Soc. Am. 123, 27332749 (2008).
http://dx.doi.org/10.1121/1.2832337
29.
29. M. G. Christensen, “ On the estimation of low fundamental frequencies,” in Proceedings of the IEEE Workshop on Application of Signal Processes to Audio and Acoustics (2011), pp. 169172.
30.
30. M. R. P. Thomas and P. A. Naylor, “ The SIGMA algorithm: A glottal activity detector for electroglottographic signals,” IEEE Trans. Audio Speech Lang. Process. 17, 15571566 (2009).
http://dx.doi.org/10.1109/TASL.2009.2022430
31.
31.PRAAT: doing phonetics by computer (Version 5.1.15) [Computer program], by P. Boersma and D. Weenink. Retrieved from http://www.praat.org/ (Last viewed 3/21/2014).
32.
32. P. Boersma, “ Should jitter be measured by peak picking or by waveform matching?,” Folia Phoniat. Logoped. 61, 305308 (2009).
http://dx.doi.org/10.1159/000245159
33.
33. P. A. Naylor, A. Kounoudes, J. Gudnason, and M. Brookes, “ Estimation of glottal closure instants in voices speech using the DYPSA algorithm,” IEEE Trans. Audio Speech Lang. Process. 15, 3443 (2007).
http://dx.doi.org/10.1109/TASL.2006.876878
34.
34. P. Boersma, “ Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of sampled signal,” IFA Proc. 17, 97110 (1993).
35.
35. X. Sun, “ Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio,” ICASSP2002, Orlando, FL (2002).
36.
36. A. Camacho and J. G. Harris, “ A sawtooth waveform inspired pitch estimator for speech and music,” J. Acoust. Soc. Am. 124, 16381652 (2008).
http://dx.doi.org/10.1121/1.2951592
37.
37. A. de Cheveigne and H. Kawahara, “ YIN, a fundamental frequency estimator for speech and music,” J. Acoust. Soc. Am. 111, 19171930 (2002).
http://dx.doi.org/10.1121/1.1458024
38.
38. H. Kawahara, H. Katayose, A. de Cheveigne, and R. D. Patterson, “ Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity,” Eurospeech, Budapest, Hungary (1999), pp. 27812784.
39.
39. H. Kawahara, A. de Cheveigne, H. Banno, T. Takahashi, and T. Irino, “ Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT,” Interspeech, Lisbon, Portugal (2005), pp. 537540.
40.
40. H. Kawahara, M. Morise, T. Takahashi, R. Nisimura, T. Irino, and H. Banno, “ Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation,” ICASSP 2008, Las Vegas (2008), pp. 39333936.
41.
41. J. R. Raol, Multi-sensor Data Fusion with Matlab (CRC Press, Boca Raton, FL, 2010).
42.
42. R. K. Mehra, “ On the identification of variance and adaptive Kalman filtering,” IEEE Trans. Automatic Control AC-15, 175184 (1970).
http://dx.doi.org/10.1109/TAC.1970.1099422
43.
43. S. Nemati, A. Malhorta, and G. D. Clifford, “ Data fusion for improved respiration rate estimation,” EURASIP J. Adv. Signal Process. 2010, 926315 (2010).
http://dx.doi.org/10.1155/2010/926305
44.
44. M. A. Little, P. E. McSharry, I. M. Moroz, and S. J. Roberts, “ Testing the assumptions of linear prediction analysis in normal vowels,” J. Acoust. Soc. Am. 119, 549558 (2007).
http://dx.doi.org/10.1121/1.2141266
45.
45. A. Tsanas, “ Accurate telemonitoring of Parkinson's disease symptom severity using nonlinear speech signal processing and statistical machine learning,” Ph.D. thesis, University of Oxford, UK (2012).
http://aip.metastore.ingenta.com/content/asa/journal/jasa/135/5/10.1121/1.4870484
Loading
/content/asa/journal/jasa/135/5/10.1121/1.4870484
Loading

Data & Media loading...

Loading

Article metrics loading...

/content/asa/journal/jasa/135/5/10.1121/1.4870484
2014-05-01
2016-12-09

Abstract

There has been consistent interest among speech signal processing researchers in the accurate estimation of the fundamental frequency ( ) of speech signals. This study examines ten estimation algorithms (some well-established and some proposed more recently) to determine which of these algorithms is, on average, better able to estimate in the sustained vowel /a/. Moreover, a robust method for adaptively weighting the estimates of individual estimation algorithms based on quality and performance measures is proposed, using an adaptive Kalman filter (KF) framework. The accuracy of the algorithms is validated using (a) a database of 117 synthetic realistic phonations obtained using a sophisticated physiological model of speech production and (b) a database of 65 recordings of human phonations where the glottal cycles are calculated from electroglottograph signals. On average, the sawtooth waveform inspired pitch estimator and the nearly defect-free algorithms provided the best individual estimates, and the proposed KF approach resulted in a ∼16% improvement in accuracy over the best single estimation algorithm. These findings may be useful in speech signal processing applications where sustained vowels are used to assess vocal quality, when very accurate estimation is required.

Loading

Full text loading...

/deliver/fulltext/asa/journal/jasa/135/5/1.4870484.html;jsessionid=v2ivKZghBXJ4-kmu0qPCDfSz.x-aip-live-06?itemId=/content/asa/journal/jasa/135/5/10.1121/1.4870484&mimeType=html&fmt=ahah&containerItemId=content/asa/journal/jasa
true
true

Access Key

  • FFree Content
  • OAOpen Access Content
  • SSubscribed Content
  • TFree Trial Content
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd
/content/realmedia?fmt=ahah&adPositionList=
&advertTargetUrl=//oascentral.aip.org/RealMedia/ads/&sitePageValue=asadl.org/jasa/135/5/10.1121/1.4870484&pageURL=http://scitation.aip.org/content/asa/journal/jasa/135/5/10.1121/1.4870484'
Right1,Right2,Right3,