1887
banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
oa
Selection of spectral compressive operator for vector Taylor series-based model adaptation in noisy environments
Rent:
Rent this article for
Access full text Article
/content/asa/journal/jasa/135/6/10.1121/1.4874358
1.
1. P. J. Moreno, B. Raj, and R. Stern, “ A vector Taylor series approach for environment independent speech recognition,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (1996), pp. 733736.
2.
2. R. Stern, P. J. Moreno, and B. Raj, “ Compensation for speech recognition in degraded acoustical environments,” J. Acoust. Soc. Am. 100, 2792 (1996).
http://dx.doi.org/10.1121/1.416497
3.
3. A. Acero, L. Deng, T. Kristjansson, and J. Zhang, “ HMM adaptation using vector Taylor series for noisy speech recognition,” in Proceedings of International Conference on Spoken Language Processing (2000), Vol. 3, pp. 869872.
4.
4. J. Li, L. Deng, D. Yu, Y. Gong, and A. Acero, “ A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions,” Comput. Speech Lang. 23, 389405 (2009).
http://dx.doi.org/10.1016/j.csl.2009.02.001
5.
5. S. Baek and H. Kang, “ Vector Taylor series based HMM adaptation for generalized cepstrum in noisy environment,” in Proceedings of Automatic Speech Recognition and Understanding (2013), pp. 186191.
6.
6. H. Hermansky, “ Perceptual linear predictive analysis of speech,” J. Acoust. Soc. Am. 87, 17381752 (1990).
http://dx.doi.org/10.1121/1.399423
7.
7. C. Kim and R. M. Stern, “ Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction,” in Proceedings of INTERSPEECH (2009), pp. 2831.
8.
8. C. Kim and R. M. Stern, “ Power-normalized cepstral coefficients (pncc) for robust speech recognition,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (2012), pp. 41014104.
9.
9.ETSI standard doc., “ Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms,” ETSI ES 202 050 v. 1.1.5 (2007).
10.
10. J. Lee, S. Baek, and H. Kang, “ Signal and feature domain enhancement approaches for robust speech recognition,” in Proceedings of ICICS (2011), pp. 14.
11.
11. H. Fletcher and W. A. Munson, “ Loudness, its definition, measurement and calculation,” Bell Syst. Tech. J. 12(4), 377430 (1933).
http://dx.doi.org/10.1002/j.1538-7305.1933.tb00403.x
12.
12. T. Kobayashi and S. Imai, “ Spectral analysis using generalized cepstrum,” IEEE Trans. Acoust., Speech, Signal Process. 32(5), 10871089 (1984).
http://dx.doi.org/10.1109/TASSP.1984.1164416
13.
13. K. Koishida, K. Tokuda, T. Kobayashi, and S. Imai, “ Spectral representation of speech using mel-generalized cepstral coefficients,” J. Acoust. Soc. Am. 100, 2756 (1996).
http://dx.doi.org/10.1121/1.416321
14.
14. R. A. Gopinath, M. J. F. Gales, P. S. Gopalakrishnan, S. Balakrishnan-Aiyer, and M. A. Picheny, “ Robust speech recognition in noise: Performance of the IBM continuous speech recognizer on the ARPA noise spoke task,” in Proceedings of the ARPA Workshop on Spoken Language Systems Technology (1995).
15.
15. H. G. Hirsch and D. Pearce, “ The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions,” in Proceedings of ISCA ITRW ASR (2000), pp. 181188.
16.
16. D. A. Kapilow, Y. Stylianou, and J. Schroeter, “ Detection of non-stationarity in speech signals and its application to time-scaling,” in Proceedings of EUROSPEECH (1999).
http://aip.metastore.ingenta.com/content/asa/journal/jasa/135/6/10.1121/1.4874358
Loading
/content/asa/journal/jasa/135/6/10.1121/1.4874358
Loading

Data & Media loading...

Loading

Article metrics loading...

/content/asa/journal/jasa/135/6/10.1121/1.4874358
2014-05-13
2014-10-01

Abstract

This letter investigates the impact of spectral compression on the vector Taylor series-based model adaptation algorithm. Unlike mel-frequency cepstral coefficients obtained by the logarithmic compression, the fractional power compression is used for extracting features. Since the relationship between acoustic models for clean and noisy speech depends on nonlinearity of the spectrum, it is important to select an appropriate compressive operator in the model adaptation. In this letter, the dependency of spectral nonlinearity on the speech recognition system is analyzed in various noisy environments. Experimental results confirm that the replacement of the compressive operator improves the performance of the model adaptation.

Loading

Full text loading...

/deliver/fulltext/asa/journal/jasa/135/6/1.4874358.html;jsessionid=1plq45urcrn60.x-aip-live-03?itemId=/content/asa/journal/jasa/135/6/10.1121/1.4874358&mimeType=html&fmt=ahah&containerItemId=content/asa/journal/jasa

Most read this month

Article
content/asa/journal/jasa
Journal
5
3
Loading

Most cited this month

true
true
This is a required field
Please enter a valid email address
This feature is disabled while Scitation upgrades its access control system.
This feature is disabled while Scitation upgrades its access control system.
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd
Scitation: Selection of spectral compressive operator for vector Taylor series-based model adaptation in noisy environments
http://aip.metastore.ingenta.com/content/asa/journal/jasa/135/6/10.1121/1.4874358
10.1121/1.4874358
SEARCH_EXPAND_ITEM