PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Model-Based Feature Compensation for Robust Speech Recognition

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper proposes a novel robust speech recognition approach based on the model-based feature compensation. The approach combines the GMM-based feature compensation and the HMM-based feature compensation together and employs the multiple recognition passes to achieve the best performance. In the initial recognition procedure, the GMM-based feature compensation approach is employed to give better clean model and noise model. Then we further refine these models by employing the HMM-based feature compensation approach. The statistical model of the clean speech and the noise is combined by using vector Taylor series (VTS) approximation. The experimental results show that the novel approach makes a significant improvement compared to the GMM-based feature compensation and the HMM-based feature compensation without any compensation in the initial pass.
Wydawca
Rocznik
Strony
529--539
Opis fizyczny
bibliogr. 20 poz.
Twórcy
autor
autor
autor
autor
Bibliografia
  • [1] Moreno, P.J., Raj, B., Stern, R.M.: A Vector Taylor Series Approach for Environment-Independent Speech Recognition, Proc. ICASSP, 1995, 733-736.
  • [2] Moreno, P.J.: Speech Recognition in Noisy Environments, Ph. D. thesis, ECE Department, CMU, April 1996.
  • [3] Raj, B., Gouvea, E.B., Moreno, P.J., Stern, R.M.: Cepstral Compensation by Polynomial Approximation for Environment-Independent Speech Recognition, Proc. ICSLP, Philadelphia, 1996, 2340-2343.
  • [4] Kim, N.S.: Statistical Linear Approximation for Environment Compensation, IEEE Signal Processing Letters, Vol.5, No.1, 1998, 8-10.
  • [5] Shen, H.F., Liu, G., Guo,J., and Li, Q.X.: Two-Domain Feature Compensation for Robust Speech Recognition, Proc. Second International Symposium on Neural Network(Wang, J., Liao, X., Yi, Z. Ed.),LNCS 3497, Springer-Verlag, Berlin, 2005,351-356.
  • [6] Shen,H.F., Liu, G., Guo, J.,Huang, P.M., Li, Q.X.: Environment Compensation Based on Maximum a Posteriori Estimation for Improved speech Recognition, Proc. Fourth Mexican International Conference on Artificial Intelligence(Gelbukh, A., de Albornoz, A., Terashima, H. Ed.), LNAI 3789, Springer-Verlag, Berlin, 2005, 854-862.
  • [7] Gales, M.J.F.: Model-Based Techniques for Noise Robust Speech Recognition, Ph.D. thesis, University of Cambridge, September 1995.
  • [8] Acero, A., Li, D., Kristjansson,K., Zhang, J.: HMM Adaptation Using Vector Taylor Series for Noisy Speech Recognition, Proc. ICSLP 2000,Beijing, 2000.
  • [9] Shen, H.F., Li, Q.X., Guo, J., Liu, G.: HMM Parameter Adaptation Using the Truncated First-Order VTS and EM Algorithm for Robust Speech Recognition, Proc. 2005 International Conference on Computational Intelligence and Security, Lecture Notes in Artificial Intelligence(Hao, Y. Ed.), LNAI 3801, Springer-Verlag, Berlin, 2005, 979-984.
  • [10] Sarikaya, R., Hansen, J.H.: PCA-PMC: A Novel Use of a Priori Knowledge for parallel model combination, Proc. ICASSP 2000, 2000, 1113-1116.
  • [11] Tai-Hwei, H., Hsiao-Chuan, W.: A fast algorithm for parallel model combination for noisy speech recognition, Computer Speech and Language, No.14, 2000,81-100.
  • [12] Sagayama, S., Yamaguchi, Y., Takahashi, S., Takahashi, J.: Jacobian Approach to Fast Acoustic Model Adaptation, Proc. ICASSP'97,Munich, Germany, 1997, 835-838.
  • [13] Sagayama, S., Kato, Y., Nakai,M., Shimodaira, H.: Jacobian Approach to Joint Adaptation to Noise, Channel and Vocal Tract Length, Proc. ISCA Workshop on Adaptation Methods, Sophia Antipolis, France, 2001, 117-120.
  • [14] Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EMalgorithm, Journal of the Royal Statistical Society B, 1977, 1-38.
  • [15] Gauvain, J.L., Lee, C.H.: Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observation of Markov Chains, IEEE Transactions on Speech and Audio Processing, Vol.2, No.2, 1994, 291-298.
  • [16] Huo, Q., Lee, C.H.: On-Line Adaptive Learning of the Continuous Density Hidden Markov Model Based on Approximate Recursive Bayes Estimate, IEEE Transactions on Speech and Audio Processing, vol.5, No.2, 1997, 161-172.
  • [17] Huo, Q., Chan, C., Lee, C.H.: Bayesian Adaptive Learning of the Parameters of Hidden Markov Model for Speech Recognition, IEEE Transactions on Speech and Audio Processing, Vol.3, No.5, 1995, 334-345.
  • [18] Zu, Y. Q.:Issues in the Scientific Design of the Continuous Speech Database, Available: http://www.cass.net.cn/chinese/s18 yys/yuyin/report/report 1998.htm.
  • [19] Rabiner, L.R.: A Totorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proc.IEEE, vol.77, 1989, 257-286.
  • [20] Varga, A., Steenneken, H. J. M., Tomilson, M., Jones, D.: The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition,Documentation on the NOISEX-92 CD-ROMs, 1992.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS2-0010-0084
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.