Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
In the paper, the method of short word deletion errors correction in automatic speech recognition is described. Short word deletion errors appear to be a frequent error type in Polish speech recognition. The proposed speech recognition process consists of two stages. At the first stage the utterance is recognized by a typical speech recognizer based on forward bigram language model. At the second stage the word sequence recognized by the first stage recognizer is analyzed and such pairs of adjacent words in the recognized sequence are localized, which are likely to be separated by a short word like conjunction or preposition. The probability of short word appearance in context of found words is evaluated using centered trigrams and backward bigram language model for short words prone to deletion. The set of probabilistic language properties used to correct deletions is called here Local Bidirectional Language Model (in contrast to purely forward or backward model used typically in speech recognition). The decision of short word insertion is based on comparison of deletion error probability of the first stage recognizer and the error probability of the decision based only on centered trigrams and backward model. Despite its simplicity, the method proved to be effective in correcting deletion errors of most frequently appearing Polish prepositions. The method was tested in application to medical spoken reports recognition, where the overall short word deletion error rate was reduced by almost 45%.
Rocznik
Tom
Strony
127--134
Opis fizyczny
Bibliogr. 9 poz., tab.
Twórcy
autor
- Instutute of Informatics, Wroclaw University of Technology, 50-370 Wroclaw, Wyb. Wyspianskiego
Bibliografia
- [1] KATZ S.M., Estimation of Probabilities for Sparse Data for the Language Model Component for the Speech Recognition, IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. 35(3), 1987, pp. 400–401.
- [2] JELINEK F., Statistical Methods for Speech Recognition, MIT Press, Cambridge, Massachusetts, 1997.
- [3] CHEN S.F., GOODMAN J., An Empirical Study of Smoothing Techniques for Language Modeling, Proc. of the 34th Annual Meeting on Association for Computational Linguistics, 1996, pp. 310–318.
- [4] JURAFSKY D., MARTIN J., Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Prentice Hall, New Jersey, 2000.
- [5] LEE A., KAWAHARA T., SHIKANO K., Julius – an Open Source Real–Time Large Vocabulary Recognition Engine, Proc. of European Conference on Speech Communication and Technology (EUROSPEECH), 2001, pp. 1691–1694.
- [6] HNATKOWSKA B., SAS J., Application of Automatic Speech Recognition to Medical Reports Spoken in Polish, Journal of Medical Informatics & Technologies, Vol. 12, 2008, pp. 223–230.
- [7] SAS J., Optimal Spoken Dialog Control in Hands–Free Medical Information Systems, Journal of Medical Informatics & Technologies, Vol. 13, 2009, pp. 113–120.
- [8] YOUNG S., EVERMAN G., The HTK Book (for HTK Version 3.4), Cambridge University Engineering Department, 2009.
- [9] ZIOLKO B., SKURZOK D., ZIOLKO M., Word n–Grams for Polish, Proc. of 10th IASTED Int. Conf. on Artificial Intelligence and Applications (AIA 2010), 2010, pp. 197–201.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-PWA4-0017-0020