Application of local bidirectional language model to error correction in polish medical speech recognition

Sas, J.

Artykuł - szczegóły

Tytuł artykułu

Application of local bidirectional language model to error correction in polish medical speech recognition

Autorzy

Sas J.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

In the paper, the method of short word deletion errors correction in automatic speech recognition is described. Short word deletion errors appear to be a frequent error type in Polish speech recognition. The proposed speech recognition process consists of two stages. At the first stage the utterance is recognized by a typical speech recognizer based on forward bigram language model. At the second stage the word sequence recognized by the first stage recognizer is analyzed and such pairs of adjacent words in the recognized sequence are localized, which are likely to be separated by a short word like conjunction or preposition. The probability of short word appearance in context of found words is evaluated using centered trigrams and backward bigram language model for short words prone to deletion. The set of probabilistic language properties used to correct deletions is called here Local Bidirectional Language Model (in contrast to purely forward or backward model used typically in speech recognition). The decision of short word insertion is based on comparison of deletion error probability of the first stage recognizer and the error probability of the decision based only on centered trigrams and backward model. Despite its simplicity, the method proved to be effective in correcting deletion errors of most frequently appearing Polish prepositions. The method was tested in application to medical spoken reports recognition, where the overall short word deletion error rate was reduced by almost 45%.

Słowa kluczowe

speech recognition language models medical information systems

rozpoznawanie mowy modele języka medyczne systemy informacji

Wydawca

University of Silesia, Institute of Informatics, Computer Systems Department

Czasopismo

Journal of Medical Informatics & Technologies

Rocznik

2010

Tom

Vol. 15

Strony

127--134

Opis fizyczny

Bibliogr. 9 poz., tab.

Twórcy

autor

Sas J.

jerzy.sas@pwr.wroc.pl

Instutute of Informatics, Wroclaw University of Technology, 50-370 Wroclaw, Wyb. Wyspianskiego

Bibliografia

[1] KATZ S.M., Estimation of Probabilities for Sparse Data for the Language Model Component for the Speech Recognition, IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. 35(3), 1987, pp. 400–401.
[2] JELINEK F., Statistical Methods for Speech Recognition, MIT Press, Cambridge, Massachusetts, 1997.
[3] CHEN S.F., GOODMAN J., An Empirical Study of Smoothing Techniques for Language Modeling, Proc. of the 34th Annual Meeting on Association for Computational Linguistics, 1996, pp. 310–318.
[4] JURAFSKY D., MARTIN J., Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Prentice Hall, New Jersey, 2000.
[5] LEE A., KAWAHARA T., SHIKANO K., Julius – an Open Source Real–Time Large Vocabulary Recognition Engine, Proc. of European Conference on Speech Communication and Technology (EUROSPEECH), 2001, pp. 1691–1694.
[6] HNATKOWSKA B., SAS J., Application of Automatic Speech Recognition to Medical Reports Spoken in Polish, Journal of Medical Informatics & Technologies, Vol. 12, 2008, pp. 223–230.
[7] SAS J., Optimal Spoken Dialog Control in Hands–Free Medical Information Systems, Journal of Medical Informatics & Technologies, Vol. 13, 2009, pp. 113–120.
[8] YOUNG S., EVERMAN G., The HTK Book (for HTK Version 3.4), Cambridge University Engineering Department, 2009.
[9] ZIOLKO B., SKURZOK D., ZIOLKO M., Word n–Grams for Polish, Proc. of 10th IASTED Int. Conf. on Artificial Intelligence and Applications (AIA 2010), 2010, pp. 197–201.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-PWA4-0017-0020