PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!
Tytuł artykułu

Building compact language models for medical speech recognition in mobile devices with limited amount of memory

Autorzy
Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The article presents the method of building compact language model for speech recognition in devices with limited amount of memory. Most popularly used bigram word-based language models allow for highly accurate speech recognition but need large amount of memory to store, mainly due to the big number of word bigrams. The method proposed here ranks bigrams according to their importance in speech recognition and replaces explicit estimation of less important bigrams probabilities by probabilities derived from the class-based model. The class-based model is created by assigning words appearing in the corpus to classes corresponding to syntactic properties of words. The classes represent various combinations of part of speech inflectional features like number, case, tense, person etc. In order to maximally reduce the amount of memory necessary to store class-based model, a method that reduces the number of part-of-speech classes has been applied, that merges the classes appearing in stochastically similar contexts in the corpus. The experiments carried out with selected domains of medical speech show that the method allows for 75% reduction of model size without significant loss of speech recognition accuracy.
Rocznik
Tom
Strony
111--119
Opis fizyczny
Bibliogr. 19 poz., rys., tab.
Twórcy
autor
  • Instutute of Informatics, Wroclaw University of Technology, 50-370 Wroclaw, ul.Wyb. Wyspianskiego 27
Bibliografia
  • [1] BROWN P., DESOUZA P. V., MERCER R. L., PIETRA V. J. D., LAI J. C., Class-based n-gram models of natural language, Computational Linguistics, 1992, Vol. 18, No. 1, pp. 467–479.
  • [2] BRYCHCIN T., KONOPIK M., Morphological based language models for inflectional languages, Proceedings of 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems, 2011, pp. 560–563.
  • [3] CHEN S., GOODMAN S., An empirical study of smoothing techniques for language modeling, Computer Speech and Language, 1999, Vol. 13, No. 1, pp. 359–394.
  • [4] DEVINE E., GAEHDE S., CURTIS A., Comparative evaluation of three continuous speech recognition software packages in the generation of medical reports, Journal of American Medical Informatics Association, 2007, Vol. 7, No. 1, pp. 462–468.
  • [5] JELINEK F., Statistical methods for speech recognition Speech and language processing, The MIT Press, Cambridge, 1998.
  • [6] LEE A., KAWAHARA T. SHIKANO K., Julius - an open source real-time large vocabulary recognition engine. Proceedings of European Conference on Speech Communication and Technology (EUROSPEECH), 2001, pp. 1691–1694.
  • [7] MIKOLOV T., DEORAS A., KOMBRINK S., BURGET L. CERNOCKY J., Empirical evaluation and combination of advanced language modeling techniques, INTERSPEECH, ISCA, 2011, pp. 605–608.
  • [8] BROWN L.D., CAI T., DASGUPTA A., Interval Estimation for a Binomial Proportion. Statistical Science, 2001, Vol. 16, No. 2, pp. 101–133.
  • [9] NIESLER T., WHITTAKER E.W.D., WOODLAND P., Comparison of part-of-speech and automatically derived category-based language models for speech recognition, Proceedings of ICASSP 98, 1998, pp. 177–180.
  • [10] NIESLER T., D., WOODLAND P., Word-to-category backoff language model, CUED/F-INFENG/TR.258, Cambridge University Technical Report, 1996.
  • [11] PIASECKI M., Polish tagger TaKIPI: Rule based construction and optimization, Task Quarterly, 2007, Vol. 11, No. 1, pp. 151–167.
  • [12] SAS J., Optimal spoken dialog control in hands-free medical information systems, Journal of Medical Informatics and Technologies, 2008, Vol. 13, pp. 113–120.
  • [13] SAS J., Application of local bidirectional language model to error correction in polish medical speech recognition, Journal of Medical Informatics and Technologies, 2010, Vol. 15, No. 1, pp. 127–134.
  • [14] SAS J., ZOLNIEREK A., Distant co-occurrence language model for ASR in loose word order languages, Advances in Intelligent and Soft Computing, Proceedings of International Conference on Computer Recognition Systems Cores, 2011, pp. 767–778.
  • [15] VAICIUNAS A., KAMINSKAS V., RASKINIS G., Statistical language models of Lithuanian based on word clustering and morphological decomposition, Informatica, 2004, Vol. 15, No. 4, pp. 565–580.
  • [16] WARD W, ISSAR S,. A class based language model for speech recognition, Proceedings of the Acoustics, Speech, and Signal Processing, ICASSP 96, 1996, pp. 416–418.
  • [17] WHITTAKER, E., WOODLAND, P., Language modeling for Russian and English using words and classes, Computer Speech and Language, 2003, Vol. 17, No. 1, pp. 87–104.
  • [18] WOLINSKI M., Morphosyntactic tag system in IPI PAN corpus, Polonica, 2003, No. 22, pp. 39-54.
  • [19] YOUNG S., EVERMAN G., HTK Book (for HTK Version 3.4), Cambridge University Engineering Department, Cambridge CB2 1PZ, United Kingdom, 2009.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-PWA4-0027-0013
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.