Speech recognition for interaction with a robot in noisy environment

Maučec, M. S.; Kačič, Z.; Žgank, A.

Artykuł - szczegóły

Tytuł artykułu

Speech recognition for interaction with a robot in noisy environment

Autorzy

Maučec M. S. , Kačič Z. , Žgank A.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

Rozpoznawanie mowy w interakcji z robotem w zaszumionym otoczeniu

Języki publikacji

Abstrakty

One of the main problems with speech recognition for robots is noise. In this paper we propose two methods to enhance the robustness of continuous speech recognition in noisy environment. We show that the accuracy of recognition can be improved by better weighting the language model in the decision process. The second proposed method is based on language model adaptation. The experiments showed that both proposed techniques improve speech recognition accuracy by approximately 2%.

W artykule przedstawiono dwie metody zwiększenia odporności na zakłócenia i skuteczności rozpoznawania mowy w zaszumionym otoczeniu. Wykazano, że odpowiednie dobranie współczynników wagowych w procesie decyzyjnym dla modelu języka zwiększa precyzję rozpoznawania dźwięków. Druga metoda opiera się na adaptacji modelu języka. Badania eksperymentalne wykazały, że obydwie metody zwiększają skuteczność rozpoznania mowy o około 2%.

Słowa kluczowe

speech recognition vocabulary robustness

rozpoznawanie mowy słownictwo zakłócenia

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2013

Tom

R. 89, nr 5

Strony

232--236

Opis fizyczny

Bibliogr. 22 poz., tab., wykr.

Twórcy

autor

Maučec M. S.

mirjam.sepesy@uni-mb.si

Faculty for Electrical Engineering and Computer Science, University of Maribor

autor

Kačič Z.

kacic@uni-mb.si

Faculty for Electrical Engineering and Computer Science, University of Maribor

autor

Žgank A.

andrej.zgank@uni-mb.si

Faculty for Electrical Engineering and Computer Science, University of Maribor

Bibliografia

[1] ROJC, Matej. Web-based architecture RES based on finitestate machines for distributed evaluation and development of speech synthesis systems. Int j. comput. linguist. res. (Print)/, Mar. 2011, vol. 2, no.1, pp. 1-12.
[2] Drungilas D, Grisius G, Recognition of Human Emotions in Reasoning Algorithms of Wheelchair Type Robots. Informatica, 2010,Vol. 21, No. 4, pp. 521-532.
[3] Sato M, Iwasawa T, Sugiyama A, Nishizawa T, Takano Y, A Single-Chip Speech Dialogue Module and Its Evaluation on a Personal Robot, PaPeRo-Mini. IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences 2010, Vol.E93-A No.1 pp.261-271.
[4] Janusz Dulas: Automatic digits recognition for Polish – noisy phonemes identification, Electrical Review, 01/2011, pp. 280-283.
[5] R. P. Lippmann et al., Multi-style training for robust isolated word speech recognition, Proc. of ICASSP, 1987, 705–708.
[6] Liao, Y. F., Fang, H. H., Hsu, C. H., Eigen-MLLR Environment/Speaker Compensation for Robust Speech Recognition. Proceeding Interspeech, 2008, Brisbane, Australia, pp. 1249–1252.
[7] Donglin Wang, Leung, H., Keun-Chang Kwak, Hosub Yoon, Enhanced Speech Recognition with Blind Equalization For Robot "WEVER-R2".The 16th IEEE International Symposium on Robot and Human interactive Communication, 2007, pp. 684-688.
[8] Yoshitaka Nishimura, Mikio Nakano, Kazuhiro Nakadai, Hiroshi Tsujino and Mitsuru Ishizuka, Speech Recognition for a Robot under its Motor Noises by Selective Application of Missing Feature Theory and MLLR, Proc. of ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition (SAPA, 2006), Pittsburgh, pp.53-58.
[9] Yamamoto, S., Nakadai, K., Nakano, M., Tsujino, H., Valin, J.- M., Komatani, K., Ogata, T., Okuno, H.G., Design And Implementation Of A Robot Audition System For Automatic Speech Recognition Of Simultaneous Speech, IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU, 2007, pp. 11-116.
[10] Yamamoto, S., Valin, J.-M., Nakadai, K., Rouat, J., Michaud, F., Ogata, T., Okuno, H.G., Enhanced Robot Speech Recognition Based on Microphone Array Source Separation and Missing Feature Theory, Proceedings of the 2005 IEEE International Conference on Robotics and Automation, pp. 1477 -1482.
[11] Jianjun Huang, Yafei Zhang, Xiongwei Zhang, Tao Zhu: A Data Field method for speech enhancement incorporating Binary Time-Frequency Masking, Electrical Review, 2011, No 7, pp.225 - 229.
[12] Kelley, J.F., An empirical methodology for writing user-friendly natural language computer application. Proceeding of ACM SIG-CHI, 1983 Human Factors in Computing Systems. New York: ACM, pp. 193 - 196.
[13] Lee CH, On Automatic Speech Recognition at the Dawn of the 21st Century. IEICE Trans. Inf. Syst., vol. E86-D, 2003, No. 3, pp. 377-396.
[14] Maučec, M. S., Žgank, A., Speech recognition system of Slovenian broadcast news. Speech technologies. Rijeka: InTech. 2011, pp. 221-236.
[15] Lipeika A, Optimization of Formant Feature Based Speech Recognition. Informatica, 2010, Vol. 21, No. 3, pp. 361-374.
[16] Maskeliunas R, Rudzionis A, Rudzionis V., Advances on the Use of the Foreign Language Recognizer. Development of Multimodal Interfaces: Active Listening and Synchrony, Lecture Notes in Computer Science, 2010, Springer Verlag, vol. 5967, pp. 217-224.
[17] Cho Y, Yook D., Maximum Likelihood Training and Adaptation of Embedded Speech Recognizers for Mobile Environments. ETRI Journal, 2010, vol.32, no.1, pp.160-162.
[18] Pyz G, Simonyte V, Slivinskas V., Modelling of Lithuanian Speech Diphthongs. Informatica, 2011,Vol. 22, No. 3, pp. 411-434.
[19] Arhar, Š., Gorjanc, V., Korpus FidaPLUS: nova generacija slovenskega referenčnega korpusa. Jezik in slovstvo 2007, 52/2., pp. 95-110.
[20] Žgank, A., Verdonik, D., Markus, A. Z., Kačič, Z., BNSI Slovenian broadcast news database - speech and text corpus, In INTERSPEECH, 2005, pp. 1537-1540.
[21] Jang C, Lee S, Jung S, Song B, Kim R, Kim R, Lee CH, OPRoS: A New Component-Based Robot Software Platform. ETRI Journal, 2010, vol.32, no.5, pp.646-656.
[22] Rambow M, Rohrmüller F, Kouraks O, Brščivcić D, Wollherr D, Hirche S, Buss M (2010) A Framework for Information Distribution, Task Execution and Decision Making in Multi- Robot Systems. IEICE TRANSACTIONS on Information and Systems, 2010, Vol.E93-D No.6 pp.1352-1360.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-1ec8b8e4-978c-4d03-b564-421e9085e60f