Allophones in automatic speech recognition

Kozierski, P.; Sadalla, T.; Dąbrowski, A.; Giernacki, W.

Artykuł - szczegóły

Tytuł artykułu

Allophones in automatic speech recognition

Autorzy

Kozierski P. , Sadalla T. , Dąbrowski A. , Giernacki W.

Wybrane pełne teksty z tego czasopisma

http://sait.cie.put.poznan.pl/

Identyfikatory

Warianty tytułu

Alofony w automatycznym rozpoznawaniu mowy

Języki publikacji

Abstrakty

The common approach to the speech recognition problem is the use of phonemes as basic parts of speech. The authors proposed allophones usage instead. For rarer allophones the conversion into other allophones (4 selection methods) has been proposed. Based on the obtained results one can say that effective use of the additional information from allophonic notation will not be possible without modification of currently used algorithms.

Typowym podejściem do zagadnienia rozpoznawania mowy jest branie pod uwagę fonemów, jako podstawowych części mowy. Zamiast tego autorzy zaproponowali wykorzystanie alofonów. Dla najrzadziej występujących alofonów zaproponowano ich zamianę na inne alofony – zaproponowano 4 metody wyboru głosek do zamiany. Na podstawie uzyskanych wyników stwierdzono, że efektywne wykorzystanie dodatkowych informacji, jakie niosą alofony, nie będzie możliwe bez modyfikacji obecnie dostępnych algorytmów.

Słowa kluczowe

allophones automatic speech recognition

alofony rozpoznawanie mowy automatyczne

Wydawca

Poznańskie Towarzystwo Przyjaciół Nauk

Czasopismo

Studia z Automatyki i Informatyki

Rocznik

2016

Tom

T. 41

Strony

47--53

Opis fizyczny

Bibliogr. 21 poz., rys., tab.

Twórcy

autor

Kozierski P.

piotr.kozierski@gmail.com

Poznan University of Technology, Faculty of Computing, Chair of Control and Systems Engineering, Division of Signal Processing and Electronic Systems
Poznan University of Technology, Faculty of Electrical Engineering, Institute of Control and Information Engineering, Piotrowo 3a street, 60-965 Poznan

autor

Sadalla T.

talar.h.sadalla@doctorate.put.poznan.pl

Poznan University of Technology, Faculty of Electrical Engineering, Institute of Control and Information Engineering, Piotrowo 3a street, 60-965 Poznan

autor

Dąbrowski A.

adam.dabrowski@put.poznan.pl

Poznan University of Technology, Faculty of Computing, Chair of Control and Systems Engineering, Division of Signal Processing and Electronic Systems, Piotrowo 3a street, 60-965 Poznan

autor

Giernacki W.

wojciech.giernacki@put.poznan.pl

Poznan University of Technology, Faculty of Electrical Engineering, Institute of Control and Information Engineering, Piotrowo 3a street, 60-965 Poznan

Bibliografia

[1] Polish language - pronunciation - phones (in Polish). https://pl.wiktionary.org/wiki/Aneks:J%C4%99zyk_polski_-_wymowa_-_g%C5%82oski. access 11 I 2017.
[2] C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and M. Mohri. Openfst: A general and efficient eeighted finite-state transducer library. In In Implementation and Application of Automata, pages 11-23. Springer Berlin Heidelberg, 2007.
[3] M. Bisani and H. Ney. Joint-sequence models for grapheme-to-phoneme conversion. Speech Communication, 50(5):434-451, 2008.
[4] G. Demenko, M. Wypych, and E. Baranowska. Implementation of grapheme-to-phoneme rules and extended sampa alphabet in Polish text-to-speech synthesis. Speech and Language Technology, 7:79-97, 2003.
[5] P. Żelasko, B. Ziółko, T. Jadczyk, and D. Skurzok. Agh corpus of Polish speech. Language Resources and Evaluation, 50(3):585-601, 2015. DOI: 10.1007/s10579-015-9302-y.
[6] A. Karpov, K. Markov, I. Kipyatkova, D. Vazhenina, and A. Ronzhin. Large vocabulary russian speech recognition using syntactico-statistical language modeling. Speech Communication, 56:213-228, 2014.
[7] P. Kłosowski. Improving speech processing based on phonetics and phonology of Polish language. Przegląd Elektrotechniczny, 89(8):303-307, 2013.
[8] P. Kozierski, T. Sadalla, A. Dąbrowski, and D. Horla. Program for Polish whispery speech corpus creation (in Polish). Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska (IAPGOŚ), accepted.
[9] P. Kozierski, T. Sadalla, S. Drgas, and A. Dąbrowski. Allophones in automatic whispery speech recognition. In In Methods and Models in Automation and Robotics (MMAR), 21st International Conference on, pages 811-815, Międzyzdroje, 2016. DOI: 10.1109/MMAR.2016.7575241.
[10] Y. Mittal, P. Toshniwal, S. Sharma, D. Singhal, R. Gupta, and V. K. Mittal. A voice-controlled multi-functional smart home automation system. In 12-th IEEE India International Conference (INDICON), pages 1-6, New Delhi, December 2015.
[11] S. Pigeon, C. Swail, E. Geoffrois, G. Bruckner, D. V. Leeuwen, C. Teixeira, et al. Use of Speech and Language Technology in Military Environments. North Atlantic Treaty Organization, Montreal, Canada, 2005.
[12] B. Plannerer. An Introduction to Speech Recognition. Munich, Germany, March 2005.
[13] F. Portet, M. Vacher, C. Golanski, C. Roux, and B. Meillon. Design and evaluation of a smart home voice interface for the elderly: Acceptability and objection aspects. Personal and Ubiquitous Computing, 17(1):127-144, 2013.
[14] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, et al. The kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding, 2011. No. EPFLCONF-192584.
[15] H. R. Sharifzadeh, I. V. McLoughlin, and F. Ahmadi. Reconstruction of normal sounding speech for laryngectomy patients through a modified celp codec. Biomedical Engineering, IEEE Transactions on, 57(10):2448-2458, 2010.
[16] A. Stolcke. Srilm-an extensible language modeling toolkit. In Proc. Intl. Conf. Spoken Language Processing (INTERSPEECH), Denver, Colorado, September 2002.
[17] D. F. Syu, S. W. Syu, S. J. Ruan, Y. C. Huang, and C. K. Yang. Fpga implementation of automatic speech recognition system in a car environment. In 2015 IEEE 4th Global Conference on Consumer Electronics (GCCE), pages 485-486, October 2015.
[18] K. Szostek. Hmm models optimization and their use in speech recognition (in Polish). Elektrotechnika i Elektronika, 24(2):172-182, 2005.
[19] S. Wakisaka, K. Ishiwatari, K. Ito, T. Toge, and M. Tanaka. Recognition Dictionary System Structure and Changeover Method of Speech Recognition System for Car Navigation. Washington, 2000. U.S. Patent No. 6,112,174.
[20] S. Wydra. The use of mixed parameterization in Polish speech recognition system (in Polish). In National Conference on Radiocommunication, Radiophony and Television (KKRRiT), Poznań, 2006.
[21] M. Wypych, E. Baranowska, and G. Demenko. A grapheme-to-phoneme transcription algorithm based on the sampa alphabet extension for the Polish language. In Phonetic Sciences, 15th International Congress of (ICPhS), pages 2601-2604, Barcelona, August 2003.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-87cc95c4-e8ee-4807-8d5b-bc5131e5eb3b