Analysis of signal of audio speech in process of speech recognition

Kubanek, M.

Artykuł - szczegóły

Tytuł artykułu

Analysis of signal of audio speech in process of speech recognition

Autorzy

Kubanek M.

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The purpose of this work is to explain the theoretical issues and implementational techniques related to the fascinating field of speech recognition. The topic of discussion are focused on some of the well-established and widely used speech coding standards, required to speech recognition and speaker identification. By studying the most successful standards and understanding their principles, performance and limitations, it is possible to apply a particular technique to a given situation according to the underlying constraints - with the ultimate goal being the development of next-generation algorithms, with improvements in all aspects. This document contains own created methods to determine the beginning and end of isolated words in audio speech. To extraction of the audio features of person's speech, in this work it was applied the mechanism of cepstral speech analysis. Finally, the paper will show results of speech coding.

Słowa kluczowe

speech recognition speaker identification audio speech audio feature extraction speech coding

rozpoznawanie mowy identyfikacja mówcy ekstrakcja cech dźwiękowa kodowanie mowy

Wydawca

Wydawnictwo Politechniki Częstochowskiej

Czasopismo

Computing, Multimedia and Intelligent Techniques

Rocznik

2006

Tom

Vol. 2, nr 1

Strony

55--64

Opis fizyczny

Bibliogr. 13 poz., rys., tab.

Twórcy

autor

Kubanek M.

Czestochowa University of Technology, Institute of Computer and Information Sciences, ul. Dąbrowskiego 73, 42-200 Częstochowa, Poland, mariusz.kubanek@icis.pcz.pl

Bibliografia

[1] Y. Aydin, H. Nakajama, Realistic articulated character positioning and balance control in interactive environments, Proceedings Computer Animation 7999, 160-168, 1999.
[2] C. Neti, G. Potamianos, J. Luttin, I. Mattews, H. Glotin, D. Vergyri, J. Sison, A. Mashari, and J. Zhou, Audio Visual Speech-Recognition, 2000 Final Report, 2000.
[3] C. Chu Wai: Speech coding algorithms. Foundation and Evolution of Standardized Coders, A John Wiley & Sons, INC, New Jersey 2003.
[4] C. Basztura: Rozmawiać z komputerem, Wydawnictwo Prac Naukowych FORMAT, Wroc³aw 1992.
[5] R. G. Lyons, Wprowadzenie do cyfrowego przetwarzania sygna³ów, WKiL, Warszawa 1999.
[6] M. Kubanek, Method of Speech recognition and Speaker Identification with Use Audio-Visual of Polish Speech and Hidden Markov Models, Image Analysis, Computer Graphics, Security Systems and Artificial Intelligence Applications, 1, 1, 353-364, 2005.
[7] L. Rabiner, B. H. Yuang, Fundamentals of Speech Recognition, Prentice Hall Signal Processing Series, 1993.
[8] A. M. Wiśniewski, Niejawne modele Markowa w rozpoznawaniu mowy, Biuletyn IAiR WAT, 7, 1997.
[9] M. N. N. Kaynak,. Q. ZHP, A. D. Check, K. Sengupta and K. C. Chung, Audio-Visual Modeling for Bimodal Speech Recognition, Proc. 2001 International Fuzzy Systems Conference, 2001.
[10] B. P. Bogert, M. J. R. Healy, J. W. Tukey, The Frequency Analysis of Time-Series for Echoes, Proc. Symp. Time Series Analysis, 209-243, 1963.
[11] A. Wahab, G. See Ng, R. Dickiyanto, Speaker Verification System Based on Human Auditory and Fuzzy Neural Network System, Neurocomputing Manuscript Draft, Singapore.
[12] K. Sayood, Kompresja danych - wprowadzenie, Wydawnictwo RM, Warszawa 2002.
[13] M. Kubanek, Technique of Video Features Extraction for Audio-video Speach Recognition System, Computing, Multimedia and Intelligent Techniques, 1, 1, 181-190, 2005.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPC1-0001-0052