Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Analysis of signal of audio speech in process of speech recognition

Warianty tytułu
Języki publikacji
The purpose of this work is to explain the theoretical issues and implementational techniques related to the fascinating field of speech recognition. The topic of discussion are focused on some of the well-established and widely used speech coding standards, required to speech recognition and speaker identification. By studying the most successful standards and understanding their principles, performance and limitations, it is possible to apply a particular technique to a given situation according to the underlying constraints - with the ultimate goal being the development of next-generation algorithms, with improvements in all aspects. This document contains own created methods to determine the beginning and end of isolated words in audio speech. To extraction of the audio features of person's speech, in this work it was applied the mechanism of cepstral speech analysis. Finally, the paper will show results of speech coding.
Opis fizyczny
Bibliogr. 13 poz., rys., tab.
  • Czestochowa University of Technology, Institute of Computer and Information Sciences, ul. Dąbrowskiego 73, 42-200 Częstochowa, Poland,
  • [1] Y. Aydin, H. Nakajama, Realistic articulated character positioning and balance control in interactive environments, Proceedings Computer Animation 7999, 160-168, 1999.
  • [2] C. Neti, G. Potamianos, J. Luttin, I. Mattews, H. Glotin, D. Vergyri, J. Sison, A. Mashari, and J. Zhou, Audio Visual Speech-Recognition, 2000 Final Report, 2000.
  • [3] C. Chu Wai: Speech coding algorithms. Foundation and Evolution of Standardized Coders, A John Wiley & Sons, INC, New Jersey 2003.
  • [4] C. Basztura: Rozmawiać z komputerem, Wydawnictwo Prac Naukowych FORMAT, Wroc³aw 1992.
  • [5] R. G. Lyons, Wprowadzenie do cyfrowego przetwarzania sygna³ów, WKiL, Warszawa 1999.
  • [6] M. Kubanek, Method of Speech recognition and Speaker Identification with Use Audio-Visual of Polish Speech and Hidden Markov Models, Image Analysis, Computer Graphics, Security Systems and Artificial Intelligence Applications, 1, 1, 353-364, 2005.
  • [7] L. Rabiner, B. H. Yuang, Fundamentals of Speech Recognition, Prentice Hall Signal Processing Series, 1993.
  • [8] A. M. Wiśniewski, Niejawne modele Markowa w rozpoznawaniu mowy, Biuletyn IAiR WAT, 7, 1997.
  • [9] M. N. N. Kaynak,. Q. ZHP, A. D. Check, K. Sengupta and K. C. Chung, Audio-Visual Modeling for Bimodal Speech Recognition, Proc. 2001 International Fuzzy Systems Conference, 2001.
  • [10] B. P. Bogert, M. J. R. Healy, J. W. Tukey, The Frequency Analysis of Time-Series for Echoes, Proc. Symp. Time Series Analysis, 209-243, 1963.
  • [11] A. Wahab, G. See Ng, R. Dickiyanto, Speaker Verification System Based on Human Auditory and Fuzzy Neural Network System, Neurocomputing Manuscript Draft, Singapore.
  • [12] K. Sayood, Kompresja danych - wprowadzenie, Wydawnictwo RM, Warszawa 2002.
  • [13] M. Kubanek, Technique of Video Features Extraction for Audio-video Speach Recognition System, Computing, Multimedia and Intelligent Techniques, 1, 1, 181-190, 2005.
Typ dokumentu
Identyfikator YADDA
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.