Method of audio-visual polish speech recognition with use hidden Markov Models

Kubanek, M.

Powiadomienia systemowe

Sesja wygasła!
Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Method of audio-visual polish speech recognition with use hidden Markov Models

Autorzy

Kubanek M.

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

This document contains the novel approach that use other orthogonal sources of information to the acoustic input that not only considerably improve the performance in severely degraded conditions, but also are independent to the type of noise and reverberation. Visual speech is one such source not perturbed by the acoustic environment and noise. It was proposed own approach to lip-tracking for audio-visual speech recognition system and novel audio-visual fusion technique. It was presented video analysis of visual speech for extraction visual features from a talking person in color video sequences. I was developed a method for automatically face, eyes, region of lips, region of corners and detection of contour of lips. Finally, the paper will show results of audio -visual speech recognition in noisy environments.

Słowa kluczowe

lip-reading lip-tracking visual speech visual feature extraction

mowa wizyjna wizyjna ekstrakcja cech

Wydawca

Wydawnictwo Politechniki Częstochowskiej

Czasopismo

Computing, Multimedia and Intelligent Techniques

Rocznik

2007

Tom

Vol. 3, nr 1

Strony

87--99

Opis fizyczny

Bibliogr. 15 poz., rys., tab.

Twórcy

autor

Kubanek M.

Czestochowa University of Technology, Institute of Computer and Information Sciences, ul. Dąbrowskiego 73, 42-200 Częstochowa, Poland, mariusz.kubanek@icis.pcz.pl

Bibliografia

[1] Aydin Y, Nakajama H., Realistic articulated character positioning and balance control in interactive environments, Proceedings Computer Animation 7999, 1999, 160-168.
[2] Herda L., Fua P., Flankers R., Boulic R., Thalmann D., Skeleton-based motion capture for robust reconstruction of human motion, Proceedings Computer Animation 2000, 2000, 77-83.
[3] Neti C, Potamianos G., Luttin J., Mattews I., Glotin H., Vergyri D., Sison J., Mashari A., Zhou J., Audio Visual Speech-Recognition, 2000 Final Report, 2000.
[4] Zhi Q., Kaynak M. N. N., Sengupta K., Cheok A. D., Ko C. C., A study of the modeling aspects in bimodal speech recognition, Proc. 2001 IEEE International Conference on Multimedia and Expo (ICME2001), 2001.
[5] Jian Z., Kaynak M. N. N., Cheok A. D., Chung K. C., Real-time Lip-tracking For Virtual Lip Implementation in Virtual Environments and Computer Games,, Proc. 2001 International Fuzzy Systems Conference, 2001.
[6] Kubanek M., Technique of Video Features Extraction for Audio-video Speach Recognition System, Computing, Multimedia and Intelligent Techniques, 2005, 1, 1, 181-190.
[7] Rabiner L., Yuang B. H., Fundamentals of Speech Recognition, Prentice Hall Signal Processing Series, 1993.
[8] Kaynak M. N. N., Zhi Q., Cheok A. D., Sengupta K., Chung K. C., Audio-Visual Modelling for Bimodal Speech Recognition, Proc. 2001 International Fuzzy Systems Conference, 2001.
[9] Bogert K., Healy M. J. R., Tukey J. W., The Frequency Analysis of Time-Series for Echoes, Proc. Symp. Time Series Analysis, 1963, 209-243.
[10] Wahab A., See Ng G., Dickiyanto R., Speaker Verification System Based on Human Auditory and Fuzzy Neural Network System, Neurocomputing Manuscript Draft, Singapore.
[11] Kukharev G., Kuzminski A., Techniki biometryczne. Część 1: Metody rozpoznawania twarzy, Wydział Informatyki. Politechnika Szczecińska, 2003.
[12] Kubanek M., Method of Speech recognition and Speaker Identification with Use Audio-Visual of Polish Speech and Hidden Markov Models, Image Analysis, Computer Graphics, Security Systems and Artificial Intelligence Applications, 2005, 1, 1, 353-364.
[13] Kubanek M., Method of edge EDGE to extraction of features of image of mouth in technique of integrated recognizing of speech audio-video, Information Sciences, Publisher of Czestochowa University of Technology, 2003, 4, 115-125.
[14] Kaucic R., Dalton B., Blake A., Real-Time Lip Tracking for Audio-Visual Speech Recognition Applications, In Proc. European Conf. Computer Vision, Cambridge, UK, 1996, 376-387.
[15] Summerfield Q., MacLeod A., McGrath M., Broke M., Lips, teeth and the benefits of lipreading, In A. W. Young and H. D. Ellis, editors, Handbook of Research on Face Processing., Elsevier Science Publishers, 1989, 223-233.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPC1-0001-0046