PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Vowel recognition based on acoustic and visual features

Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The aim of the research work presented is to show a system that may facilitate speech training for hearing impaired people. The system engineered combines both acoustic and visual vowel data acquisition and analysis modules. The acoustic feature extraction involves mel-cepstral analysis. The Active Shape Model method is used for extracting visual speech features from the shape and movement of the lips. Artificial Neural Networks (ANNs) are utilized as the classifier, feature vectors extracted combine both modalities of the human speech. The system is validated with the recordings of speakers that were not used for the lip model creating and for the ANN training. Additional experiments with the degraded acoustic information are carried out in order to test the system robustness against various distortions affecting speech utterances.
Twórcy
autor
autor
Bibliografia
  • [1] SAENKO K., LIVESCU K., GLASS J., DARRELL T., Production domain modeling of pronunciation for visual speech recognition, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pp. 473–476, Philadelphia, March 2005.
  • [2] VERMA A., FARUQUIE T., NETI C., BASU S., SENIOR A., Late integration in audio-visual continuous speech recognition, Proceedings of Automatic Speech Recognition and Understanding, Colorado, 12–15 December 1999.
  • [3] KOSTEK B., DALKA P., CZY˙ZEWSKI A., Audiovisual speech recognition for training hearing impaired patients, Nymphaio, Sept. 2005, World Scientific Publishing, 2006.
  • [4] MOTLICEK P., CERNOCKY J., Multimodal phoneme recognition of meeting data, Vol. 3206/2004, Text, Speech and Dialogue: 7th International Conference, TSD 2004, Brno, Czech Republic, September 8-11, 2004. Proceedings, Lecture Notes in Computer Science (Sojka P., Kopecek I., Pala K. [Eds.]), 2004.
  • [5] DODD B., CAMPBELL R., Hearing by eye: The psychology of lipreading, Lawrence Erlbaum Press, 1987.
  • [6] SUMMERFIELD Q., Lipreading and audio-visual speech perception, Phil. Trans. R. Soc. Lond., B 335, 71–78 (1992).
  • [7] BORGHESE N. A., FERRIGNO G., REDOLFI M., PEDOTTI A., Automatic integrated analysis of jaw and lip movement in speech production, J. Acoustical Soc. of America, 101 1, 482–487 (1997).
  • [8] KUBANEK M., Audio-visual recognition of Polish speech based on hidden Markov models [in Polish], PhD Thesis, Czestochowa University of Technology, 2005.
  • [9] FABIAN P., BADURA S., LESZCZY´N SKI M., SKARBEK W., Mouth modeling by local PCA for audio visual synchronization [in Polish], XI International AES Symposium – The Art of Sound Engineering ISSET, pp. 79–85, Kraków 2005.
  • [10] KUBANEK M., Method of speech recognition and speaker identification with use audio-visual of Polish speech and hidden Markov models, Proc. Advanced Computer Systems – Computer Information Systems and Industrial Management Applications, ACS-CISIM, Ełk 2005.
  • [11] DAVIS S. B., MERMELSTEIN P., Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences, IEEE Trans. on Acoustics, Speech & Signal Processing, 28, 4, 357–366 (1980).
  • [12] KOSTEK B., DALKA P., Combining visual and acoustic modalities to ease speech recognition by hearing impaired people, 118th Audio Engineering Society Convention, Paper No. 6462, Barcelona 2005.
  • [13] NELDER J., MEAD R., A simplex method for function optimization, Computing Journal, 7, 4, 308–313 (1965).
  • [14] COOTES T., TAYLOR C., COOPER D., GRAHAM J., Active shape models – their training and application, Computer Vision and Image Understanding, 61, 1, 38–59 (1995).
  • [15] JACKSON J., A user’s guide to principal components, John Wiley and Sons, Inc., 1991.
  • [16] LUETTIN J., THACKER N., Speechreading using probabilistic models, Computer Vision and Image Understanding, 65, 2, 163–178 (1997).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BAT3-0039-0045
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.