This document contains the novel approach that use other orthogonal sources of information to the acoustic input that not only considerably improve the performance in severely degraded conditions, but also are independent to the type of noise and reverberation. Visual speech is one such source not perturbed by the acoustic environment and noise. It was proposed own approach to lip-tracking for audio-visual speech recognition system and novel audio-visual fusion technique. It was presented video analysis of visual speech for extraction visual features from a talking person in color video sequences. I was developed a method for automatically face, eyes, region of lips, region of corners and detection of contour of lips. Finally, the paper will show results of audio -visual speech recognition in noisy environments.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The following paper describes a novel lip-reading method developed for the purpose of isolated word recognition. The method is based on a concept of a discriminative deformable model, which represents an image analysis method derived from the deformable grid paradigm. The discriminative deformable model is used to characterize the lip shape at each frame of the video sequence. The information extracted from the consecutive frames is next analyzed using the Hidden Markov Models. The proposed visual speech recognition method is tested using the Polish digits recognition task.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.