This paper addresses the problem of automatic audio content identification. In order to determine regions of speech, music and silence in audio stream, the fusion of feature contours and their envelopes has been used. Additionally, a voicing detector and four class music genre identification stage have been incorporated into classification system. To minimize boundary errors of different audio regions, a smoothed envelope of feature contours has been proposed. Experimental results show that using proposed scheme, makes it possible to achieve acceptable classification rates for audio data segmentation. In result, this approach can be applied to the content type dependent multimedia processing.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.