In this paper we present application of the automatic speech recognition technology in the area of media monitoring. We describe the use of computational models and methods by two ASR technologies, namely a Hidden Markov Model with a Gaussian Mixture Model and Deep Neural Networks, that were crucial in the ASR development. Both approaches were implemented in our speech recognition ARM-1 engine developed for the Polish language. We provide details on the implementation choices, specifically adjustments made for media monitoring application guided by the characteristics of media content. Performance of both versions of our engine is evaluated and compared.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.