Re-speaking is a mechanism for obtaining high-quality subtitles for use in live broadcasts and other public events. Because it relies on humans to perform the actual re-speaking, the task of estimating the quality of the results is non- trivial. Most organizations rely on human effort to perform the actual quality assessment, but purely automatic methods have been developed for other similar problems (like Machine Translation). This paper will try to compare several of these methods: BLEU, EBLEU, NIST, METEOR, METEOR-PL, TER, and RIBES. These will then be matched to the human-derived NER metric, commonly used in re-speaking. The purpose of this paper is to assess whether the above automatic metrics normally used for MT system evaluation can be used in lieu of the manual NER metric to evaluate re-speaking transcripts.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.