Linguistic aspects of text normalization in a polish text-to-speech system

Graliński, F.; Jassem, K.; Wagner, A.; Wypych, M.

Artykuł - szczegóły

Tytuł artykułu

Linguistic aspects of text normalization in a polish text-to-speech system

Autorzy

Graliński F. , Jassem K. , Wagner A. , Wypych M.

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The paper addresses linguistic problems of text normalization for the Polish language. Text normalization, which converts the written form of a text into the spoken form, is one of the preprocessing steps in text-to-speech systems. Normalization of texts in analytic languages like English does not necessarily require deep linguistic analysis. However, it is shown here that for synthetic languages, like Polish, linguistic analysis is crucial for the normalization process. Existing Polish text-to-speech systems, even though highly estimated for the naturalness of output, do not solve main normalization problems. The authors’ team aims at developing a text-to-speech system that will include a strong text normalization module. The idea is to design the module using linguistic resources and mechanisms developed for a Machine Translation system, Translatica. Progress of research may be followed at www.poleng.pl, where the user may input a source Polish text in the written form and obtain its "translation" after normalization.

Słowa kluczowe

text normalization text-to-speech system

Wydawca

Oficyna Wydawnicza Politechniki Wrocławskiej

Czasopismo

Systems Science

Rocznik

2006

Tom

Vol. 32, no 4

Strony

7--15

Opis fizyczny

Bibliogr. 22 poz.

Twórcy

autor

Graliński F.

autor

Jassem K.

autor

Wagner A.

autor

Wypych M.

Adam Mickiewicz University, Faculty of Mathematics and Computer Science, Poznań, Poland, gralinski@amu.edu.pl

Bibliografia

[1] Black A.W.. Lenzo K.A., Building Synthetic Voices, LTI. Carnegie Mellon University, 2003.
[2] Black A., Sproat R., Chen S., Text normalization tools for the Festival speech synthesis system, 2000, http://festvox.org/nsw
[3] Black A.W. Taylor P., Caley R., The Festival Speech Synthesis System. System documentation. Edition 1.4, 1999.
[4] Dubisz S. (ed.), The Universal Dictionary of Polish, Wydawnictwo Naukowc PWN, Warszawa 2003.
[5] Jassem K., Transfer w systemie POLENG3, [in:] G. Denienko, W. Jassem, K. Jassem, M. Karpiński (ed.), Speech and Language Technology, Vol. 6, Polskie Towarzystwo Fonetyczne, Poznań, 2002.
[6] Jassem K., Graliński F., Wypych M, Statistical and Heuristic Approach to Meaning Disambiguation in POLENG MT System, [in:] Demenko G. (ed.), Analiza, syntem i rozpoznawanie mowy w lingwistyce, technice i medycynie. Polskie Towarzystwo Fonetyczne, Szczyrk 2003.
[7] Linde-Usiekniewicz J. (ed.). The Great English-Polish Dictionary, Wydawnictwo Naukowc PWN, Warszawa 2003.
[8] Marasek K.. Synteza Mowy: przegląd technologii i zastosowań za szczególnym uwzgędnieniem języka polskiego, Polsko-Japońska Wyższa Szkoła Technik Komputerowych, Warszawa 2003.
[9] Mikheev A., Text segmentation, [in:] R. Mitkov (ds.). The Oxford Handbook of Computational Linguistics, Oxford University Press, 2004.
[10] Oliver D., Polish Text-to-Speech Synthesis, Master Thesis, University of Edinburgh 1998.
[11] Pellom B., Hacioglu K., Recent Improvements in the CU SONIC ASR System for Noisy Speech: The SPINE Task, Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP). Hong Kong, April 2003.
[12] Reichel U.D., Pfitzinger H.R., Text Preprocessing for Speech Synthesis, TC-STAR Workshop on Speech-to-Speech Translation, June 19-21, 2006 Barcelona, Spain.
[13] Sproat R., Multilingual text analysis for text-to-speech synthesis, J. Natural Language Engineering, 2(4), 1996, pp. 369-380.
[14] Wypych M, An Automatic Intonation Recognizer for the Polish Language Based on Machine Learning and Expert Knowledge. Proc. Interspeech 2005, Lisbon 2005.
[15] Wypych M., Demenko G., Baranowska E., A Grapheme-to-Phoneme Transcription Algorithm Based on the SAMPA Alphabet Extension for The Polish Language, Proc. Int. Congress of Phonetic Science, Barcelona 2003.
[16] Xydas G., Karberis G., Kouroupetroglou G., Text Normalization for the Pronunciation of Non-standard Words in an Inflected Language, Proc. 3rd Hellenic Conference on Artificial Intelligence (SETN04), Samos, Greece, May 5-8, 2004.
[17] Yarowsky D., Text normalization and ambiguity resolution in speech synthesis, J. Acoustical Society of America, Vol. 94. Issue 3, 1993, p. 1841.
[18] Acapela, http://www.acapcla-group.com/
[19] Ivona, http://www.ivo.pl/
[20] Realspeak, http://www.nuance.com/realspeak/
[21] Speech Synthesis Markup Language (SSML), (2004). Version 1.0. W3C Recommendation, 7 September 2004, http://www.w3.org/TR/speech-synthesis/
[22] The IMS German Festival Manual. Institute for Natural Language Processing, University of Stuttgart, 2001, available at http://www.ims.uni-lultgart.de/phonetik/synthesis/

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BAT5-0027-0081