Automatic prosodic modification in a Text-To-Speech synthesizer of polish language

Łopatka, K.; Suchomski, P.; Czyżewski, A.

Artykuł - szczegóły

Tytuł artykułu

Automatic prosodic modification in a Text-To-Speech synthesizer of polish language

Autorzy

Łopatka K. , Suchomski P. , Czyżewski A.

Identyfikatory

Warianty tytułu

Automatyczna modyfikacja prozodii w syntetyzerze mowy języka polskiego

Języki publikacji

Abstrakty

A Text-To-Speech synthesizer of Polish language with automatic prosodic modification is presented. The methods for automatic determination of accent and intonation are introduced. The application of prosodic speech processing algorithms to Text-To-Speech synthesis is presented. The impact of these modifications on the naturalness of the synthesized signal is discussed. The applied method is based on the TD-PSOLA algorithm. The developed Text-To-Speech Synthesizer is used in applications employing multimodal computer interfaces.

Przedstawiono system syntezy mowy polskiej z funkcją automatycznej modyfikacji prozodii wypowiedzi. Opisane zostały metody automatycznego wyznaczania akcentu i intonacji wypowiedzi. Przedstawiono zastosowanie algorytmów przetwarzania sygnału mowy w procesie kształtowania prozodii. Omówiono wpływ zastosowanych modyfikacji na naturalność brzmienia syntezowanego sygnału. Zastosowana metoda poarta jest na algorytmie TD-PSOLA. Opracowany system syntezy mowy znajduje zastosowanie w aplikacjach wykorzystujących multimodalne interfejsy komputerowe.

Słowa kluczowe

speech synthesis diphone PSOLA

synteza mowy difon PSOLA

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Elektronika : konstrukcje, technologie, zastosowania

Rocznik

2011

Tom

Vol. 52, nr 5

Strony

106--110

Opis fizyczny

Bibliogr. 10 poz., wykr.

Twórcy

autor

Łopatka K.

autor

Suchomski P.

autor

Czyżewski A.

Gdańsk University of Technology, Faculty of Electronics, Telecommunication and Informatics, Multimedia Systems Department, Gdańsk

Bibliografia

[1] Kunka B., Kostek B.: Non-intrusive infrared-free eye tracking method, Signal Processing Algorithms, Architectures, Arrangements and Applications SPA 2009, 105-109, 24-26.09.2009.
[2] Johnson M. E.: Synthesis of English Intonation using explicit models of reading and spontaneous speech, Fourth Int. Conf. on Spoken Language, 3, 1844-1847, Philadelphia, 3-6.10.1996.
[3] Politis D.: Prosodic enhancements for a musical object oriented formant synthesizer, Electrotech. Conf. MELECON, 2, 702-705, 2000
[4] Hammer F.: Time-scale modification using the phase vocoder. Institute for Electr. Music and Ac., Graz Univ. of Dramatic Arts Graz 2001.
[5] Laroche J., Doison M.: lmproved phase vocoder time-scale modification of audio, IEEE Trans, on Speech and Aud. Proc., 7,3. New York, 05.1999.
[6] Cabral J., Oliveira L.: Pitch-Synchronous Time-Scaling for Prosodic and Voice Quality Transformations, 9th Conf. of the International Speech Communication Association, INTERSPEECH Lisboa 2005.
[7] Hamon C., Moulines E., Charpentier F.: A diphone synthesis system based on time-domain prosodic modifications of speech 1989 IEEE Int. Conf. on Acoust, Speech and Sign. Proc., 238-241, 23-26.05-1989.
[8] Moulines E., Charpentier F.: Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones, Speech Communication, 453-467, North-Holland, 1.08.1990.
[9] Dutoit T.: High quality text-to-speech synthesis of the French larguage, 125-135, Mons 1993.
[10] Czyżewski A., Łopatka K., Kunka B., Rybacki R., Kostek B.: Speech synthesis controlled by eye gazing, 129th Convention of the AES, San Francisco, 4-1.11.2010.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BWAK-0024-0020