Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
The Bonn Open Synthesis System (BOSS) is an open-source software for the unit selection speech synthesis that has been used for the generation of high-quality German and Dutch speech. This article presents ongoing research and development aimed at adapting BOSS to the Polish language. In the first section, the origins and workings of the unit selection method for speech synthesis are explained. Section two details the structure of the Polish corpus and its segmental and prosodic annotation. The subsequent sections focus on the implementation of Polish TTS modules in the BOSS architecture (duration prediction and cost function) and the steps involved in preparing a new speech corpus for BOSS.
Słowa kluczowe
Rocznik
Tom
Strony
371--376
Opis fizyczny
Bibliogr. 26 poz., rys.
Twórcy
autor
autor
autor
- Institute of Linguistics, Department of Phonetics, Adam Mickiewicz University, 4 Niepodległości Ave., 61-874 Poznań, Poland
Bibliografia
- [1] A.J. Hunt and A.W. Black, “Unit selection in a concatenative speech synthesis system using a large speech database”, Proc. IEEE Int. Conf. on Acoustics and Speech Signal Processing 1, 373–376 (1996).
- [2] A.W Black and P. Taylor, “Automatically clustering similar units for unit selection in speech synthesis”, Proc. European Conf. on Speech Communication and Technology 2, 601–604 (1997).
- [3] A.P. Breen and P. Jackson, “Non-uniform unit selection and the similarity metric within BT’s Laureate TTS system”, Proc. Third Int. Workshop on Speech Synthesis 1, 373–376 (1998).
- [4] M. Beutnagel, M. Mohri, and M. Riley, “Rapid unit selection from a large speech corpus for concatenative speech synthesis”, Proc. Eur. Con. on Speech Communication and Technology 2, 607–610 (1999).
- [5] A. Conkie, “Robust unit selection system for speech synthesis”, Collected Papers of the 137th Meeting of the Acoustical Society of America and the 2nd Convention of the European Acoustics Association: Forum Acusticum Berlin 1, 1PSCB\10 (1999).
- [6] M. Balestri, A. Pacchiotti, S. Quazza, P.L. Salza, and S. Sandri, “Choose the best to modify the least: a new generation concatenative synthesis system”, Proc. Eur. Conf. Speech Communication and Technology 5, 2291–2294 (1999)
- [7] N. Iwahashi, and Y. Sagisaka, “Speech segment network approach for an optimal synthesis unit set”, Computer Speech and Language 9, 335–352 (1995).
- [8] B. M¨obius, ”Rare events and closed domains: two delicate concepts in speech synthesis”, Int. J. Speech Technology 6 (1), 57–71 (2003).
- [9] J.P.H. Santen and A.L. Buchsbaum, “Methods for optimal text selection”, Proc. Eur. Conf. on Speech Communication and Technology 2, 553–556 (1997).
- [10] ECESS: European Center of Excellence on Speech Synthesis, http://www.ecess.eu (2008).
- [11] SYNSIG: Speech Synthesis Special Interest Group of ISCA, http://www.synsig.org/index.php/Blizzard Challengeil (2008).
- [12] BOSS: The Bonn Open Synthesis System, http://www.i .unibonn.de/search?SearchableText=boss (2008).
- [13] S. Breuer, “Multifunktionale und multilinguale Unit-Selection-Sprachsynthese – Designprinzipien fur Architektur und Sprachbausteine” , Phd Thesis, Universitat Bonn, Bonn, 2008.
- [14] E. Klabbers and K. Stober, R. Veldhuis, P. Wagner, and S. Breuer, “Speech synthesis development made easy”, The Bonn Open Synthesis System 1, 521–524 (2001).
- [15] K. St ¨ober, T. Portele, P. Wagner, and W. Hess, “Synthesis by word concatenation”, Proc. Eur. Conf. on Speech Communication and Technology 2, 619–622 (1999).
- [16] SAMPA for Polish Homepage, http://www.phon.ucl.ac.uk/home/sampa/polish.htm (2008).
- [17] W. Jassem, “Illustrations of the IPA”, Polish J. Int. Phonetic Association 33 (1) 103–107 (2003).
- [18] G. Demenko, M. Wypych, and E. Baranowska, “Implementation of grapheme-to-phoneme rules and extended SAMPA alphabet in Polish text-to-speech synthesis”, Speech and Language Technology 7, 79–97 (2003).
- [19] M. Szymański, and S. Grocholewski, “Semi-automatic segmentation of speech: manual segmentation strategy. Problem space analysis”, Advances in Soft Computing, Computer Recognition Systems 1, 747–755 (2005).
- [20] K. Sj ¨olander and J. Beskow, Wavesurfer, http://www.speech.kth.se/wavesurfer/ (2008).
- [21] K. Klessa, , M. Szymański, S. Breuer, and G. Demenko, “Optimization of Polish segmental duration prediction with CART”, 6th ISCA Workshop on Speech Synthesis (SSW-6) Proc. 1, CD-ROM (2007).
- [22] G. Demenko, J. Bachan, B. M¨obius, K. Klessa, M. Szymański, and S. Grocholewski, “Development and evaluation of Polish speech corpus for unit selection speech synthesis systems”, Proc.: Interspeech 2008 1, CD-ROM (2008).
- [23] S. Breuer, K. Francuzik, G. Demenko, and M. Szymański, “Analysis of Polish segmental duration with CART”, Proc. Speech Prosody Conf. 1, CD-ROM (2006).
- [24] S. Breuer and J. Abresch, Unit selection speech synthesis for a directory enquiries service”, Proc. ICPhS Barcelona 2003 1, CD-ROM (2003).
- [25] D. Gibbon and J. Bachan, “An automatic close copy speech synthesis tool for large-scale speech corpus evaluation”, Proc. Sixth International Language Resources and Evaluation (LREC’08) 1, CD-ROM (2008).
- [26] ELDA: Evaluations and Language resources Distribution Agency, http://www.elda.org/ (2008).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BPG8-0039-0003