Analysis of State-Space Model based Voice Conversion

Sun, J.; Zhang, X.

Artykuł - szczegóły

Tytuł artykułu

Analysis of State-Space Model based Voice Conversion

Autorzy

Sun J. , Zhang X.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

Model SSM przetwarzania sygnałów głosowych

Języki publikacji

Abstrakty

A new State-Space Model (SSM) based voice conversion method has been proposed recently which outperforms the traditional Gaussian Mixture Model (GMM) method. Although the implementation process of the new method has been elaborated, the theoretical essence of this method has not been analysed clearly. In this paper an exhaustive analysis of the SSM based method is given theoretically and experimentally. Through these analysis, much simpler equivalence form and performance upper bound of the new method are obtained. Finally possible improvements are discussed.

Przedstawiono teoretyczna i eksperymentalną analizę nowego algorytm SSM przetwarzania sygnału mowy.

Słowa kluczowe

electrical technology electrical power engineering voice conversion state-space model SSM linear multivariate regression LMR analysis

elektrotechnika elektroenergetyka przetwarzanie głosu model SSM

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2011

Tom

R. 87, nr 10

Strony

373--376

Opis fizyczny

Bibliogr. 16 poz., il., tabl., wykr.

Twórcy

autor

Sun J.

autor

Zhang X.

Institute of Communications Engineering, Biaoyin 2, Yudao Street, Nanjing, China, 210007, sunjian001@gmail.com

Bibliografia

[1] Stylianou Y., (2008). Voice Transformation: A survey, Springer Handbook of Speech Processing, ICASSP, (2009), 3585-3588
[2] Abe M., Nakamura S., Shikano K., Kuwabara H., Voice conversion through vector quantization, ICASSP, (1988), 655- 658
[3] Stylianou Y., Cappe O., Moulines E., Continuous probabilistic transform for voice conversion, Speech and Audio Processing, IEEE Transactions on, 6(1998), No. 2, 131-142
[4] Shuang Z., Bakis R., Qin Y., IBM Voice Conversion Systems for 2007 TC-STAR Evaluation, Tsinghua Science & Technology, 13(2008), No. 4, 510-514
[5] Erro D., Moreno A., Bonafonte A., Voice Conversion based on Weighted Frequency Warping, Audio, Speech, and Language Processing, IEEE Transactions on, 18(2010), No. 5, 922-931
[6] Narendranath M., Murthy H. A., Rajendran, S., Yegnanarayana B., Transformation of formants for voice conversion using artificial neural networks, Speech Commun., 16(1995), No. 2, 207-216
[7] Xu N., Yang Z., Zhang L. H., Zhu W. P., Bao, J. Y., Voice conversion based on state-space model for modelling spectral trajectory, Electronics Letters, 45(2009), No. 14, 763-764
[8] Popa V., Nurminen J., Gabbouj M., A Novel Technique for Voice Conversion Based on Style and Content Decomposition with Bilinear Models, the Proc. of the 10th Annual Conference of the International Speech Communication Association, (2009), 2655-2658
[9] Roweis S., Ghahramani Z., A unifying review of linear Gaussian models, Neural Comput., 11(1999), No. 2, 305-345
[10] Tanizaki H., Nonlinear Filters: Estimation and Applications (2nd edn.), Berlin: Springer-Verlag, (1996)
[11] Ljung L., System identification: theory for the user (2nd edn.), Upper Saddle River, NJ: Prentice Hall PTR, (1999)
[12] Kawahara H., Masuda-Katsuse I., de Cheveign, A., Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequencybased F0 extraction: Possible role of a repetitive structure in sounds, Speech Communication, 27(1999), No. 3-4, 187-207
[13] Valbret H., Moulines E., Tubach J. P., Voice transformation using PSOLA technique, Speech Communication, 11(1992), No. 2-3, 175-187
[14] Ben-Israel A., Greville T. N. E., Generalized inverses theory and applications (2nd edn.), New York: Springer, (2003)
[15] Kominek J., Black A. W., CMU ARCTIC databases for speech synthesis, 5th ISCA Speech Synthesis Workshop, (2003), 223- 224
[16] Xydeas C. S., Papanastasiou, C., Split matrix quantization of LPC parameters, Speech and Audio Processing, IEEE Transactions on, 7(1999), No. 2, 113-125

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-PWA7-0054-0011