Formant frequency estimations of whispered speech in Chinese

Lv, G.; Zhao, H.

Artykuł - szczegóły

Tytuł artykułu

Formant frequency estimations of whispered speech in Chinese

Autorzy

Lv G. , Zhao H.

Wybrane pełne teksty z tego czasopisma

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Formant frequencies are important cues for characterizing whispered speech. However, it is difficult to exactly estimate its formant by the conventional linear prediction coding algorithm. The main reason is that the formant bandwidth of a whisper is wider than that of voiced speech. This brings up the pole interaction problem that then leads to the result that one or more real roots are regarded as spurious and deleted from the original LP polynomial. To reduce the degradation of pole interactions, an improved root-finding formant estimation algorithm has been proposed. In this algorithm, the whisper formant bandwidth is modified to make the spectral energy of the remained formant polynomial equal to that of the original LP polynomial. Experimental results with six Chinese whispered monophthong phonemes show that the formant frequencies obtained by the proposed algorithm produce a more reliable formant spectrum than the one that does not consider the pole interaction effect.

Słowa kluczowe

whispered speech formant linear prediction pole interaction

Wydawca

Instytut Podstawowych Problemów Techniki PAN
Komitet Akustyki PAN
Polskie Towarzystwo Akustyczne

Czasopismo

Archives of Acoustics

Rocznik

2009

Tom

Vol. 34, No. 2

Strony

127--135

Opis fizyczny

Bibliogr 7 poz., rys., tab.

Twórcy

autor

Lv G.

autor

Zhao H.

Soochow University, School of Electronic Information, Suzhou, 215021, P.R.China, lvgang@suda.edu.cn

Bibliografia

[1] Tartter V.C., What’s in a Whisper?, Journal of Acoustical Society of America, 86, 1678–1683 (1989).
[2] Itoh T., Takeda K., Itakura F., Analysis and recognition of whispered speech, Speech Communication, 45, 139–152 (2005).
[3] Morris R.W., Clements M.A., Modification of formants in the line spectrum domain, IEEE Signal Processing Letters, 9, 19–21 (2002).
[4] Ding H., Li X.L., Xu B.L., Initial/Final Segmentation of Chinese Whispered Speech based on the Auditory Model, Acoustic Application, 23, 20–25 (2004).
[5] Li X.L., Ding H., Xu B.L., Entropy-based Initial/Final Segmentation for Chinese Whispered Speech, Acta Acustica, 30, 69–75 (2005).
[6] Kuwabara H., Ohgushi K., Contributions of vocal tract resonant frequencies and bandwidths to the personal perception of speech, Acoustica, 63, 120–128 (1987).
[7] Gong C.H., Zhao H.M., Lv G., Liu J.X., Formant estimation of whispered speech based on spectral segmentation, IEEE International Symposium on Signal Processing and Information Technology, 562–566 (2006).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BAT8-0014-0037