A preliminary study on greek esophageal speech and a method for quality and intelligibility enhancement

Pastiadis, C.; Papanikolaou, G.

Powiadomienia systemowe

Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

A preliminary study on greek esophageal speech and a method for quality and intelligibility enhancement

Autorzy

Pastiadis C. , Papanikolaou G.

Wybrane pełne teksty z tego czasopisma

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The present work is a preliminary study on Greek esophageal speech and is mainly concerned with the investigation of major features such as pitch, formant frequencies, and speech power envelopes. The implementation in esophageal speech of various well-known techniques for normal voice analysis is overviewed. An improved method for resynthesizing voiced sounds (such as vowels or nasal consonants) by convolution of an ARMA estimate of the speaker's vocal tract impulse response and a periodic glottal waveform is proposed as a tool for voice quality enhancement. Fundamental frequency values were confirmed to be close to previous works' findings. Fl and F2 formant alterations due to laryngectomy were not detected compared to normal speech values. However, speech power envelopes tended to be flatter as the speaker's training stage was higher. The proposed method for speech enhancement proved able enough to preserve speaker characteristics and provide cues for higher quality reproduction of vowels as well as nasals.

Słowa kluczowe

Wydawca

Instytut Podstawowych Problemów Techniki PAN
Komitet Akustyki PAN
Polskie Towarzystwo Akustyczne

Czasopismo

Archives of Acoustics

Rocznik

1999

Tom

Vol. 24, no. 1

Strony

25--38

Opis fizyczny

Bibliogr. 24 poz., rys., tab., wykr.

Twórcy

autor

Pastiadis C.

Laboratory of Electroacoustics, Aristotle University of Thessaloniki Greece

autor

Papanikolaou G.

Laboratory of Electroacoustics, Aristotle University of Thessaloniki Greece

Bibliografia

[1] N. Bi, Y. Qi, Alaryngeal speech enhancement based on spectral substitution, ASA 127th Meeting M.I.T. 1994 June 6-10.
[2] N. Bi, Y. Qi, Application of speech conversion to alaryngeal speech enhancement, IEE E Trans. Speech & Audio Processing, 5, 2, 97-105 (1997).
[3] Y.M. Cheng, D. O’Shaughnessy, Automatic and reliable estimation of glottal closure instant and period, IEE E Trans. AS SP, 37, 12, 1805-1815, December 1989.
[4] P. Cook, Identification of Control Parameters in an Articulatory vocal tract model with applications to the synthesis of singing, Ph.D . Thesis, Stan ford Univ., 1991.
[5] J. Flanagan, Speech analysis, synthesis and perception, Springer Verlag, 1972.
[6] S. Furui, Digital speech processing, synthesis and recognition, Marcel Dekker, Inc., 1989.
[7] S. Furui, M. Sondhi Mohan, Advances in speech signal processing, Marcel Dekker, Inc., 1992.
[8] A. Iivonen, Articulatory vowel gesture presented in a psychoacoustical Fl/FS-space , Studies in Logopedics and Phonetics, Univ. Of Helsinki, 3, 19-44 (1992).
[9] H. Jarkin, M. Galler, N. Niedzielski, Enhancement of esophageal speech by injection noise rejection, Proc. ICASSP’97, Munich, 1997.
[10] S. Kay, Mo dem spectral estimation, Prentice-Hall Signal Processing Series, 1988.
[11] A. Leinonen, Intonational patterns and voice quality in esophageal speech, Studies in Logopedics and Phonetic s, Univ. of Helsinki, 3,151-159 (1992).
[12] K. Mourhos, Phonetic rehabilitation of alaryngeal people, 4th Europe an Interuniversity Symposium of “Head and Neck Cancer: Improvements of Logoregional Control” , Thessaloniki 1994.
[13] A. Oppenheim, R . Schafer, Discrete time signed processing, Prentice-Hall International 1989.
[14] C. Pastiadis, G. Papanikolaou, A preliminary study on Greek esophageal speech and a method for voice quality enhancement, AES 102nd Convention Preprint, Munich, March 1997.
[15] Y. Qi, R. FOX, Ana lysis of nasal consonants using perceptual linear prediction, Journal of the Acoustical Society of America, 91, 3,1718-1726 (1992).
[16] Y. Qi, Replacing tracheoesophageal voicing sources using LP C synthesis, Journal of the Acoustical Society of America, 88, 3,12 28-1235 (1990).
[17] Y. Qi, B. Weinberg, N. Bi, Enhancement of female esophageal and tracheoesophageal speech, Journal of the Acoustical Society of America, 98, 5, 2461-2465, 1995.
[18] Y. Qi, B. Weinberg, Characteristics of voicing source waveforms produced by esophageal and tracheoesophageal speakers, Journal of Speech & Hearing Research, 38, 536-548 (1973).
[19] J. Robbins, H. Fisher, F. Blom, M. Singer, A comparative study of normal, esophageal and tracheoesophageal speech production, Journal of Speech and Hearing Disorders, 49, 202-210 (1984).
[20] P. Rubin, L. Goldstein, Articulatory synthesis (A SY ), Haskins Laboratories.
[21] M. Mohan Sondhi, New methods of pitch extraction, IEE E Trans. Audio and Electroacoustics, 18, 16, 262-266 (1968).
[22] I. Titze, H. Liang, Comparison of F0 extraction methods f or high-precision voice perturbation measurements, Journal of Speech and Hearing Research, 36, 1120-1133 (1993).
[23] R. Tull, J . Rutledge, J . Mahler, Female alaryngeal speech enhancement for improved speaker identification using linear predictive synthesis, ASA 129th Meeting Washington D.C. 1995 May 30- June 6.
[24] B. Weinberg, Y. Horii, B. Smith, Long-time spectral and intensity characteristics of esophageal speech, Journal of the Acoustical Society of Am erica, 67, 5,17 81-1784 (1980).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BAT3-0007-0065