The intelligibility of Polish speech synthesized with a new SineWave Synthesis method

Gardzielewska, H.; Preis, A.

Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl

Artykuł - szczegóły

Czasopismo

Archives of Acoustics

2007 | Vol. 32, No. 3 | 579--589

Tytuł artykułu

The intelligibility of Polish speech synthesized with a new SineWave Synthesis method

Autorzy

Gardzielewska, H. , Preis, A.

Wybrane pełne teksty z tego czasopisma

Warianty tytułu

Języki publikacji

Abstrakty

SineWave Synthesis, (SWS), allows a significant reduction of the information carried by a speech signal representing by the dynamic spectral properties of formants selected from the natural speech. The synthesis rejects all the detailed acoustic information carried by a signal, including the fundamental frequency as well as harmonic and noise components. Regardless of the impressive information reduction (the compression coefficient for 3-tone synthesis reaches even 195:1), the linguistic and extra-linguistic information of a signal are to a large extend preserved. For the first time, a modified version of SWS was used to analyze Polish speech in order to evaluate the relationship between data reduction and the intelligibility of speech. Speech intelligibility was tested in different utterances varying in grammatical structure, linguistic information, and duration. The modified SWS method, elaborated in Adam Mickiewicz University in Poznań, provided noticeably better results for Polish speech than the original method elaborated in late 1970s at Haskins Laboratories.

Słowa kluczowe

sinewave synthesis synthetic speech perception Polish speech intelligibility linguistic and extralingusitic information

Wydawca

Czasopismo

Archives of Acoustics

Rocznik

2007

Tom

Vol. 32, No. 3

Strony

579--589

Opis fizyczny

Bibliogr. 22 poz., rys., tab.

Twórcy

autor

Gardzielewska, H.

Adam Mickiewicz University Institute of Acoustics Umultowska 85, 61-114 Poznań, Poland, hania@spl.ia.amu.edu.pl

autor

Preis, A.

Adam Mickiewicz University Institute of Acoustics Umultowska 85, 61-114 Poznań, Poland, apraton@amu.edu.pl

Bibliografia

[1] REMEZ R. E., RUBIN P. E., PISONI D. B., CARRELL T. D., Speech perception without traditional speech cues, Science, 212, 947–950 (1981).
[2] REMEZ R. E., RUBIN P. E., BERNS S. E., PARDO J. S., LANG J.M., On the perceptual organization of speech, Psychological Review, 101, 129–136 (1994).
[3] MCAULAY R. Q., QUATIERI T. F., Speech analysis-synthesis based on a sinusoidal representation, IEEE Trans. ASSP, 34, 744–754 (1986).
[4] DORMAN M., LOIZOU P., RAINEY D., Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs, Journal of the Acoustical Society of America, 102, 2403–2411 (1997).
[5] http://www.ee.columbia.edu/»dpwe/resources/matlab/sws/.
[6] JASSEM W., Podstawy fonetyki akustycznej, PWN, Warszawa 1973.
[7] ANSI. ANSI S3.6-1996, Specifications for Audiometers, American National Standards Institute, New York 1996.
[8] GROCHOLEWSKI S., CORPORA-Speech Database for Polish Diphones, Proc. Eurospeech’97, 1735–1738, 1997.
[9] KLATT D. H., KLATT L. C., Analysis, synthesis, and perception of voice quality variations among female and male talkers, Journal of the Acoustical Society of America, 87, 2, 820–857 (1990).
[10] NYGAARD L. C., PISONI D. B., Speech perception: New directions in research and theory, [in:] Speech, Language, and Communication, MILLER J. L., EIMAS P. D. [Eds.], Academic, San Diego, CA, pp. 63–96, 1995.
[11] MULLENIX J.W., PISONI D. B., Stimulus variability and processing dependencies in speech perception, Perception and Psychophysics, 47, 379–390 (1990).
[12] GREEN K. P., TOMIAK G. R., KUHL P. K., The encoding of rate and talker information during phonetic perception, Perception and Psychophysics, 59, 675–692 (1997).
[13] REMEZ R. E., Talker identification based on phonetic information, Journal of Experimental Psychology: Human Perception and Performance, 23, 651–666 (1997).
[14] JASSEM W., Frequency and phonetics balanced Polish wordlists, [in:] Speech and Language Technology, JASSEM W., BASZTURA C. [Eds.], Polish Phonetic Association, Poznań, pp. 71–100, 1997.
[15] BRACHMAŃSKI S., STARONIEWICZ P., Phonetic structure of test material used for subjective speech quality measurements, [in:] Speech and Language Technology, JASSEM W., BASZTURA C. [Eds.], WPN Format, Pozna´n, pp. 71–80, 1999.
[16] JUSCZYK P.W., LUCE P. A., Speech perception and spoken word recognition: past and present, Ear and Hearing, 23, 2–40 (2002).
[17] REDDY D., Speech recognition by machine: a review, Proceedings of IEEE, 64, 4, 501–531 (1976).
[18] PISONI D. B., LUCE P. A., Acoustic-Phonetic representations in word recognition, Cognition, 25, 21–52 (1987).
[19] MARTIN C. S., MULLENIX J.W., PISONI D. B., SUMMERS W. V., Effects of talker variability on recall of spoken word lists, Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 152–162 (1989).
[20] JUSCZYK P.W., PISONI D. B., MULLENNIX J., Some consequences of stimulus variability on speech processing by 2-month-old infants, Cognition, 43, 253–291 (1992).
[21] SHANNON R. V., ZENG F. G., WYGONSKI J., KAMATH V., EKELID M., Speech recognition with primarily temporal cues, Science, 270, 303–304 (1995).
[22] MCQUEEN J.M., CUTIER A., NORRIS D., Flow of information in the spoken word recognition system, Speech Communication, 41, 1, 257–270 (2003).

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA

bwmeta1.element.baztech-5ce7d9a0-efb8-4a04-914b-498ef14c85bd