PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

The intelligibility of Polish speech synthesized with a new SineWave Synthesis method

Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
SineWave Synthesis, (SWS), allows a significant reduction of the information carried by a speech signal representing by the dynamic spectral properties of formants selected from the natural speech. The synthesis rejects all the detailed acoustic information carried by a signal, including the fundamental frequency as well as harmonic and noise components. Regardless of the impressive information reduction (the compression coefficient for 3-tone synthesis reaches even 195:1), the linguistic and extra-linguistic information of a signal are to a large extend preserved. For the first time, a modified version of SWS was used to analyze Polish speech in order to evaluate the relationship between data reduction and the intelligibility of speech. Speech intelligibility was tested in different utterances varying in grammatical structure, linguistic information, and duration. The modified SWS method, elaborated in Adam Mickiewicz University in Poznań, provided noticeably better results for Polish speech than the original method elaborated in late 1970s at Haskins Laboratories.
Rocznik
Strony
579--589
Opis fizyczny
Bibliogr. 22 poz., rys., tab.
Twórcy
  • Adam Mickiewicz University Institute of Acoustics Umultowska 85, 61-114 Poznań, Poland
autor
  • Adam Mickiewicz University Institute of Acoustics Umultowska 85, 61-114 Poznań, Poland
Bibliografia
  • [1] REMEZ R. E., RUBIN P. E., PISONI D. B., CARRELL T. D., Speech perception without traditional speech cues, Science, 212, 947–950 (1981).
  • [2] REMEZ R. E., RUBIN P. E., BERNS S. E., PARDO J. S., LANG J.M., On the perceptual organization of speech, Psychological Review, 101, 129–136 (1994).
  • [3] MCAULAY R. Q., QUATIERI T. F., Speech analysis-synthesis based on a sinusoidal representation, IEEE Trans. ASSP, 34, 744–754 (1986).
  • [4] DORMAN M., LOIZOU P., RAINEY D., Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs, Journal of the Acoustical Society of America, 102, 2403–2411 (1997).
  • [5] http://www.ee.columbia.edu/»dpwe/resources/matlab/sws/.
  • [6] JASSEM W., Podstawy fonetyki akustycznej, PWN, Warszawa 1973.
  • [7] ANSI. ANSI S3.6-1996, Specifications for Audiometers, American National Standards Institute, New York 1996.
  • [8] GROCHOLEWSKI S., CORPORA-Speech Database for Polish Diphones, Proc. Eurospeech’97, 1735–1738, 1997.
  • [9] KLATT D. H., KLATT L. C., Analysis, synthesis, and perception of voice quality variations among female and male talkers, Journal of the Acoustical Society of America, 87, 2, 820–857 (1990).
  • [10] NYGAARD L. C., PISONI D. B., Speech perception: New directions in research and theory, [in:] Speech, Language, and Communication, MILLER J. L., EIMAS P. D. [Eds.], Academic, San Diego, CA, pp. 63–96, 1995.
  • [11] MULLENIX J.W., PISONI D. B., Stimulus variability and processing dependencies in speech perception, Perception and Psychophysics, 47, 379–390 (1990).
  • [12] GREEN K. P., TOMIAK G. R., KUHL P. K., The encoding of rate and talker information during phonetic perception, Perception and Psychophysics, 59, 675–692 (1997).
  • [13] REMEZ R. E., Talker identification based on phonetic information, Journal of Experimental Psychology: Human Perception and Performance, 23, 651–666 (1997).
  • [14] JASSEM W., Frequency and phonetics balanced Polish wordlists, [in:] Speech and Language Technology, JASSEM W., BASZTURA C. [Eds.], Polish Phonetic Association, Poznań, pp. 71–100, 1997.
  • [15] BRACHMAŃSKI S., STARONIEWICZ P., Phonetic structure of test material used for subjective speech quality measurements, [in:] Speech and Language Technology, JASSEM W., BASZTURA C. [Eds.], WPN Format, Pozna´n, pp. 71–80, 1999.
  • [16] JUSCZYK P.W., LUCE P. A., Speech perception and spoken word recognition: past and present, Ear and Hearing, 23, 2–40 (2002).
  • [17] REDDY D., Speech recognition by machine: a review, Proceedings of IEEE, 64, 4, 501–531 (1976).
  • [18] PISONI D. B., LUCE P. A., Acoustic-Phonetic representations in word recognition, Cognition, 25, 21–52 (1987).
  • [19] MARTIN C. S., MULLENIX J.W., PISONI D. B., SUMMERS W. V., Effects of talker variability on recall of spoken word lists, Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 152–162 (1989).
  • [20] JUSCZYK P.W., PISONI D. B., MULLENNIX J., Some consequences of stimulus variability on speech processing by 2-month-old infants, Cognition, 43, 253–291 (1992).
  • [21] SHANNON R. V., ZENG F. G., WYGONSKI J., KAMATH V., EKELID M., Speech recognition with primarily temporal cues, Science, 270, 303–304 (1995).
  • [22] MCQUEEN J.M., CUTIER A., NORRIS D., Flow of information in the spoken word recognition system, Speech Communication, 41, 1, 257–270 (2003).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-5ce7d9a0-efb8-4a04-914b-498ef14c85bd
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.