Using auditory properties in multimicrophone speech enhancement

Borowicz, A.; Petrovsky, A.

Artykuł - szczegóły

Tytuł artykułu

Using auditory properties in multimicrophone speech enhancement

Autorzy

Borowicz A. , Petrovsky A.

Identyfikatory

Warianty tytułu

Wykorzystanie właściwości słuchowych w wielomikrofonowym uzdatnianiu mowy

Języki publikacji

Abstrakty

In this article a perceptually motivated multichannel speech enhancement system is presented. The proposed approach uses a generalized sidelobe canceler (GSC) method for speech dereverberation and noise suppression. The conventional GSC structure has been modified by introducing a weighting factor into the noise cancellation loop. It allows for a perceptually optimal shaping of the residual noise spectrum which results in speech distortion decrease. Acomparatwe evaluation of the selected methods has been performed using objective speech guality measures. Experimental results show that the proposed approach outperforms conventional ones providing better speech guality.

Artykuł przedstawia motywowany percepcyjnie wielokanałowy system uzdatniania mowy. Proponowane podejście wykorzystuje uogólnioną metodę tłumienia listków bocznych (ang. Generalised Sidelobe Canceller) do usuwania pogłosu i szumu. Zmodyfikowano konwencjonalną strukturę algorytmu GSC poprzez wprowadzenie współczynnika wagowego w pętli usuwania szumu. Umożliwia to optymalne, w sensie percepcyjnym, kształtowanie widma szumu resztkowego, co skutkuje zmniejszeniem zniekształceń mowy. Przeprowadzono ocenę porównawczą wybranych metod z wykorzystaniem obiektywnych miar jakości mowy. Wyniki eksperymentów pokazują, że proponowane podejście przewyższa metody konwencjonalne, zapewniając lepszą jakość mowy.

Słowa kluczowe

speech enhancement GSC

uzdatnianie mowy GSC

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Elektronika : konstrukcje, technologie, zastosowania

Rocznik

2012

Tom

Vol. 53, nr 5

Strony

30--34

Opis fizyczny

Bibliogr. 16 poz., il., wykr.

Twórcy

autor

Borowicz A.

autor

Petrovsky A.

Bialystok University of Technology, Department of Digital Media and Computer Graphics

Bibliografia

[1] Frost 0.: An algorithm for linearly constrained adaptive array processing, in Proc. IEEE, vol. 60, Aug 1972, pp. 926-935.
[2] Capon J.: High resolution frequency-wavenumber spectrum analysis, in Proc. IEEE, vol. 57, no. 8, Aug 1969, pp. 1408-1418.
[3] Griffiths L., Jim C.: An alternative approach to linearly constrained adaptive beamforming. IEEE Trans. Antennas Propag., vol. AP-30, no. 1, pp. 27-34, Jan 1982.
[4] Cannot S., D. Burshtein, and E. Winstein: Signal enhancement using beamforming and nonstationarity with applications to speech. IEEE Trans. Signal Process., vol. 49, no. 8, pp. 1614-1626, 2001.
[5] Chen J., J. Benesty, and Y. Huang: A minimum distortion noise reduction algorithm with multiple microphones. IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 3, pp. 481-493, 2008.
[6] Huang Y., J. Benesty, and J. Chen: Analysis and comparison of multichannel noise reduction methods in a common framework. IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 5, pp. 957-968, 2008.
[7] Gustafsson S., P. Jax, and P. Vary: A novel psychoacoustically motivated audio enhancement algorithm preserving background noise characteristic, in Proc. ICASSP, vol. 1, 1998, pp. 397-400.
[8] Petrovsky A., M. Parfieniuk, and A. Borowicz: Warped DFT based perceptual noise reduction system, in Proc. AES 116th, Berlin, Germany, May 2004, 14 p.
[9] Schelkunoff S.: A mathematical theory of linear arrays. Bell Syst. Tech. J., vol. 22, pp. 80-107, Jan 1943.
[10] Breed B., J. Strauss: A short proof of the equivalence of lcmv and gsc beamforming. IEEE Signal Process. Lett., vol. 9, no. 6, pp. 168-169, Jun2004.
[11] Souden M., J. Benesty, and S. Affes: On optimal frequency-domain multichannel linear filtering for noise reduction. IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 2, pp. 260-276, 2010.
[12] Spriet A., M. Moonen, and J. Wouters: Robustness analysis of multi-channel wiener filtering and generalized sidelobe cancellation for multi-microphone noise reduction in hearing aid applications. IEEE Trans. Speech Audio Process., vol. 13, no. 4, pp.1 487-503, 2005.
[13] Johnston J.: Transform coding of audio signals using perceptual noise criteria. IEEE J. on Selected Areas in Comm., vol. 6, pp. 314-323, February 1988.
[14] Talmon R., I. Cohen, and S. Cannot: Multichannel speech enhancement using convolutive transfer function approximation in reverberant environments, in Proc. ICASSP, 2009, pp. 3885-3888.
[15] Garofolo J., L. Lamel, W. Fisher, J. Fiscus, D. Pallett, and N Dahlgren: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus. National Institute of Standards and Technology (NIST), CD-ROM, 1993.
[16] Rix A. W., J. G. Beerends, M. P. Hollier, and A. P. Hekstra: Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs, in Proc. ICASSR 2001, pp. 749-752.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BWA1-0052-0056