A Data Field method for speech enhancement incorporating Binary Time-Frequency Masking

Huang, J.; Zhang, Y.; Zhang, X.; Zhu, T.

Artykuł - szczegóły

Tytuł artykułu

A Data Field method for speech enhancement incorporating Binary Time-Frequency Masking

Autorzy

Huang J. , Zhang Y. , Zhang X. , Zhu T.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

Metoda pola danych oraz maskowania czasowoczęstotliwościowego wykorzystana do poprawy jakości dźwięku

Języki publikacji

Abstrakty

A data field approach coupled with binary time-frequency masking is presented for the speech enhancement problem. In this proposed approach, data field method is employed to model the time and frequency dependencies of speech. This formulation has proved to be very helpful in enhancing speech quality by exploiting the correlation of speech both in time and in frequency. The experimental results demonstrate that the proposed algorithm offers improved signal to noise ratio and less spectral distortion.

Do poprawy jakości dźwięku mowy zastosowano metodę pola danych (Data field) połączoną z binarnym maskowanie czasowoczęstotliwościowym. Pozwoliło to znacząco poprawić jakość dźwięku przez wykorzystanie korelacji czasowej i częstotliwościowej. Uzyskano poprawę stosunku sygnału do szumu i zmniejszenie poziomu zniekształceń.

Słowa kluczowe

speech enhancement data field time-frequency masking noise estimate electrical technology

dźwięk mowy pole danych elektrotechnika

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2011

Tom

R. 87, nr 7

Strony

225--229

Opis fizyczny

Bibliogr. 15 poz., rys., tab., wykr.

Twórcy

autor

Huang J.

autor

Zhang Y.

autor

Zhang X.

autor

Zhu T.

Institute of Command Automation, PLA University of Science and Technology, hjj954@gmail.com

Bibliografia

[1] I.Cohen, "Relaxed statistical model for speech enhancement and a priori SNR estimation," IEEE Transactions on Speech and Audio Processing, vol.13, no.5, pp.870-881, 2005.
[2] I.Andrianakis and P.R.White, "On the application of Markov Random Fields to speech enhancement," in Proc. IMA Int. Conf. Mathematics in Signal processing, pp. 198-201, 2006.
[3] Yipeng Li, John Woodruff, and De Liang Wang, "Monaural musical sound separation based on pitch and common amplitudę modulation," IEEE Transactions on Audio, Speech, and Language Processing, vol.17, no.7, pp.1361-1371, 2009.
[4] S. F. Boli, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Transactions on Acoustics, Speech and Signal Processing., vol.27, no.2, pp. 113-120, 1979.
[5] Yang Lu and Philipos C. Loizou, "A geometrie approach to spectral subtraction," Speech Communication, vol.50, pp. 453-466, 2008.
[6] R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Transactions on Speech and Audio Processing, vol.9, no.5, pp. 504-512, Jul. 2001.
[7] Simon Doclo, Jan Wouters and Marc Moonen, "Speech enhancement with multichannel wiener filter techniques in multimicrophone binaural hearing aids," Journal of the Acoustical Society of America. vol.25, no.1, pp.360-371, 2009
[8] Plapous, C., Marro and C., Scalart, P., "Improved Signal-to-Noise Ratio Estimation for Speech Enhancement," IEEE Transactions on Audio, Speech and Language Processing, vol.14, no.6, pp.2098-2108, 2006.
[9] Gomez.R. and Kawahara.T., "Optimizing spectral subtraction and wiener filtering for robust speech recognition in reverberant and noisy conditions," Proceedings of ICASSP, Dallas, pp.4566-4569, 2010.
[10] D.Li and Y.Du, Artificial Intelligence with Uncertainty. National Defense Press, 2005.
[11] D.Li, K.Liu, Y.Sun and M.Hań, "Emerging Clapping Synchronization From a Complex Multiagent Network With Local Information via Local Control," IEEE Trans, on Circuits and Syetems-ll:Express Brief, vol.56, no.6, pp. 504-507, 2009.
[12] Hu G. and Wang D.L, "A tandem algorithm for pitch estimation and voiced speech segregation," IEEE Transactions on Audio, Speech, and Language Processing, vol.18, no.8, pp. 2067-2079, 2010.
[13] Y.Shao, S.Srinivasan, Z.Jin and D.L.Wang, "A computational auditory scenę analysis system for speech segregation and robust speech recognition," Computer Speech and Language, vol.24, pp.77-93, 2010.
[14] Tao Xu and Wenwu Wang, "A block-based compressed sensing method for underdetermined blind speech separation incorporating binary mask," ICASSP, 2009.
[15] Y.Hu and P.Loizou, "Subjective comparison of speech enhancement algorithms," Proceedings of ICASSP, Toulouse, France, pp. 153-156, May 2006.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-PWA7-0045-0052