Speech Enhancement Based on the Multi-Scales and Multi-Thresholds of the Auditory Perception Wavelet Transform

Tao, Z.; Zhao, H. M.; Zhang, X-J.; Wu, D.

Artykuł - szczegóły

Tytuł artykułu

Speech Enhancement Based on the Multi-Scales and Multi-Thresholds of the Auditory Perception Wavelet Transform

Autorzy

Tao Z. , Zhao H. M. , Zhang X-J. , Wu D.

Treść / Zawartość

Pełne teksty:

httpaa_czasopisma_pan_plimagesdataaawydaniano3201103speechenhancementbasedonthemulti-scalesand.pdf

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

This paper proposes a speech enhancement method using the multi-scales and multi-thresholds of the auditory perception wavelet transform, which is suitable for a low SNR (signal to noise ratio) environment. This method achieves the goal of noise reduction according to the threshold processing of the human ear’s auditory masking effect on the auditory perception wavelet transform parameters of a speech signal. At the same time, in order to prevent high frequency loss during the process of noise suppression, we first make a voicing decision based on the speech signals. Afterwards, we process the unvoiced sound segment and the voiced sound segment according to the different thresholds and different judgments. Lastly, we perform objective and subjective tests on the enhanced speech. The results show that, compared to other spectral subtractions, our method keeps the components of unvoiced sound intact, while it suppresses the residual noise and the background noise. Thus, the enhanced speech has better clarity and intelligibility.

Słowa kluczowe

speech enhancement low SNR auditory perception wavelet transform unvoiced enhancement masking effect

Wydawca

Instytut Podstawowych Problemów Techniki PAN
Komitet Akustyki PAN
Polskie Towarzystwo Akustyczne

Czasopismo

Archives of Acoustics

Rocznik

2011

Tom

Vol. 36, No. 3

Strony

519--532

Opis fizyczny

Bibliogr. 18 poz., tab., wykr.

Twórcy

autor

Tao Z.

autor

Zhao H. M.

autor

Zhang X-J.

autor

Wu D.

Soochow University School of Electronic Information Suzhou 215006, China, hmzhao@suda.edu.cn

Bibliografia

1. Boll S. (1979), Suppression of Acoustic Noise in Speech Using Spectral Subtraction, IEEE Transactions on Acoustics, Speech, and Signal Processing, 2, 113-120.
2. Berouti M., Schwartz R., Makhoul J. (1979), Enhancement of speech corrupted by acoustic noise, Proc. IEEE ICASSP, Washington, DC, 208-211.
3. Donoho D.L. (1995), De-noising by soft-thresholding, IEEE Transactions inform Theory, 41, 3, 613-627.
4. Ephraim Y., Malah D. (1984), Speech Enhancement Using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator, IEEE Transactions on Acoustics, Speech, and Signal Processing, 32, 6, 1109-1121.
5. Ephraim Y., Malah D. (1985), Speech Enhancement Using a Minimum Mean-Square Error Log-Spectral Amplitude Estimator, IEEE Trans Acoust Speech Signal Processing, 33, 2, 443-445.
6. Hu Y., Chen N. (2006), A Method of Unvoiced/Voiced Classification and Pitch Detection Based on Wavelet Transform, Audio Engineering, 11, 63-66.
7. Johnston J.D. (1998), Transform Coding of Audio Signals Using Perceptual Noise Criteria, IEEE Transactions on Selected Areas in Communication, 6, 2, 314-323.
8. Lockwood P., Boudy J. (1992), Experiments with a Nonlinear Spectral Subtractor (NSS), Hidden Markov Models and Projection for Robust Recognition in Cars, Speech Communication, 11, 215-228.
9. Mallat S.G., Zhong S. (1999), Singularity detection and processing with wavelet, IEEE Transactions Inform Theory, 38, 2, 517-543.
10. Seok J.W., Bae K.S. (1997), Speech enhancement with reduction of noise components in the wavelet domain, Copyright 1997 IEEE, 1323-1326.
11. Shen Y.Q., Jin H.Z. (2000), Speech enhancement based on wavelet transform, Bulletin of Science and Technology, 16, 3, 206-211.
12. Tao Z., Zhao H.M., Gong C.H. (2005), Speech enhancement based on masking properties of human auditory system and bark wavelet transform, Acta Acustica, 30, 4, 367-372.
13. Tao Z., Zhao H.M., Gu J.H., Tan X.D., Wu J. (2008), Speech feature extraction of cochlear implants on the basis of auditory perception wavelet transform, IEEE ICALIP, Shanghai, 80-85.
14. Tao Z., Zhao H.M., Wu J., Gu J.H., Xu Y.S., Wu D. (2010), A lifting wavelet domain audio watermarking algorithm based on the statistical characteristics of sun-band coefficients, Archives of Acoustics, 35, 4, 481-491.
15. Traunmuller H. (1990), Analytical expression for the tonotopic sensory scale, Journal of the Acoustical Society of America, 88, 97-100.
16. Virag N. (1999), Single Channel Speech Enhancement Based on Masking Properties of Human Auditory System, IEEE Transactions on Speech and Audio Processing, 7, 2, 126-137.
17. Xu Y.S., Weaver J.B., Healy D.M., Lu J. (1994), Wavelet transform domain filters:a sparial selective noise filtration technique, IEEE Transactions Image Processing, 3, 6, 747-758.
18. Zhu X.W., Yang D.C., Wang W., Mou F., Xu B.L. (2003), The research on speech enhancement based on the simulation of auditory model using frame-synchronized combined wavelet packet transform algorithms, Acta Acustica, 28, 1, 12-16.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUS8-0020-0032