Logatom articulation index evaluation of speech enhanced by blind source separation and single-channel noise reduction

Drgas, Sz.; Kociński, J.; Sęk, A. P.

Artykuł - szczegóły

Tytuł artykułu

Logatom articulation index evaluation of speech enhanced by blind source separation and single-channel noise reduction

Autorzy

Drgas Sz. , Kociński J. , Sęk A. P.

Wybrane pełne teksty z tego czasopisma

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The subjective logatom articulation index of speech signals enhanced by means of various digital signal processing methods has been measured. To improve intelligibility, the convolutive blind source separation (BSS) algorithm by Parra and Spence [1] has been used in combination with classical denoising algorithms. The efficiency of these algorithms has been investigated for speech material recorded in two spatial configurations. It has been shown that the BSS algorithm can highly improve speech recognition. Moreover, a combination of the BSS with single-microphone denoising methods can additionally increase the logatom articulation index.

Słowa kluczowe

speech enhancement logatom articulation index blind source separation

Wydawca

Instytut Podstawowych Problemów Techniki PAN
Komitet Akustyki PAN
Polskie Towarzystwo Akustyczne

Czasopismo

Archives of Acoustics

Rocznik

2008

Tom

Vol. 33, No. 4

Strony

455--474

Opis fizyczny

Bibliogr. 41 poz., rys., tab.

Twórcy

autor

Drgas Sz.

autor

Kociński J.

autor

Sęk A. P.

Adam Mickiewicz University, Faculty of Physics, Institute of Acoustics, 85 Umultowska Str., 61-614 Poznań, Poland, Szymon.Drgas@amu.edu.pl

Bibliografia

[1] PARRA L., SPENCE C., Convolutive blind source separation of non-stationary sources. US Patent US6167417, IEEE Transactions on Speech and Audio Processing., 8, 3, 320–327 (2000).
[2] PREVES D.A., Hearing aids and listening in noise, Seminars in Hearing, 21, 2, 103–122 (2000).
[3] DIGIOVONNI J.J., NELSON P.B., SCHLAUCH R.S., A psychophysical evaluation of spectral enhancement, Journal of Speech, Language, and Hearing Research, 48, 5, 1121–1135 (2005).
[4] EZEKIEL S., OBLITEY W., TRIMBLE R., Hearing aid speech enhancement: A multiresolution analysis approach, [in:] IASTED International Conference on Internet and Multimedia Systems and Applications, pp. 533–537, EuroIMSA, 2005.
[5] CHUNG K., ZENG F.-G., ACKER K.N., Effects of directional microphone and adaptive multichannel noise reduction algorithm on cochlear implant performance, Journal of the Acoustical Society of America, 120, 4, 2216–2227 (2006).
[6] GAO J., ZHANG H., HU G., Real-time implementation of an efficient speech enhancement algorithm for digital hearing aids, Tsinghua Science and Technology, 11, 4, 475–480 (2006).
[7] FRANCK B.A.M., BOYMANS M., DRESCHLER W.A., Interactive fitting of multiple algorithms implemented in the same digital hearing aid, International Journal of Audiology, 46, 7, 388–397 (2007).
[8] HOEGE H., Basic parameters in speech processing. The need for evaluation, Archives of Acoustics, 32, 1, 67–74 (2007).
[9] ROCH M.A., HURTIG R.R., HUANG T., LIU J., ARTEAGA S.M., Foreground auditory scene analysis for hearing aids, Pattern Recognition Letters, 28, 11, 1351–1359 (2007).
[10] WON J.H., SCHIMMEL S.M., DRENNAN W.R., SOUZA P.E., ATLAS L., Rubinstein J.T., Improving performance in noise for hearing aids and cochlear implants using coherent modulation filtering, Hearing Research, 239, 1–2, 1–11 (2008).
[11] O’SHAUGHNESSY D., Speech communications. Human and machine. Second edition, Piscataway: IEEE Press (2000).
[12] HOJAN E., FASTL H., MALENDA J., HOJAN–JEZIERSKA D., Investigations into speech intelligibility in the presence of different masking noises for hearing aids with variable attack and release times, Archives of Acoustics, 30, 2, 159–171 (2005).
[13] KOCIŃSKI J., SĘK A.P., Speech intelligibility in various spatial configurations of background noise, Archives of Acoustics, 30, 2, 173–191 (2005).
[14] SMARAGDIS P., Information theoretic approaches to source separation, [in:] MAS Department, MSc thesis, Massachusetts Institute of Technology, Massachusetts 1997.
[15] SMARAGDIS P., Efficient blind separation of convolved sound mixtures, [in:] EEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 19–22, New Paltz NY 1997.
[16] HYVÄRINEN A., KARHUNEN J., ERKKI O., Independent component analysis, John Wiley & Sons, Inc, New York 2001.
[17] CICHOCKI A., AMARI S., Adaptive blind signal and image processing learning algorithms and applications, John Wiley & Sons, Ltd, Chichester / New York / Weinheim / Brisbane / Singapore / Toronto, 2003.
[18] CHOI S., CICHOCKI A., PARK H.-M., LEE S.Y., Blind source separation and independent component analysis: a review, Neural Information Processing – Letters and Reviews, 6, 1, 1–57 (2005).
[19] BELOUCHRANI A., AMIN M.G., A new approach for blind source separation using time-frequency distribution, Proceedings SPIE, 2846, 193–203 (1996).
[20] MOLGEDEY L., SCHUSTER H.G., Separation of a mixture of independent signals using time delayed correlatons, Physical Review Letters, 72, 23, 3634–3637 (1994).
[21] BELOUCHRANI A., ABED–MERAIM K., A blind source separation technique using second-order statistics, IEEE Transactions on Signal Processing, 45, 2, 434–444 (1997).
[22] CICHOCKI A., BELOUCHRANI A., Sources separation of temporally correlated sources from noisy data using bank of band-pass filters, Third International Conference on Independent Component Analysis and Signal Separation (ICA-2001), pp. 173–178, San Diego, USA 2001.
[23] CHOI S., CICHOCKI A., BELOUCHRANI A., Second order nonstationary source separation, Journal of VLSI Signal Processing, 32, 1–2, 93–104 (2002).
[24] CHOI S., CICHOCKI A., ZHANG L.L., AMARI S., Approximate maximum likelihood source separation using the natural gradient, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 86, 1, 198–205 (2003).
[25] MATSUOKA K., OHYA M., KAWAMOTO M., A neural net for blind separation of nonstationary signal, Neural Networks, 8, 3, 411–420 (1995).
[26] PHAM D.-T., SERVIERE C., BOUMARAF H., Blind separation of convolutive audio mixtures using nonstationarity, [in:] ICA 2003, Nara, Japan 2003.
[27] MAKINO S., SAWADA H., MUKAI R., ARAKI S., Blind source separation of convolutive mixtures of speech in frequency domain, IEICE Trans. Fundamentals, E88, 7, 1640–1655 (2005).
[28] HARMELING S., convbss, FRAUNHOFER FIRST Berlin, Berlin 2001.
[29] KOCIŃSKI J., Influence of blind source separation on speech intelligibility, Archives of Acoustics, 30, 4 (Supplement), 149–152 (2005).
[30] LIBISZEWSKI P., KOCIŃSKI J., Efficiency of blind source separation in a real room, Archives of Acoustics, 32, 4 (Supplement), 337–342 (2007).
[31] PARRA L., SPENCE C., On-line blind source separation of non-stationary signals, Journal of VLSI Signal Processing, 26, 1–2, 39–46 (2000).
[32] ASANO F., GOTO M., ITOU K., ASOH H., Real-time sound source localization and separation system and its application to automatic speech recognition, Eurospeech, 1013–1016, 2001.
[33] BOLL S.F., Suppression of acoustic noise in speech using spectral substraction, IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-27, 2, 113–120 (1979).
[34] BEROUTI M., SCHWARTZ R., MAKHCUL J., Enhancement of speech corrupted by acoustic noise, IEEE International Conference on Acoustics, Speech and Signal Processing, 4, 208–211 (1979).
[35] EPHRAIM E., MALAH D., Speech enhancement using a minimum mean-square error log-spectral amplitude estimator, IEEE Transactions on Speech and Audio Processing, ASSP-33, 2, 443–445 (1985).
[36] SCALART P., FILHO J.V., Speech enhancement based on a priori signal to noise estimation, IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 629–632 (1996).
[37] BRACHMAŃSKI S., STARONIEWICZ P., Phonetic structure of a test material used in subjective measurements of speech quality [in Polish], Speech and Language Technology, Pozna´n, 71–80 (1999).
[38] BRACHMAŃSKI S., Effect of additive interference on speech transmission, Archives of Acoustics, 27, 2, 95–108 (2002).
[39] BRACHMA´N SKI S., Estimation of logatom intelligibility with the STI method for Polish speech transmitted via communication channels, Archives of Acoustics, 29, 4, 555–562 (2004).
[40] ZAVAREHEI E., MMSESTSA85.m, http://www.mathworks.com/matlabcentral/fileexchange/load-File.do?objectId=7655&objectType=FILE.
[41] ZAVAREHEI E., WienerScalart96.m, 2005, http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=7673&objectType=FILE.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BAT8-0014-0011