Model-Based Method for Acoustic Echo Cancelation and Near-End Speaker Extraction: Non-negative Matrix Factorization

Agrawal, P.; Shandilya, M.

doi:10.26636/jtit.2018.122617

Artykuł - szczegóły

Tytuł artykułu

Model-Based Method for Acoustic Echo Cancelation and Near-End Speaker Extraction: Non-negative Matrix Factorization

Autorzy

Agrawal P. , Shandilya M.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.26636/jtit.2018.122617

Warianty tytułu

Języki publikacji

Abstrakty

Rapid escalation of wireless communication and hands-free telephony creates a problem with acoustic echo in full-duplex communication applications. In this paper a simulation of model-based acoustic echo cancelation and near-end speaker extraction using statistical methods relying on nonnegative matrix factorization (NMF) is proposed. Acoustic echo cancelation using the NMF algorithm is developed and its implementation is presented, along with all positive, real time elements and factorization techniques. Experimental results are compared against the widely used existing adaptive algorithms which have a disadvantage in terms of long impulse response, increased computational load and wrong convergence due to change in near-end enclosure. All these shortcomings have been eliminated in the statistical method of NMF that reduces echo and enhances audio signal processing.

Słowa kluczowe

adaptive algorithms convergence echo cancellation non-negative matrix factorization (NMF)

Wydawca

Instytut Łączności - Państwowy Instytut Badawczy

Czasopismo

Journal of Telecommunications and Information Technology

Rocznik

2018

Tom

nr 2

Strony

15--22

Opis fizyczny

Bibliogr. 35 poz., rys., tab.

Twórcy

autor

Agrawal P.

pallaviagrawal4@gmail.com

Department of Electronics and Communication Engineering, Maulana Azad National Institute of Technology, Bhopal (M. P.), India

autor

Shandilya M.

madhu shandilya@yahoo.in

Department of Electronics and Communication Engineering, Maulana Azad National Institute of Technology, Bhopal (M. P.), India

Bibliografia

[1] E. Hansler and G. Schmidt, Acoustic Echo and Noise Control: A Practical Approach. John Wiley & Sons, 2005 (ISBN: 978-0-471-45346-8).
[2] P. A. Naylor and N. D. Gaubitch, Eds., Speech Dereverberation, 1st ed. Springer, 2010 (doi: 10.1007/978-1-84996-056-4).
[3] M. M. Sondhi and D. A. Berkley, “Silencing echoes on the telephone network”, Proc. of the IEEE, vol. 68, no. 8, pp. 948–963, 1980 (doi: 10.1109/PROC.1980.11774).
[4] C. Paleologu, J. Benesty, and S. Ciochină, “An improved proportionate NLMS algorithm based on the L0 norm”, in Proc. IEEE Int. Conf. on Acoust., Speech and Sig. Process. ICASSP 2010, Dallas, TX, USA, 2010, pp. 309–312 (doi: 10.1109/ICASSP.2010.5495903).
[5] S. Malik and G. Enzner “Recursive Bayesian control of multichannel acoustic echo cancellation”, IEEE Sig. Process. Lett., vol. 18, no. 11, pp. 619–622, 2011 (doi: 10.1109/LSP.2011.2166385).
[6] S. L. Gay and J. E. Benesty, Eds., Acoustic Signal Processing for Telecommunication. Boston: Kluwer Academic, 2000 (doi: 10.1007/978-1-4419-8644-3).
[7] J. Gunther and T. Moon “Blind acoustic echo cancellation without double-talk detection”, IEEE Worksh. on Appl. of Sig. Process. to Audio and Acoust., New Paltz, NY, USA, 2015 (doi: 10.1109/WASPAA.2015.7336925).
[8] J. Benesty, T. Gänsler, D. R. Morgan, M. M. Sondhi, and S. L. Gay, Eds., Advances in Network and Acoustic Echo Cancellation. New York: Springer-Verlag, 2001 (ISBN: 978-3-662-04437-7).
[9] Y. Hu and P. C. Loizou, “Evaluation of objective quality measures for speech enhancement”, IEEE Trans. on Audio, Speech, and Lang. Process., vol. 16, no. 1, pp. 229–238, 2008 (doi: 10.1109/TASL.2007.911054).
[10] A. H. Abdullah, M. I. Yusof, and S. R. M. Baki, “Adaptive noise cancellation: a practical study of the least-mean square (LMS) over recursive least-square (RLS) algorithm”, in Proc. Student Conf. on Res. and Develop. SCOReD 2002, pp. 448–452, Shah Alam, Malaysia, 2002 (doi: 10.1109/SCORED.2002.1033154).
[11] X. Wang, T. Shen, and W. Wang, “An approach for echo cancellation system based on improved NLMS algorithm”, in Proc. Int. Conf. on Wirel. Commun., Netw. and Mob. Comput. WiCom 2007, Shanghai, China, 2007 (doi: 10.1109/WICOM.2007.708).
[12] S. Wu, X. Qiu, and M. Wu, “Stereo acoustic echo cancellation employing frequency-domain preprocessing and adaptive ﬁlter”, IEEE Trans. on Audio, Speech, and Lang. Process., vol. 19, no. 3, pp. 614–623, 2011 (doi: 10.1109/TASL.2010.2052804).
[13] M. H. Maruo, J. C. M. Bermudez, and L. S. Resende, “Statistical analysis of a jointly optimized beamformer-assisted acoustic echo canceller”, IEEE Trans. on Sig. Process., vol. 62, no. 1, pp. 252–265, 2014 (doi: 10.1109/TSP.2013.2284138).
[14] R. C. Nongpiur and D. J. Shpak, “Maximizing the signal-to-alias ratio in non-uniform ﬁlter banks for acoustic echo cancellation”, IEEE Trans. on Circuits and Syst.: Regular Papers, vol. 59, no. 10, pp. 2315–2325, 2012 (doi: 10.1109/TCSI.2012.2185333).
[15] D. L. Wang and G. J. Brown, Eds., Computational Auditory Scene Analysis: Principles, Algorithms and Applications. Wiley-IEEE Press, 2006 (ISBN: 978-0-471-74109-1).
[16] T. S. Wadaand and B.-H. Juang, “Enhancement of residual echo for robust acoustic echo cancellation”, IEEE Trans. on Audio, Speech, and Lang. Process., vol. 20, no. 1, pp. 175–189, 2012 (doi: 10.1109/TASL.2011.2159592).
[17] S. K. Nagendra and V. S. Kumar, “Echo cancellation in audio signal using LMS algorithm”, in Nat. Conf. on Recent Trends in Engin. and Technol. NCRTET 2011, Anand, Gujarat, India, 2011.
[18] U. Mahbub, S. A. Fattah, W-P. Zhu, and M. O. Ahmad, “Singlechannel acoustic echo cancellation in noise based on gradientbased adaptive ﬁltering”, EURASIP Journal on Audio, Speech, and Music Processing, vol. 20, pp. 1–16, 2014 (doi: 10.1186/1687-4722-2014-20).
[19] P. Paatero and U. Tapper, “Positive matrix factorization: A nonnegative factor model with optimal utilization of error estimates of data values”, Environmetrics, vol. 5, pp. 111–126, 1994 (doi: 10.1002/env.3170050203).
[20] A. Bansal, S. Choukse, K. Nathwani, and R. M. Hegde, “Acoustic echo cancellation using a multi-resolution non-negative matrix factorization method”, in Proc. 22nd Nat. Conf. on Commun. NCC 2016, Guwahati, India, 2016, pp. 1–5, 2016 (doi: 10.1109/NCC.2016.7561119).
[21] D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix factorization”, in Advances in Neural Information Processing Systems 13. Proceedings of the 2000 Conference, T. K. Leen, T. G. Dietterich, and V. Tresp, Eds. MIT Press, 2001, pp. 556–562.
[22] P. O’Grady, “Sparse Separation of Underdetermined Speech Mixtures”, Ph.D. Dissertation, Hamilton Institute, National University of Ireland Maynooth, Ireland, 2007 [Online]. Available: https://www.hamilton.ie/publications/ogrady2007 phd.pdf
[23] S. Ciochină, C. Paleologu, J. Benesty, and C. Anghel, “An optimized affine projection algorithm for acoustic echo cancellation”, in Proc. Int. Conf. on Speech Technol. and Human-Comp. Dialogue SpeD 2015, Bucharest, Romania, 2015, pp. 1–6 (doi: 10.1109/SPED.2015.7343092).
[24] C. Fevotte, N. Bertin, and J. L. Durrieu, “Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis”, Neural Computation, vol. 21, no. 3, pp. 793–830, 2009 (doi: 10.1162/neco.2008.04-08-771).
[25] S. Roweis, “One microphone sound source separation”, in Advances in Neural Information Processing Systems 13, T. K. Leen, T. G. Dietterich, and V. Tresp, Eds. MIT Press, 2001, pp. 793–799.
[26] F. Yang, M. Wu, and J. Yang,“Stereophonic acoustic echo suppression based on Wiener ﬁlter in the short-time Fourier transform domain”, IEEE Sig. Process. Lett., vol. 19, no. 4, pp. 227–230, 2012 (doi: 10.1109/LSP.2012.2187446).
[27] S. Rickard and O. Yilmaz, “On the approximate W-disjoint orthogonality of speech”, in Proc. IEEE Int. Conf. on Acoust., Speech, and Sig. Process. ICASSP 2002, vol. 1, pp. I-529–I-532, 2002, Orlando, FL, USA, 2001 (doi: 10.1109/ICASSP.2002.5743771).
[28] O. Yilmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking”, IEEE Trans. on Sig. Process., vol. 52, no. 7, pp. 1830–1847, 2004 (doi: 10.1109/TSP.2004.828896).
[29] C. Avendano, “Acoustic echo suppression in the STFT domain”, in Proc. IEEE Worksh. on Appl. of Sig. Process. to Audio and Acoust., New Paltz, NY, USA, pp. 175–178, 2001 (doi: 10.1109/ASPAA.2001.969571).
[30] C. Faller and J. Chen, “Suppressing acoustic echo in a spectral envelope space”, IEEE Trans. on Speech and Audio Process., vol. 13, no. 5, pp. 1048–1062, 2005 (doi: 10.1109/TSA.2005.852012).
[31] E. A. P. Habets, S. Gannot, and I. Cohen, “Robust early echo cancellation and late echo suppression in the STFT domain”, in Proc. of 11th Int. Worksh. on Acoust. Echo and Noise Control IWAENC 2008, Seattle, VA, USA, 2008.
[32] P. Yun-Sik and C. Joon-Hyuk, “Frequency domain acoustic echo suppression based on soft decision”, IEEE Sig. Process. Lett., vol. 16, no. 1, pp. 53–56, 2009 (doi: 10.1109/LSP.2008.2008571).
[33] M. Cooke, J. Barker, S. Cunningham, and X. Shao, “An audio-visual corpus for speech perception and automatic speech recognition”, The J. of the Acoustical Soc. of America, vol. 120, no. 5, pp. 2421–2424, 2006 (doi: 10.1121/1.2229005).
[34] Y.-X. Wang and Y-J. Zhang, “Nonnegative matrix factorization: A comprehensive review”, IEEE Trans. on Knowl. and Data Engin., vol. 25, no. 6, pp. 1336–1353, 2013 (doi: 10.1109/TKDE.2012.51).
[35] A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, “Perceptual evaluation of speech quality (PESQ) – A new method for speech quality assessment of telephone networks and codecs”, in Proc. IEEE Int. Conf. on Acoust., Speech and Sig. Process. ICASSP 2001, Salt Lake City, UT, USA, 2001, vol. 2, pp. 749–752 (doi: 10.1109/ICASSP.2001.941023).

Uwagi

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-2dcdde17-932b-46c8-8515-6949dc35f37f