Performance Analysis of MVDR Beamformer Applied on an End-fire Microphone Array Composed of Unidirectional Microphones

Šarić, Zoran; Subotić, Miško; Bilibajkić, Ružica; Barjaktarović, Marko; Zdravković, Nebojša

doi:10.24425/aoa.2021.138154

Artykuł - szczegóły

Tytuł artykułu

Performance Analysis of MVDR Beamformer Applied on an End-fire Microphone Array Composed of Unidirectional Microphones

Autorzy

Šarić Zoran , Subotić Miško , Bilibajkić Ružica , Barjaktarović Marko , Zdravković Nebojša

Treść / Zawartość

Pełne teksty:

Saric_Z_Performance analysis_AofA_vol.46_no 4_2021.pdf

Pobierz

Identyfikatory

DOI

10.24425/aoa.2021.138154

Warianty tytułu

Języki publikacji

Abstrakty

Microphone array with minimum variance (MVDR) beamformer is a commonly used method for ambient noise suppression. Unfortunately, the performance of the MVDR beamformer is poor in a real reverberant room due to multipath wave propagation. To overcome this problem, we propose three improvements. Firstly, we propose end-fire microphone array that has been shown to have a better directivity index than the corresponding broadside microphone array. Secondly, we propose the use of unidirectional microphones instead of omnidirectional ones. Thirdly, we propose an adaptation of its adaptive algorithm during the pause of speech, which improves its robustness against the room reverberation and deviation from the optimal receiving direction. The performance of the proposed microphone array was theoretically analyzed using a diffuse noise model. Simulation analysis was performed for combined diffuse and coherent noise using the image model of the reverberant room. Real room tests were conducted using a four-microphone array placed in a small office room. The theoretical analysis and the real room tests showed that the proposed solution considerably improves speech quality.

Słowa kluczowe

adaptive beamforming ambient noise suppression differential microphone array end-fire microphone array MVDR beamformer

Wydawca

Instytut Podstawowych Problemów Techniki PAN
Komitet Akustyki PAN
Polskie Towarzystwo Akustyczne

Czasopismo

Archives of Acoustics

Rocznik

2021

Tom

Vol. 46, No. 4

Strony

611--621

Opis fizyczny

Bibliogr. 40 poz., rys., tab., wykr.

Twórcy

autor

Šarić Zoran

sariczoran@yahoo.com

Laboratory of Acoustics, Life Activities Advancement Center Serbia

https://orcid.org/0000-0001-9964-9974

autor

Subotić Miško

m.subotic@add-for-life.com

Laboratory of Acoustics, Life Activities Advancement Center Serbia;

autor

Bilibajkić Ružica

bilibajkic}@add-for-life.com

Laboratory of Acoustics, Life Activities Advancement Center Serbia

autor

Barjaktarović Marko

mbarjaktarovic@etf.bg.ac.rs

Faculty of Electrical Engineering, University of Belgrade Serbia

autor

Zdravković Nebojša

nzdravkovic@medf.kg.ac.rs

Faculty of Medical Sciences, University of Kragujevac Serbia

Bibliografia

1. Allen J.B., Berkley D.A. (1979), Image method for efficiently simulating small‐room acoustics, The Journal of the Acoustical Society of America, 65(4): 943–950, doi: 10.1121/1.382599.
2. Benesty J., Chen J., Huang Y., (2008), Microphone Array Signal Processing, Springer-Verlag, Berlin, doi: 10.1007/978-3-540-78612-2.
3. Bitzer J., Simmer K.U. (2001), Superdirective microphone arrays, [in:] Microphone arrays. Digital Signal Processing, Brandstein M., Ward D. [Ed.], pp. 19–38, Springer, Berlin, Heidelberg, doi: 10.1007/978-3-662-04619-7_2.
4. Capon J. (1969), High-resolution frequency-wavenumber spectrum analysis, Proceedings of the IEEE, 57(8): 1408–1418, doi: 10.1109/PROC.1969.7278.
5. Chen J., Wang Y., Wang D. (2014), A feature study for classification-based speech separation at low signal-to-noise ratios, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12): 1993–2002, doi: 10.1109/ICASSP.2014.6854965.
6. DeFatta D.J. Lucas J.G., Hodgkiss W.S. (1988), Digital Signal Processing: A System Design Approach, John Wiley and Sons, Hoboken, New York, USA.
7. Elko G.W. (2004), Differential microphone arrays, [in:] Audio Signal Processing for Next-Generation Multimedia Communication Systems, Huang Y., Benesty J. [Eds.], pp. 11–65, Springer, Boston, MA, USA, doi: 10.1007/1-4020-7769-6_2.
8. Frost O. L. (1972), An algorithm for linearly constrained adaptive array processing, Proceedings of the IEEE, 60(8): 926–935, doi: 10.1109/PROC.1972.8817.
9. Greenberg J.E., Zurek P.M. (2001), Microphone-array hearing aids, [in:] Microphone Arrays, Brandstein M., Ward D. [Eds.], pp. 229–253, Springer, Berlin, Heidelberg, doi: 10.1007/978-3-662-04619-7_11.
10. Hoshuyama O., Sugiyama A., Hirano A. (1999), A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters, IEEE Transactions on signal processing, 47(10): 2677–2684, doi: 10.1109/78.790650.
11. International Telecommunications Union [ITU-T] (2001), Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, Recommendation P.862 (02/01).
12. Jovicic S.T., Saric Z.M., Turajlic, S.R. (2005), Application of the maximum signal to interference ratio criterion to the adaptive microphone array, Acoustics Research Letters Online, 6(4): 232–237, doi: 10.1121/1.1989785.
13. Kates J.M., Weiss M.R. (1996), A comparison of hearing‐aid array‐processing techniques, The Journal of the Acoustical Society of America, 99(5): 3138–3148, doi: 10.1121/1.414798.
14. Krecichwost M. Miodonska Z., Badura P., Trzaskalik J., Mocko N. (2019), Multi-channel acoustic analysis of phoneme /s/ mispronunciation for lateral sigmatism detection, Biocybernetics and Biomedical Engineering, 39(1): 246–255, doi: 10.1016/j.bbe.2018.11.005.
15. Krecichwost M., Miodonska, Z., Trzaskalik, J., Badura, P. (2020), Multichannel speech acquisition and analysis for computer-aided sigmatism diagnosis in children, IEEE Access, 8: 98647–98658. doi: 10.1109/ACCESS.2020.2996413.
16. Marro C., Mahieux Y., Simmer, K.U. (1998), Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering, IEEE Transactions on Speech and Audio Processing, 6(3): 240–259, doi: 10.1109/ACCESS.2020.2996413.
17. McCowan I.A., Bourlard H. (2003), Microphone array post-filter based on noise field coherence, IEEE Transactions on Speech and Audio Processing, 11(6): 709–716, doi: 10.1109/TSA.2003.818212.
18. McDonough J., Kumatani K. (2012), Microphone arrays, [in:] Techniques for Noise Robustness in Automatic Speech Recognition, Virtanen T. [Ed.], pp. 109–157, John Wiley and Sons, Hoboken, NJ, USA, doi: 10.1002/9781118392683.ch6.
19. Pan C., Chen J., Benesty J. (2014), On the noise reduction performance of the MVDR beamformer in noisy and reverberant environments, Proceedings 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 815–819, Florence, doi: 10.1109/ICASSP.2014.6853710.
20. Pan C., Chen J., Benesty, J. (2015), A multistage minimum variance distortionless response beamformer for noise reduction, The Journal of the Acoustical Society of America, 137(3): 1377–1388, doi: 10.1121/1.4913459.
21. Papp I.I., Saric Z.M., Jovicic S.T., Teslic N.D. (2007), Adaptive microphone array for unknown desired speaker’s transfer function, The Journal of the Acoustical Society of America, 122(2): EL44–EL49, doi: 10.1121/1.2749077.
22. Papp I.I., Saric Z.M., Teslic, N.D. (2011), Hands-free voice communication with TV, IEEE Transactions on Consumer Electronics, 57(2): 606–614, doi: 10.1109/TCE.2011.5955198.
23. Parra L., Spence C. (2000), Convolutive blind separation of non-stationary sources, IEEE transactions on Speech and Audio Processing, 8(3): 320–327, doi: 10.1109/89.841214.
24. Parra L.C., Alvino C.V. (2002), Geometric source separation: Merging convolutive source separation with geometric beamforming, IEEE Transactions on Speech and Audio Processing, 10(6): 352–362, doi: 10.1109/TSA.2002.803443.
25. Ricketts, T. A. (2001). Directional hearing aids, Trends in Amplification, 5(4): 139–176, doi: 10.1177/108471380100500401.
26. Saric Z.M., Jovicic S.T. (2004), Adaptive microphone array based on pause detection, Acoustics Research Letters Online, 5(2): 68–74, doi: 10.1121/1.1650411.
27. Saric Z.M., Simic D.P., Jovicic S.T. (2011), A new post-filter algorithm combined with two-step adaptive beamformer, Circuits, Systems, and Signal Processing, 30(3): 483–500, doi: 10.1007/s00034-010-9233-1.
28. Šarić Z., Subotić M., Bilibajkić R., Barjaktarović, M. (2019), Bidirectional microphone array with adaptation controlled by voice activity detector based on multiple beamformers, Multimedia Tools and Applications, 78(11): 15235–15254, doi: 10.1007/s11042-018-6895-3.
29. Simmer K.U., Bitzer J., Marro C. (2001), Post-filtering techniques, [in:] Microphone Arrays. Digital Signal Processing, Brandstein M., Ward D. [Eds.], pp. 39–60, Springer, Berlin, Heidelberg, doi: 10.1007/978-3-662-04619-7_3.
30. Soede W., Berkhout A.J., Bilsen F.A. (1993), Development of a directional hearing instrument based on array technology, The Journal of the Acoustical Society of America, 94(2): 785–798, doi: 10.1121/1.408180.
31. Spriet A., Moonen M., Wouters J. (2002), A multi‐channel subband generalized singular value decomposition approach to speech enhancement, European Transactions on Telecommunications, 13(2): 149–158, doi: 10.1002/ett.4460130210.
32. Trucco A., Traverso F., Crocco, M. (2015), Maximum constrained directivity of oversteered end-fire sensor arrays, Sensors, 15(6): 13477–13502, doi: 10.3390/s150613477.
33. Van Trees H.L. (2004), Optimum Array Processing: Part IV of Detection, Estimation, and Modulation Theory, John Wiley and Sons, Hoboken, NJ, USA, doi: 10.1002/0471221104.
34. Wang D.L., Brown G.J. (2006), Fundamentals of computational auditory scene analysis, [in:] Computational Auditory Scene Analysis: Principles, Algorithms and Applications, Wang D. L., Brown G. J. [Eds.], John Wiley and Sons, Hoboken, NJ, pp. 1–37, doi: 10.1109/9780470043387.ch1.
35. Wang L., Ding H., Yin F. (2010), Combining superdirective beamforming and frequency-domain blind source separation for highly reverberant signals, EURASIP Journal on Audio, Speech, and Music Processing, 4: 1–13, doi: 10.1155/2010/797962.
36. Wang D., Chen J. (2018), Supervised speech separation based on deep learning: An overview, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(10): 1702–1726, doi: 10.1109/TASLP.2018.2842159.
37. Wang X., Cohen I., Chen J., Benesty J. (2019), On robust and high directive beamforming with small-spacing microphone arrays for scattered sources, IEEE/ACM Transactions on Audio, Speech and Language Processing, 27(4): 842–852, doi: 10.1109/TASLP.2019.2899517.
38. Wölfel M., McDonough J.W. (2009), Distant Speech Recognition, John Wiley and Sons, Hoboken, NJ, doi: 10.1002/9780470714089.
39. Yilmaz O., Rickard S. (2004), Blind separation of speech mixtures via time-frequency masking, IEEE Transactions on signal processing, 52(7): 1830–1847, doi: 10.1109/TSP.2004.828896.
40. Zelinski R. (1988), A microphone array with adaptive post-filtering for noise reduction in reverberant rooms, Proceedings of ICASSP-88 International Conference on Acoustics, Speech, and Signal Processing, 5: 2578–2579, doi: 10.1109/ICASSP.1988.197172.

Uwagi

This paper is a result of research funded by the Ministry of Education, Science and Technological Development of the Republic of Serbia

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-d033a9ec-a257-4f94-a6b0-5d755c99ae38