Combining Multiple Sound Sources Localization Hybrid Algorithm and Fuzzy Rule Based Classification for Real-time Speaker Tracking Application

Ibala, C; Astapov, S; Bettens, F; Escobar, F; Chang, X; Valderrama, C; Riid, A

Artykuł - szczegóły

Tytuł artykułu

Combining Multiple Sound Sources Localization Hybrid Algorithm and Fuzzy Rule Based Classification for Real-time Speaker Tracking Application

Autorzy

Ibala C , Astapov S , Bettens F , Escobar F , Chang X , Valderrama C , Riid A

Treść / Zawartość

Pełne teksty:

Ibala_Astapov_Bettens_Escobar_Chang_Valderrama_Riid_Combining_1_2013.pdf

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

This work present a novel approach to track a specific speaker among multiple using the Minimum Variance Distortionless Response (MVDR) beamforming and fuzzy logic ruled based classification for speaker recognition. The Sound sources localization is performed with an improve delay and sum beamforming (DSB) computation methodology. Our proposed hybrid algorithm computes first the Generalized Cross Correlation (GCC) to create a reduced search spectrum for the DSB algorithm. This methodology reduces by more than 70% the DSB localization computation burden. Moreover for high frequencies Sound sources beamforming, the DSB will be preferred to the MVDR for logic and power consumption reduction.

Słowa kluczowe

DSB GCC localization tracking MVDR fuzzy logic classification speaker recognition FPGA

DSB GCC lokalizacja śledzenie MVDR logika rozmyta klasyfikacja rozpoznawanie mowy biometryka głosu FPGA

Wydawca

Lodz University of Technology. Department of Microelectronics and Computer Science

Czasopismo

International Journal of Microelectronics and Computer Science

Rocznik

2013

Tom

Vol. 4, nr 1

Strony

12--25

Opis fizyczny

Bibliogr. 37 poz.

Twórcy

autor

Ibala C

sibala@acm.org

Department of Electronics and Computer Engineering, University of Limerick, Ireland

autor

Astapov S

Sergei.astapov@dcc.ttu.ee

Tallinn University of Technology in Estonia

autor

Bettens F

Faculty of Engineering, University of Mons, Belgium

autor

Escobar F

Faculty of Engineering, University of Mons, Belgium

autor

Chang X

autor

Valderrama C

Faculty of Engineering, University of Mons, Belgium

autor

Riid A

Tallinn University of Technology in Estonia

Bibliografia

[1] Priyabrata Sinha, Alan D. George and Keonwook Kim, “Parallel Algorithms for Robust Broadband MVDR Beamforming”, http://www.hcs.ufl.edu/pubs/JCA2001.pdf.
[2] C. Cave, R. Wasser, “Estimating Parallel Processing Speed Multiplier”, http://www.visisoft.us/PDF_Files/EstimatingSpeedMultipliers.pdf, pp. 216-228, 31 March 2007.
[3] Fahad Qureshi, Syed Asad Alam and Oscar Gustafsson, “4K -Point FFT Algorithms based on optimized twiddle factor multiplication for FPGAs” The Asia Pacific Conference on Postgraduate Research in Microelectronics and Electronics (PrimeAsia), Sanghai, Sept 22-24 2010. http://www.ep.liu.se/PubList/Default.aspx?userid=fahqu64.
[4] Sanjay Thatte, John Blaine, “How to Manage Power Consumption in Advanced FPGAs,” Xcell Journal Xilinx Fall 2002.
[5] Hichem Belhadj, Vishal Aggrawal, Ajay Pradhan and Amal Zerrouki, “Power - Aware FPGA Design” 2009.
[6] Kirill Sakhnov, Ekaterina Verteletskaya, and Boris Simak, “Approach for Energy - Based Voice Detector with Adaptive Scaling Factor”. IAENG International Journal of Computer Science, 36:4, IJCS_36_4_16.
[7] H. Othman and T. Aboulnasr, “A Semi -Continuous State-Transition Probability Based Voice Activity Detector”, Hindawi Publishing Corporation EURASIP Journal on Audio, Speech, and Music Processing Volume 2007, Article ID 43218, 7 pages doi:10.1155/2007/43218.
[8] Christian Ibala, F. Escobar, X. Chang, C. Valderrama, “Hybrid Algorithm Computation Methodology to accelerate Sound source localization” International Journal of Microelectronic and Computer Science VOL 3, NO 3, 2012.
[9] E. Lleida, J. Fernandez, E. Masgrau, “Robust Continous Speech Recognition System Based on a Microphone Array” Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference. 12-15 May 1998, Page 241-244 Vol. 1.
[10] Lupu, E., Feher, Z., Pop, P.G. “On the speaker verification using the TESPAR coding method”. International Symposium on Signals, Circuits and Systems, 2003, pp. 173–176.
[11] S. Astapov, and A. Riid, “A Hierarchical Algorithm for Moving Vehicle Identification Based on Acoustic Noise Analysis,” 19 th International Conference “ Mixed Design of Integrated Circuits and Systems „ MIXDES 2012, Warsaw, Poland, pp. 467-472, 24-26 May 2012.
[12] Astapov, S. Preden, J.S. Suurjaak, E. “A method of real -time mobile vehicle identification by means of acoustic noise analysis implemented on an embedded device”, 13th Biennial Baltic Electronics Conference (BEC), pp.283-286, 3-5 Oct 2012.
[13] I.A. McCowan, „Robust Speech Recognition using Microphone Arrays”, PhD Thesis, Queensland University of Technology, Australia, 2001.
[14] Satish Mohan, Michael E. Lockwood, Michael L. Kramer, Douglas L. Jones, “Localization of multiple acoustic sources with small arrays using a coherence test”. J. Acoust. Soc. Am. Volume 123, Issue 4, pp. 2136-2147 (2008); (12 pages).
[15] Jacob Benesty, Jingdong Chen, Yiteng Huang, and Jacek Dmochowski, “On Microphone Array Beamforming From a MIMO Acoustic Signal Processing Perspective”. Audio, Speech, and Language Processing, vol. 15, no. 3, pp. 1053–1065, March 2007.
[16] Master Thesis “David K. Campbell” “Adaptive Beamforming Using a Microphone Array for Hands - Free Telephony” http://my.fit.edu/~vkepuska/ece5525/MicrophoneArray/etd.pdf.
[17] Ivan Tashev, I Capture and Processing, Wiley, Ed., 2009.
[18] Takanobu Nishiura, Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano, “Localization of Multiple I Sources Based on a CSP Analysis with a Microphone Array„ http://library.naist.jp/dspace/bitstream/10061/8030/1/ICASSP_2000_1053.pdf.
[19] Lefkimmiatis , S., et P. Maragos. «A Generalized Estimation Approach for Linear and Nonlinear Microphone Array PostFilters.» Speech Communication , no. 49(2007): 657-666.
[20] S. N. Bhuiya, F. Islam, and M. A. Matin, “Analysis of Direction of Arrival Techniques Using Uniform Linear Array” International Journal of Computer Theory and Engineering, Vol. 4, No. 6, December 2012.
[21] Christophe Ris Polytech Mons Internal Document Développement d’un logiciel de beamforming pour réseau de microphone linéaire.
[22] Weidong Li and Lars Wanhammar, “Efficient Radix -4 and Radix-8 Butterfly Elements” www.es.isy.liu.se/publications/papers_and_reports.
[23] The Scientist and Engineer’ s Guide to Digital Signal Processing By Steven W. Smith, PhD. http://www.dspguide.com/.
[24] Btzier, J., K,U. Simmer, and K.D. Kammeyer, “Mulit -Microphone Noise Reduction Techniques as Front-End Devices for Speech Recognition.” Speech Communication, no. 34(2001) : P 3-12.
[25] McCowan, I.A and H. Bourlard, “Microphone Array Post -Filter Based on Noise Field Coherence.” IEEE transaction on Speech and Audio Processing 11, no. 6 (November 2003): P 709 -716 BEAMFORMING,” http://www.hcs.ufl.edu/pubs/JCA2001.pdf.
[26] Ning Cheng, Wen-Ju Liu, Peng Li, Bo Xu, “Microphone array speech enhancement based on a generalized post-filter and a novel perceptual filter.” Signal Processing, ICSP 26 -29-09-2008 Proceedings; P370-3.
[27] Peeters , G. “A large set of audio features for sound description (similarity and classification) in the CUIDADO project”. CUIDADO I.S.T. Project Report.
[28] Vaseghi, S.V. Multimedia signal processing: Theory and applications in speech, music and communications. John Wiley & Sons Ltd., UK, 2007.
[29] P. Mahesha and D.S Vinod, “Vector Quantization and MFCC based classification of Dysfluencies in Stuttered Speech” Bonfring International Journal of Man Machine Interface, Vol. 2, No. 3, September 2012.
[30] Sigurdur Sigurdsson, Kaare Brandt Petersen and Tue Lehn-Schiøler, Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music.
[31] Jinjin Ye, B.S „Speech Recognition Using Time Domain Features from Phase Space Reconstruction”. MASTER OF SCIENCE.” Marquette University Milwaukee, Wisconsin May 2004.
[32] Jang, J.-S., Sun, C.-T., Mizutani, E. Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence. Prentice-Hall, Inc., 1997.
[33] A Riid , and N. Saadallah, “Unsupervised learning of well drilling operations: Fuzzy rule - based approach”, IEEE 16th International Conference on Intelligent Engineering Systems, Lisbon, Portugal, pp. 375-380, 13-15 June 2012.
[34] Hoang Do, Harvey Silverman, and Ying Yu, “A Real -Time SRP-PHAT Source Location Implementation Using Stochastic Region Contraction (SRC) on a Large - aperture Microphone Array,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2007, pp. I-121–I-124.
[35] Maurice F. Fallon and Simon . Godsill, “Acoustic Source Localization and Tracking of a Time - Varying Number of Speakers” IEEE Transaction on Audio, Speech, and Language Processing, Vol. 20. No. 4, May 2012.
[36] Zhizhang Chen, Gopal Gokeda, Yiqiang Yu, “Introduction to Direction-of-Arrival Estimation” ARTECH HOUSE ISBN 13: 978-1-59693-089-6.
[37] R. Kumara Swamy, K. Sri Rama Murty, and B. Yegnanarayana, Senior Member, IEEE “Determining Number of Speakers From Multispeaker Speech Signals Using Excitation Source Information” , IEEE Signals Processing Letters, Vol. 14, No. 7, July 2007.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-5afe99bf-3e76-4843-9bab-b412e85297e6