A two-step approach to blind deconvolution of speech and sound sources in the time domain

Okazaki, F.A.; Kasprzak, W.

Artykuł - szczegóły

Tytuł artykułu

A two-step approach to blind deconvolution of speech and sound sources in the time domain

Autorzy

Okazaki F.A. , Kasprzak W.

Wybrane pełne teksty z tego czasopisma

http://journals.pan.pl/dlibra/journal/95347

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

In order to understand commands given through voice by an operator, user or any human, a robot needs to focus on a single source, to acquire a clear speech sample and to recognize it. A two-step approach to the deconvolution of speech and sound mixtures in the time-domain is proposed. At first, we apply a deconvolution procedure, constrained in the sense, that the de-mixing matrix has fixed diagonal values without non-zero delay parameters. We derive an adaptive rule for the modification of the de-convolution matrix. Hence, the individual outputs extracted in the first step are eventually still self-convolved. This corruption we try to eliminate by a de-correlation process independently for every individual output channel.

Słowa kluczowe

blind signal analysis convolved mixtures independent component analysis robotic sensors speech reconstruction

Wydawca

Polska Akademia Nauk, Wydział IV Nauk Technicznych

Czasopismo

Bulletin of the Polish Academy of Sciences. Technical Sciences

Rocznik

2005

Tom

Vol. 53, nr 1

Strony

49--55

Opis fizyczny

Bibliogr. 19 poz., 10 rys.

Twórcy

autor

Okazaki F.A.

autor

Kasprzak W.

Institute of Control and Computation Engineering, Warsaw University of Technology, 15/19 Nowowiejska St., 00-665 Warszawa, Poland., okazaki@elka.pw.edu.pl

Bibliografia

[1] K. Tchon (ed.), VIII State Conference of Robotics, (Polanica Zdrój, Poland, June 2004), WKiŁ, Warszawa, (2005), (in Polish).
[2] C. Zielinski, “A unified formal description of behavioural and deliberative robotic multi-agent systems”, Proc. 7th IFAC International Symposium on Robot Control SYROCO 2003, Wrocław, Poland, vol. 2, 479-486 (2003).
[3] A. Cichocki and S. Amari, Adaptive Blind Signal and Image Processing, John Wiley, Chichester, UK, (2002).
[4] A. Hyvarinen, J. Karhunen and E. Oja, Independent Component Analysis, John Wiley & Sons, New York etc., (2001).
[5] N. Murata, S. Ikeda and A. Ziehe, “An approach to blind source separation based on temporal structure of speech signal”, Neurocomputing 41(4), 1–24 (2001).
[6] W. Kasprzak, “Blind deconvolution of timely correlated sources by gradient descent search”, in: Image Processing and Communications, An International Journal, edited by ATR Bydgoszcz, 9(1), 33–52 (2003).
[7] P. Smaragdis, “Blind separation of convolved mixtures in frequency domain”, Neurocomputing 22 (1), 21–34 (1998).
[8] N. Araki et al., “Fundamental limitation of frequency domain blind source separation for convolved mixture of speech”, Proceedings ICASSP2001, 5, 2737–2740 (2001).
[9] W. Kasprzak and A. Okazaki, “Blind deconvolution of timely-correlated sources by homomorphic filtering in Fourier space”, Fourth Int. Symposium on Independent Component Analysis and Blind Signal Separation - ICA’2003, Nara, Japan, 2003, NTT Comm. Science Lab., pp. 1029–1034 (2003).
[10] H. Saruwatari, S. Kurita and K. Takeda, “Blind source separation combining frequency-domain ICA and beamforming”, Proceedings ICASSP2001, 5, 2733–2736 (2001).
[11] T. Nishikawa, H. Saruwatari, K. Shikano, S. Araki and S. Makino, “Multistage ICA for blind source separation of real acoustic convolutive mixture”, Fourth Int. Symposium on Independent Component Analysis and Blind Signal Separation – ICA’2003, Nara, Japan, NTT Comm. Science Lab., pp. 523–528 (2003).
[12] D.C.B. Chan, Blind Signal Separation, Ph.D. thesis, University of Cambridge, Engineering Department, Cambridge, UK, (1997).
[13] W. Kasprzak, A. Cichocki and S. Amari, “Blind source separation with convolutive noise cancellation”, Neural Computing and Applications, Springer-Verlag London Ltd., vol. 6, 127–141 (1997).
[14] A. Okazaki and W. Kasprzak, “Deconvolution of speech signals from their mixture under constant mixing matrix diagonal”, VIII State Conference of Robotics, (Polanica Zdrój, czerwiec 2004), WKiL, Warszawa, (2004), (in Polish).
[15] I. Sabala, Multichannel Deconvolution and Separation of Statistically Independent Signals for Unknown Dynamic Systems, Ph.D. dissertation, Warsaw University of Technology, Dep. of Electrical Engineering, Warsaw, (1998).
[16] M-Audio Inc., “M-Audio Delta Series 44 User’s Manual”, M-Audio, http://www.m-audio.com, (2000).
[17] Shure Corporation, “Shure C608 user’s manual”, Shure Company, (2003).
[18] F. Antonioli, “WWW page of company FASoft with program n-Track Studio”, http://www.fasoft.com
[19] M. Burbeck, J. Haberman and D. Mazzoni, “WWW page of project Audacity”, http://audacity.sourceforge.net/.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPG5-0005-0037