PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

ICA-based Single Channel Audio Separation: New Bases and Measures of Distance

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Independent Component Analysis (ICA) can be used for single channel audio separation, if a mixed signal is transformed into time-frequency domain and the resulting matrix of magnitude coefficients is processed by ICA. Previous works used only frequency (spectral) vectors and Kullback-Leibler distance measure for this task. New decomposition bases are proposed: time vectors and time-frequency components. The applicability of several different measures of distance of components are analysed. An algorithm for clustering of components is presented. It was tested on mixes of two and three sounds. The perceptual quality of separation obtained with the measures of distance proposed was evaluated by listening tests, indicating “beta” and “correlation” measures as the most appropriate. The “Euclidean” distance is shown to be appropriate for sounds with varying amplitudes. The perceptual effect of the amount of variance used was also evaluated.
Rocznik
Strony
311--331
Opis fizyczny
Bibliogr. 34 poz., tab., wykr.
Twórcy
autor
  • Studio sQuat Professional Sound Studio Recording Pl. Tysiaclecia PP 1, 22-100 Chełm, Poland, 75mika@wp.pl
Bibliografia
  • 1. Bach F.R., Jordan M.I. (2005), Blind one-microphone speech separation: A spectral learning approach, Advances in neural information processing systems, 17, 65-72.
  • 2. Barry D., Fitzgerald D., Eugene Coyle E., Lawlor G. (2005), Single Channel Source Separation using Short-time Independent Component Analysis, 119th Audio Engineering Society Convention, Convention Paper 6603, New York.
  • 3. Barry D., Lawlor B., Coyle E. (2004), Sound source separation: azimuth discrimination and resynthesise, Proc. of the 7th Int. Conference on Digital Audio Effects (DAFX-04), Naples, Italy.
  • 4. Bech S., Zacharov N. (2006), Perceptual Audio Evaluation, John Wiley & Sons, Inc., Chichester, England.
  • 5. Box G., Tiao G. (1973), Bayesian Inference In Statistical Analysis, John Wiley & Sons, Inc., England.
  • 6. Brungart D.S., Chang P.S., Simpson B.D., Wang D.L. (2006), Isolating the energetic component of speech-on-speech masking with ideal t-f segregation, Journal Acoustical Society of America, 120, 4007-4018.
  • 7. Cardoso J.-F. (1998), Blind Signal Separation: statistical principles, Proceedings of the IEEE, 9, 10, 2009-2025.
  • 8. Casey M.A. (2001), Separation of Mixed Audio Sources by Independent Subspace Analysis, Merl - A Mitsubishi Electric Research Laboratory, TR-2001-31.
  • 9. Cooney R., Cahill N., Lawlor R. (2006), An Enhanced implementation of the ADRess (Azimuth Discrimination and Resynthesis) Music Source Separation Algorithm, Audio Engineering Society, Convention Paper 6984, 121st Convention, San Francisco, USA.
  • 10. Cover T.M., Thomas J.A. (1991), Elements of Information Theory, John Wiley & Sons, Inc., New York.
  • 11. Davies M.E., James C.J. (2007), Source separation using single channel ICA, Signal Process., 87, 8, 1819-1832.
  • 12. Duan Z., Zhang Y., Zhang C., Shi Z. (2008), Unsupervised Single-Channel Music Source Separation by Average Harmonic Structure Modeling, IEEE Transactions on Audio, Speech and Language Processing, 16, 4, 766-778.
  • 13. Dziubinski M., Kostek B. (2010), Evaluation of the separation algorithm performance employing ANNs, Chapter in Advances in Soft Computing, 80, pp. 27-37, Springer Verlag, Berlin, Heidelberg.
  • 14. Hyvarinen A., Karhunen J., Oja E. (2001), Independent Component Analysis, A Wiley-Interscience Publication, John Wiley & Sons, Inc. New York.
  • 15. Jain A.K., Dubes R.C. (1988), Algorithms for Clustering Data, Prentice-Hall advanced reference series, Prentice-Hall, Inc., Upper Saddle Riever, NJ, USA.
  • 16. Jain A.K., Murty M.N., Flyn P.J. (1999), Data Clustering: A Review, ACM Computing Survey, 31, 3.
  • 17. Jang G.-J., Lee T.-W. (2002), Learning statistically efficient features for speaker recognition, Neurocomputing, 49, 1-4, 329-348.
  • 18. Jang G.-J., Lee T.-W. (2003), A Maximum Likelihood Approach to Single-Channel Source Separation, Journal of Machine Learning Research, 4, 1365-1392.
  • 19. Kostek B. (2005), Perception-Based Data Processing in Acoustics, Springer Verlag, Berlin.
  • 20. Lee T.-W., Lewicki M.S. (2000), The generalized Gaussian Mixture Model using ICA, International workshop on ICA, 239-244.
  • 21. Litvin, Y., Cohen I. (2009), Single-channel source separation of audio signals using Bark Scale Wavelet Packet Decomposition, IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2009, 1-4.
  • 22. Masters A.S. (2006), Stereo music source separation via Bayesian modeling, Ph.D. dissertation, Stanford University, USA.
  • 23. McQueen J. (1967), Some methods for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 281-297.
  • 24. Mijovic' B., De Vos M., Gligorijevic' I., Taelman J., Van Huffel S. (2010), Source Separation From Single-Channel Recordings by Combining Empirical-Mode Decomposition and Independent Component Analysis, IEEE Transactions on Biomedical Engineering, 57, 9, 2188-2196.
  • 25. Mika D. (2009), Separation of sounds from various sources in a mixed acoustic signal [in Polish], Ph.D. Thesis, AGH University, Kraków, Poland.
  • 26. Paatero P., Tapper U. (1997), Least squares formulation of robust non-negative factor analysis, Chemometr. Intell. Lab., 37, 1, 23-35.
  • 27. Papoulis A. (1991), Probability, Random Variables, and Stochastic Processes, McGraw-Hill, 3rd edition, New York.
  • 28. Rickard S., Yilmaz O. (2002), On the approximate W-disjoint orthogonality of speech, [in:] ICASSP, Orlando, Florida, 529-531.
  • 29. Seber G.A.F. (1984), Multivariate Observations, John Wiley & Sons, Inc., New York.
  • 30. Taghia J., Ali Doostari M. (2009), Subband-based Single-channel Source Separation of Instantaneous Audio Mixtures World, World Applied Sciences Journal, 6, 784-792.
  • 31. Vinyes M., Bonada J., Loscos A. (2006), Demixing Commercial Music Productions via Human-Assisted T-f Masking, Audio Engineering Society, Convention Paper 6719, 120th Convention, Paris, France.
  • 32. Wang D.L., Brown G.J. (2006), Computational auditory scene analysis, Principles, Algorithms, and Applications. IEEE Press/Wiley-Interscience, Hoboken NJ.
  • 33. Wang B., Plumbley M. (2006), Investigating single-channel audio source separation methods based on non-negative matrix factorization, ICA Research Network International Workshop.
  • 34. Yilmaz O., Rickard S. (2004), Blind separation of speech mixtures via t-f masking, IEEE Transactions on Signal Processing, 52, 7, 1830-1847
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS8-0020-0021
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.