PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Frequency Selection Based Separation of Speech Signals with Reduced Computational Time Using Sparse NMF

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Application of wavelet decomposition is described to speed up the mixed speech signal separation with the help of non-negative matrix factorisation (NMF). It is assumed that the basis vectors of training data of individual speakers had been recorded. In this paper, the spectrogram magnitude of a mixed signal has been factorised with the help of NMF with consideration of sparseness of speech signals. The high frequency components of signal contain very small amount of signal energy. By rejecting the high frequency components, the size of input signal is reduced, which reduces the computational time of matrix factorisation. The signal of lower energy has been separated by using wavelet decomposition. The present work is done for wideband microphone speech signal and standard audio signal from digital video equipment. This shows an improvement in the separation capability using the proposed model as compared with an existing one in terms of correlation between separated and original signals. Obtained signal to distortion ratio (SDR) and signal to interference ratio (SIR) are also larger as compare of the existing model. The proposed model also shows a reduction in computational time, which results in faster operation.
Rocznik
Strony
287--295
Opis fizyczny
Bibliogr. 19 poz., rys., tab., wykr.
Twórcy
  • Department of Electronics, Aligarh Muslim University, Aligarh, India
autor
  • Department of Electronics, Aligarh Muslim University, Aligarh, India
autor
  • Department of Electronics, Aligarh Muslim University, Aligarh, India
autor
  • Department of Electronics, Aligarh Muslim University, Aligarh, India
Bibliografia
  • 1. Benetos E., Kotti M., Kotropoulos C. (2006), Musical instrument classification using non-negative Matrix factorization algorithms and subset feature selection, IEEE International Conference on Acoustics, Speech and Signal Processing, 5, 221–224.
  • 2. Cho Y-C., Choi S., Bang S-Y. (2003), Non-negative component parts of sound for classification, 3rd IEEE International Symposium on Signal Processing and Information Technology, 633–636.
  • 3. Demir C., Saraclar M., Cemgil A. T. (2013), Single-channel speech-music separation for robust ASR with mixture models, IEEE Transactions on Audio, Speech, and Language Processing, 21, 4, 725–736.
  • 4. Févotte C., Bertin N., Durrieu J. (2009), Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis, Neural Computation, 21, 793–830.
  • 5. Févotte C., Gribonval R., Vincent E. (2005), BSS EVAL toolbox user guide revision 2.0, Tech. Rep. 1706, IRISA, Rennes, France.
  • 6. Hoyer P. O. (2004), Non-negative matrix factorization with sparseness constraint, Journal of Machine Learning Research, 1457–1469.
  • 7. Kim J., Park H. (2008), Sparse nonnegative matrix factorization for clustering, Georgia Institute of Technology, GT-CSE-08-01.
  • 8. Lee D. D., Seung H. S. (1999), Learning the pans of objects with nonnegative matrix factorization, Nature 401, 788–791.
  • 9. Lee D. D., Seung H. S. (2000), Algorithms for non-negative matrix factorization, Advances in Neural Information Processing Systems, 13, 556–562.
  • 10. Nasersharif B., Abdali S. (2015), Speech/music separation using non-negative matrix factorization with combination of cost functions, International Symposium on Artificial Intelligence and Signal Processing (AISP), 107–111.
  • 11. Paatero P., Tapper U. (1994), Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values, Environmetrics, 5, 111–126.
  • 12. Reetz H., Jongman A. (2011), Phonetics: transcription, production, acoustics, and perception, Wiley-Blackwell, ISBN: 978-1-4443-5854-4, pp. 182–200.
  • 13. Schmidt M. N., Olsson R. K. (2006), Single-channel speech separation using sparse non-negative matrix factorization, 9th International Conference on Spoken Language Processing, Pittsburgh, PA, USA.
  • 14. Upadhyaya P., Farooq O., Varshney P., Upadhyaya A. (2013), Enhancement of VSR using low dimension visual feature, International Conference of Multimedia, Signal Processing and Communication Technologies (IMPACT), Aligarh, India, pp. 71–74.
  • 15. Vincent E., Gribonval R., Fevotte C. (2006), Performance measurement in blind audio source separation, IEEE Transactions on Audio, Speech, and Language Processing, 14, 1462–1469.
  • 16. Walpole R. E., Myers R. H., Myers S. L., Ye K. E. (2011), Probability and Statistics for Engineers and Scientists, 9th ed., Pearson, ISBN: 978-0-3216-2911-1, p. 433.
  • 17. Wang Y., Li Y., Ho K. C., Zare A., Skubic M. (2014), Sparsity promoted non-negative matrix factorization for source separation and detection, 19th International Conference on Digital Signal Processing (DSP), 640–645.
  • 18. Wang Z., Sha F. (2014), Discriminative non-negative matrix factorization for single-channel speech separation, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3749–3753, Florence, Italy, 4–9 May.
  • 19. Zhu B., Li W., Li R., Xue X. (2013), Multi-stage non-negative matrix factorization for monaural singing voice separation, IEEE Transactions on Audio, Speech, and Language Processing, 21, 10, 2096–2107.
Uwagi
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-b54f05ff-57e4-4e22-b2c7-9a9bcdf5349b
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.