PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Segregation of songs and instrumentals : a precursor to voice/accompaniment separation from songs in noisy scenario

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The music industry has come a long way since its inception. Music producers have also adhered to modern technology to infuse life into their creations. Systems capable of separating sounds based on sources especially vocals from songs have always been a necessity which has gained attention from researchers as well. The challenge of vocal separation elevates even more in the case of the multi‐instrument environment. It is essential for a system to be first able to detect that whether a piece of music contains vocals or not prior to attempting source separation. It is also very much challenging to perform source separation from audio which is contaminated with noise. In this paper, such a system is proposed being tested on a database of more than 99 hours of instrumentals and songs. Experiments were performed with both noise free as well as noisy audio clips. Using line spectral frequency‐based features, we have obtained the highest accuracies of 99.78% and 99.34% (noise free and noisy scenario respectively) from among six different classifiers, viz. BayesNet, Support Vector Machine, Multi Layer Perceptron, LibLinear, Simple Logistic and Decision Table.
Twórcy
  • Dept. of Computer Science, West Bengal State University, Kolkata, India
  • Dept. of Computer Science and Engineering, Aliah University, Kolkata, India
autor
  • Dept. of Computer Science, The University of South Dakota, SD, USA
  • Dept. of Informatics, University of Evora, Evora, Portugal
  • Dept. of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, India
autor
  • Dept. of Computer Science, West Bengal State University, Kolkata, India
Bibliografia
  • [1] T.‑W. Leung, C.‑W. Ngo, and R. Lau, “ICA‑FX features for classification of singing voice and instrumental sound”. In: Proceedings of the 17th International Conference on Pattern Recogition, 2004. ICPR 2004., vol. 2, 2004, 367–370, 10.1109/ICPR.2004.1334222, ISSN: 1051‑4651.
  • [2] A. Chanrungutai and C. A. Ratanamahatana, “Singing voice separation for mono‑channel music using Non‑negative Matrix Factorization”.In: 2008 International Conference on Advanced Technologies for Communications, 2008, 243–246, 10.1109/ATC.2008.4760565.
  • [3] M. Rocamora and P. Herrera, “Comparing audio descriptors for singing voice detection in music audio files”. In: 11th Brazilian symposium on computer music, San Pablo, Brazil, vol. 26, 2007.
  • [4] Chao‑Ling Hsu and J.‑S. Jang, “On the Improvement of Singing Voice Separation for Mo naural Recordings Using the MIR‑1K Dataset”,IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 2, 2010, 310–319,10.1109/TASL.2009.2026503.
  • [5] Z. Rafii and B. Pardo, “A simple music/voice separation method based on the extraction of the repeating musical structure”. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 2011, 221–224, 10.1109/ICASSP.2011.5946380.
  • [6] Z. Rafii and B. Pardo, “Repeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 1, 2013, 73–84, 10.1109/TASL.2012.2213249.
  • [7] A. Liutkus, Z. Rafii, R. Badeau, B. Pardo, and G. Richard, “Adaptive filtering for music/voice separation exploiting the repeating musical structure”. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, 53–56, 10.1109/ICASSP.2012.6287815.
  • [8] A. Ghosal, R. Chakraborty, B. C. Dhara, and S. K. Saha, “Song/instrumental classification using spectrogram based contextual features”. In: Proceedings of the CUBE International Information Technology Conference, New York, NY, USA, 2012, 21–25, 10.1145/2381716.2381722.
  • [9] M. Mauch, H. Fujihara, K. Yoshii, and M. Goto, “Timbre and Melody Features for the Recognition of Vocal Activity and Instrumental Solos in Polyphonic Music”. In: ISMIR, 2011, 233–238.
  • [10] H. Burute and P. B. Mane, “Separation of singing voice from music accompaniment using matrix factorization method”. In: 2015 International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), 2015, 166–171, 10.1109/ICATCCT.2015.7456876.
  • [11] A. Ghosal, R. Chakraborty, B. C. Dhara, and S. K. Saha, “Instrumental/song classification of music signal using RANSAC”. In: 2011 3rd International Conference on Electronics Computer Technology, vol. 1, 2011, 269–272, 10.1109/ICECTECH.2011.5941603.
  • [12] L. Regnier and G. Peeters, “Singing voice detection in music tracks using direct voice vibrato detection”. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, 1685–1688, 10.1109/ICASSP.2009.4959926.
  • [13] A. Ozerov, P. Philippe, F. Bimbot, and R. Gribonval, “Adaptation of Bayesian Models for Single‑Channel Source Separation and its Application to Voice/Music Separation in Popular Songs”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 5, 2007, 1564–1578, 10.1109/TASL.2007.899291.
  • [14] C.‑L. Hsu, D. Wang, J.‑S. R. Jang, and K. Hu, “A Tandem Algorithm for Singing Pitch Extraction and Voice Separation From Music Accompaniment”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 5, 2012, 1482–1491, 10.1109/TASL.2011.2182510.
  • [15] B. Zhu, W. Li, R. Li, and X. Xue, “Multi‑Stage Non‑Negative Matrix Factorization for Monaural Singing Voice Separation”, IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 10, 2013, 2096–2107, 10.1109/TASL.2013.2266773, Conference Name: IEEE Transactions on Audio, Speech, and Language Processing.
  • [16] “Youtube”. https://www.youtube.com/, 2020. Accessed on: 2020‑09‑20.
  • [17] “Ethnologue: Languages of the World”. https: //www.ethnologue.com/, 2020. Accessed on: 2020‑09‑20.
  • [18] H. Mukherjee, S. M. Obaidullah, S. Phadikar, and K. Roy, “SMIL ‑ A Musical Instrument Identification System”. In: J. K. Mandal, P. Dutta, and S. Mukhopadhyay, eds., Computational Intelligence, Communications, and Business Analytics, Singapore, 2017, 129–140, 10.1007/978‑981‑10‑6427‑2_11.
  • [19] H. Mukherjee, S. Phadikar, P. Rakshit, and K. Roy, “REARC‑a Bangla Phoneme recognizer”. In: 2016 International Conference on Accessibility to Digital World (ICADW), 2016, 177–180, 10.1109/ICADW.2016.7942537.
  • [20] K. K. Paliwal, “On the use of line spectral frequency parameters for speech recognition”, Digital Signal Processing, vol. 2, no. 2, 1992, 80–87, 10.1016/1051‑2004(92)90028‑W.
  • [21] N. Friedman, D. Geiger, and M. Goldszmidt, “Bayesian Network Classifiers”, Machine Learning, vol. 29, no. 2, 1997, 131–163, 10.1023/A:1007465528199.
  • [22] N. Cristianini and J. Shawe‑Taylor. “An Introduction to Support Vector Machines and Other Kernel‑based Learning Methods”, March 2000.
  • [23] R.‑E. Fan, K.‑W. Chang, C.‑J. Hsieh, X.‑R. Wang, and C.‑J. Lin, “LIBLINEAR: A Library for Large Linear Classification”, The Journal of Machine Learning Research, vol. 9, 2008, 1871–1874.
  • [24] H. Mukherjee, C. Halder, S. Phadikar, and K. Roy, “READ—A Bangla Phoneme Recognition System”. In: S. C. Satapathy, V. Bhateja, S. K. Udgata, and P. K. Pattnaik, eds., Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, Singapore, 2017, 599–607, 10.1007/978‑981‑10‑3153‑3_59.
  • [25] M. Sumner, E. Frank, and M. Hall, “Speeding Up Logistic Model Tree Induction”. In: A. M. Jorge, L. Torgo, P. Brazdil, R. Camacho, and J. Gama, eds., Knowledge Discovery in Databases: PKDD 2005, Berlin, Heidelberg, 2005, 675–683, 10.1007/11564126_72.
  • [26] R. Kohavi. “The power of decision tables”. In: N. Lavrac and S. Wrobel, eds., Machine Learning: ECML‑95, volume 912, 174–189. Springer, Berlin, Heidelberg, 1995.
  • [27] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update”, ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, 2009, 10–18, 10.1145/1656274.1656278.
  • [28] J. Demšar, “Statistical Comparisons of Classifiers over Multiple Data Sets”, Journal of Machine Learning Research, vol. 7, 2006, 1–30.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-c3478ef0-46e1-4f10-91e8-05b067b65e21
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.