PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Searching for loops and sound samples with feature learning

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Konferencja
17th Conference on Computer Science and Intelligence Systems
Języki publikacji
EN
Abstrakty
EN
In this paper, we evaluate feature learning in the problem of retrieving subjectively interesting sounds from electronic music tracks. We describe an active learning system designed to find sounds categorized as samples or loops. These retrieval tasks originate from a broader R&D project, which concerns the use of machine learning for streamlining the creation of videogame content synchronized with soundtracks. The method is expected to function in the context of limited data availability, and as such cannot rely on supervised learning of what constitutes an "interesting sound''. We apply an active learning procedure that allows us to find sound samples without predefined classes through user interaction, and evaluate the use of neural network feature extraction in the problem.
Rocznik
Tom
Strony
13--18
Opis fizyczny
Bibliogr. 27 poz., wykr.
Twórcy
autor
  • Wroclaw University of Science and Technology Faculty of Information and Communication Technology Department of Artificial Intelligence
Bibliografia
  • 1. E. J. Humphrey, J. P. Bello, Y. LeCun, “Moving beyond feature design: Deep architectures and automatic feature learning in music informatics,” in ISMIR 2012, pp. 403-408.
  • 2. M. Defferrard, K. Benzi, P. Vandergheynst, X. Bresson, “FMA: A dataset for music analysis,” arXiv preprint https://arxiv.org/abs/1612.01840. 2017, https://doi.org/10.48550/arXiv.1612.01840
  • 3. Y. A. Chen, Y. H. Yang, J. C. Wang, H. Chen, “The AMG1608 dataset for music emotion recognition,” in ICASSP 2015, pp. 693-697, https://doi.org/0.1109/ICASSP.2015.7178058
  • 4. J. W. Kim, J. Salamon, P. Li, J. P. Bello, “Crepe: A convolutional representation for pitch estimation,” in ICASSP 2018, pp. 161-165, https://doi.org/10.1109/ICASSP.2018.8461329
  • 5. J. Jakubik, “Retrieving Sound Samples of Subjective Interest With User Interaction,” in Proc. of the 2020 Federated Conference on Computer Science and Information Systems, 2020, pp. 387-390, https://doi.org/10.15439/2020F82
  • 6. B. McFee, D. Ellis, “Analyzing Song Structure with Spectral Clustering,” in ISMIR 2014, pp. 405-410, https://doi.org/10.5281/zenodo.1415778
  • 7. Kothinti, S., Imoto, K., Chakrabarty, D., Sell, G., Watanabe, S., Elhilali, M. (2019, May). “Joint acoustic and class inference for weakly supervised sound event detection,” in ICASSP 2019, pp. 36-40, https://doi.org/10.1109/ICASSP.2019.8682772
  • 8. H. Xie, T. V. Huang, “Zero-Shot Audio Classification via Semantic Embeddings,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, 2021, pp. 1233-1242, https://doi.org/10.48550/arXiv.2011.12133
  • 9. S. Makino, "Audio source separation," Springer, 2018.
  • 10. J. P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, M. B. Sandler, “A tutorial on onset detection in music signals,” in IEEE Transactions on speech and audio processing, vol. 13, no. 5, 2005, pp. 1035-1047, https://doi.org/10.1109/TSA.2005.851998
  • 11. R. Marxer, J. Janer, "Study of Regularizations and Constraints in NMF-Based Drums Monaural Separation", in Proc. of the 7th Int. Conference on Digital Audio Effects (DAFx’13). Maynooth, Ireland, 2013.
  • 12. L. Lu, M. Wang, H. J. Zhang, “Repeating pattern discovery and structure analysis from acoustic music data,” in Proc. of the 6th ACM SIGMM Int. Workshop on Multimedia Information Retrieval, 2016, pp. 275-282, https://doi.org/10.1145/1026711.1026756
  • 13. P. López-Serrano, C. Dittmar, J. Driedger, M. Müller, “Towards Modeling and Decomposing Loop-Based Electronic Music,” in ISMIR 2016, pp. 502-508.
  • 14. J. B. L. Smith, M. Goto, “Nonnegative tensor factorization for source separation of loops in audio,” in ICASSP 2018, Calgary, Canada, pp. 171–175, https://doi.org/10.1109/MSP.2018.2877582
  • 15. J. B. L. Smith, Y. Kawasaki, M. Goto, “Unmixer: An interface for extracting and remixing loops,” in ISMIR 2019, Delft, Nethedlands, pp. 824–831, https://doi.org/10.5281/zenodo.3527938
  • 16. C. Chen, S. Xin, “Combined Transfer and Active Learning for High Accuracy Music Genre Classification Method,” in 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), IEEE, 2021, https://doi.org/10.1109/ICBAIE52039.2021.9390062
  • 17. A. Sarasúa, C. Laurier, P. Herrera, “Support vector machine active learning for music mood tagging,” in 9th International Symposium on Computer Music Modeling and Retrieval (CMMR), London, 2012, https://doi.org/10.1007/s00530-006-0032-2
  • 18. W. Li, X. Feng, M. Xue, “Reducing manual labeling in singing voice detection: An active learning approach,” in 2016 IEEE International Conference on Multimedia and Expo (ICME) IEEE, 2016, https://doi.org/10.1109/ICME.2016.7552987
  • 19. Fu, Yifan, Xingquan Zhu, and Bin Li. “A survey on instance selection for active learning,” in Knowledge and information systems, vol. 35.2, pp. 249-283, 2013, https://doi.org/10.1007/s10115-012-0507-8
  • 20. T. H. Hsieh, L. Su, Y. H. Yang, “A streamlined encoder/decoder architecture for melody extraction,” in ICASSP 2019, pp. 156-160, https://doi.org/10.1109/ICASSP.2019.8682389
  • 21. J. Spijkervet, J. A.Y. Burgoyne, "Contrastive Learning of Musical Representations." arXiv preprint https://arxiv.org/abs/2103.09410, 2021, https://doi.org/10.48550/arXiv.2103.09410
  • 22. Grill, J. B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Valko, M. (2020). Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems, 33, 21271-21284, https://doi.org/10.48550/arXiv.2006.07733
  • 23. Nguyen, K., Nguyen, Y., & Le, B. (2021). Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation with SimCLR. arXiv preprint https://arxiv.org/abs/2108.00587, https://doi.org/10.48550/arXiv.2108.00587
  • 24. B. McFee, C. Raffel, D. Liang, D. P. W. Ellis, M. McVicar, E. Battenberg, O. Nieto, “librosa: Audio and music signal analysis in python,” in Proc. of the 14th python in science conference, pp. 18-25, 2015.
  • 25. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in Advances in Neural Information Processing Systems, vol. 32, 2019, pp. 8024-8035, https://doi.org/10.48550/arXiv.1912.01703
  • 26. C.R. Harris, K.J. Millman, S.J. van der Walt, “Array programming with NumPy,” Nature vol. 585, pp. 357–362, 2020. http://dx.doi.org/0.1038/s41586-020-2649-2, https://doi.org/10.1038/s41586-020-2649-2
  • 27. F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” in Hournal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011, https://doi.org/10.48550/arXiv.1201.0490
Uwagi
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-8fe2c367-aed5-4858-a356-803853b7d01f
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.