Retrieving Sound Samples of Subjective Interest With User Interaction

Jakubik, Jan

doi:10.15439/2020F82

Artykuł - szczegóły

Tytuł artykułu

Retrieving Sound Samples of Subjective Interest With User Interaction

Autorzy

Jakubik Jan

Wybrane pełne teksty z tego czasopisma

http://annals-csis.org

Identyfikatory

DOI

10.15439/2020F82

Warianty tytułu

Konferencja

Federated Conference on Computer Science and Information Systems (15 ; 06-09.09.2020 ; Sofia, Bulgaria)

Języki publikacji

Abstrakty

This paper concerns the retrieval of audio samples with a high degree of user interaction, motivated by a practical use case. We consider an open set recognition scenario in which the goal is to find all occurrences of a subjectively interesting sound selected by a user within a particular audio file. We use only a single starting example and maintain interaction through yes-no answers from the user, indicating whether any new retrieved sound matches the target pattern. We present a small dataset for this task and evaluate a baseline solution based on Nonnegative Matrix Factorization and greedy feature selection.

Słowa kluczowe

music information retrieval matrix decomposition active learning

wyszukiwanie informacji muzyka rozkład macierzy aktywne uczenie się

Wydawca

Polskie Towarzystwo Informatyczne

Czasopismo

Annals of Computer Science and Information Systems

Rocznik

2020

Tom

Vol. 21

Strony

387--390

Opis fizyczny

Bibliogr. 15 poz., wykr., wz.

Twórcy

autor

Jakubik Jan

jan.jakubik@pwr.edu.pl

Department of Computational Intelligence, Wroclaw University of Science and Technology

Bibliografia

[1] Rainer Typke, Frans Wiering, and Remco Veltkamp. A survey of music information retrieval systems. pages 153-160, 01 2005.
[2] Douglas Turnbull, Luke Barrington, David Torres, and Gert Lanckriet. Semantic annotation and retrieval of music and sound effects. IEEE Transactions on Audio, Speech, and Language Processing, 16(2):467-476, 2008.
[3] Moataz El Ayadi, Mohamed S. Kamel, and Fakhri Karray. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recogn., 44(3):572-587, March 2011.
[4] Allen Huang and Raymond Wu. Deep learning for music. arXiv preprint arXiv:1606.04930, 2016.
[5] Walter J Scheirer, Anderson de Rezende Rocha, Archana Sapkota, and Terrance E Boult. Toward open set recognition. IEEE transactions on pattern analysis and machine intelligence, 35(7):1757-1772, 2012.
[6] Wei Wang, Vincent W Zheng, Han Yu, and Chunyan Miao. A survey of zero-shot learning: Settings, methods, and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):1-37, 2019.
[7] Zhao Shuyang, Toni Heittola, and Tuomas Virtanen. Active learning for sound event classification by clustering unlabeled data. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 751-755. IEEE, 2017.
[8] Jeong Choi, Jongpil Lee, Jiyoung Park, and Juhan Nam. Zero-shot learning for audio-based music classification and tagging. arXiv preprint arXiv:1907.02670, 2019.
[9] Yifan Fu, Xingquan Zhu, and Bin Li. A survey on instance selection for active learning. Knowledge and information systems, 35(2):249-283, 2013.
[10] A. Holub, P. Perona, and M. C. Burl. Entropy-based active learning for object recognition. In 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pages 1-8, 2008.
[11] Ozan Sener and Silvio Savarese. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489, 2017.
[12] Jeong Choi, Jongpil Lee, Jiyoung Park, and Juhan Nam. Zero-shot learning for audio-based music classification and tagging. arXiv preprint arXiv:1907.02670, 2019.
[13] Vibha Tiwari. Mfcc and its applications in speaker recognition. International journal on emerging technologies, 1(1):19-22, 2010.
[14] Brian McFee, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. librosa: Audio and music signal analysis in python. In Proceedings of the 14th python in science conference, volume 8, 2015.
[15] Fabian Pedregosa, Ga¨el Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825-2830, 2011.

Uwagi

1. Track 2: Computer Science & Systems

2. Technical Session: Advances in Computer Science & Systems

3. Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-82339a2e-c5ed-4af8-8ba5-6ad604fc4016