Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Dynamic Time Warping is a standard algorithm used for matching time series irrespective of local tempo variations. Its application in the context of Query-by-Humming interface to multimedia databases requires providing the transposition independence, which involves some additional, sometimes computa- tionally expensive processing and may not guarantee the success, e.g., in the presence of a pitch trend or accidental key changes. The method of tune following, proposed in this paper, enables solving the pitch alignment problem in an adaptive way inspired by the human ability of ignoring typical errors occurring in sung melodies. The experimental validation performed on the database containing 4431 queries and over 5000 templates confirmed the enhancement introduced by the proposed algorithm in terms of the global recognition rate.
Słowa kluczowe
Wydawca
Czasopismo
Rocznik
Tom
Strony
467--476
Opis fizyczny
Bibliogr. 32 poz., tab., wykr.
Twórcy
autor
- Institute of Information Technology, Lodz University of Technology Wólczańska 215, 90-924 Łódź, Poland
Bibliografia
- 1. Adams N., Marquez D., Wakefield G. (2005), Iterative deepening for melody alignment and retrieval, [in:] ISMIR 2005, 6th Int. Conf. on Music Information Retrieval, pp. 199-206.
- 2. Blsesi E., Parncutt R. (2013), An accent-based, approach to automatic rendering of piano performance: Preliminary auditory evaluation, Archives of Acoustics, 36, 2, 283-296.
- 3. Boersma P. (1993), Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, Institute of Phonetic Sciences, University of Amsterdam, Proceedings, 17, 97- 110.
- 4. Dziubinski M., Kostek B. (2005), Octave error immune and instantaneous pitch detection algorithm,, Journal of New Music Research, 34, 3, 273-292.
- 5. ESAC-DATA (2009), http://www.esac-data.org.
- 6. Eyben F., Bock S., Schuller B., Graves A. (2010), Universal onset detection with, bidirectional long short term memory, [in:] Neural Networks, 11 th International Society for Music Information Retrieval Conference (ISMIR 2010), pp. 589-594.
- 7. Gerhard D. (2003), Pitch extraction and fundamental frequency: History and, current techniques, Tech. rep., Dept, of Computer Science, University of Regina.
- 8. Ghias A., Logan J., Chamberlin D., Smith B.C. (1995), Query by humming - musical information retrieval in an audio database, [in:] Proc. of the 3rd ACM Int. Conf. on Multimedia, MULTIMEDIA ’95, pp. 231- 236.
- 9. Głaczyński j., Łukasik E. (2011), Automatic music summarization. A ‘thumbnail” approach, Archives of Acoustics, 36, 2, 297-309.
- 10. Huang S., Wang L., Hu S., Jiang H., Xu B. (2008), Query by humming via multiscale transportation distance in random, query occurrence context, [in:] IEEE Int. Conf. on Multimedia and Expo, pp. 1225-1228.
- 11. Itakura F. (1975), Minimum prediction residual principle applied, to speech recognition, IEEE Transactions on Acoustics, Speech, and Signal Processing, 23, 1, 67- 72.
- 12. Jang (2009), http://mirlab.org/dataset/public/mir- qbsh-corpus.rar.
- 13. Jang D., Song C.J., Shin S., Lee J.S., Park S.J., Jang S.J., Lee S.P., Seo K.H. (2011), Query by singing/humming system based, on the combination of DTW distances for MIREX 2011, http://www.music- ir.org/mirex/abstracts/2011/JSSLPl.pdf
- 14. Jang J.S.R., Lee H.R. (2001), Hierarchical filtering method for content-based music retrieval via acoustic input, [in:] Proceedings of the ninth ACM International Conference on Multimedia, MULTIMEDIA ’01, pp. 401-410.
- 15. Jeon W., Ma C. (2011), Efficient search of music pitch contours using wavelet transforms and, segmented dynamic time warping, [in:] IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2304-2307.
- 16. Keogh E. (2002), Exact indexing of dynamic time warping, [in:] Proceedings of the 28th International Conference on Very Large Data Bases, VLDB ’02, pp. 406-417.
- 17. Lau E., Ding A., Calvin J. (2005), MusicDB: A query by humming system, Final project report, Massachusetts Institute of Technology, USA.
- 18. Lijffijt J., Papapetrou P., Hollmen J., Athit- SOS V. (2010), Benchmarking dynamic time warping for music retrieval, [in:] Proceedings of the 3rd International Conference on Pervasive Technologies Related to Assistive Environments, PETRA ’10, pp. 59:1-59:7.
- 19. Macrae R., Dixon S. (2010), Accurate real-time windowed time warping, [in:] Proceedings of the 11th International Society for Music Information Retrieval Con- ferenc, Downie J.S., Veltkamp R.C. [Eds.], ISMIR 2010, pp. 423-428.
- 20. McNab R.J., Smith L.A., Witten I.H., Henderson C.L., Cunningham S.J. (1996), Towards the digital music library: tune retrieval from acoustic input, [in:] Proceedings of the first ACM International Conference on Digital Libraries, DL ’96, pp. 11-18.
- 21. MIREX (2013), http://www.music-ir.org/mirex/wiki/ 2013:Main_Page.
- 22. Sakoe H., Chiba S. (1978), Dynamic programming algorithm optimization for spoken word recognition, IEEE Transactions on Acoustics, Speech, and Signal Processing, 26, 1, 43-49.
- 23. Sakurai Y., Faloutsos C., Yamamuro M. (2007), Stream monitoring under the time warping distance, Research showcase, Carnegie Mellon University, URL http: //repository.cmu.edu/compsci/529.
- 24. Salamon J., Gómez E. (2012), Melody extraction from polyphonic music signals using pitch contour characteristics, IEEE Transactions on Audio, Speech, and Language Processing, 20, 6, 1759-1770.
- 25. Salvador S., Chan P. (2004), FastDTW: Toward accurate dynamic time warping in linear time and, space, [in:] 3rd Workshop on Mining Temporal and Sequential Data.
- 26. Typke R., Wiering F., Veltkamp R.C. (2007), Transportation distances and, human perception of melodic similarity, Musicae Scientiae, pp. 153-181.
- 27. Uitdenbogerd A., Zobel J. (1999), Melodic matching techniques for large music databases, [in:] Proceedings of the seventh ACM International Conference on Multimedia (Part 1), MULTIMEDIA ’99, pp. 57-66.
- 28. Wang L., Huang S., Hu S., Laing J., Xu B. (2008), An effective and, efficient method for query by humming system based, on multi-similarity measurement fusion, [in:] Int. Conf. on Audio, Language and Image Processing, pp. 471-475.
- 29. Wolkowicz J., Kulka Z., Keselj V. (2008), N-gram based, approach to composer recognition, Archives of Acoustics, 33, 1, 43-55.
- 30. Yang J., Liu J., Zhang W. (2010), A fast query by humming system based, on notes, [in:] INTERSPEECH, pp. 2898-2901.
- 31. Yu H.M., TSAI W.H., WANG H.M. (2008), A query- by-singing system for retrieving karaoke music, IEEE Transactions on Multimedia, 10, 8, 1626-1637.
- 32. Zhu Y., Shasha D. (2003), Warping indexes with envelope transforms for query by humming, [in:] Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD ’03, pp. 181- 192.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-3ec2969a-1967-4f15-ba56-e984f82e0bf5