Automatyczne znakowanie danych audio na platformie serwera baz danych Oracle

Kubiak, R.; Niewiadomy, D.; Pelikant, A.

Artykuł - szczegóły

Tytuł artykułu

Automatyczne znakowanie danych audio na platformie serwera baz danych Oracle

Autorzy

Kubiak R. , Niewiadomy D. , Pelikant A.

Identyfikatory

Warianty tytułu

Automated audio data tagging system for Oracle database platform

Języki publikacji

Abstrakty

W rozdziale tym przedstawiono projekt systemu automatycznego etykietowania nagrań dźwiękowych. System oparto na algorytmach nieliniowej transformacji czasu DTW, operującej na współczynnikach mel-cepstralnych i human-cepstralnych. Mechanizm automatycznego etykietowania korzystać będzie z w pełni konfigurowalnej, referencyjnej bazy nagrań oraz mapowań znaczników. Finalnie przestawione zostały testy potwierdzające wysoką jakość zaproponowanych algorytmów.

In this chapter you will be provided with description of automated audio tagging system. The system will be based on optimized Dynamic Time Warping algorithm, mel-cepstral coefficients MFCC and human-cepstral coefficients HFCC. In addition the tagging process will be based on fully configurable reference audio database with mapping tags. Introduced tests results of proposed algorithms confirm their high-quality.

Słowa kluczowe

znakowanie audio automatyczne tagi MFCC HFCC DTW

automatic audio tagging tags MFCC HFCC DTW

Wydawca

Wydawnictwo Politechniki Śląskiej

Czasopismo

Studia Informatica

Rocznik

2010

Tom

Vol. 31, nr 2A

Strony

363--374

Opis fizyczny

Bibliogr. 10 poz.

Twórcy

autor

Kubiak R.

autor

Niewiadomy D.

autor

Pelikant A.

PolitechnikaŁódzka, Instytut Mechatroniki i Systemów Informatycznych, dominik.niewiadomy@gmail.com

Bibliografia

1. Weinstein E.: Query By Humming. A Survey, NYU and Google.
2. Ghias A., Logan J., Chamberlin D.: Query by humming - musical information retrieval in an audio database. ACM Multimedia 95, 1995.
3. Pelikant A., Niewiadomy D.: Klasyfikator podobieństwa w zapytaniach QBH oparty o współczynniki MFCC. BDAS, 2008.
4. Pelikant A., Niewiadomy D.: Query by Voice Example and sound similarity based on the Dynamic Time Warping algorithm. SMC, 2009.
5. Pelikant A., Niewiadomy D.: Implementation of MFCC vector generation in classification context. Journal of Applied Computer Science, 2008.
6. Skowronski M., Harris J.: Human Factor Cepstral Coefficients. J. Acoustical Society of America, Vol. 112, No. 5, Cancun, Mexico, Nov. 2002, s. 2305.
7. Sakoe H., Chiba S.: Dynamic programming algorithm optimization for spoken word recognition. Acoustics, Speech and Signal Processing, IEEE Transactions on, Vol. 26, No. 1.
8. Itakura F.: Minimum prediction residual principle applied to speech recognition. Acoustics Speech and Signal Processing, IEEE Transactions on, Vol. 23, No. 1, 1975, s. 67-72.
9. Sakurai Y., Faloutsos C., Yamamuro M.: Stream Monitoring under the Time Warping Distance. Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference.
10. http://en.wikipedia.org/wiki/Receiver_operating_characteristic.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BSL7-0046-0034