Korpusy wykorzystywane w systemach rozpoznawania mówcy

Ball, D.; Bielińska, E.

Artykuł - szczegóły

Tytuł artykułu

Korpusy wykorzystywane w systemach rozpoznawania mówcy

Autorzy

Ball D. , Bielińska E.

Identyfikatory

Warianty tytułu

Corpuses used in speaker recognition systems

Języki publikacji

Abstrakty

W artykule przedstawiono przegląd korpusów znajdujących zastosowanie w systemach rozpoznawania mówcy. Porównano korpusy anglojęzyczne i korpusy opracowane w innych językach. Zestawiono i porównano cechy korpusów, zwracając szczególną uwagę na relację korpusów polskojęzycznych do innych publikowanych korpusów.

The article is concerned with a review of corpuses applied in speaker recognition systems. English language corpuses are compared to the corpuses built for the other language speakers. The main features of the corpuses are compared. Especially, relation of the features of polish language corpuses to the other ones is taken into consideration.

Słowa kluczowe

korpusy rozpoznawanie mówcy Timit Ntimit

corpuses speaker recognition Timit Ntimit

Wydawca

Wydawnictwo Politechniki Śląskiej

Czasopismo

Studia Informatica

Rocznik

2010

Tom

Vol. 31, nr 4A

Strony

5--32

Opis fizyczny

Bibliogr. 41 poz.

Twórcy

autor

Ball D.

autor

Bielińska E.

Politechnika Śląska, Instytut Informatyki, Gliwice, Akademicka 16, damian.ball@polsl.pl

Bibliografia

1. Campbell J. P. Jr.: Speaker recognition. Department of Defense Fort Meade
2. Melin H.:Databases for Speaker Recognition. Activities in COST250 Working Group 2, In: COST250 -Speaker Recognition in Telephony (2000).
3. Feng L., Hansen L. K.: A new database for speaker recognition. Informatics and mathematical Modelling, Technical University of Denmark, IMM-Technical Report, 2005-05.
4. Ortega-Garcia J., Gonzalez-Rodriguez J., Marrero-Aguiar V., Cg. Diaz-Gomez J. J., Cap. Garcia-Jimenez R., Cap. Lucena-Molina J., Tcol. Sanchez-Molero J. A. G.: AHUMADA: A large speech corpus in Spanish for speaker identification and verification. Available online 2 June 2000, IEEE International Conference on Acoustics Speech and Signal Processing, May 1998, s. 773-776.
5. Ramos D., Gonzalez-Rodriguez J., Gonzalez-Dominguez J., Lucena-Molina J. J.: Addressing database mismatch in forensic speaker recognition with Ahumada III: a public real-casework database in Spanish. Proceedings of Interspeech 2008, September 2008, s. 1493-1496,.
6. Martin A. F.: Encyclopedia of biometrics Speaker Databases and Evaluation. National Institute of Standards and Technology Gaithersburg, Maryland, USA.
7. Toledano D. T., Hernández-López D., Esteve-Elizalde C, Fiérrez J., Ortega-García J., Ramos D., Gonzalez-Rodriguez J.: BioSec Multimodal Biometrie Database in Text-Dependent Speaker Recognition. Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco, 2008.
8. Marasek K., Gubrynowicz R.: Budowa bazy dialogów telefonicznej mowy polskiej w ramach projektu EC LUNA. The Linguistic Engineering Group, Natural Language Processing Seminar, 2007.
9. Zheng N., Qin Ch., Lee T., Ching P.C.: CU2C - A Dual-condition Cantonese Speech Database for Speaker Recognition Applications. Oriental COCOSDA, Jakarta Indonesia, 2005.
10. Center for Spoken Language understanding @OGI http://cslu.cse.ogi.edu/.
11. Wildermoth B. R., Paliwal K. K.: GMM Based Speaker Recognition on Readily Available Databases. Proc. Microelectronic Engineering Research Conference, Brisbane, Australia, November 2003.
12. Dyrek M., Gałka J., Ziółko B.: Measures On Wavelet Segmentation of Speech. Proceedings of the 8th WSEAS International Conference on Multimedia systems and signal processing, Hangzhou, China, 2008.
13. Grocholewski S.: Podstawy systemu rozpoznawania mowy dla języka polskiego. III Krajowa Konferencja pt. Multimedialne i Sieciowe Systemy Informacyjne, 09.2002.
14. Petrovska D., Hennebert J., Melin H., Genoud D.: Polycost : A telephone-speech database for speaker recognition. Speech Communication, June 2000, Volume 31, Issues 2-3, s. 265-270.
15. Genound D., Ellis D., Morgan N.: Simultaneous speech and speaker recognition Rusing hybrid architecture. ICSI Technical Report TR-99-012, July 1999.
16. Le Floch J. L., Montacie C, Caraty M. J.: Speaker recognition experiments on the ntimit database. Proceedings of Eurospeech 95, Madrid, Spain, September 1995, vol. 1, s. 379-382.
17. NIST Speaker Recognition Evaluation Plans, http://www.nist.gov/speech/test.htm.
18. European Lang Resources Assoc. http://www.icp.grenet.fr/ELRA/.
19. Linguistic Data Consortium, http://www.ldc.upenn.edu/.
20. Oregon Graduate Institute http://cslu.cse.ogi.edu/.
21. Dreuw P., Rybach D., Deselaers T., Zahedi M., Ney H.: Speech Recognition Techniques for a Sign Language Recognition System. Interspeech/ICSLP 2007, Belgium, Antwerp, 2007, s. 2513-2516.
22. Falcone M., Gallo A.: The Siva Speech Database for Speaker Verification: Description and Evaluation. In Proceedings COST250 Workshop on Speaker Recognition in Telephony, 1996.
23. Kajarekar S. S., Scheffer N., Graciarena M., Shriberg E., Stolcke A., Ferrer L., Booklet T.: The SRI NIST 2008 Speaker recognition evaluation system. Proc. ICASSP, Taipei, Taiwan, 2009.
24. Ganchev T., Fakotakis N., Kokkinakis G.: Toward 2003 NIST Speaker Recognition Evaluation:The WCL-1 System. International Workshop Speech and Komputer SPECOM'2003,2003.
25. Kajarekar S. S., Ferrer L., Stolcke A., Shriberg E.: Voice-based Speaker Recognition Combining Acoustic and Stylistic Features. Advances in Biometrics: Sensors, Algorithms and Systems, Springer, London 2008, s. 183-201.
26. Wydra S.: Zastosowanie parametryzacji mieszanej w systemie rozpoznawania mowy polskiej. Krajowa konferencja radiokomunikacji radiofonii i telewizji KKRRiT, 2006.
27. Campbell J. P. Jr., Reynold D. A.: Corpora for the Evaluation of Speaker Recognition Systems. Proceedings of the Acoustics, Speech, and Signal Processing, 1999 IEEE International Conference, 1999, Volume 02, s. 829-832.
28. Gałka J.: Optymalizacja parametryzacji sygnału w aspekcie rozpoznawania mowy polskiej - rozprawa doktorska. Kraków 2008.
29. Zahedi M., Keysers D., Deselaers T., Ney FL: Combination of Tangent Distance and an Image Distortion Model for Appearance-Based Sign Language Recognition. DAGM (Deutsche Arbeitsgemeinschaft fur Mustererkennung) Symposium 2005, s. 401-408.
30. Staroniewicz P., Sadowski J.: Akustyczna baza danych SpeechDat dla języka polskiego. LVI Otwarte Seminarium z Akustyki, Kraków-Zakopane, 14-17 września 1999, 141-144.
31. Ferrer L., Graciarena M., Zymnis A., Shriberg E.: System combination using auxiliary information for speaker verification. Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference, Las Vegas, NV, 2008, s. 4853-4856.
32. Fierrez-Aguilar J., Ortega-Garcia J., Torre-Toledano D., Gonzalez-Rodriguez J.: Biosem baseline corpus: A multimodal biometric database. Pattern Recognition, No. 4, April 2007, s. 1389-1392.
33. Garcia-Salice.tti S., Beumier C, Chollet G.: A multimodal person authentication database including face, voice, fingerprint, hand and signature modalities, Lect. Notes Comput. Sc. 2688,2003, s. 845-853.
34. Li S.Z., Jain A. K.: Encyclopedia of Biometrics.
35. Dumas B., Pugin C, Hennebert J., Petrovska-Delacrétaz D., Humm A., Evéquoz F., Ingold R., Von Rotz D.: MylDea - Multimodal Biometrics Database, Description of Acquisition Protocols. In proc. of Third COST 275 Workshop (COST 275), Hatfield (UK), October 27-28 2005, s. 59-62.
36. Dessimoz D., Richiardi J., Prof. Champod Ch., Dr. Drygajlo A.: Multimodal Biometrics for Identity Documents. Forensic Science International, 11 April 2007, Volume 167, Issue 2, s. 154-159.
37. Galbally J., Fierrez J., Ortega-Garcia J., Freiré M. R., Alonso-Fernandez F., Siguenza J. A., Garrido-Salas J., Anguiano-Rey E., Gonzalez-de-Rivera G., Ribalda R., Faundez-Zanuy M., Ortega J. A., Carde~noso-Payo V., Viloria A., Vivaracho C. E., Moro Q. I., Igarza J. J., Sanchez J., Hérnaez I., Orrite-Urunuela C: BiosecurlD: a Multimodal Biometric Database. Pattern Analysis & Applications, Volume 13, Numer 2, May, 2010, s. 235-246.
38. Meng H., Ching P.C., Lee T., Mak M. W., Mak B., Moon Y.S., Siu M.H., Tang X., Hui H. P. S., Lee A., Lo W. K., Ma B., Sio E. K. T.: The multi-biometric, multi-device and multilingual (M3) Corpus. Pattern Recognition, Volume 43, Issue 3, March 2010, s. 1094-1105.
39. Cole R., Noel M., Burnett D. C, Fanty M., Lander T., Oshika B., Sutton S.:Corpus Development Activities at the Center for Spoken Language Understanding. In Proceedings of the 1994 ARPA Human Language Technology Workshop.
40. Woo R. H., Park A., Hazen T. J:: The MIT Mobile Device Speaker Verification Corpus: Data Collection and Preliminary Experiments. Speaker and Language Recognition Workshop, IEEE Odyssey 2006, San Juan, June 2006, s. 1-6.
41. Bailly-Bailliere E., Bengio S.: The BANCA Database and Evaluation Protocol. 4th International Conference on Audio- and Video-Based Biometric Person Authentication, Guildford, UK, 2003, vol. 2688.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BSL7-0050-0017