PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Recognition of speaker’s age group and gender for a large database of telephone-recorded voices

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The paper presents the results of the automatic recognition of age group and gender of speakers performed for the large SpeechDAT(E) acoustic database for the Polish language, containing recordings of 1000 speakers (486 males/514 females) aged 12 to 73, recorded in telephone conditions. Three age groups were recognised for each gender. Mel Frequency Cepstral Coefficients (MFCC) were used to describe the recognized signals parametrically. Among the classification methods tested in this study, the best results were obtained for the SVM (Support Vector Machines) method.
Rocznik
Strony
art. no. 2022203
Opis fizyczny
Bibliogr. 7 poz., 1 rys., wykr.
Twórcy
  • Wrocław University of Science and Technology, Faculty of Electronics, Photonics and Microsystems, Department of Acoustics, Multimedia and Signal Processing, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland
Bibliografia
  • 1. D. Kwasny, D. Hammerling; Gender and Age Estimation Metods Based on Speech Using Deep Neural Networks; Sensors 2021, 21, 4785. DOI:10.3390/s21144785
  • 2. T. Bocklet, A. Maier, J.G. Bauer, F. Burkhardt, E. Noth; Age and gender recognition for telephone applications based on GMM supervectors and support vector machines; Proceedings of 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, USA, March 31 - April 4, 2008; IEEE: Piscataway, USA, 2008. DOI: 10.1109/ICASSP.2008.4517932
  • 3. P. Pollak, J. Cernocky, J. Boudy, K. Choukri, H. Heuvel, K. Vicsi, A. Virag, R. Siemund, W. Majewski, J. Sadowski, P. Staroniewicz, H. Tropf, J. Kochanina, A. Ostrukhov, M. Rusko, M. Trnka; SpeechDat(E) - eastern European telephone speech databases; In: Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00); Athens, Greece, May 31 - June 2, 2000; European Language Resources Association: Athens, Greece, 2000.
  • 4. P. Staroniewicz, J. Sadowski; SpeechDat Polish Database for the Fixed Telephone Network, (Polish Database documentation file), 2000 (http://www.elra.info).
  • 5. F. Bimbot, J.F. Bonastre, C. Fredouille, G. Gravier, I. Magrin-Chagnolleau, S. Meignier, T. Merlin, J. Ortega-Garcia, D. Petrovska-Delcretaz, D.A. Reynolds; A Tutorial on Text-Independent Speaker Verification; EURASIP J. Adv. Signal Process. 2004, 101962. DOI:10.1155/S1110865704310024
  • 6. T.K. Ho; Random decision forests; Proceeding of 3rd International Conference on Document Analysis and Recognition, Montreal, Canada, August 14-16, 1995; IEEE: Piscataway, USA, 1995. DOI:10.1109/ICDAR.1995.598994
  • 7. Y.W. Chang, C.J. Hsieh, K.W. Chang, M. Riggaard, C.J. Lin; Training and Testin Low-degree Polynomial Data Mappings via Linear SVM; Journal of Machine Learning Research 2010, 11(48), 1471-1490.
Uwagi
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-a3910e4b-e4bf-471e-a263-9699cae522e6
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.