Rozpoznawanie wieku i płci na podstawie analizy głosu

Gabryś, J.; Gil, G.; Kiszka, P.

Powiadomienia systemowe

Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Rozpoznawanie wieku i płci na podstawie analizy głosu

Autorzy

Gabryś J. , Gil G. , Kiszka P.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Age and gender recognition based on analysis of voice

Języki publikacji

Abstrakty

Metody automatycznego rozpoznawania wieku i płci pozwalają na rozpoznanie cech osoby mówiącej tylko na podstawie nagrania jej wypowiedzi. Mowa ludzka, poza werbalnym komunikatem, niesie ze sobą informacje dotyczące osoby mówiącej. Nagranie mowy osoby pozwala na wyodrębnienie takich informacji, jak jej płeć, wiek, a także emocje. Zaprezentowano przegląd metod rozpoznawania wieku i płci osób na podstawie ich mowy oraz wykonano implementację i przetestowano połączenie metod wyznaczania parametrów MFCC (współczynniki analizy cepstralnej w skali mel (Mel-frequency Cepstral Coefficients) i wysokości tonu głosu f0 oraz algorytmu SVM (metoda wektorów nośnych - Support Vector Machines) do klasyfikacji próbek głosowych. Testy zaimplementowanego rozwiązania pozwalają stwierdzić, że metoda jest skuteczna w większości przypadków testowych.

Methods for automatic recognition of the age and gender characteristics allow the identification of the person only on the basis of recording of this person speech. Human speech, beyond verbal communication, gives an information about the speaking person. Speech recording allows the identification personal characteristics such as gender, age, and the emotions. The paper presents an overview of methods of age and gender recognition of people based on their speech. A combination of methods for determining the parameters MFCC (Mel-frequency Cepstral Coefficients) and pitch of voice (f0) and SVM (Support Vector Machines) algorithm for the classification of voice samples is implanted and tested. It was demonstrated that the method is effective in the majority of test cases.

Słowa kluczowe

rozpoznawanie mowy automatyczne wiek płeć współczynniki MFCC klasyfikacja mówcy maszyna wektorów nośnych

automatic speech recognition age gender MFCC coefficients classification of speaker support vector machine (SVM)

Wydawca

Jacek Doskocz

Czasopismo

Acta Bio-Optica et Informatica Medica. Inżynieria Biomedyczna

Rocznik

2015

Tom

Vol. 21, nr 3

Strony

165--169

Opis fizyczny

Bibliogr. 10 poz.

Twórcy

autor

Gabryś J.

justyna.gabrys@gmail.com

Akademia Górniczo-Hutnicza im S. Staszica, Wydział Elektroniki, Automatyki, Informatyki i Inżynierii Biomedycznej, Katedra Automatyki i Inżynierii Biomedycznej, 30-059 Kraków, al. Mickiewicza 30

autor

Gil G.

Akademia Górniczo-Hutnicza im S. Staszica, Wydział Elektroniki, Automatyki, Informatyki i Inżynierii Biomedycznej, Katedra Automatyki i Inżynierii Biomedycznej, 30-059 Kraków, al. Mickiewicza 30

autor

Kiszka P.

Akademia Górniczo-Hutnicza im S. Staszica, Wydział Elektroniki, Automatyki, Informatyki i Inżynierii Biomedycznej, Katedra Automatyki i Inżynierii Biomedycznej, 30-059 Kraków, al. Mickiewicza 30

Bibliografia

[1] S. Schötz: A perceptual study of speaker age, Working Papers 49, Department of Linguistics and Phonetics, Lund University, 2001.
[2] F. Metze, J. Ajmera, R. Englert et al: Comparison of four approaches to age and gender recognition for telephone applications, Acoustics, Speech and Signal Processing, 2007.
[3] T. Bocklet, A. Maier, J.G. Bauer: Age and gender recognition for telephone applications based on GMM supervectors and support vector machines, Acoustics, Speech and Signal Processing, 2008.
[4] T. Seehapoch, S. Wongthanavasu: Speech emotion recognition using Support Vector Machines, Knowledge and Smart Technology (KST), 2013.
[5] V. Hubeika: Estimation of Gender and Age from Recorded Speech, Proc. ACM Student Research competition 2006.
[6] M. Feld, F. Burkhardt, C. Muller: Automatic Speaker Age and Gender Recognition in the Car for Tailoring Dialog and Mobile Services, Proc. Interspeech 2010.
[7] P. Daniel, W. Ellis: PLP and RASTA (and MFCC, and inversion) in Matlab, http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/, 2005.
[8] A. Mishra: Multi Class Support Vector Machine - File Exchange – MATLAB Central, http://www.mathworks.com/matlabcentral/fileexchange/33170-multi-class-support-vector-machine, 2012.
[9] L. Feng: Speaker Recognition, Informatics and Mathematical Modelling, Technical University of Denmark, DTU, 2004.
[10] X. Sun: Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio, Proc. of ICASSP2002, Orlando, Florida, May 13–17, 2002.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-19fba6d9-a6fc-4671-aa3d-1521258dee09