Speech Emotion Recognition Using Hybrid Generative and Discriminative Models

Huang, Y.; Zhang, G.; Dong, F.; Li, Y.; Da, F.

Artykuł - szczegóły

Tytuł artykułu

Speech Emotion Recognition Using Hybrid Generative and Discriminative Models

Autorzy

Huang Y. , Zhang G. , Dong F. , Li Y. , Da F.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

System wykrywania emocji w głosie na podstawie modelu dyskryminacyjnego

Języki publikacji

Abstrakty

In this paper, we use Sequential Forward Selection to select 8 dimensional frame-level features from the total 69 dimensional features, and we reduce the dimensions of utterance-level eigenvectors from 63 to 12 by fisher discriminant. Then, two kinds of GMM multidimensional likelihoods are proposed for hybrid generative and discriminative models. Experimental results on Berlin emotional speech databases show that the GMM-MAP/SVM series hybrid model is the optimal Hybrid Generative and Discriminative Models, with the recognition rate up to 85.1%.

W artykule zaprezentowano system wykrywania emocji w głosie na podstawie modelu dyskryminacyjnego. Zaprezentowano badania skuteczności system na przykładzie bazy danych Berlin.

Słowa kluczowe

speech emotion recognition generative model hybrid model

wykrywanie emocji w głosie model dyskryminacyjny

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2012

Tom

R. 88, nr 3b

Strony

105--108

Opis fizyczny

Bibliogr. 10 poz., schem., tab., wykr

Twórcy

autor

Huang Y.

autor

Zhang G.

autor

Dong F.

autor

Li Y.

autor

Da F.

Southeast University, huang_ym@163.com

Bibliografia

[1] Vondra M, Vich R. Evaluation of Speech Emotion Classification Based on GMM and Data Fusion[C]. In: Proc. of the Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions, 2009. 98-105.
[2] Morrison D, Wang R L, De Silva L C. Ensemble methods for spoken emotion recognition in call-centres[J]. Speech Communication, 49(2007), No.2, 98-112.
[3] Kim S, Georgiou P G, Lee S, et al. Real-time emotion detection system using speech: Multi-modal fusion of different timescale features. 2007 Ieee Ninth Workshop on Multimedia Signal Processing, 2007, 48-51.
[4] Schwenker, F., et al., The GMM-SVM Supervector Approach for the Recognition of the Emotional Status from Speech. Artificial Neural Networks - Icann 2009, Pt I, 2009, 894-903.
[5] Chang Huai, Y., L. Kong Aik, and L. Haizhou, An SVM Kernel With GMM-Supervector Based on the Bhattacharyya Distance for Speaker Recognition. Signal Processing Letters, IEEE, 16(2009), No1, 49-52.
[6] Yun S, Yoo C D. Speech Emotion Recognition Via a Max- Margin Framework Incorporating a Loss Function Based on the Watson and Tellegen's Emotion Model[C]. In: 2009 Ieee International Conference on Acoustics, Speech, and Signal Processing, 2009. 4169-4172.
[7] Guo Y F, Shu T T. Feature Extraction Method Based on the Generalised Fisher Discriminant Criterion and Facial Recognition. Pattern Analysis & Appl. 4(2001), No.1, 61–66.
[8] Platt J C. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods[M]. ADVANCES IN LARGE MARGIN CLASSIFIERS. Cambridge: MIT Press, 1999, 61-74.
[9] Burkhardt F, Paeschek A, Rolfes M, et al. A Database of German Emotional Speech. In: Proc. INTERSPEECH 2005.
[10] Wu T-F, Lin C-J, Weng R C. Probability Estimates for Multiclass Classification by Pairwise Coupling [J]. The Journal of Machine Learning Research, 5(2004), 975-1005.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPOH-0063-0001