Generalized Maximal Margin Discriminant Analysis for Speech Emotion Recognition

Jin, Y.; Zheng, W.; Zhao, L.; Yan, J.

Artykuł - szczegóły

Tytuł artykułu

Generalized Maximal Margin Discriminant Analysis for Speech Emotion Recognition

Autorzy

Jin Y. , Zheng W. , Zhao L. , Yan J.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

Analiza dyskryminacyjna maksymalnego marginesu w rozpoznawaniu emocji w mowie

Języki publikacji

Abstrakty

A novel speech emotion recognition method based on the generalized maximum margin discriminant analysis (GMMDA) method is proposed in this paper. GMMDA is a multi-class extension of our proposed two-class dimensionality reduction method based on maximum margin discriminant analysis (MMDA), which utilizes the normal direction of optimal hyperplane of linear support vector machine (SVM) as the projection vector for feature extraction. To generate an optimal set of projection vectors from MMDA-based dimensionality reduction method, we impose orthogonal restrictions on the projection vectors and then recursively solve the problem. Moreover, to deal with the multi-class speech emotion recognition problem, we present two recognition schemes based on our proposed dimensionality reduction approach. One is using “one-versus-one" strategy for multi-class classification, and the other one is to compose the projection vectors of each pair of classes to obtain a transformation matrix for the multi-class dimensionality reduction.

W artykule przedstawiono metodę analizy emisji głosu pod kątem rozpoznawania emocji. Rozwiązanie bazuje na analizie dyskryminacyjnej maksymalnego marginesu GMMDA.

Słowa kluczowe

dimensional reduction speech emotion recognition support vector machine (SVM)

redukcja wymiarowości analiza dyskryminacyjna rozpoznawanie emocji

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2013

Tom

R. 89, nr 5

Strony

86--91

Opis fizyczny

Bibliogr. 21 poz., rys., tab.

Twórcy

autor

Jin Y.

jinyun9999@gmail.com

Southeast University
Jiangsu Normal University

autor

Zheng W.

wenming_zheng@seu.edu.cn

Southeast University

autor

Zhao L.

Southeast University

autor

Yan J.

Southeast University

Bibliografia

[1] R. Cowie, et al.,“Emotion recognition in human-computer interaction," IEEE Signal Process Magazine, Vol.18, pp.32-80, 2001.
[2] T. Vogt and E.Andre, “Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition.," presented at the Proc. Multimedia and Expo(ICME05), Amsterdam, Netherlands, 2005.
[3] B. Schuller, et al., “Brute-forcing hierarchical functionals for paralinguistics: a waste of feature space?," presented at the Proc. ICASSP, Las Vegas, NV, 2008.
[4] B. Schuller, et al., “Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles," in Proc. Interspeech 2006, pp. 1818-1821.
[5] B. Schuller, et al., “Recognising realistic emotions and affect in speech: State of the art and lessons learnt from te first challenge," speech communication, 2011.
[6] I. K.Fodor, “A survey of dimension reduction techniques," 2002.
[7] P. Pudil, et al., “Floating search methods in feature selection," Pattern Recognition Letter, vol. 15, pp. 1119-1125, 1994.
[8] C.Bishop. Pattern recognition and machine learning. Springer,2006
[9] I. T. Jolliffe, Principle Component Analysis. Berlin, Germany: Springer, 2002.
[10] K. Fukunaga, Introduction to Statistical Pattern Recogniton: Academic Press, 1990.
[11] A. Kocsor, et al., “Margin Maximizing Discriminant Analysis," ECML, pp. 227-238, 2004.
[12] K. Kovacs, et al., “Maximum Margin Discriminant Analysis based Face Recognition," presented at the Proc. Joint Hungarian-Austrian Conf. Image Process Pattern Recognition, 2005.
[13] I. W.-H. Tsang, et al., “Large-Scale Maximum Margin Discriminant Analysis Using Core Vector Machines " IEEE Transactions On Neural Network, vol. 19, pp. 610-624, 2008.
[14] I. W.Tsang, et al., “Efficient kernel feature extraction for massive data sets," presented at the Proceedings of the 12th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, NY, USA, 2006.
[15] I. W. Tsang, et al., “Diversified SVM Ensembles for Large Data Sets," presented at the Machine Learning: ECML, 2006.
[16] H. Li, et al., “Efficient and Robust Feature Extraction by Maximum Margin Criterion," IEEE Transactions On Neural Networks, vol. 17, pp. 157-165, 2006.
[17] S. Gu, et al., “Discriminant analysis via support vectors," Neurocomputing, vol. 73, pp. 1669-1675, 2010.
[18] V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998.
[19] F.Burkhardt, et al., “A database of german emotional speech," presented at the Interspeech, 2005.
[20] D.Bitouk, R.Verma, A.nenkova, Class-level spectral features for emotion recognition. Speech Communication.(2010), doi:10.1016/j.specom.2010.02.010
[21] P. Boersma. Praat, a system for doing phonetics by computer. Glot International, 5(9/10):341–345, 2001.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-3a6e5be1-38d2-46e0-b469-22e92bbd496b