Spectral methods in Polish emotional speech recognition

Powroźnik, P.; Czerwiński, D.

doi:10.12913/22998624/65138

Artykuł - szczegóły

Tytuł artykułu

Spectral methods in Polish emotional speech recognition

Autorzy

Powroźnik P. , Czerwiński D.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.12913/22998624/65138

Warianty tytułu

Języki publikacji

Abstrakty

In this article the issue of emotion recognition based on Polish emotional speech signal analysis was presented. The Polish database of emotional speech, prepared and shared by the Medical Electronics Division of the Lodz University of Technology, has been used for research. Speech signal has been processed by Artificial Neural Networks (ANN). The inputs for ANN were information obtained from signal spectrogram. Researches were conducted for three different spectrogram divisions. The ANN consists of four layers but the number of neurons in each layer depends of spectrogram division. Conducted researches focused on six emotional states: a neutral state, sadness, joy, anger, fear and boredom. The averange effectiveness of emotions recognition was about 80%.

Słowa kluczowe

artificial neural network spectrogram emotional speech recognition

Wydawca

Lublin University of Technology
Polish Society of Ecological Engineering (PTIE), Branch of PTIE in Lublin

Czasopismo

Advances in Science and Technology. Research Journal

Rocznik

2016

Tom

Vol. 10, nr 32

Strony

73--81

Opis fizyczny

Bibliogr. 18 poz., fig., tab.

Twórcy

autor

Powroźnik P.

pawel.powroznik@pollub.edu.pl

Institute of Computer Sciences, Lublin University of Technology, Nadbystrzycka 36, 20-618 Lublin, Poland

autor

Czerwiński D.

d.czerwinski@pollub.pl

Institute of Computer Sciences, Lublin University of Technology, Nadbystrzycka 36, 20-618 Lublin, Poland

Bibliografia

1. Berlin Database of Emotional Speech, www.expressive-speech.net.
2. Bracewell R. The fourier transform and its application. McGraw-Hill International Editions, Electric Engineering Series, Singapore, 2000.
3. Strona internetowa Instytutu Eletroniki Politechniki Łódzkiej (www.eletel.p.lodz.pl).
4. Dennis J., Tran H.D. and Chang E.S. Overlapping sound event recognition using local spectrogram features and the generalized hough transform. Pattern Recognition Letters, 34, 2013, 1085–1093.
5. Duan Z., Mysore G.J. and Smaragdis P. Speech enhancement by online non-negative spectrogram decomposition in non-stationary noise enviroments. IEEE Workshop on Application of Signal Processing to Audio and Acoustics, 2013.
6. Hang Q., Wang K. and Ren F. Speech emotion recognition using combination of features. ICICIP 2013, 523–528.
7. Kamińska D. and Pelikant A. Zastosowanie multimedialnej klasyfikacji w rozpoznawaniu stanów emocjonalnych na podstawie mowy spontanicznej. IAPGOŚ, 03, 2012, 36–39.
8. Kamińska D., Sapiński T., Niewiadomy D. and Pelikant A. Porównanie wydajności współczynników perceptualnych na potrzeby automatycznego rozpoznawania emocji w sygnale mowy. Studia Informatica, 34, 2013, 59–66.
9. Kim E.H., Hyu K.H., Kim S.H. and Kwak Y.K. Speech emotion recognition using eigen-FFT in clean and noisy environments. 16th IEEE International Conference on Robots and Human Interactive Communication, Jeju, Korea, 2007.
10. Konratowski E. Czasowo – częstotliwościowa analiza drgań z wykorzystaniem metody overlapping. Logistyka, 3, 2014, 3104–3110.
11. Konratowski E. Monitoring of the multichannel audio signal, computional collective intelligence. Technologies and Applications, Lecture Notes in Artifical Intelligence, Springer Verlag, 6422, 2010, 298–306.
12. Kozieł G. Zastosowanie transformaty Fouriera w stenografii dźwięku. Studia Informatica, 32, 2011, 542–552.
13. McCulloch W. and Pitts W. A logical calculations of the ideas in nervous activity. Bulletin of Mathematical Biophisics, 5, 1943, 115–133.
14. Panda S.P. and Nayak A.K. Automatic speech segmentation in syllabe centric speech recognition systems. International Journal of Speech Technology, 9, 2016, 9–18.
15. Powroźnik P. Polish emotional speech recognition using artificial neural network. Advances is Science and Technology Research Journal, 8(24), 2014, 24–27.
16. Ramakrishnan S. Recognition of emotion from speech, A review. Speech Enhancement, Modeling and Recognition – Algorithms and Applications, 2012.
17. Szymczyk T. Rozpoznawanie tekstur z wykorzystaniem baz modeli. Prace Instytutu Elektroniki, 249, 2011, 95–115.
18. Zieliński T.P. Cyfrowe przetwarzanie sygnałów. Od teorii do zastosowań, WKiŁ, Warszawa 2009.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-7237a39e-ddce-4de9-834b-00bc4ac43795