The study investigates the use of speech signal to recognise speakers’ emotional states. The introduction includes the definition and categorization of emotions, including facial expressions, speech and physiological signals. For the purpose of this work, a proprietary resource of emotionally-marked speech recordings was created. The collected recordings come from the media, including live journalistic broadcasts, which show spontaneous emotional reactions to real-time stimuli. For the purpose of signal speech analysis, a specific script was written in Python. Its algorithm includes the parameterization of speech recordings and determination of features correlated with emotional content in speech. After the parametrization process, data clustering was performed to allows for the grouping of feature vectors for speakers into greater collections which imitate specific emotional states. Using the t-Student test for dependent samples, some descriptors were distinguished, which identified significant differences in the values of features between emotional states. Some potential applications for this research were proposed, as well as other development directions for future studies of the topic.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.