Wyniki wyszukiwania - BazTech

Ograniczanie wyników

Znaleziono wyników: 2

Liczba wyników na stronie

Wyniki wyszukiwania

Wyszukiwano:
w słowach kluczowych: Mel Frequency Cepstral Coefficient

Sortuj według:

Ogranicz wyniki do:

Analiza parametrów sygnału mowy w kontekście ich przydatności w automatycznej ocenie jakości ekspresji śpiewu

Zaporowski Szymon, Kostek Bożena

Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej

2019

Nr 68

61--64

Praca dotyczy podejścia do parametryzacji w przypadku klasyfikacji emocji w śpiewie oraz porównania z klasyfikacją emocji w mowie. Do tego celu wykorzystano bazę mowy i śpiewu nacechowanego emocjonalnie RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), zawierającą nagrania profesjonalnych aktorów prezentujących sześć różnych emocji. Następnie obliczono współczynniki mel-cepstralne (MFCC) oraz wybrane deskryptory niskopoziomowe MPEG 7. W celu selekcji cech, posiadających najlepsze wyniki rankingowe, wykorzystano las drzew. Następnie dokonano klasyfikacji emocji z za pomocą maszyny wektorów nośnych (SVM, Support Vector Machine). Stwierdzono, że parametryzacja skuteczna dla mowy nie jest skuteczna dla śpiewu. Wyznaczono podstawowe parametry, które zgodnie z otrzymanymi wynikami pozwalają na znaczną redukcję wymiarowości wektorów cech, jednocześnie podnosząc skuteczność klasyfikacji.

This paper concerns the approach to parameterization for the classification of emotions in singing and comparison with the classification of emotions in speech. For this purpose, the RAVDESS database containing emotional speech and song was used. This database contains recordings of professional actors presenting six different emotions. Next, Mel Frequency Cepstral Coefficients and selected Low-Level MPEG 7 descriptors were calculated. Using the algorithm of Feature Selection based on a Forest of Trees, coefficients, and descriptors with the best ranking results were determined. Then, the emotions were classified using the Support Vector Machine. The classification was repeated several times, and the results were averaged. It was found that descriptors used for emotion detection in speech are not as useful for singing. Basic parameters for singing were determined which, according to the obtained results, allow for a significant reduction in the dimensionality of feature vectors while increasing the classification efficiency of emotion detection.

Automatic Genre Classification Using Fractional Fourier Transform Based Mel Frequency Cepstral Coefficient and Timbral Features

Bhalke D. G., Rajesh B., Bormane D. S.

Archives of Acoustics

2017

Vol. 42, No. 2

213--222

This paper presents the Automatic Genre Classification of Indian Tamil Music and Western Music using Timbral and Fractional Fourier Transform (FrFT) based Mel Frequency Cepstral Coefficient (MFCC) features. The classifier model for the proposed system has been built using K-NN (K-Nearest Neighbours) and Support Vector Machine (SVM). In this work, the performance of various features extracted from music excerpts has been analysed, to identify the appropriate feature descriptors for the two major genres of Indian Tamil music, namely Classical music (Carnatic based devotional hymn compositions) & Folk music and for western genres of Rock and Classical music from the GTZAN dataset. The results for Tamil music have shown that the feature combination of Spectral Roll off, Spectral Flux, Spectral Skewness and Spectral Kurtosis, combined with Fractional MFCC features, outperforms all other feature combinations, to yield a higher classification accuracy of 96.05%, as compared to the accuracy of 84.21% with conventional MFCC. It has also been observed that the FrFT based MFCC effieciently classifies the two western genres of Rock and Classical music from the GTZAN dataset with a higher classification accuracy of 96.25% as compared to the classification accuracy of 80% with MFCC.