Application of Mel Cepstral Representation of Voice Recordings for Diagnosing Vocal Disorders

Grygiel, J.; Strumiłło, P.; Niebudek-Bogusz, E.

Artykuł - szczegóły

Tytuł artykułu

Application of Mel Cepstral Representation of Voice Recordings for Diagnosing Vocal Disorders

Autorzy

Grygiel J. , Strumiłło P. , Niebudek-Bogusz E.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

Zastosowanie reprezentacji Mel Cepstralnej sygnału mowy do badania zaburzeń głosu

Języki publikacji

Abstrakty

The aim of this study was to assess the applicability of Mel Frequency Cepstral Coefficients (MFCC) of voice samples in diagnosing vocal nodules and polyps. Patients’ voice samples were analysed acoustically with the measurement of MFCC and values of the first three formants. Classification of mel coefficients was performed by applying the Sammon Mapping and Support Vector Machines. For the tests conducted on 95 patients, voice disorders were detected with accuracy reaching approx. 80%.

Celem niniejszej pracy była ocena możliwości zastosowania analizy tzw. współczynników cepstralnych (ang. Mel Cepstral Coefficients (MFCC)) dla próbek rejestrowanego głosu pacjentów we wspomaganiu diagnozy guzów i polipów. Rejestracje mowy pacjentów poddane zostały analizie akustycznej, w której zastosowano parametry MFCC oraz wartości trzech pierwszych formantów. Do klasyfikacji współczynników cepstralnych zastosowano odwzorowanie Sammona oraz tzw. Maszynę Wektorów Nośnych. W testach wykonanych dla 95 rejestracji mowy pacjentów, zaburzenia głosu zostały wykryte z ok. 80% dokładnością.

Słowa kluczowe

MFCC SVN voice disorders Sammon mapping

MFCC SVN zaburzenia głosu odwzorowanie Sammona

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2012

Tom

R. 88, nr 6

Strony

8--11

Opis fizyczny

Bibliogr. 9 poz., rys., schem.

Twórcy

autor

Grygiel J.

autor

Strumiłło P.

autor

Niebudek-Bogusz E.

Technical University of Lodz, Institute of Electronics, Łódź, Poland, jacek.grygiel@gmail.com

Bibliografia

[1] J.I. Godino-Llorente, Ruben Fraile, N. Sãenz-Lechõn, Osma-Ruiz, P. Gomez-Vilda, Automatic detection of voice impairments from text-dependent running speech, Biomedical Signal Processing and Control, pp. 176-182, March 2011
[2] J.I. Godino-Llorente, P. Gomez-Vilda, Manuel Blanco-Velasco, Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters, IEEE Transactions on Biomedical Engineering, vol. 53, no. 10, pp. 1943–1953, October 2006.
[3] J.D. Arias-Londoño, J.I. Godino-Llorente, N. Sãenz-Lechõn, V. Osma-Ruiz, G. Castellanos-Domínguez, Automatic Detection of Pathological Voices Using Complexity Measures, Noise Parameters, and Mel-Cepstral Coefficients, IEEE Transactions on Biomedical Engineering, vol. 58, no. 2, pp. 370–370, February 2011.
[4] A.A. Dibazar, S. Narayanan, T.W. Berger, Feature Analysis for Automatic Detection of Pathological Speech, Proceedings of the Second Joint EMBS/BMES Conference Houston, TX, USA, October 23–26, 2002.
[5] J.H. Martin, D. Jurafsky, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Prentice Hall, 2 edition, June 2008.
[6] S. Osowski, Neural Networks: an algorithmic approach (in Polish), WNT Warszawa, 1996.
[7] C. Maciel, J. Pereira, and D. Stewart, “Identifying healthy and pathologically affected voice signals,” IEEE Signal Process. Mag., vol. 27, no. 1, pp. 120–123, Jan. 2010.
[8] Ce Peng, Wenxi Chen, Xin Zhu, Baikun Wan, Darning Wei, "Pathological Voice Classification Based on a Single Vowel's Acoustic Features," 7th IEEE International Conference on Computer and Information Technology, CIT 2007, pp.1106–1110, 16-19 Oct. 2007.
[9] I.R. Titze, “Workshop on acoustic voice analysis summary statement”, in Proc Workshop on Acoustic Voice Analysis, Denver, Colorado, February, 1994.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPOK-0039-0002