PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Deep learning vs feature engineering in the assessment of voice signals for diagnosis in Parkinson’s disease

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Voice acoustic analysis can be a valuable and objective tool supporting the diagnosis of many neurodegenerative diseases, especially in times of distant medical examination during the pandemic. The article compares the application of selected signal processing methods and machine learning algorithms for the taxonomy of acquired speech signals representing the vowel a with prolonged phonation in patients with Parkinson’s disease and healthy subjects. The study was conducted using three different feature engineering techniques for the generation of speech signal features as well as the deep learning approach based on the processing of images involving spectrograms of different time and frequency resolutions. The research utilized real recordings acquired in the Department of Neurology at the Medical University of Warsaw, Poland. The discriminatory ability of feature vectors was evaluated using the SVM technique. The spectrograms were processed by the popular AlexNet convolutional neural network adopted to the binary classification task according to the strategy of transfer learning. The results of numerical experiments have shown different efficiencies of the examined approaches; however, the sensitivity of the best test based on the selected features proposed with respect to biological grounds of voice articulation reached the value of 97% with the specificity no worse than 93%. The results could be further slightly improved thanks to the combination of the selected deep learning and feature engineering algorithms in one stacked ensemble model.
Rocznik
Strony
art. no. e137347
Opis fizyczny
Bibliogr. 33 poz., rys., tab.
Twórcy
  • Faculty of Electronics, Military University of Technology, ul. Gen. Sylwestra Kaliskiego 2, 00-908 Warsaw, Poland
  • Department of Neurology, Medical University of Warsaw, ul. Banacha 1a, 02-097 Warsaw, Poland
  • Faculty of Electronics, Military University of Technology, ul. Gen. Sylwestra Kaliskiego 2, 00-908 Warsaw, Poland
  • Department of Neurology, Medical University of Warsaw, ul. Banacha 1a, 02-097 Warsaw, Poland
  • Department of Neurology, Medical University of Warsaw, ul. Banacha 1a, 02-097 Warsaw, Poland
Bibliografia
  • [1] Y.D. Kumar and A.M. Prasad, “MEMS accelerometer system for tremor analysis”, Int. J Adv. Eng. Global Technol. 2(5), 685‒693 (2014).
  • [2] P. Pierleoni, “A Smart Inertial System for 24h Monitoring and Classification of Tremor and Freezing of Gait in Parkinson’s Disease”, IEEE Sens. J. 19(23), 11612‒11623 (2019).
  • [3] W. Pawlukowska, K. Honczarenko, and M. Gołąb-Janowska, “Nature of speech disorders in Parkinson disease”, Pol. Neurol. Neurosurg. 47(3), 263‒269 (2013), [in Polish].
  • [4] S.A. Factor, Parkinson’s Disease: Diagnosis & Clinical Management, 2nd Edition, 2002.
  • [5] R. Chiaramonte and M. Bonfiglio, “Acoustic analysis of voice in Parkinson’s disease: a systematic review of voice disability and meta-analysis of studies”, Rev. Neurologia 70(11), 393‒405 (2020).
  • [6] Jiri Mekyska, et al., “Robust and complex approach of pathological speech signal analysis”, Neurocomputing 167, 94‒111 (2015).
  • [7] B. Erdogdu Sakar, G. Serbes, C. Sakar, “Analyzing the effectiveness of vocal features in early telediagnosis of Parkinson’s disease”, PLoS One 12, 8 (2017).
  • [8] L. Berus, S. Klancnik, M. Brezocnik, and M. Ficko, “Classifying Parkinson’s Disease Based on Acoustic Measures Using Artificial Neural Networks”, Sensors (Basel) 19(1), 16 (2019).
  • [9] L. Jeancolas et al., “Automatic detection of early stages of Parkinson’s disease through acoustic voice analysis with mel-frequency cepstral coefficients”, 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), 2017, pp. 1‒6.
  • [10] D.A. Rahn, M. Chou, J.J. Jiang, and Y.Zhang, “Phonatory impairment in Parkinson’s disease: evidence from nonlinear dynamic analysis and perturbation analysis”, J. Voice 21, 64‒71 (2007).
  • [11] J. Kurek, B. Świderski, S. Osowski, M. Kruk, and W. Barhoumi, “Deep learning versus classical neural approach to mammogram recognition”, Bull. Pol. Acad. Sci. Tech. Sci. 66(6), 831‒840 (2018).
  • [12] S. Sivaranjini and C.M. Sujatha, “Deep learning based diagnosis of Parkinson’s disease using convolutional neural network”, Multimed. Tools Appl. 79, 15467–15479 (2020).
  • [13] M. Wodziński, A. Skalski, D. Hemmerling, J.R. Orozco-Arroyave, and E. Noth, “Deep Learning Approach to Parkinson’s Disease Detection Using Voice Recordings and Convolutional Neural Network Dedicated to Image Classification” 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019, pp. 717‒720.
  • [14] J. Chmielińska, K. Białek, A. Potulska-Chromik, J. Jakubowski, E. Majda-Zdancewicz, M. Nojszewska, A. Kostera-Pruszczyk and A. Dobrowolski, “Multimodal data acquisition set for objective assessment of Parkinson’s disease”, Proc. SPIE 11442, Radioelectronic Systems Conference 2019, 114420F (2020).
  • [15] M. Kuhn, K. Johnson, Applied predictive modeling, New York: Springer, 2013.
  • [16] P. Liang, C. Deng, J. Wu, Z. Yang, and J. Zhu, “Intelligent Fault Diagnosis of Rolling Element Bearing Based on Convolutional Neural Network and Frequency Spectrograms” 2019 IEEE International Conference on Prognostics and Health Management (ICPHM), San Francisco, USA, 2019, pp. 1‒5.
  • [17] M.S. Wibawa, I.M.D. Maysanjaya, N.K.D.P. Novianti, and P.N. Crisnapati, “Abnormal Heart Rhythm Detection Based on Spectrogram of Heart Sound using Convolutional Neural Network”, 2018 6th International Conference on Cyber and IT Service Management (CITSM), Parapat, Indonesia, 2019, pp. 1‒4.
  • [18] M. Curilem, J.P. Canário, L. Franco, and R.A. Rios, “Using CNN To Classify Spectrograms of Seismic Events From Llaima Volcano (Chile)”, 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brasil, 2018, pp. 1‒8.
  • [19] D. Rethage, J. Pons and X. Serra, “A Wavenet for Speech Denoising”, 2018 IEEE Int. Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018, pp. 5069‒5073.
  • [20] A. Krizhevsky, I. Sutskever, and G.E. Hinton, “Imagenet classification with deep convolutional neural networks”, Neural Infor-mation Processing Systems, 2012.
  • [21] J. Jakubowski and J. Chmielińska, “Detection of driver fatigue symptoms using transfer learning”, Bull. Pol. Acad. Sci. Tech. Sci. 66(6), 869‒874 (2018).
  • [22] A. Benba, A. Jilbab, and A. Hammouch, “Voice analysis for detecting persons with Parkinson’s disease using MFCC and VQ”, International conference on circuits, systems and signal processing (ICCSSP’14), Russia, 2014.
  • [23] E. Niebudek-Bogusz, J. Grygiel, P. Strumiłło, and M. Śliwińska-Kowalska, “Nonlinear acoustic analysis in the evaluation of occupational voice disorders”, Occupational Medicine, 64(1), 29–35 (2013), [in Polish].
  • [24] E. Majda and A.P. Dobrowolski, “Modeling and optimization of the feature generator for speaker recognition systems”, Electr. Rev. 88(12A), 131‒136 (2012).
  • [25] Y. Maryn, N. Roy, M. De Bodt, P.B. van Cauwenberge, P. Corthals, “Acoustic measurement of overall voice quality: a meta-analysis”, J. Acoust. Soc. Am. 126(5), 2619‒2634 (2009), doi: 10.1121/1.3224706.
  • [26] E. Niebudek-Bogusz, J. Grygiel, P. Strumiłło, and M. Śliwińska-Kowalska, “Mel cepstral analysis of voice in patients with vocal nodules”, Otorhinolaryngology 10(4), 176‒181 (2011), [in Polish].
  • [27] A. Krysiak, “Language, speech and communication disorders in Parkinson’s disease”, Neuropsychiatr. Neuropsychol. 6(1), 36–42 (2011), [in Polish].
  • [28] F. Alías, J.C. Socoró, and X. Sevillano, “A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental Sounds”, Appl. Sci. 6(5), 143 (2016).
  • [29] X. Valero and F. Alias, “Gammatone Cepstral Coefficients: Biologically InspiredFeatures for Non-Speech Audio Classification”, IEEE Trans. Multimedia 14(6), 1684‒1689 (2012).
  • [30] S. Malcolm, “An Efficient Implementation of the Patterson-Holdworth Auditory Filter Bank”, 35. Apple Computer Technical Report, 1993.
  • [31] D.M. Agrawal, H.B. Sailor, M.H. Soni, and H.A. Patil, “Novel TEO-based gammatone features for environmental sound classification”, 2017 25th European Signal Processing Conference (EUSIPCO), Kos, Greece, 2017, pp. 1809‒1813.
  • [32] S. Russel and P. Norvig, Artificial intelligence – a modern approach, Upper Saddle River: Pearson Education, 2010.
  • [33] A. Chatzimparmpas, R.M. Martins, K. Kucher, and A. Kerren, “StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics”, IEEE Transactions on Visualization and Computer Graphics 27(2), 1547‒1557 (2021), doi: 10.1109/TVCG.2020.3030352.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-2c0d00f4-1083-4a42-b536-7993e0c646b5
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.