Tytuł artykułu
Autorzy
Treść / Zawartość
Pełne teksty:
Identyfikatory
DOI
Warianty tytułu
Języki publikacji
Abstrakty
This paper proposes a comprehensive study on machine listening for localisation of snore sound excitation. Here we investigate the effects of varied frame sizes, and overlap of the analysed audio chunk for extracting low-level descriptors. In addition, we explore the performance of each kind of feature when it is fed into varied classifier models, including support vector machines, k-nearest neighbours, linear discriminant analysis, random forests, extreme learning machines, kernel-based extreme learning machines, multilayer perceptrons, and deep neural networks. Experimental results demonstrate that, wavelet packet transform energy can outperform most other features. A deep neural network trained with subband Energy ratios reaches the highest performance achieving an unweighted average recall of 72.8% from four types for snoring.
Słowa kluczowe
Wydawca
Czasopismo
Rocznik
Tom
Strony
465--475
Opis fizyczny
Bibliogr. 43 poz., fot., rys., tab.
Twórcy
autor
- Machine Intelligence & Signal Processing Group, Chair of Human-Machine Communication, Technische Universität München, Munich, Germany
- ZD.B Chair of Embedded Intelligence for Health Care & Wellbeing, Universität Augsburg, Augsburg, Germany
autor
- Munich School of Bioengineering, Technische Universität München, Garching, Germany
autor
- Group on Language, Audio & Music, Department of Computing, Imperial College London, London, UK
autor
- audEERING GmbH, Gilching, Germany
autor
- ZD.B Chair of Embedded Intelligence for Health Care & Wellbeing, Universität Augsburg, Augsburg, Germany
autor
- Department of Otorhinolaryngology, Head and Neck Surgery, Klinikum rechts der Isar, Technische Universität München, Munich, Germany
autor
- Department of Otorhinolaryngology, Head and Neck Surgery, Alfried Krupp Krankenhaus, Essen, Germany
autor
- Department of Otorhinolaryngology, Head and Neck Surgery, Carl-Thiem-Klinikum Cottbus, Cottbus, Germany
autor
- Munich School of Bioengineering, Technische Universität München, Garching, Germany
autor
- Group on Language, Audio & Music, Department of Computing, Imperial College London, London, UK
- audEERING GmbH, Gilching, Germany
- ZD.B Chair of Embedded Intelligence for Health Care & Wellbeing, Universität Augsburg, Augsburg, Germany
Bibliografia
- 1. Abdel-Hamid O., Mohamed A.-R., Jiang H., Deng L., Penn G., Yu D. (2014), Convolutional neural networks for speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22, 10, 1533-1545.
- 2. Agrawal S., Stone P., McGuinness K., Morris J., Camilleri A. (2002), Sound frequency analysis and the site of snoring in natural and induced sleep, Clinical Otolaryngology & Allied Sciences, 27, 3, 162-166.
- 3. Aldrich M. S. (1999), Sleep medicine, Oxford University Press, New York, USA.
- 4. Basheer I., Hajmeer M. (2000), Artificial neural networks: fundamentals, computing, design, and application, Journal of Microbiological Methods, 43, 1, 3-31.
- 5. Beeton R. J., Wells I., Ebden P., Whittet H., Clarke J. (2007), Snore site discrimination Rusing statistical moments of free field snoring sounds recorded during sleep nasendoscopy, Physiological Measurement, 28, 10, 1225-1236.
- 6. Bishop C. M. (2006), Pattern recognition and machine learning, Springer, New York, US.
- 7. Breiman L. (2001), Random forests, Machine Learning, 45, 1, 5-32.
- 8. Chang C.-C., Lin C.-J. (2011), LIBSVM: A libr ary for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2, 27: 1-27:27, software available at http://www.csie.ntu.edu.tw/⁓cjlin/libsvm.
- 9. Cortes C., Vapnik V. (1995), Support-vector networks, Machine Learning, 20, 3, 273-297.
- 10. El Badawey M. R., McKee G., Marshall H., HeggieN., Wilson J. A. (2003), Predictive value of Steep nasendoscopy in the management of habitual snorers, Annals of Otology, Rhinology & Laryngology, 112, 1, 40-44.
- 11. Eyben F. (2015), Real-time speech and music classification by large audio feature space extraction, SpringerInternational Publishing, Cham, Switzerland.
- 12. Eyben F. et al. (2016), The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing, IEEE Transactions on Affective Computing, 7, 2, 190-202.
- 13. Eyben F., Weninger F., Groß F., Schuller B. (2013), Recent developments in opensmile, the munich open-source multimedia feature extractor, [in:] Proc. ACM MM, pp. 835-838, Barcelona, Catalunya, Spain.
- 14. Eyben F., Wöllmer M., Schuller B. (2010), Opensmile: the munich versatile and fast open-source audio feature extractor, [in:] Proc. ACM MM, pp. 1459-1462, Firenze, Italy.
- 15. Fiz J. A., Jane R. (2012), Snoring analysis. A complex question, Journal of Sleep Disorders: Treatment and Care, 1, 1, 1-3.
- 16. Herzog M. et al. (2014), Evaluation of acoustic characteristics of snoring sounds obtained during druginduced sleep endoscopy, Sleep and Breathing, pp. 1-9.
- 17. Hessel N., de Vries N. (2002), Diagnostic work-up of socially unacceptable snoring. II. Sleep endoscopy, European Archives of Oto-Rhino-Laryngology, 259, 158-161.
- 18. Hill P., Lee B., Osborne J., Osman E. (1999), Palatal snoring identified by acoustic crest factor analysis, Physiological Measurement, 20, 2, 167-174.
- 19. Huang G.-B. (2014), An insight into extreme learning machines: random neurons, random features and kernels, Cognitive Computation, 6, 3, 376-390.
- 20. Huang G.-B., Zhu, Q.-Y., Siew C.-K. (2006), Extreme learning machine: theory and applications, Neurocomputing, 70, 1, 489-501.
- 21. Kezirian E. J., Hohenhorst W., de Vries N. (2011), Drug-induced sleep endoscopy: the vote classification, European Archives of Oto-Rhino-Laryngology, 268, 8, 1233-1236.
- 22. Marin J. M., Carrizo S. J., Vicente E., Agusti A. G. (2005), Long-term cardiovascular outcomes in men with obstructive sleep apnoea-hypopnoea with Or without treatment with continuous positive airway pressure: an observational study, The Lancet, 365, 9464, 1046-1053.
- 23. Miyazaki S., Itasaka Y., Ishikawa K., Togawa K. (1998), Acoustic analysis of snoring and the site of Airways obstruction in sleep related respiratory disorders, Acta Oto-Laryngologica, 118, 537, 47-51.
- 24. Mokhlesi B., Ham S., Gozal D. (2016), The effect of sex and age on the comorbidity burden of osa: an observational analysis from a large nationwide us health claims database, The European Respiratory Journal, 47, 4, 1162-1169.
- 25. Pancoast S., Akbacak M. (2012), Bag-of-audiowords approach for multimedia event classification, [in:] Proceedings of INTERSPEECH, pp. 2105-2108, Portland, Oregon.
- 26. Peppard P. E., Young T., Barnet J. H., Palta M., Hagen E. W., Hla K. M. (2013), Increased prevalence of sleep-disordered breathing in adults, American Journal of Epidemiology, 177, 9, 1006-1014.
- 27. Peppard P. E., Young T., Palta M., Skatrud J. (2000), Prospective study of the association between sleep-disordered breathing and hypertension, New England Journal of Medicine, 342, 19, 1378-1384.
- 28. Pevernagie D., Aarts R. M., De Meyer M. (2010), The acoustics of snoring, Sleep Medicine Reviews, 14, 2, 131-144.
- 29. Qian K., Fang Y., Xu Z., Xu H. (2013), Comparison of two acoustic features for classification of different snore signals, Chinese Journal of Electron Devices, 36, 4, 455-459.
- 30. Qian K. et al. (2017), Classification of the excitation location of snore sounds in the upper airway by acoustic multi-feature analysis, IEEE Transactions on Biomedical Engineering, 64, 8, 1731-1741.
- 31. Qian K., Janott C., Zhang Z., Heiser C., Schuller B. (2016), Wavelet features for classification of vote snore sounds, [in:] Proc. IEEE ICASSP, pp. 221-225, Shanghai, China.
- 32. Qian K., Xu Z., Xu H., Ng B. P. (2014), Automatic detection of inspiration related snoring signals from original audio recording, [in:] Proc. ChinaSIP, pp. 95-99, Xi’an, China.
- 33. Qian K., Xu Z., Xu H., Wu Y., Zhao Z. (2015), Automatic detection, segmentation and classification of snore related signals from overnight audio recording, IET Signal Processing, 9, 1, 21-29.
- 34. Roebuck A. et al. (2014), A review of signals used In sleep analysis, Physiological Measurement, 35, 1, R1-R57.
- 35. Sak H., Senior A. W., Beaufays F. (2014), Long short-term memory recurrent neural network architectures for large scale acoustic modeling, [in:] Proceedings of INTERSPEECH, pp. 338-342, Singapore.
- 36. Schmitt M. et al. (2016), A bag-of-audio-words approach for snore sounds excitation localisation, [in:] Proc. ITG Speech Communication, pp. 230-234, Paderborn, Germany.
- 37. Schuller B., Steidl S., Batliner A. (2009), The interspeech 2009 emotion challenge, [in:] Proc. INTERSPEECH, pp. 312-315, Brighton, UK.
- 38. Schuller B. et al. (2013), The interspeech 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism, [in:] Proc. INTERSPEECH, pp. 148-152, Lyon, France.
- 39. Spiegel M. R., Schiller J. J., Srinivasan R. A., LeVan M. (2009), Probability and statistics, McGraw-Hill, New York, NY, USA.
- 40. Strollo Jr P. J., Rogers R. M. (1996), Obstructive sleep apnea, New England Journal of Medicine, 334, 2, 99-104.
- 41. Vincent P., Larochelle H., Lajoie I., Bengio Y., Manzagol P.-A. (2010), Stacked denoising autoencoders: Learning useful representations in a deep Network with a local denoising criterion, Journal of Machine Learning Research, 11, 3371-3408.
- 42. Yaggi H. K., Concato J., Kernan W. N., Lichtman J. H., Brass L. M., Mohsenin V. (2005), Obstructive sleep apnea as a risk factor for stroke and death, New England Journal of Medicine, 353, 19, 2034-2041.
- 43. Young T., Palta M., Dempsey J., Skatrud J., Weber S., Badr S. (1993), The occurrence of sleepdisordered breathing among middle-aged adults, New England Journal of Medicine, 328, 17, 1230-1235.
Uwagi
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-5dcb813b-aae1-43e5-9438-3501343020ce