Teaching Machines on Snoring : A Benchmark on Computer Audition for Snore Sound Excitation Localisation

Qian, K.; Janott, C.; Zhang, Z.; Deng, J.; Baird, A.; Heiser, C.; Hohenhorst, W.; Herzog, M.; Hemmert, W.; Schuller, B.

doi:10.24425/123918

Artykuł - szczegóły

Tytuł artykułu

Teaching Machines on Snoring : A Benchmark on Computer Audition for Snore Sound Excitation Localisation

Autorzy

Qian K. , Janott C. , Zhang Z. , Deng J. , Baird A. , Heiser C. , Hohenhorst W. , Herzog M. , Hemmert W. , Schuller B.

Treść / Zawartość

Pełne teksty:

Qian_Teaching Machines on Snoring_3_2018.pdf

Pobierz

Identyfikatory

DOI

10.24425/123918

Warianty tytułu

Języki publikacji

Abstrakty

This paper proposes a comprehensive study on machine listening for localisation of snore sound excitation. Here we investigate the effects of varied frame sizes, and overlap of the analysed audio chunk for extracting low-level descriptors. In addition, we explore the performance of each kind of feature when it is fed into varied classifier models, including support vector machines, k-nearest neighbours, linear discriminant analysis, random forests, extreme learning machines, kernel-based extreme learning machines, multilayer perceptrons, and deep neural networks. Experimental results demonstrate that, wavelet packet transform energy can outperform most other features. A deep neural network trained with subband Energy ratios reaches the highest performance achieving an unweighted average recall of 72.8% from four types for snoring.

Słowa kluczowe

snore sound obstructive sleep apnea acoustic features machine learning

Wydawca

Instytut Podstawowych Problemów Techniki PAN
Komitet Akustyki PAN
Polskie Towarzystwo Akustyczne

Czasopismo

Archives of Acoustics

Rocznik

2018

Tom

Vol. 43, No. 3

Strony

465--475

Opis fizyczny

Bibliogr. 43 poz., fot., rys., tab.

Twórcy

autor

Qian K.

andykun.qian@tum.de

Machine Intelligence & Signal Processing Group, Chair of Human-Machine Communication, Technische Universität München, Munich, Germany
ZD.B Chair of Embedded Intelligence for Health Care & Wellbeing, Universität Augsburg, Augsburg, Germany

autor

Janott C.

Munich School of Bioengineering, Technische Universität München, Garching, Germany

autor

Zhang Z.

Group on Language, Audio & Music, Department of Computing, Imperial College London, London, UK

autor

Deng J.

audEERING GmbH, Gilching, Germany

autor

Baird A.

ZD.B Chair of Embedded Intelligence for Health Care & Wellbeing, Universität Augsburg, Augsburg, Germany

autor

Heiser C.

Department of Otorhinolaryngology, Head and Neck Surgery, Klinikum rechts der Isar, Technische Universität München, Munich, Germany

autor

Hohenhorst W.

Department of Otorhinolaryngology, Head and Neck Surgery, Alfried Krupp Krankenhaus, Essen, Germany

autor

Herzog M.

Department of Otorhinolaryngology, Head and Neck Surgery, Carl-Thiem-Klinikum Cottbus, Cottbus, Germany

autor

Hemmert W.

Munich School of Bioengineering, Technische Universität München, Garching, Germany

autor

Schuller B.

Group on Language, Audio & Music, Department of Computing, Imperial College London, London, UK
audEERING GmbH, Gilching, Germany
ZD.B Chair of Embedded Intelligence for Health Care & Wellbeing, Universität Augsburg, Augsburg, Germany

Bibliografia

1. Abdel-Hamid O., Mohamed A.-R., Jiang H., Deng L., Penn G., Yu D. (2014), Convolutional neural networks for speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22, 10, 1533-1545.
2. Agrawal S., Stone P., McGuinness K., Morris J., Camilleri A. (2002), Sound frequency analysis and the site of snoring in natural and induced sleep, Clinical Otolaryngology & Allied Sciences, 27, 3, 162-166.
3. Aldrich M. S. (1999), Sleep medicine, Oxford University Press, New York, USA.
4. Basheer I., Hajmeer M. (2000), Artificial neural networks: fundamentals, computing, design, and application, Journal of Microbiological Methods, 43, 1, 3-31.
5. Beeton R. J., Wells I., Ebden P., Whittet H., Clarke J. (2007), Snore site discrimination Rusing statistical moments of free field snoring sounds recorded during sleep nasendoscopy, Physiological Measurement, 28, 10, 1225-1236.
6. Bishop C. M. (2006), Pattern recognition and machine learning, Springer, New York, US.
7. Breiman L. (2001), Random forests, Machine Learning, 45, 1, 5-32.
8. Chang C.-C., Lin C.-J. (2011), LIBSVM: A libr ary for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2, 27: 1-27:27, software available at http://www.csie.ntu.edu.tw/⁓cjlin/libsvm.
9. Cortes C., Vapnik V. (1995), Support-vector networks, Machine Learning, 20, 3, 273-297.
10. El Badawey M. R., McKee G., Marshall H., HeggieN., Wilson J. A. (2003), Predictive value of Steep nasendoscopy in the management of habitual snorers, Annals of Otology, Rhinology & Laryngology, 112, 1, 40-44.
11. Eyben F. (2015), Real-time speech and music classification by large audio feature space extraction, SpringerInternational Publishing, Cham, Switzerland.
12. Eyben F. et al. (2016), The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing, IEEE Transactions on Affective Computing, 7, 2, 190-202.
13. Eyben F., Weninger F., Groß F., Schuller B. (2013), Recent developments in opensmile, the munich open-source multimedia feature extractor, [in:] Proc. ACM MM, pp. 835-838, Barcelona, Catalunya, Spain.
14. Eyben F., Wöllmer M., Schuller B. (2010), Opensmile: the munich versatile and fast open-source audio feature extractor, [in:] Proc. ACM MM, pp. 1459-1462, Firenze, Italy.
15. Fiz J. A., Jane R. (2012), Snoring analysis. A complex question, Journal of Sleep Disorders: Treatment and Care, 1, 1, 1-3.
16. Herzog M. et al. (2014), Evaluation of acoustic characteristics of snoring sounds obtained during druginduced sleep endoscopy, Sleep and Breathing, pp. 1-9.
17. Hessel N., de Vries N. (2002), Diagnostic work-up of socially unacceptable snoring. II. Sleep endoscopy, European Archives of Oto-Rhino-Laryngology, 259, 158-161.
18. Hill P., Lee B., Osborne J., Osman E. (1999), Palatal snoring identified by acoustic crest factor analysis, Physiological Measurement, 20, 2, 167-174.
19. Huang G.-B. (2014), An insight into extreme learning machines: random neurons, random features and kernels, Cognitive Computation, 6, 3, 376-390.
20. Huang G.-B., Zhu, Q.-Y., Siew C.-K. (2006), Extreme learning machine: theory and applications, Neurocomputing, 70, 1, 489-501.
21. Kezirian E. J., Hohenhorst W., de Vries N. (2011), Drug-induced sleep endoscopy: the vote classification, European Archives of Oto-Rhino-Laryngology, 268, 8, 1233-1236.
22. Marin J. M., Carrizo S. J., Vicente E., Agusti A. G. (2005), Long-term cardiovascular outcomes in men with obstructive sleep apnoea-hypopnoea with Or without treatment with continuous positive airway pressure: an observational study, The Lancet, 365, 9464, 1046-1053.
23. Miyazaki S., Itasaka Y., Ishikawa K., Togawa K. (1998), Acoustic analysis of snoring and the site of Airways obstruction in sleep related respiratory disorders, Acta Oto-Laryngologica, 118, 537, 47-51.
24. Mokhlesi B., Ham S., Gozal D. (2016), The effect of sex and age on the comorbidity burden of osa: an observational analysis from a large nationwide us health claims database, The European Respiratory Journal, 47, 4, 1162-1169.
25. Pancoast S., Akbacak M. (2012), Bag-of-audiowords approach for multimedia event classification, [in:] Proceedings of INTERSPEECH, pp. 2105-2108, Portland, Oregon.
26. Peppard P. E., Young T., Barnet J. H., Palta M., Hagen E. W., Hla K. M. (2013), Increased prevalence of sleep-disordered breathing in adults, American Journal of Epidemiology, 177, 9, 1006-1014.
27. Peppard P. E., Young T., Palta M., Skatrud J. (2000), Prospective study of the association between sleep-disordered breathing and hypertension, New England Journal of Medicine, 342, 19, 1378-1384.
28. Pevernagie D., Aarts R. M., De Meyer M. (2010), The acoustics of snoring, Sleep Medicine Reviews, 14, 2, 131-144.
29. Qian K., Fang Y., Xu Z., Xu H. (2013), Comparison of two acoustic features for classification of different snore signals, Chinese Journal of Electron Devices, 36, 4, 455-459.
30. Qian K. et al. (2017), Classification of the excitation location of snore sounds in the upper airway by acoustic multi-feature analysis, IEEE Transactions on Biomedical Engineering, 64, 8, 1731-1741.
31. Qian K., Janott C., Zhang Z., Heiser C., Schuller B. (2016), Wavelet features for classification of vote snore sounds, [in:] Proc. IEEE ICASSP, pp. 221-225, Shanghai, China.
32. Qian K., Xu Z., Xu H., Ng B. P. (2014), Automatic detection of inspiration related snoring signals from original audio recording, [in:] Proc. ChinaSIP, pp. 95-99, Xi’an, China.
33. Qian K., Xu Z., Xu H., Wu Y., Zhao Z. (2015), Automatic detection, segmentation and classification of snore related signals from overnight audio recording, IET Signal Processing, 9, 1, 21-29.
34. Roebuck A. et al. (2014), A review of signals used In sleep analysis, Physiological Measurement, 35, 1, R1-R57.
35. Sak H., Senior A. W., Beaufays F. (2014), Long short-term memory recurrent neural network architectures for large scale acoustic modeling, [in:] Proceedings of INTERSPEECH, pp. 338-342, Singapore.
36. Schmitt M. et al. (2016), A bag-of-audio-words approach for snore sounds excitation localisation, [in:] Proc. ITG Speech Communication, pp. 230-234, Paderborn, Germany.
37. Schuller B., Steidl S., Batliner A. (2009), The interspeech 2009 emotion challenge, [in:] Proc. INTERSPEECH, pp. 312-315, Brighton, UK.
38. Schuller B. et al. (2013), The interspeech 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism, [in:] Proc. INTERSPEECH, pp. 148-152, Lyon, France.
39. Spiegel M. R., Schiller J. J., Srinivasan R. A., LeVan M. (2009), Probability and statistics, McGraw-Hill, New York, NY, USA.
40. Strollo Jr P. J., Rogers R. M. (1996), Obstructive sleep apnea, New England Journal of Medicine, 334, 2, 99-104.
41. Vincent P., Larochelle H., Lajoie I., Bengio Y., Manzagol P.-A. (2010), Stacked denoising autoencoders: Learning useful representations in a deep Network with a local denoising criterion, Journal of Machine Learning Research, 11, 3371-3408.
42. Yaggi H. K., Concato J., Kernan W. N., Lichtman J. H., Brass L. M., Mohsenin V. (2005), Obstructive sleep apnea as a risk factor for stroke and death, New England Journal of Medicine, 353, 19, 2034-2041.
43. Young T., Palta M., Dempsey J., Skatrud J., Weber S., Badr S. (1993), The occurrence of sleepdisordered breathing among middle-aged adults, New England Journal of Medicine, 328, 17, 1230-1235.

Uwagi

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-5dcb813b-aae1-43e5-9438-3501343020ce