PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Background and motivation: The application of artificial intelligence in medical research, particularly unsupervised learning techniques, has shown promising potential. Medical time series data poses a unique challenge for analysis due to its complexity. Existing unsupervised learning methods often fail to effectively classify these variations, highlighting a gap in current approaches. We introduce a methodological clustering classification framework designed to accurately handle such data, aiming for improved classification tasks in biomedical signals. Methods: To address these challenges, we introduce a novel approach for the analysis and classification of medical time series data. Our method integrates agglomerative hierarchical clustering with Hilbert vector space representations of medical signals and biological sequences. We rigorously define the mathematical principles and conduct evaluations using simulations of cardiac signals, real-world neural signal datasets, open-source protein sequences, and the MNIST dataset for illustrative purposes. Results: The proposed method exhibited a 96% success rate in classifying protein sequences by function and effectively identifying families within a large protein set. In cardiac signal analysis, it retained 0.996 variance in a condensed 6-dimensional space, accurately classifying 87.4% of simulated atrial flutter groups and 99.91% of main groups when excluding conduction direction. For neural signals, it demonstrated near-perfect tracking accuracy of neural activity in mouse brain recordings, as confirmed by expert evaluations. Conclusion: Our proposed method offers a novel, translational approach for the treatment and classification of medical and biological time series, addressing some of the prevalent challenges in the field and paving the way for more reliable and effective biomedical signal analysis.
Twórcy
  • Department of Computer Science, ETH Zurich, Zurich, Switzerland
  • Department of Bioengineering, University of California San Diego, San Diego, CA, USA
  • Department of Internal Medicine, Section of Cardiovascular Medicine, Yale School of Medicine, New Haven, CT, USA
  • ITACA Institute, Universitat Politčcnica de Valčncia, Valencia, Spain
  • ITACA Institute, Universitat Politčcnica de Valčncia, Valencia, Spain
autor
  • ITACA Institute, Universitat Politčcnica de Valčncia, Valencia, Spain
  • ITACA Institute, Universitat Politčcnica de Valčncia, Valencia, Spain
Bibliografia
  • [1] Kononenko Igor. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med 2001;23(1):89-109.
  • [2] Dinga Richard, Penninx Brenda WJH, Veltman Dick J, Schmaal Lianne, Marquand Andre F. Beyond accuracy: measures for assessing machine learning models, pitfalls and guidelines. 2019, 743138, BioRxiv.
  • [3] Beam Andrew L, Manrai Arjun K, Ghassemi Marzyeh. Challenges to the reproducibility of machine learning models in health care. Jama 2020;323(4):305-6.
  • [4] Nezamabadi Kasra, Sardaripour Neda, Haghi Benyamin, Forouzanfar Mohamad. Unsupervised ECG analysis: A review. IEEE Rev Biomed Eng 2022;16:208-24.
  • [5] Hosseini Mohammad-Parsa, Hosseini Amin, Ahi Kiarash. A review on machine learning for EEG signal processing in bioengineering. IEEE Rev Biomed Eng 2020;14:204-18.
  • [6] Zou Quan, Lin Gang, Jiang Xingpeng, Liu Xiangrong, Zeng Xiangxiang. Sequence clustering in bioinformatics: an empirical study. Brief Bioinform 2020;21(1):1-10.
  • [7] Plant Darren, Barton Anne. Machine learning in precision medicine: lessons to learn. Nat Rev Rheumatol 2021;17(1):5-6.
  • [8] Gui Chloe, Chan Victoria. Machine learning in medicine. Univ West Ont Med J 2017;86(2):76-8.
  • [9] Gianfrancesco Milena A, Tamang Suzanne, Yazdany Jinoos, Schmajuk Gabriela. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med 2018;178(11):1544-7.
  • [10] Xiao Han, Rasul Kashif, Vollgraf Roland. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. 2017, arXiv preprint arXiv: 1708.07747.
  • [11] Cohen Gregory, Afshar Saeed, Tapson Jonathan, Van Schaik Andre. EMNIST: Extending MNIST to handwritten letters. In: 2017 international joint conference on neural networks. IEEE; 2017, p. 2921-6.
  • [12] Zhao Zhengtuo, Li Xue, He Fei, Wei Xiaoling, Lin Shengqing, Xie Chong. Parallel, minimally-invasive implantation of ultra-flexible neural electrode arrays. J Neural Eng 2019;16(3):035001.
  • [13] Niediek Johannes, Boström Jan, Elger Christian E, Mormann Florian. Reliable analysis of single-unit recordings from the human brain under noisy conditions: tracking neurons over hours. PLoS One 2016;11(12):e0166598.
  • [14] Steinmetz Nicholas A, Aydin Cagatay, Lebedeva Anna, Okun Michael, Pachitariu Marius, Bauza Marius, et al. Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings. Science 2021;372(6539):eabf4588.
  • [15] Quiroga Rodrigo Quian. Plugging in to human memory: advantages, challenges, and insights from human single-neuron recordings. Cell 2019;179(5):1015-32.
  • [16] Niehaus Thomas D, Thamm Antje MK, de Crécy-Lagard Valérie, Hanson Andrew D. Proteins of unknown biochemical function: a persistent problem and a roadmap to help overcome it. Plant Physiol 2015;169(3):1436-42.
  • [17] Alberts Bruce, Johnson Alexander, Lewis Julian, Raff Martin, Roberts Keith, Walter Peter. Analyzing protein structure and function. In: Molecular biology of the cell. 4th ed.. Garland Science; 2002.
  • [18] Johnson Mark, Zaretskaya Irena, Raytselis Yan, Merezhuk Yuri, McGinnis Scott, Madden Thomas L. NCBI BLAST: a better web interface. Nucleic Acids Res 2008;36(suppl_2):W5-9.
  • [19] Bateman Alex, Coin Lachlan, Durbin Richard, Finn Robert D, Hollich Volker, Griffiths-Jones Sam, et al. The pfam protein families database. Nucleic Acids Res 2004;32(suppl_1):D138-41.
  • [20] Mistry Jaina, Chuguransky Sara, Williams Lowri, Qureshi Matloob, Salazar Gustavo A, Sonnhammer Erik LL, et al. Pfam: The protein families database in 2021. Nucleic Acids Res 2021;49(D1):D412-9.
  • [21] Blum Matthias, Chang Hsin-Yu, Chuguransky Sara, Grego Tiago, Kandasaamy Swaathi, Mitchell Alex, et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res 2021;49(D1):D344-54.
  • [22] Bileschi Maxwell L, Belanger David, Bryant Drew H, Sanderson Theo, Carter Brandon, Sculley D, et al. Using deep learning to annotate the protein universe. Nat Biotechnol 2022;1-6.
  • [23] Kulmanov Maxat, Hoehndorf Robert. DeepGOPlus: improved protein function prediction from sequence. Bioinformatics 2021;37(8):1187.
  • [24] Zhou Naihui, Jiang Yuxiang, Bergquist Timothy R, Lee Alexandra J, Kacsoh Balint Z, Crocker Alex W, et al. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biol 2019;20(1):1-23.
  • [25] Cikes Maja, Sanchez-Martinez Sergio, Claggett Brian, Duchateau Nicolas, Piella Gemma, Butakoff Constantine, et al. Machine learning-based phenogrouping in heart failure to identify responders to cardiac resynchronization therapy. Eur J Heart Fail 2019;21(1):74–85.
  • [26] Mincholé Ana, Rodriguez Blanca. Artificial intelligence for the electrocardiogram. Nat Med 2019;25(1):22-3.
  • [27] Jambukia Shweta H, Dabhi Vipul K, Prajapati Harshadkumar B. Classification of ECG signals using machine learning techniques: A survey. In: 2015 international conference on advances in computer engineering and applications. IEEE; 2015, p. 714-21.
  • [28] Bollmann Andreas, Lombardi Federico. Electrocardiology of atrial fibrillation. IEEE Eng Med Biol Mag 2006;25(6):15-23.
  • [29] Hagiwara Yuki, Fujita Hamido, Oh Shu Lih, Tan Jen Hong, San Tan Ru, Ciaccio Edward J, et al. Computer-aided diagnosis of atrial fibrillation based on ECG signals: A review. Inform Sci 2018;467:99-114.
  • [30] Benditt David G, Benson Jr D Woodrow, Dunnigan Ann, Gornick Charles C, Anderson Robert W. Atrial flutter, atrial fibrillation, and other primary atrial tachycardias. Med Clin North Am 1984;68(4):895-918.
  • [31] Herzog Eyal, Argulian Edgar, Levy Steven B, Aziz Emad F. Pathway for the management of atrial fibrillation and atrial flutter. Crit Pathw Cardiol 2017;16(2):47-52.
  • [32] Ruipérez-Campillo Samuel, Castrejón Sergio, Martínez Marcel, Cervigón Raquel, Meste Olivier, Merino José Luis, et al. Non-invasive characterisation of macroreentrant atrial tachycardia types from a vectorcardiographic approach with the slow conduction region as a cornerstone. Comput Methods Programs Biomed 2021;200:105932.
  • [33] Ruipérez-Campillo Samuel, Castrejón Sergio, Martínez Marcel, Cervigón Raquel, Meste Olivier, Merino José Luis, et al. Slow Conduction Regions as a valuable vectorcardiographic parameter for the non-invasive identification of atrial flutter types. In: 2020 computing in cardiology. IEEE; 2020, p. 1-4.
  • [34] DeMasi Orianna, Kording Konrad, Recht Benjamin. Meaningless comparisons lead to false optimism in medical machine learning. PLoS One 2017;12(9):e0184604.
  • [35] Kussul Ernst, Baidyk Tatiana. Improved method of handwritten digit recognition tested on MNIST database. Image Vis Comput 2004;22(12):971-81.
  • [36] Keren Gideon, Baggen Stan. Recognition models of alphanumeric characters. Percept Psychophys 1981;29(3):234-46.
  • [37] Lindeberg Tony. Scale-space for discrete signals. IEEE Trans Pattern Anal Mach Intell 1990;12(3):234-54.
  • [38] Halmos Paul R. Finite-dimensional vector spaces. Courier Dover Publications; 2017.
  • [39] Gudder Stanley. Inner product spaces. Amer Math Monthly 1974;81(1):29-36.
  • [40] Ghorbani Hamid. Mahalanobis distance and its application for detecting multivariate outliers. Facta Univ Ser Math Inform 2019;34:583-95.
  • [41] The mahalanobis distance. Chemometr Intell Lab Syst 2000;50(1):1-18.
  • [42] Johnson Stephen C. Hierarchical clustering schemes. Psychometrika 1967;32(3):241-54.
  • [43] Hu Minhui, Zeng Kaiwei, Wang Yaohua, Guo Yang. Threshold-based hierarchical clustering for person re-identification. Entropy 2021;23(5):522.
  • [44] Murtagh Fionn, Contreras Pedro. Algorithms for hierarchical clustering: an overview. Wiley Interdiscip Rev Data Min Knowl Discov 2012;2(1):86-97.
  • [45] Bhattacharya Binay K, Toussaint Godfried T. On geometric algorithms that use the furthest-point voronoi diagram. In: Machine intelligence and pattern recognition, vol. 2, Elsevier; 1985, p. 43-61.
  • [46] Murtagh Fionn, Legendre Pierre. Ward’s hierarchical clustering method: clustering criterion and agglomerative algorithm. 2011, arXiv preprint arXiv:1111.6285.
  • [47] Lurka Adam. Spatio-temporal hierarchical cluster analysis of mining-induced seismicity in coal mines using Ward’s minimum variance method. J Appl Geophys 2021;184:104249.
  • [48] Shalchyan Vahid, Jensen Winnie, Farina Dario. Spike detection and clustering with unsupervised wavelet optimization in extracellular neural recordings. IEEE Trans Biomed Eng 2012;59(9):2576-85.
  • [49] Caliński Tadeusz, Harabasz Jerzy. A dendrite method for cluster analysis. Commun Stat - Theory Methods 1974;3(1):1-27.
  • [50] Kulmanov Maxat, Khan Mohammed Asif, Hoehndorf Robert. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier. Bioinformatics 2018;34(4):660-8.
  • [51] Zhou Guangjie, Wang Jun, Zhang Xiangliang, Guo Maozu, Yu Guoxian. Predicting functions of maize proteins using graph convolutional network. BMC Bioinformatics 2020;21(16):1-16.
  • [52] Dickey Adam S, Suminski Aaron, Amit Yali, Hatsopoulos Nicholas G. Singleunit stability using chronically implanted multielectrode arrays. J Neurophysiol 2009;102(2):1331-9.
  • [53] Emondi AA, Rebrik SP, Kurgansky AV, Miller KD. Tracking neurons recorded from tetrodes across time. J Neurosci Methods 2004;135(1-2):95-105.
  • [54] Yuan AX, Colonell J, Lebedeva A, Okun M, Charles AS, Harris TD. Multi-day neuron tracking in high density electrophysiology recordings using earth mover’s distance. Elife 2024;12:RP92495.
  • [55] Yuan Zheng. Prediction of protein subcellular locations using Markov chain models. FEBS Lett 1999;451(1):23-6.
  • [56] Kayed Mohammed, Anter Ahmed, Mohamed Hadeer. Classification of garments from fashion MNIST dataset using CNN LeNet-5 architecture. In: 2020 international conference on innovative trends in communication and computer engineering. IEEE; 2020, p. 238-43.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-06a3e2e9-c441-44d9-812f-cd15e9b0c29d
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.