Warianty tytułu
Języki publikacji
Abstrakty
Current advances in high-throughput and imaging technologies are paving the way next-generation healthcare, tailored to the clinical and molecular characteristics of each patient. The Big Data obtained from these technologies are of little value to society unless it can be analyzed, interpreted, and applied in a relatively customized and inexpensive way.We propose a flexible decision support system called IntelliOmics for multi-omics data analysis constituted with well-designed and maintained components with open license for both personal and commercial use. Our proposition aims to serve some insight how to build your own local end-to-end service towards personalized medicine: from raw data upload, intelligent integration and exploration to detailed analysis accompanying clinical medical reports. The high-throughput data is effectively collected and processed in a parallel and distributed manner using the Hadoop framework and user-defined scripts. Heterogeneous data transformation performed mainly on the Apache Hive is then integrated into a so called ‘knowledge base’. On its basis, manual analysis in the form of hierarchical rules can be performed as well as automatic data analysis with Apache Spark and machine learning library MLlib. Finally, diagnostic and prognostic tools, charts, tables, statistical tests and print-ready clinical reports for an individual or group of patients are provided. The experimental evaluation was performed as part of the clinical decision support for targeted therapy in non-small cell lung cancer. The system managed to successfully process over a hundred of multi-omic patient data and offers various functionalities for different types of users: researchers, bio-statisticians/bioinformaticians, clinicians and medical board.
Czasopismo
Rocznik
Tom
Strony
1646--1663
Opis fizyczny
Bibliogr. 75 poz., rys., tab.
Twórcy
autor
- Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
autor
- Faculty of Computer Science, Bialystok University of Technology, Wiejska 45A, 15-351 Bialystok, Poland, m.czajkowski@pb.edu.pl
autor
- Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
autor
- Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
autor
- Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
autor
- Clinical Research Centre, Medical University of Bialystok, Bialystok, Poland
autor
- Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
autor
- Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
Bibliografia
- [1] Pashazadeh A, Navimipour NJ. Big data handling mechanisms in the healthcare applications: a comprehensive and systematic literature review. J Biomed Inform 2018;82:47–62.
- [2] Lin MC, Iqbal U, Li YC. AI in medicine: big data remains a challenge. Comput Methods Programs Biomed 2018;164.
- [3] Tang KJW et al. Artificial intelligence and machine learning in emergency medicine. Biocybern Biomed Eng 2021;41(1):156–72.
- [4] Mirza B, Wang W, et al. Machine learning and integrative analysis of biomedical big data. Genes (Basel) 2019;10(2):87.
- [5] Wu PY et al. Omic and electronic health record big data analytics for precision medicine. IEEE Trans Bio-medical Eng 2017;64(2):263–73.
- [6] Ashley EA. Towards precision medicine. Nat Rev Genet 2016;17(9):507–22.
- [7] Tran B et al. Cancer genomics: technology, discovery, and translation. J Clin Oncol 2012;30(10):647–60.
- [8] Thapa C, Camtepe S. Precision health data: requirements, challenges and existing techniques for data security and privacy. Comput Biol Med 2021;129 104130.
- [9] Kalina J, Matonoha C. A sparse pair-preserving centroid-based supervised learning method for high-dimensional biomedical data or images. Biocybern Biomed Eng 2020;40(2):774–86.
- [10] Viceconi M, Hunter P, Hose R. Big data, big knowledge: big data for personalized healthcare. IEEE J Biomed Health Inform 2015;19(4):1209–15.
- [11] Momeni Z et al. A survey on single and multi omics data mining methods in cancer data classification. J Biomed Inform 2020;107 103466.
- [12] Shahid AH, Singh MP. Computational intelligence techniques for medical diagnosis and prognosis: Problems and current developments. Biocybern Biomed Eng 2019;39(3):638–72.
- [13] Pinu FR, Beale DJ, Paten AM, Kouremenos K, Swarup S, Schirra HJ, Wishart D. Systems biology and multi-omics integration: viewpoints from the metabolomics research community. Metabolites 2019;9(4):76.
- [14] de Anda-Jáuregui G, Hernández-Lemus E. Computational oncology in the multi-omics era: state of the art. Front Oncol 2020;10(423).
- [15] Huang S, Chaudhary K, Garmire LX. More is better: recent progress in multi-omics data integration methods. Front Genet 2017;8(84).
- [16] Computational Data Analysis Workflow Systems. https://s.apache.org/existing-workflow-systems.
- [17] Chung R, Kang C. A multi-omics data simulator for complex disease studies and its application to evaluate multi-omics data analysis methods for disease classification. GigaScience 2019;8:5.
- [18] Zanfardino M et al. MuSA: a graphical user interface for multi-OMICs data integration in radiogenomic studies. Sci Rep 2021;11:1550.
- [19] Misra BB, Langefeld C, Olivier M, Cox LA. Integrated omics: tools, advances and future approaches. J Mol Endocrinol 2019;62(1).
- [20] Subramanian I, Verma S, Kumar S, Jere A, Anamika K. Multiomics data integration, interpretation, and its application. Bioinform Biol Insights 2020;14. 1177932219899051.
- [21] Labory J et al. Multi-omics approaches to improve mitochondrial disease diagnosis: challenges, advances, and perspectives. Front Mol Biosci 2020;7 590842.
- [22] Rappoport N, Shamir R. Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res 2018;46(20):10546–62.
- [23] Lee B et al. Heterogeneous multi-layered network model for omics data integration and analysis. Front Genet 2020;10:1381.
- [24] Zeeshan A, Khalid M, Saman Z, XinQi D. Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine. Database 2020.
- [25] Cirillo D, Valencia A. Big data analytics for personalized medicine. Curr Opin Biotechnol 2019;58:161–7.
- [26] Tong L et al. Integrating multi-omics data by learning modality invariant representations for improved prediction of overall survival of cancer. Methods 2021;189:74–85.
- [27] Gambardella V et al. Personalized Medicine: Recent Progress in Cancer Therapy. Cancers (Basel) 2020;12(4).
- [28] Su M et al. Proteomics, Personalized Medicine and Cancer. Cancer 2021;13:2512.
- [29] Morello G et al. From multi-omics approaches to precision medicine in amyotrophic lateral sclerosis. Front Neurosci 2020;14 577755.
- [30] Hou X et al. The multi-omics architecture of juvenile idiopathic arthritis. Cells 2020;10(10). 2301.
- [31] Song JW et al. Omics-driven systems interrogation of metabolic dysregulation in COVID-19 pathogenesis. Cell MeTable 2020;32(2):188–202.
- [32] Miki D et al. Hepatocellular carcinoma: towards personalized medicine. Cancer Sci. 2012;103(5):846–50.
- [33] Rivenbark AG et al. Molecular and cellular heterogeneity in breast cancer: challenges for personalized medicine. Am J Pathol 2013;183(4):1113–24.
- [34] Krzyszczyk P et al. The growing role of precision and personalized medicine for cancer treatment. Technology (Singap World Sci) 2018;6(3–4):79–100.
- [35] Frohlich H et al. From hype to reality: data science enabling personalized medicine. BMC Med 2018;16:150.
- [36] Couri T et al. Goals and targets for personalized therapy for HCC. Hepatol Int 2019;13(2):125–37.
- [37] Zanfardino M et al. Bringing radiomics into a multi-omics framework for a comprehensive genotype–phenotype characterization of oncological diseases. J Transl Med 2019;17:337.
- [38] Zeng Y et al. Bioinformatics analysis of multi-omics data identifying molecular biomarker candidates and epigenetically regulatory targets associated with retinoblastoma. Medicine (Baltimore) 2020;99(47):e23314.
- [39] Lawal B et al. Multi-omics data analysis of gene expressions and alterations, cancer-associated fibroblast and immune infiltrations, reveals the onco-immune prognostic relevance of STAT3/CDK2/4/6 in human malignancies. Cancers (basel) 2021;13(5):954.
- [40] Xie F et al. Three-dimensional bio-printing of primary human hepatocellular carcinoma for personalized medicine. Biomaterials 2021;265 120416.
- [41] Etaati L. Azure databricks. Mach Learn Microsoft Technol 2019;159–171.
- [42] Zaharia M et al. Apache Spark: a unified engine for big data processing. Commun ACM 2016;59(11):56–65.
- [43] Niklinski J et al. Systematic biobanking, novel imaging techniques, and advanced molecular analysis for precise tumor diagnosis and therapy: the Polish MOBIT project. Adv Med Sci 2017;62(2):405–13.
- [44] Silva BN, Khan M, Han K. Internet of things: a comprehensive review of enabling technologies, architecture, and challenges. IEEE Tech Rev 2018;35(2):205–20.
- [45] Afgan E et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res 2018;46(1):537–44.
- [46] QIAGEN Inc.,http://qiagenbioinformatics.com.
- [47] Yu J, Gu X, Yi S. Ingenuity pathway analysis of gene expression profiles in distal nerve stump following nerve injury: insights into wallerian degeneration. Front Cell Neurosci 2016;10:274.
- [48] Nema R, Shrivastava A, Kumar A. Prognostic role of lipid phosphate phosphatases in non-smoker, lung adenocarcinoma patients. Comput Biol Med 2021;129 104141.
- [49] McGowan T et al. An extensible Galaxy plug-in for multiomics data visualization and exploration. GigaScience 2020;9:4.
- [50] Mehta S et al. Precursor intensity-based label-free quantification software tools for proteomic and multi-omic analysis within the galaxy platform. Proteomes 2020;8(3):15.
- [51] Patil S, Majumdar B, Awan KH, Sarode GS, Sarode SC, Gadbail AR, Gondivkar S. Cancer oriented biobanks: a comprehensive review. Oncol Rev 2018;12(1):357.
- [52] Hasin Y, Seldin M, Lusis A. Multi-omics approaches to disease. Genome Biol 2017;18:83.
- [53] Paglialonga A, Lugo A, Santoro E. An overview on the emerging area of identification, characterization, and assessment of health apps. J Biomed Inform 2018;83:97–102.
- [54] Ahmed AJ. Django Project Blueprints. UK: Packt Publishing; 2016.
- [55] Leipzig J. A review of bioinformatic pipeline frameworks. Briefings Bioinform 2017;18(3):530–6.
- [56] Erraissi A, Belangour A, Tragha A. Digging into hadoop-based big data architectures. Int J Comput Sci 2017;14(6):52–9.
- [57] Camacho-Rodríguez J et al. Apache Hive: From MapReduce to enterprise-grade big data warehousing. ACM SIGMOD 2019;1773–1786(6).
- [58] Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol 2017;35(4):316–9.
- [59] RabbitMQ URL:http://www.rabbitmq.com/.
- [60] Celery URL:http://www.celeryproject.org/.
- [61] Castel SE et al. Tools and best practices for data processing in allelic expression analysis. Genome Biol 2015;16:195.
- [62] Cock PJA et al. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 2010;38(6):1767–71.
- [63] Li H et al. The sequence alignment/map format and SAMtools. Bioinformatics 2009;25(16):2078–9.
- [64] Danecek P et al. The variant call format and VCFtools. Bioinformatics 2011;27(15):2156–8.
- [65] Akgun M, Demirci H. VCF-Explorer: filtering and analysing whole genome VCF files. Bioinformatics 2017;33(21):3468–70.
- [66] Freeman A. Putting Angular in Context. Pro Angular. Apress, Berkeley, CA; 2017.
- [67] Bittorf M et al. Impala: a modern, open-source SQL engine for Hadoop. In: Proceedings of the 7th Biennial Conference on Innovative Data Systems Research, CIDR ’15.
- [68] Hausenblas M, Nadeau J. Apache Drill: interactive ad-hoc analysis at scale. Big Data 2013;1(2):100–4.
- [69] Sethi, R. et al.: Presto: SQL on Everything. ICDE’35 1802-1813 (2019).
- [70] Meng X, Bradley J, Yavuz B, Sparks E, et al. MLlib: machine learning in Apache Spark. J Mach Learn Res 2016;17(1):1235–41.
- [71] do Nascimento PM, Medeiros IG, Falcão RM, et al. A decision tree to improve identification of pathogenic mutations in clinical practice. BMC Med Inform Decis Mak 2020;20(52).
- [72] Chen X, Ishwaran H. Random forests for genomic data analysis. Genomics 2012;99(6):323–9.
- [73] Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology 2016;278(2):563–77.
- [74] Thawani R et al. Radiomics and radiogenomics in lung cancer: a review for the clinician. Lung Cancer 2018;115:34–41.
- [75] Lambin P et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 2017;14(12):749–62.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-73f0f65e-8710-4997-aab0-c06b18248ed3