Warianty tytułu
Języki publikacji
Abstrakty
The most common type of liver cancer is hepatocellular carcinoma (HCC), which begins in hepatocytes. The HCC, like most types of cancer, does not show symptoms in the early stages and hence it is difficult to detect at this stage. The symptoms begin to appear in the advanced stages of the disease due to the unlimited growth of cancer cells. So, early detection can help to get timely treatment and reduce the mortality rate. In this paper, we proposes a novel machine learning model using seven classifiers such as K-nearest neighbor (KNN), random forest, Naïve Bayes, and other four classifiers combined to form stacking learning (ensemble) method with genetic optimization helping to select the features for each classifier to obtain highest HCC detection accuracy. In addition to preparing the data and make it suitable for further processing, we performed the normalization techniques. We have used KNN algorithm to fill in the missing values. We trained and evaluated our developed algorithm using 165 HCC patients collected from Coimbra's Hospital and University Centre (CHUC) using stratified cross-validation techniques. There are total of 49 clinically significant features in this dataset, which are divided into two groups such as quantitative and qualitative groups. Our proposed algorithm has achieved the highest accuracy and F1-score of 0.9030 and 0.8857, respectively. The developed model is ready to be tested with huge database and can be employed in cancer screening laboratories to aid the clinicians to make an accurate diagnosis.
Czasopismo
Rocznik
Tom
Strony
1512--1524
Opis fizyczny
Bibliogr. 59 poz., rys., tab., wykr.
Twórcy
autor
- Department of Information and Communications Technology, Faculty of Computer Science and Telecommunications, Cracow University of Technology, ul. Warszawska 24, F-3, 31-155 Krakow, Poland
autor
- Information Technology Department, Faculty of Computers and Information, Menoufia University, Shibin el-Kom, Menoufia, Egypt
autor
- Department of Information and Communications Technology, Faculty of Computer Science and Telecommunications, Cracow University of Technology, ul. Warszawska 24, F-3, 31-155 Krakow, Poland; Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Baltycka 5, 44-100 Gliwice, Poland, plawiak@pk.edu.pl, plawiak@iitis.pl, plawiak.pawel@gmail.com
autor
- Department of Electronics and Computer Engineering, Ngee Ann Polytechnic, Singapore; Department of Biomedical Engineering, School of Science and Technology, Singapore School of Social Sciences, Singapore; Department of Biomedical Informatics and Medical Engineering, Asia University, Taiwan
autor
- Faculty of Electrical Engineering, Automatics, Computer Science and Biomedical Engineering AGH University of Science and Technology, al. Mickiewicza 30, 30-059 Krakow, Poland
Bibliografia
- [1] Fung J, Lai CL, Yuen MF. Hepatitis B and C virus-related carcinogenesis. Clin Microbiol Infect 2009;15(11):964–70.
- [2] Bruix J, Reig M, Sherman M. Evidence-based diagnosis, staging, and treatment of patients with hepatocellular carcinoma. Gastroenterology 2016;150(4):835–53.
- [3] Etzioni R, Urban N, Ramsey S, McIntosh M, Schwartz S, Reid B, et al. The case for early detection. Nat Rev Cancer 2003;3 (4):243–52.
- [4] Raghavendra U, Gudigar A, Rao TN, Ciaccio EJ, Ng EYK, Acharya UR. Computer-aided diagnosis for the identification of breast cancer using thermogram images: a comprehensive review. Infrared Phys Technol 2019;102103041.
- [5] Khan SU, Islam N, Jan Z, Din IU, Khan A, Faheem Y. An e- Health care services framework for the detection and classification of breast cancer in breast cytology images as an IoMT application. Future Gener Comput Syst 2019;98:286–96.
- [6] Santosh KC. AI-driven tools for coronavirus outbreak: need of active learning and cross-population train/test models on multitudinal/multimodal data. J Med Syst 2020;44(5):1–5.
- [7] Yao Z, Li J, Guan Z, Ye Y, Chen Y. Liver disease screening based on densely connected deep neural networks. Neural Netw 2020;123:299–304.
- [8] Wei JKE, Jahmunah V, Pham TH, Oh SL, Ciaccio EJ, Acharya UR, et al. Automated detection of Alzheimer's disease using bi-directional empirical model decomposition. Pattern Recognit Lett 2020;135:106–13.
- [9] Johansson P, Almqvist EG, Bjerke M, Wallin A, Johansson JO, Andreasson U, et al. Reduced cerebrospinal fluid concentration of apolipoprotein AI in patients with Alzheimer's disease. J Alzheimer Dis 2017;59(3):1017–26.
- [10] Abdar M, Ksiazek W, Acharya UR, Tan RS, Makarenkov V, Plawiak P. A new machine learning technique for an accurate diagnosis of coronary artery disease. Comput Methods Programs Biomed 2019;179104992.
- [11] Hammad M, Maher A, Wang K, Jiang F, Amrani M. Detection of abnormal heart conditions based on characteristics of ECG signals. Measurement 2018;125:634–44.
- [12] Alghamdi A, Hammad M, Ugail H, Abdel-Raheem A, Muhammad K, Khalifa HS, et al. Detection of myocardial infarction based on novel deep transfer learning methods for urban healthcare in smart cities. Multimed Tools Appl 2020;1–22.
- [13] Amrani M, Hammad M, Jiang F, Wang K, Amrani A. Very deep feature extraction and fusion for arrhythmias detection. Neural Comput Appl 2018;30(7):2047–57.
- [14] Kandala RN, Dhuli R, Plawiak P, Naik GR, Moeinzadeh H, Gargiulo GD, et al. Towards real-time heartbeat classification: evaluation of nonlinear morphological features and voting method. Sensors 2019;19(23):5079.
- [15] Nasarian E, Abdar M, Fahami MA, Alizadehsani R, Hussain S, Basiri ME, et al. Association between work-related features and coronary artery disease: a heterogeneous hybrid feature selection integrated with balancing approach. Pattern Recognit Lett 2020;133:33–40.
- [16] Tuncer T, Dogan S, Plawiak P, Acharya UR. Automated arrhythmia detection using novel hexadecimal local pattern and multilevel wavelet transform with ECG signals. Knowledge Based Syst 2019;186104923.
- [17] Zomorodi-moghadam M, Abdar M, Davarzani Z, Zhou X, Plawiak P, Acharya UR. Hybrid particle swarm optimization for rule discovery in the diagnosis of coronary artery disease. Expert Syst 2019e12485.
- [18] Ksiazek W, Abdar M, Acharya UR, Plawiak P. A novel machine learning approach for early detection of hepatocellular carcinoma patients. Cogn Syst Res 2019;54:116–27.
- [19] Nayak A, Kayal EB, Arya M, Culli J, Krishan S, Agarwal S, et al. Computer-aided diagnosis of cirrhosis and hepatocellular carcinoma using multi-phase abdomen CT. Int J Comput Assist Radiol Surg 2019;14(8):1341–52.
- [20] Brehar R, Mitrea D, Nedevschi S, Lupsor MP, Rotaru M, Badea R. Hepatocellular carcinoma recognition in ultrasound images using textural descriptors and classical machine learning. 2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP); 2019. pp. 491–7. IEEE.
- [21] Santos MS, Abreu PH, García-Laencina PJ, Simão A, Carvalho A. A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients. J Biomed Inform 2015;58:49–59.
- [22] Sawhney R, Mathur P, Shankar R. A firefly algorithm based wrapper-penalty feature selection method for cancer diagnosis. International Conference on Computational Science and its Applications; 2018. pp. 438–49. Springer, Cham.
- [23] Malarvizhi MsR, Thanamani AS. K-nearest neighbor in missing data imputation. Int J Eng Res Dev 2012;5 (November (1)):05–7.
- [24] Aonpong P, Chen Q, Iwamoto Y, Lin L, Hu H, Zhang Q, et al. Comparison of machine learning-based radiomics models for early recurrence prediction of hepatocellular carcinoma. J Image Graph 2019;7(4).
- [25] Shen J, Qi L, Zou Z, Du J, Kong W, Zhao L, et al. Identification of a novel gene signature for the prediction of recurrence in HCC patients by machine learning of genome-wide databases. Sci Rep 2020;10(1):1–9.
- [26] Liao H, Xiong T, Peng J, Xu L, Liao M, Zhang Z, et al. Classification and prognosis prediction from histopathological images of hepatocellular carcinoma by a fully automated pipeline based on machine learning. Ann Surg Oncol 2020;1–11.
- [27] Das A, Acharya UR, Panda SS, Sabut S. Deep learning based liver cancer detection using watershed transform and Gaussian mixture model techniques. Cognitive Systems Research 2019;54:165–75.
- [28] Demir FB, Tuncer T, Kocamaz AF, Ertam F. A survival classification method for hepatocellular carcinoma patients with chaotic Darcy optimization method based feature selection. Med Hypotheses 2020;139109626.
- [29] Plawiak P, Acharya UR. Novel deep genetic ensemble of classifiers for arrhythmia detection using ECG signals. Neural Comput Appl 2020;32:11137–61. Springer.
- [30] Brunese L, Mercaldo F, Reginelli A, Santone A. An ensemble learning approach for brain cancer detection exploiting radiomic features. Comput Methods Programs Biomed 2020;185105134.
- [31] Plawiak P. Novel genetic ensembles of classifiers applied to myocardium dysfunction recognition based on ECG signals. Swarm Evol Comput 2018;39:192–208.
- [32] Lyu Q, Shan H, Wang G. MRI super-resolution with ensemble learning and complementary priors. IEEE Trans Comput Imaging 2020;6:615–24.
- [33] Shakeel PM, Tolba A, Al-Makhadmeh Z, Jaber MM. Automatic detection of lung cancer from biomedical data set using discrete AdaBoost optimized ensemble learning generalized neural networks. Neural Comput Appl 2020;32 (3):777–90.
- [34] Impyute, 2017. https://impyute.readthedocs.io/en/latest/. [Accessed 7 April 2020].
- [35] García S, Luengo J, Herrera F. Data preprocessing in data mining. Cham, Switzerland: Springer International Publishing; 2015. p. 195–243.
- [36] Plawiak P, Abdar M, Acharya UR. Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring. Appl Soft Comput 2019;84105740.
- [37] Plawiak P. Novel methodology of cardiac health recognition based on ECG signals and evolutionary-neural system. Expert Syst Appl 2018;92:334–49.
- [38] Aggarwal CC, editor. Data classification: algorithms and applications. CRC press; 2014.
- [39] Zeng X, Martinez TR. Distribution-balanced stratified cross-validation for accuracy estimation. J Exp Theor Artif Intell 2000;12(1):1–12.
- [40] Pandas, 2020. https://pandas.pydata.org/. [Accessed 7 April 2020].
- [41] Scikit-learn. Machine learning in Python; 2018, https://scikit-learn.org/stable/, . [Accessed 7 March 2020].
- [42] StackingClassifier. An ensemble-learning meta-classifier for stacking; http://rasbt.github.io/mlxtend/user_guide/classifier/ StackingClassifier/; 2019, . [Accessed 7 March 2020].
- [43] Fortin FA, De Rainville FM, Gardner MA, Parizeau M, Gagne C. DEAP: evolutionary algorithms made easy. J Mach Learn Res 2012;13:2171–5.
- [44] DEAP documentation; 2020, https://deap.readthedocs.io/en/master/. [Accessed 7 March 2020].
- [45] Hammad M, Plawiak P, Wang K, Acharya UR. ResNet— attention model for human authentication using ECG signals. Expert Syst 2020;e12547.
- [46] Tuncer T, Ertam F. Neighborhood component analysis and reliefF based survival recognition methods for Hepatocellular carcinoma. Phys A Stat Mech Appl 2020;540123143.
- [47] Zhang ZM, Tan JX, Wang F, Dao FY, Zhang ZY, Lin H. Early diagnosis of hepatocellular carcinoma using machine learning method. Front Bioeng Biotechnol 2020;8:254.
- [48] Abdar M, Zomorodi-Moghadam M, Das R, Ting IH. Performance analysis of classification algorithms on early detection of liver disease. Expert Syst Appl 2017;67:239–51.
- [49] Arian R, Hariri A, Mehridehnavi A, Fassihi A, Ghasemi F. Protein kinase inhibitors' classification using K-nearest neighbor algorithm. Comput Biol Chem 2020107269.
- [50] Dureja H, Gupta S, Madan AK. Topological models for prediction of pharmacokinetic parameters of cephalosporins using random forest, decision tree and moving average analysis. Sci Pharm 2008;76(3):377–94.
- [51] Perdana RS, Pinandito A. Combining likes-retweet analysis and naive bayes classifier within twitter for sentiment analysis. J Telecommun Electron Comput Eng (JTEC) 2018;10 (1–8):41–6.
- [52] Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2011;2(3):1–27.
- [53] Tsai CF, Li ML, Lin WC. A class center based approach for missing value imputation. Knowledge Based Syst 2018;151:124–35.
- [54] Walker SH, Duncan DB. Estimation of the probability of an event as a function of several independent variables. Biometrika 1967;54(1-2):167–79.
- [55] Ibrahim W, Abadeh MS. Protein fold recognition using Deep Kernelized Extreme Learning Machine and linear discriminant analysis. Neural Comput Appl 2019;31 (8):4201–14.
- [56] Plawiak P, Abdar M, Plawiak J, Makarenkov V, Acharya U. DGHNL: a new deep genetic hierarchical network of learners for prediction of credit scoring. Inf Sci (Ny) 2019;516:401–18.
- [57] Acharya UR, Raghavendra U, Fujita H, Hagiwara Y, Koh JE, Hong TJ, et al. Automated characterization of fatty liver disease and cirrhosis using curvelet transform and entropy features extracted from ultrasound images. Comput Biol Med 2016;79:250–8.
- [58] Acharya UR, Faust O, Molinari F, Sree SV, Junnarkar SP, Sudarshan V. Ultrasound-based tissue characterization and classification of fatty liver disease: a screening and diagnostic paradigm. Knowledge Based Syst 2015;75:66–77.
- [59] Ali L, Wajahat I, Golilarz NA, Keshtkar F, Bukhari SAC. LDA– GA–SVM: improved hepatocellular carcinoma prediction through dimensionality reduction and genetically optimized support vector machine. Neural Comput Appl 2020;1–10.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-2dfc635c-78fa-41eb-b396-d6a94259ab34