PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Determining relevant biomarkers for prediction of breast cancer using anthropometric and clinical features: A comparative investigation in machine learning paradigm

Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Early detection of breast cancer plays crucial role in planning and result of associated treatment. The purpose of this article is threefold: (i) to investigate whether or not clinical features obtained using routine blood analysis combined with anthropometric measurements can be utilized for envisaging breast cancer using predictive machine learning techniques; (ii) to explore the role of various machine learning components such as feature selection, data division protocols and classification to determine suitable biomarkers for breast cancer prediction; and (iii) to evaluate a recent database of clinical and anthropometric measurements acquired from normal individuals and individuals suffering from breast cancer. A database consisting of anthropometric and clinical attributes is used in the experiments. Various feature selection and statistical significance analysis methods are used to determine the relevance of various features. Furthermore, popular classifiers such as kernel based support vector machine (SVM), Naïve Bayesian, linear discriminant, quadratic discriminant, logistic regression, K-nearest neighbor (K-NN) and random forest were implemented and evaluated for breast cancer risk prediction using these features. Results of feature selection techniques indicate that among the nine features considered in this study, glucose, age and resistin are found to be most relevant and effective biomarkers for breast cancer prediction. Further, when these three features are used for classification, the medium K-NN classifier achieves the highest classification accuracy of 92.105% followed by medium Gaussian SVM which achieves classification accuracy of 83.684% under hold out data division protocol.
Twórcy
  • Department of Biomedical Engineering, National Institute of Technology, Raipur, G.E. Road, Raipur, Chhattisgarh 492010, India
Bibliografia
  • [1] American Cancer Society. Cancer Facts & Figures 2016; 2016.
  • [2] Cancer Prevention and Early Detection: Facts and Figures 2012. American Cancer Society Cancer Action Network; 2012.
  • [3] Singh BK, Verma K, Thoke AS. A dual feature selection approach for classification of breast tumors in ultrasound images using ANN and SVM. Artif Intell Syst Mach Learn 2015;7(3):78–84.
  • [4] Singh BK, Verma K, Thoke AS. Fuzzy cluster based neural network classifier for classifying breast tumors in ultrasound images. Expert Syst Appl 2016;66:114–23.
  • [5] Singh BK, Verma K, Panigrahi L, Thoke AS. Integrating radiologist feedback with computer aided diagnostic systems for breast cancer risk prediction in ultrasonic images: an experimental investigation in machine learning paradigm. Expert Syst Appl 2017;90:209–23.
  • [6] Singh BK, Verma K, Thoke AS, Suri JS. Risk stratification of 2D ultrasound-based breast lesions using hybrid feature selection in machine learning paradigm. Measurement 2017;105:146–57.
  • [7] Panigrahi L, Verma K, Singh BK. Ultrasound image segmentation using a novel multi-scale Gaussian kernel fuzzy clustering and multi-scale vector field convolution. Expert Syst Appl 2019;115:486–98.
  • [8] Opstal-van Winden AW, Rodenburg W, Pennings JL, van Oostrom C, Beijnen JH, Peeters PH, van Gils CH, de Vries A. A bead-based multiplexed immunoassay to evaluate breast cancer biomarkers for early detection in pre-diagnostic serum. Int J Mol Sci 2012;13(10):13587–604.
  • [9] Santillán-Benítez JG, Mendieta-Zerón H, Gómez-Oliván LM, Torres-Juárez JJ, González-Bañales JM, Hernández-Peña LV, Ordóñez-Quiroz A. The tetrad BMI, leptin, leptin/ adiponectin (L/A) ratio and CA 15-3 are reliable biomarkers of breast cancer. J Clin Lab Anal 2013;27(1):12–20.
  • [10] Dalamaga M, Sotiropoulos G, Karmaniolas K, Pelekanos N, Papadavid E, Lekka A. Serum resistin: a biomarker of breast cancer in postmenopausal women? Association with clinicopathological characteristics, tumor markers, inflammatory and metabolic parameters. Clin Biochem 2013;46(7–8):584–90.
  • [11] Kloten V, Becker B, Winner K, Schrauder MG, Fasching PA, Anzeneder T, Veeck J, Hartmann A, Knüchel R, Dahl E. Promoter hypermethylation of the tumor-suppressor genes ITIH5, DKK3, and RASSF1A as novel biomarkers for blood-based breast cancer screening. Breast Cancer Res 2013;15(1):R4.
  • [12] Zhu Q, Wang L, Tannenbaum S, Ricci A, DeFusco P, Hegde P. Pathologic response prediction to neoadjuvant chemotherapy utilizing pretreatment near-infrared imaging parameters and tumor pathologic criteria. Breast Cancer Res 2014;16(5):456.
  • [13] Zheng B, Yoon SW, Lam SS. Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Syst Appl 2014;41(4):1476–82.
  • [14] Provatopoulou X, Georgiou GP, Kalogera E, Kalles V, Matiatou MA, Papapanagiotou I, Sagkriotis A, Zografos GC, Gounaris A. Serum irisin levels are lower in patients with breast cancer: association with disease diagnosis and tumor characteristics. BMC Cancer 2015;15(1):898.
  • [15] Assiri AM, Kamel HF. Evaluation of diagnostic and predictive value of serum adipokines: leptin, resistin and visfatin in postmenopausal breast cancer. Obes Res Clin Pract 2016;10(4):442–53.
  • [16] Lee E, Moon A. Identification of biomarkers for breast cancer using databases. J Cancer Prev 2016;21(4):235.
  • [17] Alickovic´ E, Subasi A. Breast cancer diagnosis using GA feature selection and rotation forest. Neural Comput Appl 2017;28(4):753–63.
  • [18] Choi J, Park S, Yoon Y, Ahn J. Improved prediction of breast cancer outcome by identifying heterogeneous biomarkers. Bioinformatics 2017;33(22):3619–26.
  • [19] Nicolini A, Ferrari P, Duffy MJ. Prognostic and predictive biomarkers in breast cancer: past, present and future. Semin Cancer Biol 2018;52:56–73.
  • [20] Patrício M, Pereira J, Crisóstomo J, Matafome P, Gomes M, Seiça R, Caramelo F. Using resistin, glucose, age and BMI to predict the presence of breast cancer. BMC Cancer 2018;18 (1):29.
  • [21] Phillips M, Cataneo RN, Cruz-Ramos JA, Huston J, Ornelas O, Pappas N, Pathak S. Prediction of breast cancer risk with volatile biomarkers in breath. Breast Cancer Res Treat 2018;1–8.
  • [22] Chen YZ, Kim Y, Soliman H, Ying G, Lee JK. Single drug biomarker prediction for ER breast cancer outcome from chemotherapy. Endocr Relat Cancer 2018;25(6):595–605.
  • [23] Besutti G, Iotti V, Rossi PG. Molecular imaging biomarkers for breast cancer risk and personalized screening. Transl Cancer Res 2018;7(5):1319–25.
  • [24] Kyrochristos ID, Ziogas DE, Lykoudis EG, Roukos DH. Breast cancer genome analysis in time and space: biomarker development strategy. Biomark Med 2018;12(6):547–50.
  • [25] Feng W, Li Y, Chu J, Li J, Zhang Y, Ding X, Fu Z, Li W, Huang X, Yin Y. Identification of tRNA-derived small noncoding RNA s as potential biomarkers for prediction of recurrence in triple-negative breast cancer. Cancer Med 2018;7 (10):5130–44.
  • [26] Weaver O, Leung JW. Biomarkers and imaging of breast cancer. Am J Roentgenol 2018;210(2):271–8.
  • [27] Landau S, Everitt BS. A handbook of statistical analyses using SPSS. Chapman & Hall/CRC; 2004.
  • [28] Chandrashekar G, Sahin F. A survey on feature selection methods. Comput Electr Eng 2014;40(1):16–28.
  • [29] Setiono R, Liu H. Improving backpropagation learning with feature selection. Appl Intell 1996;6(2):129–39.
  • [30] Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A, Liu H. Advancing feature selection research. ASU Feature Sel Repos 2010;1–28.
  • [31] Holte RC. Very simple classification rules perform well on most commonly used datasets. Mach Learn 1993; 11(1):63–90.
  • [32] Robnik-Šikonja M, Kononenko I. Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 2003; 53(1–2):23–69.
  • [33] Yu L, Liu H. Feature selection for high-dimensional data: a fast correlation-based filter solution. Proceedings of the 20th International Conference on Machine Learning (ICML-03); 2003. p. 856–63.
  • [34] Hall MA. Correlation-based feature selection for machine learning; 1999.
  • [35] Kolde R, Laur S, Adler P, Vilo J. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics 2012;28(4):573–80.
  • [36] Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995;20(3):273–97.
  • [37] Boser BE, Guyon IM, Vapnik VN. A training algorithm for optimal margin classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory. ACM; 1992. p. 144–52.
  • [38] Tharwat A. Linear vs. quadratic discriminant analysis classifier: a tutorial. Int J Appl Pattern Recognit 2016; 3(2):145–80.
  • [39] Li T, Zhu S, Ogihara M. Using discriminant analysis for multi-class classification: an experimental investigation. Knowl Inf Syst 2006;10(4):453–72.
  • [40] Kalmegh SR. Comparative analysis of weka data mining algorithm random forest, random tree and lad tree for classification of indigenous news data. Int J Emerg Technol Adv Eng 2015;5(1):507–17.
  • [41] Cappelletti V, Appierto V, Tiberio P, Fina E, Callari M, Daidone MG. Circulating biomarkers for prediction of treatment response. J Natl Cancer Inst Monogr 2015; 2015(51):60–3.
  • [42] Novaković J, Strbac P, Bulatovic D. Toward optimal feature selection using ranking methods and classification algorithms. Yugosl J Oper Res 2016;21(1).
  • [43] Isaac ER. Test of hypothesis – concise formula summary. Available from: https://www.researchgate.net/profile/Ebenezer_Isaac/ publication/283318687_Test_of_Hypothesis_-_Concise_ Formula_Summary/links/5632e74c08aefa44c3685cd7/Test- of-Hypothesis-Concise-Formula-Summary.pdf [accessed 09.03.19].
Uwagi
PL
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-b774ee94-72aa-41fb-8640-a5f8d615a397
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.