A comparative study of kernel-based vector machines with probabilistic outputs for medical diagnosis

Qian, Xusheng; Zhou, Zhiyong; Hu, Jisu; Zhu, Jianbing; Huang, He; Dai, Yakang

doi:10.1016/j.bbe.2021.09.003

Artykuł - szczegóły

Tytuł artykułu

A comparative study of kernel-based vector machines with probabilistic outputs for medical diagnosis

Autorzy

Qian Xusheng , Zhou Zhiyong , Hu Jisu , Zhu Jianbing , Huang He , Dai Yakang

Wybrane pełne teksty z tego czasopisma

Identyfikatory

DOI

10.1016/j.bbe.2021.09.003

Warianty tytułu

Języki publikacji

Abstrakty

In this paper, support vector machines (SVMs), least squares SVMs (LSSVMs), relevance vector machines (RVMs), and probabilistic classification vector machines (PCVMs), are compared on sixteen binary and multiclass medical datasets. Particular emphasis is put on the comparison among the commonly used Gaussian radial basis function (GRBF) kernel, and the relatively new generalized min–max (GMM) kernel and exponentiated-GMM (eGMM) kernel. Since most medical decisions involve uncertainty, a postprocessing approach based on Platt’s method and pairwise coupling is employed to produce probabilistic outputs for prediction uncertainty assessment. The extensive empirical study illustrates that the SVM classifier using the tuning-free GMM kernel (SVM-GMM) shows good usability and broad applicability, and exhibits competitive performance against some state-of-the-art methods. These results indicate that SVM-GMM can be used as the first-choice method when selecting an appropriate kernel-based vector machine for medical diagnosis. As an illustration, SVM-GMM efficiently achieves a high accuracy of 98.92% on the thyroid disease dataset consisting of 7200 samples.

Słowa kluczowe

Kernel based vector machines generalized min–max kernel probabilistic output medical diagnosis

maszyna wektorów nośnych wyjście probabilistyczne diagnoza medyczna

Wydawca

Nałęcz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences
Elsevier

Czasopismo

Biocybernetics and Biomedical Engineering

Rocznik

2021

Tom

Vol. 41, no. 4

Strony

1486--1504

Opis fizyczny

Bibliogr. 91 poz., rys., tab.

Twórcy

autor

Qian Xusheng

School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Suzhou, China; Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, China

autor

Zhou Zhiyong

Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, China

autor

Hu Jisu

School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Suzhou, China; Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, China

autor

Zhu Jianbing

Suzhou Science and Technology Town Hospital, Suzhou, China

autor

Huang He

hhuang@suda.edu.cn

School of Electronics and Information Engineering, Soochow University, Suzhou, China

autor

Dai Yakang

daiyk@sibet.ac.cn

Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, China; Jinan Guoke Medical Engineering Technology Development co., LTD, Jinan, China

Bibliografia

[1] Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in Medicine 2001;23(1):89–109.
[2] Itani S, Lecron F, Fortemps P. Specifics of medical data mining for diagnosis aid: A survey. Expert Systems with Applications 2019;118:300–14.
[3] Qin J, Chen L, Liu Y, Liu C, Feng C, Chen B. A machine learning methodology for diagnosing chronic kidney disease. IEEE Access 2019;8:20991–1002.
[4] Haileamlak A. Chronic kidney disease is on the rise. Ethiopian Journal of Health Sciences 2018;28(6):681.
[5] Olivares R, Munoz R, Soto R, Crawford B, Cárdenas D, Ponce A, Taramasco C. An optimized brain-based algorithm for classifying Parkinson’s disease. Applied Sciences 2020;10(5):1827.
[6] Ball N, Teo W-P, Chandra S, Chapman J. Parkinson’s disease and the environment. Frontiers in Neurology 2019;10:218.
[7] Raza K. Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. In: U-Healthcare Monitoring Systems. Elsevier; 2019. p. 179–96.
[8] Kaptoge S, Pennells L, De Bacquer D, Cooney MT, Kavousi M, Stevens G, Riley LM, Savin S, Khan T, Altay S, et al. World Health Organization cardiovascular disease risk charts: revised models to estimate risk in 21 global regions. The Lancet Global Health 2019;7(10):e1332–45.
[9] Hayashi Y. Synergy effects between grafting and subdivision in Re-RX with J48graft for the diagnosis of thyroid disease. Knowledge-Based Systems 2017;131:170–82.
[10] Shah N, Ursani TJ, Shah NA, Raza HMZ. Prevalence and etiology of thyroid disease: A review. Pure and Applied Biology (PAB) 2021;10(3):691–702.
[11] Chen Y, Guo A, Chen Q, Quan B, Liu G, Li L, Hong J,Wei H, Hao Z. Intelligent classification of antepartum cardiotocography model based on deep forest. Biomedical Signal Processing and Control 2021;67 102555.
[12] Liu N, Shen J, Xu M, Gan D, Qi E-S, Gao B. Improved cost-sensitive support vector machine classifier for breast cancer diagnosis. Mathematical Problems in Engineering 2018;2018.
[13] Ullah MF. Breast cancer: current perspectives on the disease status. Breast Cancer Metastasis and Drug Resistance 2019:51–64.
[14] Er O, Tanrikulu AC, Abakay A, Temurtas F. An approach based on probabilistic neural network for diagnosis of mesotheliomas disease. Computers & Electrical Engineering 2012;38(1):75–81.
[15] Krówczyńska M, Wilk E. Asbestos exposure and the mesothelioma incidence in Poland. International Journal of Environmental Research and Public Health 2018;15 (8):1741.
[16] Sahebi G, Movahedi P, Ebrahimi M, Pahikkala T, Plosila J, Tenhunen H. Gefes: A generalized wrapper feature selection approach for optimizing classification performance. Computers in Biology and Medicine 2020;125 103974.
[17] Kandwal M, Jindal R, Chauhan P, Roy S. Skin diseases in geriatrics and their effect on the quality of life: A hospital-based observational study. Journal of Family Medicine and Primary Care 2020;9(3):1453.
[18] Kalantari A, Kamsin A, Shamshirband S, Gani A, Alinejad-Rokny H, Chronopoulos AT. Computational intelligence approaches for classification of medical data: State-of-theart, future challenges and research directions. Neurocomputing 2018;276:2–22.
[19] K. Bache, M. Lichman, UCI machine learning repository. URL: http://archive.ics.uci.edu/ml, 2013.
[20] Cortes C, Vapnik V. Support-vector networks. Machine Learning 1995;20(3):273–97.
[21] Muller K-R, Mika S, Ratsch G, Tsuda K, Scholkopf B. An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks 2001;12(2):181–201.
[22] T. Joachims, Making large-scale SVM learning practical. In‘‘ B. Schölkopf, C. Burges, A. Smola. MIT Press, Cambridge, MA, 1999. URL:http://www.cs.cornell.edu/people/tj/svm_light/index.html.
[23] C.-C. Chang and C.-J. Lin, ‘‘LIBSVM: A library for support vector machines,” ACM transactions on intelligent systems and technology (TIST), vol. 2, no. 3, pp. 1–27, 2011. URL:http://www.csie.ntu.edu.tw/ cjlin/libsvm.
[24] Arora G, Dubey AK, Jaffery ZA, Rocha A. Bag of feature and support vector machine based early diagnosis of skin cancer. Neural Computing and Applications 2020:1–8.
[25] Ab Hamid TMT, Sallehuddin R, Yunos ZM, Ali A. Ensemble based filter feature selection with harmonize particle swarm optimization and support vector machine for optimal cancer classification. Machine Learning with Applications 2021 100054.
[26] Li M, Chen W, Zhang T. Automatic epilepsy detection using wavelet-based nonlinear analysis and optimized SVM. Biocybernetics and Biomedical Engineering 2016;36(4):708–18.
[27] Suykens JA, Vandewalle J. Least squares support vector machine classifiers. Neural Processing Letters 1999;9(3):293–300.
[28] Van Gestel T, Suykens JA, Baesens B, Viaene S, Vanthienen J, Dedene G, De Moor B, Vandewalle J. Benchmarking least squares support vector machine classifiers. Machine Learning 2004;54(1):5–32.
[29] Maheshwari S, Pachori RB, Acharya UR. Automated diagnosis of glaucoma using empirical wavelet transform and correntropy features extracted from fundus images. IEEE Journal of Biomedical and Health Informatics 2016;21(3):803–13.
[30] Yang K, Zhou B, Yi F, Chen Y, Chen Y. Colorectal cancer diagnostic algorithm based on sub-patch weight color histogram in combination of improved least squares support vector machine for pathological image. Journal of Medical Systems 2019;43(9):1–9.
[31] Chen L, Zhou S. Sparse algorithm for robust LSSVM in primal space. Neurocomputing 2018;275:2880–91.
[32] Schölkopf B, Herbrich R, Smola AJ. A generalized representer theorem. In: International Conference on Computational Learning Theory. Springer; 2001. p. 416–26.
[33] Wang K, Zhong P. Robust non-convex least squares loss function for regression with outliers. Knowledge-Based Systems 2014;71:290–302.
[34] Yang X, Tan L, He L. A robust least squares support vector machine for regression and classification with noise. Neurocomputing 2014;140:41–52.
[35] Afifi S, GholamHosseini H, Sinha R. Dynamic hardware systemDynamic hardware system for cascade SVM classification of melanoma. Neural Computing and Applications 2020;32(6):1777–88.
[36] J. Wu, F. Gou, Y. Tan, A Staging Auxiliary Diagnosis Model for Nonsmall Cell Lung Cancer Based on the Intelligent Medical System Computational and Mathematical Methods in Medicine, vol. 2021, 2021.
[37] Begoli E, Bhattacharya T, Kusnezov D. The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence 2019;1(1):20–3.
[38] Luo J, Wong C-M, Vong C-M. Multinomial bayesian extreme learning machine for sparse and accurate classification model. Neurocomputing 2021;423:24–33.
[39] Lu C, Van Gestel T, Suykens JA, Van Huffel S, Vergote I, Timmerman D. Preoperative prediction of malignancy of ovarian tumors using least squares support vector machines. Artificial Intelligence in Medicine 2003;28(3):281–306.
[40] Maros ME, Capper D, Jones DT, Hovestadt V, von Deimling A, Pfister SM, Benner A, Zucknick M, Sill M. Machine learning workflows to estimate class probabilities for precision cancer diagnostics on DNA methylation microarray data. Nature Protocols 2020;15(2):479–512.
[41] Platt J et al. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers 1999;10(3):61–74.
[42] Lin H-T, Lin C-J, Weng RC. A note on platts probabilistic outputs for support vector machines. Machine Learning 2007;68(3):267–76.
[43] Wu T-F, Lin C-J, Weng RC. Probability estimates for multiclass classification by pairwise coupling. Journal of Machine Learning Research 2004;5(Aug):975–1005.
[44] Tipping ME. Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research 2001;1(Jun):211–44.
[45] Chen H, Tino P, Yao X. Probabilistic classification vector machines. IEEE Transactions on Neural Networks 2009;20(6):901–14.
[46] Psorakis I, Damoulas T, Girolami MA. Multiclass relevance vector machines: sparsity and accuracy. IEEE Transactions on Neural Networks 2010;21(10):1588–98.
[47] Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological) 1977;39(1):1–22.
[48] Qian X, Huang H, Hu J, Zhou Z, Geng C, Dai Y. Probabilistic classification vector machines for multiclass classification. In: 2019 Chinese Control And Decision Conference (CCDC). IEEE; 2019. p. 1028–32.
[49] Pillonetto G, Dinuzzo F, Chen T, De Nicolao G, Ljung L. Kernel methods in system identification, machine learning and function estimation: A survey. Automatica 2014;50(3):657–82.
[50] Shahid AH, Singh M. Computational intelligence techniques for medical diagnosis and prognosis: Problems and current developments. Biocybernetics and Biomedical Engineering 2019;39(3):638–72.
[51] Li P, Zhang C-H. Theory of the GMM kernel. In: Proceedings of the 26th International Conference on World Wide Web. p. 1053–62.
[52] Li P. Linearized GMM kernels and normalized random Fourier features. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. p. 315–24.
[53] P. Li, Tunable GMM kernels, arXiv preprint arXiv:1701.02046, 2017.
[54] I.Y. Chen, S. Joshi, M. Ghassemi, R. Ranganath, Probabilistic machine learning for healthcare, Annual Review of Biomedical Data Science 4 (2020).
[55] Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 1997;30(7):1145–59.
[56] Hand DJ, Till RJ. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 2001;45(2):171–86.
[57] Demšar J. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 2006;7(Jan):1–30.
[58] Yuille AL, Rangarajan A. The concave-convex procedure. Neural Computation 2003;15(4):915–36.
[59] Zhou S. Sparse LSSVM in primal using Cholesky factorization for large-scale problems. IEEE Transactions on Neural Networks and Learning Systems 2015;27(4):783–95.
[60] Huang H-M, Lin C. A kernel-based image denoising method for improving parametric image generation. Medical Image Analysis 2019;55:41–8.
[61] T. Anderson, An introduction to multivariate statistical analysis, 3rd Edition John Wiley and Sons, New York, 2003.
[62] Rousseeuw PJ, Leroy AM. Robust Regression and Outlier Detection, vol. 589. John wiley & sons; 2005.
[63] C.C. Aggarwal, A. Hinneburg, D.A. Keim, On the surprising behavior of distance metrics in high dimensional space, in: International Conference on Database Theory.
[64] De Brabanter K, De Brabanter J, Suykens JA, De Moor B. Optimized fixed-size kernel models for large data sets. Computational Statistics & Data Analysis 2010;54(6):1484–504.
[65] Xavier-de Souza S, Suykens JA, Vandewalle J, Bollé D. Coupled simulated annealing. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 2009;40(2):320–35.
[66] J.A. Nelder, R. Mead, A simplex method for function minimization, The Computer Journal 7 (4) (1965) 308–313.
[67] K. Leuven, LS-SVMlab v1.8. URL: http://www.esat.kuleuven.be/sista/lssvmlab, 2011.
[68] Kang S, Cho S, Kang P. Constructing a multi-class classifier using one-against-one approach with different binary classifiers. Neurocomputing 2015;149:677–82.
[69] Sharma D, Yadav U, Sharma P. The concept of sensitivity and specificity in relation to two types of errors and its application in medical research. Journal of Reliability and Statistical studies 2009:53–8.
[70] Fawcett T. ROC graphs: Notes and practical considerations for researchers. Machine Learning 2004;31(1):1–38.
[71] Huang J, Ling CX. Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on knowledge and Data Engineering 2005;17(3):299–310.
[72] Lieli RP, Hsu Y-C. Using the area under an estimated ROC curve to test the adequacy of binary predictors. Journal of Nonparametric Statistics 2019;31(1):100–30.
[73] Nanni L, Lumini A. An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert Systems with Applications 2009;36(2):3028–33.
[74] Wang X-Z, He Y-L, Wang DD. Non-naive bayesian classifiers for classification problems with continuous attributes. IEEE Transactions on Cybernetics 2013;44(1):21–39.
[75] Friedman M. A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics 1940;11(1):86–92.
[76] Iman RL, Davenport JM. Approximations of the critical region of the fbietkan statistic. Communications in Statistics-Theory and Methods 1980;9(6):571–95.
[77] Zhang GP, Berardi VL. An investigation of neural networks in thyroid function diagnosis. Health Care Management Science 1998;1(1):29–37.
[78] Yu Z, Li T, Yu N, Pan Y, Chen H, Liu B. Reconstruction of hidden representation for robust feature extraction. ACM Transactions on Intelligent Systems and Technology (TIST) 2019;10(2):1–24.
[79] Bouhamed H, Masmoudi A, Rebai A. Bayesian classifier structure-learning using several general algorithms. Procedia Computer Science 2015;46:476–82.
[80] Faris H, Aljarah I, Mirjalili S. Training feedforward neural networks using multi-verse optimizer for binary classification problems. Applied Intelligence 2016;45(2):322–32.
[81] Helwan A, Idoko JB, Abiyev RH. Machine learning techniques for classification of breast tissue. Procedia Computer Science 2017;120:402–10.
[82] Fang W, Gong X, Liu G, Wu Y, Fu Y. A balance adjusting approach of extended belief-rule-based system for imbalanced classification problem. IEEE Access 2020;8:41201–12.
[83] Zhang J, Chen L. Clustering-based undersampling with random over sampling examples and support vector machine for imbalanced classification of breast cancer diagnosis. Computer Assisted Surgery 2019;24(sup2):62–72.
[84] Unal Y, Polat K, Kocer HE. Pairwise FCM based feature weighting for improved classification of vertebral column disorders. Computers in Biology and Medicine 2014;46:61–70.
[85] Little M, McSharry P, Roberts S, Costello D, Moroz I. Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection. Nature Precedings 2007. 1–1.
[86] Xu J, Zhang M. Use of magnetic resonance imaging and artificial intelligence in studies of diagnosis of Parkinson’s disease. ACS Chemical Neuroscience 2019;10(6):2658–67.
[87] Moon S, Song H-J, Sharma VD, Lyons KE, Pahwa R, Akinwuntan AE, Devos H. Classification of Parkinson’s disease and essential tremor based on balance and gait characteristics from wearable motion sensors via machine learning techniques: a data-driven approach. Journal of NeuroEngineering and Rehabilitation 2020;17(1):1–8.
[88] W.H. Wolberg, O.L. Mangasarian, Multisurface method of pattern separation for medical diagnosis applied to breast cytology, Proceedings of the National Academy of Sciences 87(23) (1990) 9193–9196.
[89] Kazi M, Parshad R, Seenu V, Mathur S, Haresh K, et al. Fine-Needle Aspiration Cytology (FNAC) in breast cancer: A reappraisal based on retrospective review of 698 cases.World Journal of Surgery 2017;41(6):1528–33.
[90] Kulluk S, Ozbakir L, Baykasoglu A. Training neural networks with harmony search algorithms for classification problems. Engineering Applications of Artificial Intelligence 2012;25(1):11–9.
[91] Jaddi NS, Abdullah S. A cooperative-competitive master-slave global-best harmony search for ANN optimization and water-quality prediction. Applied Soft Computing 2017;51:209–24.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-b5514ca2-4d55-4452-b1df-f4a08569183c