PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

A comparative study on performance of basic and ensemble classifiers with various datasets

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Classification plays a critical role in machine learning (ML) systems for processing images, text and high -dimensional data. Predicting class labels from training data is the primary goal of classification. An optimal model for a particular classification problem is chosen based on the model's performance and execution time. This paper compares and analyzes the performance of basic as well as ensemble classifiers utilizing 10-fold cross validation and also discusses their essential concepts, advantages, and disadvantages. In this study five basic classifiers namely Naïve Bayes (NB), Multi-layer Perceptron (MLP), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF) and the ensemble of all the five classifiers along with few more combinations are compared with five University of California Irvine (UCI) ML Repository datasets and a Diabetes Health Indicators dataset from Kaggle repository. To analyze and compare the performance of classifiers, evaluation metrics like Accuracy, Recall, Precision, Area Under Curve (AUC) and F-Score are used. Experimental results showed that SVM performs best on two out of the six datasets (Diabetes Health Indicators and waveform), RF performs best for Arrhythmia, Sonar, Tic-tac-toe datasets, and the best ensemble combination is found to be DT+SVM+RF on Ionosphere dataset having respective accuracies 72.58%, 90.38%, 81.63%, 73.59%, 94.78% and 94.01%. The proposed ensemble combinations outperformed the conven¬tional models for few datasets.
Rocznik
Strony
107--132
Opis fizyczny
Bibliogr. 41 poz., fig., tab.
Twórcy
  • VIT-AP University, Amaravati-522237, Andhra Pradesh, India
  • VIT-AP University, Amaravati-522237, Andhra Pradesh, India
Bibliografia
  • [1] Alshayeji, M. H., Ellethy, H., Abed, S., & Gupta, R. (2022). Computer-aided detection of breast cancer on the Wisconsin dataset: An artificial neural networks approach. Biomedical Signal Processing and Control, 71(PA), 103141. https://doi.org/10.1016/j.bspc.2021.103141
  • [2] Alshdaifat, E., Al-hassan, M., & Aloqaily, A. (2021). Effective heterogeneous ensemble classification: An alternative approach for selecting base classifiers. ICT Express, 7(3), 342–349. https://doi.org/10.1016/j.icte.2020.11.005
  • [3] Baumann, P., Hochbaum, D. S., & Yang, Y. T. (2019). A comparative study of the leading machine learning techniques and two new optimization algorithms. European Journal of Operational Research, 272(3), 1041–1057. https://doi.org/10.1016/j.ejor.2018.07.009
  • [4] bin Basir, M. A., & binti Ahmad, F. (2017). New Feature Selection Model Based Ensemble Rule Classifiers Method for Dataset Classification. International Journal of Artificial Intelligence & Applications, 8(2), 37–43. https://doi.org/10.5121/ijaia.2017.8204
  • [5] Chandrika, Divya, C., Gowramma, G. S., & Varun, C. R. (2018). A comparative analysis on evaluation of classification algorithms based on ionospheric data. International Journal of Computer Sciences and Engineering, 6(5), 636–640. https://doi.org/10.26438/ijcse/v6i5.636640
  • [6] Consuegra-Ayala, J. P., Gutiérrez, Y., Almeida-Cruz, Y., & Palomar, M. (2022). Intelligent ensembling of auto-ML system outputs for solving classification problems. Information Sciences, 609, 766–780. https://doi.org/10.1016/j.ins.2022.07.061
  • [7] Ecemis, C., Acu, N., & Sari, Z. (2022). Classification of Imbalanced Cardiac Arrhythmia Data. European Journal of Science and Technology, 34, 546-552. https://doi.org/10.31590/ejosat.1083423
  • [8] Fang, X., Klawohn, J., De Sabatino, A., Kundnani, H., Ryan, J., Yu, W., & Hajcak, G. (2022). Accurate classification of depression through optimized machine learning models on high-dimensional noisy data. Biomedical Signal Processing and Control, 71(Part B), 103237. https://doi.org/10.1016/j.bspc.2021.103237
  • [9] Farhat, N. H. (1992). Photonit neural networks and learning mathines the role of electron-trapping materials. IEEE Expert-Intelligent Systems and Their Applications, 7(5), 63–72. https://doi.org/10.1109/64.163674
  • [10] Fath, A. H., Madanifar, F., & Abbasi, M. (2020). Implementation of multilayer perceptron (MLP) and radial basis function (RBF) neural networks to predict solution gas-oil ratio of crude oil systems. Petroleum, 6(1), 80–91. https://doi.org/10.1016/j.petlm.2018.12.002
  • [11] Ganie, S. M., & Malik, M. B. (2022). An Ensemble Machine Learning Approach for Predicting Type-II Diabetes Mellitus based on Lifestyle Indicators. Healthcare Analytics, 2, 100092. https://doi.org/10.1016/j.health.2022.100092
  • [12] Gupta, V., Srinivasan, S., & Kudli, S. S. (2014). Prediction and Classification of Cardiac Arrhythmia. https://cs229.stanford.edu/proj2014/Vasu%20Gupta,%20Sharan%20Srinivasan,%20Sneha%20Kudli,%20Prediction%20and%20Classification%20of%20Cardiac%20Arrhythmia.pdf
  • [13] Hongle, D., Yan, Z., Lin, Z., Yeh-Cheng, C., Gang, K., & Chen, Y.-C. (2022). Selective Ensemble Learning Algorithm for Imbalanced Dataset. Preprint. https://doi.org/10.21203/rs.3.rs-721493/v1
  • [14] Jia, J., & Qiu, W. (2020). Research on an ensemble classification algorithm based on differential privacy. IEEE Access, 8, 93499–93513. https://doi.org/10.1109/ACCESS.2020.2995058
  • [15] Kilincer, I. F., Ertam, F., & Sengur, A. (2021). Machine learning methods for cyber security intrusion detection: Datasets and comparative study. Computer Networks, 188, 107840. https://doi.org/10.1016/j.comnet.2021.107840
  • [16] Kushwah, J. S., Kumar, A., Patel, S., Soni, R., Gawande, A., & Gupta, S. (2021). Comparative study of regressor and classifier with decision tree using modern tools. Materials Today: Proceedings, 56(6), 3571-3576. https://doi.org/10.1016/j.matpr.2021.11.635
  • [17] Ma, T. M., Yamamori, K., & Thida, A. (2020). A comparative approach to naïve bayes classifier and support vector machine for email spam classification. 2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020 (pp. 324–326). IEEE. https://doi.org/10.1109/GCCE50665.2020.9291921
  • [18] Maniruzzaman, M., Jahanur Rahman, M., Ahammed, B., Abedin, M. M., Suri, H. S., Biswas, M., El-Baz, A., Bangeas, P., Tsoulfas, G., & Suri, J. S. (2019). Statistical characterization and classification of colon microarray gene expression data using multiple machine learning paradigms. Computer Methods and Programs in Biomedicine, 176, 173–193. https://doi.org/10.1016/j.cmpb.2019.04.008
  • [19] Mohamed, A. R. (2017). Comparative Study of Four Supervised Machine Learning Techniques for Classification. International Journal of Applied Science and Technology, 7(2), 5–18.
  • [20] Nazari, E., Aghemiri, M., Avan, A., Mehrabian, A., & Tabesh, H. (2021). Machine learning approaches for classification of colorectal cancer with and without feature selection method on microarray data. Gene Reports, 25, 101419. https://doi.org/10.1016/j.genrep.2021.101419
  • [21] Ngo, G., Beard, R., & Chandra, R. (2022). Evolutionary bagging for ensemble learning.
  • [22] Neurocomputing, 510, 1-14. https://doi.org/10.1016/j.neucom.2022.08.055
  • [23] Patel, H. H., & Prajapati, P. (2018). Study and analysis of decision tree based classification algorithms. International Journal of Computer Sciences and Engineering, 6(10), 74–78. https://doi.org/10.26438/ijcse/v6i10.7478
  • [24] Patel, N., & Upadhyay, S. (2012). Study of various decision tree pruning methods with their empirical comparison in WEKA. International Journal of Computer Applications, 60(12), 20–25. https://doi.org/10.5120/9744-4304
  • [25] Priyanka, & Kumar, D. (2020). Decision tree classifier: A detailed survey. International Journal of Information and Decision Sciences, 12(3), 246–269. https://doi.org/10.1504/ijids.2020.108141
  • [26] Pugliese, R., Regondi, S., & Marini, R. (2021). Machine learning-based approach: global trends, research directions, and regulatory standpoints. Data Science and Management, 4, 19–29. https://doi.org/10.1016/j.dsm.2021.12.002
  • [27] Punyapornwithaya, V., Klaharn, K., Arjkumpa, O., & Sansamur, C. (2022). Exploring the predictive capability of machine learning models in identifying foot and mouth disease outbreak occurrences in cattle farms in an endemic setting of Thailand. Preventive Veterinary Medicine, 207, 105706. https://doi.org/10.1016/J.PREVETMED.2022.105706
  • [28] Qian, X., Zhou, Z., Hu, J., Zhu, J., Huang, H., & Dai, Y. (2021). A comparative study of kernel-based vector machines with probabilistic outputs for medical diagnosis. Biocybernetics and Biomedical Engineering, 41(4), 1486–1504. https://doi.org/10.1016/j.bbe.2021.09.003
  • [29] Revathi, A., Kaladevi, R., Ramana, K., Jhaveri, R. H., Kumar, M. R., & Kumar, M. S. P. (2022). Early detection of cognitive decline using machine learning algorithm and cognitive ability test. Security and Communication Networks, 2022, 4190023. https://doi.org/10.1155/2022/4190023
  • [30] Rezvani, S., & Wang, X. (2022). Neurocomputing intuitionistic fuzzy twin support vector machines for imbalanced data. Neurocomputing, 507, 16–25. https://doi.org/10.1016/j.neucom.2022.07.083
  • [31] Sevinç, E. (2022). An empowered AdaBoost algorithm implementation: A COVID-19 dataset study. Computers and Industrial Engineering, 165, 107912. https://doi.org/10.1016/j.cie.2021.107912
  • [32] Shafi, A. S. M., Molla, M. M. I., Jui, J. J., & Rahman, M. M. (2020). Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques. SN Applied Sciences, 2(7), 1–8. https://doi.org/10.1007/s42452-020-3051-2
  • [33] Shi, Q., Suganthan, P. N., & Katuwal, R. (2022). Weighting and pruning based ensemble deep random vector functional link network for tabular data classification. arXiv:2201.05809. http://arxiv.org/abs/2201.05809
  • [34] Swathy, M., & Saruladha, K. (2021). A comparative study of classification and prediction of cardio-vascular diseases (cvd) using machine learning and deep learning techniques. ICT Express, 8(1), 109-116. https://doi.org/10.1016/j.icte.2021.08.021
  • [35] Tewari, S., & Dwivedi, U. D. (2020). A comparative study of heterogeneous ensemble methods for the identification of geological lithofacies. Journal of Petroleum Exploration and Production Technology, 10(5), 1849–1868. https://doi.org/10.1007/s13202-020-00839-y
  • [36] Thirunavukkarasu, K., Singh, A. S., Rai, P., & Gupta, S. (2018). Classification of IRIS dataset using classification based KNN Algorithm in supervised learning. 2018 4th International Conference on Computing Communication and Automation, ICCCA 2018 (pp. 4–7). IEEE. https://doi.org/10.1109/CCAA.2018.8777643
  • [37] Uddin, S., Khan, A., Hossain, M. E., & Moni, M. A. (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making, 19(1), 1–16. https://doi.org/10.1186/s12911-019-1004-8
  • [38] Wade, B. S. C., Joshi, S. H., Gutman, B. A., & Thompson, P. M. (2017). Machine learning on high dimensional shape data from subcortical brain surfaces: A comparison of feature selection and classification methods. Pattern Recognition, 63, 731–739. https://doi.org/10.1016/j.patcog.2016.09.034
  • [39] Wei, X., Zou, N., Zeng, L., & Pei, Z. (2022). PolyJet 3D printing: Predicting color by multilayer perceptron neural network. Annals of 3D Printed Medicine, 5, 100049. https://doi.org/10.1016/j.stlm.2022.100049
  • [40] Yakut, Ö., & Bolat, E. D. (2022). A high-performance arrhythmic heartbeat classification using ensemble learning method and PSD based feature extraction approach. Biocybernetics and Biomedical Engineering, 42(2), 667–680. https://doi.org/10.1016/j.bbe.2022.05.004
  • [41] Yogita, B., Akanksha, M., Shefali, A., Tanya, M., & Gresha, B. (2020). Classification of Cardiac Arrhythmia Using Kernelized SVM. 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184) (pp. 922-926). IEEE. https://doi.org/10.1109/ICOEI48184.2020.9143000.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-197a87b1-57c4-464b-9440-a8a494879f89
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.