PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Improving coronary heart disease prediction by outlier elimination

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Nowadays, heart disease is the major cause of deaths globally. According to a survey conducted by the World Health Organization, almost 18 million people die of heart diseases (or cardiovascular diseases) every day. So, there should be a system for early detection and prevention of heart disease. Detection of heart disease mostly depends on the huge pathological and clinical data that is quite complex. So, researchers and other medical professionals are showing keen interest in accurate prediction of heart disease. Heart disease is a general term for a large number of medical conditions related to heart and one of them is the coronary heart disease (CHD). Coronary heart disease is caused by the amassing of plaque on the artery walls. In this paper, various machine learning base and ensemble classifiers have been applied on heart disease dataset for efficient prediction of coronary heart disease. Various machine learning classifiers that have been employed include k-nearest neighbor, multilayer percep-tron, multinomial naïve bayes, logistic regression, decision tree, random forest and support vector machine classifiers. Ensemble classifiers that have been used include majority voting, weighted average, bagging and boosting classifiers. The dataset used in this study is obtained from the Framingham Heart Study which is a long-term, ongoing cardiovascular study of people from the Framingham city in Massachusetts, USA. To evaluate the performance of the classifiers, various evaluation metrics including accuracy, precision, recall and f1 score have been used. According to our results, the best accuracy was achieved by logistic regression, random forest, majority voting, weighted average and bagging classifiers but the highest accuracy among these was achieved using weighted average ensemble classifier.
Rocznik
Strony
70--88
Opis fizyczny
Bibliogr. 43 poz., fig., tab.
Twórcy
autor
  • PG Department of Computer Sciences, University of Kashmir, Srinagar, India
  • PG Department of Computer Sciences, University of Kashmir, Srinagar, India
autor
  • Directorate of IT & SS, University of Kashmir, Srinagar, India
Bibliografia
  • [1] Ashraf, M., Zaman, M., & Ahmed, M. (2018a). Using ensemble stackingc method and base classifiers to ameliorate prediction accuracy of pedagogical data. Procedia Computer Science, 132(Iccids), 1021–1040. https://doi.org/10.1016/j.procs.2018.05.018
  • [2] Ashraf, M., Zaman, M., & Ahmed, M. (2018b). Performance analysis and different subject combinations: an empirical and analytical discourse of educational data mining. Proceedings of the 8th International Conference Confluence 2018 on Cloud Computing, Data Science and Engineering, Confluence 2018 (pp. 287–292). IEEE. https://doi.org/10.1109/CONFLUENCE.2018.8442633
  • [3] Ashraf, M., Zaman, M., & Ahmed, M. (2019). To ameliorate classification accuracy using ensemble vote approach and base classifiers. In Advances in Intelligent Systems and Computing (vol 813). Springer Singapore. https://doi.org/10.1007/978-981-13-1498-8_29
  • [4] Ashraf, M., Zaman, M., & Ahmed, M. (2020). An intelligent prediction system for educational data mining based on ensemble and filtering approaches. Procedia Computer Science, 167(2019), 1471–1483. https://doi.org/10.1016/j.procs.2020.03.358
  • [5] Bashir, S., Khan, Z. S., Hassan Khan, F., Anjum, A., & Bashir, K. (2019). Improving Heart Disease Prediction Using Feature Selection Approaches. Proceedings of 2019 16th International Bhurban Conference on Applied Sciences and Technology, (pp. 619–623). IEEE. https://doi.org/10.1109/IBCAST.2019.8667106
  • [7] Benhar, H., Idri, A., & Fernández-Alemán, J. L. (2019). A Systematic Mapping Study of Data Preparation in Heart Disease Knowledge Discovery. Journal of Medical Systems, 43(1), 17. https://doi.org/10.1007/s10916-018-1134-z
  • [8] Cardiovascular (Heart) Diseases: Types and Treatments. (n.d.). Retrieved January 8, 2022 from https://www.webmd.com/heart-disease/guide/diseases-cardiovascular
  • [9] Chandra Shekar, K., Chandra, P., & Venugopala Rao, K. (2019). An Ensemble Classifier Characterized by Genetic Algorithm with Decision Tree for the Prophecy of Heart Disease. In Lecture Notes in Networks and Systems (Vol. 74). Springer Singapore. https://doi.org/10.1007/978-981-13-7082-3_2
  • [10] Coronary artery disease: Causes, symptoms, and treatment. (n.d.). Retrieved December 22, 2021 from https://www.medicalnewstoday.com/articles/184130
  • [11] Coronary heart disease – NHS. (n.d.). Retrieved December 22, 2021 from https://www.nhs.uk/conditions/coronary-heart-disease/
  • [12] Coronary Heart Disease | NHLBI, NIH. (n.d.). Retrieved December 22, 2021 from https://www.nhlbi.nih.gov/health-topics/coronary-heart-disease
  • [13] Data Jabberwocky: Decision Tree Mathematical Formulation. (n.d.). Retrieved December 26, 2021 from http://fiascodata.blogspot.com/2018/08/decision-tree-mathematical-formulation.html
  • [14] Decision Tree – GeeksforGeeks. (n.d.). Retrieved December 26, 2021 from https://www.geeksforgeeks.org/decision-tree/
  • [15] Decision Trees in Machine Learning | by Prashant Gupta | Towards Data Science. (n.d.). Retrieved December 26, 2021 from https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8052
  • [16] Dun, B., Wang, E., & Majumder, S. (2016). Heart Disease Diagnosis on Medical Data Using Ensemble Learning. Computer Science, 1(1), 1–5.
  • [17] El-Shafeiy, E. A., El-Desouky, A. I., & Elghamrawy, S. M. (2018). Prediction of Liver Diseases Based on Machine Learning Technique for Big Data. Advances in Intelligent Systems and Computing, 723, 362–374. https://doi.org/10.1007/978-3-319-74690-6_36
  • [18] Entropy: How Decision Trees Make Decisions | by Sam T | Towards Data Science. (n.d.). Retrieved December 26, 2021 from https://towardsdatascience.com/entropy-how-decision-trees-make-decisions-2946b9c18c8
  • [19] Entropy and Information Gain in Decision Trees | by Jeremiah Lutes | Towards Data Science. (n.d.). Retrieved December 26, 2021 from https://towardsdatascience.com/entropy-and-information-gain-in-decision-trees-c7db67a3a293
  • [20] Framingham Heart Study. (n.d.). Retrieved September 9, 2021 from https://framinghamheartstudy.org/
  • [21] Gokulnath, C. B., & Shantharajah, S. P. (2019). An optimized feature selection based on genetic approach and support vector machine for heart disease. Cluster Computing, 22(s6), 14777–14787. https://doi.org/10.1007/s10586-018-2416-4
  • [22] Heart disease – Symptoms and causes - Mayo Clinic. (n.d.). Retrieved January 8, 2022 from https://www.mayoclinic.org/diseases-conditions/heart-disease/symptoms-causes/syc-20353118
  • [23] K-Nearest Neighbor(KNN) Algorithm for Machine Learning - Javatpoint. (n.d.). Retrieved December 26, 2021 from https://www.javatpoint.com/k-nearest-neighbor-algorithm-for-machine-learning
  • [24] Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., & Chouvarda, I. (2017). Machine Learning and Data Mining Methods in Diabetes Research. Computational and Structural Biotechnology Journal, 15, 104–116. https://doi.org/10.1016/J.CSBJ.2016.12.005
  • [25] Latha, C. B. C., & Jeeva, S. C. (2019). Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Informatics in Medicine Unlocked, 16, 100203. https://doi.org/10.1016/j.imu.2019.100203
  • [26] Less than $1: How WHO thinks that can save 7 million lives. (n.d.). Retrieved January 9, 2022 from https://www.downtoearth.org.in/news/health/less-than-1-how-who-thinks-that-can-save-7-million-lives-80679
  • [27] Logistic Regression - an overview | ScienceDirect Topics. (n.d.). Retrieved December 26, 2021 from https://www.sciencedirect.com/topics/computer-science/logistic-regression
  • [28] Mir, N. M., Khan, S., Butt, M. A., & Zaman, M. (2016). An experimental evaluation of Bayesian classifiers applied to intrusion detection. Indian Journal of Science and Technology, 9(12), 1–13. https://doi.org/10.17485/ijst/2016/v9i12/86291
  • [29] Mohd, R., Butt, M. A., & Baba, M. Z. (2020). GWLM–NARX: Grey Wolf Levenberg–Marquardt-based neural network for rainfall prediction. Data Technologies and Applications, 54(1), 85–102. https://doi.org/10.1108/DTA-08-2019-0130
  • [30] Mohd, R., Butt, M. A., & Baba, M. Z. (2019). SALM-NARX: Self adaptive LM-based NARX model for the prediction of rainfall. Proceedings of the International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud), I-SMAC 2018 (pp. 580–585). IEEE. https://doi.org/10.1109/I-SMAC.2018.8653747
  • [31] Multilayer Perceptron – an overview | ScienceDirect Topics. (n.d.). Retrieved December 26, 2021 from https://www.sciencedirect.com/topics/computer-science/multilayer-perceptron
  • [32] Multinomial Naive Bayes Explained: Function, Advantages & Disadvantages, Applications in 2021 | upGrad blog. (n.d.). Retrieved December 26, 2021 from https://www.upgrad.com/blog/multinomial-naive-bayes-explained/
  • [33] Otoom, A. F., Abdallah, E. E., Kilani, Y., & Kefaye, A. (2015). Effective Diagnosis and Monitoring of Heart Disease. International Journal of Software Engineering and Its Applications, 9(1), 143–156.
  • [34] Riyaz, L., Butt, M. A., Zaman, M., & Ayob, O. (2022). Heart Disease Prediction Using Machine Learning Techniques: A Quantitative Review. Advances in Intelligent Systems and Computing (pp. 81–94). Springer. https://doi.org/10.1007/978-981-16-3071-2_8
  • [35] Sakai, K., & Yamada, K. (2019). Machine learning studies on major brain diseases: 5-year trends of 2014–2018. Japanese Journal of Radiology, 37, 34–72. https://doi.org/10.1007/s11604-018-0794-4
  • [36] Salvatore, C., Cerasa, A., Castiglioni, I., Gallivanone, F., Augimeri, A., Lopez, M., Arabia, G., Morelli, M., Gilardi, M. C., & Quattrone, A. (2014). Machine learning on brain MRI data for differential diagnosis of Parkinson’s disease and Progressive Supranuclear Palsy. Journal of Neuroscience Methods, 222, 230–237. https://doi.org/10.1016/J.JNEUMETH.2013.11.016
  • [37] Shinde, R., Arjun, S., Patil, P., & Waghmare, P. J. (2015). An Intelligent Heart Disease Prediction System Using K-Means Clustering and Naïve Bayes Algorithm. International Journal of Computer Science and Information Technolog, 6(1), 637–639.
  • [38] Takci, H. (2018). Improvement of heart attack prediction by the feature selection methods. Turkish Journal of Electrical Engineering and Computer Sciences, 26(1), 1–10. https://doi.org/10.3906/elk-1611-235
  • [39] Thaiparnit, S., Kritsanasung, S., & Chumuang, N. (2019). A Classification for Patients with Heart Disease Based on Hoeffding Tree. JCSSE 2019 – 16th International Joint Conference on Computer Science and Software Engineering: Knowledge Evolution Towards Singularity of Man-Machine Intelligence (pp. 352–357). IEEE. https://doi.org/10.1109/JCSSE.2019.8864158
  • [40] Wei, S., Zhao, X., & Miao, C. (2018). A comprehensive exploration to the machine learning techniques for diabetes identification. IEEE World Forum on Internet of Things, WF-IoT 2018 - Proceedings, (pp. 291–295). IEEE. https://doi.org/10.1109/WF-IOT.2018.8355130
  • [41] Wu, C. C., Yeh, W. C., Hsu, W. D., Islam, M. M., Nguyen, P. A., Poly, T. N., Wang, Y. C., Yang, H. C., & Li, Y. C. (2019). Prediction of fatty liver disease using machine learning algorithms. Computer Methods and Programs in Biomedicine, 170, 23–29. https://doi.org/10.1016/J.CMPB.2018.12.032
  • [42] Zaman, M., Kaul, S., & Ahmed, M. (2020). Analytical comparison between the information gain and gini index using historical geographical data. International Journal of Advanced Computer Science and Applications, 11(5), 429–440. https://doi.org/10.14569/IJACSA.2020.0110557
  • [43] Zaman, M., Quadri, S. M. K., & Butt, M. A. (2012). Information translation: A practitioners approach. Lecture Notes in Engineering and Computer Science, 1, 45–47.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-47af51e3-75ec-4f0c-a03f-5310dd6118c8
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.