Friedman and Wilcoxon Evaluations Comparing SVM, Bagging, Boosting, K-NN and Decision Tree Classifiers

Biju, V. G.; Prashanth, CM

doi:10.1515/jacsm-2017-0002

Artykuł - szczegóły

Tytuł artykułu

Friedman and Wilcoxon Evaluations Comparing SVM, Bagging, Boosting, K-NN and Decision Tree Classifiers

Autorzy

Biju V. G. , Prashanth CM

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.1515/jacsm-2017-0002

Warianty tytułu

Języki publikacji

Abstrakty

This paper describes a number of experiments to compare and validate the performance of machine learning classifiers. Creating machine learning models for data with wide varieties has huge applications in predictive modelling across multiple domain of science. This work reviews state of the art techniques in machine learning classifiers methods with several extent of magnitude in statistics and key findings that will be helpful in establishing best methodological practices for class predictions. Comprehensive comparative review analysis with statistical validations for various machine learning algorithm for SVM, Bagging, Boosting, Decision Trees and Nearest Neighborhood algorithm on multiple data sets is carried out. Focus on the statistical analysis of the results using Friedman-Test and Wilcoxon Test as well as other interpretative metrics like classification rate, ROC, F-measure are evaluated to benchmark results.

Słowa kluczowe

bagging boosting SVM KNN decision tree

Wydawca

Społeczna Akademia Nauk w Łodzi

Czasopismo

Journal of Applied Computer Science Methods

Rocznik

2017

Tom

Vol. 9 No. 1

Strony

23--47

Opis fizyczny

Bibliogr. 36 poz., rys., tab.

Twórcy

autor

Biju V. G.

vinai.george@christuniversity.in

Department of Computer Science and Engineering Christ University Faculty of Engineering, India

autor

Prashanth CM

Department of Computer Science and Engineering Sapthgiri College of Engineering, India

Bibliografia

1. Labatut, Vincent, and Hocine Cherifi “Accuracy measures for the comparison of classifiers”. arXiv preprint arXiv, 1207.3790, 2012.
2. Duman, Ekrem, Yeliz Ekinci, and Aydin Tanriverdi “Comparing alternative classifiers for database marketing: The case of imbalanced datasets”. Expert Systems with Applications, 39.1, pp.48-53, 2012.
3. Aydemir, Onder, and Temel Kayikcioglu “Comparing common machine learning classifiers in low-dimensional feature vectors for brain computer interface applications”. International Journal of Innovative Computing, Information and Control 9.3, pp.1145-1157, 2013.
4. Majnik, Matjaz, and Zoran Bosnic “ROC analysis of classifiers in machine learning: A survey”. Intelligent data analysis, 17.3, pp.531-558, 2013.
5. Kim, Yoosin, Do Young Kwon, and Seung Ryul Jeong “Comparing machine learning classifiers for movie WOM opinion mining”. KSII Transactions on Internet and Information Systems 9.8, pp.3178-3190, 2015.
6. Kotfila, Christopher, and Ozlem Uzuner “A systematic comparison of feature space effects on disease classifier performance for phenotype identification of five diseases”. Journal of biomedical informatics 58, S92-S102, 2015.
7. Demsar, Janez “Statistical comparisons of classifiers over multiple data sets”. Journal of Machine learning research 7.Jan, pp.1-30, 2006.
8. Mollazade, Kaveh, Mahmoud Omid, and Arman Arefi “Comparing data mining classifiers for grading raisins based on visual features”. Computers and electronics in agriculture 84, pp.124-131, 2012.
9. Dalton, Anthony, and Gearoid OLaighin “Comparing supervised learning techniques on the task of physical activity recognition”. IEEE journal of biomedical and health informatics 17.1, pp.46-52, 2013.
10. Bekhuis, Tanja, and Dina Demner-Fushman “Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. Artificial intelligence in medicine 55.3, pp.197-207, 2012.
11. Deufemia, Vincenzo, et al. “Comparing classifiers for web user intent understanding”. Empowering Organizations. Springer International Publishing, pp.147-159, 2016.
12. Taghizadeh-Mehrjardi, R., et al. “Comparing data mining classifiers to predict spatial distribution of USDA-family soil groups in Baneh region”, Iran. Geoderma 253: 67-77, 2015.
13. Orru, Graziella, et al “Using support vector machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review”. Neuroscience and Biobehavioral Reviews 36.4, pp.1140-1152, .2012.
14. Qi, Zhiquan, Yingjie Tian, and Yong Shi “Robust twin support vector machine for pattern classification”. Pattern Recognition 46.1, pp.305-316, 2013.
15. Geng, Yishuang, et al. “Enlighten wearable physiological monitoring systems: On-body rf characteristics based human motion classification using a support vector machine”. IEEE transactions on mobile computing 15.3, pp.656-671, 2016.
16. Tehrany, Mahyat Shafapour, et al. “Flood susceptibility assessment using GISbased support vector machine model with different kernel types”. Catena 125, pp.91-101, 2015.
17. Azar, Ahmad Taher, and Shereen M. El-Metwally. “Decision tree classifiers for automated medical diagnosis”. Neural Computing and Applications 23.7-8, pp.2387-2403, 2013.
18. Lajnef, Tarek, et al. “Learning machines and sleeping brains: automatic sleep stage classification using decision-tree multi-class support vector machines”. Journal of neuroscience methods 250, pp.94-105, 2015.
19. Wang, Ran, et al. “Segment based decision tree induction with continuous valued attributes”. IEEE transactions on cybernetics 45.7, pp.1262-1275, 2015.
20. Oliver, Jonathan J., and David J. “Hand On pruning and averaging decision trees. Machine Learning”: Proceedings of the Twelfth International Conference, Morgan Kaufmann, pp.430-437, 1995.
21. Parvin, Hamid, Miresmaeil MirnabiBaboli, and Hamid Alinejad-Rokny. “Proposing a classifier ensemble framework based on classifier selection and decision tree”. Engineering Applications of Artificial Intelligence 37, pp.34-42, 2015.
22. Simidjievski, Nikola, Ljupco Todorovski, and Saso Dzeroski “Predicting longterm population dynamics with bagging and boosting of process-based models”. Expert Systems with Applications 42.22, pp.8484-8496, 2015.
23. Wang, Guan-Wei, Chun-Xia Zhang, and Gao Guo “Investigating the Effect of Randomly Selected Feature Subsets on Bagging and Boosting”. Communications in Statistics-Simulation and Computation 44.3, pp.636-646, 2015.
24. Abdollahi-Arpanahi, R., et al. “Assessment of bagging GBLUP for wholegenome prediction of broiler chicken traits." Journal of Animal Breeding and Genetics 132.3, pp.218-228, 2015.
25. Hegde, Chiranth, Scott Wallace, and Ken Gray “Using Trees, Bagging, and Random Forests to Predict Rate of Penetration During Drilling”. SPE Middle East Intelligent Oil and Gas Conference and Exhibition. Society of Petroleum Engineers, doi:10.2118/176792-MS, 2015.
26. Korytkowski, Marcin, Leszek Rutkowski, and Rafal Scherer “Fast image classification by boosting fuzzy classifiers”. Information Sciences 327, pp.175— 182, 2016.
27. Appel, Ron, Thomas J. Fuchs, Piotr Dollar, and Pietro Perona “Quickly Boosting Decision Trees-Pruning Underachieving Features Early”. In ICML (3), pp.594-602, 2013.
28. Kim, Tae-Kyun, and Roberto Cipolla “Multiple classifier boosting and treestructured classifiers”. Machine Learning for Computer Vision. Springer Berlin Heidelberg, pp.163-196, 2013.
29. Ye, Jerry, Jyh-Herng Chow, Jiang Chen, and Zhaohui Zheng “Stochastic gradient boosted distributed decision trees”. In Proceedings of the 18th ACM conference on Information and knowledge management, ACM, pp. 2061-2064, 2009.
30. Nowak, Bartosz A., et al. “Multi-class nearest neighbour classifier for incomplete data handling”. International Conference on Artificial Intelligence and Soft Computing. Springer International Publishing, 2015.
31. Osth, John, William AV Clark, and Bo Malmberg “Measuring the Scale of Segregation Using k-Nearest Neighbor Aggregates”. Geographical Analysis 47.1, pp.34-49, 2015.
32. Chavez, Edgar, et al. “Near neighbor searching with K nearest references”. Information Systems 51, pp.43-61, 2015.
33. Bhulai, Sandjai “Nearest neighbour algorithms for forecasting call arrivals in call centers”. Intelligent Decision Technologies. Springer International Publishing, pp. 77-87, 2015.
34. Blaszczyn'ski, Jerzy, and Jerzy Stefanowski “Neighbourhood sampling in bagging for imbalanced data”. Neurocomputing, 150, pp.529-542, 2015.
35. Kamley S, Jaloree S, Thakur RS. “Performance Forecasting of Share Market using Machine Learning Techniques: A Review”. International Journal of Electrical and Computer Engineering. 6(6):3196, 2016.
36. Vidyullatha P, Rao DR. Machine Learning Techniques on Multidimensional Curve Fitting Data Based on R-Square and Chi-Square Methods. International Journal of Electrical and Computer Engineering. 1;6(3):974, 2016.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-a5488307-f42d-43ee-a091-27045f1d8f2c