Czasopismo
Tytuł artykułu
Warianty tytułu
Języki publikacji
Abstrakty
Supervised classification covers a number of data mining methods based on training data. These methods have been successfully applied to solve multi-criteria complex classification problems in many domains, including economical issues. In this paper we discuss features of some supervised classification methods based on decision trees and apply them to the direct marketing campaigns data of a Portuguese banking institution. We discuss and compare the following classification methods: decision trees, bagging, boosting, and random forests. A classification problem in our approach is defined in a scenario where a bank's clients make decisions about the activation of their deposits. The obtained results are used for evaluating the effectiveness of the classification rules. (original abstract)
Czasopismo
Rocznik
Tom
Numer
Strony
36-48
Opis fizyczny
Twórcy
autor
- Cracow University of Technology
autor
- Opole University
autor
- Cracow University of Technology
Bibliografia
- [1] Odrzygóźdź Z., Szczęsny W. (2011) Data Mining - how to select the tools for analysts in a large bank, Information Systems in Management XIII, WULS Press, Warszawa, Poland, 97-103.
- [2] Chapelle O. et al. (2006) Semi-Supervised Learning, The MIT Press, UK.
- [3] Liu G., Yang F. (2012) The application of data mining in the classification of spam messages, Proceedings of CSIP'12, 1315-1317.
- [4] Kruczkowski M., Niewiadomska-Szynkiewicz E. (2014) Comparative study of supervised learning methods for malware analysis, JTIT, Vol. 4, 24-33.
- [5] Suchacka G., Sobków M. (2015) Detection of Internet robots using a Bayesian approach, Proceedings of CYBCONF'15, Gdynia, Poland, 365-370.
- [6] Soiraya B., Mingkhwan A., Haruechaiyasak C. (2008) E-commerce web site trust assessment based on text analysis, International Journal of Business and Information, vol. 3, Issue 1, 86-114.
- [7] Shen D., Ruvini J.-D., Sarwar B. (2012) Large-scale item categorization for e-commerce, Proceedings of CIKM'12, Maui, HI, USA, 595-604.
- [8] Suchacka G., Skolimowska-Kulig M., Potempa A. (2015) A k-Nearest Neighbors method for classifying user sessions in e-commerce scenario, JTIT, Vol. 3, 64-69.
- [9] Suchacka G., Skolimowska-Kulig M., Potempa A. (2015) Classification of e-customer sessions based on Support Vector Machine, Proceedings of ECMS'15, Albena, Bulgaria, 594-600.
- [10] Hop W. (2013) Web-shop order prediction using machine learning, Masters Thesis, Erasmus University Rotterdam.
- [11] Ngai E.W.T., Xiu L., Chau D.C K. (2009) Application of data mining techniques in customer relationship management: A literature review and classification, Expert Systems with Applications, Vol. 36, Issue 2, Part 2, 2592-2602.
- [12] Morgan J.N., Sonquist J.A. (1963) Problems in the analysis of survey data, and a proposal, Journal of the American Statistical Association, Vol. 58, Issue 302, 415-434.
- [13] Breiman L. et al. (1984) Classification and Regression Trees, Wadsworth, CA.
- [14] Nobibon F. et al. (2011) Optimization models for targeted offers in direct marketing: Exact and heuristic algorithms, European Journal of Operational Research, Vol. 210, Issue 3, 670-683.
- [15] Moro S. et al. (2011) Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology, Proceedings of the European Simulation and Modelling Conference - ESM'2011, Portugal, 117-121.
- [16] Koronacki J., Ćwik J. (2005) Statystyczne systemy uczące się, WNT, Warszawa, Poland.
- [17] Walesiak M., Gatnar E. (2009) Statystyczna analiza danych z wykorzystaniem programu R, PWN, Warszawa, Poland.
- [18] Rokach L., Maimon O. (2005) Decision trees in The Data Mining and Knowledge Discovery Handbook, Springer, 165-192.
- [19] Tukey J.W. (1977) Exploratory data analysis, Addison-Wesley, Reading.
- [20] Hansen L.K., Salamon P. (1990) Neural network ensembles, IEEE Transactions on Pattern Analysis & Machine Intelligence, Vol. 12, Issue 10, 993-1001.
- [21] Breiman L. (1996) Bagging predictors, Machine learning, Vol. 24, Issue 2, 123-140.
- [22] STAT 897D: Applied Data Mining and Statistical Learning course (available online: https://onlinecourses.science.psu.edu/stat857/).
- [23] Freund, Y., Schapire R.E. (1996) Experiments with a new boosting algorithm in ICML, Vol. 96, 148-156.
- [24] Babenko B. Note: A Derivation of Discrete AdaBoost (available online: http://vision.ucsd.edu/~bbabenko/data/boosting_note.pdf).
- [25] Breiman, L. (2001) Random forests, Machine Learning, Vol. 45, Issue 1, 5-32.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171428847