A comparative study of corporate credit ratings prediction with machine learning

Doğan, Seyyide; Büyükkör, Yasin; Atan, Murat

doi:10.37190/ord220102

Artykuł - szczegóły

Tytuł artykułu

A comparative study of corporate credit ratings prediction with machine learning

Autorzy

Doğan Seyyide , Büyükkör Yasin , Atan Murat

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.37190/ord220102

Warianty tytułu

Języki publikacji

Abstrakty

Credit scores are critical for financial sector investors and government officials, so it is important to develop reliable, transparent and appropriate tools for obtaining ratings. This study aims to predict company credit scores with machine learning and modern statistical methods, both in sectoral and aggregated data. Analyses are made on 1881 companies operating in three different sectors that applied for loans from Turkey’s largest public bank. The results of the experiment are compared in terms of classification accuracy, sensitivity, specificity, precision and Mathews correlation coefficient. When the credit ratings are estimated on a sectoral basis, it is observed that the classification rate considerably changes. Considering the analysis results, it is seen that logistic regression analysis, support vector machines, random forest and XGBoost have better performance than decision tree and k-nearest neighbour for all data sets.

Słowa kluczowe

credit rating credit risk machine learning

Wydawca

Oficyna Wydawnicza Politechniki Wrocławskiej

Czasopismo

Operations Research and Decisions

Rocznik

2022

Tom

Vol. 32, No. 1

Strony

25--47

Opis fizyczny

Bibliogr. 70 poz., rys.

Twórcy

autor

Doğan Seyyide

dogans@kmu.edu.tr

Department of Econometrics, Karamanoglu Mehmetbey University, Karaman, Turkey

autor

Büyükkör Yasin

Department of Econometrics, Karamanoglu Mehmetbey University, Karaman, Turkey

autor

Atan Murat

Department of Econometric, Ankara Hacı Bayram Veli University, Ankara, Turkey

Bibliografia

[1] Moody’s Symbols and Definitions, available from, https,//www.moodys.com/researchdocumentcontent page.aspx?docid=PBC_79004
[2] HUANG Z., CHEN H., HSU C.J.,CHEN W.H., WU S., Credit rating analysis with support vector machines and neural networks: a market comparative study, Dec. Sup. Syst., 2004, 37, (4), 543–558.
[3] BALIOS D., THOMADAKIS S., TSIPOURI L., Credit rating model development. An ordered analysis based on accounting data, Res. Int. Bus. Fin., 2016, 38, 122–136.
[4] BLÖCHLINGER A., Credit rating and pricing. Poles apart, J. Risk Fin. Manage., 2018, 11 (2).
[5] AKBULAK Y., Credit rating or rating. Concept and criteria, Mali Çözüm Derg., 2012, 111, 171–184 (in Turkish).
[6] WÓJCICKA-WÓJTOWICZ A., Credit risk mangement in finance-a review of various approaches. Oper. Res. Dec., 2018, 28 (4), 99–106.
[7] LI J.P., MIRZA N., RAHAT B., XIONG D., Machine learning and credit ratings prediction in the age of fourth industrial revolution, Techn. For. Soc. Change, 2020, 161, 120309.
[8] KARIYA, T., YAMAMURA Y., INUI K., Empirical credit risk ratings of individual corporate bonds and derivation of term structures of default probabilities, J. Risk Fin. Manage., 2019, 12 (3).
[9] ZHONG H., MIAO C., SHEN Z., FENG Y., Comparing the learning effectiveness of BP, ELM, I-ELM, and SVM for corporate credit ratings, Neurocomp., 2014, 128, 285–295.
[10] HUANG,C.-L.,CHEN M.-C.,WANG C.-J., Credit scoring with a data mining approach based on suport vector machines, Exp. Syst. Appl., 2007, 33 (4), 847–856.
[11] SHI J., XU B., Credit Scoring by fuzzy support vector machines with a novel membership function, J. Risk Fin. Manage., 2016, 9 (4), 13.
[12] DANENAS , GARSVA G., Selection of support vector machines based classifiers for credit risk domain, Exp. Syst. Appl., 2015, 42 (6), 3194–3204.
[13] YANENKOVA I., NEHODA Y.,DROBYAZKO S., ZAVHORODNII A., BEREZOVSKA L., Modeling of bank credit risk management using the cost risk model, J. Risk Fin. Manage., 2021, 14 (5), 211.
[14] HAJEK, , MICHALAK K., Feature selection in corporate credit rating prediction, Knowl.-Based Syst., 2013, 51, 72–84.
[15] HAN S.H., PAGANO M.S., SHIN Y.S., Rating agency reputation, the global financial crisis, and the cost of debt, Fin. Manage., 2012, 41 (4), 849–884.
[16] GÜR T.H., ÖZTÜRK H., Country risk, rating agencies, disruptions and new regulations, Sosyoekon., 2011 (2), 69–92, 238–240 (in Turkish).
[17] NICA I., ALEXANDRU D.B., PARAMON CRĂCIUNESCU S.L., IONESCU S., Automated valuation modelling: analysing mortgage behavioural life profile models using machine learning techniques, MDPI, 2021, 13 (9), 5162.
[18] ADDO, M., GUEGAN D., HASSANI B., Credit risk analysis using machine and deep learning models, 2018, 6 (2), 38.
[19] ASSEF F. M., STEINER M., NETO S., DE BARROS FRANCO D.G., Classification algorithms in financial application. Credit risk analysis on legal entities, IEEE Latin Am. Trans., 2019, 17 (10), 1733–1740.
[20] ANICETO, M.C., BARBOZA F., KIMURA H., Machine learning predictivity applied to consumer creditworthiness, Fut.e Bus. J., 2020, 6 (1), 37.
[21] CARBO-VALVERDE S., CUADROS-SOLAS P., RODRÍGUEZ-FERNÁNDEZ F.J.O., A machine learning approach to the digitalization of bank customers. Evidence from random and causal forests, PLOS ONE, 2020, 15 (10), e0240362.
[22] FISHER L., Determinants of risk premiums on corporate bonds, J. Pol. Econ., 1959, 67 (3), 217–237.
[23] PINCHES G.E. MINGO K.A.,A Multivariate analysis of industrial bond ratings, J. Fin., 1973, 28 (1), 1–18.
[24] EDERINGTON, L.H., Classification models and bond ratings, Fin. Rev., 1985, 20 (4), 237–262.
[25] GENTRY J.A., WHITFORD D.T., NEWBOLD P., Predicting industrial bond ratings with a probit model and funds flow components, Fin. Rev., 1988, 23 (3), 269–286.
[26] KOZODOI N., LESSMANN S., PAPAKONSTANTINOU K., GATSOULIS Y., A multi-objective approach for profit-driven feature selection in credit scoring, Dec. Supp. Syst., 2019, 120, 106–117.
[27] DUTTA, S., SHEKHAR S., Bond rating. A non-conservative application of neural networks, Proc. IEEE Int. Conf. Neural Networks, 1988.
[28] SURKAN A.J., SINGLETON J.C., Neural networks for bond rating improved by multiple hidden layers, IJCNN Int. Joint Conf. Neural Networks, IEEE, 1990.
[29] DAMRONGSAKMETHEE T., NEAGOE V.-E., C4.5 decision tree enhanced with adaboost versus multilayer perceptron for credit scoring modeling, [In:] Computational Statistics and Mathematical Modeling Methods in Intelligent Systems, Springer, Cham 2019.
[30] LUO C., A comprehensive decision support approach for credit scoring, Ind. Manage. Data Syst., 2020, 120 (2), 280–290.
[31] KIM J.W., WEISTROFFER H.R., REDMOND R.T., Expert systems for bond rating. A comparative analysis of statistical, rule-based and neural network systems, Exp. Syst., 1993, 10 (3), 167–172.
[32] SHIN K.-S., HAN I., A case-based approach using inductive indexing for corporate bond rating, Dec. Supp. Syst., 2001, 32 (1), 41–52.
[33] KALAIVANI , SHUNMUGANATHAN K., Feature selection based on genetic algorithm and hybrid model for sentiment polarity classification, Int. J. Data Min., Model. Manage., 2016, 8 (4), 315–329.
[34] ZEKIC-SUSAC M., SARLIJA N., BENSIC M., Small business credit scoring: a comparison of logistic regression, neural network, and decision tree models, IEEE Proc. 26th Int. Conf. Information Technology Interfaces, 2004.
[35] LESSMANN S., BAESENS B., SEOW H.-V., THOMAS L.C., State-of-the-art classification algorithms for credit scoring, J. Oper. Res. Soc., 2003, 54 (6), 627–635.
[36] LEE Y.-C., Application of support vector machines to corporate credit rating prediction, Exp. Syst. Appl., 2007, 33 (1), 67–74.
[37] KIM K.-S., HAN I., The cluster-indexing method for case-based reasoning using self-organizing maps and learning vector quantization for bond rating cases, Exp. Syst. Appl., 2001, 21 (3), 147–156.
[38] YU L., WANG S., LAI K.K., Credit risk assessment with a multistage neural network ensemble learning approach, Expert Systems with Applications, 2008, 34 (2), 1434–1444.
[39] KIM K.-J., AHN H., A corporate credit rating model using multi-class support vector machines with an ordinal pairwise partitioning approach, Comp. Oper. Res., 2012, 39 (8), 1800–1811.
[40] YEH C.-C., LIN F., HSU C.-Y., A hybrid KMV model, random forests and rough set theory approach for credit rating, Knowledge-Based Systems, 2012, 33, 166–172.
[41] PAI -F., TAN Y.-S., HSU M.-F., Credit rating analysis by the decision-tree support vector machine with ensemble strategies, Int. J. Fuzzy Syst., 2015, 17 (4), 521–530.
[42] ADDO M., GUEGAN D., HASSANI B., Credit risk analysis using machine and deep learning models, Risks, 2018, 6 (2), 38.
[43] WALLIS M., KUMAR K., GEPP A., Credit rating forecasting using machine learning techniques, Adv. Data Min. Datab. Manage. Manage. Persp. Int. Big Anal., 2019, 180–198.
[44] GRISHUNIN S., SULOEVA S., EGOROVA A., BUROVA B., Comparison of empirical methods for the reproduction of global manufacturing companies’ credit ratings, Int. J. Technol., 2020, 11, 1223.
[45] GOLBAYANI P., FLORESCU I., CHATTERJEE R., A comparative study of forecasting corporate credit ratings using neural networks, support vector machines, and decision trees, North Am. J. Econ. Fin., 2020, 54C, 101251.
[46] MERĆEP A., MRČELA L., BIROV M., KOSTANJČAR Z., Deep neural networks for behavioral credit rating, Entropy (Basel), 2020, 23 (1), 27.
[47] HOSMER, D.W., LEMESHOW S., STURDIVANT R.X., Applied Logistic Regression, Wiley, New York 2000.
[48] NETER J., Applied Linear Statistical Models, McGraw-Hill, Irwin, Boston 1996.
[49] MUSA A.B., Comparative study on classification performance between support vector machine and logistic regression, Int. J. Mach. Learn. Cyber., 2013, 4 (1), 13–24.
[50] FIX E., HODGES J.L., Discriminatory analysis – nonparametric discrimination. Consistency properties, International Statistical Review, 1989, 57, 238.
[51] HAN J., KAMBER M., Data mining concepts and techniques, [In:] J. Gray (Ed.), The Morgan Kaufmann Series in Data Management Systems, Elsevier, 2006.
[52] CORTES C., VAPNIK V., Support-vector networks, Mach. Learn., 1995, 20 (3), 273–297.
[53] CRISTIANINI N., SHAWE-TAYLOR J., An introduction to support vector machines and other kernel-based learning methods, Cambridge University Press, Camridge 2000.
[54] BURGES C.J.C., A tutorial on support vector machines for pattern recognition, Data Min. Knowl. Disc., 1998, 2 (2), 121–167.
[55] LIAO Y., FANG S.-C., NUTTLE H.L.W., A neural network model with bounded-weights for pattern classification, Comp. Oper. Res., 2004, 31 (9), 1411–1426.
[56] SONG Y.-Y., LU Y., Decision tree methods: applications for classification and prediction, Shang. Arch. Psych., 2015, 27 (2), 130–135.
[57] BREIMAN L., FRIEDMAN J.H., OLSHEN R.A., STONE C.J., Classification and regression trees, Brooks /Cole Publishing, Monterey 1984.
[58] QUINLAN J.R., C4.5: programs for machine learning, Morgan Kaufmann Publishers, Inc., 1993.
[59] BIGGS D., DE VILLE B., SUEN E., A method of choosing multiway partitions for classification and decision trees, J. Appl. Stat., 1991, 18 (1), 49–62.
[60] KASS G.V., An exploratory technique for investigating large quantities of categorical data, J. Royal Stat. Soc., Ser. C (Appl. Stat.), 1980, 29 (2), 119–127.
[61] SAMUEL O.S., OLAJUBU E., Supervised machine learning algorithms. Classification and comparison, Int. J. Comp. Trends Technol., 2017, 48, 128–138.
[62] PATEL N.D. UPADHYAY S., Study of various decision tree pruning methods with their empirical comparison in WEKA, Int. J. Comp. Appl., 2012, 60, 20–25.
[63] ARAS S., Stacking hybrid GARCH models for forecasting bitcoin volatility, Exp. Syst. Appl., 2021, 174, 114747.
[64] FRAIWAN L., LWEESY K., KHASAWNEH N., WENZ H., DICKHAUS H., Automated sleep stage identification system based on time–frequency analysis of a single EEG channel and random forest classifier, Comp. Meth. Progr. Biomed., 2012, 108 (1), 10–19.
[65] CHEN T., GUESTRIN K.D., XGBoost. A scalable tree boosting system, https://www.kdd.org /kdd2016/papers/files/rfp0697-chenAemb.pdf 2016
[66] CHANG C.-C., LIN C.-J., LIBSVM: A library for support vector machines, ACM Trans. Intell. Syst. Technol., 2011, 2 (27), 1–27.
[67] CAI J., JAIWEI L., SHULIN W., SHENG Y., Feature selection in machine learning. A new perspective, Neurocomp., 2018, 300, 70–79.
[68] TIBSHIRANI R., Regression shrinkage and selection via the lasso, J. Royal Stat. Soc., Ser. B (Meth.), 1996, 58 (1), 267–288.
[69] ZOU H., The adaptive lasso and its oracle properties, J. Am. Stat. Assoc., 2006, 101 (476), 1418–1429.
[70] HALL M.A., SMITH L.A., Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper, FLAIRS Conference, 1999.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-35fc8a03-d13d-49db-a3ff-5706b31f2d8e