Local Levenberg-Marquardt algorithm for learning feedforwad neural networks

Bilski, Jarosław; Kowalczyk, Bartosz; Marchlewska, Alina; Zurada, Jacek M.

doi:10.2478/jaiscr-2020-0020

Artykuł - szczegóły

Tytuł artykułu

Local Levenberg-Marquardt algorithm for learning feedforwad neural networks

Autorzy

Bilski Jarosław , Kowalczyk Bartosz , Marchlewska Alina , Zurada Jacek M.

Treść / Zawartość

Pełne teksty:

lski_Local Levenberg-Marquardt Algorithm.pdf

Pobierz

Identyfikatory

DOI

10.2478/jaiscr-2020-0020

Warianty tytułu

Języki publikacji

Abstrakty

This paper presents a local modification of the Levenberg-Marquardt algorithm (LM). First, the mathematical basics of the classic LM method are shown. The classic LM algorithm is very efficient for learning small neural networks. For bigger neural networks, whose computational complexity grows significantly, it makes this method practically inefficient. In order to overcome this limitation, local modification of the LM is introduced in this paper. The main goal of this paper is to develop a more complexity efficient modification of the LM method by using a local computation. The introduced modification has been tested on the following benchmarks: the function approximation and classification problems. The obtained results have been compared to the classic LM method performance. The paper shows that the local modification of the LM method significantly improves the algorithm’s performance for bigger networks. Several possible proposals for future works are suggested.

Słowa kluczowe

feed-forward neural network neural network learning algorithm optimization problem Levenberg-Marquardt algorithm QR decomposition Givens rotation

Wydawca

University of Social Sciences

Czasopismo

Journal of Artificial Intelligence and Soft Computing Research

Rocznik

2020

Tom

Vol. 10, No. 4

Strony

299--316

Opis fizyczny

Bibliogr. 46 poz., rys.

Twórcy

autor

Bilski Jarosław

jaroslaw.bilski@pcz.pl

Department of Computer Engineering, Czestochowa University of Technology, al. Armii Krajowej 36, 42-200 Częstochowa, Poland

autor

Kowalczyk Bartosz

Department of Computer Engineering, Czestochowa University of Technology, al. Armii Krajowej 36, 42-200 Częstochowa, Poland

autor

Marchlewska Alina

University of Social Science, Łodź, Poland
Clark University Worcester, MA, USA

autor

Zurada Jacek M.

Department Electrical and Computer Engineering, University of Louisville, Louisville, KY 40292, USA

Bibliografia

[1] Ryotaro Kamimura. Supposed maximum mutual information for improving generalization and interpretation of multi-layered neural networks. Journal of Artificial Intelligence and Soft Computing Research, 9(2):123–147, 2019.
[2] M. Abbas M. Javaid, Jia-Bao Liu, W. C. Teh,and Jinde Cao. Topological properties of fourlayered neural networks. Journal of Artificial Intelligence and Soft Computing Research, 9(2):111–122, 2019.
[3] Oded Koren, Carina Antonia Hallin, Nir Perel,and Dror Bendet. Decision-making enhancementin a big data environment: Application of the kmeans algorithm to mixed data. Journal of Artificial Intelligence and Soft Computing Research,9(4):293–302, 2019.
[4] S. Albawi, T. A. Mohammed, and S. Al-Zawi. Understanding of a convolutional neural network. In 2017 International Conference on Engineering and Technology (ICET), pages 1–6, 2017.
[5] A. M. Taqi, A. Awad, F. Al-Azzo, and M. Milanova. The impact of multi-optimizers and data augmentation on tensorflow convolutional neural network performance. In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 140–145, April 2018.
[6] Jiuxiang Gu, Zhenhua Wang, Jason Kuen,Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Gang Wang, Jianfei Cai, and Tsuhan Chen. Recent advances in convolutional neural networks. Pattern Recognition, 77:354 –377, 2018.
[7] Robert K. Nowicki and Janusz T. Starczewski. A new method for classification of imprecise data using fuzzy rough fuzzification. Inf. Sci., 414:33–52, 2017.
[8] Janusz T. Starczewski, Katarzyna Nieszporek, Michal Wrobel, and Konrad Grzanek. A fuzzy ´ SOM for understanding incomplete 3d faces. In ICAISC (2), volume 10842 of Lecture Notes in Computer Science, pages 73–80. Springer, 2018.
[9] Michal Wrobel, Katarzyna Nieszporek, Janusz T. ´ Starczewski, and Andrzej Cader. A fuzzy measure for recognition of handwritten letter strokes. In ICAISC (1), volume 10841 of Lecture Notes in Computer Science, pages 761–770. Springer, 2018.
[10] Sou Nobukawa, Haruhiko Nishimura, and Teruya Yamanishi. Pattern classification by spiking neural networks combining self-organized and rewardrelated spike-timing-dependent plasticity. Journal of Artificial Intelligence and Soft Computing Research, 9(4):283–291, 2019.
[11] Miguel Costa, Daniel Oliveira, Sandro Pinto, and Adriano Tavares. Detecting driver’s fatigue, distraction and activity using a non-intrusive aibased monitoring system. Journal of Artificial Intelligence and Soft Computing Research, 9(4):247–266, 2019.
[12] Xin Wang, Yi Guo, Yuanyuan Wang, and Jinhua Yu. Automatic breast tumor detection in abvs images based on convolutional neural network and superpixel patterns. Neural Computing and Applications, 31(4):1069–1081, 2019.
[13] Muhammad Irfan Sharif, Jian Ping Li, Muhammad Attique Khan, and Muhammad Asim Saleem. Active deep neural network features selection for segmentation and recognition of brain tumors using mri images. Pattern Recognition Letters, 129:181 – 189, 2020.
[14] P. Mohamed Shakeel, T. E. E. Tobely, H. Al-Feel, G. Manogaran, and S. Baskar. Neural network based brain tumor detection using wireless infrared imaging sensor. IEEE Access, 7:5577–5588, 2019.
[15] Alexander Rakhlin, Alexey Shvets, Vladimir Iglovikov, and Alexandr A. Kalinin. Deep convolutional neural networks for breast cancer histology image analysis. In Aurelio Campilho, Fakhri ´Karray, and Bart ter Haar Romeny, editors, Image Analysis and Recognition, pages 737–744, Cham,2018. Springer International Publishing.
[16] Xin Cai, Yufeng Qian, Qingshan Bai, and Wei Liu. Exploration on the financing risks of enterprise supply chain using back propagation neural network. Journal of Computational and Applied Mathematics, 367:112457, 2020.
[17] Amin Hedayati Moghaddam, Moein Hedayati Moghaddam, and Morteza Esfandyari. Stock market index prediction using artificial neural network. Journal of Economics, Finance and Administrative Science, 21(41):89 – 93, 2016.
[18] Songqiao Qi, Kaijun Jin, Baisong Li, and Yufeng Qian. The exploration of internet finance by using neural network. Journal of Computational and Applied Mathematics, 369:112630, 2020.
[19] Gustavo Botelho de Souza, Daniel Felipe da Silva Santos, Rafael Gonc¸alves Pires, Aparecido Nilceu Marananil, and Jo ao Paulo Papa. Deep features extraction for robust fingerprint spoofing attack detection. Journal of Artificial Intelligence and Soft Computing Research, 9(1):41–49, 2019.
[20] Apeksha Shewalkar, Deepika Nyavanandi, and Simone A. Ludwig. Performance evaluation of deep neural networks applied to speech recognition: Rnn, lstm and gru. Journal of Artificial Intelligence and Soft Computing Research, 9(4):235–245, 2019.
[21] A. V. Kurbesov, D. V. Ryabkin, I. I. Miroshnichenko, N. A. Aruchidi, and K. Kh. Kalugyan. Automated voice recognition of emotions through the use of neural networks. In Rafik A. Aliev, Janusz Kacprzyk, Witold Pedrycz, Mo Jamshidi, Mustafa B. Babanli, and Fahreddin M. Sadikoglu, editors, 10th International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions - ICSCCW-2019, pages 675–682, Cham, 2020. Springer International Publishing.
[22] X. Changzhen, W. Cong, M. Weixin, and S. Yanmei. A traffic sign detection algorithm based on deep convolutional neural network. In 2016 IEEE International Conference on Signal and Image Processing (ICSIP), pages 676–679, Aug 2016.
[23] Katsuba Yurii and Grigorieva Liudmila. Application of artificial neural networks in vehicles design self-diagnostic systems for safety reasons. Transportation Research Procedia, 20:283 – 287, 2017. 12th International Conference ”Organization and Traffic Safety Management in large cities SPbOTSIC-2016, 28-30 September 2016, St. Petersburg, Russia.
[24] Max W. Y. Lam. One-match-ahead forecasting in two-team sports with stacked bayesian regressions. Journal of Artificial Intelligence and Soft Computing Research, 8(3):159–171, 2018.
[25] N. P. Patel and A. Kale. Optimize approach to voice recognition using iot. In 2018 International Conference On Advances in Communication and Computing Technology (ICACCT), pages 251–256, 2018.
[26] Yi Mou and Kun Xu. The media inequality: Comparing the initial human-human and human-ai social interactions. Computers in Human Behavior, 72:432 – 440, 2017.
[27] Werbos J. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, 1974.
[28] Scott E. Fahlman. An empirical study of learning speed in back-propagation networks. Technical report, 1988.
[29] M. Riedmiller and H. Braun. A direct adaptive method for faster backpropagation learning: the rprop algorithm. In IEEE International Conference on Neural Networks, pages 586–591 vol.1, March 1993.
[30] Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on International Conference on Machine Learning -Volume 28, ICML’13, pages III–1139–III–1147.JMLR.org, 2013.
[31] M. T. Hagan and M.B. Menhaj. Training feedforward networks with the marquardt algorithm. IEEE Transactions on Neuralnetworks, 5:989–993, 1994.
[32] N. Ampazis and S. J. Perantonis. Two highly efficient second-order algorithms for training feedforward networks. IEEE Transactions on Neural Networks, 13(5):1064–1074, 2002.
[33] J. S. Smith, B. Wu, and B. M. Wilamowski. Neural network training with Levenberg–Marquardt and adaptable weight compression. IEEE Transactions on Neural Networks and Learning Systems, 30(2):580–587, 2019.
[34] Miao Cui, Kai Yang, Xiao liang Xu, Sheng dong Wang, and Xiao wei Gao. A modified Levenberg–Marquardt algorithm for simultaneous estimation of multi-parameters of boundary heat flux by solving transient nonlinear inverse heat conduction problems. International Journal of Heat and Mass Transfer, 97:908 – 916, 2016.
[35] Jiyang Dong, Ke Lu, Jian Xue, Shuangfeng Dai, Rui Zhai, and Weiguo Pan. Accelerated nonrigid image registration using improved Levenberg–Marquardt method. Information Sciences, 423:66 – 79, 2018.
[36] Jarosław Bilski, Bartosz Kowalczyk, and Jacek M. Zurada. Application of the givens rotations in the ˙ neural network learning algorithm. In Artificial Intelligence and Soft Computing, volume 9602 of Lecture Notes in Artificial Intelligence, pages 46–56. Springer-Verlag Berlin Heidelberg, 2016.
[37] Jacek Smolag, Jarosław Bilski, and Leszek Rutkowski. Systolic array for neural networks. In IV KSNiIZ, pages 487–497, 1999.
[38] Jacek Smolag and Jarosław Bilski. A systolic array for fast learning of neural networks. In V NNSC, pages 754–758, 2000.
[39] D. Rutkowska, R.K. Nowicki, and Y. Hayashi. Parallel processing by implication-based neuro–fuzzy systems. Lecture Notes in Computer Science, 2328:599–607, 2002.
[40] Jarosław Bilski and Jacek Smol. Parallel realisation of the recurrent RTRN neural network learning. In Artificial Intelligence and Soft Computing, volume 5097 of Lecture Notes in Computer Science, pages 11–16. Springer-Verlag Berlin Heidelberg, 2008.
[41] Jarosław Bilski and Jacek Smolag. Parallel architectures for learning the RTRN and Elman dynamic neural network. IEEE Transactions on Parallel and Distributed Systems, 26(9):2561 – 2570, 2015.
[42] Jarosław Bilski, Jacek Smolag, and Jacek M. Zurada. Parallel approach to the Levenberg- ˙Marquardt learning algorithm for feedforward neural networks. In Artificial Intelligence and Soft Computing, volume 9119 of Lecture Notes in Computer Science, pages 3–14. Springer-Verlag Berlin Heidelberg, 2015.
[43] J. Bilski and B.M. Wilamowski. ParallelLevenberg-Marquardt algorithm without error backpropagation. Artificial Intelligence and Soft Computing, Springer-Verlag Berlin Heidelberg, LNAI 10245:25–39, 2017.
[44] Ewaryst Rafajłowicz and Wojciech Rafajłowicz. Iterative learning in optimal control of linear dynamic processes. International Journal of Control, 91(7):1522–1540, 2018.
[45] Ewaryst Rafajłowicz and Wojciech Rafajłowicz. Iterative learning in repetitive optimal control of linear dynamic processes. LNCS, 9692:705–717, 06 2016.
[46] Piotr Jurewicz, Wojciech Rafajłowicz, Jacek Reiner, and Ewaryst Rafajłowicz. Simulations for tuning a laser power control system of the cladding process. In Khalid Saeed and Władysław Homenda, editors, Computer Information Systems and Industrial Management, pages 218–229, Cham, 2016. Springer International Publishing.

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-fa51989f-052c-4998-867c-a2727a11fd80