Fast computational approach to the Levenberg-Marquardt algorithm for training feedforward neural networks

Bilski, Jarosław; Smoląg, Jacek; Kowalczyk, Bartosz; Grzanek, Konrad; Izonin, Ivan

doi:10.2478/jaiscr-2023-0006

Powiadomienia systemowe

Sesja wygasła!
Sesja wygasła!
Sesja wygasła!
Sesja wygasła!
Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Fast computational approach to the Levenberg-Marquardt algorithm for training feedforward neural networks

Autorzy

Bilski Jarosław , Smoląg Jacek , Kowalczyk Bartosz , Grzanek Konrad , Izonin Ivan

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.2478/jaiscr-2023-0006

Warianty tytułu

Języki publikacji

Abstrakty

This paper presents a parallel approach to the Levenberg-Marquardt algorithm (LM). The use of the Levenberg-Marquardt algorithm to train neural networks is associated with significant computational complexity, and thus computation time. As a result, when the neural network has a big number of weights, the algorithm becomes practically ineffective. This article presents a new parallel approach to the computations in Levenberg-Marquardt neural network learning algorithm. The proposed solution is based on vector instructions to effectively reduce the high computational time of this algorithm. The new approach was tested on several examples involving the problems of classification and function approximation, and next it was compared with a classical computational method. The article presents in detail the idea of parallel neural network computations and shows the obtained acceleration for different problems.

Słowa kluczowe

feed-forward neural network neural network learning algorithm Levenberg-Marquardt algorithm QR decomposition Givens rotation

Wydawca

University of Social Sciences

Czasopismo

Journal of Artificial Intelligence and Soft Computing Research

Rocznik

2023

Tom

Vol. 13, No. 2

Strony

45--61

Opis fizyczny

Bibliogr. 40 poz., rys.

Twórcy

autor

Bilski Jarosław

jaroslaw.bilski@pcz.pl

Department of Computational Intelligence, Częstochowa University of Technology, al. Armii Krajowej 36, 42-200 Częstochowa, Poland

https://orcid.org/0000-0003-1769-3934

autor

Smoląg Jacek

Department of Computational Intelligence, Częstochowa University of Technology, al. Armii Krajowej 36, 42-200 Częstochowa, Poland

https://orcid.org/0000-0002-1326-3374

autor

Kowalczyk Bartosz

Department of Computational Intelligence, Częstochowa University of Technology, al. Armii Krajowej 36, 42-200 Częstochowa, Poland

https://orcid.org/0000-0002-7683-9051

autor

Grzanek Konrad

Institute of Information Technologies, University of Social Sciences, ul. Sienkiewicza 9, 90-113 Łódź, Poland

https://orcid.org/0000-0003-2193-143X

autor

Izonin Ivan

Department of Artificial Intelligence, Lviv Polytechnic National University Lviv, 79905, Ukraine

https://orcid.org/0000-0002-9761-0096

Bibliografia

[1] Marcin Gabryel, Dawid Lada, Zbigniew Filutowicz, Zofia Patora-Wysocka, Marek KisielDorohinicki, and Guang Yi Chen. Detecting anomalies in advertising web traffic with the use of the variational autoencoder. Journal of nArtificial Intelligence and Soft Computing Research, 12(4):255–256, 2022.
[2] Marcin Zalasiński, Łukasz Laskowski, Tacjana Niksa-Rynkiewicz, Krzysztof Cpałka, Aleksander Byrski, Krzysztof Przybyszewski, Paweł Trippner, and Shi Dong. Evolutionary algorithm for selecting dynamic signatures partitioning approach. Journal of Artificial Intelligence and Soft Computing Research,12(4):267–279, 2022.
[3] S. Albawi, T. A. Mohammed, and S. Al-Zawi. Understanding of a convolutional neural network. In 2017 International Conference on Engineering and Technology (ICET), pages 1–6, 2017.
[4] A. M. Taqi, A. Awad, F. Al-Azzo, and M. Milanova. The impact of multi-optimizers and data augmentation on tensorflow convolutional neural network performance. In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 140–145, April 2018.
[5] Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Gang Wang, Jianfei Cai, and Tsuhan Chen. Recent advances in convolutional neural networks. Pattern Recognition, 77:354 – 377, 2018.
[6] Robert K. Nowicki and Janusz T. Starczewski. A new method for classification of imprecise data using fuzzy rough fuzzification. Inf. Sci., 414:33–52, 2017.
[7] Janusz T. Starczewski, Katarzyna Nieszporek, Michal Wróbel, and Konrad Grzanek. A fuzzy SOM for understanding incomplete 3d faces. In ICAISC (2), volume 10842 of Lecture Notes in Computer Science, pages 73–80. Springer, 2018.
[8] Michal Wróbel, Katarzyna Nieszporek, Janusz T. Starczewski, and Andrzej Cader. A fuzzy measure for recognition of handwritten letter strokes. In ICAISC (1), volume 10841 of Lecture Notes in Computer Science, pages 761–770. Springer, 2018.
[9] Jarosław Bilski, Bartosz Kowalczyk, Alina Marchlewska, and Jacek M. Zurada. Local levenberg-marquardt algorithm for learning feedforwad neural networks. Journal of Artificial Intelligence and Soft Computing Research, 10(4):299–316, 2020.
[10] Jarosław Bilski, Bartosz Kowalczyk, Andrzej Marjański, Michał Gandor, and Jacek Zurada. A novel fast feedforward neural networks training algorithm. Journal of Artificial Intelligence and Soft Computing Research, 11(4):287–306, 2021.
[11] Jarosław Bilski, Bartosz Kowalczyk, Marek Kisiel-Dorohinicki, Agnieszka Siwocha, and Jacek Żurada. Towards a very fast feedforward multilayer neural networks training algorithm. Journal of Artificial Intelligence and Soft Computing Research, 12(3):181–195, 2022.
[12] Xin Wang, Yi Guo, Yuanyuan Wang, and Jinhua Yu. Automatic breast tumor detection in abvs images based on convolutional neural network and superpixel patterns. Neural Computing and Applications, 31(4):1069–1081, 2019.
[13] Muhammad Irfan Sharif, Jian Ping Li, Muhammad Attique Khan, and Muhammad Asim Saleem. Active deep neural network features selection for segmentation and recognition of brain tumors using mri images. Pattern Recognition Letters, 129:181 – 189, 2020.
[14] P. Mohamed Shakeel, T. E. E. Tobely, H. AlFeel, G. Manogaran, and S. Baskar. Neural network based brain tumor detection using wireless infrared imaging sensor. IEEE Access, 7:5577–5588, 2019.
[15] Alexander Rakhlin, Alexey Shvets, Vladimir Iglovikov, and Alexandr A. Kalinin. Deep convolutional neural networks for breast cancer histology image analysis. In Aurélio Campilho, Fakhri Karray, and Bart ter Haar Romeny, editors, Image Analysis and Recognition, pages 737–744, Cham, 2018. Springer International Publishing.
[16] Xin Cai, Yufeng Qian, Qingshan Bai, and Wei Liu. Exploration on the financing risks of enterprise supply chain using back propagation neural network. Journal of Computational and Applied Mathematics, 367:112457, 2020.
[17] Amin Hedayati Moghaddam, Moein Hedayati Moghaddam, and Morteza Esfandyari. Stock market index prediction using artificial neural network. Journal of Economics, Finance and Administrative Science, 21(41):89 – 93, 2016.
[18] Songqiao Qi, Kaijun Jin, Baisong Li, and Yufeng Qian. The exploration of internet finance by using neural network. Journal of Computational and Applied Mathematics, 369:112630, 2020.
[19] A. V. Kurbesov, D. V. Ryabkin, I. I. Miroshnichenko, N. A. Aruchidi, and K. Kh. Kalugyan. Automated voice recognition of emotions through the use of neural networks. In Rafik A. Aliev, Janusz Kacprzyk, Witold Pedrycz, Mo Jamshidi, Mustafa B. Babanli, and Fahreddin M. Sadikoglu, editors, 10th International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions - ICSCCW-2019, pages 675–682, Cham, 2020. Springer International Publishing.
[20] X. Changzhen, W. Cong, M. Weixin, and S. Yanmei. A traffic sign detection algorithm based on deep convolutional neural network. In 2016 IEEE International Conference on Signal and Image Processing (ICSIP), pages 676–679, Aug 2016.
[21] Katsuba Yurii and Grigorieva Liudmila. Application of artificial neural networks in vehicles’ design self-diagnostic systems for safety reasons. Transportation Research Procedia, 20:283 – 287, 2017. 12th International Conference ”Organization and Traffic Safety Management in large cities SPbOTSIC-2016, 28-30 September 2016, St. Petersburg, Russia.
[22] N. P. Patel and A. Kale. Optimize approach to voice recognition using iot. In 2018 International Conference On Advances in Communication and Computing Technology (ICACCT), pages 251–256, 2018.
[23] Yi Mou and Kun Xu. The media inequality: Comparing the initial human-human and human-ai social interactions. Computers in Human Behavior, 72:432 – 440, 2017.
[24] Werbos J. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, 1974.
[25] Scott E. Fahlman. An empirical study of learning speed in back-propagation networks. Technical report, 1988.
[26] M. Riedmiller and H. Braun. A direct adaptive method for faster backpropagation learning: the rprop algorithm. In IEEE International Conference on Neural Networks, pages 586–591 vol.1, March 1993.
[27] Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, ICML’13, pages III–1139–III–1147. JMLR.org, 2013.
[28] M. T. Hagan and M.B. Menhaj. Training feedforward networks with the marquardt algorithm. IEEE Transactions on Neuralnetworks, 5:989–993, 1994.
[29] N. Ampazis and S. J. Perantonis. Two highly efficient second-order algorithms for training feedforward networks. IEEE Transactions on Neural Networks, 13(5):1064–1074, 2002.
[30] J. S. Smith, B. Wu, and B. M. Wilamowski. Neural network training with Levenberg–Marquardt and adaptable weight compression. IEEE Transactions on Neural Networks and Learning Systems, 30(2):580–587, 2019.
[31] Miao Cui, Kai Yang, Xiao liang Xu, Sheng dong Wang, and Xiao wei Gao. A modified; Levenberg–Marquardt algorithm for simultaneous estimation of multi-parameters of boundary heat flux by solving transient nonlinear inverse heat conduction problems. International Journal of Heat and Mass Transfer, 97:908 –916, 2016.
[32] Jiyang Dong, Ke Lu, Jian Xue, Shuangfeng Dai, Rui Zhai, and Weiguo Pan. Accelerated nonrigid image registration using improved Levenberg–Marquardt method. Information Sciences, 423:66 – 79, 2018.
[33] Jarosław Bilski, Bartosz Kowalczyk, and Jacek M. Żurada. Application of the givens ;rotations in the neural network learning algorithm. In Artificial Intelligence and Soft Computing, volume 9602 of Lecture Notes in Artificial Intelligence, pages 46–56. Springer-Verlag; Berlin Heidelberg, 2016.
[34] Jacek Smoląg, Jarosław Bilski, and Leszek; Rutkowski. Systolic array for neural networks.; In IV KSNiIZ, pages 487–497, 1999.
[35] Jacek Smoląg and Jarosław Bilski. A systolic; array for fast learning of neural networks. In; V NNSC, pages 754–758, 2000.
[36] D. Rutkowska, R.K. Nowicki, and Y. Hayashi.; Parallel processing by implication-based;neuro–fuzzy systems. Lecture Notes in Computer Science, 2328:599–607, 2002.
[37] Jarosław Bilski and Jacek Smoląg. Parallel realisation of the recurrent RTRN neural network learning. In Artificial Intelligence and Soft Computing, volume 5097 of Lecture Notes;in Computer Science, pages 11–16. SpringerVerlag Berlin Heidelberg, 2008.
[38] Jarosław Bilski and Jacek Smoląg. Parallel;architectures for learning the RTRN and Elman dynamic neural network. IEEE Transactions on Parallel and Distributed Systems,;26(9):2561 – 2570, 2015.
[39] Jarosław Bilski, Jacek Smoląg, and Jacek M.;Żurada. Parallel approach to the LevenbergMarquardt learning algorithm for feedforward;neural networks. In Artificial Intelligence and;Soft Computing, volume 9119 of Lecture Notes;in Computer Science, pages 3–14. SpringerVerlag Berlin Heidelberg, 2015.
[40] J. Bilski and B.M. Wilamowski. Parallel; Levenberg-Marquardt algorithm without error;backpropagation. Artificial Intelligence and;Soft Computing, Springer-Verlag Berlin Heidelberg, LNAI 10245:25–39, 2017.

Uwagi

Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-41a1ef60-a15f-4db1-a7d6-351248ea99bd