A novel fast feedforward neural networks training algorithm

Bilski, Jarosław; Kowalczyk, Bartosz; Marjański, Andrzej; Gandor, Michał; Zurada, Jacek

doi:10.2478/jaiscr-2021-0017

Artykuł - szczegóły

Tytuł artykułu

A novel fast feedforward neural networks training algorithm

Autorzy

Bilski Jarosław , Kowalczyk Bartosz , Marjański Andrzej , Gandor Michał , Zurada Jacek

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.2478/jaiscr-2021-0017

Warianty tytułu

Języki publikacji

Abstrakty

In this paper1 a new neural networks training algorithm is presented. The algorithm originates from the Recursive Least Squares (RLS) method commonly used in adaptive filtering. It uses the QR decomposition in conjunction with the Givens rotations for solving a normal equation - resulting from minimization of the loss function. An important parameter in neural networks is training time. Many commonly used algorithms require a big number of iterations in order to achieve a satisfactory outcome while other algorithms are effective only for small neural networks. The proposed solution is characterized by a very short convergence time compared to the well-known backpropagation method and its variants. The paper contains a complete mathematical derivation of the proposed algorithm. There are presented extensive simulation results using various benchmarks including function approximation, classification, encoder, and parity problems. Obtained results show the advantages of the featured algorithm which outperforms commonly used recent state-of-the-art neural networks training algorithms, including the Adam optimizer and the Nesterov’s accelerated gradient.

Słowa kluczowe

neural network training algorithm QR decomposition Givens rotations approximation classification

Wydawca

University of Social Sciences

Czasopismo

Journal of Artificial Intelligence and Soft Computing Research

Rocznik

2021

Tom

Vol. 11, No. 4

Strony

287--306

Opis fizyczny

Bibliogr. 38 poz., rys.

Twórcy

autor

Bilski Jarosław

jaroslaw.bilski@pcz.pl

Department of Intelligent Computer Systems, Częstochowa University of Technology, al. Armii Krajowej 36, 42-200 Częstochowa, Poland

autor

Kowalczyk Bartosz

Department of Intelligent Computer Systems, Częstochowa University of Technology, al. Armii Krajowej 36, 42-200 Częstochowa, Poland

autor

Marjański Andrzej

Management Department, University of Social Sciences, 90-113 Łódź, Poland
Clark University, Worcester, MA 01610, USA

autor

Gandor Michał

Faculty of Computer Science and Telecommunications, Cracow University of Technology Warszawska 24, 31-155 Krakow, Poland

autor

Zurada Jacek

Department of Computer and Electrical Engineering, University of Louisville, KY 40292, USA

Bibliografia

[1] J. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, 1974.
[2] J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang, G. Wang, J. Cai, and T. Chen. Recent advances in convolutional neural networks. Pattern Recognition, 77: 354–377, 2018.
[3] J. Bilski and A.I. Galushkin. A new proposition of the activation function for significant improvement of neural networks performance. In Artificial Intelligence and Soft Computing, volume 9602 of Lecture Notes in Computer Science, pages 35–45. Springer-Verlag Berlin Heidelberg, 2016.
[4] N.A. Khan and A. Shaikh. A smart amalgamation of spectral neural algorithm for nonlinear lane-emden equations with simulated annealing. Journal of Artificial Intelligence and Soft Computing Research, 7(3): 215–224, 2017.
[5] O. Chang, P. Constante, A. Gordon, and M. Singana. A novel deep neural network that uses space-time features for tracking and recognizing a moving object. Journal of Artificial Intelligence and Soft Computing Research, 7(2): 125–136, 2017.
[6] A. Shewalkar, D. Nyavanandi, and S. A. Ludwig. Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU. Journal of Artificial Intelligence and Soft Computing Research, 9(4): 235–245, 2019.
[7] J.B. Liu, J. Zhao, S. Wang, M. Javaid, and J. Cao. On the topological properties of the certain neural networks. Journal of Artificial Intelligence and Soft Computing Research, 8(4): 257–268, 2018.
[8] Y. Li, R. Cui, Z. Li, and D. Xu. Neural network approximation based near-optimal motion planning with kinodynamic constraints using rrt. IEEE Transactions on Industrial Electronics, 65(11): 8718–8729, Nov 2018.
[9] R. Shirin. A neural network approach for retailer risk assessment in the aftermarket industry. Benchmarking: An International Journal, 26(5): 1631–1647, Jan 2019.
[10] M. Costam, D. Oliveira, S. Pinto, and A. Tavares. Detecting driver’s fatigue, distraction and activity using a non-intrusive ai-based monitoring system. Journal of Artificial Intelligence and Soft Computing Research, 9(4): 247–266, 2019.
[11] A.K. Singh, S.K. Jha, and A.V. Muley. Candidates selection using artificial neural network technique in a pharmaceutical industry. In Siddhartha Bhattacharyya, Aboul Ella Hassanien, Deepak Gupta, Ashish Khanna, and Indrajit Pan, editors, International Conference on Innovative Computing and Communications, pages 359–366, Singapore, 2019. Springer Singapore.
[12] A.Y. Hannun, P. Rajpurkar, M. Haghpanahi, G.H. Tison, C. Bourn, M. P. Turakhia, and A.Y. Ng. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nature Medicine, 25(1): 65–69, 2019.
[13] D. Hagan and H. Hagan. Soft computing tools for virtual drug discovery. Journal of Artificial Intelligence and Soft Computing Research, 8(3): 173–189, 2018.
[14] E. Angelini, G. di Tollo, and A. Roli. A neural network approach for credit risk evaluation. The Quarterly Review of Economics and Finance, 48(4): 733–755, 2008.
[15] Ghosh and Reilly. Credit card fraud detection with a neural-network. In 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences, volume 3, pages 621–630, Jan 1994.
[16] K.Y. Tam and M. Kiang. Predicting bank failures: A neural network approach. Applied Artificial Intelligence, 4(4): 265–282, 1990.
[17] U.R. Acharya, S.L. Oh, Y. Hagiwara, J.H. Tan, and H. Adeli. Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals. Computers in Biology and Medicine, 100: 270–278, 2018.
[18] O. Abedinia, N. Amjady, and N. Ghadimi. Solar energy forecasting based on hybrid neural network and improved metaheuristic algorithm. Computational Intelligence, 34(1): 241–260, 2018.
[19] H. Liu, X. Mi, and Y. Li. Wind speed forecasting method based on deep learning strategy using empirical wavelet transform, long short term memory neural network and Elman neural network. Energy Conversion and Management, 156: 498–514, 2018.
[20] J.C.R. Whittington and R. Bogacz. Theories of error back-propagation in the brain. Trends in Cognitive Sciences, 23(3): 235–250, 2019.
[21] A.K. Singh, B. Kumar, S.K. Singh, S.P. Ghrera, and A. Mohan. Multiple watermarking technique for securing online social network contents using back propagation neural network. Future Generation Computer Systems, 86: 926–939, 2018.
[22] Z. Cao, N. Guo, M. Li, K. Yu, and K. Gao. Back propagation neural network based signal acquisition for Brillouin distributed optical fiber sensors. Opt. Express, 27(4): 4549–4561, Feb 2019.
[23] M.T. Hagan and M.B. Menhaj. Training feed-forward networks with the marquardt algorithm. IEEE Transactions on Neuralnetworks, 5: 989–993, 1994.
[24] B.T. Polyak. Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics, 4(5): 1–17, 1964.
[25] Yu. E. Nesterov. A method for solving the convex programming problem with convergence rate O(1/sqr(k)). In Soviet Mathematics Dok-lady, number 27: 372-376, 1983.
[26] I. Sutskever, J. Martens, G. Dahl, and G. Hinton. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on International Conference on Machine Learning -Volume 28, ICML’13, pages III–1139–III–1147. JMLR.org, 2013.
[27] S.E. Fahlman. An empirical study of learning speed in back-propagation networks. Technical report, 1988.
[28] M. Riedmiller and H. Braun. A direct adaptive method for faster backpropagation learning: the rprop algorithm. In IEEE International Conference on Neural Networks, pages 586–591 vol.1, March 1993.
[29] D.P. Kingma and J. Ba. Adam: A method for stochastic optimization, 2014.
[30] J. Bilski and L. Rutkowski. A fast training algorithm for neural networks. IEEE Transaction on Circuits and Systems Part II, 45(6): 749–753, 1998.
[31] W. Givens. Computation of plain unitary rotations transforming a general matrix to triangular form. Journal of The Society for Industrial and Applied Mathematics, 6: 26–50, 1958.
[32] C.L. Lawson and R.J. Hanson. Solving Least Squares Problems. Prentice-Hall series in automatic computation. Prentice-Hall, 1974.
[33] A. Kiełbasiński and H. Schwetlick. Numeryczna Algebra Liniowa: Wprowadzenie do Obliczeń Zautomatyzowanych. Wydawnictwa Naukowo-Techniczne, Warszawa, 1992.
[34] Louis Guttman. Enlargement Methods for Computing the Inverse Matrix. The Annals of Mathematical Statistics, 17(3): 336 – 343, 1946.
[35] J. Bilski and B.M. Wilamowski. Parallel learning of feedforward neural networks without error backpropagation. In Artificial Intelligence and Soft Computing, pages 57–69, Cham, 2016. Springer International Publishing.
[36] J. Bilski, B. Kowalczyk, and K. Grzanek. The parallel modification to the Levenberg-Marquardt algorithm. In Artificial Intelligence and Soft Computing, volume 10841 of Lecture Notes in Artificial Intelligence, pages 15–24. Springer-Verlag Berlin Heidelberg, 2018.
[37] J. Bilski and B.M. Wilamowski. Parallel Levenberg-Marquardt algorithm without error backpropagation. Artificial Intelligence and Soft Computing, Springer-Verlag Berlin Heidelberg, LNAI 10245: 25–39, 2017.
[38] J. Bilski and J. Smoląg. Fast conjugate gradient algorithm for feedforward neural networks. In Leszek Rutkowski, Rafał Scherer, Marcin Korytkowski, Witold Pedrycz, Ryszard Tadeusiewicz, and Jacek M. Zurada, editors, Artificial Intelligence and Soft Computing, pages 27–38, Cham, 2020. Springer International Publishing.

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-1148f663-0c6b-4b57-882a-80801792fbac