A survey of factors influencing MLP error surface

Kordos, M.; Duch, W.

Artykuł - szczegóły

Tytuł artykułu

A survey of factors influencing MLP error surface

Autorzy

Kordos M. , Duch W.

Treść / Zawartość

Pełne teksty:

http://matwbn.icm.edu.pl/ksiazki/cc/cc33/cc3347.pdf [zdalny]

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Visualization of neural network error surfaces and learning trajectories helps to understand the influence of numerous factors on the neural learning process. This understanding can be used to improve training and design of MLP networks. The following topics are discussed using a few benchmark datasets for illustration: general error surface properties including local minima, plateaus and narrow funnels, their dependence on network structure, input data, transfer and error functions, consequences of weight initialization, and interesting directions in the weight space. The error surfaces are shown in 3-dimensional PCA-based projections. Finally a possibility of effective weight number reduction is discussed.

Słowa kluczowe

neural networks MLP error surface visualization learning trajectory

sieć neuronowa powierzchnia błędów wizualizacja trajektoria uczenia się

Wydawca

Systems Research Institute, Polish Academy of Sciences

Czasopismo

Control and Cybernetics

Rocznik

2004

Tom

Vol. 33, no 4

Strony

611--631

Opis fizyczny

Bibliogr. 13 poz., wykr.

Twórcy

autor

Kordos M.

Faculty of Automatic Control, Electronics and Computer Science, The Silesian University of Technology Gliwice, Poland

autor

Duch W.

Department of Informatics, Nicholaus Copernicus University Toruń, Poland
School of Computer Engineering, Nanyang Technological University Singapore www.phys.uni.torun.pl/~duch

Bibliografia

Denker, J., Schwartz, D., Wittner, B., Solla, S., and Howard, R.,Jackel, L., Hopfield, J.J. (1987) Automatic learning, rule extractionand generalization. Complex Systems 1, 887-922.
Duch, W. and Jankowski, N. (1999) Survey of neural transfer functions. Neural Computing Surveys 2, 163-213.
Duch, W., Adamczak, R. and Grąbczewski, K. (2001) A new methodology of extraction, optimization and application of crispand fuzzy logicalrules. IEEE Transactions on Neural Networks 12, 277-306.
Kordos, M. and Duch, W. (2003) Search-based Training for Logical Rule Extraction by Multilayer Perceptron. Proc. of Int. Conf. on Artificial Neural Networks (ICANN), Istanbul, June 2003, 86-89.
Kordos, M. and Duch, W. (2003) Multilayer Perceptron Trained with Numerical Gradient. Proc. of Int. Conf. on Artificial Neural Networks (ICANN), Istanbul, June 2003, 106-109.
Kordos, M. and Duch, W. (2004) Variable Step Search Algorithm for MLPTraining. The 8th IASTED Int. Conf. on Artificial Intelligence and Soft Computing, Marbella, Spain, Sept. 2004, 215-221.
Gallagher, M. and Downs, T. (2003) Visualization of Learning in Multilayer Perceptron Networks using PCA. IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics 33, 28-34.
Hyvarinen, A. and Oja, E. (2002) Independent Component Analysis: A Tutorial. http://www.cis.hut.fi/projects/ica
Levin, A.U., Leen, T.K. and Moody, J.E. (1994) Fast Pruning Using Principal Components. Advances in Neural Information Processing 6, 35-42.
Mertz, C.J. and Murphy, P.M. (1999) UCI repository of machine learning data-bases. http://www.ics.uci.edu/ mlearn/MLRepository.html
Moller, M.F. (1993) A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning. Neural Networks, 6, 525-533.
Ranganathan, A. (2004) The Levenberg-Marquardt Algorithm. http://www.cc.gatech.edu/people/home/ananth
Sussmann, H.J. (1992) Uniqueness of the weights for minimal feedforward nets with a given input-output map.Neural Networks 5, 589-593.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BAT5-0007-0073