Evolutionary learning of rich neural networks in the Bayesian model selection framework

Matteucci, M.; Spadoni, D.

Artykuł - szczegóły

Tytuł artykułu

Evolutionary learning of rich neural networks in the Bayesian model selection framework

Autorzy

Matteucci M. , Spadoni D.

Treść / Zawartość

Pełne teksty:

http://matwbn.icm.edu.pl/ksiazki/amc/amc14/amc14311.pdf [zdalny]

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

In this paper we focus on the problem of using a genetic algorithm for model selection within a Bayesian framework. We propose to reduce the model selection problem to a search problem solved using evolutionary computation to explore a posterior distribution over the model space. As a case study, we introduce ELeaRNT (Evolutionary Learning of Rich Neural Network Topologies), a genetic algorithm which evolves a particular class of models, namely, Rich Neural Networks (RNN), in order to find an optimal domain-specific non-linear function approximator with a good generalization capability. In order to evolve this kind of neural networks, ELeaRNT uses a Bayesian fitness function. The experimental results prove that ELeaRNT using a Bayesian fitness function finds, in a completely automated way, networks well-matched to the analysed problem, with acceptable complexity.

Słowa kluczowe

Rich Neural Networks Bayesian model selection genetic algorithm Bayesian fitness

sieć neuronowa model Bayesa algorytm genetyczny

Wydawca

Oficyna Wydawnicza Uniwersytetu Zielonogórskiego

Czasopismo

International Journal of Applied Mathematics and Computer Science

Rocznik

2004

Tom

Vol. 14, no 3

Strony

423--440

Opis fizyczny

Bibliogr. 38 poz., rys., tab., wykr.

Twórcy

autor

Matteucci M.

matteucci@elet.polimi.it

Department of Electronics and Information, Politecnico di Milano, Piazza L. da Vinci 32, 20133 Milan, Italy

autor

Spadoni D.

spadoni@alari.ch

ALaRI (Advanced Learning and Research Institute), University of Lugano, Lugano, Switzerland

Bibliografia

[1] Angeline P.J. (1994): Genetic Programming and Emergent Intelligence, In: Advances in Genetic Programming (Jr. Kinnear and E. Kenneth, Eds.).—Cambridge, MA: MIT Press, pp. 75–98.
[2] Bebis G., Georgiopoulos M. and Kasparis T. (1997): Coupling weight elimination with genetic algorithms to reduce network size and preserve generalization. — Neurocomput., Vol. 17, No. 3–4, pp. 167–194.
[3] Bernardo J.M. and Smith A.F.M. (1994): Bayesian Theory. — New York: Wiley.
[4] Bishop C.M. (1995): Neural Networks for Pattern Recognition. —Oxford: Oxford University Press.
[5] Castellano G., Fanelli A.M. and Pelillo M. (1997): An iterative pruning algorithm for feedforward neural networks. —IEEE Trans. Neural Netw., Vol. 8, No. 3, pp. 519–531.
[6] Chib S. and Greenberg E. (1995): Understanding the Metropolis-Hastings algorithm. — Amer. Stat., Vol. 49, No. 4, pp. 327–335.
[7] Denison D.G.T., Holmes C.C., Mallick B.K. and Smith A.F.M. (2002): Bayesian Methods for Nonlinear Classification and Regression.—New York: Wiley.
[8] Dudzinski M.L. and Mykytowycz R. (1961): The eye lens as an indicator of age in the wild rabbit in Australia. — CSIRO Wildlife Res., Vol. 6, No. 1, pp. 156–159.
[9] Flake G.W. (1993): Nonmonotonic activation functions in multilayer perceptrons. — Ph.D. thesis, Dept. Comput. Sci., University of Maryland, College Park, MD.
[10] Fletcher R. (1987): Practical Methods of Optimization. — New York: Wiley.
[11] Goldberg D.E. (1989): Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, MA: Addison-Wesley.
[12] Gull S.F. (1989): Developments in maximum entropy data analysis, In: Maximum Entropy and Bayesian Methods, Cambridge 1998 (J. Skilling, Ed.). — Dordrecht: Kluwer, pp. 53–71.
[13] Hancock P.J.B. (1992): Genetic algorithms and permutation problems: A comparison of recombination operators for neural net structure specification. — Proc. COGANN Workshop, Int. Joint Conf. Neural Networks, Piscataway, NJ, IEEE Computer Press, pp. 108–122.
[14] Hashem S. (1997): Optimal linear combinations of neural networks.— Neural Netw., Vol 10, No. 4, pp. 599–614.
[15] Hassibi B. and Stork D.G. (1992): Second order derivatives for network pruning: Optimal Brain Surgeon, In: Advances in Neural Information Processing Systems (S.J. Hanson, J.D. Cowan and C. Lee Giles, Eds.). — San Matteo, CA: Morgan Kaufmann, Vol. 5, pp. 164–171.
[16] Hastings W.K. (1970): Monte Carlo sampling methods using Markov chains and their applications. — Biometrika, Vol. 57, pp. 97–109.
[17] Haykin S. (1999): Neural Networks. A Comprehensive Foundation (2nd Edition).—New Jersey: Prentice Hall.
[18] Hoeting J., Madigan D., Raftery A. and Volinsky C. (1998): Bayesian model averaging. — Tech. Rep. No. 9814, Department of Statistics, Colorado State University.
[19] Hornik K.M., Stinchcombe M. and White H. (1989): Multilayer feedforward networks are universal approximators. — Neural Netw., Vol. 2, No. 5, pp. 359–366.
[20] Liu Y. and Yao X. (1996): A population-based learning algorithm which learns both architectures and weights of neural networks.—Chinese J. Adv. Softw. Res., Vol. 3, No. 1, pp. 54–65.
[21] Lovell D. and Tsoi A. (1992): The performance of the neocognitron with various s-cell and c-cell transfer functions. — Tech. Rep., Intelligent Machines Laboratory, Department of Electrical Engineering, University of Queensland.
[22] MacKay D.J.C. (1992): A practical Bayesian framework for backpropagation networks. — Neural Comput., Vol. 4, No. 3, pp. 448–472.
[23] MacKay D.J.C. (1995): Probable networks and plausible predictions — a review of practical Bayesian methods for supervised neural networks. — Netw. Comput. Neural Syst., Vol. 6, No. 3, pp. 469–505.
[24] MacKay D.J.C. (1999): Comparison of approximate methods for handling hyperparameters. — Neural Comput., Vol. 11, No. 5, pp. 1035–1068.
[25] Mani G. (1990): Learning by gradient descent in function space. — Tech. Rep. No. WI 52703, Computer Sciences Department, University of Winsconsin, Madison, WI.
[26] Matteucci M. (2002a): ELeaRNT: Evolutionary learning of rich neural network topologies. — Tech. Rep. No. CMU–CALD–02–103, Carnegie Mellon University, Pittsburgh, PA.
[27] Matteucci M. (2002b): Evolutionary learning of adaptive models within a Bayesian framework. — Ph.D. thesis, Dipartimento di Elettronica e Informazione, Politecnico di Milano.
[28] Montana D.J. and Davis L. (1989): Training feedforward neural networks using genetic algorithms. — Proc. 3rd Int. Conf. Genetic Algorithms, San Francisco, CA, USA, pp. 762–767.
[29] Pearlmutter B.A. (1994): Fast exact multiplication by the Hessian.— Neural Comput., Vol. 6, No. 1, pp. 147–160.
[30] Press W.H., Teukolsky S.A., Vetterling W.T. and Flannery B.P. (1992): Numerical Recipes in C: The Art of Scientific Computing.— Cambridge, UK: University Press.
[31] Ronald E. and Schoenauer M. (1994): Genetic lander: An experiment in accurate neuro-genetic control. — Proc. 3rd Conf. Parallel Problem Solving from Nature, Berlin, Germany, pp. 452–461.
[32] Rumelhart D.E., Hinton G.E. and Williams R.J. (1986): Learning representations by back-propagating errors.—Nature, Vol. 323, pp. 533–536.
[33] Stone M. (1974): Cross-validation choice and assessment of statistical procedures.—J. Royal Stat. Soc., Series B, Vol. 36, pp. 111–147.
[34] Tierney L. and Kadane J.B. (1986): Accurate approximations for posterior moments and marginal densities. — J. Amer. Stat. Assoc., Vol. 81, pp. 82–86.
[35] Tikhonov A.N. (1963): Solution of incorrectly formulated problems and the regularization method.—Soviet Math. Dokl., Vol. 4, pp. 1035–1038.
[36] Wasserman L. (1999): Bayesian model selection and model averaging.— J. Math. Psych., Vol. 44, No. 1, pp. 92–107.
[37] Weigend A.S., Rumelhart D.E. and Huberman B.A. (1991): Generalization by weight elimination with application to forecasting, In: Advances in Neural Information Processing Systems, Vol. 3 (R. Lippmann, J. Moody and D. Touretzky, Eds.). — San Francisco, CA: Morgan-Kaufmann, pp. 875–882.
[38] Williams P.M. (1995): Bayesian regularization and pruning using a Laplace prior. — Neural Comput., Vol. 7, No. 1, pp. 117–143.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPZ1-0007-0038