Approximating Arbitrary Reinforcement Signal by Learning Classifier Systems using Micro Genetic Algorithm

Hamzeh, A.; Rahmani, A.

Artykuł - szczegóły

Tytuł artykułu

Approximating Arbitrary Reinforcement Signal by Learning Classifier Systems using Micro Genetic Algorithm

Autorzy

Hamzeh A. , Rahmani A.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Learning Classifier Systems are Evolutionary Learning mechanisms which combine Genetic Algorithm and the Reinforcement Learning paradigm. Learning Classifier Systems try to evolve state-action-reward mappings to propose the best action for each environmental state to maximize the achieved reward. In the first versions of learning classifier systems, state-action pairs can only be mapped to a constant real-valued reward. So to model a fairly complex environment, LCSs had to develop redundant state-action pairs which had to be mapped to different reward values. But an extension to a well-known LCS, called Accuracy Based Learning Classifier System or XCS, was recently developed which was able to map state-action pairs to a linear reward function. This new extension, called XCSF, can develop a more compact population than the original XCS. But some further researches have shown that this new extension is not able to develop proper mappings when the input parameters are from certain intervals. As a solution to this issue, in our previous works, we proposed a novel solution inspired by the idea of using evolutionary approach to approximate the reward landscape. The first results seem promising, but our approach, called XCSFG, converged to the goal very slowly. In this paper, we propose a new extension to XCSFG which employs micro-GA which its needed population is extremely smaller than simple GA. So we expect micro-GA to help XCSFG to converge faster. Reported results show that this new extension can be assumed as an alternative approach in XCSF family with respect to its convergence speed, approximation accuracy and population compactness.

Słowa kluczowe

function approximation Learning Classifier Systems Micro Genetic Algorithm XCSF

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2008

Tom

Vol. 86, nr 1-2

Strony

93--111

Opis fizyczny

bibliogr. 20 poz., tab.

Twórcy

autor

Hamzeh A.

autor

Rahmani A.

Computer Engineering Department, Iran University of Science and Technology, Teheran, Iran, hamzeh@iust.ac.ir

Bibliografia

[1] S. Bradtke, A. Barto, 1996, Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, vol. 22(1/2/3), pp. 33-57.
[2] D.E. Goldberg, 1989, Sizing Populations for Serial and Parallel Genetic Algorithm, In J. David Schaffer, editor, Proceedings of the Third International Conference on Genetic Algorithms, pp. 70-19.
[3] A. Hamzeh, A. Rahmani, 2005, An Evolutionary Function Approximation Approach to Compute Predicted reward in XCSF, In Proceedings of 16th European Conference of Machine Learning, Porto, Portugal.
[4] S. Haykin, 1998, Neural Networks: A Comprehensive Foundation, Prentice Hall PTR.
[5] J.H. Holland, 1986, Escaping Brittleness: The Possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems, in R.S.Michalski, J.G. Carbonell, and T.M.Mitchell, eds.,Machine Learning: An Artificial Intelligence Approach,Morgan Kaufmann, Los Altos, Calif., 2nd edition, 1986, pp. 593-62.
[6] G. Kanji, 1994, 100 Statistical Tests, SAGE Publications.
[7] K. Krishnakumar, 1989, Micro-Genetic Algorithms for Stationary and non-Stationary Function Optimization, In SPIE Proceedings, Intelligent Control and Adaptive Systems, pp. 289-296.
[8] P.L. Lanzi, D. Loiacono, S.W. Wilson, D.E. Goldberg, 2005, XCS with Computed Prediction for the Learning of Boolean Functions, In the Proceedings of the IEEE Congress on Evolutionary Computation Conference (CEC2005),
[9] P.L. Lanzi, D. Loiacono, S.W. Wilson, D.E. Goldberg, 2005, Generalization in the XCSF Classifier System: Analysis, Improvement, and Extension, Technical Report 2005012, Illinois Genetic Algorithms Laboratory.
[10] P.L. Lanzi, D. Loiacono, S.W. Wilson, D.E. Goldberg, 2005, Generalization in XCSF for Real Inputs, Technical Report 2005023, Illinois Genetic Algorithms Laboratory.
[11] P.L. Lanzi, D. Loiacono, S.W.Wilson, D.E. Goldberg, 2005, Extending XCSF beyond Linear Approximation, In the Proceedings of the Genetic and Evolutionary Computation Conference 2005 (GECCO 2005), pp. 1827-1834.
[12] P.L. Lanzi, D. Loiacono, S.W. Wilson, D.E. Goldberg, 2006, Prediction Update Algorithms for XCSF: RLS, Kalman filter, and Gain Adaptation, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2006)Maarten Keijzer et al, eds. ACM Press, New York, pp. 1505-1512.
[13] A. Nikanjam, A. Rahmani, 2006, An Anticipatory Approach to Improve XCSF, In Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, Seattle, Washington, USA, July 08 - 12, 2006. GECCO '06. ACM Press, New York, NY, pp. 1595-1596.
[14] G.T. Pulido, C.A. Coello, 2001, A Micro-Genetic Algorithm for Multiobjective Optimization, In Proceedings of EvolutionaryMulti-Criterion Optimization conference, pp. 126-140.
[15] G.T. Pulido,C.A. Coello, 2003, The Micro Genetic Algorithm 2: Towards Online Adaptation in Evolutionary Multiobjective Optimization,In Proceedings of Evolutionary Multi-Criterion Optimization conference 2003, pp. 252-266.
[16] R. Sutton, A. Barto, 1998, Reinforcement learning, Cambridge, MIT Press, ISBN: 0262193981.
[17] B. Widrow, M.E. Hoff, 1988, Adaptive Switching Circuits, Chapter Neurocomputing: Foundation of Research, pp. 126-134. Cambridge: The MIT Press.
[18] S.W. Wilson, 1995, Classifier Fitness Based on Accuracy, Evolutionary Computation vol. 3(2), pp. 149-175.
[19] S.W. Wilson, 2002, Classifiers that Approximate Functions, Journal of Natural Computing vol. 1(2-3), pp. 211-234.
[20] S.W. Wilson, 2004, Classifier Systems for Continuous Payoff Environments, In Genetic and Evolutionary Computation -GECCO-2004, Part II, Volume 3103 of Lecture Notes in Computer Science, Seattle, WA, USA, pp. 824-835, Springer-Verlag.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUS5-0018-0005