Reinforcement Learning with Approximation Spaces

Peters, J.F.; Henry, Ch.

Artykuł - szczegóły

Tytuł artykułu

Reinforcement Learning with Approximation Spaces

Autorzy

Peters J.F. , Henry Ch.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

This paper introduces a rough set approach to reinforcement learning by swarms of cooperating agents. The problem considered in this paper is how to guide reinforcement learning based on knowledge of acceptable behavior patterns. This is made possible by considering behavior patterns of swarms in the context of approximation spaces. Rough set theory introduced by Zdzisaw Pawlak in the early 1980s provides a ground for deriving pattern-based rewards within approximation spaces. Both conventional and approximation space-based forms of reinforcement comparison and the actor-critic method as well as two forms of the off-policy Monte Carlo learning control method are investigated in this article. The study of swarm behavior by collections of biologically-inspired bots is carried out in the context of an artificial ecosystem testbed. This ecosystem has an ethological basis that makes it possible to observe and explain the behavior of biological organisms that carries over into the study of reinforcement learning by interacting robotic devices. The results of ecosystem experiments with six forms of reinforcement learning are given. The contribution of this article is the presentation of several viable alternatives to conventional reinforcement learning methods defined in the context of approximation spaces.

Słowa kluczowe

approximation space ecosystem ethology Monte Carlo method reinforcement learning rough sets swarm

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2006

Tom

Vol. 71, nr 2,3

Strony

323--349

Opis fizyczny

wykr., bibliogr. 59 poz.

Twórcy

autor

Peters J.F.

autor

Henry Ch.

Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, R3T 2N2, Canada, [jfpeters.chenry]@ee.umanitoba.ca

Bibliografia

[1] Bazan, J., Hoa Nguyen, S., Son Nguyen, H., Skowron, A.: Rough set methods in approximation of hierarchical concepts. Proc. of RSCTC'2004, Lecture Notes in Artificial Intelligence 3066, Springer, Heidelberg, 2004, 346-355.
[2] Bazan, J.: The Road simulator Homepage at logic.mimuw.edu.pl/~bazan/simulator
[3] Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence. From Natural to Artificial Systems, (UK: Oxford University Press, 1999).
[4] Efremovič, V.A.: The geometry of proximity I, Mat. Sb. 31(73), 1952, 189-200.
[5] Engelking, R., Sieklucki, K.: Outline of General Topology. North-Holland Pub. Co., Amsterdam, 1968.
[6] Gaskett, C: Q-Learning for Robot Control. Ph.D. Thesis, & Ssupervisor : A.Zelinsky, Department of Systems Engineering, The Australian National University, 2002.
[7] Gomolinska, A.: Rough validity, confidence, and coverage of rules in approximation spaces, Transactions on Rough Sets III, 2005, 57-81
[8] Hammersley, J.M., Handscomb, D.C.: Monte Carlo Methods. Methuen & Co Ltd, London, 1964.
[9] Huygens, C.: De Ratiociniis in Ludo Aleae (On Reasoning or Computing in Games of Chance), 1657.
[10] Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey Journal of Artificial Intelligence Research, 4, 1996, 237-285.
[11] Komorowski, J., Pawlak, Z., Polkowski, L., Skowron, A.: Rough sets: A tutorial, in S.K. Pal, A. Skowron (Eds.) Rough Fuzzy Hybridization. A New Trend in Decision-Making (Singapore: Springer-Verlag Singapore Pte. Ltd., 1999), 3-98.
[12] Lehner, P.N.: Handbook of ethological methods, 2nd Ed., Cambridge University Press, Cambridge, UK, 1979, 1996.
[13] Lesniewski, S.: Foundations of the General Theory of Sets. In: S.J. Surma, J.T. Srzednicki, D.I. Barnett, V.F. Rickey (Eds.), S. Lesniewski, Collected Works, 2 vols., Kluwer Academic Pub., Dordrecht, vol. 1, 1992, 129-173.
[14] Lockery, D.:Reinforcement Learning Methods. Research Report CIL02.11052005, Computational Intelligence Laboratory, Department of Electrical and Computer Engineering, University of Manitoba, 11 April 2005.
[15] McCallum, A.K.: Reinforcement Learning with Selective Perception and Hidden State, Ph.D. Thesis, University of Rochester, U.S.A., 1996.
[16] Martinoli, A., Mondada, F.: Collective and cooperative group behaviours: Biologically inspired experiments in robotics. In: O. Khatib,J.K. Salisbury (Eds.), Proc. 4 th Int. Symp. on Experimental Robotics (ISER'95), Lecture Notes in Control and Information Sciences, 223. Springer-Verlag, Berlin, 1997, 3-10.
[17] Metropolis, N., Ulam, S.: The Monte Carlo method, Journal of the American Statistical Association, 44(247), 1949, 335-341.
[18] Mikhailov, G.A.: Monte-Carlo method. In: M. Hazelwinkel(Ed.), Encyclopedia of Mathematics, 4, Kluwer Aca. Pub., Dordrecht, 1995.
[19] Nguyen, H.S.: Discretization of Real Value Attributes, Boolean Reasoning Approach, supervisor: A. Skowron,Warsaw University, 1997.
[20] Nguyen, S.H., Bazan, J., Skowron, A., Nguyen, H.S.: Layered learning for concept synthesis, Transactions on Rough Sets, bf I, LNCS 3100, 2004, 187-208.
[21] Nguyen Thi, S.H.: Regularity Analysis and Its Applications in Data Mining, Doctoral Thesis, superisor: Bogdan S. Chlebus, Faculty of Mathematics, Computer Science and Mechanics, Warsaw University, July 1999.
[22] Orłowska, E.: Semantics of Vague Concepts. Applications of Rough Sets. Institute for Computer Science, Polish Academy of Sciences Report 469, March 1982
[23] Pawlak Z.: Classification of Objects by Means of Attributes. Institute for Computer Science, Polish Academy of Sciences Report 429, March 1981
[24] Pawlak Z.: Rough Sets. Institute for Computer Science, Polish Academy of Sciences Report 431, March 1981
[25] Pawlak Z.: Rough sets, International J. Comp. Inform. Science, 11, 1982, 341-356
[26] Pawlak Z.: Rough Sets. Theoretical Reasoning about Data, Theory and Decision Library, Series D: System Theory, Knowledge Engineering and Problem Solving, vol. 9, Kluwer Academic Pub., Dordrecht, 1991.
[27] Peters, J.F.: Approximation space for intelligent system design patterns. Engineering Applications of Artificial Intelligence, 17(4), 2004, 1-8.
[28] Peters, J.F., Ramanna, S.: Measuring acceptance of intelligent system models. In: M. Gh. Negoita et al. (Eds.), Knowledge-Based Intelligent Information and Engineering Systems, Lecture Notes in Artificial Intelligence, 3213, Part I, 2004,764-771.
[29] Peters, J.F.: Approximation spaces for hierarchical intelligent behavioral system models. In: B.D.-Keplic¸z, A. Jankowski, A. Skowron, M. Szczuka (Eds.), Monitoring, Security and Rescue Techniques in Multiagent Systems, Advances in Soft Computing, Physica-Verlag, Heidelberg, 2004, 13-30.
[30] Peters, J.F.: Rough ethology: Towards a Biologically-Inspired Study of Collective Behavior in Intelligent Systems with Approximation Spaces. Transactions on Rough Sets, III, LNCS 3400, 2005, 153-174.
[31] Peters, J.F., Skowron, A., Synak, P., Ramanna, S.: Rough sets and information granulation. In: Bilgic, T., Baets, D., Kaynak, O. (Eds.), Tenth Int. Fuzzy Systems Assoc. World Congress IFSA, Instanbul, Turkey, Lecture Notes in Artificial Intelligence 2715, Physica-Verlag, Heidelberg, 2003, 370-377.
[32] Peters, J.F., Ahn, T.C., Borkowski, M., Degtyaryov, V., Ramanna, S.: Line-crawling robot navigation: A rough neurocomputing approach. In: C. Zhou, D. Maravall, D. Ruan (Eds.), Autonomous Robotic Systems. Studies in Fuzziness and Soft Computing 116, Physica-Verlag, Heidelberg, 2003, 141-164.
[33] Peters, J.F., Henry, C., Ramanna, S.: Rough Ethograms : Study of Intelligent System Behavior. In: M.A. Kłopotek, S. Wierzchoń, K. Trojanowski (Eds.), New Trends in Intelligent Information Processing and Web Mining (IIS05), Gdańsk, Poland, June 13-16 2005, 117-126.
[34] Peters, J.F., Henry, C., Ramanna, S.: Reinforcement learning with pattern-based rewards. In 2 Proc. Fourth Int. IASTED Conf. Computational Intelligence (CI 2005), Calgary, Alberta, Canada, 4-6 July 2005, 267-272.
[35] Peters, J.F., Henry, C., Ramanna, S.: Reinforcement Learning in Swarms that Learn, Research Report CIL01.01012005, Computational Intelligence Laboratory, University of Manitoba, 2005.
[36] Peters, J.F., Skowron, A., Stepaniuk, J., Ramanna, S.: Towards an ontology of approximate reason, Fundamenta Informaticae, 51(1,2), 2002, 157-173.
[37] Peters, J.F., Lockery, D.,Ramanna, S.: Monte Carlo off-policy reinforcement learning: A rough set approach. In:Proc. Fifth Int. Conf. on Hybrid Intelligent Systems, Rio de Janeiro, Brazil, 06-09 Nov. 2005, 187-192.
[38] Peters, J.F.: Approximation spaces in off-policy Monte Carlo learning. Plenary paper in T. Burczynski, W. Cholewa, W. Moczulski (Eds.), Recent Methods in Artificial Intelligence Methods, AI-METH Series, Gliwice, 2005, 139-144.
[39] Polkowski, L.: Rough Sets. Mathematical Foundations. Springer-Verlag,Heidelberg, 2002.
[40] Polkowski, L., Skowron, A.: Rough mereology: A new paradigm for approximate reasoning, International Journal of Approximate Reasoning, 15(4), 1997, 333-365.
[41] Polkowski, L., Skowron, A.: (Eds.), Rough Sets in Knowledge Discovery 2, Studies in Fuzziness and Soft Computing 19. Springer-Verlag,Heidelberg, 1998.
[42] Precup, D.,Sutton, R.S.,Singh, S.: Eligibility traces for off-policy evaluation. In: Proc. 17th Conf. on Machine Learning(ICML 2000). Morgan Kaufmann, San Francisco, 2000, 1-8.
[43] Randlov, J.: Solving Complex Problems with Reinforcement Learning, Ph.D. Thesis, University of Copenhagen, 2001.
[44] Robert, C.P., Casella, G.: Monte Carlo Statistical Methods, 2nd Ed. Springer, Berlin, 2004.
[45] Rubinstein, R.Y.: Simulation and the Monte Carlo Method. JohnWiley & Sons, Toronto, 1981
[46] Rummery, G.A.: Problem Solving with Reinforcement Learning, Ph.D. Thesis, supervisor: Mahesan Niranjan, Trinity College, University of Cambridge, 26 July 1995.
[47] Skowron, A., Stepaniuk, J.: Generalized approximation spaces. In: Lin, T.Y.,Wildberger, A.M. (Eds.), Soft Computing, Simulation Councils, San Diego, 1995, 18-21
[48] Skowron, A., Swiniarski, R., Synak, P.: Approximation spaces and information granulation, Transactions on Rough Sets III, 2005, 175-189
[49] Skowron, A.: Rough sets and vague concepts, Fundamenta Informaticae, 64(1-4),2004, 417-431.
[50] Skowron, A., Bazan J., Stepaniuk, J.: Modelling complex patterns by information systems, Fundamenta Informaticae, 67(1-3), 2005, 203-217.
[51] Skowron, A., Stepaniuk, J., Peters, J.F., Swiniarski, R.: Calculi of approximation spaces, Fundamenta Informaticae, 2006, (to appear).
[52] Stepaniuk, J.: Approximation spaces, reducts and representatives, in [41], 109-126
[53] Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning, 3, 1988, 9-44.
[54] Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (Cambridge, MA: The MIT Press, 1998).
[55] Tinbergen, N.: Social Behavior in Animals, 2nd Ed., The Scientific Book Club, London, 1953, 1965.
[56] Tinbergen, N.: On aims and methods of ethology, Zeitschrift für Tierpsychologie 20: 410-433, 1963.
[57] Ulam, S.: On the Monte Carlo method. In Proc. 2nd Symposium on Largescale Digital Calculating Machinery, 1951, 207-212
[58] Watkins, C.J.C.H.: Learning from Delayed Rewards, Ph.D. Thesis, supervisor: Richard Young, King's College, University of Cambridge, UK, May, 1989.
[59] Watkins, C.J.C.H., Dayan, P.: Technical note: Q-learning, Machine Learning, 8, 1992, 279-292.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUS2-0010-0041