The machine learning approach: analysis of experimental results

Poliscuk, J. E.

Artykuł - szczegóły

Tytuł artykułu

The machine learning approach: analysis of experimental results

Autorzy

Poliscuk J. E.

Wybrane pełne teksty z tego czasopisma

https://eczasopisma.p.lodz.pl/JACS/issue/archive

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The article analyses a reinforcement learning method in which the subject of learning is defined. The essence of this method is the selection of activities by a try and fail process and awarding deferred rewards. Theoretical analyses were supplemented by the practical studies, with reference to implementation of the Sarsa( Lambda) algorithm, with replacing eligibility traces and the Epsilon greedy policy.

Słowa kluczowe

algorithm TD(0) algorithm TD(Lambda) Bellman equation Markov decision making process mechanism of eligibility traces method of temporal difference learning reinforcement learning method

algorytm TD 0 algorytm TD lambda równanie Bellmana decyzyjny proces Markowa mechanizm śladów wybieralności Temporal Difference Learning uczenie na podstawie różnic w czasie uczenie ze wzmocnieniem

Wydawca

Wydawnictwo Politechniki Łódzkiej

Czasopismo

Journal of Applied Computer Science

Rocznik

2003

Tom

Vol. 11, nr 1

Strony

61--76

Opis fizyczny

Bibliogr. 8 poz.

Twórcy

autor

Poliscuk J. E.

Departament of Electrical Engineering, University of Montenegro, Cetinjski put bb, Podgorica, Yugoslavia, jaroslav@Server1.cis.cg.ac.yu

Bibliografia

[1] Boyan J. A., Littman, M. L.: Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach, Advances in Neural Information Processing Systems: Proceedings of the 994 Conference, San Francisco, CA, USA, 1994.
[2] Doya K.; Reinforcement Learning in Continuous Time and Space, Neural Computation, 2000, Vol 12, No. 1, pp. 219-246.
[3] Kaelbling L. P., Littman M. L., Moore A. W.: Reinforcement Learning: A Survey, Journal of Artificial Intelligence, 1996,. Vol. 4, pp. 237-285.
[4] Lewis M. E., Puterman M. L. A.: Probabilistic Analysis of Bias Optimality in Unichain Markov Decision Process, IEEE Transactions on Automatic Control, 2001, Vol. 46, No. l,pp. 96-101.
[5] Poliscuk J. E.: A contribution to methodology of development of Decision Support Systems and Expert Systems, Doctors Thesis, Faculty of Organization and Informatics, University of Zagreb, Croatia, 1992.
[6] Rolls E. T., Milward T„ Wiskott L.: A Model of Invoviant Object Recognition in the Visual System: Learning Rules, Activation Functions, Lateral Inhibition, and Information - Based Performance Measures, Neural Computation, 2000, Vol. 2, No. 11, pp. 2547-2573.
[7] Sutton R. S., Barto A. G.: Reinforcement Learning: An Introduction, MIT press - Bradford Books, Cambridge, MA, USA, 1998.
[8] Szepesvari C., Littman M. L. A.: Unified Analysis of Value - Function - Based Reinforcement - Learning Algorithms, Neural Computation, 1999, Vol 11, No. 8, pp. 2017-2061.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-LOD7-0027-0088