Poszukiwanie optymalnej strategii eksploracji z zastosowaniem uczenia ze wzmocnieniem

Pluciński, M.

Powiadomienia systemowe

Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Poszukiwanie optymalnej strategii eksploracji z zastosowaniem uczenia ze wzmocnieniem

Autorzy

Pluciński M.

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The paper presents an application of the reinforcement learning for a searching of an optimal policy in an exploration problem (also known as a Jeep problem). The continuous problem, in unrealistic so the main work was concentrated on the discrete Jeep problem. There is examined and described an influence of main learning parameters on the learning speed and there are presented some found exemplary policies for different problem conditions.

Słowa kluczowe

uczenie ze wzmocnieniem problem eksploracji algorytm TD problem Jeepa

reinforcement learning exploration problem Jeep problem

Wydawca

Komisja Informatyki Polskiej Akademii Nauk, Oddział w Gdańsku

Czasopismo

Metody Informatyki Stosowanej

Rocznik

2008

Tom

nr 1 (Tom 13)

Strony

127--137

Opis fizyczny

Bibliogr. 16 poz., rys., tab.

Twórcy

autor

Pluciński M.

Politechnika Szczecińska, Wydział Informatyki

Bibliografia

[1] Ball W.W.R. Coxeter H.S.M. Mathematical Recreations and Essays, 13edn, New York: Dover, 1987.
[2] Brauer W., Brauer U. Reconsidering the Jeep Problem - or how to transport a birthday present to Salosauna, Proceedings of the Colloquium in Honor of Arto Salomaa on Results and Trends in Theoretical Computer Science, s. 30-33, 1994.
[3] Cichosz P. Systemy uczqce, sie, Wyd. Naukowo-Techniczne, Warszawa, 2000
[4] Fine N..J. The Jeep Problem, The, American Mathematical Monthly, v. 54, nr 1, s. 24-31, 1947.
[5] Franklin J.N. The range of a fleet of aircraft, Journal of the Society for Industrial and Applied Mathematics, v. 8, nr 3, s. 541-548, 1960.
[6] Gale D. The Jeep Once More or Jeeper by the Dozen, The American Mathematical Monthly, v. 77. nr 5, s. 493-501, 1970.
[7] Giffen W..J. Deterministic and stochastic extensions of the Jeep problem, Rozprawa doktorska, Purdue University, 2004.
[8] Imada A. Can learning robot solve a 2-D jeep problem? Proceedings of the International Symposium on Autonomous Minirobots for Research and Edutainment (AMiRE'2007). Buenos Aires. Argentina, 2007.
[9] Kaelbling L.P., Littman M.L., Moore A.W., Reinforcement learning: A survey, Journal of Artificial Intelligence Research, v. 4, s. 237-285, 1996.
[10] Klęsk P. The Jeep Problem, searching for the best strategy with a genetic algorithm. Information Processing and Security Systems, v. 3, Springer US, s. 453-464, 2005.
[11] Phipps C.G. The Jeep Problem: A more general solution, The American Mathematical Monthly, v. 54, s. 458-462, 1947.
[12| Pluciński M. Application of the probabilistic RBF neural network in the reinfor-cement learning of a mobile robot, Proceedings of the 14th International Multi-Conference on "Advanced Computer Systems", Międzyzdroje, Poland, 2007.
[13] Rote G.., Zhang G. Optimal logistics for expeditions - the jeep problem with complete refilling, 'Spezialforschungsbereich F 003' Technical Report No. 71, TU Graz. Austria, 1996.
[14] Sutton R.S., Learning to predict by the methods of temporal differences. Machine Learning, v. 3, s. 9-44, 1992.
[15] Sutton R.S., Barto A.G., Reinforcement learning: An introduction. The MIT Press. 1998.
[16] Tesauro G., Practical issues in temporal differences learning, Machine Learning, v. 8. s. 257-277, 1992 tram Web Resource, http://rnathworld.wolfram.com/ JeepProblem.html.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPS3-0010-0012