Q-learning: from discrete to continuous representation

Věchet, S.; Krejsa, J.

Artykuł - szczegóły

Tytuł artykułu

Q-learning: from discrete to continuous representation

Autorzy

Věchet S. , Krejsa J.

Identyfikatory

Warianty tytułu

Metoda Q-learning: od reprezentacji dyskretnej do ciągłej

Konferencja

International Conference Mechatronics 2004 (5 ; 23-25.09.2004 ; Warsaw, Poland)

Języki publikacji

Abstrakty

Q-learning algorithm in its standard form is limited by discrete states and actions. In order to improve quality of the control the algorithm must be modified to enable direct use of continuous variables. One possible way, presented in the paper, is to replace the table, by suitable approximator.

Algorytm metody Q-learning w swej standardowej formie jest ograniczony przez dyskretne stany i działania. W celu ulepszenia jakości sterowania algorytm ten trzeba zmodyfikować, aby umożliwić bezpośrednie wykorzystanie zmiennych ciągłych. Jednym z możliwych sposobów jest przedstawione w artykule zastąpienie tablicy odpowiednim aproksymatorem.

Słowa kluczowe

Q-learning machine learning Locally Weighted Regression metoda Q-learning

metoda Q-learning proces uczenia się maszynowego regresja lokalnie ważona

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Elektronika : konstrukcje, technologie, zastosowania

Rocznik

2004

Tom

Vol. 45, nr 8-9

Strony

12--14

Opis fizyczny

Bibliogr. 3 poz., wykr.

Twórcy

autor

Věchet S.

Brno University of Technology, Faculty of Mechanical Engineering, Institute of Mechanics, Mechatronics and Biomechanics, Czech Republic

autor

Krejsa J.

Institute of Thermomechanics, CAS, Mechatronics Centre Brno, Czech Republic

Bibliografia

1. Atkeson C. G., Moore A. W., Schaal S.: Locally Weighted Learning, Technical Report, ATR Human lnformation Processing Laboratories, Japan, 1996.
2. Schaal S., Atkeson C. G.: Receptive Field Weighted Regression, Technical Report TR-H-209, ATR Human lnformation Processing Laboratories, Japan, 1997.
3. Vechet S., Krejsa J., Bfezina T.: Using O-learning with LWA for Inverted Pendulum Control, Mechatronics, Robotics and Biomechanics 2003, pp. 91-92, Hrotovice, 2003.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BWA2-0011-0030