Wyniki wyszukiwania - BazTech

Ograniczanie wyników

Znaleziono wyników: 2

Liczba wyników na stronie

Wyniki wyszukiwania

Wyszukiwano:
w słowach kluczowych: temporal difference learning

Sortuj według:

Ogranicz wyniki do:

Simultaneous localization and mapping for tracked wheel robots combining monocular and stereo vision

Jesus F., Ventura R.

Journal of Automation Mobile Robotics and Intelligent Systems

2013

Vol. 7, No. 1

21--27

This paper addresses an online 6D SLAM method for a tracked wheel robot in an unknown and unstructured environment. While the robot pose is represented by its position and orientation over a 3D space, the environment is mapped with natural landmarks in the same space, autonomously collected using visual data from feature detectors. The observation model employs opportunistically features detected from either monocular and stereo vision. These features are represented using an inverse depth parametrization. The motion model uses odometry readings from motor encoders and orientation changes measured with an IMU. A dimensional-bounded EKF (DBEKF) is introduced here, that keeps the dimension of the state bounded. A new landmark classifier using a Temporal Difference Learning methodology is used to identify undesired landmarks from the state. By forcing an upper bound to the number of landmarks in the EKF state, the computational complexity is reduced to up to a constant while not compromising its integrity. All experimental work was done using real data from RAPOSA-NG, a tracked wheel robot developed for Search and Rescue missions.

Evolving small-board Go players using coevolutionary temporal difference learning with archives

Krawiec K., Jaśkowski W., Szubert M.

International Journal of Applied Mathematics and Computer Science

2011

Vol. 21, no 4

717-731

We apply Coevolutionary Temporal Difference Learning (CTDL) to learn small-board Go strategies represented as weighted piece counters. CTDL is a randomized learning technique which interweaves two search processes that operate in the intra-game and inter-game mode. Intra-game learning is driven by gradient-descent Temporal Difference Learning (TDL), a reinforcement learning method that updates the board evaluation function according to differences observed between its values for consecutively visited game states. For the inter-game learning component, we provide a coevolutionary algorithm that maintains a sample of strategies and uses the outcomes of games played between them to iteratively modify the probability distribution, according to which new strategies are generated and added to the sample. We analyze CTDL's sensitivity to all important parameters, including the trace decay constant that controls the lookahead horizon of TDL, and the relative intensity of intra-game and inter-game learning. We also investigate how the presence of memory (an archive) affects the search performance, and find out that the archived approach is superior to other techniques considered here and produces strategies that outperform a handcrafted weighted piece counter strategy and simple liberty-based heuristics. This encouraging result can be potentially generalized not only to other strategy representations used for small-board Go, but also to various games and a broader class of problems, because CTDL is generic and does not rely on any problem-specific knowledge.