Reinforcement Learning in Ship Handling

Łącki, M.

Powiadomienia systemowe

Sesja wygasła!
Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Reinforcement Learning in Ship Handling

Autorzy

Łącki M.

Treść / Zawartość

Pełne teksty:

Reinforcement Learning in Ship Handling.pdf

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

This paper presents the idea of using machine learning techniques to simulate and demonstrate learning behaviour in ship manoeuvring. Simulated model of ship is treated as an agent, which through environmental sensing learns itself to navigate through restricted waters selecting an optimum trajectory. Learning phase of the task is to observe current state and choose one of the available actions. The agent gets positive reward for reaching destination and negative reward for hitting an obstacle. Few reinforcement learning algorithms are considered. Experimental results based on simulation program are presented for different layouts of possible routes within restricted area.

Słowa kluczowe

Ship Handling Reinforcement Learning machine learning techniques Manoeuvring Restricted Waters Markov Decision Process (MDP) Artificial Neural Network (ANN) multi-agent environment

Wydawca

Faculty of Navigation, Gdynia Maritime University

Czasopismo

TransNav : International Journal on Marine Navigation and Safety of Sea Transportation

Rocznik

2008

Tom

Vol. 2, no. 2

Strony

157--160

Opis fizyczny

Bibliogr. 6 poz., rys.

Twórcy

autor

Łącki M.

Gdynia Maritime University, Gdynia, Poland

Bibliografia

1. Eden, T. Knittel, A., Uffelen, R. 2002. Reinforcement Learning: Tutorial
2. Kaelbling, L.P. & Littman & Moore. 1996. Reinforcement Learning: A Survey
3. The Reinforcement Learning Repository, University of Massachusetts, Amherst
4. Sutton, R. 1996. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. In Touretzky, D., Mozer, M., & Hasselmo, M. (Eds.), Neural Information Processing Systems 8.
5. Sutton, R. & Barto, A. 1998. Reinforcement Learning: An Introduction
6. Tesauro, G. 1995. Temporal Difference Learning and TD- Gammon, Communications of the Association for Computing Machinery, vol. 38, No. 3.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-622854f3-e86b-445e-b90d-9a633ef86e8b