The paper presents an application of the reinforcement learning for a searching of an optimal policy in an exploration problem (also known as a Jeep problem). The continuous problem, in unrealistic so the main work was concentrated on the discrete Jeep problem. There is examined and described an influence of main learning parameters on the learning speed and there are presented some found exemplary policies for different problem conditions.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.