Wyniki wyszukiwania - BazTech

Ograniczanie wyników

2 2021

Powiadomienia systemowe

Sesja wygasła!

Znaleziono wyników: 2

Liczba wyników na stronie

Wyniki wyszukiwania

Wyszukiwano:
w słowach kluczowych: uczenie przez naśladowanie

Sortuj według:

Ogranicz wyniki do:

Analiza możliwości wykorzystania algorytmów uczenia maszynowego w środowisku Unity

Litwynenko Karina, Plechawska-Wójcik Małgorzata

Journal of Computer Sciences Institute

2021

Vol. 20

197--204

Algorytmy uczenia ze wzmocnieniem zyskują coraz większą popularność, a ich rozwój jest możliwy dzięki istnieniu narzędzi umożliwiających ich badanie. Niniejszy artykuł dotyczy możliwości zastosowania algorytmów uczenia maszynowego na platformie Unity wykorzystującej bibliotekę Unity ML-Agents Toolkit. Celem badania było porównanie dwóch algorytmów: Proximal Policy Optimization oraz Soft Actor-Critic. Zweryfikowano również możliwość poprawy wyników uczenia poprzez łączenie tych algorytmów z metodą uczenia przez naśladowanie Generative Adversarial Imitation Learning. Wyniki badania wykazały, że algorytm PPO może sprawdzić się lepiej w nieskomplikowanych środowiskach o nienatychmiastowym charakterze nagród, zaś dodatkowe zastosowanie GAIL może wpłynąć na poprawę skuteczności uczenia.

Reinforcement learning algorithms are gaining popularity, and their advancement is made possible by the presence of tools to evaluate them. This paper concerns the applicability of machine learning algorithms on the Unity platform using the Unity ML-Agents Toolkit library. The purpose of the study was to compare two algorithms: Proximal Policy Optimization and Soft Actor-Critic. The possibility of improving the learning results by combining these algorithms with Generative Adversarial Imitation Learning was also verified. The results of the study showed that the PPO algorithm can perform better in uncomplicated environments with non-immediate rewards, while the additional use of GAIL can improve learning performance.

An automated driving strategy generating method based on WGAIL–DDPG

Zhang Mingheng, Wan Xing, Gang Longhui, Lv Xinfei, Wu Zengwen, Liu Zhaoyang

International Journal of Applied Mathematics and Computer Science

2021

Vol. 31, no. 3

461--470

Reliability, efficiency and generalization are basic evaluation criteria for a vehicle automated driving system. This paper proposes an automated driving decision-making method based on the Wasserstein generative adversarial imitation learning–deep deterministic policy gradient (WGAIL–DDPG(λ)). Here the exact reward function is designed based on the requirements of a vehicle’s driving performance, i.e., safety, dynamic and ride comfort performance. The model’s training efficiency is improved through the proposed imitation learning strategy, and a gain regulator is designed to smooth the transition from imitation to reinforcement phases. Test results show that the proposed decision-making model can generate actions quickly and accurately according to the surrounding environment. Meanwhile, the imitation learning strategy based on expert experience and the gain regulator can effectively improve the training efficiency for the reinforcement learning model. Additionally, an extended test also proves its good adaptability for different driving conditions.