Wyniki wyszukiwania - BazTech

1

Sterowanie autonomicznym bezzałogowym statkiem powietrznym z wykorzystaniem uczenia przez wzmacnianie

Miera Paweł, Szolc Hubert, Kryjak Tomasz

Pomiary Automatyka Robotyka

|

2023

|

R. 27, nr 4

85--91

PL

Uczenie przez wzmacnianie ma coraz większe znaczenie w sterowaniu robotami, a symulacja odgrywa w tym procesie kluczową rolę. W obszarze bezzałogowych statków powietrznych (BSP, w tym dronów) obserwujemy wzrost liczby publikowanych prac naukowych zajmujących się tym zagadnieniem i wykorzystujących wspomniane podejście. W artykule omówiono opracowany system autonomicznego sterowania dronem, który ma za zadanie lecieć w zadanym kierunku (zgodnie z przyjętym układem odniesienia) i omijać napotykane w lesie drzewa na podstawie odczytów z obrotowego sensora LiDAR. Do jego przygotowania wykorzystano algorytm Proximal Policy Optimization (PPO), stanowiący przykład uczenia przez wzmacnianie (ang. reinforcement learning, RL). Do realizacji tego celu opracowano własny symulator w języku Python. Przy testach uzyskanego algorytmu sterowania wykorzystano również środowisko Gazebo, zintegrowane z Robot Operating System (ROS). Rozwiązanie zaimplementowano w układzie eGPU Nvidia Jetson Nano i przeprowadzono testy w rzeczywistości. Podczas nich dron skutecznie zrealizował postawione zadania i był w stanie w powtarzalny sposób omijać drzewa podczas przelotu przez las.

EN

Reinforcement learning is of increasing importance in the field of robot control and simulation plays a key role in this process. In the unmanned aerial vehicles (UAVs, drones), there is also an increase in the number of published scientific papers involving this approach. In this work, an autonomous drone control system was prepared to fly forward (according to its coordinates system) and pass the trees encountered in the forest based on the data from a rotating LiDAR sensor. The Proximal Policy Optimization (PPO) algorithm, an example of reinforcement learning (RL), was used to prepare it. A custom simulator in the Python language was developed for this purpose. The Gazebo environment, integrated with the Robot Operating System (ROS), was also used to test the resulting control algorithm. Finally, the prepared solution was implemented in the Nvidia Jetson Nano eGPU and verified in the real tests scenarios. During them, the drone successfully completed the set task and was able to repeatable avoid trees and fly through the forest.

2

Stabilizer design of PSS3B based on the KH algorithm and Q-Learning for damping of low frequency oscillations in a single-machine power system

Mohamadi Farshid, Sedaghati Alireza

Journal of Power Technologies

|

2023

|

Vol. 103, nr 4

230--242

EN

The aim of this study is to use the reinforcement learning method in order to generate a complementary signal for enhancing the performance of the system stabilizer. The reinforcement learning is one of the important branches of machine learning on the area of artificial intelligence and a general approach for solving the Marcov Decision Process (MDP) problems. In this paper, a reinforcement learning-based control method, named Q-learning, is presented and used to improve the performance of a 3-Band Power System Stabilizer (PSS3B) in a single-machine power system. For this end, we first set the parameters of the 3-band power system stabilizer by optimizing the eigenvalue-based objective function using the new optimization KH algorithm, and then its efficiency is improved using the proposed reinforcement learning algorithm based on the Q-learning method in real time. One of the fundamental features of the proposed reinforcement learning-based stabilizer is its simplicity and independence on the system model and changes in the working points of operation. To evaluate the efficiency of the proposed reinforcement learning-based 3-band power system stabilizer, its results are compared with the conventional power system stabilizer and the 3-band power system stabilizer designed by the use of the KH algorithm under different working points. The simulation results based on the performance indicators show that the power system stabilizer proposed in this study underperform the two other methods in terms of decrease in settling time and damping of low frequency oscillations.

3

Simplification of deep reinforcement learning in traffic control using the Bonsai Platform

Skuba Michał, Janota Aleš

Journal of civil engineering and transport

|

2020

|

Vol. 2, No. 4

191--202

EN

The paper deals with the problem of traffic light control of road intersection. The authors use a model of a real road junction created in the AnyLogic modelling tool. For two scenarios, there are three simulation experiments performed – fixed time control, fixed time control after AnyLogic-based optimizations, and dynamic control obtained through the cooperation of the AnyLogic tool and the Bonsai platform, utilizing benefits of deep reinforcement learning. At present, there are trends to simplify machine learning processes as much as possible to make them accessible to practitioners with no artificial intelligence background and without the need to become data scientists. Project Bonsai represents an easy-to-use connector, that allows to use AnyLogic models connected to the Bonsai platform - a novel approach to machine learning without the need to set any hyper-parameters. Due to unavailability of real operational data, the model uses simulation data only, with presence and movement of vehicles only (no pedestrians). The optimization problem consists in minimizing the average time that agents (vehicles) must spend in the model, passing the modelled intersection. Another observed parameter is the maximum time of individual vehicles spent in the model. The authors share their practical, mainly methodological, experiences with the simulation process and indicate economic cost needed for training as well.

PL

Artykuł dotyczy problemu sterowania sygnalizacją świetlną na skrzyżowaniach dróg. Autorzy wykorzystują model rzeczywistego węzła drogowego utworzony w narzędziu do modelowania AnyLogic. Dla dwóch scenariuszy wykonywane są trzy eksperymenty symulacyjne - sterowanie światłami sygnalizacyjnymi o stałym czasie działania, sterowanie światłami sygnalizacyjnymi o stałym czasie działania po optymalizacji w oparciu o AnyLogic, i sterowanie dynamiczne dzięki współpracy między AnyLogic i platformą Bonsai, wykorzystując korzyści płynące z uczenia się przez głębokie wzmocnienie. Obecnie istnieją tendencje do maksymalnego upraszczania procesów uczenia maszynowego, aby były dostępne dla praktyków bez doświadczenia w zakresie sztucznej inteligencji i bez konieczności zostania naukowcami danych. Project Bonsai to łatwe w obsłudze złącze, które pozwala na korzystanie z modeli AnyLogic podłączonych do platformy Bonsai - nowatorskie podejście do uczenia maszynowego bez konieczności ustawiania hiperparametrów. Ze względu na niedostępność rzeczywistych danych eksploatacyjnych model wykorzystuje tylko dane symulacyjne, tylko z obecnością i ruchem pojazdów (bez pieszych). Problem optymalizacji polega na zminimalizowaniu średniego czasu, jaki agenci (pojazdy) muszą spędzać w modelu, mijając modelowane skrzyżowanie. Kolejnym obserwowanym parametrem jest maksymalny czas przebywania poszczególnych pojazdów w modelu. Autorzy dzielą się praktycznymi, głównie metodologicznymi, doświadczeniami związanymi z procesem symulacji oraz wskazują koszty ekonomiczne potrzebne do uczenia.

4

Zastosowanie algorytmów uczenia przez wzmacnianie w układzie wyznaczania trajektorii zadanej manewrującego statku

Rak A.

Zeszyty Naukowe Akademii Morskiej w Gdyni

|

2013

|

nr 78

86--96

PL

Artykuł przedstawia koncepcję autonomicznego generowania trajektorii zadanej w elektronawigacyjnym układzie sterowania ruchem statku. Trajektoria ta wyznaczana jest na podstawie informacji o docelowej pozycji statku, dostarczonej przez operatora oraz sytuacji nawigacyjnej, określanej poprzez zestaw urządzeń elektronawigacyjnych. Działanie układu opiera się na wykorzystaniu algorytmów uczenia przez wzmacnianie. W artykule przedstawiono zasady działania tych algorytmów zarówno w wersji dyskretnej, jak i ciągłej – z aproksymacją przestrzeni stanu. Wyznaczana trajektoria może być realizowana w autopilocie okrętowym wyposażonym w wielowymiarowy, nieliniowy regulator kursu i położenia.

EN

The paper presents the concept of autonomous reference trajectory generation unit for the vessel motion control system. Reference trajectory is determined based on the information about the target position of the vessel, provided by the operator and navigational situation determined by the navigational equipment fitted on the vessel. The key data processing concept of the system relies on a reinforcement learning algorithms. The paper presents the principles of selected RL algorithms in both discrete and continuous domains. Trajectory determined in the proposed module can be realized in marine autopilot equipped with a multidimensional, nonlinear controller of the course and position.