Sterowanie autonomicznym bezzałogowym statkiem powietrznym z wykorzystaniem uczenia przez wzmacnianie

Miera, Paweł; Szolc, Hubert; Kryjak, Tomasz

doi:10.14313/PAR_250/85

Artykuł - szczegóły

Tytuł artykułu

Sterowanie autonomicznym bezzałogowym statkiem powietrznym z wykorzystaniem uczenia przez wzmacnianie

Autorzy

Miera Paweł , Szolc Hubert , Kryjak Tomasz

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.14313/PAR_250/85

Warianty tytułu

Control of an Autonomous Unmanned Aerial Vehicle Using Reinforcement Learning

Języki publikacji

Abstrakty

Uczenie przez wzmacnianie ma coraz większe znaczenie w sterowaniu robotami, a symulacja odgrywa w tym procesie kluczową rolę. W obszarze bezzałogowych statków powietrznych (BSP, w tym dronów) obserwujemy wzrost liczby publikowanych prac naukowych zajmujących się tym zagadnieniem i wykorzystujących wspomniane podejście. W artykule omówiono opracowany system autonomicznego sterowania dronem, który ma za zadanie lecieć w zadanym kierunku (zgodnie z przyjętym układem odniesienia) i omijać napotykane w lesie drzewa na podstawie odczytów z obrotowego sensora LiDAR. Do jego przygotowania wykorzystano algorytm Proximal Policy Optimization (PPO), stanowiący przykład uczenia przez wzmacnianie (ang. reinforcement learning, RL). Do realizacji tego celu opracowano własny symulator w języku Python. Przy testach uzyskanego algorytmu sterowania wykorzystano również środowisko Gazebo, zintegrowane z Robot Operating System (ROS). Rozwiązanie zaimplementowano w układzie eGPU Nvidia Jetson Nano i przeprowadzono testy w rzeczywistości. Podczas nich dron skutecznie zrealizował postawione zadania i był w stanie w powtarzalny sposób omijać drzewa podczas przelotu przez las.

Reinforcement learning is of increasing importance in the field of robot control and simulation plays a key role in this process. In the unmanned aerial vehicles (UAVs, drones), there is also an increase in the number of published scientific papers involving this approach. In this work, an autonomous drone control system was prepared to fly forward (according to its coordinates system) and pass the trees encountered in the forest based on the data from a rotating LiDAR sensor. The Proximal Policy Optimization (PPO) algorithm, an example of reinforcement learning (RL), was used to prepare it. A custom simulator in the Python language was developed for this purpose. The Gazebo environment, integrated with the Robot Operating System (ROS), was also used to test the resulting control algorithm. Finally, the prepared solution was implemented in the Nvidia Jetson Nano eGPU and verified in the real tests scenarios. During them, the drone successfully completed the set task and was able to repeatable avoid trees and fly through the forest.

Słowa kluczowe

uczenie przez wzmacnianie drony autonomiczne sterowanie ROS Gazebo

reinforcement learning RL drones automatic control ROS Gazebo

Wydawca

Sieć Badawcza Łukasiewicz - Przemysłowy Instytut Automatyki i Pomiarów PIAP

Czasopismo

Pomiary Automatyka Robotyka

Rocznik

2023

Tom

R. 27, nr 4

Strony

85--91

Opis fizyczny

Bibliogr. 13 poz., fot., rys., tab., wykr., wzory

Twórcy

autor

Miera Paweł

miera@student.agh.edu.pl

AGH Akademia Górniczo-Hutnicza im. S. Staszica, Wydział Elektrotechniki, Automatyki, Informatyki i Inżynierii Biomedycznej, Laboratorium Systemów Wizyjnych, Zespół Wbudowanych Systemów Wizyjnych, al. Mickiewicza 30, Kraków, 30-059, Polska

https://orcid.org/0009-0005-4690-0535

autor

Szolc Hubert

szolc@agh.edu.pl

AGH Akademia Górniczo-Hutnicza im. S. Staszica, Wydział Elektrotechniki, Automatyki, Informatyki i Inżynierii Biomedycznej, Laboratorium Systemów Wizyjnych, Zespół Wbudowanych Systemów Wizyjnych, al. Mickiewicza 30, Kraków, 30-059, Polska

https://orcid.org/0000-0003-3018-5731

autor

Kryjak Tomasz

tomasz.kryjak@agh.edu.pl

AGH Akademia Górniczo-Hutnicza im. S. Staszica, Wydział Elektrotechniki, Automatyki, Informatyki i Inżynierii Biomedycznej, Laboratorium Systemów Wizyjnych, Zespół Wbudowanych Systemów Wizyjnych, al. Mickiewicza 30, Kraków, 30-059, Polska

https://orcid.org/0000-0001-6798-4444

Bibliografia

1. Mandirola M., Casarotti C., Peloso S., Lanese I., Brunesi E., Senaldi I., Use of UAS for damage inspection and assessment of bridge infrastructures, “International Journal of Disaster Risk Reduction”, Vol. 72, 2022, DOI: 10.1016/j.ijdrr.2022.102824.
2. Ackerman E., Koziol M., The blood is here: Zipline’s medical delivery drones are changing the game in Rwanda, “IEEE Spectrum”, Vol. 56, No. 5, 2019, 24-31, DOI: 10.1109/MSPEC.2019.8701196.
3. Carabassa V., Montero P., Crespo M., Padró J.-C., Pons X., Balagué J., Brotons L., Alcañiz J.M., Unmanned aerial system protocol for quarry restoration and mineral extraction monitoring, “Journal of Environmental Management”, Vol. 270, 2020, DOI: 10.1016/j.jenvman.2020.110717.
4. Roldán J.J., Garcia-Aunon P., Peña-Tapia E., Barrientos A., SwarmCity Project: Can an Aerial Swarm Monitor Traffic in a Smart City?, 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), 2019, 862-867, DOI: 10.1109/PERCOMW.2019.8730677.
5. Koval A., Kanellakis C., Vidmark E., Haluska J., Nikolakopoulos G., A Subterranean Virtual Cave World for Gazebo based on the DARPA SubT Challenge, CoRR, abs/2004.08452, 2020, DOI: 10.48550/arXiv.2004.08452.
6. Loon K.W., Graesser L., Cvitkovic M., SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning, arXiv, 2019. DOI: 10.48550/ARXIV.1912.12482.
7. Muzahid A.J., Kamarulzaman S.F., Rahman A., Comparison of PPO and SAC Algorithms Towards Decision Making Strategies for Collision Avoidance Among Multiple Autonomous Vehicles, 2021 International Conference on Software Engineering & Computer Systems and 4 th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM), 2021, 200-205. DOI: 10.1109/ICSECS52883.2021.00043.
8. Jagannath J., Jagannath A., Furman S., Gwin T., Deep Learning and Reinforcement Learning for Autonomous Unmanned Aerial Systems: Roadmap for Theory to Deployment, arXiv, 2020, DOI: 10.48550/ARXIV.2009.03349.
9. Rodriguez-Ramos A., Sampedro C., Bavle H., de la Puente P., Campoy P., A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform, “Journal of Intelligent & Robotic Systems”, Vol. 93, No. 1, 2019, 351-366. DOI: 10.1007/s10846-018-0891-8.
10. Song Y., Steinweg M., Kaufmann E., Scaramuzza D., Autonomous Drone Racing with Deep Reinforcement Learning, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, DOI: 10.1109/IROS51168.2021.9636053.
11. Slamtec RPLIDAR-A2 Laser Range Scanner, [www.slamtec.com/en/ Lidar/A2].
12. Bellman R., Dynamic programming, Princeton University Press, 1957.
13. Raffin A., Hill A., Gleave A., Kanervisto A., Ernestus M., Dormann N., Stable-Baselines3: Reliable Reinforcement Learning Implementations, “Journal of Machine Learning Research”, Vol. 22, 2021, 1-8. [http://jmlr. org/papers/v22/20-1364.html].

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-8fb422ff-775e-44ce-b0cd-a0e0d4aff5b8