Wyniki wyszukiwania - BazTech

1

Metrics for assessing generalization of deep reinforcement learning in parameterized environments

Aleksandrowicz Maciej, Jaworek-Korjakowska Joanna

Journal of Artificial Intelligence and Soft Computing Research

|

2024

|

Vol. 14, No. 1

45--61

EN

In this work, a study focusing on proposing generalization metrics for Deep Reinforcement Learning (DRL) algorithms was performed. The experiments were conducted in DeepMind Control (DMC) benchmark suite with parameterized environments. The performance of three DRL algorithms in selected ten tasks from the DMC suite has been analysed with existing generalization gap formalism and the proposed ratio and decibel metrics. The results were presented with the proposed methods: average transfer metric and plot for environment normal distribution. These efforts allowed to highlight major changes in the model’s performance and add more insights about making decisions regarding models’ requirements.

2

CNC machine control using deep reinforcement learning

Kalandyk Dawid, Kwiatkowski Bogdan, Mazur Damian

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2024

|

Vol. 72, nr 3

art. no. e148940

EN

Optimization of industrial processes such as manufacturing or processing of specific materials constitutes a point of interest for many researchers, and its application can lead not only to speeding up the processes in question, but also to reducing the energy cost incurred during them. This article presents a novel approach to optimizing the spindle motion of a computer numeric control (CNC) machine. The proposed solution is to use deep learning with reinforcement to map the performance of the reference points realization optimization (RPRO) algorithm used in the industry. A detailed study was conducted to see how well the proposed method performs the targeted task. In addition, the influence of a number of different factors and hyperparameters of the learning process on the performance of the trained agent was investigated. The proposed solution achieved very good results, not only satisfactorily replicating the performance of the benchmark algorithm, but also speeding up the machining process and providing significantly higher accuracy.

3

Handling realistic noise in multi-agent systems with self-supervised learning and curiosity

Szemenyei Marton, Reizinger Patrik

Journal of Artificial Intelligence and Soft Computing Research

|

2022

|

Vol. 12, No. 2

135--148

EN

Most reinforcement learning benchmarks – especially in multi-agent tasks – do not go beyond observations with simple noise; nonetheless, real scenarios induce more elaborate vision pipeline failures: false sightings, misclassifications or occlusion. In this work, we propose a lightweight, 2D environment for robot soccer and autonomous driving that can emulate the above discrepancies. Besides establishing a benchmark for accessible multiagent reinforcement learning research, our work addresses the challenges the simulator imposes. For handling realistic noise, we use self-supervised learning to enhance scene reconstruction and extend curiosity-driven learning to model longer horizons. Our extensive experiments show that the proposed methods achieve state-of-the-art performance, compared against actor-critic methods, ICM, and PPO.

4

A hybrid control strategy for a dynamic scheduling problem in transit networks

Liu Zhongshan, Yu Bin, Zhang Li, Wang Wensi

International Journal of Applied Mathematics and Computer Science

|

2022

|

Vol. 32, no. 4

553--567

EN

Public transportation is often disrupted by disturbances, such as the uncertain travel time caused by road congestion. Therefore, the operators need to take real-time measures to guarantee the service reliability of transit networks. In this paper, we investigate a dynamic scheduling problem in a transit network, which takes account of the impact of disturbances on bus services. The objective is to minimize the total travel time of passengers in the transit network. A two-layer control method is developed to solve the proposed problem based on a hybrid control strategy. Specifically, relying on conventional strategies (e.g., holding, stop-skipping), the hybrid control strategy makes full use of the idle standby buses at the depot. Standby buses can be dispatched to bus fleets to provide temporary or regular services. Besides, deep reinforcement learning (DRL) is adopted to solve the problem of continuous decision-making. A long short-term memory (LSTM) method is added to the DRL framework to predict the passenger demand in the future, which enables the current decision to adapt to disturbances. The numerical results indicate that the hybrid control strategy can reduce the average headway of the bus fleet and improve the reliability of bus service.

5

An automated driving strategy generating method based on WGAIL–DDPG

Zhang Mingheng, Wan Xing, Gang Longhui, Lv Xinfei, Wu Zengwen, Liu Zhaoyang

International Journal of Applied Mathematics and Computer Science

|

2021

|

Vol. 31, no. 3

461--470

EN

Reliability, efficiency and generalization are basic evaluation criteria for a vehicle automated driving system. This paper proposes an automated driving decision-making method based on the Wasserstein generative adversarial imitation learning–deep deterministic policy gradient (WGAIL–DDPG(λ)). Here the exact reward function is designed based on the requirements of a vehicle’s driving performance, i.e., safety, dynamic and ride comfort performance. The model’s training efficiency is improved through the proposed imitation learning strategy, and a gain regulator is designed to smooth the transition from imitation to reinforcement phases. Test results show that the proposed decision-making model can generate actions quickly and accurately according to the surrounding environment. Meanwhile, the imitation learning strategy based on expert experience and the gain regulator can effectively improve the training efficiency for the reinforcement learning model. Additionally, an extended test also proves its good adaptability for different driving conditions.

6

Simplification of deep reinforcement learning in traffic control using the Bonsai Platform

Skuba Michał, Janota Aleš

Journal of civil engineering and transport

|

2020

|

Vol. 2, No. 4

191--202

EN

The paper deals with the problem of traffic light control of road intersection. The authors use a model of a real road junction created in the AnyLogic modelling tool. For two scenarios, there are three simulation experiments performed – fixed time control, fixed time control after AnyLogic-based optimizations, and dynamic control obtained through the cooperation of the AnyLogic tool and the Bonsai platform, utilizing benefits of deep reinforcement learning. At present, there are trends to simplify machine learning processes as much as possible to make them accessible to practitioners with no artificial intelligence background and without the need to become data scientists. Project Bonsai represents an easy-to-use connector, that allows to use AnyLogic models connected to the Bonsai platform - a novel approach to machine learning without the need to set any hyper-parameters. Due to unavailability of real operational data, the model uses simulation data only, with presence and movement of vehicles only (no pedestrians). The optimization problem consists in minimizing the average time that agents (vehicles) must spend in the model, passing the modelled intersection. Another observed parameter is the maximum time of individual vehicles spent in the model. The authors share their practical, mainly methodological, experiences with the simulation process and indicate economic cost needed for training as well.

PL

Artykuł dotyczy problemu sterowania sygnalizacją świetlną na skrzyżowaniach dróg. Autorzy wykorzystują model rzeczywistego węzła drogowego utworzony w narzędziu do modelowania AnyLogic. Dla dwóch scenariuszy wykonywane są trzy eksperymenty symulacyjne - sterowanie światłami sygnalizacyjnymi o stałym czasie działania, sterowanie światłami sygnalizacyjnymi o stałym czasie działania po optymalizacji w oparciu o AnyLogic, i sterowanie dynamiczne dzięki współpracy między AnyLogic i platformą Bonsai, wykorzystując korzyści płynące z uczenia się przez głębokie wzmocnienie. Obecnie istnieją tendencje do maksymalnego upraszczania procesów uczenia maszynowego, aby były dostępne dla praktyków bez doświadczenia w zakresie sztucznej inteligencji i bez konieczności zostania naukowcami danych. Project Bonsai to łatwe w obsłudze złącze, które pozwala na korzystanie z modeli AnyLogic podłączonych do platformy Bonsai - nowatorskie podejście do uczenia maszynowego bez konieczności ustawiania hiperparametrów. Ze względu na niedostępność rzeczywistych danych eksploatacyjnych model wykorzystuje tylko dane symulacyjne, tylko z obecnością i ruchem pojazdów (bez pieszych). Problem optymalizacji polega na zminimalizowaniu średniego czasu, jaki agenci (pojazdy) muszą spędzać w modelu, mijając modelowane skrzyżowanie. Kolejnym obserwowanym parametrem jest maksymalny czas przebywania poszczególnych pojazdów w modelu. Autorzy dzielą się praktycznymi, głównie metodologicznymi, doświadczeniami związanymi z procesem symulacji oraz wskazują koszty ekonomiczne potrzebne do uczenia.

7

Multi agent deep learning with cooperative communication

Simões David, Lau Nuno, Reis Luís Paulo

Journal of Artificial Intelligence and Soft Computing Research

|

2020

|

Vol. 10, No. 3

189--207

EN

We consider the problem of multi agents cooperating in a partially-observable environment. Agents must learn to coordinate and share relevant information to solve the tasks successfully. This article describes Asynchronous Advantage Actor-Critic with Communication (A3C2), an end-to-end differentiable approach where agents learn policies and communication protocols simultaneously. A3C2 uses a centralized learning, distributed execution paradigm, supports independent agents, dynamic team sizes, partiallyobservable environments, and noisy communications. We compare and show that A3C2 outperforms other state-of-the-art proposals in multiple environments.

8

Deep reinforcement learning overview of the state of the art

Fenjiro Y., Benbrahim H.

Journal of Automation Mobile Robotics and Intelligent Systems

|

2018

|

Vol. 12, No. 3

20--39

EN

Artificial intelligence has made big steps forward with reinforcement learning (RL) in the last century, and with the advent of deep learning (DL) in the 90s, especially, the breakthrough of convolutional networks in computer vision field. The adoption of DL neural networks in RL, in the first decade of the 21 century, led to an end-toend framework allowing a great advance in human-level agents and autonomous systems, called deep reinforcement learning (DRL). In this paper, we will go through the development Timeline of RL and DL technologies, describing the main improvements made in both fields. Then, we will dive into DRL and have an overview of the state-ofthe- art of this new and promising field, by browsing a set of algorithms (Value optimization, Policy optimization and Actor-Critic), then, giving an outline of current challenges and real-world applications, along with the hardware and frameworks used. In the end, we will discuss some potential research directions in the field of deep RL, for which we have great expectations that will lead to a real human level of intelligence.