Propozycja wykorzystania uczenia przez wzmacnianie w celu optymalizowania podejmowania decyzji w zakresie przeciwdziałania praniu pieniędzy oraz finansowania terroryzmu. Część 2

Kędzierski, Maciej Aleksander

doi:10.37055/nsz/188842

Artykuł - szczegóły

Tytuł artykułu

Propozycja wykorzystania uczenia przez wzmacnianie w celu optymalizowania podejmowania decyzji w zakresie przeciwdziałania praniu pieniędzy oraz finansowania terroryzmu. Część 2

Autorzy

Kędzierski Maciej Aleksander

Wybrane pełne teksty z tego czasopisma

http://nsz.wat.edu.pl/

Identyfikatory

DOI

10.37055/nsz/188842

Warianty tytułu

A proposal to use reinforcement learning to optimize decision-making in the field of counteracting money laundering and terrorist financing. Part 2

Języki publikacji

Abstrakty

Uczenie przez wzmacnianie skupia się nie tylko na uczeniu pojedynczego agenta, lecz także zastosowanie tej metody znajduje swoje odzwierciedlenie w wieloagentowym działaniu. To kwestia istotna z punktu widzenia tego, że proces decyzyjny i zarządzanie informacją w systemie AML/CFT dla instytucji obowiązanej pozostaje coraz bardziej procesem skomplikowanym. W konsekwencji należy wprowadzić także, chcąc zastosować metodę uczenia przez wzmacnianie, wielość agentów zarówno w relacji ze środowiskiem, jak i w relacji ze sobą. Wobec tego rodzaju rozwiązań możliwe jest do zastosowania wieloagentowe uczenie się przez wzmacnianie czy koncepcja półniezależnej metody szkolenia polityk ze współdzieloną reprezentacją dla heterogenicznego, wieloagentowego uczenia się przez wzmacnianie. Ponadto mając na uwadze fakt, że proces decyzyjny AML/CFT czerpie jedynie pomocniczo rozwiązania ze sztucznej inteligencji, w tym systemie zarządzania niezbędny pozostaje także czynnik ludzki. Wobec tego rodzaju potrzeb jako wyjściowe rozwiązanie można wskazać Reinforcement Learning from Human Feedback, które zapewnia w uczeniu czynnik ludzki.

Reinforcement learning focuses not only on teaching a single agent, but also the use of this method is reflected in multi-agent operation. This is an important issue from the point of view that the decision-making process and information management in the AML/CFT system for the obligated institution remains an increasingly complex process. Consequently, if we want to use the reinforcement learning method, we must also introduce a multiplicity of agents both in relation to the environment and in relation to each other. Given this type of solutions, it is possible to use multi-agent reinforcement learning or the concept of a semi-independent policy training method with a shared representation for heterogeneous, multi-agent reinforcement learning. Bearing in mind the fact that the AML/CFT decision-making process only derives solutions from artificial intelligence, the human factor also remains essential in this management system. Given these types of needs, the initial solution can be Reinforcement Learning from Human Feedback, which ensures the human factor in learning.

Słowa kluczowe

uczenie przez wzmacnianie pranie pieniędzy wieloagentowy zbiór uczący sprzężenie zwrotne

reinforcement learning money laundering multi-agent training set feedback

Wydawca

Wojskowa Akademia Techniczna im. Jarosława Dąbrowskiego

Czasopismo

Nowoczesne Systemy Zarządzania

Rocznik

2023

Tom

T. 18, nr 4

Strony

49--68

Opis fizyczny

Bibliogr. 21 poz., rys.

Twórcy

autor

Kędzierski Maciej Aleksander

sulawezi.mk@onet.eu

OIRP Warszawa, Polska

https://orcid.org/0000-0003-3074-1355

Bibliografia

[1] Abramson, J., Ahuja, A., Carnevale, F., Georgiev, P., 2022. Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback, https://arxiv.org/pdf/2211.11602. pdf (dostęp: 22.11.2023).
[2] Bartuś, T., 2013. Zastosowanie inteligentnych agentów w administracji publicznej, Wydział Ekonomii Uniwersytet Ekonomiczny w Katowicach, Roczniki Kolegium Analiz Ekonomicznych, nr 29.
[3] Dhaduk, H., 2023. A Complete Guide to Fine Tuning Large Language Models. Simform - Product Engineering Company, https://www.simform.com/blog/completeguide-finetuning-llm/ (dostęp: 20.11.2023).
[4] Eastnets, 2023. Is open-source AI a good or bad thing for the finance sector?, https://www.eastnets.com/newsroom/is-open-source-ai-a-good-or-bad-thing-for-the-finance-sector (dostęp: 24.11.2023).
[5] Egli, A., 2023. ChatGPT, GPT-4, and Other Large Language Models: The Next Revolution for Clinical Microbiology?, Clinical Infectious Diseases, vol. 77, nr 9.
[6] Frąckiewicz, M., 2023. Przeciwstawne uczenie maszynowe, https://ts2.space/pl/przeciwstawne-uczenie-maszynowe/#gsc.tab=0 (dostęp: 26.11.2023).
[7] Guoxinag, T., Jieyu, S., 2023. Financial transaction fraud detector based on imbalance learning and graph neural network, Applied Soft Computing, vol. 149, Part A.
[8] Gupta, J. K., Egorov, M., Kochenderfel, M., 2017. Cooperative Multi-Agent Control Using Deep Reinforcement Learning, pkt 4.1, https://ala2017.cs.universityofgalway.ie/papers/ALA2017_Gupta.pdf (dostęp: 26.11.2023).
[9] Wikipedia, 2013. Mapa wektorowa, https://pl.wikipedia.org/wiki/Mapa_wektorowa (dostęp: 26.11.2023).
[10] Mehta, K., Mahajan, A., Kumar, P., 2023. marl-jax: Multi-Agent Reinforcement Leaning Framework, https://arxiv.org/pdf/2303.13808.pdf (dostęp: 28.11.2023).
[11] Muller, A. C., Guido, S., 2023. Machine learning, Python i data science, Gliwice: Wydawnictwo Helion.
[12] Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, Ch., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., Lowe, R., 2023. Training language models to follow instructions with human feedback, https://proceedings.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf (dostęp: 28.11.2023).
[13] Patrizo, A., 2023. Reinforcement learning from human feedback (RLHF), https://www.techtarget.com/whatis/definition/reinforcement-learning-from-human-feedback-RLHF, (dostęp: 28.11.2023).
[14] Qiu, W., Ma, X., An, B., Obraztsova, S., Yan, S. H., Xu, Z., 2023, RPM: Generalizable Multi-Agent Policies For Multi-Agent Reinforcement Learning, https://arxiv.org/pdf/2210.09646.pdf (dostęp: 28.11.2023).
[15] Raport EBA, 2023. Machine Learning for IRB Models. Follow-Up Report From The Consultation On The Discussion Paper On Machine Learning for IRB Models, Eba/Rep/2023/28, August 2023, https://www.eba.europa.eu/sites/default/documents/files/document_library/Publications/Reports/2023/1061483/Follow-up%20report%20on%20machine%20learning%20for%20IRB%20models.pdf (dostęp: 25.11.2023).
[16] Standen, M., Kim, J., Szabo, C., 2023. SoK: Adversarial Machine Learning Attacks and Defences in Multi-Agent Reinforcement Learning, https://arxiv.org/abs/2301.04299 (dostęp: 25.11.2023).
[17] SuperSARs and information, 2023. SuperSARs and information sharing in the regulated sector, https://www.comsuregroup.com/news/supersars-and-information-sharing-in-the-regulatedsector/ (dostęp: 27.11.2023).
[18] Tong, G., Shen, J., 2023. Financial transaction fraud detector based on imbalance learning and graph neural network, Applied Soft Computing, vol. 149, Part A.
[19] Weyns, D., 2010. Architecture-Based Design of Multi-Agent Systems, Berlin-Heidelberg: Springer-Verlag.
[20] Zhang, K., Yang, Z., Başar, T., 2021. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms, https://arxiv.org/pdf/1911.10635.pdf (dostęp: 22.11.2023).
[21] Zhao, B., Jin, W., Chen, Z., Guo, Y., 2023. A semi-independent policies training method with shared representation for heterogeneous multi-agents reinforcement learning, https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2023.1201370/full (dostęp: 22.11.2023).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-3914acc4-3b0d-4b7a-8545-b7ac4fc8db09