Predictive Modelling of a Honeypot System Based on a Markov Decision Process and a Partially Observable Markov Decision Process

Wang, Lidong; Mosher, Reed; Duett, Patti; Falls, Terril

doi:10.5604/01.3001.0016.2027

Artykuł - szczegóły

Tytuł artykułu

Predictive Modelling of a Honeypot System Based on a Markov Decision Process and a Partially Observable Markov Decision Process

Autorzy

Wang Lidong , Mosher Reed , Duett Patti , Falls Terril

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.5604/01.3001.0016.2027

Warianty tytułu

Języki publikacji

Abstrakty

A honeypot is used to attract and monitor attacker activities and capture valuable information that can be used to help practice good cybersecurity. Predictive modelling of a honeypot system based on a Markov decision process (MDP) and a partially observable Markov decision process (POMDP) is performed in this paper. Analyses over a finite planning horizon and an infinite planning horizon for a discounted MDP are respectively conducted. Four methods, including value iteration (VI), policy iteration (PI), linear programming (LP), and Q-learning, are used in the analyses over an infinite planning horizon for the discounted MDP. The results of the various methods are compared to evaluate the validity of the created MDP model and the parameters in the model. The optimal policy to maximise the total expected reward of the states of the honeypot system is achieved, based on the MDP model employed. In the modelling over an infinite planning horizon for the discounted POMDP of the honeypot system, the effects of the observation probability of receiving commands, the probability of attacking the honeypot, the probability of the honeypot being disclosed, and transition rewards on the total expected reward of the honeypot system are studied.

Słowa kluczowe

cybersecurity honeypot machine learning Markov decision process Q-learning

cyberbezpieczeństwo honeypot uczenie maszynowe proces decyzyjny Markowa Q-learning

Wydawca

NASK – National Research Institute

Czasopismo

Applied Cybersecurity & Internet Governance

Rocznik

2023

Tom

Vol. 2, No. 1

Strony

32--50

Opis fizyczny

Bibliogr. 31 poz., rys., tab., wykr.

Twórcy

autor

Wang Lidong

lidong@iser.msstate.edu

Institute for Systems Engineering Research Mississippi State University, Mississippi, USA

https://orcid.org/0000-0003-3923-849X

autor

Mosher Reed

Institute for Systems Engineering Research Mississippi State University, Mississippi, USA

autor

Duett Patti

Institute for Systems Engineering Research Mississippi State University, Mississippi, USA

autor

Falls Terril

Institute for Systems Engineering Research Mississippi State University, Mississippi, USA

https://orcid.org/0009-0006-4468-4928

Bibliografia

[1] McKenzie, T. M. (2017). Is Cyber Deterrence Possible?, Air University Press.
[2] S. Srujana, P. Sreeja, G. Swetha, H. Shanmugasundaram, “Cutting edge technologies for improved cybersecurity model: A survey,” International Conference on Applied Artificial Intelligence and Computing (ICAAIC), 2022, pp. 1392–1396.
[3] L. Vokorokos, A. Pekár, N. Ádám, P. Darányi, “Yet another attempt in user authentication,” Acta Polytechnica Hungarica, vol. 10, no. 3, pp. 37–50, 2013.
[4] J. Palša, J. Hurtuk, E. Chovancová, M. Havira, “Configuration honeypots with an emphasis on logging of the attacks and redundancy,” IEEE 20th Jubilee World Symposium on Applied Machine Intelligence and Informatics ( SAMI), 2022, pp. 000073–000076, doi: 10.1109/SAMI54271.2022.9780801
[5] F. Franzen, L. Steger, J. Zirngibl, P. Sattler, “Looking for honey once again: Detecting RDP and SMB honeypots on the Internet,” IEEE Looking European Symposium on Security and Privacy Workshops (EuroS&PW), 2022.
[6] M. Boffa, G. Milan, L. Vassio, I. Drago, M. Mellia, Z.B. Houidi, “Towards NLP-based processing of honeypot logs,” IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2022, pp. 314–321
[7] Z. Shamsi, D. Zhang, D. Kyoung, A. Liu, „Measuring and Clustering Network Attackers using Medium-Interaction Honeypots,” in IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2022, pp. 294–306.
[8] X. Liu, H. Zhang, S. Dong, Y. Zhang, „Network Defense Decision-Making Based on a Stochastic Game System and a Deep Recurrent Q-Network,” Computers & Security, vol. 111, p. 102480, 2021, doi: 10.1016/j.cose.2021.102480.
[9] H. Itoh, H. Nakano, R. Tokushima, H. Fukumoto, „A Partially Observable Markov Decision Process-Based Blackboard Architecture for Cognitive Agents in Partially Observable Environments,” IEEE Transactions on Cognitive and Developmental Systems, 2020, doi: 10.1109/TCDS.2020.3034428.
[10] M. Haklidir, H. Temeltaş, „Guided Soft Actor Critic: A Guided Deep Reinforcement Learning Approach for Partially Observable Markov Decision Processes,” IEEE Access, vol. 9, pp. 159672–159683, 2021, doi: 10.1109/ACCESS.2021.3131772.
[11] A.R. Cassandra, „A Survey of POMDP Applications,” 2003. [Online]. Available: http://www.cassandra.org/arc/papers/applications.pdf. [Accessed: Nov. 27, 2022].
[12] O. Hayatle, H. Otrok, A. Youssef, „A Markov Decision Process Model for High Interaction Honeypots,” Information Security Journal: A Global Perspective, vol. 22, no. 4, pp. 159–170, 2013.
[13] M. Mohri, A. Rostamrdeh, A. Talwalkar, Foundations of Machine Learning. Cambridge, Massachusetts: MIT Press, 2012.
[14] M.A. Alsheikh, D.T. Hoang, D. Niyato, H.P. Tan, S. Lin, „Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey,” IEEE Communications Surveys & Tutorials, vol. 17, no. 3, pp. 1239–1267, 2015, doi: 10.1109/COMST.2015.2420686.
[15] Y. Chen, J. Hong, C.C. Liu, „Modeling of Intrusion and Defense for Assessment of Cyber Security at Power Substations,” IEEE Transactions on Smart Grid, vol. 9, no. 4, pp. 2541–2552, 2018.
[16] R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction. Cambridge, Massachusetts: MIT Press, 2018.
[17] M. van Otterlo, „Markov Decision Processes: Concepts and Algorithms,” in Reinforcement Learning: Adaptation, Learning, and Optimization, vol. 12, M. Wiering and M. van Otterlo, Eds. Berlin, Heidelberg: Springer, doi: 10.1007/978-3-642-27645-3_1.
[18] M. van Otterlo, M. Wiering, „Reinforcement Learning and Markov Decision Processes,” in Reinforcement Learning: Adaptation, Learning, and Optimization, vol. 12, M. Wiering and M. van Otterlo, Eds. Berlin, Heidelberg: Springer, pp. 3–42, doi: 10.1007/978-3-642-27645-3_1.
[19] O. Sigaud, O. Buffet, Markov Decision Processes in Artificial Intelligence. Hoboken, New Jersey: John Wiley & Sons, 2013.
[20] S.J. Majeed, M. Hutter, „On Q-learning Convergence for Non-Markov Decision Processes,” in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp. 2546–2552, 2018, doi: 10.24963/ijcai.2018/353.
[21] E. Zanini, Markov Decision Processes. Berlin, Heidelberg: Springer, 2014.
[22] Y. Liu, H. Liu, B. Wang, „Autonomous Exploration for Mobile Robot Using Q-Learning,” in Proceedings of the 2nd International Conference on Advanced Robotics and Mechatronics (ICARM), 2017, pp. 614–619, doi: 10.1109/ICARM.2017.8273233.
[23] G.E. Monahan, „State of the Art–A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms,” Management Science, vol. 28, no. 1, pp. 1–16, 1982, doi: 10.1287/mnsc.28.1.1.
[24] M.L. Littman, A.R. Cassandra, L.P. Kaelbling, „Efficient Dynamic-Programming Updates in Partially Observable Markov Decision Processes,” Department of Computer Science. Providence, Rhode Island: Brown University, CS-95-19, 1995.
[25] J. Stuart, P. Norvig, Artificial Intelligence: A Modern Approach, London: Pearson, 3rd ed., 2010.
[26] H. Kurniawati, D. Hsu, W.S. Lee, „Sarsop: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces,” in Robotics: Science and Systems, 2008.
[27] J. Pineau, G. Gordon, S. Thrun, „Point-Based Value Iteration: An Anytime Algorithm for POMDPs,” in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), vol. 3, pp. 1025–1032, 2003.
[28] E.J. Sondik, The Optimal Control of Partially Observable Markov Processes. Stanford, California: Stanford University, 1971.
[29] N.L. Zhang, W. Liu, „Planning in Stochastic Domains: Problem Characteristics and Approximation,” Department of Computer Science. Hong Kong: Hong Kong University of Science and Technology, HKUST-CS96-31, 1996.
[30] A.R. Cassandra, M.L. Littman, N.L. Zhang, „Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes,” arXiv preprint arXiv:1302.1525, 2013.
[31] R.I. Brafman, „A Heuristic Variable Grid Solution Method for POMDPs,” in Proceedings of the AAAI/IAAI, 1997, pp. 727–733.

Uwagi

Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-636ad7fe-a0b8-4e3c-b161-377302f12902