Solving Markov decision processes by d-graph algorithms

Kátai, Z.

Artykuł - szczegóły

Tytuł artykułu

Solving Markov decision processes by d-graph algorithms

Autorzy

Kátai Z.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Markov decision processes (MDPs) provide a mathematical model for sequential decisionmaking (sMDP/dMDP: stochastic/ deterministic MDP). We introduce the concept of generalized dMDP (g-dMDP) where each action may result in more than one next (parallel or clone) state. The common tools to represent dMDPs are digraphs, but these are inadequate for sMDPs and g-dMDPs. We introduce d-graphs as general tools to represent all the above mentioned processes (stationary versions). We also present a combined d-graph algorithm that implements dynamic programming strategies to find optimal policies for the finite/infinite horizon versions of these Markov processes. (The preliminary version of this paper was presented at the Conference MACRo 2011.)

Słowa kluczowe

Markov decision processes dynamic programming graph representation graph algorithms optimization problems

Wydawca

Systems Research Institute, Polish Academy of Sciences

Czasopismo

Control and Cybernetics

Rocznik

2012

Tom

Vol. 41, no. 3

Strony

577--593

Opis fizyczny

Bibliogr. 15 poz., wykr.

Twórcy

autor

Kátai Z.

Sapientia University, Tírgu Mures, Romania

Bibliografia

BELLMAN, R. E. (1957) Dynamic Programming. Princeton University Press.
CLEMPNER, J. (2005) Colored decision process petri nets: modeling, analysis and stability. International Journal of Applied Mathematics and Computer Science, 15 (3), 405-420.
GIMBERT, H. (2008) A Class of Markov Decision Processes with Pure and Stationary Optimal Strategies. Third Congress of the Game Theory Society. Evanston, Illinois, USA. Retrieved November 20, 2012, from http://www.labri.fr/perso/gimbert/recherche/games08.pdf
HOWARD, R. A. (1960) Dynamic Programming and Markov Processes. MIT Press, Cambridge, Massachusetts.
KATAI, Z. (2006) Dynamic programming and d-graphs. Studia Universitatis Babes-Bolyai, Informatics, 51 (2), 41-52.
KÁTAI, Z. and CSÍKI, A. (2009) Automated dynamic programming. Acta Universitatis Sapientiae, Informatica, 1 (2), 149-164.
KÁTAI, Z. and FŨLÕP, P. I. (2010) Modeling dynamic programming problems: Petri nets versus d-graphs. In: Proceedings of 8th International Conference on Applied Informatics (ICAI), 1, 217-226. Eger, Hungary.
KATAI, Z. (2010) Modelling dynamic programming problems by generalized d-graphs. Acta Universitatis Sapientiae Informatics, 2 (2), 210-230.
KRISTENSEN, A. R. (1996) Dynamic programming and Markov decision processes. Dina Notat, 49. Retrieved November 20, 2012, from Danish Informatics Network in the Agricultural Sciences Web Site:http://www.prodstyr.ihh.kvl.dk/pdf/notat49.pdf
LEW, A. (2002) A Petri net model for discrete dynamic programming. In: Proceedings of the 9th Bellman Continuum: International Workshop on Uncertain Systems and Soft Computing, 16-21. Beijing, China.
LEW, A. and MAUCH, H. (2004) Bellman nets: A Petri net model and tool for dynamic programming. In: Proceedings of 5th International Conference on Modelling, Computation and Optimization in Information Systems and Management Sciences (MCO), 241-248.
LEW, A. and MAUCH, H. (2007) Dynamic Programming, A Computational Tool. Studies in Computational Intelligence, 38. Springer.
MADANI, O., THORUP, M. and ZWICK, U. (2009) Discounted Deterministic Markov Decision Processes and Discounted All-Pairs Shortest Paths. In: Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms, 958-967.
MAUCH, H. (2006) DP2PN2Solver: A flexible dynamic programming solver software tool. Control and Cybernetics, 35(3), 687-702.
PUTERMAN, M. L. (1994) Markov Decision Processes. Wiley.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BATC-0011-0119