PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Solving Markov decision processes by d-graph algorithms

Autorzy
Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Markov decision processes (MDPs) provide a mathematical model for sequential decisionmaking (sMDP/dMDP: stochastic/ deterministic MDP). We introduce the concept of generalized dMDP (g-dMDP) where each action may result in more than one next (parallel or clone) state. The common tools to represent dMDPs are digraphs, but these are inadequate for sMDPs and g-dMDPs. We introduce d-graphs as general tools to represent all the above mentioned processes (stationary versions). We also present a combined d-graph algorithm that implements dynamic programming strategies to find optimal policies for the finite/infinite horizon versions of these Markov processes. (The preliminary version of this paper was presented at the Conference MACRo 2011.)
Rocznik
Strony
577--593
Opis fizyczny
Bibliogr. 15 poz., wykr.
Twórcy
autor
  • Sapientia University, Tírgu Mures, Romania
Bibliografia
  • BELLMAN, R. E. (1957) Dynamic Programming. Princeton University Press.
  • CLEMPNER, J. (2005) Colored decision process petri nets: modeling, analysis and stability. International Journal of Applied Mathematics and Computer Science, 15 (3), 405-420.
  • GIMBERT, H. (2008) A Class of Markov Decision Processes with Pure and Stationary Optimal Strategies. Third Congress of the Game Theory Society. Evanston, Illinois, USA. Retrieved November 20, 2012, from http://www.labri.fr/perso/gimbert/recherche/games08.pdf
  • HOWARD, R. A. (1960) Dynamic Programming and Markov Processes. MIT Press, Cambridge, Massachusetts.
  • KATAI, Z. (2006) Dynamic programming and d-graphs. Studia Universitatis Babes-Bolyai, Informatics, 51 (2), 41-52.
  • KÁTAI, Z. and CSÍKI, A. (2009) Automated dynamic programming. Acta Universitatis Sapientiae, Informatica, 1 (2), 149-164.
  • KÁTAI, Z. and FŨLÕP, P. I. (2010) Modeling dynamic programming problems: Petri nets versus d-graphs. In: Proceedings of 8th International Conference on Applied Informatics (ICAI), 1, 217-226. Eger, Hungary.
  • KATAI, Z. (2010) Modelling dynamic programming problems by generalized d-graphs. Acta Universitatis Sapientiae Informatics, 2 (2), 210-230.
  • KRISTENSEN, A. R. (1996) Dynamic programming and Markov decision processes. Dina Notat, 49. Retrieved November 20, 2012, from Danish Informatics Network in the Agricultural Sciences Web Site:http://www.prodstyr.ihh.kvl.dk/pdf/notat49.pdf
  • LEW, A. (2002) A Petri net model for discrete dynamic programming. In: Proceedings of the 9th Bellman Continuum: International Workshop on Uncertain Systems and Soft Computing, 16-21. Beijing, China.
  • LEW, A. and MAUCH, H. (2004) Bellman nets: A Petri net model and tool for dynamic programming. In: Proceedings of 5th International Conference on Modelling, Computation and Optimization in Information Systems and Management Sciences (MCO), 241-248.
  • LEW, A. and MAUCH, H. (2007) Dynamic Programming, A Computational Tool. Studies in Computational Intelligence, 38. Springer.
  • MADANI, O., THORUP, M. and ZWICK, U. (2009) Discounted Deterministic Markov Decision Processes and Discounted All-Pairs Shortest Paths. In: Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms, 958-967.
  • MAUCH, H. (2006) DP2PN2Solver: A flexible dynamic programming solver software tool. Control and Cybernetics, 35(3), 687-702.
  • PUTERMAN, M. L. (1994) Markov Decision Processes. Wiley.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BATC-0011-0119
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.