{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T04:13:38Z","timestamp":1777349618310,"version":"3.51.4"},"reference-count":32,"publisher":"Institute for Operations Research and the Management Sciences (INFORMS)","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Mathematics of OR"],"published-print":{"date-parts":[[2012,5]]},"abstract":"<jats:p> We consider Markov decision processes where the values of the parameters are uncertain. This uncertainty is described by a sequence of nested sets (that is, each set contains the previous one), each of which corresponds to a probabilistic guarantee for a different confidence level. Consequently, a set of admissible probability distributions of the unknown parameters is specified. This formulation models the case where the decision maker is aware of and wants to exploit some (yet imprecise) a priori information of the distribution of parameters, and it arises naturally in practice where methods for estimating the confidence region of parameters abound. We propose a decision criterion based on distributional robustness: the optimal strategy maximizes the expected total reward under the most adversarial admissible probability distributions. We show that finding the optimal distributionally robust strategy can be reduced to the standard robust MDP where parameters are known to belong to a single uncertainty set; hence, it can be computed in polynomial time under mild technical conditions. <\/jats:p>","DOI":"10.1287\/moor.1120.0540","type":"journal-article","created":{"date-parts":[[2012,4,20]],"date-time":"2012-04-20T00:13:29Z","timestamp":1334880809000},"page":"288-300","source":"Crossref","is-referenced-by-count":80,"title":["Distributionally Robust Markov Decision Processes"],"prefix":"10.1287","volume":"37","author":[{"given":"Huan","family":"Xu","sequence":"first","affiliation":[{"name":"Department of Mechanical Engineering, National University of Singapore, Singapore, 117576"}]},{"given":"Shie","family":"Mannor","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, Technion, Israel, 32000"}]}],"member":"109","reference":[{"key":"B1","doi-asserted-by":"publisher","DOI":"10.1109\/9.159584"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.1109\/9.159585"},{"key":"B3","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-0805-2_4"},{"key":"B5","volume-title":"Thinking and Deciding","author":"Baron J","year":"2000"},{"key":"B6","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-6377(99)00016-4"},{"key":"B7","volume-title":"Neuro-Dynamic Programming","author":"Bertsekas DP","year":"1996"},{"key":"B8","volume-title":"Theory of Games and Statistical Decisions","author":"Blackwell D","year":"1954"},{"key":"B9","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511804441"},{"key":"B10","doi-asserted-by":"publisher","DOI":"10.1007\/s10957-006-9084-x"},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.1287\/opre.1080.0685"},{"key":"B12","doi-asserted-by":"publisher","DOI":"10.1287\/opre.1090.0741"},{"key":"B13","doi-asserted-by":"publisher","DOI":"10.1016\/0005-1098(81)90047-9"},{"key":"B14","doi-asserted-by":"publisher","DOI":"10.1080\/17442508708833436"},{"key":"B15","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177729689"},{"key":"B16","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-937X.2007.00464.x"},{"key":"B17","doi-asserted-by":"publisher","DOI":"10.1016\/0304-4068(89)90018-9"},{"key":"B18","doi-asserted-by":"publisher","DOI":"10.1287\/opre.1090.0795"},{"key":"B19","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-97881-4"},{"key":"B20","doi-asserted-by":"publisher","DOI":"10.1287\/moor.1040.0129"},{"key":"B21","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1515\/9783112479926-009","volume-title":"Advances in Mathematical Optimization","author":"Kall P","year":"1988"},{"key":"B22","doi-asserted-by":"publisher","DOI":"10.2307\/1969794"},{"issue":"3","key":"B23","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1093\/oxfordjournals.oep.a042139","volume":"46","author":"Kelsey D","year":"1994","journal-title":"Oxford Econom. Papers"},{"key":"B24","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.1060.0614"},{"key":"B25","doi-asserted-by":"publisher","DOI":"10.1287\/opre.1050.0216"},{"key":"B26","doi-asserted-by":"publisher","DOI":"10.1287\/opre.1060.0353"},{"key":"B27","doi-asserted-by":"publisher","DOI":"10.1002\/9780470316887"},{"key":"B28","doi-asserted-by":"publisher","DOI":"10.1515\/9781400873173"},{"key":"B29","first-page":"201","volume-title":"Studies in Mathematical Theory of Inventory and Production","author":"Scarf H","year":"1958"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.1007\/s10107-005-0680-6"},{"key":"B31","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.39.10.1095"},{"key":"B32","doi-asserted-by":"publisher","DOI":"10.2140\/pjm.1958.8.171"},{"key":"B33","doi-asserted-by":"publisher","DOI":"10.1287\/opre.42.4.739"}],"container-title":["Mathematics of Operations Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/pubsonline.informs.org\/doi\/pdf\/10.1287\/moor.1120.0540","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,2]],"date-time":"2023-04-02T12:03:21Z","timestamp":1680437001000},"score":1,"resource":{"primary":{"URL":"https:\/\/pubsonline.informs.org\/doi\/10.1287\/moor.1120.0540"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,5]]},"references-count":32,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2012,5]]}},"alternative-id":["10.1287\/moor.1120.0540"],"URL":"https:\/\/doi.org\/10.1287\/moor.1120.0540","relation":{},"ISSN":["0364-765X","1526-5471"],"issn-type":[{"value":"0364-765X","type":"print"},{"value":"1526-5471","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,5]]}}}