Value iteration for simple stochastic games: Stopping criterion and learning algorithm

The classical problem of reachability in simple stochastic games is typically solved by value iteration (VI), which produces a sequence of under-approximations of the value of the game, but is only guaranteed to converge in the limit. We provide an additional converging sequence of over-approximatio...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information and computation Jg. 285; S. 104886
Hauptverfasser:	Eisentraut, Julia, Kelmendi, Edon, Křetínský, Jan, Weininger, Maximilian
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Elsevier Inc 01.05.2022
Schlagworte:	Markov decision processes Probabilistic verification Reachability Stochastic games Value iteration Probabilistic verification Stochastic games Value iteration Markov decision processes Reachability
ISSN:	0890-5401, 1090-2651
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	The classical problem of reachability in simple stochastic games is typically solved by value iteration (VI), which produces a sequence of under-approximations of the value of the game, but is only guaranteed to converge in the limit. We provide an additional converging sequence of over-approximations, based on an analysis of the game graph. Together, these two sequences entail the first error bound and hence the first stopping criterion for VI on simple stochastic games, indicating when the algorithm can be stopped for a given precision. Consequently, VI becomes an anytime algorithm returning the approximation of the value and the current error bound. We further use this error bound to provide a learning-based asynchronous VI algorithm; it uses simulations and thus often avoids exploring the whole game graph, but still yields the same guarantees. Finally, we experimentally show that the overhead for computing the additional sequence of over-approximations often is negligible.
AbstractList	The classical problem of reachability in simple stochastic games is typically solved by value iteration (VI), which produces a sequence of under-approximations of the value of the game, but is only guaranteed to converge in the limit. We provide an additional converging sequence of over-approximations, based on an analysis of the game graph. Together, these two sequences entail the first error bound and hence the first stopping criterion for VI on simple stochastic games, indicating when the algorithm can be stopped for a given precision. Consequently, VI becomes an anytime algorithm returning the approximation of the value and the current error bound. We further use this error bound to provide a learning-based asynchronous VI algorithm; it uses simulations and thus often avoids exploring the whole game graph, but still yields the same guarantees. Finally, we experimentally show that the overhead for computing the additional sequence of over-approximations often is negligible.
ArticleNumber	104886
Author	Eisentraut, Julia Kelmendi, Edon Weininger, Maximilian Křetínský, Jan
Author_xml	– sequence: 1 givenname: Julia orcidid: 0000-0002-7735-8751 surname: Eisentraut fullname: Eisentraut, Julia organization: Technical University of Munich, Germany – sequence: 2 givenname: Edon orcidid: 0000-0003-3100-1500 surname: Kelmendi fullname: Kelmendi, Edon organization: University of Oxford, United Kingdom – sequence: 3 givenname: Jan surname: Křetínský fullname: Křetínský, Jan organization: Technical University of Munich, Germany – sequence: 4 givenname: Maximilian orcidid: 0000-0002-0163-2152 surname: Weininger fullname: Weininger, Maximilian email: maxi.weininger@tum.de organization: Technical University of Munich, Germany
BookMark	eNp9kMtOwzAQRS1UJNrCnqV_IMWPOHG6QxUvqRILoFvLnUxaV0lc2QaJvyehrJBgNa97Rrp3Ria975GQa84WnPHi5rBwsBBMiGHMtS7OyJSzimWiUHxCpkwPvcoZvyCzGA-Mca7yYko2G9u-I3UJg03O97TxgUbXHVukMXnY25gc0J3tMC7pS_LHo-t3FMJIjHrb17RFG_pxbdudHy777pKcN7aNePVT5-Tt_u519Zitnx-eVrfrDKQsUqZsZZXSdaXqspCNEJLrrWUC6m2pclRVWQqhtbQVDDKU2mIJFeRlI5WsUcg5KU5_IfgYAzYGXPo2koJ1reHMjOmYg3FgxnTMKZ0BZL_AY3CdDZ__IcsTgoOhD4fBRHDYA9YuICRTe_c3_AU2lX6O
CitedBy_id	crossref_primary_10_1016_j_ic_2024_105193 crossref_primary_10_4204_EPTCS_428_4 crossref_primary_10_1016_j_ic_2024_105214 crossref_primary_10_1016_j_ic_2024_105236
Cites_doi	10.1007/s10703-010-0097-6 10.1007/s10703-013-0183-7 10.1016/0890-5401(90)90004-2 10.1016/0890-5401(92)90048-K 10.1007/s001650300007 10.1145/3060139 10.1002/9780470316887 10.1016/S0004-3702(00)00039-4 10.1016/j.ejcon.2016.04.009 10.1007/s004539910020 10.1109/TAC.2016.2598476 10.1016/j.tcs.2016.12.003 10.1287/mnsc.25.4.352 10.1007/BF01720283 10.1287/mnsc.12.5.359 10.1145/1968.1972 10.1109/TSMCC.2007.913919 10.1007/s10703-006-0005-2 10.1145/2168260.2168264 10.1016/j.jcss.2012.12.001
ContentType	Journal Article
Copyright	2022 The Authors
Copyright_xml	– notice: 2022 The Authors
DBID	6I. AAFTH AAYXX CITATION
DOI	10.1016/j.ic.2022.104886
DatabaseName	ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Computer Science
EISSN	1090-2651
ExternalDocumentID	10_1016_j_ic_2022_104886 S0890540122000281
GrantInformation_xml	– fundername: German Research Foundation grantid: KR 4890/2-1 funderid: https://doi.org/10.13039/501100001659 – fundername: TUM IGSSE grantid: 10.06 – fundername: the German Excellence Initiative and the European Union Seventh Framework Programme grantid: 291763 for TUM – IAS – fundername: Studienstiftung des Deutschen Volkes funderid: https://doi.org/10.13039/501100004350
GroupedDBID	--K --M --Z -~X .~1 0R~ 1B1 1~. 1~5 29I 4.4 457 4G. 5GY 5VS 6I. 6TJ 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAFTH AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABAOU ABBOA ABFNM ABJNI ABMAC ABTAH ABVKL ABXDB ABYKQ ACAZW ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADFGL ADMUD AEBSH AEKER AENEX AEXQZ AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ARUGR ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC CAG COF CS3 DM4 DU5 E3Z EBS EFBJH EFLBG EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HVGLF HZ~ H~9 IHE IXB J1W KOM LG5 LX9 M41 MHUIS MO0 MVM N9A NCXOZ O-L O9- OAUVE OK1 OZT P-8 P-9 P2P PC. Q38 R2- RIG RNS ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SSV SSW SSZ T5K TN5 WH7 WUQ XJT XPP ZMT ZU3 ZY4 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO ADVLN AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD
ID	FETCH-LOGICAL-c336t-5a9a558d95d763f22318ba02cdb754e597722883a9c558e38ae7c9c47f353de23
ISICitedReferencesCount	13
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000886255300011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	0890-5401
IngestDate	Sat Nov 29 07:13:10 EST 2025 Tue Nov 18 21:32:24 EST 2025 Fri Feb 23 02:40:39 EST 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Keywords	Probabilistic verification Stochastic games Value iteration Markov decision processes Reachability
Language	English
License	This is an open access article under the CC BY-NC-ND license.
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c336t-5a9a558d95d763f22318ba02cdb754e597722883a9c558e38ae7c9c47f353de23
ORCID	0000-0003-3100-1500 0000-0002-0163-2152 0000-0002-7735-8751
OpenAccessLink	https://dx.doi.org/10.1016/j.ic.2022.104886
ParticipantIDs	crossref_citationtrail_10_1016_j_ic_2022_104886 crossref_primary_10_1016_j_ic_2022_104886 elsevier_sciencedirect_doi_10_1016_j_ic_2022_104886
PublicationCentury	2000
PublicationDate	May 2022 2022-05-00
PublicationDateYYYYMMDD	2022-05-01
PublicationDate_xml	– month: 05 year: 2022 text: May 2022
PublicationDecade	2020
PublicationTitle	Information and computation
PublicationYear	2022
Publisher	Elsevier Inc
Publisher_xml	– name: Elsevier Inc
References	Vrieze, Tijs, Raghavan, Filar (br0600) 1983; 5 Haddad, Monmege (br0330) 2018; 735 Davey, Priestley (br0270) 2002 Dehnert, Junges, Katoen, Volk (br0280) 2017 Kattenbelt, Kwiatkowska, Norman, Parker (br0380) 2010; 36 LaValle (br0490) 2000; 26 Li, Liu (br0500) 2008 Hoffman, Karp (br0350) 1966; 12 Kelmendi, Krämer, Kretínský, Weininger (br0390) 2018 Ashok, Chatterjee, Kretínský, Weininger, Winkler (br0040) 2020 Chatterjee, Fijalkow (br0160) 2011 Chatterjee, Henzinger (br0180) 2008 Kretínský, Meggendorfer (br0410) 2020; 16 Svorenová, Kwiatkowska (br0560) 2016; 30 Ashok, Chatterjee, Daca, Kretínský, Meggendorfer (br0030) 2017 Calinescu, Kikuchi, Johnson (br0130) 2012 Ashok, Daca, Kretínský, Weininger (br0050) 2020 Baier, Katoen (br0070) 2008 Chatterjee, Henzinger, Jobstmann, Radhakrishna (br0190) 2010 Daca, Henzinger, Kretínský, Petrov (br0260) 2017; 18 Valiant (br0590) 1984; 27 McMahan, Likhachev, Gordon (br0510) 2005 Strehl, Li, Wiewiora, Langford, Littman (br0550) 2006 Chen, Forejt, Kwiatkowska, Parker, Simaitis (br0200) 2013; 43 Ujma (br0580) 2015 Baier, Klein, Leuschner, Parker, Wunderlich (br0080) 2017 Balaji, Kiefer, Novotný, Pérez, Shirmohammadi (br0090) 2019 Saffre, Simaitis (br0540) 2012; 7 van Dijk (br0290) 2018 Cheng, Knoll, Luttenberger, Buckl (br0230) 2011 Kwiatkowska, Norman, Parker (br0450) 2012 Phalakarn, Takisaka, Haas, Hasuo (br0520) 2020 Tcheukam, Tembine (br0570) 2016 Kwiatkowska, Norman, Parker, Santos (br0430) 2020 Kwiatkowska, Norman, Parker, Sproston (br0460) 2006; 29 Puterman (br0530) 1994 Feng, Kwiatkowska, Parker (br0310) 2011 Wen, Topcu (br0610) 2016 Cámara, Moreno, Garlan (br0140) 2014 Brázdil, Chatterjee, Chmelik, Forejt, Kretínský, Kwiatkowska, Parker, Ujma (br0110) 2014 Filar, Vrieze (br0320) 2012 Hordijk, Kallenberg (br0360) 1979; 25 Busoniu, Babuska, Schutter (br0120) 2008; 38 Kwiatkowska, Norman, Sproston (br0480) 2003; 14 Brafman, Tennenholtz (br0100) 2000; 121 Itai, Rodeh (br0370) 1990; 88 Eisentraut, Kretínský, Rotar (br0300) 2019 Chen, Kwiatkowska, Parker, Simaitis (br0210) 2011 Hahn, Hartmanns, Hensel, Klauck, Klein, Kretínský, Parker, Quatmann, Ruijters, Steinmetz (br0340) 2019 Chatterjee, Henzinger (br0170) 2011 Chatterjee, de Alfaro, Henzinger (br0150) 2013; 79 Kretínský, Ramneantu, Slivinskiy, Weininger (br0420) 2020 Ashok, Kretínský, Weininger (br0060) 2019 Kwiatkowska, Norman, Parker (br0440) 2011 Condon (br0240) 1992; 96 Kwiatkowska, Norman, Sproston (br0470) 2002 Arslan, Yüksel (br0020) 2017; 62 Condon (br0250) 1993 Andersson, Miltersen (br0010) 2009 Chen, Kwiatkowska, Simaitis, Wiltsche (br0220) 2013 Kretínský, Meggendorfer (br0400) 2017 Chen (10.1016/j.ic.2022.104886_br0200) 2013; 43 Filar (10.1016/j.ic.2022.104886_br0320) 2012 Kretínský (10.1016/j.ic.2022.104886_br0420) 2020 LaValle (10.1016/j.ic.2022.104886_br0490) 2000; 26 Chatterjee (10.1016/j.ic.2022.104886_br0160) 2011 Chatterjee (10.1016/j.ic.2022.104886_br0170) 2011 Cheng (10.1016/j.ic.2022.104886_br0230) 2011 Kwiatkowska (10.1016/j.ic.2022.104886_br0460) 2006; 29 Svorenová (10.1016/j.ic.2022.104886_br0560) 2016; 30 Saffre (10.1016/j.ic.2022.104886_br0540) 2012; 7 Daca (10.1016/j.ic.2022.104886_br0260) 2017; 18 Chen (10.1016/j.ic.2022.104886_br0220) 2013 Wen (10.1016/j.ic.2022.104886_br0610) 2016 Brafman (10.1016/j.ic.2022.104886_br0100) 2000; 121 Ashok (10.1016/j.ic.2022.104886_br0060) 2019 Kattenbelt (10.1016/j.ic.2022.104886_br0380) 2010; 36 Kwiatkowska (10.1016/j.ic.2022.104886_br0470) 2002 Kwiatkowska (10.1016/j.ic.2022.104886_br0480) 2003; 14 Balaji (10.1016/j.ic.2022.104886_br0090) 2019 Chatterjee (10.1016/j.ic.2022.104886_br0190) 2010 Cámara (10.1016/j.ic.2022.104886_br0140) 2014 Chatterjee (10.1016/j.ic.2022.104886_br0150) 2013; 79 Kelmendi (10.1016/j.ic.2022.104886_br0390) 2018 Condon (10.1016/j.ic.2022.104886_br0240) 1992; 96 Ashok (10.1016/j.ic.2022.104886_br0050) 2020 Vrieze (10.1016/j.ic.2022.104886_br0600) 1983; 5 Ashok (10.1016/j.ic.2022.104886_br0040) 2020 Eisentraut (10.1016/j.ic.2022.104886_br0300) Dehnert (10.1016/j.ic.2022.104886_br0280) 2017 Andersson (10.1016/j.ic.2022.104886_br0010) 2009 Puterman (10.1016/j.ic.2022.104886_br0530) 1994 Baier (10.1016/j.ic.2022.104886_br0070) 2008 Kwiatkowska (10.1016/j.ic.2022.104886_br0450) 2012 Davey (10.1016/j.ic.2022.104886_br0270) 2002 Arslan (10.1016/j.ic.2022.104886_br0020) 2017; 62 Strehl (10.1016/j.ic.2022.104886_br0550) 2006 Kwiatkowska (10.1016/j.ic.2022.104886_br0440) 2011 Hordijk (10.1016/j.ic.2022.104886_br0360) 1979; 25 Kretínský (10.1016/j.ic.2022.104886_br0410) 2020; 16 Chatterjee (10.1016/j.ic.2022.104886_br0180) 2008 Brázdil (10.1016/j.ic.2022.104886_br0110) 2014 Haddad (10.1016/j.ic.2022.104886_br0330) 2018; 735 Li (10.1016/j.ic.2022.104886_br0500) 2008 Chen (10.1016/j.ic.2022.104886_br0210) 2011 Condon (10.1016/j.ic.2022.104886_br0250) 1993 Hahn (10.1016/j.ic.2022.104886_br0340) 2019 Valiant (10.1016/j.ic.2022.104886_br0590) 1984; 27 Kretínský (10.1016/j.ic.2022.104886_br0400) 2017 Itai (10.1016/j.ic.2022.104886_br0370) 1990; 88 Phalakarn (10.1016/j.ic.2022.104886_br0520) 2020 Ashok (10.1016/j.ic.2022.104886_br0030) 2017 Tcheukam (10.1016/j.ic.2022.104886_br0570) 2016 McMahan (10.1016/j.ic.2022.104886_br0510) 2005 Baier (10.1016/j.ic.2022.104886_br0080) 2017 Kwiatkowska (10.1016/j.ic.2022.104886_br0430) 2020 Feng (10.1016/j.ic.2022.104886_br0310) 2011 Hoffman (10.1016/j.ic.2022.104886_br0350) 1966; 12 van Dijk (10.1016/j.ic.2022.104886_br0290) 2018 Ujma (10.1016/j.ic.2022.104886_br0580) 2015 Busoniu (10.1016/j.ic.2022.104886_br0120) 2008; 38 Calinescu (10.1016/j.ic.2022.104886_br0130) 2012
References_xml	– start-page: 51 year: 1993 end-page: 71 ident: br0250 article-title: On algorithms for simple stochastic games publication-title: Advances in Computational Complexity Theory – volume: 29 start-page: 33 year: 2006 end-page: 78 ident: br0460 article-title: Performance analysis of probabilistic timed automata using digital clocks publication-title: Form. Methods Syst. Des. – start-page: 169 year: 2002 end-page: 187 ident: br0470 article-title: Probabilistic model checking of the IEEE 802.11 wireless local area network protocol publication-title: PAPM-PROBMIV – start-page: 203 year: 2012 end-page: 204 ident: br0450 article-title: The PRISM benchmark suite publication-title: QEST – volume: 14 start-page: 295 year: 2003 end-page: 318 ident: br0480 article-title: Probabilistic model checking of deadline properties in the IEEE 1394 firewire root contention protocol publication-title: Form. Asp. Comput. – year: 2008 ident: br0070 article-title: Principles of Model Checking – volume: 62 start-page: 1545 year: 2017 end-page: 1558 ident: br0020 article-title: Decentralized Q-learning for stochastic teams and games publication-title: IEEE Trans. Autom. Control – start-page: 380 year: 2017 end-page: 399 ident: br0400 article-title: Efficient strategy iteration for mean payoff in Markov decision processes publication-title: ATVA – start-page: 102:1 year: 2019 end-page: 102:15 ident: br0090 article-title: On the complexity of value iteration publication-title: ICALP – start-page: 592 year: 2017 end-page: 600 ident: br0280 article-title: A storm is coming: a modern probabilistic model checker publication-title: CAV (2) – year: 2012 ident: br0320 article-title: Competitive Markov Decision Processes – start-page: 1318 year: 2011 end-page: 1336 ident: br0170 article-title: Faster and dynamic algorithms for maximal end-component decomposition and related graph problems in probabilistic verification publication-title: SODA – start-page: 190 year: 2011 end-page: 207 ident: br0210 article-title: Verifying team formation protocols with probabilistic model checking publication-title: CLIMA – volume: 88 start-page: 60 year: 1990 end-page: 87 ident: br0370 article-title: Symmetry breaking in distributed networks publication-title: Inf. Comput. – start-page: 585 year: 2011 end-page: 591 ident: br0440 article-title: PRISM 4.0: verification of probabilistic real-time systems publication-title: CAV – volume: 7 start-page: 4:1 year: 2012 end-page: 4:16 ident: br0540 article-title: Host selection through collective decision publication-title: ACM Trans. Auton. Adapt. Syst. – start-page: 349 year: 2020 end-page: 371 ident: br0520 article-title: Widest paths and global propagation in bounded value iteration for stochastic games publication-title: CAV (2) – volume: 79 start-page: 640 year: 2013 end-page: 657 ident: br0150 article-title: Strategy improvement for concurrent reachability and turn-based stochastic safety games publication-title: J. Comput. Syst. Sci. – volume: 735 start-page: 111 year: 2018 end-page: 131 ident: br0330 article-title: Interval iteration algorithm for mdps and imdps publication-title: Theor. Comput. Sci. – start-page: 322 year: 2013 end-page: 337 ident: br0220 article-title: Synthesis for multi-objective stochastic games: an application to autonomous urban driving publication-title: QEST – volume: 96 start-page: 203 year: 1992 end-page: 224 ident: br0240 article-title: The complexity of stochastic games publication-title: Inf. Comput. – start-page: 331 year: 2020 end-page: 349 ident: br0050 article-title: Statistical model checking: black or white? publication-title: ISoLA (1) – volume: 43 start-page: 61 year: 2013 end-page: 92 ident: br0200 article-title: Automatic verification of competitive stochastic systems publication-title: Form. Methods Syst. Des. – start-page: 303 year: 2012 end-page: 329 ident: br0130 article-title: Compositional reverification of probabilistic safety properties for large-scale complex IT systems publication-title: Monterey Workshop – volume: 36 start-page: 246 year: 2010 end-page: 280 ident: br0380 article-title: A game-based abstraction-refinement framework for Markov decision processes publication-title: Form. Methods Syst. Des. – start-page: 69 year: 2019 end-page: 92 ident: br0340 article-title: The 2019 comparison of tools for the analysis of quantitative formal models - (QCOMP 2019 competition report) publication-title: TACAS (3) – start-page: 881 year: 2006 end-page: 888 ident: br0550 article-title: PAC model-free reinforcement learning publication-title: ICML – start-page: 258 year: 2011 end-page: 261 ident: br0230 article-title: GAVS+: an open platform for the research of algorithmic game solving publication-title: TACAS – start-page: 497 year: 2019 end-page: 519 ident: br0060 article-title: PAC statistical model checking for Markov decision processes and stochastic games publication-title: CAV (1) – start-page: 155 year: 2014 end-page: 164 ident: br0140 article-title: Stochastic game analysis and latency awareness for proactive self-adaptation publication-title: SEAMS – start-page: 107 year: 2008 end-page: 138 ident: br0180 article-title: Value iteration publication-title: 25 Years of Model Checking - History, Achievements, Perspectives – volume: 38 start-page: 156 year: 2008 end-page: 172 ident: br0120 article-title: A comprehensive survey of multiagent reinforcement learning publication-title: IEEE Trans. Syst. Man Cybern. Part C – start-page: 198 year: 2018 end-page: 215 ident: br0290 article-title: Attracting tangles to solve parity games publication-title: CAV (2) – volume: 5 start-page: 15 year: 1983 end-page: 24 ident: br0600 article-title: A finite algorithm for the switching control stochastic game publication-title: OR Spektrum – start-page: 1135 year: 2008 end-page: 1144 ident: br0500 article-title: A novel heuristic Q-learning algorithm for solving stochastic games publication-title: IJCNN – start-page: 201 year: 2017 end-page: 221 ident: br0030 article-title: Value iteration for long-run average reward in Markov decision processes publication-title: CAV (1) – year: 1994 ident: br0530 article-title: Markov Decision Processes: Discrete Stochastic Dynamic Programming publication-title: Wiley Series in Probability and Statistics – volume: 12 start-page: 359 year: 1966 end-page: 370 ident: br0350 article-title: On nonterminating stochastic games publication-title: Manag. Sci. – volume: 16 year: 2020 ident: br0410 article-title: Of cores: a partial-exploration framework for Markov decision processes publication-title: Log. Methods Comput. Sci. – start-page: 3630 year: 2016 end-page: 3636 ident: br0610 article-title: Probably approximately correct learning in stochastic games with temporal logic specifications publication-title: IJCAI, IJCAI/AAAI Press – year: 2019 ident: br0300 article-title: Stopping criteria for value and strategy iteration on concurrent stochastic reachability games – start-page: 665 year: 2010 end-page: 669 ident: br0190 article-title: Gist: a solver for probabilistic games publication-title: CAV – start-page: 74 year: 2011 end-page: 86 ident: br0160 article-title: A reduction from parity games to simple stochastic games publication-title: GandALF – start-page: 475 year: 2020 end-page: 487 ident: br0430 article-title: Prism-games 3.0: stochastic game verification with concurrency, equilibria and time publication-title: CAV (2) – volume: 30 start-page: 15 year: 2016 end-page: 30 ident: br0560 article-title: Quantitative verification and strategy synthesis for stochastic games publication-title: Eur. J. Control – start-page: 112 year: 2009 end-page: 121 ident: br0010 article-title: The complexity of solving stochastic games on graphs publication-title: ISAAC – volume: 121 start-page: 31 year: 2000 end-page: 47 ident: br0100 article-title: A near-optimal polynomial time algorithm for learning in certain classes of stochastic games publication-title: Artif. Intell. – start-page: 144 year: 2016 end-page: 145 ident: br0570 article-title: One swarm per queen: a particle swarm learning for stochastic games publication-title: SASO – start-page: 160 year: 2017 end-page: 180 ident: br0080 article-title: Ensuring the reliability of your model checker: interval iteration for Markov decision processes publication-title: CAV (1) – volume: 18 start-page: 12:1 year: 2017 end-page: 12:25 ident: br0260 article-title: Faster statistical model checking for unbounded temporal properties publication-title: ACM Trans. Comput. Log. – start-page: 102 year: 2020 end-page: 115 ident: br0040 article-title: Approximating values of generalized-reachability stochastic games publication-title: LICS – volume: 25 start-page: 352 year: 1979 end-page: 362 ident: br0360 article-title: Linear programming and Markov decision chains publication-title: Manag. Sci. – start-page: 623 year: 2018 end-page: 642 ident: br0390 article-title: Value iteration for simple stochastic games: stopping criterion and learning algorithm publication-title: CAV (1) – start-page: 131 year: 2020 end-page: 148 ident: br0420 article-title: Comparison of algorithms for simple stochastic games publication-title: GandALF – start-page: 98 year: 2014 end-page: 114 ident: br0110 article-title: Verification of Markov decision processes using learning algorithms publication-title: ATVA – volume: 27 start-page: 1134 year: 1984 end-page: 1142 ident: br0590 article-title: A theory of the learnable publication-title: Commun. ACM – year: 2002 ident: br0270 article-title: Introduction to Lattices and Order – year: 2015 ident: br0580 article-title: On verification and controller synthesis for probabilistic systems at runtime – volume: 26 start-page: 430 year: 2000 end-page: 465 ident: br0490 article-title: Robot motion planning: a game-theoretic foundation publication-title: Algorithmica – start-page: 2 year: 2011 end-page: 17 ident: br0310 article-title: Automated learning of probabilistic assumptions for compositional reasoning publication-title: FASE – start-page: 569 year: 2005 end-page: 576 ident: br0510 article-title: Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees publication-title: ICML – start-page: 169 year: 2002 ident: 10.1016/j.ic.2022.104886_br0470 article-title: Probabilistic model checking of the IEEE 802.11 wireless local area network protocol – start-page: 69 year: 2019 ident: 10.1016/j.ic.2022.104886_br0340 article-title: The 2019 comparison of tools for the analysis of quantitative formal models - (QCOMP 2019 competition report) – start-page: 198 year: 2018 ident: 10.1016/j.ic.2022.104886_br0290 article-title: Attracting tangles to solve parity games – start-page: 144 year: 2016 ident: 10.1016/j.ic.2022.104886_br0570 article-title: One swarm per queen: a particle swarm learning for stochastic games – start-page: 102 year: 2020 ident: 10.1016/j.ic.2022.104886_br0040 article-title: Approximating values of generalized-reachability stochastic games – start-page: 131 year: 2020 ident: 10.1016/j.ic.2022.104886_br0420 article-title: Comparison of algorithms for simple stochastic games – start-page: 107 year: 2008 ident: 10.1016/j.ic.2022.104886_br0180 article-title: Value iteration – start-page: 74 year: 2011 ident: 10.1016/j.ic.2022.104886_br0160 article-title: A reduction from parity games to simple stochastic games – start-page: 881 year: 2006 ident: 10.1016/j.ic.2022.104886_br0550 article-title: PAC model-free reinforcement learning – volume: 36 start-page: 246 year: 2010 ident: 10.1016/j.ic.2022.104886_br0380 article-title: A game-based abstraction-refinement framework for Markov decision processes publication-title: Form. Methods Syst. Des. doi: 10.1007/s10703-010-0097-6 – volume: 43 start-page: 61 year: 2013 ident: 10.1016/j.ic.2022.104886_br0200 article-title: Automatic verification of competitive stochastic systems publication-title: Form. Methods Syst. Des. doi: 10.1007/s10703-013-0183-7 – volume: 88 start-page: 60 year: 1990 ident: 10.1016/j.ic.2022.104886_br0370 article-title: Symmetry breaking in distributed networks publication-title: Inf. Comput. doi: 10.1016/0890-5401(90)90004-2 – start-page: 190 year: 2011 ident: 10.1016/j.ic.2022.104886_br0210 article-title: Verifying team formation protocols with probabilistic model checking – volume: 96 start-page: 203 year: 1992 ident: 10.1016/j.ic.2022.104886_br0240 article-title: The complexity of stochastic games publication-title: Inf. Comput. doi: 10.1016/0890-5401(92)90048-K – start-page: 497 year: 2019 ident: 10.1016/j.ic.2022.104886_br0060 article-title: PAC statistical model checking for Markov decision processes and stochastic games – year: 2015 ident: 10.1016/j.ic.2022.104886_br0580 – volume: 14 start-page: 295 year: 2003 ident: 10.1016/j.ic.2022.104886_br0480 article-title: Probabilistic model checking of deadline properties in the IEEE 1394 firewire root contention protocol publication-title: Form. Asp. Comput. doi: 10.1007/s001650300007 – start-page: 349 year: 2020 ident: 10.1016/j.ic.2022.104886_br0520 article-title: Widest paths and global propagation in bounded value iteration for stochastic games – volume: 18 start-page: 12:1 year: 2017 ident: 10.1016/j.ic.2022.104886_br0260 article-title: Faster statistical model checking for unbounded temporal properties publication-title: ACM Trans. Comput. Log. doi: 10.1145/3060139 – start-page: 1135 year: 2008 ident: 10.1016/j.ic.2022.104886_br0500 article-title: A novel heuristic Q-learning algorithm for solving stochastic games – year: 1994 ident: 10.1016/j.ic.2022.104886_br0530 article-title: Markov Decision Processes: Discrete Stochastic Dynamic Programming doi: 10.1002/9780470316887 – volume: 121 start-page: 31 year: 2000 ident: 10.1016/j.ic.2022.104886_br0100 article-title: A near-optimal polynomial time algorithm for learning in certain classes of stochastic games publication-title: Artif. Intell. doi: 10.1016/S0004-3702(00)00039-4 – start-page: 112 year: 2009 ident: 10.1016/j.ic.2022.104886_br0010 article-title: The complexity of solving stochastic games on graphs – start-page: 322 year: 2013 ident: 10.1016/j.ic.2022.104886_br0220 article-title: Synthesis for multi-objective stochastic games: an application to autonomous urban driving – start-page: 203 year: 2012 ident: 10.1016/j.ic.2022.104886_br0450 article-title: The PRISM benchmark suite – volume: 30 start-page: 15 year: 2016 ident: 10.1016/j.ic.2022.104886_br0560 article-title: Quantitative verification and strategy synthesis for stochastic games publication-title: Eur. J. Control doi: 10.1016/j.ejcon.2016.04.009 – year: 2012 ident: 10.1016/j.ic.2022.104886_br0320 – volume: 26 start-page: 430 year: 2000 ident: 10.1016/j.ic.2022.104886_br0490 article-title: Robot motion planning: a game-theoretic foundation publication-title: Algorithmica doi: 10.1007/s004539910020 – start-page: 155 year: 2014 ident: 10.1016/j.ic.2022.104886_br0140 article-title: Stochastic game analysis and latency awareness for proactive self-adaptation – start-page: 665 year: 2010 ident: 10.1016/j.ic.2022.104886_br0190 article-title: Gist: a solver for probabilistic games – volume: 62 start-page: 1545 year: 2017 ident: 10.1016/j.ic.2022.104886_br0020 article-title: Decentralized Q-learning for stochastic teams and games publication-title: IEEE Trans. Autom. Control doi: 10.1109/TAC.2016.2598476 – start-page: 380 year: 2017 ident: 10.1016/j.ic.2022.104886_br0400 article-title: Efficient strategy iteration for mean payoff in Markov decision processes – volume: 735 start-page: 111 year: 2018 ident: 10.1016/j.ic.2022.104886_br0330 article-title: Interval iteration algorithm for mdps and imdps publication-title: Theor. Comput. Sci. doi: 10.1016/j.tcs.2016.12.003 – volume: 25 start-page: 352 year: 1979 ident: 10.1016/j.ic.2022.104886_br0360 article-title: Linear programming and Markov decision chains publication-title: Manag. Sci. doi: 10.1287/mnsc.25.4.352 – start-page: 623 year: 2018 ident: 10.1016/j.ic.2022.104886_br0390 article-title: Value iteration for simple stochastic games: stopping criterion and learning algorithm – volume: 5 start-page: 15 year: 1983 ident: 10.1016/j.ic.2022.104886_br0600 article-title: A finite algorithm for the switching control stochastic game publication-title: OR Spektrum doi: 10.1007/BF01720283 – start-page: 51 year: 1993 ident: 10.1016/j.ic.2022.104886_br0250 article-title: On algorithms for simple stochastic games – start-page: 331 year: 2020 ident: 10.1016/j.ic.2022.104886_br0050 article-title: Statistical model checking: black or white? – volume: 12 start-page: 359 year: 1966 ident: 10.1016/j.ic.2022.104886_br0350 article-title: On nonterminating stochastic games publication-title: Manag. Sci. doi: 10.1287/mnsc.12.5.359 – volume: 16 year: 2020 ident: 10.1016/j.ic.2022.104886_br0410 article-title: Of cores: a partial-exploration framework for Markov decision processes publication-title: Log. Methods Comput. Sci. – ident: 10.1016/j.ic.2022.104886_br0300 – start-page: 160 year: 2017 ident: 10.1016/j.ic.2022.104886_br0080 article-title: Ensuring the reliability of your model checker: interval iteration for Markov decision processes – start-page: 102:1 year: 2019 ident: 10.1016/j.ic.2022.104886_br0090 article-title: On the complexity of value iteration – start-page: 475 year: 2020 ident: 10.1016/j.ic.2022.104886_br0430 article-title: Prism-games 3.0: stochastic game verification with concurrency, equilibria and time – volume: 27 start-page: 1134 year: 1984 ident: 10.1016/j.ic.2022.104886_br0590 article-title: A theory of the learnable publication-title: Commun. ACM doi: 10.1145/1968.1972 – start-page: 1318 year: 2011 ident: 10.1016/j.ic.2022.104886_br0170 article-title: Faster and dynamic algorithms for maximal end-component decomposition and related graph problems in probabilistic verification – start-page: 569 year: 2005 ident: 10.1016/j.ic.2022.104886_br0510 article-title: Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees – start-page: 258 year: 2011 ident: 10.1016/j.ic.2022.104886_br0230 article-title: GAVS+: an open platform for the research of algorithmic game solving – start-page: 585 year: 2011 ident: 10.1016/j.ic.2022.104886_br0440 article-title: PRISM 4.0: verification of probabilistic real-time systems – start-page: 98 year: 2014 ident: 10.1016/j.ic.2022.104886_br0110 article-title: Verification of Markov decision processes using learning algorithms – volume: 38 start-page: 156 year: 2008 ident: 10.1016/j.ic.2022.104886_br0120 article-title: A comprehensive survey of multiagent reinforcement learning publication-title: IEEE Trans. Syst. Man Cybern. Part C doi: 10.1109/TSMCC.2007.913919 – start-page: 592 year: 2017 ident: 10.1016/j.ic.2022.104886_br0280 article-title: A storm is coming: a modern probabilistic model checker – start-page: 2 year: 2011 ident: 10.1016/j.ic.2022.104886_br0310 article-title: Automated learning of probabilistic assumptions for compositional reasoning – start-page: 201 year: 2017 ident: 10.1016/j.ic.2022.104886_br0030 article-title: Value iteration for long-run average reward in Markov decision processes – start-page: 3630 year: 2016 ident: 10.1016/j.ic.2022.104886_br0610 article-title: Probably approximately correct learning in stochastic games with temporal logic specifications – year: 2008 ident: 10.1016/j.ic.2022.104886_br0070 – volume: 29 start-page: 33 year: 2006 ident: 10.1016/j.ic.2022.104886_br0460 article-title: Performance analysis of probabilistic timed automata using digital clocks publication-title: Form. Methods Syst. Des. doi: 10.1007/s10703-006-0005-2 – start-page: 303 year: 2012 ident: 10.1016/j.ic.2022.104886_br0130 article-title: Compositional reverification of probabilistic safety properties for large-scale complex IT systems – volume: 7 start-page: 4:1 year: 2012 ident: 10.1016/j.ic.2022.104886_br0540 article-title: Host selection through collective decision publication-title: ACM Trans. Auton. Adapt. Syst. doi: 10.1145/2168260.2168264 – year: 2002 ident: 10.1016/j.ic.2022.104886_br0270 – volume: 79 start-page: 640 year: 2013 ident: 10.1016/j.ic.2022.104886_br0150 article-title: Strategy improvement for concurrent reachability and turn-based stochastic safety games publication-title: J. Comput. Syst. Sci. doi: 10.1016/j.jcss.2012.12.001
SSID	ssj0011546
Score	2.448369
Snippet	The classical problem of reachability in simple stochastic games is typically solved by value iteration (VI), which produces a sequence of under-approximations...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	104886
SubjectTerms	Markov decision processes Probabilistic verification Reachability Stochastic games Value iteration
Title	Value iteration for simple stochastic games: Stopping criterion and learning algorithm
URI	https://dx.doi.org/10.1016/j.ic.2022.104886
Volume	285
WOSCitedRecordID	wos000886255300011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: ScienceDirect (Freedom Collection) customDbUrl: eissn: 1090-2651 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011546 issn: 0890-5401 databaseCode: AIEXJ dateStart: 20211207 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELfKxgM8DBggNj7kB_aAqowkjpuYt7EVARITEmP0LXJtZ8tI02pNp_4Z_Mmc44-maEPsgZcoci9OlPv1fDn_7g6h14RyksRFEShGBkGSJCTIJCsCmhUyChVjSsi22UR6fJyNRuxrr_fL5cJcVWldZ8slm_1XVcMYKFunzt5C3X5SGIBzUDocQe1w_CfFn_JqofqmWrLjEc5LXQS4D46eOOe6MnP_TLNjdTjgWzOdtTlTYD503WbLTq5cyIRXZ1P45XzSdWNtElPjpEXbG2JtU39YztvA8aJxadh8ZdyriaplyyMYyg4PYO-Q7oHemnb3_qie_9Qn748Mm9eL_VBtUwuDtS98WU7KyoHcxi_g09ezBZ2ZY6GmZ0RdmxxntGNVI21mBtcafBN7uNgvdTnKON5fia7X1v5jzfNMREdyu8hLkesZcjPDHbQZp5SBndw8-DQcffY7U5FN_nJPbbe-DWdw_Smud3U67svJQ7RlvzvwgcHLI9RT9TZ64Hp6YGvit9H9ToHKx-i0BRP2YMKgdmzAhFdgwi2Y3mEHJeyhhAEc2EEJeyg9Qd8_DE8OPwa2E0cgCBk0AeWMUwp_YiphPSrApYyyMQ9jIccpTZSuYRjrttWcCRBTJOMqFUwkaUEokSomT9FGPa3VM4RhLCzgNRaMJQkPJUiailB0TDJYa3fQW_fScmHL1OtuKVV-k6p20Bt_xcyUaPmLLHF6yK2LaVzHHAB141W7t7jDc3RvhfIXaKO5XKiX6K64asr55SuLpt_znZqC
linkProvider	Elsevier
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Value+iteration+for+simple+stochastic+games%3A+Stopping+criterion+and+learning+algorithm&rft.jtitle=Information+and+computation&rft.au=Eisentraut%2C+Julia&rft.au=Kelmendi%2C+Edon&rft.au=K%C5%99et%C3%ADnsk%C3%BD%2C+Jan&rft.au=Weininger%2C+Maximilian&rft.date=2022-05-01&rft.issn=0890-5401&rft.volume=285&rft.spage=104886&rft_id=info:doi/10.1016%2Fj.ic.2022.104886&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_ic_2022_104886
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0890-5401&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0890-5401&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0890-5401&client=summon