New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system
•We propose new approximate dynamic programming algorithms.•These algorithms can solve large-scale undiscounted Markov decision processes.•Optimal control problems for production systems with 35, 973, 840 states were solved.•The kanban, base stock, CONWIP, hybrid and extended kanban systems are cons...
Gespeichert in:
| Veröffentlicht in: | European journal of operational research Jg. 249; H. 1; S. 22 - 31 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Amsterdam
Elsevier B.V
16.02.2016
Elsevier Sequoia S.A |
| Schlagworte: | |
| ISSN: | 0377-2217, 1872-6860 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | •We propose new approximate dynamic programming algorithms.•These algorithms can solve large-scale undiscounted Markov decision processes.•Optimal control problems for production systems with 35, 973, 840 states were solved.•The kanban, base stock, CONWIP, hybrid and extended kanban systems are considered.•We show numerical comparisons between optimal controls and optimized pull controls.
Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy.
For numerical comparisons, the optimal control problem of the three-stage JIT-based production and distribution system with stochastic demand and production capacity is formulated as a UMDP. The demand distribution is changed from a shifted binomial distribution in Ohno (2011) to a Poisson distribution and near-optimal policies of the optimal control problems with 35,973,840 states are computed by the SBMPI algorithms and the SBMPIM. The computational result shows that the SBMPI algorithms are at least 100 times faster than the SBMPIM in solving the numerical problems and are robust with respect to initial policies. Numerical examples are solved to show an effectiveness of the near optimal control utilizing the SBMPI algorithms compared with optimized pull systems with optimal parameters computed utilizing the SBOS (simulation-based optimal solutions) from Ohno (2011). |
|---|---|
| AbstractList | Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy. For numerical comparisons, the optimal control problem of the three-stage JIT-based production and distribution system with stochastic demand and production capacity is formulated as a UMDP. The demand distribution is changed from a shifted binomial distribution in Ohno (2011) to a Poisson distribution and near-optimal policies of the optimal control problems with 35,973,840 states are computed by the SBMPI algorithms and the SBMPIM. The computational result shows that the SBMPI algorithms are at least 100 times faster than the SBMPIM in solving the numerical problems and are robust with respect to initial policies. Numerical examples are solved to show an effectiveness of the near optimal control utilizing the SBMPI algorithms compared with optimized pull systems with optimal parameters computed utilizing the SBOS (simulation-based optimal solutions) from Ohno (2011). •We propose new approximate dynamic programming algorithms.•These algorithms can solve large-scale undiscounted Markov decision processes.•Optimal control problems for production systems with 35, 973, 840 states were solved.•The kanban, base stock, CONWIP, hybrid and extended kanban systems are considered.•We show numerical comparisons between optimal controls and optimized pull controls. Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy. For numerical comparisons, the optimal control problem of the three-stage JIT-based production and distribution system with stochastic demand and production capacity is formulated as a UMDP. The demand distribution is changed from a shifted binomial distribution in Ohno (2011) to a Poisson distribution and near-optimal policies of the optimal control problems with 35,973,840 states are computed by the SBMPI algorithms and the SBMPIM. The computational result shows that the SBMPI algorithms are at least 100 times faster than the SBMPIM in solving the numerical problems and are robust with respect to initial policies. Numerical examples are solved to show an effectiveness of the near optimal control utilizing the SBMPI algorithms compared with optimized pull systems with optimal parameters computed utilizing the SBOS (simulation-based optimal solutions) from Ohno (2011). |
| Author | Ohno, Katsuhisa Nakade, Koichi Boh, Toshitaka Tamura, Takayoshi |
| Author_xml | – sequence: 1 givenname: Katsuhisa surname: Ohno fullname: Ohno, Katsuhisa email: ohno@aitech.ac.jp organization: Faculty of Business Administration, Aichi Institute of Technology, Jiyugaoka 2-49-2, Chikusa-ku, Nagoya 464-0044, Japan – sequence: 2 givenname: Toshitaka surname: Boh fullname: Boh, Toshitaka email: boh.toshitaka@jp.panasonic.com organization: Sales SCM Group, Marketing & Logistics Solution Business Unit, Corporate Information Systems Company, Panasonic Co., Kadoma-city 571-8686, Japan – sequence: 3 givenname: Koichi surname: Nakade fullname: Nakade, Koichi email: nakade@nitech.ac.jp organization: Nagoya Institute of Technology, Gokisocho, Syowa-ku, Nagoya 466-8555, Japan – sequence: 4 givenname: Takayoshi surname: Tamura fullname: Tamura, Takayoshi email: tamuratak@aitech.ac.jp organization: Faculty of Business Administration, Aichi Institute of Technology, Jiyugaoka 2-49-2, Chikusa-ku, Nagoya 464-0044, Japan |
| BookMark | eNp9kc9u3CAQxlGVSNmkeYGckHq2C_YabKmXKuqfSEl7yR1hGG_GtWELOM32mfqQxbs99ZAT0szv-2aG75KcOe-AkBvOSs64eD-WMPpQVow3JZMlq8QbsuGtrArRCnZGNqyWsqgqLi_IZYwjY5nkzYb8-Qa_qN7vg3_BWSeg9uD0jIbmyi7oeUa3o3ra-YDpaY508IFOOuygiEZPQBdnMRq_uASWPujwwz9TCwYjerd6GIgRItXO0vQEGNZZExqd1n7y1O8TzvgbqF5pu5hjY8WzbwrYL8dCPMQE81tyPugpwvW_94o8fv70ePu1uP_-5e72431htg1PRQu6rcHKZgAtuGnFILSprYCtqTphJIied73c8l63spO86oaGDzXvbGehH-or8u5kmzf6uUBMavRLcHmi4rLOnBSyyVR7okzwMQYYlMF0vCsFjZPiTK3RqFGt0ag1GsWkytFkafWfdB_y74fD66IPJxHky58RgooGwRmwGMAkZT2-Jv8LUJKw5w |
| CODEN | EJORDT |
| CitedBy_id | crossref_primary_10_1016_j_cie_2019_04_005 crossref_primary_10_1016_j_ymssp_2019_106570 crossref_primary_10_1007_s10696_025_09608_7 crossref_primary_10_1016_j_ejor_2018_02_047 crossref_primary_10_1007_s10586_018_2078_2 crossref_primary_10_1016_j_cie_2018_02_031 crossref_primary_10_1016_j_ejor_2016_06_006 crossref_primary_10_1016_j_ejor_2019_02_024 crossref_primary_10_1016_j_ejor_2022_08_049 crossref_primary_10_1016_j_eswa_2018_02_032 crossref_primary_10_1016_j_ejor_2017_11_003 crossref_primary_10_1016_j_cie_2019_106092 crossref_primary_10_1002_asmb_2619 crossref_primary_10_1016_j_ejor_2020_11_005 crossref_primary_10_1016_j_cie_2016_11_019 crossref_primary_10_1109_TASE_2020_2984739 crossref_primary_10_1080_24725854_2025_2469275 |
| Cites_doi | 10.1287/mnsc.6.4.475 10.1080/07408170008963914 10.1080/002075497195713 10.1080/07408170208928908 10.1016/j.cie.2011.01.007 10.1017/S0269964803172051 10.1016/j.ejor.2011.03.005 10.1287/opre.1120.1044 10.1080/00207549508930216 10.1080/00207549008942761 10.1287/opre.35.1.121 10.1287/mnsc.45.4.560 10.1007/s11768-011-1005-3 10.1007/s11768-011-0313-y |
| ContentType | Journal Article |
| Copyright | 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the International Federation of Operational Research Societies (IFORS) Copyright Elsevier Sequoia S.A. Feb 16, 2016 |
| Copyright_xml | – notice: 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the International Federation of Operational Research Societies (IFORS) – notice: Copyright Elsevier Sequoia S.A. Feb 16, 2016 |
| DBID | AAYXX CITATION 7SC 7TB 8FD FR3 JQ2 L7M L~C L~D |
| DOI | 10.1016/j.ejor.2015.07.026 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science Business |
| EISSN | 1872-6860 |
| EndPage | 31 |
| ExternalDocumentID | 3867149021 10_1016_j_ejor_2015_07_026 S0377221715006591 |
| Genre | Feature |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1RT 1~. 1~5 4.4 457 4G. 5GY 5VS 6OB 7-5 71M 8P~ 9JN 9JO AAAKF AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AARIN AAXUO AAYFN ABAOU ABBOA ABFNM ABFRF ABJNI ABMAC ABUCO ABYKQ ACAZW ACDAQ ACGFO ACGFS ACIWK ACNCT ACRLP ACZNC ADBBV ADEZE ADGUI AEBSH AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIGVJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD APLSM ARUGR AXJTR BKOJK BKOMP BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HAMUX IHE J1W KOM LY1 M41 MHUIS MO0 MS~ N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 RIG ROL RPZ RXW SCC SDF SDG SDP SDS SES SPC SPCBC SSB SSD SSV SSW SSZ T5K TAE TN5 U5U XPP ZMT ~02 ~G- 1OL 29G 41~ 9DU AAAKG AAQXK AATTM AAXKI AAYWO AAYXX ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADIYS ADJOM ADMUD ADNMO ADXHL AEIPS AEUPX AFFNX AFJKZ AFPUW AGQPQ AI. AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS FEDTE FGOYB HVGLF HZ~ R2- SEW VH1 WUQ ~HD 7SC 7TB 8FD AFXIZ AGCQF AGRNS FR3 JQ2 L7M L~C L~D SSH |
| ID | FETCH-LOGICAL-c451t-8ea83ed75fea61c86f6ac3d6e4c296c7e6b19b741ba8797129f51f319d9debf3 |
| ISICitedReferencesCount | 21 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000366536400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0377-2217 |
| IngestDate | Sun Jul 13 03:36:32 EDT 2025 Tue Nov 18 21:23:31 EST 2025 Sat Nov 29 01:41:14 EST 2025 Fri Feb 23 02:27:39 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | Approximate dynamic programming algorithms JIT-based production and distribution system The curses of dimensionality Optimal control Undiscounted Markov decision processes |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c451t-8ea83ed75fea61c86f6ac3d6e4c296c7e6b19b741ba8797129f51f319d9debf3 |
| Notes | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 |
| PQID | 1733197675 |
| PQPubID | 45678 |
| PageCount | 10 |
| ParticipantIDs | proquest_journals_1733197675 crossref_citationtrail_10_1016_j_ejor_2015_07_026 crossref_primary_10_1016_j_ejor_2015_07_026 elsevier_sciencedirect_doi_10_1016_j_ejor_2015_07_026 |
| PublicationCentury | 2000 |
| PublicationDate | 2016-02-16 |
| PublicationDateYYYYMMDD | 2016-02-16 |
| PublicationDate_xml | – month: 02 year: 2016 text: 2016-02-16 day: 16 |
| PublicationDecade | 2010 |
| PublicationPlace | Amsterdam |
| PublicationPlace_xml | – name: Amsterdam |
| PublicationTitle | European journal of operational research |
| PublicationYear | 2016 |
| Publisher | Elsevier B.V Elsevier Sequoia S.A |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier Sequoia S.A |
| References | Clark, Scarf (bib0009) 1960; 6 Gosavi (bib0016) 2003 Ohno (bib0025) 2011; 213 Cooper, Henderson, Lewis (bib0010) 2003; 17 Powell, Ma (bib0027) 2011; 9 Bertsekas (bib0004) 2011; 9 He, Fu, Marcus (bib0017) 2000 Powell (bib0026) 2007 Cao (bib0007) 2007 Monden (bib0019) 2012 (bib0029) 2004 Bonvik, Couch, Gershwin (bib0005) 1997; 35 Desai, Farias, Moallemi (bib0014) 2012; 60 Dallery, Liberopoulos (bib0011) 2000; 32 Gosavi, Bandla, Das (bib0015) 2002; 34 Ohno, Ichiki (bib0021) 1987; 35 Bertsekas (bib0003) 2010 Katanyukul, Duff, Chong (bib0018) 2011; 60 Puterman (bib0028) 1994 Bellman (bib0001) 1957 Spearman, Woodruff, Hopp (bib0030) 1990; 28 Buşoniu, Babuška, Schutter, Ernst (bib0006) 2010 Ohno, Yashima, Ito (bib0023) 2003; 54 Sutton, Barto (bib0031) 1998 Chang, Fu, Hu, Marcus (bib0008) 2007 Ohno, Nakashima, Kojima (bib0022) 1995; 33 Ohno, Ito (bib0024) 2004; 55 Ohno (bib0020) 1985 Bertsekas, Tsitsiklis (bib0002) 1996 Das, Gosavi, Mahadevan, Marchalleck (bib0012) 1999; 45 Ohno (10.1016/j.ejor.2015.07.026_bib0025) 2011; 213 Bertsekas (10.1016/j.ejor.2015.07.026_bib0002) 1996 Powell (10.1016/j.ejor.2015.07.026_bib0026) 2007 Bertsekas (10.1016/j.ejor.2015.07.026_bib0004) 2011; 9 Cao (10.1016/j.ejor.2015.07.026_bib0007) 2007 Powell (10.1016/j.ejor.2015.07.026_bib0027) 2011; 9 Ohno (10.1016/j.ejor.2015.07.026_bib0020) 1985 Ohno (10.1016/j.ejor.2015.07.026_bib0022) 1995; 33 Desai (10.1016/j.ejor.2015.07.026_bib0014) 2012; 60 Spearman (10.1016/j.ejor.2015.07.026_bib0030) 1990; 28 Clark (10.1016/j.ejor.2015.07.026_bib0009) 1960; 6 Bellman (10.1016/j.ejor.2015.07.026_bib0001) 1957 (10.1016/j.ejor.2015.07.026_bib0029) 2004 Chang (10.1016/j.ejor.2015.07.026_bib0008) 2007 Monden (10.1016/j.ejor.2015.07.026_bib0019) 2012 Das (10.1016/j.ejor.2015.07.026_bib0012) 1999; 45 Gosavi (10.1016/j.ejor.2015.07.026_bib0016) 2003 Dallery (10.1016/j.ejor.2015.07.026_bib0011) 2000; 32 Katanyukul (10.1016/j.ejor.2015.07.026_bib0018) 2011; 60 Puterman (10.1016/j.ejor.2015.07.026_bib0028) 1994 Buşoniu (10.1016/j.ejor.2015.07.026_bib0006) 2010 Bertsekas (10.1016/j.ejor.2015.07.026_bib0003) 2010 Bonvik (10.1016/j.ejor.2015.07.026_bib0005) 1997; 35 Gosavi (10.1016/j.ejor.2015.07.026_bib0015) 2002; 34 Ohno (10.1016/j.ejor.2015.07.026_bib0021) 1987; 35 Ohno (10.1016/j.ejor.2015.07.026_bib0023) 2003; 54 Ohno (10.1016/j.ejor.2015.07.026_bib0024) 2004; 55 He (10.1016/j.ejor.2015.07.026_bib0017) 2000 Cooper (10.1016/j.ejor.2015.07.026_bib0010) 2003; 17 Sutton (10.1016/j.ejor.2015.07.026_bib0031) 1998 |
| References_xml | – volume: 60 start-page: 655 year: 2012 end-page: 674 ident: bib0014 article-title: Approximate dynamic programming via a smoothed linear program publication-title: Operations Research – volume: 55 start-page: 179 year: 2004 end-page: 188 ident: bib0024 article-title: An optimal control of a production and distribution system by neuro-dynamic programming and a comparison of pull systems publication-title: Journal of Japan Industrial Management Association – volume: 6 start-page: 475 year: 1960 end-page: 490 ident: bib0009 article-title: Optimal policies for multi-echelon inventory problems publication-title: Management Science – year: 1998 ident: bib0031 article-title: Reinforcement learning – volume: 213 start-page: 124 year: 2011 end-page: 133 ident: bib0025 article-title: The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems publication-title: European Journal of Operational Research – year: 2010 ident: bib0006 article-title: Reinforcement learning and dynamic programming using function approximators – volume: 33 start-page: 1387 year: 1995 end-page: 1401 ident: bib0022 article-title: Optimal numbers of two kinds of kanbans in a JIT production system publication-title: International Journal of Production Research – year: 2007 ident: bib0007 article-title: Stochastic learning and optimization – A sensitivity-based approach – volume: 9 start-page: 336 year: 2011 end-page: 352 ident: bib0027 article-title: A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications publication-title: Journal of Control Theory and Applications – year: 2003 ident: bib0016 article-title: Simulation-based optimization: Parametric optimization techniques and reinforcement learning – year: 2007 ident: bib0008 article-title: Simulation-based algorithms for Markov decision processes – volume: 54 start-page: 316 year: 2003 end-page: 325 ident: bib0023 article-title: Neuro-dynamic programming algorithms for computing optimal control of production lines publication-title: Journal of Japan Industrial Management Association – volume: 9 start-page: 310 year: 2011 end-page: 335 ident: bib0004 article-title: Approximate policy iteration: A survey and some new methods publication-title: Journal of Control Theory and Application – year: 2007 ident: bib0026 article-title: Approximate dynamic programming – Solving the curses of dimensionality – volume: 35 start-page: 121 year: 1987 end-page: 126 ident: bib0021 article-title: Computing optimal policies for controlled tandem queueing systems publication-title: Operations Research – year: 1985 ident: bib0020 article-title: Modified policy iteration algorithm with nonoptimality tests for undiscounted Markov decision process publication-title: Working Paper – volume: 17 start-page: 213 year: 2003 end-page: 234 ident: bib0010 article-title: Convergence of simulation-based policy iteration publication-title: Probability in the Engineering and Informational Sciences – year: 2004 ident: bib0029 publication-title: Handbook of learning and approximate dynamic programming – year: 1994 ident: bib0028 article-title: Markov decision processes: Discrete stochastic dynamic programming – volume: 32 start-page: 369 year: 2000 end-page: 386 ident: bib0011 article-title: Extended kanban control system: Combining kanban and base stock publication-title: IIE Transactions – volume: 34 start-page: 729 year: 2002 end-page: 742 ident: bib0015 article-title: A reinforcement learning approach to a single-leg airline revenue management problem with multiple fare classes and overbooking publication-title: IIE Transactions – year: 1996 ident: bib0002 article-title: Neuro-dynamic programming – year: 2012 ident: bib0019 article-title: Toyota production system – volume: 60 start-page: 719 year: 2011 end-page: 743 ident: bib0018 article-title: Approximate dynamic programming for an inventory problem: Empirical comparison publication-title: Computers & Industrial Engineering – start-page: 161 year: 2000 end-page: 182 ident: bib0017 article-title: A simulation- based policy iteration algorithm for average cost unichain Markov decision processes publication-title: Computing tools for modeling, optimization and simulation – year: 1957 ident: bib0001 article-title: Dynamic programming – year: 2010 ident: bib0003 article-title: Pathologies of temporal difference methods in approximate dynamic programming publication-title: Proceedings of 2010 conference on decision and control, Atlanta, GA, – volume: 45 start-page: 560 year: 1999 end-page: 574 ident: bib0012 article-title: Solving semi-Markov decision problems using average reward reinforcement learning publication-title: Management Science – volume: 28 start-page: 879 year: 1990 end-page: 894 ident: bib0030 article-title: CONWIP: A pull alternative to kanban publication-title: International Journal of Production Research – volume: 35 start-page: 789 year: 1997 end-page: 804 ident: bib0005 article-title: A comparison of production-line control mechanisms publication-title: International Journal of Production Research – volume: 6 start-page: 475 year: 1960 ident: 10.1016/j.ejor.2015.07.026_bib0009 article-title: Optimal policies for multi-echelon inventory problems publication-title: Management Science doi: 10.1287/mnsc.6.4.475 – start-page: 161 year: 2000 ident: 10.1016/j.ejor.2015.07.026_bib0017 article-title: A simulation- based policy iteration algorithm for average cost unichain Markov decision processes – year: 2007 ident: 10.1016/j.ejor.2015.07.026_bib0007 – volume: 55 start-page: 179 issue: 4 year: 2004 ident: 10.1016/j.ejor.2015.07.026_bib0024 article-title: An optimal control of a production and distribution system by neuro-dynamic programming and a comparison of pull systems publication-title: Journal of Japan Industrial Management Association – volume: 32 start-page: 369 year: 2000 ident: 10.1016/j.ejor.2015.07.026_bib0011 article-title: Extended kanban control system: Combining kanban and base stock publication-title: IIE Transactions doi: 10.1080/07408170008963914 – volume: 35 start-page: 789 issue: 3 year: 1997 ident: 10.1016/j.ejor.2015.07.026_bib0005 article-title: A comparison of production-line control mechanisms publication-title: International Journal of Production Research doi: 10.1080/002075497195713 – year: 2007 ident: 10.1016/j.ejor.2015.07.026_bib0026 – volume: 34 start-page: 729 year: 2002 ident: 10.1016/j.ejor.2015.07.026_bib0015 article-title: A reinforcement learning approach to a single-leg airline revenue management problem with multiple fare classes and overbooking publication-title: IIE Transactions doi: 10.1080/07408170208928908 – year: 2004 ident: 10.1016/j.ejor.2015.07.026_bib0029 – year: 1985 ident: 10.1016/j.ejor.2015.07.026_bib0020 article-title: Modified policy iteration algorithm with nonoptimality tests for undiscounted Markov decision process – volume: 60 start-page: 719 year: 2011 ident: 10.1016/j.ejor.2015.07.026_bib0018 article-title: Approximate dynamic programming for an inventory problem: Empirical comparison publication-title: Computers & Industrial Engineering doi: 10.1016/j.cie.2011.01.007 – year: 2007 ident: 10.1016/j.ejor.2015.07.026_bib0008 – volume: 54 start-page: 316 issue: 5 year: 2003 ident: 10.1016/j.ejor.2015.07.026_bib0023 article-title: Neuro-dynamic programming algorithms for computing optimal control of production lines publication-title: Journal of Japan Industrial Management Association – volume: 17 start-page: 213 year: 2003 ident: 10.1016/j.ejor.2015.07.026_bib0010 article-title: Convergence of simulation-based policy iteration publication-title: Probability in the Engineering and Informational Sciences doi: 10.1017/S0269964803172051 – volume: 213 start-page: 124 year: 2011 ident: 10.1016/j.ejor.2015.07.026_bib0025 article-title: The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems publication-title: European Journal of Operational Research doi: 10.1016/j.ejor.2011.03.005 – year: 2010 ident: 10.1016/j.ejor.2015.07.026_bib0003 article-title: Pathologies of temporal difference methods in approximate dynamic programming – volume: 60 start-page: 655 year: 2012 ident: 10.1016/j.ejor.2015.07.026_bib0014 article-title: Approximate dynamic programming via a smoothed linear program publication-title: Operations Research doi: 10.1287/opre.1120.1044 – year: 1996 ident: 10.1016/j.ejor.2015.07.026_bib0002 – year: 2010 ident: 10.1016/j.ejor.2015.07.026_bib0006 – volume: 33 start-page: 1387 issue: 5 year: 1995 ident: 10.1016/j.ejor.2015.07.026_bib0022 article-title: Optimal numbers of two kinds of kanbans in a JIT production system publication-title: International Journal of Production Research doi: 10.1080/00207549508930216 – volume: 28 start-page: 879 issue: 5 year: 1990 ident: 10.1016/j.ejor.2015.07.026_bib0030 article-title: CONWIP: A pull alternative to kanban publication-title: International Journal of Production Research doi: 10.1080/00207549008942761 – volume: 35 start-page: 121 issue: 1 year: 1987 ident: 10.1016/j.ejor.2015.07.026_bib0021 article-title: Computing optimal policies for controlled tandem queueing systems publication-title: Operations Research doi: 10.1287/opre.35.1.121 – volume: 45 start-page: 560 year: 1999 ident: 10.1016/j.ejor.2015.07.026_bib0012 article-title: Solving semi-Markov decision problems using average reward reinforcement learning publication-title: Management Science doi: 10.1287/mnsc.45.4.560 – year: 1998 ident: 10.1016/j.ejor.2015.07.026_bib0031 – year: 1994 ident: 10.1016/j.ejor.2015.07.026_bib0028 – year: 1957 ident: 10.1016/j.ejor.2015.07.026_bib0001 – volume: 9 start-page: 310 issue: 3 year: 2011 ident: 10.1016/j.ejor.2015.07.026_bib0004 article-title: Approximate policy iteration: A survey and some new methods publication-title: Journal of Control Theory and Application doi: 10.1007/s11768-011-1005-3 – year: 2012 ident: 10.1016/j.ejor.2015.07.026_bib0019 – year: 2003 ident: 10.1016/j.ejor.2015.07.026_bib0016 – volume: 9 start-page: 336 issue: 3 year: 2011 ident: 10.1016/j.ejor.2015.07.026_bib0027 article-title: A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications publication-title: Journal of Control Theory and Applications doi: 10.1007/s11768-011-0313-y |
| SSID | ssj0001515 |
| Score | 2.2956004 |
| Snippet | •We propose new approximate dynamic programming algorithms.•These algorithms can solve large-scale undiscounted Markov decision processes.•Optimal control... Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 22 |
| SubjectTerms | Algorithms Approximate dynamic programming algorithms Binomial distribution Dynamic programming Iterative methods JIT-based production and distribution system Markov analysis Optimal control Optimization algorithms Production capacity Simulation Stochastic control theory Studies The curses of dimensionality Undiscounted Markov decision processes |
| Title | New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system |
| URI | https://dx.doi.org/10.1016/j.ejor.2015.07.026 https://www.proquest.com/docview/1733197675 |
| Volume | 249 |
| WOSCitedRecordID | wos000366536400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1872-6860 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001515 issn: 0377-2217 databaseCode: AIEXJ dateStart: 19950105 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lj9MwELaqXYTgwKOAWFjQHLhFWTVpE8fHFVrEQypI9NBblNgOzdI2VZNWhd_Eb0SMX2noaleAxCWKLDtNNV_sz-OZ-Qh5RUUusiAWPu4NmD-KhpnP8jzw8bNK0OIZUuyRFpug43EynbJPvd5PlwuzndPlMtnt2Oq_mhrb0NgqdfYvzN0-FBvwHo2OVzQ7Xv_I8CpiUVcK35XIRqUnjOa8i8Ra6KzE-ZdqXTYzU43Bm6twcL9Gc0mli1vWWkFCCp3JU209YYV4vJVJK5C1C7ws117nCFwR2QrnoEX5XXqZ6i02VoocuwtVo9fKa9kK0teeC1iOjA1r5620ZYla9_XHmVYNV_Eg9WZW1nvPQqVdRZOqnpVN9rVtH-O9KTL8oSr5rNw7LRYbrbbkTbDHNzWs6wsJdPi0SdU0DrorSTomMYxSPwxNiuiZNPN8QkM_ToyUgVsIQlM89TfE22k97BAEs2pdWXqMF-TyTF5Wqs5sEOmisOFBnW_NHD6rV1JvhGxcnWvj7v04pBHDWfn4_N3F9H3LJRTd1Odg9i_YtC8ToXj4S9dRqwOSoZnT5AG5Z7c8cG6g-pD05LJPbruMiz6575RFwC40fXK3UybzEfmBkIYOpMFCGjqQhj2kASENHUhDF9JgIA0O0tBCGhCjoCENHUhDU4GDNGSwh7Tu3oU0GEg_JpM3F5PXb32rMuLzURQ0fiKzZCgFjQqZxQFP4iLO-FDEcsRDFnMq4zxgORLvPEsoo8iPiygocB4TTMi8GD4hR8tqKZ8SkLj1xs684MMcl8I4Hw1owZHwCdzIDAp2QgJnn5TbCvxKCGaeulDLy1TZNFU2TQc0RZueEK8dszL1Z27sHTmzp5ZBG2acIkpvHHfqMJLaj7xOAyXnylSxp2f_-Njn5M7-Kz0lR816I1-QW3zblPX6pcX6L-y8A1g |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=New+approximate+dynamic+programming+algorithms+for+large-scale+undiscounted+Markov+decision+processes+and+their+application+to+optimize+a+production+and+distribution+system&rft.jtitle=European+journal+of+operational+research&rft.au=Ohno%2C+Katsuhisa&rft.au=Boh%2C+Toshitaka&rft.au=Nakade%2C+Koichi&rft.au=Tamura%2C+Takayoshi&rft.date=2016-02-16&rft.pub=Elsevier+B.V&rft.issn=0377-2217&rft.eissn=1872-6860&rft.volume=249&rft.issue=1&rft.spage=22&rft.epage=31&rft_id=info:doi/10.1016%2Fj.ejor.2015.07.026&rft.externalDocID=S0377221715006591 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0377-2217&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0377-2217&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0377-2217&client=summon |