New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system

•We propose new approximate dynamic programming algorithms.•These algorithms can solve large-scale undiscounted Markov decision processes.•Optimal control problems for production systems with 35, 973, 840 states were solved.•The kanban, base stock, CONWIP, hybrid and extended kanban systems are cons...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:European journal of operational research Jg. 249; H. 1; S. 22 - 31
Hauptverfasser: Ohno, Katsuhisa, Boh, Toshitaka, Nakade, Koichi, Tamura, Takayoshi
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Amsterdam Elsevier B.V 16.02.2016
Elsevier Sequoia S.A
Schlagworte:
ISSN:0377-2217, 1872-6860
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract •We propose new approximate dynamic programming algorithms.•These algorithms can solve large-scale undiscounted Markov decision processes.•Optimal control problems for production systems with 35, 973, 840 states were solved.•The kanban, base stock, CONWIP, hybrid and extended kanban systems are considered.•We show numerical comparisons between optimal controls and optimized pull controls. Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy. For numerical comparisons, the optimal control problem of the three-stage JIT-based production and distribution system with stochastic demand and production capacity is formulated as a UMDP. The demand distribution is changed from a shifted binomial distribution in Ohno (2011) to a Poisson distribution and near-optimal policies of the optimal control problems with 35,973,840 states are computed by the SBMPI algorithms and the SBMPIM. The computational result shows that the SBMPI algorithms are at least 100 times faster than the SBMPIM in solving the numerical problems and are robust with respect to initial policies. Numerical examples are solved to show an effectiveness of the near optimal control utilizing the SBMPI algorithms compared with optimized pull systems with optimal parameters computed utilizing the SBOS (simulation-based optimal solutions) from Ohno (2011).
AbstractList Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy. For numerical comparisons, the optimal control problem of the three-stage JIT-based production and distribution system with stochastic demand and production capacity is formulated as a UMDP. The demand distribution is changed from a shifted binomial distribution in Ohno (2011) to a Poisson distribution and near-optimal policies of the optimal control problems with 35,973,840 states are computed by the SBMPI algorithms and the SBMPIM. The computational result shows that the SBMPI algorithms are at least 100 times faster than the SBMPIM in solving the numerical problems and are robust with respect to initial policies. Numerical examples are solved to show an effectiveness of the near optimal control utilizing the SBMPI algorithms compared with optimized pull systems with optimal parameters computed utilizing the SBOS (simulation-based optimal solutions) from Ohno (2011).
•We propose new approximate dynamic programming algorithms.•These algorithms can solve large-scale undiscounted Markov decision processes.•Optimal control problems for production systems with 35, 973, 840 states were solved.•The kanban, base stock, CONWIP, hybrid and extended kanban systems are considered.•We show numerical comparisons between optimal controls and optimized pull controls. Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy. For numerical comparisons, the optimal control problem of the three-stage JIT-based production and distribution system with stochastic demand and production capacity is formulated as a UMDP. The demand distribution is changed from a shifted binomial distribution in Ohno (2011) to a Poisson distribution and near-optimal policies of the optimal control problems with 35,973,840 states are computed by the SBMPI algorithms and the SBMPIM. The computational result shows that the SBMPI algorithms are at least 100 times faster than the SBMPIM in solving the numerical problems and are robust with respect to initial policies. Numerical examples are solved to show an effectiveness of the near optimal control utilizing the SBMPI algorithms compared with optimized pull systems with optimal parameters computed utilizing the SBOS (simulation-based optimal solutions) from Ohno (2011).
Author Ohno, Katsuhisa
Nakade, Koichi
Boh, Toshitaka
Tamura, Takayoshi
Author_xml – sequence: 1
  givenname: Katsuhisa
  surname: Ohno
  fullname: Ohno, Katsuhisa
  email: ohno@aitech.ac.jp
  organization: Faculty of Business Administration, Aichi Institute of Technology, Jiyugaoka 2-49-2, Chikusa-ku, Nagoya 464-0044, Japan
– sequence: 2
  givenname: Toshitaka
  surname: Boh
  fullname: Boh, Toshitaka
  email: boh.toshitaka@jp.panasonic.com
  organization: Sales SCM Group, Marketing & Logistics Solution Business Unit, Corporate Information Systems Company, Panasonic Co., Kadoma-city 571-8686, Japan
– sequence: 3
  givenname: Koichi
  surname: Nakade
  fullname: Nakade, Koichi
  email: nakade@nitech.ac.jp
  organization: Nagoya Institute of Technology, Gokisocho, Syowa-ku, Nagoya 466-8555, Japan
– sequence: 4
  givenname: Takayoshi
  surname: Tamura
  fullname: Tamura, Takayoshi
  email: tamuratak@aitech.ac.jp
  organization: Faculty of Business Administration, Aichi Institute of Technology, Jiyugaoka 2-49-2, Chikusa-ku, Nagoya 464-0044, Japan
BookMark eNp9kc9u3CAQxlGVSNmkeYGckHq2C_YabKmXKuqfSEl7yR1hGG_GtWELOM32mfqQxbs99ZAT0szv-2aG75KcOe-AkBvOSs64eD-WMPpQVow3JZMlq8QbsuGtrArRCnZGNqyWsqgqLi_IZYwjY5nkzYb8-Qa_qN7vg3_BWSeg9uD0jIbmyi7oeUa3o3ra-YDpaY508IFOOuygiEZPQBdnMRq_uASWPujwwz9TCwYjerd6GIgRItXO0vQEGNZZExqd1n7y1O8TzvgbqF5pu5hjY8WzbwrYL8dCPMQE81tyPugpwvW_94o8fv70ePu1uP_-5e72431htg1PRQu6rcHKZgAtuGnFILSprYCtqTphJIied73c8l63spO86oaGDzXvbGehH-or8u5kmzf6uUBMavRLcHmi4rLOnBSyyVR7okzwMQYYlMF0vCsFjZPiTK3RqFGt0ag1GsWkytFkafWfdB_y74fD66IPJxHky58RgooGwRmwGMAkZT2-Jv8LUJKw5w
CODEN EJORDT
CitedBy_id crossref_primary_10_1016_j_cie_2019_04_005
crossref_primary_10_1016_j_ymssp_2019_106570
crossref_primary_10_1007_s10696_025_09608_7
crossref_primary_10_1016_j_ejor_2018_02_047
crossref_primary_10_1007_s10586_018_2078_2
crossref_primary_10_1016_j_cie_2018_02_031
crossref_primary_10_1016_j_ejor_2016_06_006
crossref_primary_10_1016_j_ejor_2019_02_024
crossref_primary_10_1016_j_ejor_2022_08_049
crossref_primary_10_1016_j_eswa_2018_02_032
crossref_primary_10_1016_j_ejor_2017_11_003
crossref_primary_10_1016_j_cie_2019_106092
crossref_primary_10_1002_asmb_2619
crossref_primary_10_1016_j_ejor_2020_11_005
crossref_primary_10_1016_j_cie_2016_11_019
crossref_primary_10_1109_TASE_2020_2984739
crossref_primary_10_1080_24725854_2025_2469275
Cites_doi 10.1287/mnsc.6.4.475
10.1080/07408170008963914
10.1080/002075497195713
10.1080/07408170208928908
10.1016/j.cie.2011.01.007
10.1017/S0269964803172051
10.1016/j.ejor.2011.03.005
10.1287/opre.1120.1044
10.1080/00207549508930216
10.1080/00207549008942761
10.1287/opre.35.1.121
10.1287/mnsc.45.4.560
10.1007/s11768-011-1005-3
10.1007/s11768-011-0313-y
ContentType Journal Article
Copyright 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the International Federation of Operational Research Societies (IFORS)
Copyright Elsevier Sequoia S.A. Feb 16, 2016
Copyright_xml – notice: 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the International Federation of Operational Research Societies (IFORS)
– notice: Copyright Elsevier Sequoia S.A. Feb 16, 2016
DBID AAYXX
CITATION
7SC
7TB
8FD
FR3
JQ2
L7M
L~C
L~D
DOI 10.1016/j.ejor.2015.07.026
DatabaseName CrossRef
Computer and Information Systems Abstracts
Mechanical & Transportation Engineering Abstracts
Technology Research Database
Engineering Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Engineering Research Database
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
Business
EISSN 1872-6860
EndPage 31
ExternalDocumentID 3867149021
10_1016_j_ejor_2015_07_026
S0377221715006591
Genre Feature
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
4.4
457
4G.
5GY
5VS
6OB
7-5
71M
8P~
9JN
9JO
AAAKF
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AARIN
AAXUO
AAYFN
ABAOU
ABBOA
ABFNM
ABFRF
ABJNI
ABMAC
ABUCO
ABYKQ
ACAZW
ACDAQ
ACGFO
ACGFS
ACIWK
ACNCT
ACRLP
ACZNC
ADBBV
ADEZE
ADGUI
AEBSH
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIGVJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
APLSM
ARUGR
AXJTR
BKOJK
BKOMP
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HAMUX
IHE
J1W
KOM
LY1
M41
MHUIS
MO0
MS~
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
RIG
ROL
RPZ
RXW
SCC
SDF
SDG
SDP
SDS
SES
SPC
SPCBC
SSB
SSD
SSV
SSW
SSZ
T5K
TAE
TN5
U5U
XPP
ZMT
~02
~G-
1OL
29G
41~
9DU
AAAKG
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ABXDB
ACLOT
ACNNM
ACRPL
ACVFH
ADCNI
ADIYS
ADJOM
ADMUD
ADNMO
ADXHL
AEIPS
AEUPX
AFFNX
AFJKZ
AFPUW
AGQPQ
AI.
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
CITATION
EFKBS
FEDTE
FGOYB
HVGLF
HZ~
R2-
SEW
VH1
WUQ
~HD
7SC
7TB
8FD
AFXIZ
AGCQF
AGRNS
FR3
JQ2
L7M
L~C
L~D
SSH
ID FETCH-LOGICAL-c451t-8ea83ed75fea61c86f6ac3d6e4c296c7e6b19b741ba8797129f51f319d9debf3
ISICitedReferencesCount 21
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000366536400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0377-2217
IngestDate Sun Jul 13 03:36:32 EDT 2025
Tue Nov 18 21:23:31 EST 2025
Sat Nov 29 01:41:14 EST 2025
Fri Feb 23 02:27:39 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Approximate dynamic programming algorithms
JIT-based production and distribution system
The curses of dimensionality
Optimal control
Undiscounted Markov decision processes
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c451t-8ea83ed75fea61c86f6ac3d6e4c296c7e6b19b741ba8797129f51f319d9debf3
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
PQID 1733197675
PQPubID 45678
PageCount 10
ParticipantIDs proquest_journals_1733197675
crossref_citationtrail_10_1016_j_ejor_2015_07_026
crossref_primary_10_1016_j_ejor_2015_07_026
elsevier_sciencedirect_doi_10_1016_j_ejor_2015_07_026
PublicationCentury 2000
PublicationDate 2016-02-16
PublicationDateYYYYMMDD 2016-02-16
PublicationDate_xml – month: 02
  year: 2016
  text: 2016-02-16
  day: 16
PublicationDecade 2010
PublicationPlace Amsterdam
PublicationPlace_xml – name: Amsterdam
PublicationTitle European journal of operational research
PublicationYear 2016
Publisher Elsevier B.V
Elsevier Sequoia S.A
Publisher_xml – name: Elsevier B.V
– name: Elsevier Sequoia S.A
References Clark, Scarf (bib0009) 1960; 6
Gosavi (bib0016) 2003
Ohno (bib0025) 2011; 213
Cooper, Henderson, Lewis (bib0010) 2003; 17
Powell, Ma (bib0027) 2011; 9
Bertsekas (bib0004) 2011; 9
He, Fu, Marcus (bib0017) 2000
Powell (bib0026) 2007
Cao (bib0007) 2007
Monden (bib0019) 2012
(bib0029) 2004
Bonvik, Couch, Gershwin (bib0005) 1997; 35
Desai, Farias, Moallemi (bib0014) 2012; 60
Dallery, Liberopoulos (bib0011) 2000; 32
Gosavi, Bandla, Das (bib0015) 2002; 34
Ohno, Ichiki (bib0021) 1987; 35
Bertsekas (bib0003) 2010
Katanyukul, Duff, Chong (bib0018) 2011; 60
Puterman (bib0028) 1994
Bellman (bib0001) 1957
Spearman, Woodruff, Hopp (bib0030) 1990; 28
Buşoniu, Babuška, Schutter, Ernst (bib0006) 2010
Ohno, Yashima, Ito (bib0023) 2003; 54
Sutton, Barto (bib0031) 1998
Chang, Fu, Hu, Marcus (bib0008) 2007
Ohno, Nakashima, Kojima (bib0022) 1995; 33
Ohno, Ito (bib0024) 2004; 55
Ohno (bib0020) 1985
Bertsekas, Tsitsiklis (bib0002) 1996
Das, Gosavi, Mahadevan, Marchalleck (bib0012) 1999; 45
Ohno (10.1016/j.ejor.2015.07.026_bib0025) 2011; 213
Bertsekas (10.1016/j.ejor.2015.07.026_bib0002) 1996
Powell (10.1016/j.ejor.2015.07.026_bib0026) 2007
Bertsekas (10.1016/j.ejor.2015.07.026_bib0004) 2011; 9
Cao (10.1016/j.ejor.2015.07.026_bib0007) 2007
Powell (10.1016/j.ejor.2015.07.026_bib0027) 2011; 9
Ohno (10.1016/j.ejor.2015.07.026_bib0020) 1985
Ohno (10.1016/j.ejor.2015.07.026_bib0022) 1995; 33
Desai (10.1016/j.ejor.2015.07.026_bib0014) 2012; 60
Spearman (10.1016/j.ejor.2015.07.026_bib0030) 1990; 28
Clark (10.1016/j.ejor.2015.07.026_bib0009) 1960; 6
Bellman (10.1016/j.ejor.2015.07.026_bib0001) 1957
(10.1016/j.ejor.2015.07.026_bib0029) 2004
Chang (10.1016/j.ejor.2015.07.026_bib0008) 2007
Monden (10.1016/j.ejor.2015.07.026_bib0019) 2012
Das (10.1016/j.ejor.2015.07.026_bib0012) 1999; 45
Gosavi (10.1016/j.ejor.2015.07.026_bib0016) 2003
Dallery (10.1016/j.ejor.2015.07.026_bib0011) 2000; 32
Katanyukul (10.1016/j.ejor.2015.07.026_bib0018) 2011; 60
Puterman (10.1016/j.ejor.2015.07.026_bib0028) 1994
Buşoniu (10.1016/j.ejor.2015.07.026_bib0006) 2010
Bertsekas (10.1016/j.ejor.2015.07.026_bib0003) 2010
Bonvik (10.1016/j.ejor.2015.07.026_bib0005) 1997; 35
Gosavi (10.1016/j.ejor.2015.07.026_bib0015) 2002; 34
Ohno (10.1016/j.ejor.2015.07.026_bib0021) 1987; 35
Ohno (10.1016/j.ejor.2015.07.026_bib0023) 2003; 54
Ohno (10.1016/j.ejor.2015.07.026_bib0024) 2004; 55
He (10.1016/j.ejor.2015.07.026_bib0017) 2000
Cooper (10.1016/j.ejor.2015.07.026_bib0010) 2003; 17
Sutton (10.1016/j.ejor.2015.07.026_bib0031) 1998
References_xml – volume: 60
  start-page: 655
  year: 2012
  end-page: 674
  ident: bib0014
  article-title: Approximate dynamic programming via a smoothed linear program
  publication-title: Operations Research
– volume: 55
  start-page: 179
  year: 2004
  end-page: 188
  ident: bib0024
  article-title: An optimal control of a production and distribution system by neuro-dynamic programming and a comparison of pull systems
  publication-title: Journal of Japan Industrial Management Association
– volume: 6
  start-page: 475
  year: 1960
  end-page: 490
  ident: bib0009
  article-title: Optimal policies for multi-echelon inventory problems
  publication-title: Management Science
– year: 1998
  ident: bib0031
  article-title: Reinforcement learning
– volume: 213
  start-page: 124
  year: 2011
  end-page: 133
  ident: bib0025
  article-title: The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems
  publication-title: European Journal of Operational Research
– year: 2010
  ident: bib0006
  article-title: Reinforcement learning and dynamic programming using function approximators
– volume: 33
  start-page: 1387
  year: 1995
  end-page: 1401
  ident: bib0022
  article-title: Optimal numbers of two kinds of kanbans in a JIT production system
  publication-title: International Journal of Production Research
– year: 2007
  ident: bib0007
  article-title: Stochastic learning and optimization – A sensitivity-based approach
– volume: 9
  start-page: 336
  year: 2011
  end-page: 352
  ident: bib0027
  article-title: A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
  publication-title: Journal of Control Theory and Applications
– year: 2003
  ident: bib0016
  article-title: Simulation-based optimization: Parametric optimization techniques and reinforcement learning
– year: 2007
  ident: bib0008
  article-title: Simulation-based algorithms for Markov decision processes
– volume: 54
  start-page: 316
  year: 2003
  end-page: 325
  ident: bib0023
  article-title: Neuro-dynamic programming algorithms for computing optimal control of production lines
  publication-title: Journal of Japan Industrial Management Association
– volume: 9
  start-page: 310
  year: 2011
  end-page: 335
  ident: bib0004
  article-title: Approximate policy iteration: A survey and some new methods
  publication-title: Journal of Control Theory and Application
– year: 2007
  ident: bib0026
  article-title: Approximate dynamic programming – Solving the curses of dimensionality
– volume: 35
  start-page: 121
  year: 1987
  end-page: 126
  ident: bib0021
  article-title: Computing optimal policies for controlled tandem queueing systems
  publication-title: Operations Research
– year: 1985
  ident: bib0020
  article-title: Modified policy iteration algorithm with nonoptimality tests for undiscounted Markov decision process
  publication-title: Working Paper
– volume: 17
  start-page: 213
  year: 2003
  end-page: 234
  ident: bib0010
  article-title: Convergence of simulation-based policy iteration
  publication-title: Probability in the Engineering and Informational Sciences
– year: 2004
  ident: bib0029
  publication-title: Handbook of learning and approximate dynamic programming
– year: 1994
  ident: bib0028
  article-title: Markov decision processes: Discrete stochastic dynamic programming
– volume: 32
  start-page: 369
  year: 2000
  end-page: 386
  ident: bib0011
  article-title: Extended kanban control system: Combining kanban and base stock
  publication-title: IIE Transactions
– volume: 34
  start-page: 729
  year: 2002
  end-page: 742
  ident: bib0015
  article-title: A reinforcement learning approach to a single-leg airline revenue management problem with multiple fare classes and overbooking
  publication-title: IIE Transactions
– year: 1996
  ident: bib0002
  article-title: Neuro-dynamic programming
– year: 2012
  ident: bib0019
  article-title: Toyota production system
– volume: 60
  start-page: 719
  year: 2011
  end-page: 743
  ident: bib0018
  article-title: Approximate dynamic programming for an inventory problem: Empirical comparison
  publication-title: Computers & Industrial Engineering
– start-page: 161
  year: 2000
  end-page: 182
  ident: bib0017
  article-title: A simulation- based policy iteration algorithm for average cost unichain Markov decision processes
  publication-title: Computing tools for modeling, optimization and simulation
– year: 1957
  ident: bib0001
  article-title: Dynamic programming
– year: 2010
  ident: bib0003
  article-title: Pathologies of temporal difference methods in approximate dynamic programming
  publication-title: Proceedings of 2010 conference on decision and control, Atlanta, GA,
– volume: 45
  start-page: 560
  year: 1999
  end-page: 574
  ident: bib0012
  article-title: Solving semi-Markov decision problems using average reward reinforcement learning
  publication-title: Management Science
– volume: 28
  start-page: 879
  year: 1990
  end-page: 894
  ident: bib0030
  article-title: CONWIP: A pull alternative to kanban
  publication-title: International Journal of Production Research
– volume: 35
  start-page: 789
  year: 1997
  end-page: 804
  ident: bib0005
  article-title: A comparison of production-line control mechanisms
  publication-title: International Journal of Production Research
– volume: 6
  start-page: 475
  year: 1960
  ident: 10.1016/j.ejor.2015.07.026_bib0009
  article-title: Optimal policies for multi-echelon inventory problems
  publication-title: Management Science
  doi: 10.1287/mnsc.6.4.475
– start-page: 161
  year: 2000
  ident: 10.1016/j.ejor.2015.07.026_bib0017
  article-title: A simulation- based policy iteration algorithm for average cost unichain Markov decision processes
– year: 2007
  ident: 10.1016/j.ejor.2015.07.026_bib0007
– volume: 55
  start-page: 179
  issue: 4
  year: 2004
  ident: 10.1016/j.ejor.2015.07.026_bib0024
  article-title: An optimal control of a production and distribution system by neuro-dynamic programming and a comparison of pull systems
  publication-title: Journal of Japan Industrial Management Association
– volume: 32
  start-page: 369
  year: 2000
  ident: 10.1016/j.ejor.2015.07.026_bib0011
  article-title: Extended kanban control system: Combining kanban and base stock
  publication-title: IIE Transactions
  doi: 10.1080/07408170008963914
– volume: 35
  start-page: 789
  issue: 3
  year: 1997
  ident: 10.1016/j.ejor.2015.07.026_bib0005
  article-title: A comparison of production-line control mechanisms
  publication-title: International Journal of Production Research
  doi: 10.1080/002075497195713
– year: 2007
  ident: 10.1016/j.ejor.2015.07.026_bib0026
– volume: 34
  start-page: 729
  year: 2002
  ident: 10.1016/j.ejor.2015.07.026_bib0015
  article-title: A reinforcement learning approach to a single-leg airline revenue management problem with multiple fare classes and overbooking
  publication-title: IIE Transactions
  doi: 10.1080/07408170208928908
– year: 2004
  ident: 10.1016/j.ejor.2015.07.026_bib0029
– year: 1985
  ident: 10.1016/j.ejor.2015.07.026_bib0020
  article-title: Modified policy iteration algorithm with nonoptimality tests for undiscounted Markov decision process
– volume: 60
  start-page: 719
  year: 2011
  ident: 10.1016/j.ejor.2015.07.026_bib0018
  article-title: Approximate dynamic programming for an inventory problem: Empirical comparison
  publication-title: Computers & Industrial Engineering
  doi: 10.1016/j.cie.2011.01.007
– year: 2007
  ident: 10.1016/j.ejor.2015.07.026_bib0008
– volume: 54
  start-page: 316
  issue: 5
  year: 2003
  ident: 10.1016/j.ejor.2015.07.026_bib0023
  article-title: Neuro-dynamic programming algorithms for computing optimal control of production lines
  publication-title: Journal of Japan Industrial Management Association
– volume: 17
  start-page: 213
  year: 2003
  ident: 10.1016/j.ejor.2015.07.026_bib0010
  article-title: Convergence of simulation-based policy iteration
  publication-title: Probability in the Engineering and Informational Sciences
  doi: 10.1017/S0269964803172051
– volume: 213
  start-page: 124
  year: 2011
  ident: 10.1016/j.ejor.2015.07.026_bib0025
  article-title: The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems
  publication-title: European Journal of Operational Research
  doi: 10.1016/j.ejor.2011.03.005
– year: 2010
  ident: 10.1016/j.ejor.2015.07.026_bib0003
  article-title: Pathologies of temporal difference methods in approximate dynamic programming
– volume: 60
  start-page: 655
  year: 2012
  ident: 10.1016/j.ejor.2015.07.026_bib0014
  article-title: Approximate dynamic programming via a smoothed linear program
  publication-title: Operations Research
  doi: 10.1287/opre.1120.1044
– year: 1996
  ident: 10.1016/j.ejor.2015.07.026_bib0002
– year: 2010
  ident: 10.1016/j.ejor.2015.07.026_bib0006
– volume: 33
  start-page: 1387
  issue: 5
  year: 1995
  ident: 10.1016/j.ejor.2015.07.026_bib0022
  article-title: Optimal numbers of two kinds of kanbans in a JIT production system
  publication-title: International Journal of Production Research
  doi: 10.1080/00207549508930216
– volume: 28
  start-page: 879
  issue: 5
  year: 1990
  ident: 10.1016/j.ejor.2015.07.026_bib0030
  article-title: CONWIP: A pull alternative to kanban
  publication-title: International Journal of Production Research
  doi: 10.1080/00207549008942761
– volume: 35
  start-page: 121
  issue: 1
  year: 1987
  ident: 10.1016/j.ejor.2015.07.026_bib0021
  article-title: Computing optimal policies for controlled tandem queueing systems
  publication-title: Operations Research
  doi: 10.1287/opre.35.1.121
– volume: 45
  start-page: 560
  year: 1999
  ident: 10.1016/j.ejor.2015.07.026_bib0012
  article-title: Solving semi-Markov decision problems using average reward reinforcement learning
  publication-title: Management Science
  doi: 10.1287/mnsc.45.4.560
– year: 1998
  ident: 10.1016/j.ejor.2015.07.026_bib0031
– year: 1994
  ident: 10.1016/j.ejor.2015.07.026_bib0028
– year: 1957
  ident: 10.1016/j.ejor.2015.07.026_bib0001
– volume: 9
  start-page: 310
  issue: 3
  year: 2011
  ident: 10.1016/j.ejor.2015.07.026_bib0004
  article-title: Approximate policy iteration: A survey and some new methods
  publication-title: Journal of Control Theory and Application
  doi: 10.1007/s11768-011-1005-3
– year: 2012
  ident: 10.1016/j.ejor.2015.07.026_bib0019
– year: 2003
  ident: 10.1016/j.ejor.2015.07.026_bib0016
– volume: 9
  start-page: 336
  issue: 3
  year: 2011
  ident: 10.1016/j.ejor.2015.07.026_bib0027
  article-title: A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
  publication-title: Journal of Control Theory and Applications
  doi: 10.1007/s11768-011-0313-y
SSID ssj0001515
Score 2.2956004
Snippet •We propose new approximate dynamic programming algorithms.•These algorithms can solve large-scale undiscounted Markov decision processes.•Optimal control...
Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 22
SubjectTerms Algorithms
Approximate dynamic programming algorithms
Binomial distribution
Dynamic programming
Iterative methods
JIT-based production and distribution system
Markov analysis
Optimal control
Optimization algorithms
Production capacity
Simulation
Stochastic control theory
Studies
The curses of dimensionality
Undiscounted Markov decision processes
Title New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system
URI https://dx.doi.org/10.1016/j.ejor.2015.07.026
https://www.proquest.com/docview/1733197675
Volume 249
WOSCitedRecordID wos000366536400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1872-6860
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001515
  issn: 0377-2217
  databaseCode: AIEXJ
  dateStart: 19950105
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lj9MwELaqXYTgwKOAWFjQHLhFWTVpE8fHFVrEQypI9NBblNgOzdI2VZNWhd_Eb0SMX2noaleAxCWKLDtNNV_sz-OZ-Qh5RUUusiAWPu4NmD-KhpnP8jzw8bNK0OIZUuyRFpug43EynbJPvd5PlwuzndPlMtnt2Oq_mhrb0NgqdfYvzN0-FBvwHo2OVzQ7Xv_I8CpiUVcK35XIRqUnjOa8i8Ra6KzE-ZdqXTYzU43Bm6twcL9Gc0mli1vWWkFCCp3JU209YYV4vJVJK5C1C7ws117nCFwR2QrnoEX5XXqZ6i02VoocuwtVo9fKa9kK0teeC1iOjA1r5620ZYla9_XHmVYNV_Eg9WZW1nvPQqVdRZOqnpVN9rVtH-O9KTL8oSr5rNw7LRYbrbbkTbDHNzWs6wsJdPi0SdU0DrorSTomMYxSPwxNiuiZNPN8QkM_ToyUgVsIQlM89TfE22k97BAEs2pdWXqMF-TyTF5Wqs5sEOmisOFBnW_NHD6rV1JvhGxcnWvj7v04pBHDWfn4_N3F9H3LJRTd1Odg9i_YtC8ToXj4S9dRqwOSoZnT5AG5Z7c8cG6g-pD05LJPbruMiz6575RFwC40fXK3UybzEfmBkIYOpMFCGjqQhj2kASENHUhDF9JgIA0O0tBCGhCjoCENHUhDU4GDNGSwh7Tu3oU0GEg_JpM3F5PXb32rMuLzURQ0fiKzZCgFjQqZxQFP4iLO-FDEcsRDFnMq4zxgORLvPEsoo8iPiygocB4TTMi8GD4hR8tqKZ8SkLj1xs684MMcl8I4Hw1owZHwCdzIDAp2QgJnn5TbCvxKCGaeulDLy1TZNFU2TQc0RZueEK8dszL1Z27sHTmzp5ZBG2acIkpvHHfqMJLaj7xOAyXnylSxp2f_-Njn5M7-Kz0lR816I1-QW3zblPX6pcX6L-y8A1g
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=New+approximate+dynamic+programming+algorithms+for+large-scale+undiscounted+Markov+decision+processes+and+their+application+to+optimize+a+production+and+distribution+system&rft.jtitle=European+journal+of+operational+research&rft.au=Ohno%2C+Katsuhisa&rft.au=Boh%2C+Toshitaka&rft.au=Nakade%2C+Koichi&rft.au=Tamura%2C+Takayoshi&rft.date=2016-02-16&rft.pub=Elsevier+B.V&rft.issn=0377-2217&rft.eissn=1872-6860&rft.volume=249&rft.issue=1&rft.spage=22&rft.epage=31&rft_id=info:doi/10.1016%2Fj.ejor.2015.07.026&rft.externalDocID=S0377221715006591
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0377-2217&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0377-2217&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0377-2217&client=summon