A simulation-based approximate dynamic programming approach to dynamic and stochastic resource-constrained multi-project scheduling problem

We consider the dynamic and stochastic resource-constrained multi-project scheduling problem which allows for the random arrival of projects and stochastic task durations. Completing projects generates rewards, which are reduced by a tardiness cost in the case of late completion. Multiple types of r...

Full description

Saved in:

Bibliographic Details
Published in:	European journal of operational research Vol. 315; no. 2; pp. 454 - 469
Main Authors:	Satic, U., Jacko, P., Kirkbride, C.
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 01.06.2024
Subjects:	Approximate dynamic programming Dynamic programming Dynamic resource allocation Markov decision processes Project scheduling Project scheduling Approximate dynamic programming Dynamic resource allocation Dynamic programming Markov decision processes
ISSN:	0377-2217, 1872-6860
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We consider the dynamic and stochastic resource-constrained multi-project scheduling problem which allows for the random arrival of projects and stochastic task durations. Completing projects generates rewards, which are reduced by a tardiness cost in the case of late completion. Multiple types of resource are available, and projects consume different amounts of these resources when under processing. The problem is modelled as an infinite-horizon discrete-time Markov decision process and seeks to maximise the expected discounted long-run profit. We use an approximate dynamic programming algorithm (ADP) with a linear approximation model which can be used for online decision making. Our approximation model uses project elements that are easily accessible by a decision-maker, with the model coefficients obtained offline via a combination of Monte Carlo simulation and least squares estimation. Our numerical study shows that ADP often statistically significantly outperforms the optimal reactive baseline algorithm (ORBA). In experiments on smaller problems however, both typically perform suboptimally compared to the optimal scheduler obtained by stochastic dynamic programming. ADP has an advantage over ORBA and dynamic programming in that ADP can be applied to larger problems. We also show that ADP generally produces statistically significantly higher profits than common algorithms used in practice, such as a rule-based algorithm and a reactive genetic algorithm. •Dynamic programming is optimal in small problems but intractable for large problems.•Rule based algorithms are straightforward to apply but can perform poorly.•Reactive baseline algorithms have a restricted view of future profits.•Approximate dynamic programming learns about actions by simulating the future.•Approximate dynamic programming performs competitively compared to alternatives.
ISSN:	0377-2217 1872-6860
DOI:	10.1016/j.ejor.2023.10.046