Deep Reinforcement Learning Based Optimization Algorithm for Permutation Flow-Shop Scheduling

As a new analogy paradigm of human learning process, reinforcement learning (RL) has become an emerging topic in computational intelligence (CI). The synergy between the RL and CI is an emerging way to develop efficient solution algorithms for solving complex combinatorial optimization (CO) problems...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on emerging topics in computational intelligence Vol. 7; no. 4; pp. 983 - 994
Main Authors:	Pan, Zixiao, Wang, Ling, Wang, Jingjing, Lu, Jiawen
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 01.08.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms Artificial neural networks Combinatorial analysis Completion time Computational intelligence Computing time Decoding Deep learning deep neural network Dynamic scheduling Encoding flow-shop scheduling improvement strategy Job shop scheduling Machine learning Optimization optimization algorithm Optimization algorithms Permutations Reinforcement learning
ISSN:	2471-285X, 2471-285X
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	As a new analogy paradigm of human learning process, reinforcement learning (RL) has become an emerging topic in computational intelligence (CI). The synergy between the RL and CI is an emerging way to develop efficient solution algorithms for solving complex combinatorial optimization (CO) problems like machine scheduling problem. In this paper, we proposed an efficient optimization algorithm based on Deep RL for solving permutation flow-shop scheduling problem (PFSP) to minimize the maximum completion time. Firstly, a new deep neural network (PFSPNet) is designed for the PFSP to achieve the end-to-end output without limitation of problem sizes. Secondly, an actor-critic method of RL is used to train the PFSPNet without depending on the collection of high-quality labelled data. Thirdly, an improvement strategy is designed to refine the solution provided by the PFSPNet. Simulation results and statistical comparison show that the proposed optimization algorithm based on deep RL can obtain better results than the existing heuristics in similar computational time for solving the PFSP.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2471-285X 2471-285X
DOI:	10.1109/TETCI.2021.3098354