A self-learning whale optimization algorithm based on reinforcement learning for a dual-resource flexible job shop scheduling problem

One of the key areas in which production systems researchers are working these days is to find advanced optimization algorithms to efficiently schedule activities in manufacturing systems, which requires more sophisticated models with increased computational complexity. Therefore, there has been gro...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Applied soft computing Ročník 180; s. 113436
Hlavní autoři:	Manafi, Ehsan, Domenech, Bruno, Tavakkoli-Moghaddam, Reza, Ranaboldo, Matteo
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Elsevier B.V 01.08.2025
Témata:	Flexible job shop scheduling Machine learning Meta-heuristics Reconfigurable manufacturing systems Reinforcement learning Reconfigurable manufacturing systems Flexible job shop scheduling Machine learning Reinforcement learning Meta-heuristics
ISSN:	1568-4946
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	One of the key areas in which production systems researchers are working these days is to find advanced optimization algorithms to efficiently schedule activities in manufacturing systems, which requires more sophisticated models with increased computational complexity. Therefore, there has been growing interest in this subject to improve the performance of meta-heuristics by incorporating reinforcement learning approaches. This paper deals with a dual-resource flexible job shop scheduling (DRFJSS) problem, in which each operation requires two resources (i.e., reconfigurable machine tool (RMT) and worker) to be processed. A mixed-integer linear programming (MILP) model is formulated to minimize the makespan. Since the proposed model cannot optimally solve most medium-sized instances, a self-learning whale optimization algorithm (SLWOA) is developed to deal efficiently with such a difficult problem. In the proposed SLWOA, an agent is trained by the state–action–reward–state–action (SARSA) algorithm to balance exploration and exploitation. The results show that the SLWOA has a stronger global search ability and faster convergence speed than the original whale optimization algorithm. [Display omitted] •Studying dual-resource scheduling in shop floors with reconfigurable machine tools.•Formulating a position-based MILP model for scheduling optimization.•Proposing a self-learning whale algorithm for large instance problems.•Designing states, actions, and rewards for reinforcement learning integration.•Developing a variable neighbourhood search to improve the local search.
ISSN:	1568-4946
DOI:	10.1016/j.asoc.2025.113436