Automated algorithm design using proximal policy optimisation with identified features

Automated algorithm design is attracting considerable recent research attention in solving complex combinatorial optimisation problems, due to that most metaheuristics may be particularly effective at certain problems or certain instances of the same problem but perform poorly at others. Within a ge...

Full description

Saved in:

Bibliographic Details
Published in:	Expert systems with applications Vol. 216; p. 119461
Main Authors:	Yi, Wenjie, Qu, Rong, Jiao, Licheng
Format:	Journal Article
Language:	English
Published:	Elsevier Ltd 15.04.2023
Subjects:	Automated algorithm design Feature identification Reinforcement learning Search pattern Vehicle routing problem Feature identification Search pattern Automated algorithm design Vehicle routing problem Reinforcement learning
ISSN:	0957-4174
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Automated algorithm design is attracting considerable recent research attention in solving complex combinatorial optimisation problems, due to that most metaheuristics may be particularly effective at certain problems or certain instances of the same problem but perform poorly at others. Within a general algorithm design framework, this study investigates reinforcement learning on the automated design of metaheuristic algorithms. Two groups of features, namely search-dependent and instance-dependent features, are firstly identified to represent the search space of algorithm design to support effective reinforcement learning on the new task of algorithm design. With these key features, a state-of-the-art reinforcement learning technique, namely proximal policy optimisation, is employed to automatically combine the basic algorithmic components within the general framework to develop effective metaheuristics. Patterns of the best designed algorithm, in particular the utilisation and transition of algorithmic components, are investigated. Experimental results on the capacitated vehicle routing problem with time windows benchmark dataset demonstrate the effectiveness of the identified features in assisting automated algorithm design with the proposed reinforcement learning model. •Two groups of features are identified to describe search space of algorithm design.•PPO is employed to extract useful knowledge hidden in the search data.•The influence of state representation (with different features) is investigated.•The utilisation and transition of algorithmic components are analysed.
ISSN:	0957-4174
DOI:	10.1016/j.eswa.2022.119461