Automated algorithm design using proximal policy optimisation with identified features

Automated algorithm design is attracting considerable recent research attention in solving complex combinatorial optimisation problems, due to that most metaheuristics may be particularly effective at certain problems or certain instances of the same problem but perform poorly at others. Within a ge...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Expert systems with applications Ročník 216; s. 119461
Hlavní autoři: Yi, Wenjie, Qu, Rong, Jiao, Licheng
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 15.04.2023
Témata:
ISSN:0957-4174
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Automated algorithm design is attracting considerable recent research attention in solving complex combinatorial optimisation problems, due to that most metaheuristics may be particularly effective at certain problems or certain instances of the same problem but perform poorly at others. Within a general algorithm design framework, this study investigates reinforcement learning on the automated design of metaheuristic algorithms. Two groups of features, namely search-dependent and instance-dependent features, are firstly identified to represent the search space of algorithm design to support effective reinforcement learning on the new task of algorithm design. With these key features, a state-of-the-art reinforcement learning technique, namely proximal policy optimisation, is employed to automatically combine the basic algorithmic components within the general framework to develop effective metaheuristics. Patterns of the best designed algorithm, in particular the utilisation and transition of algorithmic components, are investigated. Experimental results on the capacitated vehicle routing problem with time windows benchmark dataset demonstrate the effectiveness of the identified features in assisting automated algorithm design with the proposed reinforcement learning model. •Two groups of features are identified to describe search space of algorithm design.•PPO is employed to extract useful knowledge hidden in the search data.•The influence of state representation (with different features) is investigated.•The utilisation and transition of algorithmic components are analysed.
ISSN:0957-4174
DOI:10.1016/j.eswa.2022.119461