Rank-based ant colony optimization applied to dynamic traveling salesman problems

This study proposes a rank-based ant colony optimization (ACO) method with a rank-based nonlinear selective pressure function and a modified Q-learning method to enhance the convergence characteristics of original ACO Dorigo et al., which defines the probability of exploring a city to be visited by...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Engineering optimization Ročník 37; číslo 8; s. 831 - 847
Hlavný autor: Liu, Jenn-long
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Taylor & Francis 01.12.2005
Predmet:
ISSN:0305-215X, 1029-0273
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:This study proposes a rank-based ant colony optimization (ACO) method with a rank-based nonlinear selective pressure function and a modified Q-learning method to enhance the convergence characteristics of original ACO Dorigo et al., which defines the probability of exploring a city to be visited by ants with a random proportional rule. This probability distribution of the random proportional rule, which is similar to that of the stochastic universal sampling method generally applied to the selection operation in genetic algorithms, is good for exploring favorable paths in small traveling salesman problems (TSPs), but inefficient at exploring such paths in large TSPs. Therefore, this study presents the rank-based nonlinear selection pressure function, based on ranking of [τ (r, z)][η (r, z)] β , to improve the performance of the state transition rule of the original ACO, as well as a modified Q-learning method to solve reinforcement learning problems efficiently. The modified Q-learning method, which incorporates a constant pheromone trail distribution with standard Q-learning, can yield a solution effectively when applied in the local-updating rule of an ACO. In this article, the optimal settings for the control parameters (q) used in the rank-based selective pressure function and the discounted factor (γ) associated with modified Q-learning were investigated numerically using a benchmark St70 case of the static TSP. Furthermore, the improved ACO was applied to the static TSP of the KroA100 case and a route programing with 532 nodes. This study also applied the rank-based ACO to solve dynamic TSPs. By introducing the rank-based nonlinear selective pressure function and the modified Q-learning model into the original ACO, the presented rank-based ACO algorithm effectively explores paths affected by a change in the environment. In this work, the environment changes are traffic jams and road closures between cities, which sometimes force the salesman to change his route.
Bibliografia:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0305-215X
1029-0273
DOI:10.1080/03052150500340504