基于端到端深度强化学习求解 有能力约束的车辆路径问题.

Saved in:
Bibliographic Details
Title: 基于端到端深度强化学习求解 有能力约束的车辆路径问题. (Chinese)
Alternate Title: Solving capacitated vehicle routing problems based on end to end deep reinforcement learning. (English)
Authors: 葛斌, 田文智, 夏晨星, 秦望博
Source: Application Research of Computers / Jisuanji Yingyong Yanjiu; Nov2024, Vol. 41 Issue 11, p3245-3250, 6p
Subject Terms: REINFORCEMENT learning, DEEP reinforcement learning, VEHICLE routing problem, REPRESENTATIONS of graphs, HEURISTIC algorithms
Abstract (English): The capacitated vehicle routing problem (CVRP) is the most prevalent problem model in supply chain applications at present, and researchers often use heuristic algorithms to solve it, but the solution speed is slow and the quality of the solution cannot be guaranteed. This paper proposed an end-to-end deep reinforcement learning (DRL) network framework to study the CVRP problem. Firstly, it used the edge graph attention network encoder (EGATE) to perform feature embedding encoding on the graph representation of VRP. Then, it designed a multi-head attention decoder (MAD) to decode the encoded graph representation. Additionally, it proposed a multi-decoding strategy to enhance the spatial diversity of the solutions. Continuing with the training of the end-to-end network model using the baseline REINFORCE algorithm with a rollout baseline, the adaptive updating of the baseline was employed to enhance the effectiveness of model training. Additionally, reward function normalization and optimization using Adam optimizer were utilized to further improve the algorithm. Finally, this paper validated the feasibility and effectiveness of the proposed end-to-end DRL framework through experiments on problems of different scales, comparing its performance against other algorithms. The average solution time of the trained model on the CVRPLIB public dataset is only 0. 189 s to obtain a better solution. [ABSTRACT FROM AUTHOR]
Abstract (Chinese): 有能力约束的车辆路径问题 (CVRP) 是现阶段供应链应用最常见的问题模型, 现多采用启发式算法求 解。但随着问题规模增大, 启发式算法求解速度慢且无法保证解的质量。提出端到端深度强化学习 (DRL) 网络 框架对CVRP进行研究。首先利用边聚合图注意力网络编码器 (EGATE) 对车辆路径规划问题的图表示进行特 征嵌入编码;然后设计多头注意力解码器 (MAD) 进行解码, 并提出多解码策略以增加解的空间多样性;接着利 用带回滚基线的基线 REINFORCE 算法对端到端网络模型进行训练, 基线可自适应性更新以提升模型训练效 果, 并利用奖励函数归一化和Adam 优化器对算法进行优化。最后通过对不同规模问题的实验以及与其他算法 进行对比, 验证了所提出端到端 DRL 框架的可行性与有效性, 经过训练的模型在 CVRPLIB 公共数据集上的平均 求解时间仅需0.189s即可得到较优解。 [ABSTRACT FROM AUTHOR]
Copyright of Application Research of Computers / Jisuanji Yingyong Yanjiu is the property of Application Research of Computers Edition and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Complementary Index
Be the first to leave a comment!
You must be logged in first