PIAR: Path-Improved Adaptive Routing for Dragonfly Networks
For the next-generation exascale supercomputing communication systems, Dragonfly topology offers strong scalability, low latency, and cost efficiency. Dragonfly networks have already been implemented in current supercomputers and will continue to expand in future systems. Adaptive routing in Dragonf...
Gespeichert in:
| Veröffentlicht in: | Proceedings / IEEE International Conference on Cluster Computing S. 1 - 11 |
|---|---|
| Hauptverfasser: | , , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
02.09.2025
|
| Schlagworte: | |
| ISSN: | 2168-9253 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | For the next-generation exascale supercomputing communication systems, Dragonfly topology offers strong scalability, low latency, and cost efficiency. Dragonfly networks have already been implemented in current supercomputers and will continue to expand in future systems. Adaptive routing in Dragonfly topologies is critical for network performance. The traditional UGAL routing algorithm, which uses the valiant mechanism to select non-minimal paths, does not adequately consider the impact of high hops in non-minimal paths, often unnecessarily increasing the average path length, thereby increasing network latency and load. Furthermore, UGAL inaccurately estimates the congestion of the entire routing path based on local information, leading to suboptimal routing decisions that limit the algorithm's performance. In this paper, we propose PIAR, a novel pathimproved adaptive routing algorithm. PIAR dynamically selects paths based on the status of local and global channels, prioritizing non-minimal paths with fewer hops to reduce network latency and load, thereby improving network performance. Additionally, we present the microarchitecture of the routing computation unit. Our evaluation results demonstrate that, compared with advanced algorithms such as PAR _{\text {PH }} , TPR, and UGAL LE, PIAR achieves an average throughput improvement of 19.2 % and reduces latency by up to \mathbf{1 3. 4 \%} under the single synthetic traffic. Under mixed traffic, PIAR achieves an average throughput improvement of \mathbf{2 3. 6 \%} and reduces the latency by up to \mathbf{3 3. 8 \%} . For application workloads, PIAR achieves an average reduction of 24.0 % in packet latency. |
|---|---|
| AbstractList | For the next-generation exascale supercomputing communication systems, Dragonfly topology offers strong scalability, low latency, and cost efficiency. Dragonfly networks have already been implemented in current supercomputers and will continue to expand in future systems. Adaptive routing in Dragonfly topologies is critical for network performance. The traditional UGAL routing algorithm, which uses the valiant mechanism to select non-minimal paths, does not adequately consider the impact of high hops in non-minimal paths, often unnecessarily increasing the average path length, thereby increasing network latency and load. Furthermore, UGAL inaccurately estimates the congestion of the entire routing path based on local information, leading to suboptimal routing decisions that limit the algorithm's performance. In this paper, we propose PIAR, a novel pathimproved adaptive routing algorithm. PIAR dynamically selects paths based on the status of local and global channels, prioritizing non-minimal paths with fewer hops to reduce network latency and load, thereby improving network performance. Additionally, we present the microarchitecture of the routing computation unit. Our evaluation results demonstrate that, compared with advanced algorithms such as PAR _{\text {PH }} , TPR, and UGAL LE, PIAR achieves an average throughput improvement of 19.2 % and reduces latency by up to \mathbf{1 3. 4 \%} under the single synthetic traffic. Under mixed traffic, PIAR achieves an average throughput improvement of \mathbf{2 3. 6 \%} and reduces the latency by up to \mathbf{3 3. 8 \%} . For application workloads, PIAR achieves an average reduction of 24.0 % in packet latency. |
| Author | Xie, Min Chen, Guo Wang, Zhenghao Wang, Qiang Xu, Jinbo Lai, Mingche Xu, Jiaqing |
| Author_xml | – sequence: 1 givenname: Zhenghao surname: Wang fullname: Wang, Zhenghao email: zh_wang@nudt.edu.cn organization: College of Computer Science and Technology, National University of Defense Technology,Changsha,China – sequence: 2 givenname: Qiang surname: Wang fullname: Wang, Qiang email: qiangwang@nudt.edu.cn organization: College of Computer Science and Technology, National University of Defense Technology,Changsha,China – sequence: 3 givenname: Mingche surname: Lai fullname: Lai, Mingche email: mingchelai@nudt.edu.cn organization: College of Computer Science and Technology, National University of Defense Technology,Changsha,China – sequence: 4 givenname: Jiaqing surname: Xu fullname: Xu, Jiaqing email: xujiaqing@nudt.edu.cn organization: College of Computer Science and Technology, National University of Defense Technology,Changsha,China – sequence: 5 givenname: Jinbo surname: Xu fullname: Xu, Jinbo email: xujinbo@nudt.edu.cn organization: College of Computer Science and Technology, National University of Defense Technology,Changsha,China – sequence: 6 givenname: Min surname: Xie fullname: Xie, Min email: xiemin@nudt.edu.cn organization: College of Computer Science and Technology, National University of Defense Technology,Changsha,China – sequence: 7 givenname: Guo surname: Chen fullname: Chen, Guo email: guochen@hnu.edu.cn organization: College of Computer Science and Electronic Engineering, Hunan University,Changsha,China |
| BookMark | eNo1j81Kw0AYAFdRsK19Aw_Be-puvuyfnkJaNRC0xHguu8m3NdomYRMrfXsF9TQwh4GZkrO2a5GQa0YXjFF9k-avL-Wq4BriaBHRiP9opkSsohMy11IrAMaBMq1OySRiQoU64nBBpsPwTilIoGJC7tZZUtwGazO-hdm-990B6yCpTT82BwyK7nNs2m3gOh8svdl2rdsdgyccvzr_MVySc2d2A87_OCPl_apMH8P8-SFLkzxsNIwhSJQQM2m5Rl6bmEuHQlQKXQwWKUWqLBpbCc4ccimti4Vz1spKUVVVAmbk6jfbIOKm983e-OPm_xW-ARs-S_c |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/CLUSTER59342.2025.11186482 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798331530198 |
| EISSN | 2168-9253 |
| EndPage | 11 |
| ExternalDocumentID | 11186482 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Key Research and Development Program of China grantid: 2023YFB4403400 funderid: 10.13039/501100012166 |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL RNS |
| ID | FETCH-LOGICAL-i93t-37e73417b59e5da457fe66c8ef43be00e08beabc651fe577bf46ffbb7c808cc63 |
| IEDL.DBID | RIE |
| IngestDate | Wed Oct 15 14:21:20 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i93t-37e73417b59e5da457fe66c8ef43be00e08beabc651fe577bf46ffbb7c808cc63 |
| PageCount | 11 |
| ParticipantIDs | ieee_primary_11186482 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-Sept.-2 |
| PublicationDateYYYYMMDD | 2025-09-02 |
| PublicationDate_xml | – month: 09 year: 2025 text: 2025-Sept.-2 day: 02 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings / IEEE International Conference on Cluster Computing |
| PublicationTitleAbbrev | CLUSTER |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0037306 |
| Score | 2.3020515 |
| Snippet | For the next-generation exascale supercomputing communication systems, Dragonfly topology offers strong scalability, low latency, and cost efficiency.... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | adaptive routing Adaptive systems dragonfly topology Heuristic algorithms High-performance interconnection network Network topology Next generation networking Routing Scalability Supercomputers Throughput Topology Traffic control |
| Title | PIAR: Path-Improved Adaptive Routing for Dragonfly Networks |
| URI | https://ieeexplore.ieee.org/document/11186482 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JS0MxEA5aPHiqS0XrQg5e06YvL8vTU6kWhVIeWqG3kmUivbSli-C_N3mL4sGDtxCYECbJfJNkvhmEbpV1wgltiE0oI6nRlugsAeKM5spKJ1hZtWQkx2M1nWZ5RVYvuDAAUASfQSc2i798t7S7-FTWDedSiVQFi7svpSjJWrXZZWGriiqraI9m3cHo7TU4hDxjaeRbJbxTS_-qo1LAyLD5zwkcodYPIQ_n31BzjPZgcYKadUUGXB3QU3SfP_df7nAe3DpSPheAw32nV9Go4Rj9E8Rx8FPxw1q_R0bIJx6XgeCbFpoMHyeDJ1KVRyDzjG2DZQAZIEgangF3OuXSgxBWgU-ZAUqBKgPaWMF7HriUxqfCe2OkVVRZK9gZaiyWCzhHmIELl1MB3gTE1t7q4CMEmDKCeh5G4heoFXUxW5UJMGa1Gtp_9F-iw6jxIhQruUKN7XoH1-jAfmznm_VNsWxfxo-Zng |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA6igp7qo-LbHLxuu93Na_VUqqXFdVm0Qm8lj4n00pY-BP-9yT4UDx68hcCEMEnmmyTzzSB0K7RhhkkV6CiMA6KkDmQSQWCUpEJzw-KyaknKs0yMx0lekdULLgwAFMFn0PLN4i_fzPXGP5W13bkUjAhncXcoIVFY0rVqwxu7zcqqvKKdMGn30rdX5xLSJCaecRXRVi3_q5JKAST9xj-ncICaP5Q8nH-DzSHagtkRatQ1GXB1RI_RfT7svtzh3Dl2QflgAAZ3jVx4s4Z9_I8Tx85TxQ9L-e45IZ84K0PBV0006j-OeoOgKpAQTJN47WwDcAdCXNEEqJGEcguMaQGWxArCEEKhQCrNaMcC5VxZwqxVimsRCq1ZfIK2Z_MZnCIcg3HXUwZWOcyWVkvnJTigUiy01I1Ez1DT62KyKFNgTGo1nP_Rf4P2BqPndJIOs6cLtO-1XwRmRZdoe73cwBXa1R_r6Wp5XSzhF_WknOU |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+IEEE+International+Conference+on+Cluster+Computing&rft.atitle=PIAR%3A+Path-Improved+Adaptive+Routing+for+Dragonfly+Networks&rft.au=Wang%2C+Zhenghao&rft.au=Wang%2C+Qiang&rft.au=Lai%2C+Mingche&rft.au=Xu%2C+Jiaqing&rft.date=2025-09-02&rft.pub=IEEE&rft.eissn=2168-9253&rft.spage=1&rft.epage=11&rft_id=info:doi/10.1109%2FCLUSTER59342.2025.11186482&rft.externalDocID=11186482 |