HyperX Topology: First At-Scale Implementation and Comparison to the Fat-Tree

The de-facto standard topology for modern HPC systems and data-centers are Folded Clos networks, commonly known as Fat-Trees. The number of network endpoints in these systems is steadily in-creasing. The switch radix increase is not keeping up, forcing an increased path length in these multi-level t...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:SC19: International Conference for High Performance Computing, Networking, Storage and Analysis s. 1 - 23
Hlavní autoři: Domke, Jens, Matsuoka, Satoshi, Ivanov, Ivan R., Tsushima, Yuki, Yuki, Tomoya, Nomura, Akihiro, Miura, Shin'ichi, McDonald, Nic, Floyd, Dennis L., Dube, Nicolas
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: ACM 17.11.2019
Témata:
ISSN:2167-4337
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The de-facto standard topology for modern HPC systems and data-centers are Folded Clos networks, commonly known as Fat-Trees. The number of network endpoints in these systems is steadily in-creasing. The switch radix increase is not keeping up, forcing an increased path length in these multi-level trees that will limit gains for latency-sensitive applications. Additionally, today's Fat-Trees force the extensive use of active optical cables which carries a pro-hibitive cost-structure at scale. To tackle these issues, researchers proposed various low-diameter topologies, such as Dragonfly. An-other novel, but only theoretically studied, option is the HyperX. We built the world's first 3 Pflop/s supercomputer with two separate networks, a 3-level Fat-Tree and a 12×8 HyperX. This dual-plane system allows us to perform a side-by-side comparison using a broad set of benchmarks. We show that the HyperX, together with our novel communication pattern-aware routing, can challenge the performance of, or even outperform, traditional Fat-Trees.
ISSN:2167-4337
DOI:10.1145/3295500.3356140