Distributed Discrete Morse Sandwich: Efficient Computation of Persistence Diagrams for Massive Scalar Data

The persistence diagram, which describes the topological features of a dataset, is a key descriptor in Topological Data Analysis. The "Discrete Morse Sandwich" (DMS) method has been reported to be the most efficient algorithm for computing persistence diagrams of 3D scalar fields on a sing...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on parallel and distributed systems Ročník 37; číslo 1; s. 137 - 154
Hlavní autoři: Le Guillou, Eve, Fortin, Pierre, Tierny, Julien
Médium: Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.01.2026
Institute of Electrical and Electronics Engineers
Témata:
ISSN:1045-9219, 1558-2183
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The persistence diagram, which describes the topological features of a dataset, is a key descriptor in Topological Data Analysis. The "Discrete Morse Sandwich" (DMS) method has been reported to be the most efficient algorithm for computing persistence diagrams of 3D scalar fields on a single node, using shared-memory parallelism. In this work, we extend DMS to distributed-memory parallelism for the efficient and scalable computation of persistence diagrams for massive datasets across multiple compute nodes. On the one hand, we can leverage the embarrassingly parallel procedure of the first and most time-consuming step of DMS (namely the discrete gradient computation). On the other hand, the efficient distributed computations of the subsequent DMS steps are much more challenging. To address this, we have extensively revised the DMS routines by contributing a new self-correcting distributed pairing algorithm, redesigning key data structures and introducing computation tokens to coordinate distributed computations. We have also introduced a dedicated communication thread to overlap communication and computation. Detailed performance analyses show the scalability of our hybrid MPI+thread approach for strong and weak scaling using up to 16 nodes of 32 cores (512 cores total). Our algorithm outperforms DIPHA , a reference method for the distributed computation of persistence diagrams, with an average speedup of <inline-formula><tex-math notation="LaTeX">\times 8</tex-math> <mml:math><mml:mrow><mml:mo>×</mml:mo><mml:mn>8</mml:mn></mml:mrow></mml:math><inline-graphic xlink:href="leguillou-ieq1-3626047.gif"/> </inline-formula> on 512 cores. We show the practical capabilities of our approach by computing the persistence diagram of a public 3D scalar field of 6 billion vertices in 174 seconds on 512 cores. Finally, we provide a usage example of our open-source implementation at https://github.com/eve-le-guillou/DDMS-example .
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2025.3626047