Efficient persistence landscape generation

Using topological summary tools such as persistence landscapes have greatly enhanced the practical usage of topological data analysis to analyze large-scale, noisy, and complex datasets. A central element of persistence landscape usage involves computing the top- k landscapes. This article presents...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of algorithms & computational technology Jg. 19
Hauptverfasser: Henderson, Taylor, Simon, Robert
Format: Journal Article
Sprache:Englisch
Veröffentlicht: SAGE Publishing 01.06.2025
ISSN:1748-3018, 1748-3026
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Using topological summary tools such as persistence landscapes have greatly enhanced the practical usage of topological data analysis to analyze large-scale, noisy, and complex datasets. A central element of persistence landscape usage involves computing the top- k landscapes. This article presents a novel output-sensitive plane sweep algorithm for computing the top- k persistence landscapes in optimal time and space: significantly outperforming previous algorithms. Our algorithm can determine in optimal O ( n * log ( n ) ) if a given birth-death pair appears in the top- k landscapes. The runtime performance of the approach on a botnet dataset and several synthetically generated point cloud topologies, showing that the algorithm can achieve significant speedups for these datasets due to its better algorithmic design. The speedups seen range from slightly worse (in some extreme examples) to equal compared to previous works while returning exactly the same output and is significantly faster when filtering is used (15x for birth-death pairs when removing 75% of birth-death pairs). Filtering is shown to maintain machine learning performance on both synthetically generated and real world datasets while providing orders of magnitude speedup depending on how intensive of filtering is done. Due to the introduced algorithm’s algorithmic design, the speedup seen is greater when filtering using the introduced birth-death filtering algorithm. The software is freely provided in Rust with Python bindings online.
ISSN:1748-3018
1748-3026
DOI:10.1177/17483026251347091