The fast and the capacious: memory-efficient multi-GPU accelerated explicit state space exploration with GPUexplore 3.0

The GPU acceleration of explicit state space exploration, for explicit-state model checking, has been the subject of previous research, but to date, the tools have been limited in their applicability and in their practical use. Considering this research, to our knowledge, we are the first to use a n...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Frontiers in high performance computing Jg. 2
Hauptverfasser: Wijs, Anton, Osama, Muhammad
Format: Journal Article
Sprache:Englisch
Veröffentlicht: 13.03.2024
ISSN:2813-7337, 2813-7337
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The GPU acceleration of explicit state space exploration, for explicit-state model checking, has been the subject of previous research, but to date, the tools have been limited in their applicability and in their practical use. Considering this research, to our knowledge, we are the first to use a novel tree database for GPUs. This novel tree database allows high-performant, memory-efficient storage of states in the form of binary trees. Besides the tree compression this enables, we also propose two new hashing schemes, compact-cuckoo and compact multiple-functions. These schemes enable the use of Cleary compression to compactly store tree roots. Besides an in-depth discussion of the tree database algorithms, the input language and workflow of our tool, called GPUexplore 3.0, are presented. Finally, we explain how the algorithms can be extended to exploit multiple GPUs that reside on the same machine. Experiments show single-GPU processing speeds of up to 144 million states per second compared to 20 million states achieved by 32-core LTSmin. In the multi-GPU setting, workload and storage distributions are optimal, and, frequently, performance is even positively impacted when the number of GPUs is increased. Overall, a logarithmic acceleration up to 1.9× was achieved with four GPUs, compared to what was achieved with one and two GPUs. We believe that a linear speedup can be easily accomplished with faster P2P communications between the GPUs.
ISSN:2813-7337
2813-7337
DOI:10.3389/fhpcp.2024.1285349