Filling the Void: Data-Driven Machine Learning-based Reconstruction of Sampled Spatiotemporal Scientific Simulation Data

As high-performance computing systems continue to advance, the gap between computing performance and I/O capabilities is widening. This bottleneck limits the storage capabilities of increasingly large-scale simulations, which generate data at never-before-seen granularities while only being able to...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis s. 290 - 299
Hlavní autori:	Biswas, Ayan, Mishra, Aditi, Majumder, Meghanto, Hazarika, Subhashis, Most, Alexander, Castorena, Juan, Bryan, Christopher, McCormick, Patrick, Ahrens, James, Lawrence, Earl, Hagberg, Aric
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	IEEE 17.11.2024
Predmet:	Computational modeling computing methodologies Data models data reduction Deep learning Exascale computing Filling Interpolation Knowledge transfer machine learning machine learning approaches modeling and simulation Neural networks reconstruction scientific visualization simulation types and techniques Spatiotemporal phenomena Training
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	As high-performance computing systems continue to advance, the gap between computing performance and I/O capabilities is widening. This bottleneck limits the storage capabilities of increasingly large-scale simulations, which generate data at never-before-seen granularities while only being able to store a small subset of the raw data. Recently, strategies for data-driven sampling have been proposed to intelligently sample the data in a way that achieves high data reduction rates while preserving important regions or features with high fidelity. However, a thorough analysis of how such intelligent samples can be used for data reconstruction is lacking. We propose a data-driven machine learning approach based on training neural networks to reconstruct full-scale datasets based on a simulation's sampled output. Compared to current state-of-the-art reconstruction approaches such as Delaunay triangulation-based linear interpolation, we demonstrate that our machine learning-based reconstruction has several advantages, including reconstruction quality, time-to-reconstruct, and knowledge transfer to unseen timesteps and grid resolutions. We propose and evaluate strategies that balance the sampling rates with model training (pretraining and fine-tuning) and data reconstruction time to demonstrate how such machine learning approaches can be tailored for both speed and quality for the reconstruction of grid-based datasets.
DOI:	10.1109/SCW63240.2024.00045