Checkpointing vs. Supervision Resilience Approaches for Dynamic Independent Tasks

With the advent of exascale computing, issues such as application irregularity and permanent hardware failure are growing in importance. Irregularity is often addressed by task-based parallel programming coupled with work stealing. At the task level, resilience can be provided by two principal appro...

Full description

Saved in:
Bibliographic Details
Published in:2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) pp. 556 - 565
Main Authors: Posner, Jonas, Reitz, Mia, Fohry, Claudia
Format: Conference Proceeding
Language:English
Published: IEEE 01.06.2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first