Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI

A long-term trend in high-performance computing is the increasing number of nodes in parallel computing platforms, which entails a higher failure probability. Fault tolerant programming environments should be used to guarantee the safe execution of critical applications. Research in fault tolerant M...

Full description

Saved in:
Bibliographic Details
Published in:Conference on High Performance Networking and Computing: Proceedings of the 2006 ACM/IEEE conference on Supercomputing; 11-17 Nov. 2006 pp. 127 - es
Main Authors: Coti, Camille, Herault, Thomas, Lemarinier, Pierre, Pilard, Laurence, Rezmerita, Ala, Rodriguez, Eric, Cappello, Franck
Format: Conference Proceeding
Language:English
Published: New York, NY, USA ACM 11.11.2006
Series:ACM Conferences
Subjects:
ISBN:0769527000, 9780769527000
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first