Checkpointing Orchestration: Toward a Scalable HPC Fault-Tolerant Environment

Check pointing is widely used in technical computing. However, the overhead of check pointing is a subject of increasing in concern in recent years, especially for large-scale parallel computer systems. In these systems, check pointing generates a huge number of concurrent I/O writes. The burst of w...

Full description

Saved in:
Bibliographic Details
Published in:2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing pp. 276 - 283
Main Authors: Hui Jin, Tao Ke, Yong Chen, Xian-He Sun
Format: Conference Proceeding
Language:English
Published: IEEE 01.05.2012
Subjects:
ISBN:1467313955, 9781467313957
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first