Optimization of checkpointing-related I/O for high-performance parallel and distributed computing
Checkpointing, the process of saving program/application state, usually to a stable storage, has been the most common fault-tolerance methodology for high-performance applications. The rate of checkpointing (how often) is primarily driven by the failure rate of the system. If the checkpointing rate...
Saved in:
| Published in: | The Journal of supercomputing Vol. 46; no. 2; pp. 150 - 180 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Boston
Springer US
01.11.2008
|
| Subjects: | |
| ISSN: | 0920-8542, 1573-0484 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!