A distributed consistent global checkpoint algorithm for distributed mobile systems
A distributed coordinated checkpointing algorithm for distributed mobile systems is presented. A consistent global checkpoint is a set of states in which no message is recorded as received in one process and as not yet sent in another process. It is used for rollback when process failure occurs. A c...
Uloženo v:
| Vydáno v: | Proceedings. Eighth International Conference on Parallel and Distributed Systems. ICPADS 2001 s. 125 - 132 |
|---|---|
| Hlavní autor: | |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
2001
|
| Témata: | |
| ISBN: | 0769511538, 9780769511535 |
| ISSN: | 1521-9097 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | A distributed coordinated checkpointing algorithm for distributed mobile systems is presented. A consistent global checkpoint is a set of states in which no message is recorded as received in one process and as not yet sent in another process. It is used for rollback when process failure occurs. A consistent global checkpoint must be obtained for any checkpoint initiation by any process. This paper shows a checkpoint algorithm in which the amount of information piggybacked on program messages does not depend on the number of mobile processes. The number of checkpoints is minimized under two assumptions: (1) one consistent global checkpoint is taken for concurrent checkpoint initiations and (2) a checkpoint is initiated at each handoff by mobile processes. This algorithm is thus optimal among the generalizations of Chandy and Lamport's distributed snapshot algorithm under the latter assumption. |
|---|---|
| ISBN: | 0769511538 9780769511535 |
| ISSN: | 1521-9097 |
| DOI: | 10.1109/ICPADS.2001.934810 |

