A survey of recoverable distributed shared virtual memory systems

Distributed Shared Virtual Memory (DSVM) systems provide a shared memory abstraction on distributed memory architectures. Such systems ease parallel application programming because the shared-memory programming model is often more natural than the message-passing paradigm. However, the probability o...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems Vol. 8; no. 9; pp. 959 - 969
Main Authors: Morin, C., Puaut, I.
Format: Journal Article
Language:English
Published: IEEE 01.09.1997
Subjects:
ISSN:1045-9219
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Distributed Shared Virtual Memory (DSVM) systems provide a shared memory abstraction on distributed memory architectures. Such systems ease parallel application programming because the shared-memory programming model is often more natural than the message-passing paradigm. However, the probability of failure of a DSVM increases with the number of sites. Thus, fault tolerance mechanisms must be implemented in order to allow processes to continue their execution in the event of a failure. This paper gives an overview of recoverable DSVMs (RDSVMs) that provide a checkpointing mechanism to restart parallel computations in the event of a site failure.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:1045-9219
DOI:10.1109/71.615441