A multiprocessor scheduling algorithm for low overhead fault-tolerance

We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. App...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings - Symposium on Reliable Distributed Systems s. 186 - 194
Hlavní autoři: Hashimoto, K., Tsuchiya, T., Kikuno, T.
Médium: Konferenční příspěvek Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.01.1998
Témata:
ISBN:0818692189, 9780818692185
ISSN:1060-9857
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:We propose a new scheduling algorithm for achieving fault tolerance in multiprocessor systems. The new algorithm partitions a parallel program into subsets of tasks based on some characteristics of a task graph. Then for each subset, the algorithm duplicates and schedules its tasks successively. Applying the proposed algorithm to three kinds of practical task graphs (Gaussian elimination, Laplace equation solver and LU decomposition), we conduct simulations. Experimental results show that fault tolerance can be achieved at the cost of a small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.
Bibliografie:SourceType-Scholarly Journals-2
ObjectType-Feature-2
ObjectType-Conference Paper-1
content type line 23
SourceType-Conference Papers & Proceedings-1
ObjectType-Article-3
ISBN:0818692189
9780818692185
ISSN:1060-9857
DOI:10.1109/RELDIS.1998.740493