Parallel program debugging by specification

Most message passing parallel programs employ logical process topologies with regular characteristics to support their computation. Since process topologies define the relationship between processes, they present an excellent opportunity for debugging. The primary benefit is that process behaviours...

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation Vol. 16; no. 6; pp. 551 - 585
Main Authors: Huband, Simon, McDonald, Chris
Format: Journal Article
Language:English
Published: Chichester, UK John Wiley & Sons, Ltd 01.05.2004
Subjects:
ISSN:1532-0626, 1532-0634
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Most message passing parallel programs employ logical process topologies with regular characteristics to support their computation. Since process topologies define the relationship between processes, they present an excellent opportunity for debugging. The primary benefit is that process behaviours can be correlated, allowing expected behaviour to be ed and identified, and undesirable behaviour reported. However, topology support is inadequate in most message passing parallel programming environments, including the popular Message Passing Interface (MPI) and the Parallel Virtual Machine (PVM). Programmers are forced to implement topology support themselves, increasing the possibility of introducing errors. This paper proposes a trace‐ and topology‐based approach to parallel program debugging, driven by four distinct types of specifications. Trace specifications allow trace data from a variety of sources and message passing libraries to be interpreted in an manner, and topology specifications address the lack of explicit topology knowledge, whilst also facilitating the construction of user‐consistent views of the debugging activity. Loop specifications express topology‐consistent patterns of expected trace events, allowing conformance testing of associated trace data, and error specifications specify undesirable event interactions, including mismatched message sizes and mismatched communication pairs. Both loop and error specifications are simplified by having knowledge of the actual topologies being debugged. The proposed debugging framework enables a wealth of potential debugging views and techniques. Copyright © 2004 John Wiley & Sons, Ltd.
Bibliography:ark:/67375/WNG-ZF1055NF-C
ArticleID:CPE762
istex:3EF0FD2B53CE8F1D183960983FDDF85CFBB2FB7D
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.762