Evaluating high level parallel programming support for irregular applications in ICC++

Object‐oriented techniques have been proffered as aids for managing complexity, enhancing reuse, and improving readability of irregular parallel applications. However, as performance is the major reason for employing parallelism, programmability and high performance must be delivered together. Using...

Full description

Saved in:
Bibliographic Details
Published in:Software, practice & experience Vol. 28; no. 11; pp. 1213 - 1243
Main Authors: Chien, Andrew A., Dolby, Julian, Gangul, Bishwaroop, Karamcheti, Vijay, Zhang, Xingbin
Format: Journal Article
Language:English
Published: New York John Wiley & Sons, Ltd 01.09.1998
Subjects:
ISSN:0038-0644, 1097-024X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Object‐oriented techniques have been proffered as aids for managing complexity, enhancing reuse, and improving readability of irregular parallel applications. However, as performance is the major reason for employing parallelism, programmability and high performance must be delivered together. Using a suite of seven challenging irregular applications and the mature Illinois Concert system (a high‐level concurrent object‐oriented programming model) and an aggressive implementation (whole program compilation plus microsecond threading and communication primitives in the runtime), we evaluate what programming efforts are required to achieve high performance. For all seven applications, we achieve performance comparable to the best achievable via low‐level programming means on large‐scale parallel systems. In general, a high‐level concurrent object‐oriented programming model supported by aggressive implementation techniques can eliminate programmer management of many concerns – procedure and computation granularity, namespace management, and low‐level concurrency management. Our study indicates that these concerns are fully automated for these applications. Decoupling these concerns makes managing the remaining fundamental concerns – data locality and load balance – much easier. In several cases, data locality and load balance for the complex algorithm and pointer data structures is automatically managed by the compiler and runtime, but in general programmer intervention was required. In a few cases, more detailed control is required, specifically explicit task priority, data consistency, and task placement. Our system integrates the expression of such information cleanly into the programming interface. Finally, only small changes to the sequential code were required to express concurrency and performance optimizations, less than 5 per cent of the source code lines were changed in all cases. This bodes well for supporting both sequential and parallel performance in a single code base. © 1998 John Wiley & Sons, Ltd.
Bibliography:istex:518AAF8BD5D3D69F0377944700F2B6F3AD47F3B3
ArticleID:SPE201
ark:/67375/WNG-8RLFN234-M
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0038-0644
1097-024X
DOI:10.1002/(SICI)1097-024X(199809)28:11<1213::AID-SPE201>3.0.CO;2-M