A new thread-level speculative automatic parallelization model and library based on duplicate code execution

Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular...

Full description

Saved in:

Bibliographic Details
Published in:	The Journal of supercomputing Vol. 80; no. 10; pp. 13714 - 13737
Main Authors:	Martínez, Millán A., Fraguela, Basilio B., Cabaleiro, José C., Rivera, Francisco F.
Format:	Journal Article
Language:	English
Published:	New York Springer US 01.07.2024 Springer Nature B.V
Subjects:	C plus plus Compilers Computer Science Interpreters Libraries Processor Architectures Programming Languages Software User services Automatic parallelization Thread-level speculation Template metaprogramming Speculative parallelism
ISSN:	0920-8542, 1573-0484
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular loops with indirections, race conditions or potential data dependency violations, among many other possible causes. One of the successful ways to automatically parallelize these loops is the use of speculative parallelization techniques. This paper presents a new model and the corresponding C++ library that supports the speculative automatic parallelization of loops in shared memory systems, seeking competitive performance and scalability while keeping user effort to a minimum. The primary speculative strategy consists of redundantly executing chunks of loop iterations in a duplicate fashion. Namely, each chunk is executed speculatively in parallel to obtain results as soon as possible and sequentially in a different thread to validate the speculative results. The implementation uses C++11 threads and it makes intensive use of templates and advanced multithreading techniques. An evaluation based on various benchmarks confirms that our proposal provides a competitive level of performance and scalability.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0920-8542 1573-0484
DOI:	10.1007/s11227-024-05987-0