An efficient hardware supported and parallelization architecture for intelligent systems to overcome speculative overheads

In the last few decades, technology advancements have paved the way for the creation of intelligent and autonomous systems that utilize complex calculations which are both time‐consuming and central processing unit intensive. As a consequence, parallel processing systems are gaining popularity to en...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:International journal of intelligent systems Ročník 37; číslo 12; s. 11764 - 11790
Hlavní autoři: Kumar, Sudhakar, Singh, Sunil K., Aggarwal, Naveen, Gupta, Brij B., Alhalabi, Wadee, Band, Shahab S.
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York John Wiley & Sons, Inc 01.12.2022
Témata:
ISSN:0884-8173, 1098-111X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:In the last few decades, technology advancements have paved the way for the creation of intelligent and autonomous systems that utilize complex calculations which are both time‐consuming and central processing unit intensive. As a consequence, parallel processing systems are gaining popularity to enhance overall computer performance. Programmers should be able to efficiently utilize available hardware resources with parallelization in an ideal world. Through the automatic parallelization of sequential code, multithreading can be executed without extra supervision. However, a wide range of software dependencies prevents this from being feasible. An architectural framework for speculative parallelization along with an efficient memory analysis and computational algorithms for the code generation are proposed that can provide optimal performance. Furthermore, a suitable support of hardware design as a runtime library to the proposed architectural framework is presented which can be used to recover misspeculated results during execution to minimize speculative parallelism overhead. The implementation makes use of the Low‐Level Virtual Machine compiler infrastructure and is tested on numerous benchmarks, thus making it highly scalable in terms of programming languages and architectures. According to our experimental results, there is significant potential for speedup increase. In comparison to the overall function speedup, that is, geomean speedup of 5.2× approximately when using the proposed architecture without hardware support, the proposed architectural framework and algorithm with hardware support give an average geomean speedup of 7.0× approximately on the given benchmark which is written in C/C++.
Bibliografie:All authors contributed equally to this study.
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0884-8173
1098-111X
DOI:10.1002/int.23062