Enhancing Parallelization with OpenMP through Multi-Modal Transformer Learning

The popularity of multicore processors and the rise of High Performance Computing as a Service (HPCaaS) have made parallel programming essential to fully utilize the performance of multicore systems. OpenMP, a widely adopted shared-memory parallel programming model, is favored for its ease of use. H...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings (International Conference on Computer Engineering and Applications. Online) pp. 465 - 469
Main Authors: Chen, Yuehua, Yuan, Huaqiang, Hou, Fengyao, Hu, Peng
Format: Conference Proceeding
Language:English
Published: IEEE 12.04.2024
Subjects:
ISSN:2159-1288
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract The popularity of multicore processors and the rise of High Performance Computing as a Service (HPCaaS) have made parallel programming essential to fully utilize the performance of multicore systems. OpenMP, a widely adopted shared-memory parallel programming model, is favored for its ease of use. However, it is still challenging to assist and accelerate automation of its parallelization. Although existing automation tools such as Cetus and DiscoPoP to simplify the parallelization, there are still limitations when dealing with complex data dependencies and control flows. Inspired by the success of deep learning in the field of Natural Language Processing (NLP), this study adopts a Transformer-based model to tackle the problems of automatic parallelization of OpenMP instructions. We propose a novel Transformer-based multimodal model, ParaMP, to improve the accuracy of OpenMP instruction classification. The ParaMP model not only takes into account the sequential features of the code text, but also incorporates the code structural features and enriches the input features of the model by representing the Abstract Syntax Trees (ASTs) corresponding to the codes in the form of binary trees. In addition, we built a BTCode dataset, which contains a large number of C/C++ code snippets and their corresponding simplified AST representations, to provide a basis for model training. Experimental evaluation shows that our model outperforms other existing automated tools and models in key performance metrics such as F1 score and recall. This study shows a significant improvement on the accuracy of OpenMP instruction classification by combining sequential and structural features of code text, which will provide a valuable insight into deep learning techniques to programming tasks.
AbstractList The popularity of multicore processors and the rise of High Performance Computing as a Service (HPCaaS) have made parallel programming essential to fully utilize the performance of multicore systems. OpenMP, a widely adopted shared-memory parallel programming model, is favored for its ease of use. However, it is still challenging to assist and accelerate automation of its parallelization. Although existing automation tools such as Cetus and DiscoPoP to simplify the parallelization, there are still limitations when dealing with complex data dependencies and control flows. Inspired by the success of deep learning in the field of Natural Language Processing (NLP), this study adopts a Transformer-based model to tackle the problems of automatic parallelization of OpenMP instructions. We propose a novel Transformer-based multimodal model, ParaMP, to improve the accuracy of OpenMP instruction classification. The ParaMP model not only takes into account the sequential features of the code text, but also incorporates the code structural features and enriches the input features of the model by representing the Abstract Syntax Trees (ASTs) corresponding to the codes in the form of binary trees. In addition, we built a BTCode dataset, which contains a large number of C/C++ code snippets and their corresponding simplified AST representations, to provide a basis for model training. Experimental evaluation shows that our model outperforms other existing automated tools and models in key performance metrics such as F1 score and recall. This study shows a significant improvement on the accuracy of OpenMP instruction classification by combining sequential and structural features of code text, which will provide a valuable insight into deep learning techniques to programming tasks.
Author Hou, Fengyao
Hu, Peng
Yuan, Huaqiang
Chen, Yuehua
Author_xml – sequence: 1
  givenname: Yuehua
  surname: Chen
  fullname: Chen, Yuehua
  email: 221115186@dgut.edu.cn
  organization: Dongguan University of Technology,Dongguan,China
– sequence: 2
  givenname: Huaqiang
  surname: Yuan
  fullname: Yuan, Huaqiang
  email: yuanhq@dgut.edu.cn
  organization: Dongguan University of Technology,Dongguan,China
– sequence: 3
  givenname: Fengyao
  surname: Hou
  fullname: Hou, Fengyao
  email: houfy@ihep.ac.cn
  organization: Chinese Academy of Sciences,Institute of High Energy Physics,Beijing,China
– sequence: 4
  givenname: Peng
  surname: Hu
  fullname: Hu, Peng
  email: hup@ihep.ac.cn
  organization: Chinese Academy of Sciences,Institute of High Energy Physics,Beijing,China
BookMark eNo10E1PgzAcgPFqNHFOvoGHfgGw_5a-cFwIuiXgdpjnpdAyarqyFBajn14T9fTcfofnHt2EMViEMJAMgBRPm7KsVoIC4RklNM-ACMIkya9QUshCMU6YElKKa7SgwIsUqFJ3KJmmd0IIowCiEAv0WoVBh86FI97pqL233n3p2Y0Bf7h5wNuzDc0Oz0McL8cBNxc_u7QZjfZ4H3WY-jGebMS11TH8IA_ottd-sslfl-jtudqX67TevmzKVZ06kGJOZSc4U73VvCXAcyatNDkVuqOsNZKr3nStaZkBY1ivaFuwvjW50LzjOSgObIkef11nrT2cozvp-Hn4X8C-Aa1tU2Q
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICCEA62105.2024.10603704
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350386776
EISSN 2159-1288
EndPage 469
ExternalDocumentID 10603704
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 10.13039/501100001809
GroupedDBID 6IE
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i176t-7c6538fea5b015437e7d426ac23bd758fdcbdb3d1dd3f82b93fbd46a5c5418513
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001291294600090&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:36:07 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i176t-7c6538fea5b015437e7d426ac23bd758fdcbdb3d1dd3f82b93fbd46a5c5418513
PageCount 5
ParticipantIDs ieee_primary_10603704
PublicationCentury 2000
PublicationDate 2024-April-12
PublicationDateYYYYMMDD 2024-04-12
PublicationDate_xml – month: 04
  year: 2024
  text: 2024-April-12
  day: 12
PublicationDecade 2020
PublicationTitle Proceedings (International Conference on Computer Engineering and Applications. Online)
PublicationTitleAbbrev ICCEA
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211696
Score 1.8665384
Snippet The popularity of multicore processors and the rise of High Performance Computing as a Service (HPCaaS) have made parallel programming essential to fully...
SourceID ieee
SourceType Publisher
StartPage 465
SubjectTerms Abstract Syntax Trees
Accuracy
Automation
Codes
component
Deep learning
Multicore processing
Natural Language Processing
OpenMP
Parallel programming
Parallelization
Training
Title Enhancing Parallelization with OpenMP through Multi-Modal Transformer Learning
URI https://ieeexplore.ieee.org/document/10603704
WOSCitedRecordID wos001291294600090&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmAqH0V8ywOrSxJ_ZkRVKxhaZShSt8r22aVSlaLS8vux3QTEwMAWWUoU2bLv3vneewg9lBZ0phglhkFGWOmAaMVL4gqrJPM-JgnJbEJOJmo2K6uGrJ64MM651Hzm-vEx3eXD2u5iqSzscJFRGdU_D6WUe7LWd0GFBigjStF262Tl48tgMHwSAdPwgAML1m9f_2WkkuLIqPvPPzhBvR9GHq6-Y80pOnD1Geq2lgy42aHnaDKs36KCRr3Ald5En5RVQ7TEseKKY__IuMKNOw9O9FsyXoNe4WmbwobvNaqrix56HQ2ng2fSWCaQZS7FlkgrwgnmneYmJkdUOgkhBmtbUAMBGniwBgyFHIB6VZiSegNMaG55VLHJ6QXq1OvaXSLMrVE6LyyT4AKKUsa6XFuvIAPKM2BXqBfnZ_6-V8WYt1Nz_cf4DTqOq0CSTuIt6mw3O3eHjuzndvmxuU9r-QXEkKDL
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0MmugJPzB-24PX4u623XaPhkAgwoYDJtxI2-kiCVkMgr_ftuxiPHjw1uyh2bRpZ9503nsIPWUGVCQZJZpBRFhmgSjJM2ITIwUrCp8kBLMJkedyOs3GFVk9cGGstaH5zLb9MLzlw8psfanMnfA0osKrfx5yxpJ4R9fal1SoAzNpltb9OlH2POh0ui-pQzXcIcGEtesJflmphEjSa_7zH05R64eTh8f7aHOGDmx5jpq1KQOuzugFyrvlu9fQKOd4rNbeKWVZUS2xr7li30EyGuPKnwcHAi4ZrUAt8aROYt18le7qvIXeet1Jp08q0wSyiEW6IcKk7g4rrOLap0dUWAEuCiuTUA0OHBRgNGgKMQAtZKIzWmhgqeKGex2bmF6iRrkq7RXC3Gip4sQwAdbhKKmNjZUpJERAeQTsGrX8-sw-droYs3ppbv74_oiO-5PRcDYc5K-36MTvCAmqiXeosVlv7T06Ml-bxef6IezrN1BFpBI
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28International+Conference+on+Computer+Engineering+and+Applications.+Online%29&rft.atitle=Enhancing+Parallelization+with+OpenMP+through+Multi-Modal+Transformer+Learning&rft.au=Chen%2C+Yuehua&rft.au=Yuan%2C+Huaqiang&rft.au=Hou%2C+Fengyao&rft.au=Hu%2C+Peng&rft.date=2024-04-12&rft.pub=IEEE&rft.eissn=2159-1288&rft.spage=465&rft.epage=469&rft_id=info:doi/10.1109%2FICCEA62105.2024.10603704&rft.externalDocID=10603704