Automatic mapping of sequential programs to parallel computers with distributed memory

This work is aimed at creating an automatically parallelizing compiler for a distributed memory computing system. The conditions for the correctness of parallelization of the considered sequential programs on a computing system with distributed memory are described. The article considers the problem...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Procedia computer science Ročník 229; s. 236 - 244
Hlavní autoři: Bagliy, A.P., Krivosheev, N.M., Steinberg, B.Ya
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 2023
Témata:
ISSN:1877-0509, 1877-0509
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract This work is aimed at creating an automatically parallelizing compiler for a distributed memory computing system. The conditions for the correctness of parallelization of the considered sequential programs on a computing system with distributed memory are described. The article considers the problem of placing data in distributed memory with minimization of interprocessor transfers for a program represented by a sequence of parallelizable loops. To solve this problem, an auxiliary bipartite “statements-variables” graph (SVG) is constructed and analyzed from the text of the program. The problem of grouping transfers of small data volumes into a smaller number of transfers of large data volumes is considered. The solution of this problem also leads to minimization of the time for data transfer. New chips with distributed local memory of thousands of cores have higher performance than the previous ones. Therefore, the development of compilers for the development of such microcircuits is necessary and relevant.
AbstractList This work is aimed at creating an automatically parallelizing compiler for a distributed memory computing system. The conditions for the correctness of parallelization of the considered sequential programs on a computing system with distributed memory are described. The article considers the problem of placing data in distributed memory with minimization of interprocessor transfers for a program represented by a sequence of parallelizable loops. To solve this problem, an auxiliary bipartite “statements-variables” graph (SVG) is constructed and analyzed from the text of the program. The problem of grouping transfers of small data volumes into a smaller number of transfers of large data volumes is considered. The solution of this problem also leads to minimization of the time for data transfer. New chips with distributed local memory of thousands of cores have higher performance than the previous ones. Therefore, the development of compilers for the development of such microcircuits is necessary and relevant.
Author Bagliy, A.P.
Steinberg, B.Ya
Krivosheev, N.M.
Author_xml – sequence: 1
  givenname: A.P.
  surname: Bagliy
  fullname: Bagliy, A.P.
  email: abagly@sfedu.ru
  organization: Southern federal university, department of Mathematics, mechanics and computer science
– sequence: 2
  givenname: N.M.
  surname: Krivosheev
  fullname: Krivosheev, N.M.
  organization: Southern federal university, department of Mathematics, mechanics and computer science
– sequence: 3
  givenname: B.Ya
  surname: Steinberg
  fullname: Steinberg, B.Ya
  organization: Southern federal university, department of Mathematics, mechanics and computer science
BookMark eNqFkMtOwzAQRS1UJErpF7DxDyT4USfxgkVV8ZIqsQG2lmNPiqskDrYL6t-TUhaIBcxmRiOdK91zjia97wGhS0pySmhxtc2H4E3MGWE8pywnTJygKa3KMiOCyMmP-wzNY9yScXhVSVpO0ctyl3ynkzO408Pg-g32DY7wtoM-Od3iMXsTdBdx8njQQbcttNj4btglCBF_uPSKrYspuHr8WNxB58P-Ap02uo0w_94z9Hx787S6z9aPdw-r5TozrOAiq2hN6UJUVBMgEoBxuShNI5lsykZyYTlQWUBdi9IWVleCNzUrDKlrqIxeFHyG5DHXBB9jgEYZl8Y2vk9Bu1ZRog6O1FZ9OVIHR4oyNToaWf6LHYLrdNj_Q10fKRhrvTsIKhoHvQHrApikrHd_8p-oaoYO
CitedBy_id crossref_primary_10_1145_3716871
Cites_doi 10.1109/ISPAN.1999.778932
10.1145/360827.360844
10.1145/3276496
10.1007/978-3-030-86359-3_8
10.25209/2079-3316-2021-12-1-21-113
10.4208/eajam.300921.210522
10.17587/prin.13.3-16
10.1007/978-3-319-62932-2_24
ContentType Journal Article
Copyright 2023
Copyright_xml – notice: 2023
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.procs.2023.12.025
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1877-0509
EndPage 244
ExternalDocumentID 10_1016_j_procs_2023_12_025
S187705092302015X
GroupedDBID --K
0R~
0SF
1B1
457
5VS
6I.
71M
AACTN
AAEDT
AAEDW
AAFTH
AAIKJ
AALRI
AAQFI
AAXUO
ABMAC
ACGFS
ADBBV
ADEZE
AEXQZ
AFTJW
AGHFR
AITUG
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
E3Z
EBS
EJD
EP3
FDB
FNPLU
HZ~
IXB
KQ8
M41
M~E
NCXOZ
O-L
O9-
OK1
P2P
RIG
ROL
SES
SSZ
9DU
AAYWO
AAYXX
ABWVN
ACRPL
ACVFH
ADCNI
ADNMO
ADVLN
AEUPX
AFPUW
AIGII
AKBMS
AKYEP
CITATION
~HD
ID FETCH-LOGICAL-c2635-81b114581a0e09ee23947cf929f7f935d3e196ebb57d6da853fb26c0bbe8ca463
ISSN 1877-0509
IngestDate Sat Nov 29 03:07:59 EST 2025
Tue Nov 18 22:32:52 EST 2025
Sat May 25 15:42:01 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords automation of parallelization
data transfers
distributed memory
data placement
program transformations
Language English
License This is an open access article under the CC BY-NC-ND license.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c2635-81b114581a0e09ee23947cf929f7f935d3e196ebb57d6da853fb26c0bbe8ca463
OpenAccessLink https://dx.doi.org/10.1016/j.procs.2023.12.025
PageCount 9
ParticipantIDs crossref_citationtrail_10_1016_j_procs_2023_12_025
crossref_primary_10_1016_j_procs_2023_12_025
elsevier_sciencedirect_doi_10_1016_j_procs_2023_12_025
PublicationCentury 2000
PublicationDate 2023
2023-00-00
PublicationDateYYYYMMDD 2023-01-01
PublicationDate_xml – year: 2023
  text: 2023
PublicationDecade 2020
PublicationTitle Procedia computer science
PublicationYear 2023
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References R. Allen, K. Kennedy, Optimizing Compilers for Modern Architectures, Morgan Kaufmann Publisher, Academic Press.
SoC esperanto [online].
Optimizing parallelizing system [online].
Prangishvili, Vilenkin, Medvedev (bib0012) 1983
S. G. Ammaev, L. R. Gervich, B. Y. Steinberg, Combining parallelization with overlaps and optimization of cache memory usage, in: International Conference on Parallel Computing Technologies, pp. 257–264. doi:10.1007/978-3-319-62932-2-24. URL
L. Lamport, The parallel execution of DO loops, Communications of the ACM 17 (2) 83–93. doi:10.1145/360827.360844. URL
Shteinberg (bib0016) 2010
Kataev, Kolganov (bib0009) 2021
U. Bondhugula, Automatic distributed-memory parallelization and code generation using the polyhedral framework, Technical report, ISc-CSA-TR-2011-3. URL
A. Vasilenko, V. Veselovskiy, N. Zhivykh, O. Steinberg, O. Steinberg, Precompiler for the ACELAN-COMPOS package solvers, in: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2021, Vol. 12942, Springer, pp. 103–116. doi
Shteinberg (bib0015) 2010; 6
Z. Gong, Z. Chen, Z. Szaday, D. Wong, Z. Sura, N. Watkinson, S. Maleki, D. Padua, A. Veidenbaum, A. Nicolau, An empirical study of the effect of source-level loop transformations on compiler stability, in: Proceedings of the ACM on Programming Languages, pp. 1–29. URL
L. Gervich, B. Steinberg, Automation of the application of data distribution with overlapping in distributed memory, Bulletin of the South Ural State University. Ser. Mathematical Modelling, Programming & Computer Software (Bulletin SUSU MMCS) 16 (1) 59–68.
NVIDIA HPC fortran,c and c++ compilers with OpenACC | NVIDIA developer [online].
SambaNova launches second-gen DataScale system [online].
DVM-system for parallel program development | DVM-system [online].
D. Kwon, S. Han, H. Kim, MPI backend for an automatic parallelizing compiler, in: Proceedings Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN’99), pp. 152–157, ISSN: 1087-4089. doi
.
G. Chinin, Program vectorization. Theory, methods, implementation., Mir. URL
B. Y. Steinberg, O. B. Steinberg, Program transformations as the base for optimizing parallelizing compilers, Program Systems: Theory and Applications 12 21–113. doi
B. Steinberg, O. Steinberg, P. Oganesyan, A. Vasilenko, V. V. Null, N. Zhivykh, Fast solvers for systems of linear equations with block-band matrices, East Asian Journal on Applied Mathematics 13(1) 47–58. doi:10.4208/eajam.300921.210522. URL
Processor from NTC “Modul” [online].
F. Harari, Graph theory, Mir.
N. Krivosheev, B. Steinberg, Algorithm for searching minimum inter-node data transfers, in: Procedia Computer Science, 10th International Young Scientist Conference on Computational Science.
V. Korneev, Parallel programming, Programmnaya Ingeneria 13 (1) 3–16. doi:10.17587/prin.13.3-16. URL
Shteinberg (10.1016/j.procs.2023.12.025_bib0016) 2010
10.1016/j.procs.2023.12.025_bib0020
10.1016/j.procs.2023.12.025_bib0010
10.1016/j.procs.2023.12.025_bib0021
10.1016/j.procs.2023.12.025_bib0004
Shteinberg (10.1016/j.procs.2023.12.025_bib0015) 2010; 6
10.1016/j.procs.2023.12.025_bib0005
10.1016/j.procs.2023.12.025_bib0006
10.1016/j.procs.2023.12.025_bib0017
10.1016/j.procs.2023.12.025_bib0007
10.1016/j.procs.2023.12.025_bib0018
10.1016/j.procs.2023.12.025_bib0011
10.1016/j.procs.2023.12.025_bib0022
10.1016/j.procs.2023.12.025_bib0001
10.1016/j.procs.2023.12.025_bib0023
10.1016/j.procs.2023.12.025_bib0002
10.1016/j.procs.2023.12.025_bib0013
10.1016/j.procs.2023.12.025_bib0024
10.1016/j.procs.2023.12.025_bib0003
10.1016/j.procs.2023.12.025_bib0014
Prangishvili (10.1016/j.procs.2023.12.025_bib0012) 1983
Kataev (10.1016/j.procs.2023.12.025_bib0009) 2021
10.1016/j.procs.2023.12.025_bib0008
10.1016/j.procs.2023.12.025_bib0019
References_xml – reference: D. Kwon, S. Han, H. Kim, MPI backend for an automatic parallelizing compiler, in: Proceedings Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN’99), pp. 152–157, ISSN: 1087-4089. doi:
– volume: 6
  start-page: 36
  year: 2010
  end-page: 41
  ident: bib0015
  article-title: Blochno-afnnye razmeshcheniia dannykh v parallelnoi pamiati
  publication-title: Informatsionnye tekhnologii
– reference: L. Lamport, The parallel execution of DO loops, Communications of the ACM 17 (2) 83–93. doi:10.1145/360827.360844. URL
– reference: B. Y. Steinberg, O. B. Steinberg, Program transformations as the base for optimizing parallelizing compilers, Program Systems: Theory and Applications 12 21–113. doi:
– start-page: 41
  year: 2021
  end-page: 52
  ident: bib0009
  article-title: Additional parallelization of existing MPI programs using SAPFOR
  publication-title: Parallel Computing Technologies: 16th International Conference, PaCT
– reference: Optimizing parallelizing system [online].
– reference: G. Chinin, Program vectorization. Theory, methods, implementation., Mir. URL
– year: 1983
  ident: bib0012
  article-title: Parallelnye vychislitelnye sistemy s obshchim upravleniem
– reference: V. Korneev, Parallel programming, Programmnaya Ingeneria 13 (1) 3–16. doi:10.17587/prin.13.3-16. URL
– reference: R. Allen, K. Kennedy, Optimizing Compilers for Modern Architectures, Morgan Kaufmann Publisher, Academic Press.
– reference: L. Gervich, B. Steinberg, Automation of the application of data distribution with overlapping in distributed memory, Bulletin of the South Ural State University. Ser. Mathematical Modelling, Programming & Computer Software (Bulletin SUSU MMCS) 16 (1) 59–68.
– reference: A. Vasilenko, V. Veselovskiy, N. Zhivykh, O. Steinberg, O. Steinberg, Precompiler for the ACELAN-COMPOS package solvers, in: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2021, Vol. 12942, Springer, pp. 103–116. doi:
– reference: F. Harari, Graph theory, Mir.
– reference: U. Bondhugula, Automatic distributed-memory parallelization and code generation using the polyhedral framework, Technical report, ISc-CSA-TR-2011-3. URL
– reference: B. Steinberg, O. Steinberg, P. Oganesyan, A. Vasilenko, V. V. Null, N. Zhivykh, Fast solvers for systems of linear equations with block-band matrices, East Asian Journal on Applied Mathematics 13(1) 47–58. doi:10.4208/eajam.300921.210522. URL
– reference: SambaNova launches second-gen DataScale system [online].
– reference: Z. Gong, Z. Chen, Z. Szaday, D. Wong, Z. Sura, N. Watkinson, S. Maleki, D. Padua, A. Veidenbaum, A. Nicolau, An empirical study of the effect of source-level loop transformations on compiler stability, in: Proceedings of the ACM on Programming Languages, pp. 1–29. URL
– reference: .
– reference: SoC esperanto [online].
– reference: NVIDIA HPC fortran,c and c++ compilers with OpenACC | NVIDIA developer [online].
– year: 2010
  ident: bib0016
  article-title: Optimizatsiia razmeshcheniia dannykh v parallelnoi pamiati, Prioritetnye natsionalnye proekty. Obrazovanie
– reference: S. G. Ammaev, L. R. Gervich, B. Y. Steinberg, Combining parallelization with overlaps and optimization of cache memory usage, in: International Conference on Parallel Computing Technologies, pp. 257–264. doi:10.1007/978-3-319-62932-2-24. URL
– reference: N. Krivosheev, B. Steinberg, Algorithm for searching minimum inter-node data transfers, in: Procedia Computer Science, 10th International Young Scientist Conference on Computational Science.
– reference: DVM-system for parallel program development | DVM-system [online].
– reference: Processor from NTC “Modul” [online].
– ident: 10.1016/j.procs.2023.12.025_bib0014
– ident: 10.1016/j.procs.2023.12.025_bib0010
  doi: 10.1109/ISPAN.1999.778932
– ident: 10.1016/j.procs.2023.12.025_bib0021
  doi: 10.1145/360827.360844
– volume: 6
  start-page: 36
  year: 2010
  ident: 10.1016/j.procs.2023.12.025_bib0015
  article-title: Blochno-afnnye razmeshcheniia dannykh v parallelnoi pamiati
  publication-title: Informatsionnye tekhnologii
– ident: 10.1016/j.procs.2023.12.025_bib0022
– ident: 10.1016/j.procs.2023.12.025_bib0005
  doi: 10.1145/3276496
– ident: 10.1016/j.procs.2023.12.025_bib0008
– start-page: 41
  year: 2021
  ident: 10.1016/j.procs.2023.12.025_bib0009
  article-title: Additional parallelization of existing MPI programs using SAPFOR
– year: 1983
  ident: 10.1016/j.procs.2023.12.025_bib0012
– ident: 10.1016/j.procs.2023.12.025_bib0020
– year: 2010
  ident: 10.1016/j.procs.2023.12.025_bib0016
– ident: 10.1016/j.procs.2023.12.025_bib0024
– ident: 10.1016/j.procs.2023.12.025_bib0003
– ident: 10.1016/j.procs.2023.12.025_bib0002
– ident: 10.1016/j.procs.2023.12.025_bib0004
– ident: 10.1016/j.procs.2023.12.025_bib0007
  doi: 10.1007/978-3-030-86359-3_8
– ident: 10.1016/j.procs.2023.12.025_bib0001
– ident: 10.1016/j.procs.2023.12.025_bib0023
  doi: 10.25209/2079-3316-2021-12-1-21-113
– ident: 10.1016/j.procs.2023.12.025_bib0006
  doi: 10.4208/eajam.300921.210522
– ident: 10.1016/j.procs.2023.12.025_bib0011
  doi: 10.17587/prin.13.3-16
– ident: 10.1016/j.procs.2023.12.025_bib0018
– ident: 10.1016/j.procs.2023.12.025_bib0013
  doi: 10.1007/978-3-319-62932-2_24
– ident: 10.1016/j.procs.2023.12.025_bib0017
– ident: 10.1016/j.procs.2023.12.025_bib0019
SSID ssj0000388917
Score 2.2730374
Snippet This work is aimed at creating an automatically parallelizing compiler for a distributed memory computing system. The conditions for the correctness of...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 236
SubjectTerms automation of parallelization
data placement
data transfers
distributed memory
program transformations
Title Automatic mapping of sequential programs to parallel computers with distributed memory
URI https://dx.doi.org/10.1016/j.procs.2023.12.025
Volume 229
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 1877-0509
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000388917
  issn: 1877-0509
  databaseCode: M~E
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9swDBa2boddtu6Fdi_osNvmIH7KOqbFhh62okC7IjsZlkRvKVw7SNyip_72kpLsZEtRbAN2MRIlimzxA0Uy5EfG3guDRjSYKihzoQLiLwlUSGG4FHJpQKrQ8hacfhGHh_l0Ko986tDSthMQTZNfXcn5fxU1jqGwqXT2L8Q9_CgO4GsUOl5R7Hj9I8FPLrrW8bCel_O5T2p2GdMdhcd9RpZldiDi77qG2maWU3MHX-xmiE6XOmGhOXpOubi__PlriwsQV8OsD_4cXYVFf9QzK73J6Gg06PTF7LJd_gR8IFvi9XX45JhabvaJZnuj7-V6LMIVCrvQ2EZ5jNWmuRABEcy4w-aWMa-CIx_18Eo0ztbO48jxQ26oehd1OKODRhPvehTbuK4ro_6NQ_uYVqVF0eFCiyed3mcPIoG-FOV6Xq-CckSNI22X5uE2e6YqmxO4sdbt1syahXKyzR5714JPHCSesnvQPGNP-rYd3Gvx5-x0QAj3COFtxVcI4T1CeNfyHiF8QAgnhPA1hHCHkBfs2-dPJ_sHgW-vEWhiIArQYUFnOM3DcgxjCRDFMhG6Qnu5EpWMUxMDqmdQKhUmMyXadZWKMj1WCnJdJln8km01bQM7jGu0C9MQNIx1nBiTKLS8SgmpNklaVUrssqjfpkJ77nlqgVIXfZLhWWH3tqC9LcKowL3dZR-HSXNHvXL317N-_wuPemcVFoiYuya--teJr9kjeucCcm_YVre4gLfsob7sZsvFO4usG2K2mR0
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automatic+mapping+of+sequential+programs+to+parallel+computers+with+distributed+memory&rft.jtitle=Procedia+computer+science&rft.au=Bagliy%2C+A.P.&rft.au=Krivosheev%2C+N.M.&rft.au=Steinberg%2C+B.Ya&rft.date=2023&rft.pub=Elsevier+B.V&rft.issn=1877-0509&rft.eissn=1877-0509&rft.volume=229&rft.spage=236&rft.epage=244&rft_id=info:doi/10.1016%2Fj.procs.2023.12.025&rft.externalDocID=S187705092302015X
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1877-0509&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1877-0509&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1877-0509&client=summon