Automatic mapping of sequential programs to parallel computers with distributed memory

This work is aimed at creating an automatically parallelizing compiler for a distributed memory computing system. The conditions for the correctness of parallelization of the considered sequential programs on a computing system with distributed memory are described. The article considers the problem...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Procedia computer science Ročník 229; s. 236 - 244
Hlavní autoři:	Bagliy, A.P., Krivosheev, N.M., Steinberg, B.Ya
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Elsevier B.V 2023
Témata:	automation of parallelization data placement data transfers distributed memory program transformations automation of parallelization data transfers distributed memory data placement program transformations
ISSN:	1877-0509, 1877-0509
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Abstract	This work is aimed at creating an automatically parallelizing compiler for a distributed memory computing system. The conditions for the correctness of parallelization of the considered sequential programs on a computing system with distributed memory are described. The article considers the problem of placing data in distributed memory with minimization of interprocessor transfers for a program represented by a sequence of parallelizable loops. To solve this problem, an auxiliary bipartite “statements-variables” graph (SVG) is constructed and analyzed from the text of the program. The problem of grouping transfers of small data volumes into a smaller number of transfers of large data volumes is considered. The solution of this problem also leads to minimization of the time for data transfer. New chips with distributed local memory of thousands of cores have higher performance than the previous ones. Therefore, the development of compilers for the development of such microcircuits is necessary and relevant.
AbstractList	This work is aimed at creating an automatically parallelizing compiler for a distributed memory computing system. The conditions for the correctness of parallelization of the considered sequential programs on a computing system with distributed memory are described. The article considers the problem of placing data in distributed memory with minimization of interprocessor transfers for a program represented by a sequence of parallelizable loops. To solve this problem, an auxiliary bipartite “statements-variables” graph (SVG) is constructed and analyzed from the text of the program. The problem of grouping transfers of small data volumes into a smaller number of transfers of large data volumes is considered. The solution of this problem also leads to minimization of the time for data transfer. New chips with distributed local memory of thousands of cores have higher performance than the previous ones. Therefore, the development of compilers for the development of such microcircuits is necessary and relevant.
Author	Bagliy, A.P. Steinberg, B.Ya Krivosheev, N.M.
Author_xml	– sequence: 1 givenname: A.P. surname: Bagliy fullname: Bagliy, A.P. email: abagly@sfedu.ru organization: Southern federal university, department of Mathematics, mechanics and computer science – sequence: 2 givenname: N.M. surname: Krivosheev fullname: Krivosheev, N.M. organization: Southern federal university, department of Mathematics, mechanics and computer science – sequence: 3 givenname: B.Ya surname: Steinberg fullname: Steinberg, B.Ya organization: Southern federal university, department of Mathematics, mechanics and computer science
BookMark	eNqFkMtOwzAQRS1UJErpF7DxDyT4USfxgkVV8ZIqsQG2lmNPiqskDrYL6t-TUhaIBcxmRiOdK91zjia97wGhS0pySmhxtc2H4E3MGWE8pywnTJygKa3KMiOCyMmP-wzNY9yScXhVSVpO0ctyl3ynkzO408Pg-g32DY7wtoM-Od3iMXsTdBdx8njQQbcttNj4btglCBF_uPSKrYspuHr8WNxB58P-Ap02uo0w_94z9Hx787S6z9aPdw-r5TozrOAiq2hN6UJUVBMgEoBxuShNI5lsykZyYTlQWUBdi9IWVleCNzUrDKlrqIxeFHyG5DHXBB9jgEYZl8Y2vk9Bu1ZRog6O1FZ9OVIHR4oyNToaWf6LHYLrdNj_Q10fKRhrvTsIKhoHvQHrApikrHd_8p-oaoYO
CitedBy_id	crossref_primary_10_1145_3716871
Cites_doi	10.1109/ISPAN.1999.778932 10.1145/360827.360844 10.1145/3276496 10.1007/978-3-030-86359-3_8 10.25209/2079-3316-2021-12-1-21-113 10.4208/eajam.300921.210522 10.17587/prin.13.3-16 10.1007/978-3-319-62932-2_24
ContentType	Journal Article
Copyright	2023
Copyright_xml	– notice: 2023
DBID	6I. AAFTH AAYXX CITATION
DOI	10.1016/j.procs.2023.12.025
DatabaseName	ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1877-0509
EndPage	244
ExternalDocumentID	10_1016_j_procs_2023_12_025 S187705092302015X
GroupedDBID	--K 0R~ 0SF 1B1 457 5VS 6I. 71M AACTN AAEDT AAEDW AAFTH AAIKJ AALRI AAQFI AAXUO ABMAC ACGFS ADBBV ADEZE AEXQZ AFTJW AGHFR AITUG AKRWK ALMA_UNASSIGNED_HOLDINGS AMRAJ E3Z EBS EJD EP3 FDB FNPLU HZ~ IXB KQ8 M41 M~E NCXOZ O-L O9- OK1 P2P RIG ROL SES SSZ 9DU AAYWO AAYXX ABWVN ACRPL ACVFH ADCNI ADNMO ADVLN AEUPX AFPUW AIGII AKBMS AKYEP CITATION ~HD
ID	FETCH-LOGICAL-c2635-81b114581a0e09ee23947cf929f7f935d3e196ebb57d6da853fb26c0bbe8ca463
ISSN	1877-0509
IngestDate	Sat Nov 29 03:07:59 EST 2025 Tue Nov 18 22:32:52 EST 2025 Sat May 25 15:42:01 EDT 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Keywords	automation of parallelization data transfers distributed memory data placement program transformations
Language	English
License	This is an open access article under the CC BY-NC-ND license.
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c2635-81b114581a0e09ee23947cf929f7f935d3e196ebb57d6da853fb26c0bbe8ca463
OpenAccessLink	https://dx.doi.org/10.1016/j.procs.2023.12.025
PageCount	9
ParticipantIDs	crossref_citationtrail_10_1016_j_procs_2023_12_025 crossref_primary_10_1016_j_procs_2023_12_025 elsevier_sciencedirect_doi_10_1016_j_procs_2023_12_025
PublicationCentury	2000
PublicationDate	2023 2023-00-00
PublicationDateYYYYMMDD	2023-01-01
PublicationDate_xml	– year: 2023 text: 2023
PublicationDecade	2020
PublicationTitle	Procedia computer science
PublicationYear	2023
Publisher	Elsevier B.V
Publisher_xml	– name: Elsevier B.V
References	R. Allen, K. Kennedy, Optimizing Compilers for Modern Architectures, Morgan Kaufmann Publisher, Academic Press. SoC esperanto [online]. Optimizing parallelizing system [online]. Prangishvili, Vilenkin, Medvedev (bib0012) 1983 S. G. Ammaev, L. R. Gervich, B. Y. Steinberg, Combining parallelization with overlaps and optimization of cache memory usage, in: International Conference on Parallel Computing Technologies, pp. 257–264. doi:10.1007/978-3-319-62932-2-24. URL L. Lamport, The parallel execution of DO loops, Communications of the ACM 17 (2) 83–93. doi:10.1145/360827.360844. URL Shteinberg (bib0016) 2010 Kataev, Kolganov (bib0009) 2021 U. Bondhugula, Automatic distributed-memory parallelization and code generation using the polyhedral framework, Technical report, ISc-CSA-TR-2011-3. URL A. Vasilenko, V. Veselovskiy, N. Zhivykh, O. Steinberg, O. Steinberg, Precompiler for the ACELAN-COMPOS package solvers, in: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2021, Vol. 12942, Springer, pp. 103–116. doi Shteinberg (bib0015) 2010; 6 Z. Gong, Z. Chen, Z. Szaday, D. Wong, Z. Sura, N. Watkinson, S. Maleki, D. Padua, A. Veidenbaum, A. Nicolau, An empirical study of the effect of source-level loop transformations on compiler stability, in: Proceedings of the ACM on Programming Languages, pp. 1–29. URL L. Gervich, B. Steinberg, Automation of the application of data distribution with overlapping in distributed memory, Bulletin of the South Ural State University. Ser. Mathematical Modelling, Programming & Computer Software (Bulletin SUSU MMCS) 16 (1) 59–68. NVIDIA HPC fortran,c and c++ compilers with OpenACC \| NVIDIA developer [online]. SambaNova launches second-gen DataScale system [online]. DVM-system for parallel program development \| DVM-system [online]. D. Kwon, S. Han, H. Kim, MPI backend for an automatic parallelizing compiler, in: Proceedings Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN’99), pp. 152–157, ISSN: 1087-4089. doi . G. Chinin, Program vectorization. Theory, methods, implementation., Mir. URL B. Y. Steinberg, O. B. Steinberg, Program transformations as the base for optimizing parallelizing compilers, Program Systems: Theory and Applications 12 21–113. doi B. Steinberg, O. Steinberg, P. Oganesyan, A. Vasilenko, V. V. Null, N. Zhivykh, Fast solvers for systems of linear equations with block-band matrices, East Asian Journal on Applied Mathematics 13(1) 47–58. doi:10.4208/eajam.300921.210522. URL Processor from NTC “Modul” [online]. F. Harari, Graph theory, Mir. N. Krivosheev, B. Steinberg, Algorithm for searching minimum inter-node data transfers, in: Procedia Computer Science, 10th International Young Scientist Conference on Computational Science. V. Korneev, Parallel programming, Programmnaya Ingeneria 13 (1) 3–16. doi:10.17587/prin.13.3-16. URL Shteinberg (10.1016/j.procs.2023.12.025_bib0016) 2010 10.1016/j.procs.2023.12.025_bib0020 10.1016/j.procs.2023.12.025_bib0010 10.1016/j.procs.2023.12.025_bib0021 10.1016/j.procs.2023.12.025_bib0004 Shteinberg (10.1016/j.procs.2023.12.025_bib0015) 2010; 6 10.1016/j.procs.2023.12.025_bib0005 10.1016/j.procs.2023.12.025_bib0006 10.1016/j.procs.2023.12.025_bib0017 10.1016/j.procs.2023.12.025_bib0007 10.1016/j.procs.2023.12.025_bib0018 10.1016/j.procs.2023.12.025_bib0011 10.1016/j.procs.2023.12.025_bib0022 10.1016/j.procs.2023.12.025_bib0001 10.1016/j.procs.2023.12.025_bib0023 10.1016/j.procs.2023.12.025_bib0002 10.1016/j.procs.2023.12.025_bib0013 10.1016/j.procs.2023.12.025_bib0024 10.1016/j.procs.2023.12.025_bib0003 10.1016/j.procs.2023.12.025_bib0014 Prangishvili (10.1016/j.procs.2023.12.025_bib0012) 1983 Kataev (10.1016/j.procs.2023.12.025_bib0009) 2021 10.1016/j.procs.2023.12.025_bib0008 10.1016/j.procs.2023.12.025_bib0019
References_xml	– reference: D. Kwon, S. Han, H. Kim, MPI backend for an automatic parallelizing compiler, in: Proceedings Fourth International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN’99), pp. 152–157, ISSN: 1087-4089. doi: – volume: 6 start-page: 36 year: 2010 end-page: 41 ident: bib0015 article-title: Blochno-afnnye razmeshcheniia dannykh v parallelnoi pamiati publication-title: Informatsionnye tekhnologii – reference: L. Lamport, The parallel execution of DO loops, Communications of the ACM 17 (2) 83–93. doi:10.1145/360827.360844. URL – reference: B. Y. Steinberg, O. B. Steinberg, Program transformations as the base for optimizing parallelizing compilers, Program Systems: Theory and Applications 12 21–113. doi: – start-page: 41 year: 2021 end-page: 52 ident: bib0009 article-title: Additional parallelization of existing MPI programs using SAPFOR publication-title: Parallel Computing Technologies: 16th International Conference, PaCT – reference: Optimizing parallelizing system [online]. – reference: G. Chinin, Program vectorization. Theory, methods, implementation., Mir. URL – year: 1983 ident: bib0012 article-title: Parallelnye vychislitelnye sistemy s obshchim upravleniem – reference: V. Korneev, Parallel programming, Programmnaya Ingeneria 13 (1) 3–16. doi:10.17587/prin.13.3-16. URL – reference: R. Allen, K. Kennedy, Optimizing Compilers for Modern Architectures, Morgan Kaufmann Publisher, Academic Press. – reference: L. Gervich, B. Steinberg, Automation of the application of data distribution with overlapping in distributed memory, Bulletin of the South Ural State University. Ser. Mathematical Modelling, Programming & Computer Software (Bulletin SUSU MMCS) 16 (1) 59–68. – reference: A. Vasilenko, V. Veselovskiy, N. Zhivykh, O. Steinberg, O. Steinberg, Precompiler for the ACELAN-COMPOS package solvers, in: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2021, Vol. 12942, Springer, pp. 103–116. doi: – reference: F. Harari, Graph theory, Mir. – reference: U. Bondhugula, Automatic distributed-memory parallelization and code generation using the polyhedral framework, Technical report, ISc-CSA-TR-2011-3. URL – reference: B. Steinberg, O. Steinberg, P. Oganesyan, A. Vasilenko, V. V. Null, N. Zhivykh, Fast solvers for systems of linear equations with block-band matrices, East Asian Journal on Applied Mathematics 13(1) 47–58. doi:10.4208/eajam.300921.210522. URL – reference: SambaNova launches second-gen DataScale system [online]. – reference: Z. Gong, Z. Chen, Z. Szaday, D. Wong, Z. Sura, N. Watkinson, S. Maleki, D. Padua, A. Veidenbaum, A. Nicolau, An empirical study of the effect of source-level loop transformations on compiler stability, in: Proceedings of the ACM on Programming Languages, pp. 1–29. URL – reference: . – reference: SoC esperanto [online]. – reference: NVIDIA HPC fortran,c and c++ compilers with OpenACC \| NVIDIA developer [online]. – year: 2010 ident: bib0016 article-title: Optimizatsiia razmeshcheniia dannykh v parallelnoi pamiati, Prioritetnye natsionalnye proekty. Obrazovanie – reference: S. G. Ammaev, L. R. Gervich, B. Y. Steinberg, Combining parallelization with overlaps and optimization of cache memory usage, in: International Conference on Parallel Computing Technologies, pp. 257–264. doi:10.1007/978-3-319-62932-2-24. URL – reference: N. Krivosheev, B. Steinberg, Algorithm for searching minimum inter-node data transfers, in: Procedia Computer Science, 10th International Young Scientist Conference on Computational Science. – reference: DVM-system for parallel program development \| DVM-system [online]. – reference: Processor from NTC “Modul” [online]. – ident: 10.1016/j.procs.2023.12.025_bib0014 – ident: 10.1016/j.procs.2023.12.025_bib0010 doi: 10.1109/ISPAN.1999.778932 – ident: 10.1016/j.procs.2023.12.025_bib0021 doi: 10.1145/360827.360844 – volume: 6 start-page: 36 year: 2010 ident: 10.1016/j.procs.2023.12.025_bib0015 article-title: Blochno-afnnye razmeshcheniia dannykh v parallelnoi pamiati publication-title: Informatsionnye tekhnologii – ident: 10.1016/j.procs.2023.12.025_bib0022 – ident: 10.1016/j.procs.2023.12.025_bib0005 doi: 10.1145/3276496 – ident: 10.1016/j.procs.2023.12.025_bib0008 – start-page: 41 year: 2021 ident: 10.1016/j.procs.2023.12.025_bib0009 article-title: Additional parallelization of existing MPI programs using SAPFOR – year: 1983 ident: 10.1016/j.procs.2023.12.025_bib0012 – ident: 10.1016/j.procs.2023.12.025_bib0020 – year: 2010 ident: 10.1016/j.procs.2023.12.025_bib0016 – ident: 10.1016/j.procs.2023.12.025_bib0024 – ident: 10.1016/j.procs.2023.12.025_bib0003 – ident: 10.1016/j.procs.2023.12.025_bib0002 – ident: 10.1016/j.procs.2023.12.025_bib0004 – ident: 10.1016/j.procs.2023.12.025_bib0007 doi: 10.1007/978-3-030-86359-3_8 – ident: 10.1016/j.procs.2023.12.025_bib0001 – ident: 10.1016/j.procs.2023.12.025_bib0023 doi: 10.25209/2079-3316-2021-12-1-21-113 – ident: 10.1016/j.procs.2023.12.025_bib0006 doi: 10.4208/eajam.300921.210522 – ident: 10.1016/j.procs.2023.12.025_bib0011 doi: 10.17587/prin.13.3-16 – ident: 10.1016/j.procs.2023.12.025_bib0018 – ident: 10.1016/j.procs.2023.12.025_bib0013 doi: 10.1007/978-3-319-62932-2_24 – ident: 10.1016/j.procs.2023.12.025_bib0017 – ident: 10.1016/j.procs.2023.12.025_bib0019
SSID	ssj0000388917
Score	2.2730374
Snippet	This work is aimed at creating an automatically parallelizing compiler for a distributed memory computing system. The conditions for the correctness of...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	236
SubjectTerms	automation of parallelization data placement data transfers distributed memory program transformations
Title	Automatic mapping of sequential programs to parallel computers with distributed memory
URI	https://dx.doi.org/10.1016/j.procs.2023.12.025
Volume	229
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1877-0509 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000388917 issn: 1877-0509 databaseCode: M~E dateStart: 20100101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9swDBa2boddtu6Fdi_osNvmIH7KOqbFhh62okC7IjsZlkRvKVw7SNyip_72kpLsZEtRbAN2MRIlimzxA0Uy5EfG3guDRjSYKihzoQLiLwlUSGG4FHJpQKrQ8hacfhGHh_l0Ko986tDSthMQTZNfXcn5fxU1jqGwqXT2L8Q9_CgO4GsUOl5R7Hj9I8FPLrrW8bCel_O5T2p2GdMdhcd9RpZldiDi77qG2maWU3MHX-xmiE6XOmGhOXpOubi__PlriwsQV8OsD_4cXYVFf9QzK73J6Gg06PTF7LJd_gR8IFvi9XX45JhabvaJZnuj7-V6LMIVCrvQ2EZ5jNWmuRABEcy4w-aWMa-CIx_18Eo0ztbO48jxQ26oehd1OKODRhPvehTbuK4ro_6NQ_uYVqVF0eFCiyed3mcPIoG-FOV6Xq-CckSNI22X5uE2e6YqmxO4sdbt1syahXKyzR5714JPHCSesnvQPGNP-rYd3Gvx5-x0QAj3COFtxVcI4T1CeNfyHiF8QAgnhPA1hHCHkBfs2-dPJ_sHgW-vEWhiIArQYUFnOM3DcgxjCRDFMhG6Qnu5EpWMUxMDqmdQKhUmMyXadZWKMj1WCnJdJln8km01bQM7jGu0C9MQNIx1nBiTKLS8SgmpNklaVUrssqjfpkJ77nlqgVIXfZLhWWH3tqC9LcKowL3dZR-HSXNHvXL317N-_wuPemcVFoiYuya--teJr9kjeucCcm_YVre4gLfsob7sZsvFO4usG2K2mR0
linkProvider	ISSN International Centre
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automatic+mapping+of+sequential+programs+to+parallel+computers+with+distributed+memory&rft.jtitle=Procedia+computer+science&rft.au=Bagliy%2C+A.P.&rft.au=Krivosheev%2C+N.M.&rft.au=Steinberg%2C+B.Ya&rft.date=2023&rft.pub=Elsevier+B.V&rft.issn=1877-0509&rft.eissn=1877-0509&rft.volume=229&rft.spage=236&rft.epage=244&rft_id=info:doi/10.1016%2Fj.procs.2023.12.025&rft.externalDocID=S187705092302015X
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1877-0509&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1877-0509&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1877-0509&client=summon