Letting future programmers experience performance-related tasks

•Presentation of fine tuned assignments for HPC and parallel programming courses.•All assignments tested on students in three interconnected courses.•Covering vectorization, caches, multicore CPUs, GPUs, and multi-node problems.•Introducing 5 widely used technologies (TBB, OpenMP, CUDA, OpenMPI, and...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Journal of parallel and distributed computing Ročník 155; s. 74 - 86
Hlavní autori:	Bednárek, David, Kruliš, Martin, Yaghob, Jakub
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Elsevier Inc 01.09.2021
Predmet:	Assignments Education GPU HPC Parallel computing Assignments Parallel computing HPC Education GPU
ISSN:	0743-7315
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	•Presentation of fine tuned assignments for HPC and parallel programming courses.•All assignments tested on students in three interconnected courses.•Covering vectorization, caches, multicore CPUs, GPUs, and multi-node problems.•Introducing 5 widely used technologies (TBB, OpenMP, CUDA, OpenMPI, and Spark). Programming courses usually focus on software-engineering problems like software decomposition and code maintenance. While computer-science lessons emphasize algorithm complexity, technological problems are usually neglected although they may significantly affect the performance in terms of wall time. As the technological problems are best explained by hands-on experience, we present a set of homework assignments focused on a range of technologies from instruction-level parallelism to GPU programming to cluster computing. These assignments are a product of a decade of development and testing on live subjects – the students of three performance-related software courses at the Faculty of Mathematics and Physics of the Charles University in Prague.
AbstractList	•Presentation of fine tuned assignments for HPC and parallel programming courses.•All assignments tested on students in three interconnected courses.•Covering vectorization, caches, multicore CPUs, GPUs, and multi-node problems.•Introducing 5 widely used technologies (TBB, OpenMP, CUDA, OpenMPI, and Spark). Programming courses usually focus on software-engineering problems like software decomposition and code maintenance. While computer-science lessons emphasize algorithm complexity, technological problems are usually neglected although they may significantly affect the performance in terms of wall time. As the technological problems are best explained by hands-on experience, we present a set of homework assignments focused on a range of technologies from instruction-level parallelism to GPU programming to cluster computing. These assignments are a product of a decade of development and testing on live subjects – the students of three performance-related software courses at the Faculty of Mathematics and Physics of the Charles University in Prague.
Author	Bednárek, David Kruliš, Martin Yaghob, Jakub
Author_xml	– sequence: 1 givenname: David orcidid: 0000-0001-7740-0158 surname: Bednárek fullname: Bednárek, David email: bednarek@ksi.mff.cuni.cz – sequence: 2 givenname: Martin orcidid: 0000-0002-0985-8949 surname: Kruliš fullname: Kruliš, Martin email: krulis@ksi.mff.cuni.cz – sequence: 3 givenname: Jakub orcidid: 0000-0002-9602-2196 surname: Yaghob fullname: Yaghob, Jakub email: yaghob@ksi.mff.cuni.cz
BookMark	eNp9kM1KAzEUhbOoYFt9AVfzAjPmZ9KkIIgUtULBja5DmtyUjJ0fblLRt3dqXbno6h4OfAe-OyOTru-AkBtGK0bZ4rapmsG7ilPOKlpXlNUTMqWqFqUSTF6SWUoNpYxJpafkfgM5x25XhEM-IBQD9ju0bQuYCvgaACN0bqwBQ4-tHXOJsLcZfJFt-khX5CLYfYLrvzsn70-Pb6t1uXl9flk9bEonKM2l11ozHhyttebKSb5U2y1YqZgXcquDAwhSck5rycMClBTaq-VSSes9ExbEnOjTrsM-JYRgXMw2x77LaOPeMGqO8qYxR3lzlDe0NqP8iPJ_6ICxtfh9Hro7QTBKfUZAk9zvK3xEcNn4Pp7DfwBZMHiG
CitedBy_id	crossref_primary_10_1016_j_parco_2024_103096
Cites_doi	10.1145/1356052.1356053 10.1016/j.patrec.2007.01.001 10.1016/j.is.2016.06.001 10.1145/143103.143132 10.1145/321796.321811
ContentType	Journal Article
Copyright	2021 Elsevier Inc.
Copyright_xml	– notice: 2021 Elsevier Inc.
DBID	AAYXX CITATION
DOI	10.1016/j.jpdc.2021.04.014
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Education Computer Science
EndPage	86
ExternalDocumentID	10_1016_j_jpdc_2021_04_014 S0743731521000976
GroupedDBID	--K --M -~X .~1 0R~ 1B1 1~. 1~5 29L 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AATTM AAXKI AAXUO AAYFN ABBOA ABDPE ABEFU ABFNM ABFSI ABJNI ABMAC ABTAH ABWVN ABXDB ACDAQ ACGFS ACNNM ACRLP ACRPL ACZNC ADBBV ADEZE ADFGL ADHUB ADJOM ADMUD ADNMO ADTZH ADVLN AEBSH AECPX AEIPS AEKER AENEX AFJKZ AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AKRWK ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC BNPGV CAG COF CS3 DM4 DU5 E.L EBS EFBJH EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA K-O KOM LG5 LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SES SET SEW SPC SPCBC SSH SST SSV SSZ T5K TN5 TWZ WUQ XJT XOL XPP ZMT ZU3 ZY4 ~G- 9DU AAYWO AAYXX ACLOT ACVFH ADCNI AEUPX AFPUW AGQPQ AIGII AIIUN AKBMS AKYEP APXCP CITATION EFKBS EFLBG ~HD
ID	FETCH-LOGICAL-c300t-d88812fc048827c5297bbea571d35b8fceef55220452f6e7538d79975add13ae3
ISICitedReferencesCount	3
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000656871100007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	0743-7315
IngestDate	Sat Nov 29 07:15:39 EST 2025 Tue Nov 18 22:13:22 EST 2025 Sun Apr 06 06:54:09 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Keywords	Assignments Parallel computing HPC Education GPU
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c300t-d88812fc048827c5297bbea571d35b8fceef55220452f6e7538d79975add13ae3
ORCID	0000-0001-7740-0158 0000-0002-0985-8949 0000-0002-9602-2196
PageCount	13
ParticipantIDs	crossref_citationtrail_10_1016_j_jpdc_2021_04_014 crossref_primary_10_1016_j_jpdc_2021_04_014 elsevier_sciencedirect_doi_10_1016_j_jpdc_2021_04_014
PublicationCentury	2000
PublicationDate	September 2021 2021-09-00
PublicationDateYYYYMMDD	2021-09-01
PublicationDate_xml	– month: 09 year: 2021 text: September 2021
PublicationDecade	2020
PublicationTitle	Journal of parallel and distributed computing
PublicationYear	2021
Publisher	Elsevier Inc
Publisher_xml	– name: Elsevier Inc
References	Redmond, Heneghan (br0100) 2007; 28 Reinders (br0050) 2007 Bader, Zenger (br0010) 2006 Sarkar, Thekkath (br0020) 1992; 27 Goto, Geijn (br0120) 2008; 34 Bednárek, Brabec, Kruliš (br0080) 2017; 64 Knuth (br0150) 1997 Kaufman, Rousseeuw (br0090) 2009 Wagner, Fischer (br0060) 1974; 21 Farber (br0130) 2011 Ding, Zhao, Shen, Musuvathi, Mytkowicz (br0110) 2015 van Emde Boas (br0030) 1975 Chandra, Dagum, Kohr, Menon, Maydan, McDonald (br0040) 2001 Jette, Yoo, Grondona (br0140) 2002 Kennedy, Allen (br0070) 2001 Giardino, Ferri (br0160) 2016 Jette (10.1016/j.jpdc.2021.04.014_br0140) 2002 Kaufman (10.1016/j.jpdc.2021.04.014_br0090) 2009 Reinders (10.1016/j.jpdc.2021.04.014_br0050) 2007 Bednárek (10.1016/j.jpdc.2021.04.014_br0080) 2017; 64 van Emde Boas (10.1016/j.jpdc.2021.04.014_br0030) 1975 Chandra (10.1016/j.jpdc.2021.04.014_br0040) 2001 Farber (10.1016/j.jpdc.2021.04.014_br0130) 2011 Giardino (10.1016/j.jpdc.2021.04.014_br0160) 2016 Redmond (10.1016/j.jpdc.2021.04.014_br0100) 2007; 28 Knuth (10.1016/j.jpdc.2021.04.014_br0150) 1997 Goto (10.1016/j.jpdc.2021.04.014_br0120) 2008; 34 Kennedy (10.1016/j.jpdc.2021.04.014_br0070) 2001 Sarkar (10.1016/j.jpdc.2021.04.014_br0020) 1992; 27 Bader (10.1016/j.jpdc.2021.04.014_br0010) 2006 Wagner (10.1016/j.jpdc.2021.04.014_br0060) 1974; 21 Ding (10.1016/j.jpdc.2021.04.014_br0110) 2015
References_xml	– year: 2011 ident: br0130 article-title: CUDA Application Design and Development – volume: 27 start-page: 175 year: 1992 end-page: 187 ident: br0020 article-title: A general framework for iteration-reordering loop transformations publication-title: SIGPLAN Not. – volume: 64 start-page: 175 year: 2017 end-page: 193 ident: br0080 article-title: Improving matrix-based dynamic programming on massively parallel accelerators publication-title: Inf. Syst. – year: 2009 ident: br0090 article-title: Finding Groups in Data: An Introduction to Cluster Analysis, vol. 344 – start-page: 75 year: 1975 end-page: 84 ident: br0030 article-title: Preserving order in a forest in less than logarithmic time publication-title: 16th Annual Symposium on Foundations of Computer Science (sfcs 1975) – year: 2001 ident: br0070 article-title: Optimizing Compilers for Modern Architectures: A Dependence-Based Approach – year: 1997 ident: br0150 article-title: The Art of Computer Programming, vol. 3 – volume: 28 start-page: 965 year: 2007 end-page: 973 ident: br0100 article-title: A method for initialising the k-means clustering algorithm using kd-trees publication-title: Pattern Recognit. Lett. – volume: 21 start-page: 168 year: 1974 end-page: 173 ident: br0060 article-title: The string-to-string correction problem publication-title: J. ACM – year: 2001 ident: br0040 article-title: Parallel Programming in OpenMP – start-page: 1042 year: 2006 end-page: 1049 ident: br0010 article-title: A cache oblivious algorithm for matrix multiplication based on Peano's space filling curve publication-title: Parallel Processing and Applied Mathematics – start-page: 579 year: 2015 end-page: 587 ident: br0110 article-title: Yinyang k-means: a drop-in replacement of the classic k-means with consistent speedup publication-title: International Conference on Machine Learning – volume: 34 year: 2008 ident: br0120 article-title: Anatomy of high-performance matrix multiplication publication-title: ACM Trans. Math. Softw. – year: 2007 ident: br0050 article-title: Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor Parallelism – start-page: 1 year: 2016 end-page: 2 ident: br0160 article-title: Correlating hardware performance events to CPU and DRAM power consumption publication-title: 2016 IEEE International Conference on Networking, Architecture and Storage (NAS) – start-page: 44 year: 2002 end-page: 60 ident: br0140 article-title: SLURM: simple Linux utility for resource management publication-title: Proceedings of Job Scheduling Strategies for Parallel Processing (JSSPP) 2003 – volume: 34 issue: 3 year: 2008 ident: 10.1016/j.jpdc.2021.04.014_br0120 article-title: Anatomy of high-performance matrix multiplication publication-title: ACM Trans. Math. Softw. doi: 10.1145/1356052.1356053 – year: 1997 ident: 10.1016/j.jpdc.2021.04.014_br0150 – year: 2001 ident: 10.1016/j.jpdc.2021.04.014_br0040 – year: 2011 ident: 10.1016/j.jpdc.2021.04.014_br0130 – year: 2009 ident: 10.1016/j.jpdc.2021.04.014_br0090 – start-page: 1 year: 2016 ident: 10.1016/j.jpdc.2021.04.014_br0160 article-title: Correlating hardware performance events to CPU and DRAM power consumption – start-page: 1042 year: 2006 ident: 10.1016/j.jpdc.2021.04.014_br0010 article-title: A cache oblivious algorithm for matrix multiplication based on Peano's space filling curve – volume: 28 start-page: 965 issue: 8 year: 2007 ident: 10.1016/j.jpdc.2021.04.014_br0100 article-title: A method for initialising the k-means clustering algorithm using kd-trees publication-title: Pattern Recognit. Lett. doi: 10.1016/j.patrec.2007.01.001 – start-page: 579 year: 2015 ident: 10.1016/j.jpdc.2021.04.014_br0110 article-title: Yinyang k-means: a drop-in replacement of the classic k-means with consistent speedup – start-page: 44 year: 2002 ident: 10.1016/j.jpdc.2021.04.014_br0140 article-title: SLURM: simple Linux utility for resource management – year: 2001 ident: 10.1016/j.jpdc.2021.04.014_br0070 – volume: 64 start-page: 175 issue: C year: 2017 ident: 10.1016/j.jpdc.2021.04.014_br0080 article-title: Improving matrix-based dynamic programming on massively parallel accelerators publication-title: Inf. Syst. doi: 10.1016/j.is.2016.06.001 – volume: 27 start-page: 175 issue: 7 year: 1992 ident: 10.1016/j.jpdc.2021.04.014_br0020 article-title: A general framework for iteration-reordering loop transformations publication-title: SIGPLAN Not. doi: 10.1145/143103.143132 – year: 2007 ident: 10.1016/j.jpdc.2021.04.014_br0050 – volume: 21 start-page: 168 issue: 1 year: 1974 ident: 10.1016/j.jpdc.2021.04.014_br0060 article-title: The string-to-string correction problem publication-title: J. ACM doi: 10.1145/321796.321811 – start-page: 75 year: 1975 ident: 10.1016/j.jpdc.2021.04.014_br0030 article-title: Preserving order in a forest in less than logarithmic time
SSID	ssj0011578
Score	2.3126333
Snippet	•Presentation of fine tuned assignments for HPC and parallel programming courses.•All assignments tested on students in three interconnected courses.•Covering...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	74
SubjectTerms	Assignments Education GPU HPC Parallel computing
Title	Letting future programmers experience performance-related tasks
URI	https://dx.doi.org/10.1016/j.jpdc.2021.04.014
Volume	155
WOSCitedRecordID	wos000656871100007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 issn: 0743-7315 databaseCode: AIEXJ dateStart: 19950101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: false ssIdentifier: ssj0011578 providerName: Elsevier
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LS8NAEF58Hbz4FuuLHLxJpJtku8lJRBT1IIIKvYVsslHbkoY2FX--M_tIa6lFBS-hpN1u2JnMfrs73zeEnCSB5C3AFW4u8JjRp4ErUi91c1w8RCxJ_VwXm-D392G7HT2YtLGhKifAiyL8-IjKfzU13ANjI3X2F-au_xRuwGcwOlzB7HD9keGRoKPSI5VaiE3Awu1pI-ev3uVyTBhwFZ8FgGeVDLvDb9AqSoT3elILC2SotouFsqTixJWjyk6AivGTFer4nQ5kdyptHgL7AAnZlwy-NlQhq_2NoSd5ee0LnbzbHYnJHQmP1ilXZpvMUmW-ZHIqLVTua-5mHXoZmwieulyPnYZbMwO83mvonHXKDAUoPaqEamkwns7qJMNH7BJ79Kiiq7QWybLHWQSxb_ni9qp9V582UaZnbPuIhlyl8wCne5oNYCZAydMGWTP2cS60F2ySBVlskXVbqcMxgXsLa3ObPJ5tcm4cxNEO4kw4iDN2EGeGgzjKQXbI8_XV0-WNa8pouKnfbFZuFoaA4vIUY7XHU-ZFXAiZME4zn4kwB5iUM4DhKK6ftySsX8OMRxFnMPVRP5H-Llkq-oXcI44QIhQAmUNYhwYBik1SwDNJIOD99gFrNgi1YxOnRmMeS530YptM2IlxPGMcz7gZxDCeDXJatym1wsrcXzM75LHBiBr7xeAhc9rt_7HdAVkdu_ghWaoGI3lEVtL36m04ODaO9AmlWItx
linkProvider	Elsevier
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Letting+future+programmers+experience+performance-related+tasks&rft.jtitle=Journal+of+parallel+and+distributed+computing&rft.au=Bedn%C3%A1rek%2C+David&rft.au=Kruli%C5%A1%2C+Martin&rft.au=Yaghob%2C+Jakub&rft.date=2021-09-01&rft.pub=Elsevier+Inc&rft.issn=0743-7315&rft.volume=155&rft.spage=74&rft.epage=86&rft_id=info:doi/10.1016%2Fj.jpdc.2021.04.014&rft.externalDocID=S0743731521000976
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0743-7315&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0743-7315&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0743-7315&client=summon