Towards a performance-portable description of geometric multigrid algorithms using a domain-specific language

High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including accelerators such as Graphics Processing Units (GPUs). To cope with the challenge of programming such complex systems, this work presents a domain-spec...

Full description

Saved in:
Bibliographic Details
Published in:Journal of parallel and distributed computing Vol. 74; no. 12; pp. 3191 - 3201
Main Authors: Membarth, Richard, Reiche, Oliver, Schmitt, Christian, Hannig, Frank, Teich, Jürgen, Stürmer, Markus, Köstler, Harald
Format: Journal Article
Language:English
Published: Elsevier Inc 01.12.2014
Subjects:
ISSN:0743-7315, 1096-0848
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including accelerators such as Graphics Processing Units (GPUs). To cope with the challenge of programming such complex systems, this work presents a domain-specific approach to automatically generate code tailored to different processor types. Low-level CUDA and OpenCL code is generated from a high-level description of an algorithm specified in a Domain-Specific Language (DSL) instead of writing hand-tuned code for GPU accelerators. The DSL is part of the Heterogeneous Image Processing Acceleration (HIPAcc) framework and was extended in this work to handle grid hierarchies in order to model different cycle types. Language constructs are introduced to process and represent data at different resolutions. This allows to describe image processing algorithms that work on image pyramids as well as multigrid methods in the stencil domain. By decoupling the algorithm from its schedule, the proposed approach allows to generate efficient stencil code implementations. Our results show that similar performance compared to hand-tuned codes can be achieved. •DSL extension to handle image pyramids and grid hierarchies.•DSL extension to model different multigrid cycle types.•Generated GPU code shows similar performance compared to hand-tuned implementation.•We apply the algorithm to high dynamic range compression of 2D X-ray images.
AbstractList High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including accelerators such as Graphics Processing Units (GPUs). To cope with the challenge of programming such complex systems, this work presents a domain-specific approach to automatically generate code tailored to different processor types. Low-level CUDA and OpenCL code is generated from a high-level description of an algorithm specified in a Domain-Specific Language (DSL) instead of writing hand-tuned code for GPU accelerators. The DSL is part of the Heterogeneous Image Processing Acceleration (HIPA cc ) framework and was extended in this work to handle grid hierarchies in order to model different cycle types. Language constructs are introduced to process and represent data at different resolutions. This allows to describe image processing algorithms that work on image pyramids as well as multigrid methods in the stencil domain. By decoupling the algorithm from its schedule, the proposed approach allows to generate efficient stencil code implementations. Our results show that similar performance compared to hand-tuned codes can be achieved.
High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including accelerators such as Graphics Processing Units (GPUs). To cope with the challenge of programming such complex systems, this work presents a domain-specific approach to automatically generate code tailored to different processor types. Low-level CUDA and OpenCL code is generated from a high-level description of an algorithm specified in a Domain-Specific Language (DSL) instead of writing hand-tuned code for GPU accelerators. The DSL is part of the Heterogeneous Image Processing Acceleration (HIPAcc) framework and was extended in this work to handle grid hierarchies in order to model different cycle types. Language constructs are introduced to process and represent data at different resolutions. This allows to describe image processing algorithms that work on image pyramids as well as multigrid methods in the stencil domain. By decoupling the algorithm from its schedule, the proposed approach allows to generate efficient stencil code implementations. Our results show that similar performance compared to hand-tuned codes can be achieved. •DSL extension to handle image pyramids and grid hierarchies.•DSL extension to model different multigrid cycle types.•Generated GPU code shows similar performance compared to hand-tuned implementation.•We apply the algorithm to high dynamic range compression of 2D X-ray images.
Author Hannig, Frank
Köstler, Harald
Reiche, Oliver
Membarth, Richard
Teich, Jürgen
Stürmer, Markus
Schmitt, Christian
Author_xml – sequence: 1
  givenname: Richard
  surname: Membarth
  fullname: Membarth, Richard
  email: richard.membarth@dfki.de
  organization: German Research Center for Artificial Intelligence, Germany
– sequence: 2
  givenname: Oliver
  surname: Reiche
  fullname: Reiche, Oliver
  email: oliver.reiche@cs.fau.de
  organization: Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany
– sequence: 3
  givenname: Christian
  surname: Schmitt
  fullname: Schmitt, Christian
  email: christian.schmitt@cs.fau.de
  organization: Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany
– sequence: 4
  givenname: Frank
  surname: Hannig
  fullname: Hannig, Frank
  email: hannig@cs.fau.de
  organization: Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany
– sequence: 5
  givenname: Jürgen
  surname: Teich
  fullname: Teich, Jürgen
  email: teich@cs.fau.de
  organization: Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany
– sequence: 6
  givenname: Markus
  surname: Stürmer
  fullname: Stürmer, Markus
  email: markus.stuermer@cs.fau.de
  organization: System Simulation, Department of Computer Science, University of Erlangen-Nuremberg, Germany
– sequence: 7
  givenname: Harald
  surname: Köstler
  fullname: Köstler, Harald
  email: harald.koestler@cs.fau.de
  organization: System Simulation, Department of Computer Science, University of Erlangen-Nuremberg, Germany
BookMark eNp9kE1v1TAQRS1UJF5L_wCrLNkk9bx8ORIbVEGLVIlNWVsTe5zOUxIH2wHx7_HTY8Wiq9ncczX3XIur1a8kxAeQFUjo7k7VabOmOkpoKqkqKdUbcQA5dKVUjboSB9k3ddnX0L4T1zGepARoe3UQy7P_jcHGAouNgvNhwdVQufmQcJypsBRN4C2xXwvvion8QimwKZZ9TjwFtgXOkw-cXpZY7JHXKVdZvyCvZdzIsMvhGddpx4nei7cO50i3_-6N-PH1y_P9Y_n0_eHb_een0tR1ncp2VOC6wREBjdK0Q95oezRHp0ZDdlCusQ5NZ7HtZYfNOCjoCY-d6y04CfWN-Hjp3YL_uVNMeuFoaM5_kN-jhq6FeoCmbXP0eIma4GMM5PQWeMHwR4PUZ7f6pM9u9dmtlkpntxlS_0GGE54lpYA8v45-uqCU9_9iCjoapuzcciCTtPX8Gv4Xwpyasw
CitedBy_id crossref_primary_10_1109_JPROC_2018_2854229
crossref_primary_10_1145_2936314_2814208
crossref_primary_10_1002_cpe_4062
crossref_primary_10_1016_j_advengsoft_2024_103666
Cites_doi 10.1088/1749-4699/5/1/015003
10.1016/j.parco.2011.10.002
10.1007/s00607-008-0003-x
10.1090/S0025-5718-1977-0431719-X
10.1145/2185520.2185528
10.1145/2010324.1964963
10.1109/TCOM.1983.1095851
ContentType Journal Article
Copyright 2014 Elsevier Inc.
Copyright_xml – notice: 2014 Elsevier Inc.
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1016/j.jpdc.2014.08.008
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1096-0848
EndPage 3201
ExternalDocumentID 10_1016_j_jpdc_2014_08_008
S0743731514001506
GroupedDBID --K
--M
-~X
.~1
0R~
1B1
1~.
1~5
29L
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFSI
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADFGL
ADHUB
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CAG
COF
CS3
DM4
DU5
E.L
EBS
EFBJH
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
K-O
KOM
LG5
LG9
LY7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SET
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
TWZ
WUQ
XJT
XOL
XPP
ZMT
ZU3
ZY4
~G-
~G0
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
ADVLN
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c333t-5b81f69fee1eb0c59101d7ac2f8bced98f4dfac6da5706a4b9817ea26f7d1f013
ISICitedReferencesCount 8
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000345733300003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0743-7315
IngestDate Sun Sep 28 12:51:31 EDT 2025
Sat Nov 29 07:09:48 EST 2025
Tue Nov 18 22:30:41 EST 2025
Fri Feb 23 02:31:21 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 12
Keywords Code generation
CUDA
Domain-specific language
Multigrid
Multiresolution
Image pyramid
Stencil codes
OpenCL
GPU
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c333t-5b81f69fee1eb0c59101d7ac2f8bced98f4dfac6da5706a4b9817ea26f7d1f013
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PQID 1651391455
PQPubID 23500
PageCount 11
ParticipantIDs proquest_miscellaneous_1651391455
crossref_primary_10_1016_j_jpdc_2014_08_008
crossref_citationtrail_10_1016_j_jpdc_2014_08_008
elsevier_sciencedirect_doi_10_1016_j_jpdc_2014_08_008
PublicationCentury 2000
PublicationDate 2014-12-01
PublicationDateYYYYMMDD 2014-12-01
PublicationDate_xml – month: 12
  year: 2014
  text: 2014-12-01
  day: 01
PublicationDecade 2010
PublicationTitle Journal of parallel and distributed computing
PublicationYear 2014
Publisher Elsevier Inc
Publisher_xml – name: Elsevier Inc
References Stürmer, Treibig, Rüde (br000155) 2008; 4
DeVito, Joubert, Palacios, Oakley, Medina, Barrientos, Elsen, Ham, Aiken, Duraisamy, Darve, Alonso, Hanrahan (br000045) 2011
Maruyama, Nomura, Sato, Matsuoka (br000105) 2011
Dietrich, German, Köstler, Rüde (br000050) 2010
Bastian, Blatt, Dedner, Engwer, Klöfkorn, Ohlberger, Sander (br000015) 2008; 82
Köstler, Stürmer, Pohl (br000085) 2013
Baker, Falgout, Kolev, Yang (br000005) 2012
Burt, Adelson (br000035) 1983; 31
Berkeley Benchmarking and Optimization (BeBOP) Group, University of California, Berkeley, pOSKI: Parallel Optimized Sparse Kernel Interface Library, Apr. 2012.
Membarth, Reiche, Hannig, Teich (br000130) 2014
Du, Weber, Luszczek, Tomov, Peterson, Dongarra (br000055) 2011; 38
Köstler, Stürmer, Rüde (br000090) 2008; 15
Membarth, Hannig, Teich, Köstler (br000110) 2012
Briggs, Van Emden, McCormick (br000030) 2000
Hackbusch (br000070) 1985
Trottenberg, Oosterlee, Schüller (br000165) 2000
Leung, Vasilache, Meister, Baskaran, Wohlford, Bastoul, Lethin (br000100) 2010
Tang, Chowdhury, Kuszmaul, Luk, Leiserson (br000160) 2011
Ragan-Kelley, Adams, Paris, Levoy, Amarasinghe, Durand (br000150) 2012; 31
Orchard, Bolingbroke, Mycroft (br000140) 2010
Muranushi (br000135) 2012; 5
Kamil, Chan, Oliker, Shalf, Williams (br000080) 2010
Brandt (br000025) 1977; 31
Membarth, Hannig, Teich, Körner, Eckert (br000115) 2012
Membarth, Hannig, Teich, Körner, Eckert (br000125) 2012
Baskaran, Bondhugula, Krishnamoorthy, Ramanujam, Rountev, Sadayappan (br000010) 2008
Feautrier, Lengauer (br000065) 2011
Membarth, Hannig, Teich, Körner, Eckert (br000120) 2012
Christen, Schenk, Burkhart (br000040) 2011
Holewinski, Pouchet, Sadayappan (br000075) 2012
Fattal, Lischinski, Werman (br000060) 2002
Kunz, Eck, Fillbrandt, Aach (br000095) 2003
Paris, Hasinoff, Kautz (br000145) 2011; 30
Hackbusch (10.1016/j.jpdc.2014.08.008_br000070) 1985
Köstler (10.1016/j.jpdc.2014.08.008_br000090) 2008; 15
Baker (10.1016/j.jpdc.2014.08.008_br000005) 2012
Christen (10.1016/j.jpdc.2014.08.008_br000040) 2011
Kamil (10.1016/j.jpdc.2014.08.008_br000080) 2010
Briggs (10.1016/j.jpdc.2014.08.008_br000030) 2000
Kunz (10.1016/j.jpdc.2014.08.008_br000095) 2003
10.1016/j.jpdc.2014.08.008_br000020
Membarth (10.1016/j.jpdc.2014.08.008_br000125) 2012
Membarth (10.1016/j.jpdc.2014.08.008_br000110) 2012
Du (10.1016/j.jpdc.2014.08.008_br000055) 2011; 38
Fattal (10.1016/j.jpdc.2014.08.008_br000060) 2002
Maruyama (10.1016/j.jpdc.2014.08.008_br000105) 2011
Burt (10.1016/j.jpdc.2014.08.008_br000035) 1983; 31
Trottenberg (10.1016/j.jpdc.2014.08.008_br000165) 2000
Köstler (10.1016/j.jpdc.2014.08.008_br000085) 2013
Membarth (10.1016/j.jpdc.2014.08.008_br000115) 2012
Membarth (10.1016/j.jpdc.2014.08.008_br000120) 2012
Membarth (10.1016/j.jpdc.2014.08.008_br000130) 2014
DeVito (10.1016/j.jpdc.2014.08.008_br000045) 2011
Paris (10.1016/j.jpdc.2014.08.008_br000145) 2011; 30
Bastian (10.1016/j.jpdc.2014.08.008_br000015) 2008; 82
Dietrich (10.1016/j.jpdc.2014.08.008_br000050) 2010
Stürmer (10.1016/j.jpdc.2014.08.008_br000155) 2008; 4
Feautrier (10.1016/j.jpdc.2014.08.008_br000065) 2011
Leung (10.1016/j.jpdc.2014.08.008_br000100) 2010
Ragan-Kelley (10.1016/j.jpdc.2014.08.008_br000150) 2012; 31
Baskaran (10.1016/j.jpdc.2014.08.008_br000010) 2008
Brandt (10.1016/j.jpdc.2014.08.008_br000025) 1977; 31
Tang (10.1016/j.jpdc.2014.08.008_br000160) 2011
Holewinski (10.1016/j.jpdc.2014.08.008_br000075) 2012
Orchard (10.1016/j.jpdc.2014.08.008_br000140) 2010
Muranushi (10.1016/j.jpdc.2014.08.008_br000135) 2012; 5
References_xml – start-page: 117
  year: 2011
  end-page: 128
  ident: br000160
  article-title: The Pochoir stencil compiler
  publication-title: Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA)
– volume: 31
  start-page: 532
  year: 1983
  end-page: 540
  ident: br000035
  article-title: The Laplacian pyramid as a compact image code
  publication-title: IEEE Trans. Commun.
– start-page: 1581
  year: 2011
  end-page: 1592
  ident: br000065
  article-title: Polyhedron model
  publication-title: Encyclopedia of Parallel Computing
– start-page: 211
  year: 2012
  end-page: 218
  ident: br000125
  article-title: Automatic optimization of in-flight memory transactions for GPU accelerators based on a domain-specific language for medical imaging
  publication-title: Proceedings of the 11th International Symposium on Parallel and Distributed Computing (ISPDC)
– start-page: 86:1
  year: 2014
  end-page: 86:6
  ident: br000130
  article-title: Code generation for embedded heterogeneous architectures on android
  publication-title: Proceedings of the Conference on Design, Automation and Test in Europe (DATE)
– start-page: 1
  year: 2013
  end-page: 13
  ident: br000085
  article-title: Performance engineering to achieve real-time high dynamic range imaging
  publication-title: Real-Time Image Process.
– start-page: 11:1
  year: 2011
  end-page: 11:12
  ident: br000105
  article-title: Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers
  publication-title: Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)
– start-page: 224
  year: 2010
  end-page: 234
  ident: br000050
  article-title: Modeling multigrid algorithms for variational imaging
  publication-title: Proceedings of the 21st Australian Software Engineering Conference (ASWEC)
– volume: 31
  start-page: 32:1
  year: 2012
  end-page: 32:12
  ident: br000150
  article-title: Decoupling algorithms from schedules for easy optimization of image processing pipelines
  publication-title: ACM Trans. Graph. (TOG)
– volume: 5
  start-page: 1
  year: 2012
  end-page: 40
  ident: br000135
  article-title: Paraiso: An automated tuning framework for explicit solvers of partial differential equations
  publication-title: Comput. Sci. Discov.
– year: 2000
  ident: br000030
  article-title: A Multigrid Tutorial, Vol.~2
– start-page: 732
  year: 2003
  end-page: 742
  ident: br000095
  article-title: Nonlinear multiresolution gradient adaptive filter for medical images
  publication-title: Proceedings of SPIE Medical Imaging 2003: Image Processing, Vol. 5032
– start-page: 15
  year: 2010
  end-page: 24
  ident: br000140
  article-title: Ypnos: Declarative, parallel structured grid programming
  publication-title: Proceedings of the 5th ACM SIGPLAN Workshop on Declarative Aspects of Multicore Programming
– start-page: 261
  year: 2012
  end-page: 279
  ident: br000005
  article-title: Scaling Hypre’s multigrid solvers to 100,000 cores
  publication-title: High-Perfor. Sci. Comput.
– year: 2000
  ident: br000165
  article-title: Multigrid
– start-page: 676
  year: 2011
  end-page: 687
  ident: br000040
  article-title: PATUS: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures
  publication-title: Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS)
– volume: 4
  start-page: 29
  year: 2008
  end-page: 35
  ident: br000155
  article-title: Optimising a 3d multigrid algorithm for the IA-64 architecture
  publication-title: Int. J. Comput. Sci. Eng.
– start-page: 9:1
  year: 2011
  end-page: 9:12
  ident: br000045
  article-title: Liszt: A domain specific language for building portable mesh-based PDE solvers
  publication-title: Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)
– start-page: 1
  year: 2010
  end-page: 12
  ident: br000080
  article-title: An auto-tuning framework for parallel multicore stencil computations
  publication-title: Proceedings of the 24th IEEE International Parallel & Distributed Processing Symposium (IPDPS)
– start-page: 249
  year: 2002
  end-page: 256
  ident: br000060
  article-title: Gradient domain high dynamic range compression
  publication-title: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH)
– start-page: 51
  year: 2010
  end-page: 61
  ident: br000100
  article-title: A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction
  publication-title: Proceedings of the 3rd Workshop on General Purpose Processing on Graphics Processing Units (GPGPU)
– volume: 30
  start-page: 68:1
  year: 2011
  end-page: 68:12
  ident: br000145
  article-title: Local Laplacian filters: Edge-aware image processing with a Laplacian pyramid
  publication-title: ACM Trans. Graph. (TOG)
– reference: Berkeley Benchmarking and Optimization (BeBOP) Group, University of California, Berkeley, pOSKI: Parallel Optimized Sparse Kernel Interface Library, Apr. 2012.
– start-page: 1133
  year: 2012
  end-page: 1138
  ident: br000110
  article-title: Towards domain-specific computing for stencil codes in HPC
  publication-title: Proceedings of the 2nd International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC)
– start-page: 225
  year: 2008
  end-page: 234
  ident: br000010
  article-title: A compiler framework for optimization of affine loop nests for GPGPUs
  publication-title: Proceedings of the 22nd Annual International Conference on Supercomputing
– volume: 31
  start-page: 333
  year: 1977
  end-page: 390
  ident: br000025
  article-title: Multi-level adaptive solutions to boundary-value problems
  publication-title: Math. Comp.
– volume: 82
  start-page: 103
  year: 2008
  end-page: 119
  ident: br000015
  article-title: A generic grid interface for parallel and adaptive scientific computing. Part I: Abstract framework
  publication-title: Computing
– start-page: 311
  year: 2012
  end-page: 320
  ident: br000075
  article-title: High-performance code generation for stencil computations on GPU architectures
  publication-title: Proceedings of the 26th ACM International Conference on Supercomputing
– year: 1985
  ident: br000070
  publication-title: Multi-Grid Methods and Applications
– start-page: 569
  year: 2012
  end-page: 581
  ident: br000115
  article-title: Generating device-specific GPU code for local operators in medical imaging
  publication-title: Proceedings of the 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS)
– start-page: 123
  year: 2012
  end-page: 132
  ident: br000120
  article-title: Mastering software variant explosion for GPU accelerators
  publication-title: Proceedings of the 10th International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar)
– volume: 15
  start-page: 187
  year: 2008
  end-page: 200
  ident: br000090
  article-title: A fast full multigrid solver for applications in image processing
  publication-title: Numer. Linear Algebra Appl.
– volume: 38
  start-page: 391
  year: 2011
  end-page: 407
  ident: br000055
  article-title: From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming
  publication-title: Parallel Comput.
– start-page: 1581
  year: 2011
  ident: 10.1016/j.jpdc.2014.08.008_br000065
  article-title: Polyhedron model
– year: 1985
  ident: 10.1016/j.jpdc.2014.08.008_br000070
– start-page: 11:1
  year: 2011
  ident: 10.1016/j.jpdc.2014.08.008_br000105
  article-title: Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers
– year: 2000
  ident: 10.1016/j.jpdc.2014.08.008_br000030
– start-page: 569
  year: 2012
  ident: 10.1016/j.jpdc.2014.08.008_br000115
  article-title: Generating device-specific GPU code for local operators in medical imaging
– year: 2000
  ident: 10.1016/j.jpdc.2014.08.008_br000165
– volume: 5
  start-page: 1
  issue: 1
  year: 2012
  ident: 10.1016/j.jpdc.2014.08.008_br000135
  article-title: Paraiso: An automated tuning framework for explicit solvers of partial differential equations
  publication-title: Comput. Sci. Discov.
  doi: 10.1088/1749-4699/5/1/015003
– volume: 38
  start-page: 391
  issue: 8
  year: 2011
  ident: 10.1016/j.jpdc.2014.08.008_br000055
  article-title: From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming
  publication-title: Parallel Comput.
  doi: 10.1016/j.parco.2011.10.002
– start-page: 676
  year: 2011
  ident: 10.1016/j.jpdc.2014.08.008_br000040
  article-title: PATUS: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures
– start-page: 261
  year: 2012
  ident: 10.1016/j.jpdc.2014.08.008_br000005
  article-title: Scaling Hypre’s multigrid solvers to 100,000 cores
  publication-title: High-Perfor. Sci. Comput.
– start-page: 224
  year: 2010
  ident: 10.1016/j.jpdc.2014.08.008_br000050
  article-title: Modeling multigrid algorithms for variational imaging
– start-page: 311
  year: 2012
  ident: 10.1016/j.jpdc.2014.08.008_br000075
  article-title: High-performance code generation for stencil computations on GPU architectures
– start-page: 123
  year: 2012
  ident: 10.1016/j.jpdc.2014.08.008_br000120
  article-title: Mastering software variant explosion for GPU accelerators
– volume: 82
  start-page: 103
  issue: 2
  year: 2008
  ident: 10.1016/j.jpdc.2014.08.008_br000015
  article-title: A generic grid interface for parallel and adaptive scientific computing. Part I: Abstract framework
  publication-title: Computing
  doi: 10.1007/s00607-008-0003-x
– volume: 31
  start-page: 333
  issue: 138
  year: 1977
  ident: 10.1016/j.jpdc.2014.08.008_br000025
  article-title: Multi-level adaptive solutions to boundary-value problems
  publication-title: Math. Comp.
  doi: 10.1090/S0025-5718-1977-0431719-X
– volume: 31
  start-page: 32:1
  issue: 4
  year: 2012
  ident: 10.1016/j.jpdc.2014.08.008_br000150
  article-title: Decoupling algorithms from schedules for easy optimization of image processing pipelines
  publication-title: ACM Trans. Graph. (TOG)
  doi: 10.1145/2185520.2185528
– volume: 4
  start-page: 29
  issue: 1
  year: 2008
  ident: 10.1016/j.jpdc.2014.08.008_br000155
  article-title: Optimising a 3d multigrid algorithm for the IA-64 architecture
  publication-title: Int. J. Comput. Sci. Eng.
– start-page: 51
  year: 2010
  ident: 10.1016/j.jpdc.2014.08.008_br000100
  article-title: A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction
– start-page: 9:1
  year: 2011
  ident: 10.1016/j.jpdc.2014.08.008_br000045
  article-title: Liszt: A domain specific language for building portable mesh-based PDE solvers
– start-page: 1
  year: 2013
  ident: 10.1016/j.jpdc.2014.08.008_br000085
  article-title: Performance engineering to achieve real-time high dynamic range imaging
  publication-title: Real-Time Image Process.
– ident: 10.1016/j.jpdc.2014.08.008_br000020
– volume: 30
  start-page: 68:1
  issue: 4
  year: 2011
  ident: 10.1016/j.jpdc.2014.08.008_br000145
  article-title: Local Laplacian filters: Edge-aware image processing with a Laplacian pyramid
  publication-title: ACM Trans. Graph. (TOG)
  doi: 10.1145/2010324.1964963
– volume: 15
  start-page: 187
  issue: 2–3
  year: 2008
  ident: 10.1016/j.jpdc.2014.08.008_br000090
  article-title: A fast full multigrid solver for applications in image processing
  publication-title: Numer. Linear Algebra Appl.
– start-page: 117
  year: 2011
  ident: 10.1016/j.jpdc.2014.08.008_br000160
  article-title: The Pochoir stencil compiler
– start-page: 732
  year: 2003
  ident: 10.1016/j.jpdc.2014.08.008_br000095
  article-title: Nonlinear multiresolution gradient adaptive filter for medical images
– start-page: 15
  year: 2010
  ident: 10.1016/j.jpdc.2014.08.008_br000140
  article-title: Ypnos: Declarative, parallel structured grid programming
– start-page: 249
  year: 2002
  ident: 10.1016/j.jpdc.2014.08.008_br000060
  article-title: Gradient domain high dynamic range compression
– start-page: 1133
  year: 2012
  ident: 10.1016/j.jpdc.2014.08.008_br000110
  article-title: Towards domain-specific computing for stencil codes in HPC
– start-page: 1
  year: 2010
  ident: 10.1016/j.jpdc.2014.08.008_br000080
  article-title: An auto-tuning framework for parallel multicore stencil computations
– volume: 31
  start-page: 532
  issue: 4
  year: 1983
  ident: 10.1016/j.jpdc.2014.08.008_br000035
  article-title: The Laplacian pyramid as a compact image code
  publication-title: IEEE Trans. Commun.
  doi: 10.1109/TCOM.1983.1095851
– start-page: 211
  year: 2012
  ident: 10.1016/j.jpdc.2014.08.008_br000125
  article-title: Automatic optimization of in-flight memory transactions for GPU accelerators based on a domain-specific language for medical imaging
– start-page: 225
  year: 2008
  ident: 10.1016/j.jpdc.2014.08.008_br000010
  article-title: A compiler framework for optimization of affine loop nests for GPGPUs
– start-page: 86:1
  year: 2014
  ident: 10.1016/j.jpdc.2014.08.008_br000130
  article-title: Code generation for embedded heterogeneous architectures on android
SSID ssj0011578
Score 2.1160793
Snippet High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 3191
SubjectTerms Accelerators
Algorithms
Code generation
CUDA
Distributed processing
Domain-specific language
GPU
Graphics processing units
Image processing
Image pyramid
Mathematical models
Microprocessors
Multigrid
Multiresolution
OpenCL
Stencil codes
Subscriber line
Title Towards a performance-portable description of geometric multigrid algorithms using a domain-specific language
URI https://dx.doi.org/10.1016/j.jpdc.2014.08.008
https://www.proquest.com/docview/1651391455
Volume 74
WOSCitedRecordID wos000345733300003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: ScienceDirect Freedom Collection 2021
  customDbUrl:
  eissn: 1096-0848
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0011578
  issn: 0743-7315
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Jj9MwFLaqDgcu7IgZFhmJWxRUZ_dxhAYNHAYOg9Rb5HrpQpaqTUfz9_hn-MV2EgU6Yg5coipNntK8r-999tsQ-qA5KuE85L5KiF6g0IXwmRSRLxhlgkgRU8bbYRPp1VU2n9Pvk8kvVwtzU6RVld3e0u1_VbU-p5UNpbP3UHcnVJ_Qn7XS9VGrXR__TfFtIuzeY9CS2BUF-C3NNlVSvaHQRHEp6xKGatnUwuVuLTxWLOvdulmVe-_QbiUwT9QlW1c-1GVCblG3zXmE20JD8aKQpg2BgN68MFZLthV020Pj3GXb-LeEnILVqMi_jQFJSFOFL74VkD3Sx4xWpY1pmdYIA4RfwgimpWPkP4d7GiQa5IcY0weNU9PQFHo6O22m-Tg8BgOrq80IGXjwMDCy_vAOZqNi83GzFdC9kkRt99ZZ1vtCF_8fucgucdHlxG1ykJGDjBzGeEK5-UmQxjSbopPzLxfzr10oi8SGDrifZCu3TJLh-EmOsaMRT2jJz_UT9MhqFp8btD1FE1k9Q4_dRBBsHcRzVFrwYYb_Bj48AB-uFe7Ahzvw4R58uAWfFjUCH3bge4F-fL64_nTp24EePg_DsPHjRUZUQpWURC5mPNZUlYiU8UBlCy4FzVQkFOOJYHE6S1i0oBlJJQsSlQqi9GLlJZpWdSVfIRzxWcZVSlkieESFopEKklhpkaHSsskpIu5F5tx2u4ehK0V-XIWnyOvu2ZpeL3deHTv95JatGhaaa7jded97p8xcm3KIz7FK1od9TpJYr8dgcsDZvZ7kNXrY_4neoGmzO8i36AG_adb73TuLx982ycuc
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+a+performance-portable+description+of+geometric+multigrid+algorithms+using+a+domain-specific+language&rft.jtitle=Journal+of+parallel+and+distributed+computing&rft.au=Membarth%2C+Richard&rft.au=Reiche%2C+Oliver&rft.au=Schmitt%2C+Christian&rft.au=Hannig%2C+Frank&rft.date=2014-12-01&rft.issn=0743-7315&rft.volume=74&rft.issue=12&rft.spage=3191&rft.epage=3201&rft_id=info:doi/10.1016%2Fj.jpdc.2014.08.008&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_jpdc_2014_08_008
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0743-7315&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0743-7315&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0743-7315&client=summon