Towards a performance-portable description of geometric multigrid algorithms using a domain-specific language
High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including accelerators such as Graphics Processing Units (GPUs). To cope with the challenge of programming such complex systems, this work presents a domain-spec...
Saved in:
| Published in: | Journal of parallel and distributed computing Vol. 74; no. 12; pp. 3191 - 3201 |
|---|---|
| Main Authors: | , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier Inc
01.12.2014
|
| Subjects: | |
| ISSN: | 0743-7315, 1096-0848 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including accelerators such as Graphics Processing Units (GPUs). To cope with the challenge of programming such complex systems, this work presents a domain-specific approach to automatically generate code tailored to different processor types. Low-level CUDA and OpenCL code is generated from a high-level description of an algorithm specified in a Domain-Specific Language (DSL) instead of writing hand-tuned code for GPU accelerators. The DSL is part of the Heterogeneous Image Processing Acceleration (HIPAcc) framework and was extended in this work to handle grid hierarchies in order to model different cycle types. Language constructs are introduced to process and represent data at different resolutions. This allows to describe image processing algorithms that work on image pyramids as well as multigrid methods in the stencil domain. By decoupling the algorithm from its schedule, the proposed approach allows to generate efficient stencil code implementations. Our results show that similar performance compared to hand-tuned codes can be achieved.
•DSL extension to handle image pyramids and grid hierarchies.•DSL extension to model different multigrid cycle types.•Generated GPU code shows similar performance compared to hand-tuned implementation.•We apply the algorithm to high dynamic range compression of 2D X-ray images. |
|---|---|
| AbstractList | High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including accelerators such as Graphics Processing Units (GPUs). To cope with the challenge of programming such complex systems, this work presents a domain-specific approach to automatically generate code tailored to different processor types. Low-level CUDA and OpenCL code is generated from a high-level description of an algorithm specified in a Domain-Specific Language (DSL) instead of writing hand-tuned code for GPU accelerators. The DSL is part of the Heterogeneous Image Processing Acceleration (HIPA cc ) framework and was extended in this work to handle grid hierarchies in order to model different cycle types. Language constructs are introduced to process and represent data at different resolutions. This allows to describe image processing algorithms that work on image pyramids as well as multigrid methods in the stencil domain. By decoupling the algorithm from its schedule, the proposed approach allows to generate efficient stencil code implementations. Our results show that similar performance compared to hand-tuned codes can be achieved. High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including accelerators such as Graphics Processing Units (GPUs). To cope with the challenge of programming such complex systems, this work presents a domain-specific approach to automatically generate code tailored to different processor types. Low-level CUDA and OpenCL code is generated from a high-level description of an algorithm specified in a Domain-Specific Language (DSL) instead of writing hand-tuned code for GPU accelerators. The DSL is part of the Heterogeneous Image Processing Acceleration (HIPAcc) framework and was extended in this work to handle grid hierarchies in order to model different cycle types. Language constructs are introduced to process and represent data at different resolutions. This allows to describe image processing algorithms that work on image pyramids as well as multigrid methods in the stencil domain. By decoupling the algorithm from its schedule, the proposed approach allows to generate efficient stencil code implementations. Our results show that similar performance compared to hand-tuned codes can be achieved. •DSL extension to handle image pyramids and grid hierarchies.•DSL extension to model different multigrid cycle types.•Generated GPU code shows similar performance compared to hand-tuned implementation.•We apply the algorithm to high dynamic range compression of 2D X-ray images. |
| Author | Hannig, Frank Köstler, Harald Reiche, Oliver Membarth, Richard Teich, Jürgen Stürmer, Markus Schmitt, Christian |
| Author_xml | – sequence: 1 givenname: Richard surname: Membarth fullname: Membarth, Richard email: richard.membarth@dfki.de organization: German Research Center for Artificial Intelligence, Germany – sequence: 2 givenname: Oliver surname: Reiche fullname: Reiche, Oliver email: oliver.reiche@cs.fau.de organization: Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany – sequence: 3 givenname: Christian surname: Schmitt fullname: Schmitt, Christian email: christian.schmitt@cs.fau.de organization: Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany – sequence: 4 givenname: Frank surname: Hannig fullname: Hannig, Frank email: hannig@cs.fau.de organization: Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany – sequence: 5 givenname: Jürgen surname: Teich fullname: Teich, Jürgen email: teich@cs.fau.de organization: Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany – sequence: 6 givenname: Markus surname: Stürmer fullname: Stürmer, Markus email: markus.stuermer@cs.fau.de organization: System Simulation, Department of Computer Science, University of Erlangen-Nuremberg, Germany – sequence: 7 givenname: Harald surname: Köstler fullname: Köstler, Harald email: harald.koestler@cs.fau.de organization: System Simulation, Department of Computer Science, University of Erlangen-Nuremberg, Germany |
| BookMark | eNp9kE1v1TAQRS1UJF5L_wCrLNkk9bx8ORIbVEGLVIlNWVsTe5zOUxIH2wHx7_HTY8Wiq9ncczX3XIur1a8kxAeQFUjo7k7VabOmOkpoKqkqKdUbcQA5dKVUjboSB9k3ddnX0L4T1zGepARoe3UQy7P_jcHGAouNgvNhwdVQufmQcJypsBRN4C2xXwvvion8QimwKZZ9TjwFtgXOkw-cXpZY7JHXKVdZvyCvZdzIsMvhGddpx4nei7cO50i3_-6N-PH1y_P9Y_n0_eHb_een0tR1ncp2VOC6wREBjdK0Q95oezRHp0ZDdlCusQ5NZ7HtZYfNOCjoCY-d6y04CfWN-Hjp3YL_uVNMeuFoaM5_kN-jhq6FeoCmbXP0eIma4GMM5PQWeMHwR4PUZ7f6pM9u9dmtlkpntxlS_0GGE54lpYA8v45-uqCU9_9iCjoapuzcciCTtPX8Gv4Xwpyasw |
| CitedBy_id | crossref_primary_10_1109_JPROC_2018_2854229 crossref_primary_10_1145_2936314_2814208 crossref_primary_10_1002_cpe_4062 crossref_primary_10_1016_j_advengsoft_2024_103666 |
| Cites_doi | 10.1088/1749-4699/5/1/015003 10.1016/j.parco.2011.10.002 10.1007/s00607-008-0003-x 10.1090/S0025-5718-1977-0431719-X 10.1145/2185520.2185528 10.1145/2010324.1964963 10.1109/TCOM.1983.1095851 |
| ContentType | Journal Article |
| Copyright | 2014 Elsevier Inc. |
| Copyright_xml | – notice: 2014 Elsevier Inc. |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.jpdc.2014.08.008 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1096-0848 |
| EndPage | 3201 |
| ExternalDocumentID | 10_1016_j_jpdc_2014_08_008 S0743731514001506 |
| GroupedDBID | --K --M -~X .~1 0R~ 1B1 1~. 1~5 29L 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABFSI ABJNI ABMAC ABTAH ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADFGL ADHUB ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CAG COF CS3 DM4 DU5 E.L EBS EFBJH EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA K-O KOM LG5 LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SES SET SEW SPC SPCBC SST SSV SSZ T5K TN5 TWZ WUQ XJT XOL XPP ZMT ZU3 ZY4 ~G- ~G0 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO ADVLN AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c333t-5b81f69fee1eb0c59101d7ac2f8bced98f4dfac6da5706a4b9817ea26f7d1f013 |
| ISICitedReferencesCount | 8 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000345733300003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0743-7315 |
| IngestDate | Sun Sep 28 12:51:31 EDT 2025 Sat Nov 29 07:09:48 EST 2025 Tue Nov 18 22:30:41 EST 2025 Fri Feb 23 02:31:21 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 12 |
| Keywords | Code generation CUDA Domain-specific language Multigrid Multiresolution Image pyramid Stencil codes OpenCL GPU |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c333t-5b81f69fee1eb0c59101d7ac2f8bced98f4dfac6da5706a4b9817ea26f7d1f013 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PQID | 1651391455 |
| PQPubID | 23500 |
| PageCount | 11 |
| ParticipantIDs | proquest_miscellaneous_1651391455 crossref_primary_10_1016_j_jpdc_2014_08_008 crossref_citationtrail_10_1016_j_jpdc_2014_08_008 elsevier_sciencedirect_doi_10_1016_j_jpdc_2014_08_008 |
| PublicationCentury | 2000 |
| PublicationDate | 2014-12-01 |
| PublicationDateYYYYMMDD | 2014-12-01 |
| PublicationDate_xml | – month: 12 year: 2014 text: 2014-12-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | Journal of parallel and distributed computing |
| PublicationYear | 2014 |
| Publisher | Elsevier Inc |
| Publisher_xml | – name: Elsevier Inc |
| References | Stürmer, Treibig, Rüde (br000155) 2008; 4 DeVito, Joubert, Palacios, Oakley, Medina, Barrientos, Elsen, Ham, Aiken, Duraisamy, Darve, Alonso, Hanrahan (br000045) 2011 Maruyama, Nomura, Sato, Matsuoka (br000105) 2011 Dietrich, German, Köstler, Rüde (br000050) 2010 Bastian, Blatt, Dedner, Engwer, Klöfkorn, Ohlberger, Sander (br000015) 2008; 82 Köstler, Stürmer, Pohl (br000085) 2013 Baker, Falgout, Kolev, Yang (br000005) 2012 Burt, Adelson (br000035) 1983; 31 Berkeley Benchmarking and Optimization (BeBOP) Group, University of California, Berkeley, pOSKI: Parallel Optimized Sparse Kernel Interface Library, Apr. 2012. Membarth, Reiche, Hannig, Teich (br000130) 2014 Du, Weber, Luszczek, Tomov, Peterson, Dongarra (br000055) 2011; 38 Köstler, Stürmer, Rüde (br000090) 2008; 15 Membarth, Hannig, Teich, Köstler (br000110) 2012 Briggs, Van Emden, McCormick (br000030) 2000 Hackbusch (br000070) 1985 Trottenberg, Oosterlee, Schüller (br000165) 2000 Leung, Vasilache, Meister, Baskaran, Wohlford, Bastoul, Lethin (br000100) 2010 Tang, Chowdhury, Kuszmaul, Luk, Leiserson (br000160) 2011 Ragan-Kelley, Adams, Paris, Levoy, Amarasinghe, Durand (br000150) 2012; 31 Orchard, Bolingbroke, Mycroft (br000140) 2010 Muranushi (br000135) 2012; 5 Kamil, Chan, Oliker, Shalf, Williams (br000080) 2010 Brandt (br000025) 1977; 31 Membarth, Hannig, Teich, Körner, Eckert (br000115) 2012 Membarth, Hannig, Teich, Körner, Eckert (br000125) 2012 Baskaran, Bondhugula, Krishnamoorthy, Ramanujam, Rountev, Sadayappan (br000010) 2008 Feautrier, Lengauer (br000065) 2011 Membarth, Hannig, Teich, Körner, Eckert (br000120) 2012 Christen, Schenk, Burkhart (br000040) 2011 Holewinski, Pouchet, Sadayappan (br000075) 2012 Fattal, Lischinski, Werman (br000060) 2002 Kunz, Eck, Fillbrandt, Aach (br000095) 2003 Paris, Hasinoff, Kautz (br000145) 2011; 30 Hackbusch (10.1016/j.jpdc.2014.08.008_br000070) 1985 Köstler (10.1016/j.jpdc.2014.08.008_br000090) 2008; 15 Baker (10.1016/j.jpdc.2014.08.008_br000005) 2012 Christen (10.1016/j.jpdc.2014.08.008_br000040) 2011 Kamil (10.1016/j.jpdc.2014.08.008_br000080) 2010 Briggs (10.1016/j.jpdc.2014.08.008_br000030) 2000 Kunz (10.1016/j.jpdc.2014.08.008_br000095) 2003 10.1016/j.jpdc.2014.08.008_br000020 Membarth (10.1016/j.jpdc.2014.08.008_br000125) 2012 Membarth (10.1016/j.jpdc.2014.08.008_br000110) 2012 Du (10.1016/j.jpdc.2014.08.008_br000055) 2011; 38 Fattal (10.1016/j.jpdc.2014.08.008_br000060) 2002 Maruyama (10.1016/j.jpdc.2014.08.008_br000105) 2011 Burt (10.1016/j.jpdc.2014.08.008_br000035) 1983; 31 Trottenberg (10.1016/j.jpdc.2014.08.008_br000165) 2000 Köstler (10.1016/j.jpdc.2014.08.008_br000085) 2013 Membarth (10.1016/j.jpdc.2014.08.008_br000115) 2012 Membarth (10.1016/j.jpdc.2014.08.008_br000120) 2012 Membarth (10.1016/j.jpdc.2014.08.008_br000130) 2014 DeVito (10.1016/j.jpdc.2014.08.008_br000045) 2011 Paris (10.1016/j.jpdc.2014.08.008_br000145) 2011; 30 Bastian (10.1016/j.jpdc.2014.08.008_br000015) 2008; 82 Dietrich (10.1016/j.jpdc.2014.08.008_br000050) 2010 Stürmer (10.1016/j.jpdc.2014.08.008_br000155) 2008; 4 Feautrier (10.1016/j.jpdc.2014.08.008_br000065) 2011 Leung (10.1016/j.jpdc.2014.08.008_br000100) 2010 Ragan-Kelley (10.1016/j.jpdc.2014.08.008_br000150) 2012; 31 Baskaran (10.1016/j.jpdc.2014.08.008_br000010) 2008 Brandt (10.1016/j.jpdc.2014.08.008_br000025) 1977; 31 Tang (10.1016/j.jpdc.2014.08.008_br000160) 2011 Holewinski (10.1016/j.jpdc.2014.08.008_br000075) 2012 Orchard (10.1016/j.jpdc.2014.08.008_br000140) 2010 Muranushi (10.1016/j.jpdc.2014.08.008_br000135) 2012; 5 |
| References_xml | – start-page: 117 year: 2011 end-page: 128 ident: br000160 article-title: The Pochoir stencil compiler publication-title: Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA) – volume: 31 start-page: 532 year: 1983 end-page: 540 ident: br000035 article-title: The Laplacian pyramid as a compact image code publication-title: IEEE Trans. Commun. – start-page: 1581 year: 2011 end-page: 1592 ident: br000065 article-title: Polyhedron model publication-title: Encyclopedia of Parallel Computing – start-page: 211 year: 2012 end-page: 218 ident: br000125 article-title: Automatic optimization of in-flight memory transactions for GPU accelerators based on a domain-specific language for medical imaging publication-title: Proceedings of the 11th International Symposium on Parallel and Distributed Computing (ISPDC) – start-page: 86:1 year: 2014 end-page: 86:6 ident: br000130 article-title: Code generation for embedded heterogeneous architectures on android publication-title: Proceedings of the Conference on Design, Automation and Test in Europe (DATE) – start-page: 1 year: 2013 end-page: 13 ident: br000085 article-title: Performance engineering to achieve real-time high dynamic range imaging publication-title: Real-Time Image Process. – start-page: 11:1 year: 2011 end-page: 11:12 ident: br000105 article-title: Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers publication-title: Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) – start-page: 224 year: 2010 end-page: 234 ident: br000050 article-title: Modeling multigrid algorithms for variational imaging publication-title: Proceedings of the 21st Australian Software Engineering Conference (ASWEC) – volume: 31 start-page: 32:1 year: 2012 end-page: 32:12 ident: br000150 article-title: Decoupling algorithms from schedules for easy optimization of image processing pipelines publication-title: ACM Trans. Graph. (TOG) – volume: 5 start-page: 1 year: 2012 end-page: 40 ident: br000135 article-title: Paraiso: An automated tuning framework for explicit solvers of partial differential equations publication-title: Comput. Sci. Discov. – year: 2000 ident: br000030 article-title: A Multigrid Tutorial, Vol.~2 – start-page: 732 year: 2003 end-page: 742 ident: br000095 article-title: Nonlinear multiresolution gradient adaptive filter for medical images publication-title: Proceedings of SPIE Medical Imaging 2003: Image Processing, Vol. 5032 – start-page: 15 year: 2010 end-page: 24 ident: br000140 article-title: Ypnos: Declarative, parallel structured grid programming publication-title: Proceedings of the 5th ACM SIGPLAN Workshop on Declarative Aspects of Multicore Programming – start-page: 261 year: 2012 end-page: 279 ident: br000005 article-title: Scaling Hypre’s multigrid solvers to 100,000 cores publication-title: High-Perfor. Sci. Comput. – year: 2000 ident: br000165 article-title: Multigrid – start-page: 676 year: 2011 end-page: 687 ident: br000040 article-title: PATUS: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures publication-title: Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS) – volume: 4 start-page: 29 year: 2008 end-page: 35 ident: br000155 article-title: Optimising a 3d multigrid algorithm for the IA-64 architecture publication-title: Int. J. Comput. Sci. Eng. – start-page: 9:1 year: 2011 end-page: 9:12 ident: br000045 article-title: Liszt: A domain specific language for building portable mesh-based PDE solvers publication-title: Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) – start-page: 1 year: 2010 end-page: 12 ident: br000080 article-title: An auto-tuning framework for parallel multicore stencil computations publication-title: Proceedings of the 24th IEEE International Parallel & Distributed Processing Symposium (IPDPS) – start-page: 249 year: 2002 end-page: 256 ident: br000060 article-title: Gradient domain high dynamic range compression publication-title: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH) – start-page: 51 year: 2010 end-page: 61 ident: br000100 article-title: A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction publication-title: Proceedings of the 3rd Workshop on General Purpose Processing on Graphics Processing Units (GPGPU) – volume: 30 start-page: 68:1 year: 2011 end-page: 68:12 ident: br000145 article-title: Local Laplacian filters: Edge-aware image processing with a Laplacian pyramid publication-title: ACM Trans. Graph. (TOG) – reference: Berkeley Benchmarking and Optimization (BeBOP) Group, University of California, Berkeley, pOSKI: Parallel Optimized Sparse Kernel Interface Library, Apr. 2012. – start-page: 1133 year: 2012 end-page: 1138 ident: br000110 article-title: Towards domain-specific computing for stencil codes in HPC publication-title: Proceedings of the 2nd International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC) – start-page: 225 year: 2008 end-page: 234 ident: br000010 article-title: A compiler framework for optimization of affine loop nests for GPGPUs publication-title: Proceedings of the 22nd Annual International Conference on Supercomputing – volume: 31 start-page: 333 year: 1977 end-page: 390 ident: br000025 article-title: Multi-level adaptive solutions to boundary-value problems publication-title: Math. Comp. – volume: 82 start-page: 103 year: 2008 end-page: 119 ident: br000015 article-title: A generic grid interface for parallel and adaptive scientific computing. Part I: Abstract framework publication-title: Computing – start-page: 311 year: 2012 end-page: 320 ident: br000075 article-title: High-performance code generation for stencil computations on GPU architectures publication-title: Proceedings of the 26th ACM International Conference on Supercomputing – year: 1985 ident: br000070 publication-title: Multi-Grid Methods and Applications – start-page: 569 year: 2012 end-page: 581 ident: br000115 article-title: Generating device-specific GPU code for local operators in medical imaging publication-title: Proceedings of the 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS) – start-page: 123 year: 2012 end-page: 132 ident: br000120 article-title: Mastering software variant explosion for GPU accelerators publication-title: Proceedings of the 10th International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar) – volume: 15 start-page: 187 year: 2008 end-page: 200 ident: br000090 article-title: A fast full multigrid solver for applications in image processing publication-title: Numer. Linear Algebra Appl. – volume: 38 start-page: 391 year: 2011 end-page: 407 ident: br000055 article-title: From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming publication-title: Parallel Comput. – start-page: 1581 year: 2011 ident: 10.1016/j.jpdc.2014.08.008_br000065 article-title: Polyhedron model – year: 1985 ident: 10.1016/j.jpdc.2014.08.008_br000070 – start-page: 11:1 year: 2011 ident: 10.1016/j.jpdc.2014.08.008_br000105 article-title: Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers – year: 2000 ident: 10.1016/j.jpdc.2014.08.008_br000030 – start-page: 569 year: 2012 ident: 10.1016/j.jpdc.2014.08.008_br000115 article-title: Generating device-specific GPU code for local operators in medical imaging – year: 2000 ident: 10.1016/j.jpdc.2014.08.008_br000165 – volume: 5 start-page: 1 issue: 1 year: 2012 ident: 10.1016/j.jpdc.2014.08.008_br000135 article-title: Paraiso: An automated tuning framework for explicit solvers of partial differential equations publication-title: Comput. Sci. Discov. doi: 10.1088/1749-4699/5/1/015003 – volume: 38 start-page: 391 issue: 8 year: 2011 ident: 10.1016/j.jpdc.2014.08.008_br000055 article-title: From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming publication-title: Parallel Comput. doi: 10.1016/j.parco.2011.10.002 – start-page: 676 year: 2011 ident: 10.1016/j.jpdc.2014.08.008_br000040 article-title: PATUS: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures – start-page: 261 year: 2012 ident: 10.1016/j.jpdc.2014.08.008_br000005 article-title: Scaling Hypre’s multigrid solvers to 100,000 cores publication-title: High-Perfor. Sci. Comput. – start-page: 224 year: 2010 ident: 10.1016/j.jpdc.2014.08.008_br000050 article-title: Modeling multigrid algorithms for variational imaging – start-page: 311 year: 2012 ident: 10.1016/j.jpdc.2014.08.008_br000075 article-title: High-performance code generation for stencil computations on GPU architectures – start-page: 123 year: 2012 ident: 10.1016/j.jpdc.2014.08.008_br000120 article-title: Mastering software variant explosion for GPU accelerators – volume: 82 start-page: 103 issue: 2 year: 2008 ident: 10.1016/j.jpdc.2014.08.008_br000015 article-title: A generic grid interface for parallel and adaptive scientific computing. Part I: Abstract framework publication-title: Computing doi: 10.1007/s00607-008-0003-x – volume: 31 start-page: 333 issue: 138 year: 1977 ident: 10.1016/j.jpdc.2014.08.008_br000025 article-title: Multi-level adaptive solutions to boundary-value problems publication-title: Math. Comp. doi: 10.1090/S0025-5718-1977-0431719-X – volume: 31 start-page: 32:1 issue: 4 year: 2012 ident: 10.1016/j.jpdc.2014.08.008_br000150 article-title: Decoupling algorithms from schedules for easy optimization of image processing pipelines publication-title: ACM Trans. Graph. (TOG) doi: 10.1145/2185520.2185528 – volume: 4 start-page: 29 issue: 1 year: 2008 ident: 10.1016/j.jpdc.2014.08.008_br000155 article-title: Optimising a 3d multigrid algorithm for the IA-64 architecture publication-title: Int. J. Comput. Sci. Eng. – start-page: 51 year: 2010 ident: 10.1016/j.jpdc.2014.08.008_br000100 article-title: A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction – start-page: 9:1 year: 2011 ident: 10.1016/j.jpdc.2014.08.008_br000045 article-title: Liszt: A domain specific language for building portable mesh-based PDE solvers – start-page: 1 year: 2013 ident: 10.1016/j.jpdc.2014.08.008_br000085 article-title: Performance engineering to achieve real-time high dynamic range imaging publication-title: Real-Time Image Process. – ident: 10.1016/j.jpdc.2014.08.008_br000020 – volume: 30 start-page: 68:1 issue: 4 year: 2011 ident: 10.1016/j.jpdc.2014.08.008_br000145 article-title: Local Laplacian filters: Edge-aware image processing with a Laplacian pyramid publication-title: ACM Trans. Graph. (TOG) doi: 10.1145/2010324.1964963 – volume: 15 start-page: 187 issue: 2–3 year: 2008 ident: 10.1016/j.jpdc.2014.08.008_br000090 article-title: A fast full multigrid solver for applications in image processing publication-title: Numer. Linear Algebra Appl. – start-page: 117 year: 2011 ident: 10.1016/j.jpdc.2014.08.008_br000160 article-title: The Pochoir stencil compiler – start-page: 732 year: 2003 ident: 10.1016/j.jpdc.2014.08.008_br000095 article-title: Nonlinear multiresolution gradient adaptive filter for medical images – start-page: 15 year: 2010 ident: 10.1016/j.jpdc.2014.08.008_br000140 article-title: Ypnos: Declarative, parallel structured grid programming – start-page: 249 year: 2002 ident: 10.1016/j.jpdc.2014.08.008_br000060 article-title: Gradient domain high dynamic range compression – start-page: 1133 year: 2012 ident: 10.1016/j.jpdc.2014.08.008_br000110 article-title: Towards domain-specific computing for stencil codes in HPC – start-page: 1 year: 2010 ident: 10.1016/j.jpdc.2014.08.008_br000080 article-title: An auto-tuning framework for parallel multicore stencil computations – volume: 31 start-page: 532 issue: 4 year: 1983 ident: 10.1016/j.jpdc.2014.08.008_br000035 article-title: The Laplacian pyramid as a compact image code publication-title: IEEE Trans. Commun. doi: 10.1109/TCOM.1983.1095851 – start-page: 211 year: 2012 ident: 10.1016/j.jpdc.2014.08.008_br000125 article-title: Automatic optimization of in-flight memory transactions for GPU accelerators based on a domain-specific language for medical imaging – start-page: 225 year: 2008 ident: 10.1016/j.jpdc.2014.08.008_br000010 article-title: A compiler framework for optimization of affine loop nests for GPGPUs – start-page: 86:1 year: 2014 ident: 10.1016/j.jpdc.2014.08.008_br000130 article-title: Code generation for embedded heterogeneous architectures on android |
| SSID | ssj0011578 |
| Score | 2.1160793 |
| Snippet | High Performance Computing (HPC) systems are nowadays more and more heterogeneous. Different processor types can be found on a single node including... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 3191 |
| SubjectTerms | Accelerators Algorithms Code generation CUDA Distributed processing Domain-specific language GPU Graphics processing units Image processing Image pyramid Mathematical models Microprocessors Multigrid Multiresolution OpenCL Stencil codes Subscriber line |
| Title | Towards a performance-portable description of geometric multigrid algorithms using a domain-specific language |
| URI | https://dx.doi.org/10.1016/j.jpdc.2014.08.008 https://www.proquest.com/docview/1651391455 |
| Volume | 74 |
| WOSCitedRecordID | wos000345733300003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: ScienceDirect Freedom Collection 2021 customDbUrl: eissn: 1096-0848 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011578 issn: 0743-7315 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Jj9MwFLaqDgcu7IgZFhmJWxRUZ_dxhAYNHAYOg9Rb5HrpQpaqTUfz9_hn-MV2EgU6Yg5coipNntK8r-999tsQ-qA5KuE85L5KiF6g0IXwmRSRLxhlgkgRU8bbYRPp1VU2n9Pvk8kvVwtzU6RVld3e0u1_VbU-p5UNpbP3UHcnVJ_Qn7XS9VGrXR__TfFtIuzeY9CS2BUF-C3NNlVSvaHQRHEp6xKGatnUwuVuLTxWLOvdulmVe-_QbiUwT9QlW1c-1GVCblG3zXmE20JD8aKQpg2BgN68MFZLthV020Pj3GXb-LeEnILVqMi_jQFJSFOFL74VkD3Sx4xWpY1pmdYIA4RfwgimpWPkP4d7GiQa5IcY0weNU9PQFHo6O22m-Tg8BgOrq80IGXjwMDCy_vAOZqNi83GzFdC9kkRt99ZZ1vtCF_8fucgucdHlxG1ykJGDjBzGeEK5-UmQxjSbopPzLxfzr10oi8SGDrifZCu3TJLh-EmOsaMRT2jJz_UT9MhqFp8btD1FE1k9Q4_dRBBsHcRzVFrwYYb_Bj48AB-uFe7Ahzvw4R58uAWfFjUCH3bge4F-fL64_nTp24EePg_DsPHjRUZUQpWURC5mPNZUlYiU8UBlCy4FzVQkFOOJYHE6S1i0oBlJJQsSlQqi9GLlJZpWdSVfIRzxWcZVSlkieESFopEKklhpkaHSsskpIu5F5tx2u4ehK0V-XIWnyOvu2ZpeL3deHTv95JatGhaaa7jded97p8xcm3KIz7FK1od9TpJYr8dgcsDZvZ7kNXrY_4neoGmzO8i36AG_adb73TuLx982ycuc |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Towards+a+performance-portable+description+of+geometric+multigrid+algorithms+using+a+domain-specific+language&rft.jtitle=Journal+of+parallel+and+distributed+computing&rft.au=Membarth%2C+Richard&rft.au=Reiche%2C+Oliver&rft.au=Schmitt%2C+Christian&rft.au=Hannig%2C+Frank&rft.date=2014-12-01&rft.issn=0743-7315&rft.volume=74&rft.issue=12&rft.spage=3191&rft.epage=3201&rft_id=info:doi/10.1016%2Fj.jpdc.2014.08.008&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_jpdc_2014_08_008 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0743-7315&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0743-7315&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0743-7315&client=summon |