Smart Containers and Skeleton Programming for GPU-Based Systems
In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for t...
Uložené v:
| Vydané v: | International journal of parallel programming Ročník 44; číslo 3; s. 506 - 530 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
Springer US
01.06.2016
Springer Nature B.V |
| Predmet: | |
| ISSN: | 0885-7458, 1573-7640, 1573-7640 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | In this paper, we discuss the role, design and implementation of
smart containers
in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for their operand data on the different memory units. We discuss how these containers can help in achieving asynchronous execution for skeleton calls while providing implicit synchronization capabilities in a data consistent manner. Furthermore, we discuss the limitations of the original, already optimizing memory management mechanism implemented in SkePU containers, and propose and implement a new mechanism that provides stronger data consistency and improves performance by reducing communication and memory allocations. With several applications, we show that our new mechanism can achieve significantly (up to 33.4 times) better performance than the initial mechanism for page-locked memory on a multi-GPU based system. |
|---|---|
| AbstractList | In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for their operand data on the different memory units. We discuss how these containers can help in achieving asynchronous execution for skeleton calls while providing implicit synchronization capabilities in a data consistent manner. Furthermore, we discuss the limitations of the original, already optimizing memory management mechanism implemented in SkePU containers, and propose and implement a new mechanism that provides stronger data consistency and improves performance by reducing communication and memory allocations. With several applications, we show that our new mechanism can achieve significantly (up to 33.4 times) better performance than the initial mechanism for page-locked memory on a multi-GPU based system. In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for their operand data on the different memory units. We discuss how these containers can help in achieving asynchronous execution for skeleton calls while providing implicit synchronization capabilities in a data consistent manner. Furthermore, we discuss the limitations of the original, already optimizing memory management mechanism implemented in SkePU containers, and propose and implement a new mechanism that provides stronger data consistency and improves performance by reducing communication and memory allocations. With several applications, we show that our new mechanism can achieve significantly (up to 33.4 times) better performance than the initial mechanism for page-locked memory on a multi-GPU based system. Issue Title: Special Issue on High-Level Parallel Programming and Applications In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for their operand data on the different memory units. We discuss how these containers can help in achieving asynchronous execution for skeleton calls while providing implicit synchronization capabilities in a data consistent manner. Furthermore, we discuss the limitations of the original, already optimizing memory management mechanism implemented in SkePU containers, and propose and implement a new mechanism that provides stronger data consistency and improves performance by reducing communication and memory allocations. With several applications, we show that our new mechanism can achieve significantly (up to 33.4 times) better performance than the initial mechanism for page-locked memory on a multi-GPU based system. |
| Author | Kessler, Christoph Dastgeer, Usman |
| Author_xml | – sequence: 1 givenname: Usman surname: Dastgeer fullname: Dastgeer, Usman organization: PELAB, Department of Computer and Information Science, Linköping University – sequence: 2 givenname: Christoph surname: Kessler fullname: Kessler, Christoph email: christoph.kessler@liu.se organization: PELAB, Department of Computer and Information Science, Linköping University |
| BackLink | https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-128719$$DView record from Swedish Publication Index (Linköpings universitet) |
| BookMark | eNp9kUFr3DAQhUVJIJukP6A3Qy-9KJ2xLEt7Kuk2TQOBBpL2KmR5vCi1pa1kU_Lvq7AhhEB7Ghi-N_Nm3jE7CDEQY-8QzhBAfcwIqm05oOQgpOLtG7ZCqQRXbQMHbAVaS64aqY_Ycc73ALBWWq_Yp9vJprnaxDBbHyjlyoa-uv1FI80xVDcpbpOdJh-21RBTdXnzg3-2mQrykGea8ik7HOyY6e1TPWF3Xy_uNt_49ffLq835NXeNgJkrSZK6zrYNtTXoGmBYay06qlFh6XW97LFVzq17R6rXgxBO952yTjm0gzhhfD82_6Hd0pld8sX3g4nWmy_-57mJaWtGvxistcJ14T_s-V2KvxfKs5l8djSONlBcskENGhGlbgr6_hV6H5cUyjEGlRZailrIQuGecinmnGh4toBgHhMw-wRMScA8JmDaolGvNM7Pdvbl18n68b_K-uncsiVsKb3w9E_RXz1pm4A |
| CODEN | IJPPE5 |
| CitedBy_id | crossref_primary_10_1007_s10766_021_00704_3 crossref_primary_10_1002_cpe_5003 crossref_primary_10_1007_s11227_019_02894_7 crossref_primary_10_1007_s10766_024_00770_3 crossref_primary_10_1016_j_cl_2017_04_004 crossref_primary_10_1016_j_jlamp_2019_100498 crossref_primary_10_1109_TPDS_2021_3104257 crossref_primary_10_1007_s10766_017_0490_5 crossref_primary_10_1007_s10766_022_00746_1 crossref_primary_10_1155_2022_6335118 crossref_primary_10_1109_JPROC_2018_2856739 crossref_primary_10_1007_s11227_016_1792_x crossref_primary_10_1007_s11227_019_02824_7 |
| Cites_doi | 10.1145/1863482.1863487 10.1007/978-3-642-40447-4_18 10.1109/MM.2011.89 10.1109/HPEC.2014.7040988 10.1109/PDP.2013.29 10.1145/1944862.1944883 10.1109/IPDPS.2011.269 10.1007/s10766-006-0018-x 10.1145/2086696.2086721 10.1504/IJHPCN.2012.046370 10.1007/s00450-011-0157-1 10.1007/978-3-642-40047-6_86 10.1017/CBO9781139051224 |
| ContentType | Journal Article |
| Copyright | Springer Science+Business Media New York 2015 Springer Science+Business Media New York 2016 |
| Copyright_xml | – notice: Springer Science+Business Media New York 2015 – notice: Springer Science+Business Media New York 2016 |
| DBID | AAYXX CITATION 3V. 7SC 7WY 7WZ 7XB 87Z 8AL 8FD 8FE 8FG 8FK 8FL 8G5 ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ GUQSH HCIFZ JQ2 K60 K6~ K7- L.- L.0 L7M L~C L~D M0C M0N M2O MBDVC P5Z P62 PHGZM PHGZT PKEHL PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI Q9U ABXSW ADTPV AOWAS D8T DG8 ZZAVC |
| DOI | 10.1007/s10766-015-0357-6 |
| DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni Edition) ProQuest Research Library ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central Business Premium Collection Technology Collection ProQuest One Community College ProQuest Central Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student Research Library Prep SciTech Premium Collection ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced ABI/INFORM Professional Standard Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global Computing Database Research Library Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Business ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central Basic SWEPUB Linköpings universitet full text SwePub SwePub Articles SWEPUB Freely available online SWEPUB Linköpings universitet SwePub Articles full text |
| DatabaseTitle | CrossRef ABI/INFORM Global (Corporate) ProQuest Business Collection (Alumni Edition) ProQuest One Business Research Library Prep Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College Research Library (Alumni Edition) ABI/INFORM Complete ProQuest Central ABI/INFORM Professional Advanced ProQuest One Applied & Life Sciences ABI/INFORM Professional Standard ProQuest Central Korea ProQuest Research Library ProQuest Central (New) Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global ProQuest Computing ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest Business Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Business (Alumni) ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) Business Premium Collection (Alumni) |
| DatabaseTitleList | Computer and Information Systems Abstracts ABI/INFORM Global (Corporate) |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1573-7640 |
| EndPage | 530 |
| ExternalDocumentID | oai_DiVA_org_liu_128719 4033358261 10_1007_s10766_015_0357_6 |
| Genre | Feature |
| GroupedDBID | -4Z -59 -5G -BR -EM -Y2 -~C -~X .4S .86 .DC .VR 06D 0R~ 0VY 199 1N0 2.D 203 28- 29J 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 78A 7WY 8FE 8FG 8FL 8G5 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAOBN AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYJJ AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDBF ABDPE ABDZT ABECU ABFSI ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTAH ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFO ACGFS ACHSB ACHXU ACIHN ACKNC ACMDZ ACMLO ACNCT ACOKC ACOMO ACPIV ACREN ACUHS ACZOJ ADHIR ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADYOE ADZKW AEAQA AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFKRA AFLOW AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMTXH AMXSW AMYLF AOCGG ARAPS ARCSS ARMRJ AXYYD AYJHY AZFZN AZQEC B-. B0M BA0 BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BKOMP BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO E.L EAD EAP EAS EBLON EBS EDO EIOEI EJD EMK EPL ESBYG ESX FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GUQSH GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ H~9 I-F I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW LAK LLZTM M0C M0N M2O M4Y MA- MS~ N2Q NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P62 P9O PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 Q2X QOK QOS R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TAE TEORI TN5 TSG TSK TSV TUC TUS U2A U5U UG4 UOJIU UTJUX UZXMN VC2 VFIZW VXZ W23 W48 WH7 WK8 YLTOR Z45 Z7R Z7X Z81 Z83 Z88 Z8R Z8W Z92 ZMTXR ZY4 ~8M ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADHKG AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT PQGLB 7SC 7XB 8AL 8FD 8FK JQ2 L.- L.0 L7M L~C L~D MBDVC PKEHL PQEST PQUKI Q9U ABXSW ADTPV AOWAS D8T DG8 ZZAVC |
| ID | FETCH-LOGICAL-c430t-75e5ebba64e6208200f9883be21714e6bd5d167cc9dce7d8f33c8db7ac7c1af3 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 23 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000374897200008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0885-7458 1573-7640 |
| IngestDate | Tue Nov 04 16:59:47 EST 2025 Sun Nov 09 14:45:05 EST 2025 Tue Nov 04 22:12:50 EST 2025 Tue Nov 18 22:10:52 EST 2025 Sat Nov 29 01:59:43 EST 2025 Fri Feb 21 02:37:21 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Keywords | SkePU Skeleton programming Memory management Smart containers Runtime optimizations GPU-based systems |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c430t-75e5ebba64e6208200f9883be21714e6bd5d167cc9dce7d8f33c8db7ac7c1af3 |
| Notes | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
| OpenAccessLink | https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-128719 |
| PQID | 1783853235 |
| PQPubID | 48389 |
| PageCount | 25 |
| ParticipantIDs | swepub_primary_oai_DiVA_org_liu_128719 proquest_miscellaneous_1808111584 proquest_journals_1783853235 crossref_primary_10_1007_s10766_015_0357_6 crossref_citationtrail_10_1007_s10766_015_0357_6 springer_journals_10_1007_s10766_015_0357_6 |
| PublicationCentury | 2000 |
| PublicationDate | 2016-06-01 |
| PublicationDateYYYYMMDD | 2016-06-01 |
| PublicationDate_xml | – month: 06 year: 2016 text: 2016-06-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | International journal of parallel programming |
| PublicationTitleAbbrev | Int J Parallel Prog |
| PublicationYear | 2016 |
| Publisher | Springer US Springer Nature B.V |
| Publisher_xml | – name: Springer US – name: Springer Nature B.V |
| References | Aufmann, R., Barker, V., Lockwood, J.: Intermediate Algebra with Applications, Multimedia Edition. Cengage Learning (2008). URL http://books.google.se/books?id=QYfJAxqwDE8C Diogo, M., Grelck, C.: Towards Heterogeneous Computing without Heterogeneous Programming. In: H.W. Loidl, R. Pena (eds.): 13th Int. Symposium on Trends in Functional Programming (TFP 2012), St. Andrews, UK, Lecture Notes in Computer Science 7829, pp. 279–294, Springer (2013) Harris, M.: CUDA Unfied Memory in CUDA 6. Nvidia, http://devblogs.nvidia.com/parallelforall/unified-memory-in-cuda-6 (2013) KichererMNowakFBuchtyRKarlWSeamlessly portable applications: managing the diversity of modern heterogeneous systemsACM Trans. Archit. Code Optim.20128442:142:2010.1145/2086696.2086721 Landaverde, R., Zhang, T., Coskun, A., Herbordt, M.: An investigation of Unified Memory access performance in CUDA. In: IEEE High Performance Extreme Computing Conference, Waltham, USA (2014) Enmyren, J., Kessler, C.: SkePU: A Multi-Backend Skeleton Programming Library for Multi-GPU Systems. In: Proceedings of 4th International Workshop on High-Level Parallel Programming and Applications (HLPP-2010), Baltimore, USA, ACM (Sep. 2010) DuboisMAnnavaramMStenströmPParallel Computer Organization and Design2012CambridgeCambridge University Press10.1017/CBO9781139051224 Dastgeer, U., Kessler, C., Thibault, S.: Flexible runtime support for efficient skeleton programming. In: Advances in Parallel Computing, vol. 22, pp. 159–166. IOS Press (2012). Proc. ParCo conference, Ghent, Belgium (Sep . 2011) ShainerGThe development of Mellanox/NVIDIA GPUDirect over InfiniBand—a new model for GPU to GPU communicationsComput. Sci.-Res. Dev.2011263410.1007/s00450-011-0157-1 ColeMIAlgorithmic Skeletons: Structured Management of Parallel Computation1989CambridgeAddison-Wesley0681.68041 GrelckCScholzSSAC-A functional array language for efficient multi-threaded executionInt. J. Parallel Program.200634438342710.1007/s10766-006-0018-x1102.68438 Dastgeer, U.: Skeleton programming for heterogeneous GPU-based systems. Licentiate thesis. Thesis No. 1504. Department of Computer and Information Science, Linköping University (2011). URL http://liu.diva-portal.org/smash/record.jsf?pid=diva2:437140 Ciechanowicz, P., Poldner, M., Kuchen, H.: The Münster skeleton library Muesli—a comprehensive overview (2009). ERCIS Working Paper No. 7 Hoberock, J., Bell, N.: Thrust: C++ template library for CUDA (2011). http://code.google.com/p/thrust Park, J.: Memory optimizations of embedded applications for energy efficiency. Ph.D. thesis, Dept. of Electrical Engineering. University of Stanford (2011) Goli, M., Gonzalez-Velez, H.: Heterogeneous algorithmic skeletons for FastFlow with seamless coordination over hybrid architectures. In: 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 148–156 (2013) Dastgeer, U.: Performance-aware component composition for GPU-based systems. Ph.D. thesis, Linköping University (2014). URL http://www.diva-portal.org/smash/record.jsf?pid=diva2:712422 Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL—A portable skeleton library for high-level GPU programming. In: 16th International Workshop on High-Level Parallel Programming Models and Supportive Environments, HIPS ’11 (2011) Kicherer, M., Buchty, R., Karl, W.: Cost-aware function migration in heterogeneous systems. In: 6th International Conference on High Performance and Embedded Architectures and Compilers. HiPEAC ’11, pp. 137–145. ACM, New York, NY, USA (2011) AlexandrescuAModern C++ Design20011BostonAddison-Wesley Professional ErnstingSKuchenHAlgorithmic skeletons for multi-core, multi-GPU systems and clustersInt. J. High Perform. Comput. Netw.2012712913810.1504/IJHPCN.2012.046370 KecklerSWDallyWJKhailanyBGarlandMGlascoDGPUs and the future of parallel computingIEEE Micro.201131571710.1109/MM.2011.89 Marques, R., Paulino, H., Alexandre, F., Medeiros, P.D.: Algorithmic skeleton framework for the orchestration of GPU computations. In: Euro-Par 2013 Parallel Processing. Lecture Notes in Computer Science, vol. 8097, pp. 874–885. Springer, Berlin Heidelberg (2013) NVIDIA Corporation: NVIDIA CUDA C Programming Guide (2013). http://docs.nvidia.com/cuda/cuda-c-programming-guide 357_CR24 357_CR12 A Alexandrescu (357_CR1) 2001 357_CR22 357_CR10 M Kicherer (357_CR18) 2012; 8 357_CR21 357_CR17 357_CR15 357_CR14 SW Keckler (357_CR16) 2011; 31 S Ernsting (357_CR11) 2012; 7 M Dubois (357_CR9) 2012 357_CR20 MI Cole (357_CR4) 1989 C Grelck (357_CR13) 2006; 34 357_CR2 G Shainer (357_CR23) 2011; 26 357_CR19 357_CR3 357_CR6 357_CR5 357_CR8 357_CR7 |
| References_xml | – reference: KecklerSWDallyWJKhailanyBGarlandMGlascoDGPUs and the future of parallel computingIEEE Micro.201131571710.1109/MM.2011.89 – reference: Dastgeer, U.: Skeleton programming for heterogeneous GPU-based systems. Licentiate thesis. Thesis No. 1504. Department of Computer and Information Science, Linköping University (2011). URL http://liu.diva-portal.org/smash/record.jsf?pid=diva2:437140 – reference: Marques, R., Paulino, H., Alexandre, F., Medeiros, P.D.: Algorithmic skeleton framework for the orchestration of GPU computations. In: Euro-Par 2013 Parallel Processing. Lecture Notes in Computer Science, vol. 8097, pp. 874–885. Springer, Berlin Heidelberg (2013) – reference: NVIDIA Corporation: NVIDIA CUDA C Programming Guide (2013). http://docs.nvidia.com/cuda/cuda-c-programming-guide – reference: GrelckCScholzSSAC-A functional array language for efficient multi-threaded executionInt. J. Parallel Program.200634438342710.1007/s10766-006-0018-x1102.68438 – reference: ShainerGThe development of Mellanox/NVIDIA GPUDirect over InfiniBand—a new model for GPU to GPU communicationsComput. Sci.-Res. Dev.2011263410.1007/s00450-011-0157-1 – reference: ColeMIAlgorithmic Skeletons: Structured Management of Parallel Computation1989CambridgeAddison-Wesley0681.68041 – reference: Goli, M., Gonzalez-Velez, H.: Heterogeneous algorithmic skeletons for FastFlow with seamless coordination over hybrid architectures. In: 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 148–156 (2013) – reference: Hoberock, J., Bell, N.: Thrust: C++ template library for CUDA (2011). http://code.google.com/p/thrust/ – reference: KichererMNowakFBuchtyRKarlWSeamlessly portable applications: managing the diversity of modern heterogeneous systemsACM Trans. Archit. Code Optim.20128442:142:2010.1145/2086696.2086721 – reference: Park, J.: Memory optimizations of embedded applications for energy efficiency. Ph.D. thesis, Dept. of Electrical Engineering. University of Stanford (2011) – reference: Kicherer, M., Buchty, R., Karl, W.: Cost-aware function migration in heterogeneous systems. In: 6th International Conference on High Performance and Embedded Architectures and Compilers. HiPEAC ’11, pp. 137–145. ACM, New York, NY, USA (2011) – reference: Diogo, M., Grelck, C.: Towards Heterogeneous Computing without Heterogeneous Programming. In: H.W. Loidl, R. Pena (eds.): 13th Int. Symposium on Trends in Functional Programming (TFP 2012), St. Andrews, UK, Lecture Notes in Computer Science 7829, pp. 279–294, Springer (2013) – reference: DuboisMAnnavaramMStenströmPParallel Computer Organization and Design2012CambridgeCambridge University Press10.1017/CBO9781139051224 – reference: Ciechanowicz, P., Poldner, M., Kuchen, H.: The Münster skeleton library Muesli—a comprehensive overview (2009). ERCIS Working Paper No. 7 – reference: AlexandrescuAModern C++ Design20011BostonAddison-Wesley Professional – reference: ErnstingSKuchenHAlgorithmic skeletons for multi-core, multi-GPU systems and clustersInt. J. High Perform. Comput. Netw.2012712913810.1504/IJHPCN.2012.046370 – reference: Landaverde, R., Zhang, T., Coskun, A., Herbordt, M.: An investigation of Unified Memory access performance in CUDA. In: IEEE High Performance Extreme Computing Conference, Waltham, USA (2014) – reference: Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL—A portable skeleton library for high-level GPU programming. In: 16th International Workshop on High-Level Parallel Programming Models and Supportive Environments, HIPS ’11 (2011) – reference: Aufmann, R., Barker, V., Lockwood, J.: Intermediate Algebra with Applications, Multimedia Edition. Cengage Learning (2008). URL http://books.google.se/books?id=QYfJAxqwDE8C – reference: Dastgeer, U., Kessler, C., Thibault, S.: Flexible runtime support for efficient skeleton programming. In: Advances in Parallel Computing, vol. 22, pp. 159–166. IOS Press (2012). Proc. ParCo conference, Ghent, Belgium (Sep . 2011) – reference: Harris, M.: CUDA Unfied Memory in CUDA 6. Nvidia, http://devblogs.nvidia.com/parallelforall/unified-memory-in-cuda-6 (2013) – reference: Dastgeer, U.: Performance-aware component composition for GPU-based systems. Ph.D. thesis, Linköping University (2014). URL http://www.diva-portal.org/smash/record.jsf?pid=diva2:712422 – reference: Enmyren, J., Kessler, C.: SkePU: A Multi-Backend Skeleton Programming Library for Multi-GPU Systems. In: Proceedings of 4th International Workshop on High-Level Parallel Programming and Applications (HLPP-2010), Baltimore, USA, ACM (Sep. 2010) – ident: 357_CR10 doi: 10.1145/1863482.1863487 – ident: 357_CR8 doi: 10.1007/978-3-642-40447-4_18 – ident: 357_CR7 – volume: 31 start-page: 7 issue: 5 year: 2011 ident: 357_CR16 publication-title: IEEE Micro. doi: 10.1109/MM.2011.89 – ident: 357_CR19 doi: 10.1109/HPEC.2014.7040988 – ident: 357_CR3 – ident: 357_CR5 – ident: 357_CR6 – ident: 357_CR12 doi: 10.1109/PDP.2013.29 – ident: 357_CR22 – volume-title: Modern C++ Design year: 2001 ident: 357_CR1 – ident: 357_CR21 – ident: 357_CR2 – ident: 357_CR17 doi: 10.1145/1944862.1944883 – ident: 357_CR24 doi: 10.1109/IPDPS.2011.269 – volume-title: Algorithmic Skeletons: Structured Management of Parallel Computation year: 1989 ident: 357_CR4 – volume: 34 start-page: 383 issue: 4 year: 2006 ident: 357_CR13 publication-title: Int. J. Parallel Program. doi: 10.1007/s10766-006-0018-x – ident: 357_CR14 – volume: 8 start-page: 42:1 issue: 4 year: 2012 ident: 357_CR18 publication-title: ACM Trans. Archit. Code Optim. doi: 10.1145/2086696.2086721 – volume: 7 start-page: 129 year: 2012 ident: 357_CR11 publication-title: Int. J. High Perform. Comput. Netw. doi: 10.1504/IJHPCN.2012.046370 – ident: 357_CR15 – volume: 26 start-page: 3 year: 2011 ident: 357_CR23 publication-title: Comput. Sci.-Res. Dev. doi: 10.1007/s00450-011-0157-1 – ident: 357_CR20 doi: 10.1007/978-3-642-40047-6_86 – volume-title: Parallel Computer Organization and Design year: 2012 ident: 357_CR9 doi: 10.1017/CBO9781139051224 |
| SSID | ssj0009788 |
| Score | 2.1741712 |
| Snippet | In this paper, we discuss the role, design and implementation of
smart containers
in the SkePU skeleton library for GPU-based systems. These containers provide... Issue Title: Special Issue on High-Level Parallel Programming and Applications In this paper, we discuss the role, design and implementation of smart... In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide... |
| SourceID | swepub proquest crossref springer |
| SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 506 |
| SubjectTerms | Analysis C plus plus Communication Computer memory Computer programming Computer Science Consistency Containers Design engineering Interfaces Libraries Memory management Optimization Processor Architectures Programming Run time (computers) Software Engineering/Programming and Operating Systems Studies Synchronization Theory of Computation |
| SummonAdditionalLinks | – databaseName: ABI/INFORM Collection dbid: 7WY link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LTxsxELYocOgFykuE0spIiEORxSZ-7qmiUOCEIgEtPVl-LYqADSSB39-ZfSSlElw4r9dr7cx4Pnvs7yNkVxmRpA-SqegjEw4i3WnDmY-FS4XucWVcJTahz8_N9XXebzbcxs2xynZOrCbqOAy4R37QhR4gtfS4_P7wyFA1CqurjYTGB7IAiVqigoH-_WdGuqsr3UkIJMm0kKatatZX57TCtbRkGZeaqZd5aQY2p_XR_7hEq_xzsvzekX8iSw3ypIe1q6yQuVSukuVW1YE2Qb4G8P0e3IkibZXDm4Fj6spIL24hPwFOpP36QNc9jJYC4KWn_Sv2A1IhNKnJz9fJ5cnPy6Mz1sgssCB4NmFaJpm8d0ok1UNEkBW5MdynHoqjJ-WjjF2lQ8hjSDqagvNgotcu6NB1Bd8g8-WwTJuEGh40lyp0gxBCZ8F5geu1PIdMGQFZdEjW_mMbGgpyVMK4szPyZDSLBbNYNItVHfJt-spDzb_xVuPt1gK2CcWxnf3-DtmZPoYgwsqIK9PwCdqg_ghgYyM6ZL81-D9dvP7BvdonpmNDtu7jwa9DOxzd2LsBMnXDkjTfentkn8lHAGKqPoK2TeYno6f0hSyG58lgPPpaOfVfZ8370w priority: 102 providerName: ProQuest |
| Title | Smart Containers and Skeleton Programming for GPU-Based Systems |
| URI | https://link.springer.com/article/10.1007/s10766-015-0357-6 https://www.proquest.com/docview/1783853235 https://www.proquest.com/docview/1808111584 https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-128719 |
| Volume | 44 |
| WOSCitedRecordID | wos000374897200008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: ABI/INFORM Collection customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: 7WY dateStart: 19970201 isFulltext: true titleUrlDefault: https://www.proquest.com/abicomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ABI/INFORM Global customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: M0C dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/abiglobal providerName: ProQuest – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: P5Z dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: K7- dateStart: 19970201 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: BENPR dateStart: 19970201 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Research Library customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: M2O dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/pqrl providerName: ProQuest – providerCode: PRVAVX databaseName: Springer Journals - Owned customDbUrl: eissn: 1573-7640 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3daxQxEB9s64MvrZ94tR4RxAclsLf53Me2thbEc7G1Vl9CvlaOtnvl7urf72Q_7qpYQV8Cy2aTMJnZ-Q2T_AbgpdQ8CucFlcEFyi1aulWaURcqGyuVM6ltU2xCjcf67Kwou3vc8_60e5-SbP7UNy67KZmiX0EzJhSVa7AhEtlMCtGPT1dMu6opNonWI6jiQvepzD8N8aszWiHMZVL0NwLRxukcbv3Xcu_DZocxyW6rFA_gTqwfwlZfv4F05vwIgfolKg5JBFU23QGcE1sHcnyOnggRISnbo1uXOCtBaEvelZ_pHjo97NLSnD-Gk8ODk_0j2hVUoJ6zbEGViCI6ZyWPMk--P6sKrZmLeSqDHqULIoyk8r4IPqqgK8a8Dk5Zr_zIVuwJrNfTOj4FoplXTEg_8pxzlXnreIrMigJ9YkAMMYCsF6zxHdl4qnlxYVY0yUk-BuVjknyMHMDr5SdXLdPG3zrv9LtlOqObmxEqGaKPnIkBvFi-RnNJORBbx-k19kmVRhAFaz6AN_3G3Rji9glftYqwXFvi5X47Od0109l3czFJnNwYfBbb_zTsM7iHCEy2Z892YH0xu47P4a7_sZjMZ0NYU1--DmFj72BcfsKn94pi-yHbT23-EdtSfBs26v8TZqr3VQ |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Jb9NAFH4qBQkulFUECgwScACNcDyrDwgVSmmVEkUioN5Gs7mK2jolSUH8KP4jb7wkgERvPXD2eGzP2_1mvg_gqdQ8CucFlcEFyi1aulWaURdKG0uVM6ltTTahhkN9cFCM1uBndxYmbavsfGLtqMPUp3_kr_o4A4aWnIk3p19pYo1K3dWOQqNRi0H88R1LtvnrvW2U77M833k_frdLW1YB6jnLFlSJKKJzVvIo8xQAs7LQmrmYJy7wKF0QoS-V90XwUQVdMuZ1cMp65fu2ZDjtJbjMmZbJoAaKrjB-VU1ziXYrqOJCd03U5qSekql0FzRjQlH5Zxhc5bbLduxf0KV1uNvZ-M8W6gZcb_NqstUYwk1Yi9Ut2Og4K0jrwm5jcXKCxkISKJdN5x7nxFaBfDrC6ItZMBk129VOcHEIpvPkw-gzfYuBHoc00O53YHwRX3EX1qtpFe8B0cwrJqTve865yrx1PFWjRYF5QMC8qQdZJ1LjW4D1xPNxbFbQ0EkLDGqBSVpgZA9eLG85bdBFzhu82QnctI5mblbS7sGT5WV0EanvY6s4PcMxiV0FM3_Ne_Cy06_fpvj3A583Krh8t4RFvj35smWms0NzPEk45FhwF_fPf7PHcHV3_HHf7O8NBw_gGqacstlstwnri9lZfAhX_LfFZD57VNsTAXPBavkL6_ZZwQ |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1bb9MwFD4aHUJ72biKjgFGAh5A1tL4mgeENrrCNFRFMNDeLMd2pmpburUdiJ_Gv-O4SVpAYm974DmOk_jcc-zvA3guNQ-icIJKX3jKLVq6VZrRwpc2lCplUts52YQaDvXRUZavwM_2LEzcVtn6xLmj9mMX_5Fv93AGDC0pE9tlsy0i7w_enl_QyCAVO60tnUatIgfhx3cs36Zv9vso6xdpOtg7fPeBNgwD1HGWzKgSQYSisJIHmcZgmJSZ1qwIaeQFD7Lwwvekci7zLiivS8ac9oWyTrmeLRlOewNWFcOapwOru3vD_NMS8VfNSS_RigVVXOi2pVqf21MyFvKCJkwoKv8MistMd9Gc_QvIdB78Bhv_8bLdhvUm4yY7tYncgZVQ3YWNls2CNM7tHpYtZ2hGJMJ12Xgickps5cnnE4zLmB-TvN7IdoYLRTDRJ-_zL3QXUwAcUoO-34fD6_iKB9CpxlV4CEQzp5iQruc45ypxtuCxTs0yzBA8ZlRdSFrxGtdAr0cGkFOzBI2OGmFQI0zUCCO78Gpxy3mNO3LV4K1W-KZxQVOzlHwXni0uo_OIHSFbhfEljom8K1gTaN6F162u_TbFvx_4slbHxbtFlPL-6OuOGU-OzekoIpRjKZ5tXv1mT-EWaqP5uD88eARrmIvKehfeFnRmk8vwGG66b7PRdPKkMS4C5pr18hdMkmQT |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Smart+Containers+and+Skeleton+Programming+for+GPU-Based+Systems&rft.jtitle=International+journal+of+parallel+programming&rft.au=Dastgeer%2C+Usman&rft.au=Kessler%2C+Christoph&rft.date=2016-06-01&rft.pub=Springer+US&rft.issn=0885-7458&rft.eissn=1573-7640&rft.volume=44&rft.issue=3&rft.spage=506&rft.epage=530&rft_id=info:doi/10.1007%2Fs10766-015-0357-6&rft.externalDocID=10_1007_s10766_015_0357_6 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0885-7458&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0885-7458&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0885-7458&client=summon |