Smart Containers and Skeleton Programming for GPU-Based Systems

In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for t...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International journal of parallel programming Ročník 44; číslo 3; s. 506 - 530
Hlavní autori: Dastgeer, Usman, Kessler, Christoph
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York Springer US 01.06.2016
Springer Nature B.V
Predmet:
ISSN:0885-7458, 1573-7640, 1573-7640
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for their operand data on the different memory units. We discuss how these containers can help in achieving asynchronous execution for skeleton calls while providing implicit synchronization capabilities in a data consistent manner. Furthermore, we discuss the limitations of the original, already optimizing memory management mechanism implemented in SkePU containers, and propose and implement a new mechanism that provides stronger data consistency and improves performance by reducing communication and memory allocations. With several applications, we show that our new mechanism can achieve significantly (up to 33.4 times) better performance than the initial mechanism for page-locked memory on a multi-GPU based system.
AbstractList In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for their operand data on the different memory units. We discuss how these containers can help in achieving asynchronous execution for skeleton calls while providing implicit synchronization capabilities in a data consistent manner. Furthermore, we discuss the limitations of the original, already optimizing memory management mechanism implemented in SkePU containers, and propose and implement a new mechanism that provides stronger data consistency and improves performance by reducing communication and memory allocations. With several applications, we show that our new mechanism can achieve significantly (up to 33.4 times) better performance than the initial mechanism for page-locked memory on a multi-GPU based system.
In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for their operand data on the different memory units. We discuss how these containers can help in achieving asynchronous execution for skeleton calls while providing implicit synchronization capabilities in a data consistent manner. Furthermore, we discuss the limitations of the original, already optimizing memory management mechanism implemented in SkePU containers, and propose and implement a new mechanism that provides stronger data consistency and improves performance by reducing communication and memory allocations. With several applications, we show that our new mechanism can achieve significantly (up to 33.4 times) better performance than the initial mechanism for page-locked memory on a multi-GPU based system.
Issue Title: Special Issue on High-Level Parallel Programming and Applications In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide an interface similar to C++ STL containers but internally perform runtime optimization of data transfers and runtime memory management for their operand data on the different memory units. We discuss how these containers can help in achieving asynchronous execution for skeleton calls while providing implicit synchronization capabilities in a data consistent manner. Furthermore, we discuss the limitations of the original, already optimizing memory management mechanism implemented in SkePU containers, and propose and implement a new mechanism that provides stronger data consistency and improves performance by reducing communication and memory allocations. With several applications, we show that our new mechanism can achieve significantly (up to 33.4 times) better performance than the initial mechanism for page-locked memory on a multi-GPU based system.
Author Kessler, Christoph
Dastgeer, Usman
Author_xml – sequence: 1
  givenname: Usman
  surname: Dastgeer
  fullname: Dastgeer, Usman
  organization: PELAB, Department of Computer and Information Science, Linköping University
– sequence: 2
  givenname: Christoph
  surname: Kessler
  fullname: Kessler, Christoph
  email: christoph.kessler@liu.se
  organization: PELAB, Department of Computer and Information Science, Linköping University
BackLink https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-128719$$DView record from Swedish Publication Index (Linköpings universitet)
BookMark eNp9kUFr3DAQhUVJIJukP6A3Qy-9KJ2xLEt7Kuk2TQOBBpL2KmR5vCi1pa1kU_Lvq7AhhEB7Ghi-N_Nm3jE7CDEQY-8QzhBAfcwIqm05oOQgpOLtG7ZCqQRXbQMHbAVaS64aqY_Ycc73ALBWWq_Yp9vJprnaxDBbHyjlyoa-uv1FI80xVDcpbpOdJh-21RBTdXnzg3-2mQrykGea8ik7HOyY6e1TPWF3Xy_uNt_49ffLq835NXeNgJkrSZK6zrYNtTXoGmBYay06qlFh6XW97LFVzq17R6rXgxBO952yTjm0gzhhfD82_6Hd0pld8sX3g4nWmy_-57mJaWtGvxistcJ14T_s-V2KvxfKs5l8djSONlBcskENGhGlbgr6_hV6H5cUyjEGlRZailrIQuGecinmnGh4toBgHhMw-wRMScA8JmDaolGvNM7Pdvbl18n68b_K-uncsiVsKb3w9E_RXz1pm4A
CODEN IJPPE5
CitedBy_id crossref_primary_10_1007_s10766_021_00704_3
crossref_primary_10_1002_cpe_5003
crossref_primary_10_1007_s11227_019_02894_7
crossref_primary_10_1007_s10766_024_00770_3
crossref_primary_10_1016_j_cl_2017_04_004
crossref_primary_10_1016_j_jlamp_2019_100498
crossref_primary_10_1109_TPDS_2021_3104257
crossref_primary_10_1007_s10766_017_0490_5
crossref_primary_10_1007_s10766_022_00746_1
crossref_primary_10_1155_2022_6335118
crossref_primary_10_1109_JPROC_2018_2856739
crossref_primary_10_1007_s11227_016_1792_x
crossref_primary_10_1007_s11227_019_02824_7
Cites_doi 10.1145/1863482.1863487
10.1007/978-3-642-40447-4_18
10.1109/MM.2011.89
10.1109/HPEC.2014.7040988
10.1109/PDP.2013.29
10.1145/1944862.1944883
10.1109/IPDPS.2011.269
10.1007/s10766-006-0018-x
10.1145/2086696.2086721
10.1504/IJHPCN.2012.046370
10.1007/s00450-011-0157-1
10.1007/978-3-642-40047-6_86
10.1017/CBO9781139051224
ContentType Journal Article
Copyright Springer Science+Business Media New York 2015
Springer Science+Business Media New York 2016
Copyright_xml – notice: Springer Science+Business Media New York 2015
– notice: Springer Science+Business Media New York 2016
DBID AAYXX
CITATION
3V.
7SC
7WY
7WZ
7XB
87Z
8AL
8FD
8FE
8FG
8FK
8FL
8G5
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
DWQXO
FRNLG
F~G
GNUQQ
GUQSH
HCIFZ
JQ2
K60
K6~
K7-
L.-
L.0
L7M
L~C
L~D
M0C
M0N
M2O
MBDVC
P5Z
P62
PHGZM
PHGZT
PKEHL
PQBIZ
PQBZA
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
ABXSW
ADTPV
AOWAS
D8T
DG8
ZZAVC
DOI 10.1007/s10766-015-0357-6
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ABI/INFORM Collection
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni Edition)
ProQuest Research Library
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
ProQuest Central
Business Premium Collection
Technology Collection
ProQuest One Community College
ProQuest Central
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Central Student
Research Library Prep
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
Computer Science Database
ABI/INFORM Professional Advanced
ABI/INFORM Professional Standard
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ABI/INFORM Global
Computing Database
Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
ProQuest One Academic Middle East (New)
ProQuest One Business
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central Basic
SWEPUB Linköpings universitet full text
SwePub
SwePub Articles
SWEPUB Freely available online
SWEPUB Linköpings universitet
SwePub Articles full text
DatabaseTitle CrossRef
ABI/INFORM Global (Corporate)
ProQuest Business Collection (Alumni Edition)
ProQuest One Business
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
Research Library (Alumni Edition)
ABI/INFORM Complete
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest One Applied & Life Sciences
ABI/INFORM Professional Standard
ProQuest Central Korea
ProQuest Research Library
ProQuest Central (New)
Advanced Technologies Database with Aerospace
ABI/INFORM Complete (Alumni Edition)
Advanced Technologies & Aerospace Collection
Business Premium Collection
ABI/INFORM Global
ProQuest Computing
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Business Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Business (Alumni)
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList

Computer and Information Systems Abstracts
ABI/INFORM Global (Corporate)
Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1573-7640
EndPage 530
ExternalDocumentID oai_DiVA_org_liu_128719
4033358261
10_1007_s10766_015_0357_6
Genre Feature
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
-~X
.4S
.86
.DC
.VR
06D
0R~
0VY
199
1N0
2.D
203
28-
29J
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
5GY
5QI
5VS
67Z
6NX
78A
7WY
8FE
8FG
8FL
8G5
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYJJ
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDBF
ABDPE
ABDZT
ABECU
ABFSI
ABFTD
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTAH
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFO
ACGFS
ACHSB
ACHXU
ACIHN
ACKNC
ACMDZ
ACMLO
ACNCT
ACOKC
ACOMO
ACPIV
ACREN
ACUHS
ACZOJ
ADHIR
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADURQ
ADYFF
ADYOE
ADZKW
AEAQA
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFYQB
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMTXH
AMXSW
AMYLF
AOCGG
ARAPS
ARCSS
ARMRJ
AXYYD
AYJHY
AZFZN
AZQEC
B-.
B0M
BA0
BBWZM
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BKOMP
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
E.L
EAD
EAP
EAS
EBLON
EBS
EDO
EIOEI
EJD
EMK
EPL
ESBYG
ESX
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GROUPED_ABI_INFORM_RESEARCH
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
H~9
I-F
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
KOW
LAK
LLZTM
M0C
M0N
M2O
M4Y
MA-
MS~
N2Q
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P62
P9O
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
Q2X
QOK
QOS
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TAE
TEORI
TN5
TSG
TSK
TSV
TUC
TUS
U2A
U5U
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
VXZ
W23
W48
WH7
WK8
YLTOR
Z45
Z7R
Z7X
Z81
Z83
Z88
Z8R
Z8W
Z92
ZMTXR
ZY4
~8M
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PQGLB
7SC
7XB
8AL
8FD
8FK
JQ2
L.-
L.0
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQUKI
Q9U
ABXSW
ADTPV
AOWAS
D8T
DG8
ZZAVC
ID FETCH-LOGICAL-c430t-75e5ebba64e6208200f9883be21714e6bd5d167cc9dce7d8f33c8db7ac7c1af3
IEDL.DBID RSV
ISICitedReferencesCount 23
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000374897200008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0885-7458
1573-7640
IngestDate Tue Nov 04 16:59:47 EST 2025
Sun Nov 09 14:45:05 EST 2025
Tue Nov 04 22:12:50 EST 2025
Tue Nov 18 22:10:52 EST 2025
Sat Nov 29 01:59:43 EST 2025
Fri Feb 21 02:37:21 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords SkePU
Skeleton programming
Memory management
Smart containers
Runtime optimizations
GPU-based systems
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c430t-75e5ebba64e6208200f9883be21714e6bd5d167cc9dce7d8f33c8db7ac7c1af3
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-128719
PQID 1783853235
PQPubID 48389
PageCount 25
ParticipantIDs swepub_primary_oai_DiVA_org_liu_128719
proquest_miscellaneous_1808111584
proquest_journals_1783853235
crossref_primary_10_1007_s10766_015_0357_6
crossref_citationtrail_10_1007_s10766_015_0357_6
springer_journals_10_1007_s10766_015_0357_6
PublicationCentury 2000
PublicationDate 2016-06-01
PublicationDateYYYYMMDD 2016-06-01
PublicationDate_xml – month: 06
  year: 2016
  text: 2016-06-01
  day: 01
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle International journal of parallel programming
PublicationTitleAbbrev Int J Parallel Prog
PublicationYear 2016
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References Aufmann, R., Barker, V., Lockwood, J.: Intermediate Algebra with Applications, Multimedia Edition. Cengage Learning (2008). URL http://books.google.se/books?id=QYfJAxqwDE8C
Diogo, M., Grelck, C.: Towards Heterogeneous Computing without Heterogeneous Programming. In: H.W. Loidl, R. Pena (eds.): 13th Int. Symposium on Trends in Functional Programming (TFP 2012), St. Andrews, UK, Lecture Notes in Computer Science 7829, pp. 279–294, Springer (2013)
Harris, M.: CUDA Unfied Memory in CUDA 6. Nvidia, http://devblogs.nvidia.com/parallelforall/unified-memory-in-cuda-6 (2013)
KichererMNowakFBuchtyRKarlWSeamlessly portable applications: managing the diversity of modern heterogeneous systemsACM Trans. Archit. Code Optim.20128442:142:2010.1145/2086696.2086721
Landaverde, R., Zhang, T., Coskun, A., Herbordt, M.: An investigation of Unified Memory access performance in CUDA. In: IEEE High Performance Extreme Computing Conference, Waltham, USA (2014)
Enmyren, J., Kessler, C.: SkePU: A Multi-Backend Skeleton Programming Library for Multi-GPU Systems. In: Proceedings of 4th International Workshop on High-Level Parallel Programming and Applications (HLPP-2010), Baltimore, USA, ACM (Sep. 2010)
DuboisMAnnavaramMStenströmPParallel Computer Organization and Design2012CambridgeCambridge University Press10.1017/CBO9781139051224
Dastgeer, U., Kessler, C., Thibault, S.: Flexible runtime support for efficient skeleton programming. In: Advances in Parallel Computing, vol. 22, pp. 159–166. IOS Press (2012). Proc. ParCo conference, Ghent, Belgium (Sep . 2011)
ShainerGThe development of Mellanox/NVIDIA GPUDirect over InfiniBand—a new model for GPU to GPU communicationsComput. Sci.-Res. Dev.2011263410.1007/s00450-011-0157-1
ColeMIAlgorithmic Skeletons: Structured Management of Parallel Computation1989CambridgeAddison-Wesley0681.68041
GrelckCScholzSSAC-A functional array language for efficient multi-threaded executionInt. J. Parallel Program.200634438342710.1007/s10766-006-0018-x1102.68438
Dastgeer, U.: Skeleton programming for heterogeneous GPU-based systems. Licentiate thesis. Thesis No. 1504. Department of Computer and Information Science, Linköping University (2011). URL http://liu.diva-portal.org/smash/record.jsf?pid=diva2:437140
Ciechanowicz, P., Poldner, M., Kuchen, H.: The Münster skeleton library Muesli—a comprehensive overview (2009). ERCIS Working Paper No. 7
Hoberock, J., Bell, N.: Thrust: C++ template library for CUDA (2011). http://code.google.com/p/thrust
Park, J.: Memory optimizations of embedded applications for energy efficiency. Ph.D. thesis, Dept. of Electrical Engineering. University of Stanford (2011)
Goli, M., Gonzalez-Velez, H.: Heterogeneous algorithmic skeletons for FastFlow with seamless coordination over hybrid architectures. In: 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 148–156 (2013)
Dastgeer, U.: Performance-aware component composition for GPU-based systems. Ph.D. thesis, Linköping University (2014). URL http://www.diva-portal.org/smash/record.jsf?pid=diva2:712422
Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL—A portable skeleton library for high-level GPU programming. In: 16th International Workshop on High-Level Parallel Programming Models and Supportive Environments, HIPS ’11 (2011)
Kicherer, M., Buchty, R., Karl, W.: Cost-aware function migration in heterogeneous systems. In: 6th International Conference on High Performance and Embedded Architectures and Compilers. HiPEAC ’11, pp. 137–145. ACM, New York, NY, USA (2011)
AlexandrescuAModern C++ Design20011BostonAddison-Wesley Professional
ErnstingSKuchenHAlgorithmic skeletons for multi-core, multi-GPU systems and clustersInt. J. High Perform. Comput. Netw.2012712913810.1504/IJHPCN.2012.046370
KecklerSWDallyWJKhailanyBGarlandMGlascoDGPUs and the future of parallel computingIEEE Micro.201131571710.1109/MM.2011.89
Marques, R., Paulino, H., Alexandre, F., Medeiros, P.D.: Algorithmic skeleton framework for the orchestration of GPU computations. In: Euro-Par 2013 Parallel Processing. Lecture Notes in Computer Science, vol. 8097, pp. 874–885. Springer, Berlin Heidelberg (2013)
NVIDIA Corporation: NVIDIA CUDA C Programming Guide (2013). http://docs.nvidia.com/cuda/cuda-c-programming-guide
357_CR24
357_CR12
A Alexandrescu (357_CR1) 2001
357_CR22
357_CR10
M Kicherer (357_CR18) 2012; 8
357_CR21
357_CR17
357_CR15
357_CR14
SW Keckler (357_CR16) 2011; 31
S Ernsting (357_CR11) 2012; 7
M Dubois (357_CR9) 2012
357_CR20
MI Cole (357_CR4) 1989
C Grelck (357_CR13) 2006; 34
357_CR2
G Shainer (357_CR23) 2011; 26
357_CR19
357_CR3
357_CR6
357_CR5
357_CR8
357_CR7
References_xml – reference: KecklerSWDallyWJKhailanyBGarlandMGlascoDGPUs and the future of parallel computingIEEE Micro.201131571710.1109/MM.2011.89
– reference: Dastgeer, U.: Skeleton programming for heterogeneous GPU-based systems. Licentiate thesis. Thesis No. 1504. Department of Computer and Information Science, Linköping University (2011). URL http://liu.diva-portal.org/smash/record.jsf?pid=diva2:437140
– reference: Marques, R., Paulino, H., Alexandre, F., Medeiros, P.D.: Algorithmic skeleton framework for the orchestration of GPU computations. In: Euro-Par 2013 Parallel Processing. Lecture Notes in Computer Science, vol. 8097, pp. 874–885. Springer, Berlin Heidelberg (2013)
– reference: NVIDIA Corporation: NVIDIA CUDA C Programming Guide (2013). http://docs.nvidia.com/cuda/cuda-c-programming-guide
– reference: GrelckCScholzSSAC-A functional array language for efficient multi-threaded executionInt. J. Parallel Program.200634438342710.1007/s10766-006-0018-x1102.68438
– reference: ShainerGThe development of Mellanox/NVIDIA GPUDirect over InfiniBand—a new model for GPU to GPU communicationsComput. Sci.-Res. Dev.2011263410.1007/s00450-011-0157-1
– reference: ColeMIAlgorithmic Skeletons: Structured Management of Parallel Computation1989CambridgeAddison-Wesley0681.68041
– reference: Goli, M., Gonzalez-Velez, H.: Heterogeneous algorithmic skeletons for FastFlow with seamless coordination over hybrid architectures. In: 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 148–156 (2013)
– reference: Hoberock, J., Bell, N.: Thrust: C++ template library for CUDA (2011). http://code.google.com/p/thrust/
– reference: KichererMNowakFBuchtyRKarlWSeamlessly portable applications: managing the diversity of modern heterogeneous systemsACM Trans. Archit. Code Optim.20128442:142:2010.1145/2086696.2086721
– reference: Park, J.: Memory optimizations of embedded applications for energy efficiency. Ph.D. thesis, Dept. of Electrical Engineering. University of Stanford (2011)
– reference: Kicherer, M., Buchty, R., Karl, W.: Cost-aware function migration in heterogeneous systems. In: 6th International Conference on High Performance and Embedded Architectures and Compilers. HiPEAC ’11, pp. 137–145. ACM, New York, NY, USA (2011)
– reference: Diogo, M., Grelck, C.: Towards Heterogeneous Computing without Heterogeneous Programming. In: H.W. Loidl, R. Pena (eds.): 13th Int. Symposium on Trends in Functional Programming (TFP 2012), St. Andrews, UK, Lecture Notes in Computer Science 7829, pp. 279–294, Springer (2013)
– reference: DuboisMAnnavaramMStenströmPParallel Computer Organization and Design2012CambridgeCambridge University Press10.1017/CBO9781139051224
– reference: Ciechanowicz, P., Poldner, M., Kuchen, H.: The Münster skeleton library Muesli—a comprehensive overview (2009). ERCIS Working Paper No. 7
– reference: AlexandrescuAModern C++ Design20011BostonAddison-Wesley Professional
– reference: ErnstingSKuchenHAlgorithmic skeletons for multi-core, multi-GPU systems and clustersInt. J. High Perform. Comput. Netw.2012712913810.1504/IJHPCN.2012.046370
– reference: Landaverde, R., Zhang, T., Coskun, A., Herbordt, M.: An investigation of Unified Memory access performance in CUDA. In: IEEE High Performance Extreme Computing Conference, Waltham, USA (2014)
– reference: Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL—A portable skeleton library for high-level GPU programming. In: 16th International Workshop on High-Level Parallel Programming Models and Supportive Environments, HIPS ’11 (2011)
– reference: Aufmann, R., Barker, V., Lockwood, J.: Intermediate Algebra with Applications, Multimedia Edition. Cengage Learning (2008). URL http://books.google.se/books?id=QYfJAxqwDE8C
– reference: Dastgeer, U., Kessler, C., Thibault, S.: Flexible runtime support for efficient skeleton programming. In: Advances in Parallel Computing, vol. 22, pp. 159–166. IOS Press (2012). Proc. ParCo conference, Ghent, Belgium (Sep . 2011)
– reference: Harris, M.: CUDA Unfied Memory in CUDA 6. Nvidia, http://devblogs.nvidia.com/parallelforall/unified-memory-in-cuda-6 (2013)
– reference: Dastgeer, U.: Performance-aware component composition for GPU-based systems. Ph.D. thesis, Linköping University (2014). URL http://www.diva-portal.org/smash/record.jsf?pid=diva2:712422
– reference: Enmyren, J., Kessler, C.: SkePU: A Multi-Backend Skeleton Programming Library for Multi-GPU Systems. In: Proceedings of 4th International Workshop on High-Level Parallel Programming and Applications (HLPP-2010), Baltimore, USA, ACM (Sep. 2010)
– ident: 357_CR10
  doi: 10.1145/1863482.1863487
– ident: 357_CR8
  doi: 10.1007/978-3-642-40447-4_18
– ident: 357_CR7
– volume: 31
  start-page: 7
  issue: 5
  year: 2011
  ident: 357_CR16
  publication-title: IEEE Micro.
  doi: 10.1109/MM.2011.89
– ident: 357_CR19
  doi: 10.1109/HPEC.2014.7040988
– ident: 357_CR3
– ident: 357_CR5
– ident: 357_CR6
– ident: 357_CR12
  doi: 10.1109/PDP.2013.29
– ident: 357_CR22
– volume-title: Modern C++ Design
  year: 2001
  ident: 357_CR1
– ident: 357_CR21
– ident: 357_CR2
– ident: 357_CR17
  doi: 10.1145/1944862.1944883
– ident: 357_CR24
  doi: 10.1109/IPDPS.2011.269
– volume-title: Algorithmic Skeletons: Structured Management of Parallel Computation
  year: 1989
  ident: 357_CR4
– volume: 34
  start-page: 383
  issue: 4
  year: 2006
  ident: 357_CR13
  publication-title: Int. J. Parallel Program.
  doi: 10.1007/s10766-006-0018-x
– ident: 357_CR14
– volume: 8
  start-page: 42:1
  issue: 4
  year: 2012
  ident: 357_CR18
  publication-title: ACM Trans. Archit. Code Optim.
  doi: 10.1145/2086696.2086721
– volume: 7
  start-page: 129
  year: 2012
  ident: 357_CR11
  publication-title: Int. J. High Perform. Comput. Netw.
  doi: 10.1504/IJHPCN.2012.046370
– ident: 357_CR15
– volume: 26
  start-page: 3
  year: 2011
  ident: 357_CR23
  publication-title: Comput. Sci.-Res. Dev.
  doi: 10.1007/s00450-011-0157-1
– ident: 357_CR20
  doi: 10.1007/978-3-642-40047-6_86
– volume-title: Parallel Computer Organization and Design
  year: 2012
  ident: 357_CR9
  doi: 10.1017/CBO9781139051224
SSID ssj0009788
Score 2.1741712
Snippet In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide...
Issue Title: Special Issue on High-Level Parallel Programming and Applications In this paper, we discuss the role, design and implementation of smart...
In this paper, we discuss the role, design and implementation of smart containers in the SkePU skeleton library for GPU-based systems. These containers provide...
SourceID swepub
proquest
crossref
springer
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 506
SubjectTerms Analysis
C plus plus
Communication
Computer memory
Computer programming
Computer Science
Consistency
Containers
Design engineering
Interfaces
Libraries
Memory management
Optimization
Processor Architectures
Programming
Run time (computers)
Software Engineering/Programming and Operating Systems
Studies
Synchronization
Theory of Computation
SummonAdditionalLinks – databaseName: ABI/INFORM Collection
  dbid: 7WY
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LTxsxELYocOgFykuE0spIiEORxSZ-7qmiUOCEIgEtPVl-LYqADSSB39-ZfSSlElw4r9dr7cx4Pnvs7yNkVxmRpA-SqegjEw4i3WnDmY-FS4XucWVcJTahz8_N9XXebzbcxs2xynZOrCbqOAy4R37QhR4gtfS4_P7wyFA1CqurjYTGB7IAiVqigoH-_WdGuqsr3UkIJMm0kKatatZX57TCtbRkGZeaqZd5aQY2p_XR_7hEq_xzsvzekX8iSw3ypIe1q6yQuVSukuVW1YE2Qb4G8P0e3IkibZXDm4Fj6spIL24hPwFOpP36QNc9jJYC4KWn_Sv2A1IhNKnJz9fJ5cnPy6Mz1sgssCB4NmFaJpm8d0ok1UNEkBW5MdynHoqjJ-WjjF2lQ8hjSDqagvNgotcu6NB1Bd8g8-WwTJuEGh40lyp0gxBCZ8F5geu1PIdMGQFZdEjW_mMbGgpyVMK4szPyZDSLBbNYNItVHfJt-spDzb_xVuPt1gK2CcWxnf3-DtmZPoYgwsqIK9PwCdqg_ghgYyM6ZL81-D9dvP7BvdonpmNDtu7jwa9DOxzd2LsBMnXDkjTfentkn8lHAGKqPoK2TeYno6f0hSyG58lgPPpaOfVfZ8370w
  priority: 102
  providerName: ProQuest
Title Smart Containers and Skeleton Programming for GPU-Based Systems
URI https://link.springer.com/article/10.1007/s10766-015-0357-6
https://www.proquest.com/docview/1783853235
https://www.proquest.com/docview/1808111584
https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-128719
Volume 44
WOSCitedRecordID wos000374897200008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVPQU
  databaseName: ABI/INFORM Collection
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: 7WY
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/abicomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ABI/INFORM Global
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: M0C
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/abiglobal
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: P5Z
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: K7-
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: BENPR
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Research Library
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: M2O
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/pqrl
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: Springer Journals - Owned
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: RSV
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3daxQxEB9s64MvrZ94tR4RxAclsLf53Me2thbEc7G1Vl9CvlaOtnvl7urf72Q_7qpYQV8Cy2aTMJnZ-Q2T_AbgpdQ8CucFlcEFyi1aulWaURcqGyuVM6ltU2xCjcf67Kwou3vc8_60e5-SbP7UNy67KZmiX0EzJhSVa7AhEtlMCtGPT1dMu6opNonWI6jiQvepzD8N8aszWiHMZVL0NwLRxukcbv3Xcu_DZocxyW6rFA_gTqwfwlZfv4F05vwIgfolKg5JBFU23QGcE1sHcnyOnggRISnbo1uXOCtBaEvelZ_pHjo97NLSnD-Gk8ODk_0j2hVUoJ6zbEGViCI6ZyWPMk--P6sKrZmLeSqDHqULIoyk8r4IPqqgK8a8Dk5Zr_zIVuwJrNfTOj4FoplXTEg_8pxzlXnreIrMigJ9YkAMMYCsF6zxHdl4qnlxYVY0yUk-BuVjknyMHMDr5SdXLdPG3zrv9LtlOqObmxEqGaKPnIkBvFi-RnNJORBbx-k19kmVRhAFaz6AN_3G3Rji9glftYqwXFvi5X47Od0109l3czFJnNwYfBbb_zTsM7iHCEy2Z892YH0xu47P4a7_sZjMZ0NYU1--DmFj72BcfsKn94pi-yHbT23-EdtSfBs26v8TZqr3VQ
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Jb9NAFH4qBQkulFUECgwScACNcDyrDwgVSmmVEkUioN5Gs7mK2jolSUH8KP4jb7wkgERvPXD2eGzP2_1mvg_gqdQ8CucFlcEFyi1aulWaURdKG0uVM6ltTTahhkN9cFCM1uBndxYmbavsfGLtqMPUp3_kr_o4A4aWnIk3p19pYo1K3dWOQqNRi0H88R1LtvnrvW2U77M833k_frdLW1YB6jnLFlSJKKJzVvIo8xQAs7LQmrmYJy7wKF0QoS-V90XwUQVdMuZ1cMp65fu2ZDjtJbjMmZbJoAaKrjB-VU1ziXYrqOJCd03U5qSekql0FzRjQlH5Zxhc5bbLduxf0KV1uNvZ-M8W6gZcb_NqstUYwk1Yi9Ut2Og4K0jrwm5jcXKCxkISKJdN5x7nxFaBfDrC6ItZMBk129VOcHEIpvPkw-gzfYuBHoc00O53YHwRX3EX1qtpFe8B0cwrJqTve865yrx1PFWjRYF5QMC8qQdZJ1LjW4D1xPNxbFbQ0EkLDGqBSVpgZA9eLG85bdBFzhu82QnctI5mblbS7sGT5WV0EanvY6s4PcMxiV0FM3_Ne_Cy06_fpvj3A583Krh8t4RFvj35smWms0NzPEk45FhwF_fPf7PHcHV3_HHf7O8NBw_gGqacstlstwnri9lZfAhX_LfFZD57VNsTAXPBavkL6_ZZwQ
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1bb9MwFD4aHUJ72biKjgFGAh5A1tL4mgeENrrCNFRFMNDeLMd2pmpburUdiJ_Gv-O4SVpAYm974DmOk_jcc-zvA3guNQ-icIJKX3jKLVq6VZrRwpc2lCplUts52YQaDvXRUZavwM_2LEzcVtn6xLmj9mMX_5Fv93AGDC0pE9tlsy0i7w_enl_QyCAVO60tnUatIgfhx3cs36Zv9vso6xdpOtg7fPeBNgwD1HGWzKgSQYSisJIHmcZgmJSZ1qwIaeQFD7Lwwvekci7zLiivS8ac9oWyTrmeLRlOewNWFcOapwOru3vD_NMS8VfNSS_RigVVXOi2pVqf21MyFvKCJkwoKv8MistMd9Gc_QvIdB78Bhv_8bLdhvUm4yY7tYncgZVQ3YWNls2CNM7tHpYtZ2hGJMJ12Xgickps5cnnE4zLmB-TvN7IdoYLRTDRJ-_zL3QXUwAcUoO-34fD6_iKB9CpxlV4CEQzp5iQruc45ypxtuCxTs0yzBA8ZlRdSFrxGtdAr0cGkFOzBI2OGmFQI0zUCCO78Gpxy3mNO3LV4K1W-KZxQVOzlHwXni0uo_OIHSFbhfEljom8K1gTaN6F162u_TbFvx_4slbHxbtFlPL-6OuOGU-OzekoIpRjKZ5tXv1mT-EWaqP5uD88eARrmIvKehfeFnRmk8vwGG66b7PRdPKkMS4C5pr18hdMkmQT
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Smart+Containers+and+Skeleton+Programming+for+GPU-Based+Systems&rft.jtitle=International+journal+of+parallel+programming&rft.au=Dastgeer%2C+Usman&rft.au=Kessler%2C+Christoph&rft.date=2016-06-01&rft.pub=Springer+US&rft.issn=0885-7458&rft.eissn=1573-7640&rft.volume=44&rft.issue=3&rft.spage=506&rft.epage=530&rft_id=info:doi/10.1007%2Fs10766-015-0357-6&rft.externalDocID=10_1007_s10766_015_0357_6
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0885-7458&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0885-7458&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0885-7458&client=summon