pocl: A Performance-Portable OpenCL Implementation

OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings th...

Full description

Saved in:
Bibliographic Details
Published in:International journal of parallel programming Vol. 43; no. 5; pp. 752 - 785
Main Authors: Jääskeläinen, Pekka, de La Lama, Carlos Sánchez, Schnetter, Erik, Raiskila, Kalle, Takala, Jarmo, Berg, Heikki
Format: Journal Article
Language:English
Published: New York Springer US 01.10.2015
Springer Nature B.V
Subjects:
ISSN:0885-7458, 1573-7640
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue. Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure. This data can be exploited by the later generic compiler passes for efficient parallelization. The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research. The paper describes how the portability of the implementation is achieved. We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources. The results show that most of the benchmarked applications when compiled using pocl were faster or close to as fast as the best proprietary OpenCL implementation for the platform at hand.
AbstractList OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue. Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure. This data can be exploited by the later generic compiler passes for efficient parallelization. The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research. The paper describes how the portability of the implementation is achieved. We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources. The results show that most of the benchmarked applications when compiled using pocl were faster or close to as fast as the best proprietary OpenCL implementation for the platform at hand.
Issue Title: Includes a Special Section on High-level Heterogeneous and Hierarchical Parallel Systems OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue. Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure. This data can be exploited by the later generic compiler passes for efficient parallelization. The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research. The paper describes how the portability of the implementation is achieved. We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources. The results show that most of the benchmarked applications when compiled using pocl were faster or close to as fast as the best proprietary OpenCL implementation for the platform at hand.
Author Raiskila, Kalle
de La Lama, Carlos Sánchez
Jääskeläinen, Pekka
Berg, Heikki
Schnetter, Erik
Takala, Jarmo
Author_xml – sequence: 1
  givenname: Pekka
  surname: Jääskeläinen
  fullname: Jääskeläinen, Pekka
  email: pekka.jaaskelainen@tut.fi
  organization: Tampere University of Technology
– sequence: 2
  givenname: Carlos Sánchez
  surname: de La Lama
  fullname: de La Lama, Carlos Sánchez
  organization: Knowledge Development for POF
– sequence: 3
  givenname: Erik
  surname: Schnetter
  fullname: Schnetter, Erik
  organization: Perimeter Institute for Theoretical Physics, Department of Physics, University of Guelph, Center for Computation and Technology, Louisiana State University
– sequence: 4
  givenname: Kalle
  surname: Raiskila
  fullname: Raiskila, Kalle
  organization: Nokia Research Center
– sequence: 5
  givenname: Jarmo
  surname: Takala
  fullname: Takala, Jarmo
  organization: Tampere University of Technology
– sequence: 6
  givenname: Heikki
  surname: Berg
  fullname: Berg, Heikki
  organization: Nokia Research Center
BookMark eNp9kD1rwzAQQEVJoUnaH9DN0KWL2pOts-RuIfQjEEiG7EKR5eJgS67kDPn3dXCHEminW9477t6MTJx3lpB7Bk8MQDxHBiLPKTBOIUuBnq7IlKHIqMg5TMgUpEQqOMobMovxAACFkHJK0s6b5iVZJFsbKh9a7YylWx96vW9ssumsW66TVds1trWu133t3S25rnQT7d3PnJPd2-tu-UHXm_fVcrGmhkvsqUTDuBQpQlmgxcIKXexLXqaF5gisSjWiAF0VMuc5MihNBqhNisiMFVU2J4_j2i74r6ONvWrraGzTaGf9MSomMgDJi5wN6MMFevDH4IbjFMsLiSJPMRsoMVIm-BiDrZSpx4_6oOtGMVDnlGpMqYaU6pxSnQaTXZhdqFsdTv866ejEgXWfNvy66U_pG9vVhfw
CODEN IJPPE5
CitedBy_id crossref_primary_10_1016_j_cageo_2019_04_003
crossref_primary_10_3847_1538_4357_aa6f06
crossref_primary_10_1177_10943420251369350
crossref_primary_10_1145_3315569
crossref_primary_10_1007_s42514_020_00039_4
crossref_primary_10_1109_ACCESS_2025_3546635
crossref_primary_10_1016_j_aam_2021_102229
crossref_primary_10_1016_j_sysarc_2017_10_004
crossref_primary_10_1145_3199610_3199614
crossref_primary_10_1145_3659949
crossref_primary_10_1109_TC_2021_3107196
crossref_primary_10_1016_j_micpro_2023_104772
crossref_primary_10_1007_s11042_018_6532_1
crossref_primary_10_1109_MCSE_2021_3083547
crossref_primary_10_3233_JIFS_200616
crossref_primary_10_1007_s11265_018_1416_1
crossref_primary_10_1007_s42514_024_00181_3
crossref_primary_10_3390_computers13100250
crossref_primary_10_1145_3140582_3081040
crossref_primary_10_1109_TVLSI_2025_3574427
crossref_primary_10_1109_TC_2018_2793919
crossref_primary_10_1007_s11227_023_05879_9
crossref_primary_10_1016_j_combustflame_2018_09_008
crossref_primary_10_1016_j_ijepes_2024_110014
crossref_primary_10_1109_TPDS_2021_3116859
crossref_primary_10_1109_TVLSI_2019_2897508
crossref_primary_10_1145_3434312
crossref_primary_10_1145_3177960
crossref_primary_10_1007_s11265_018_1422_3
crossref_primary_10_1007_s11265_018_1424_1
crossref_primary_10_1016_j_parco_2021_102754
crossref_primary_10_1145_3554736
crossref_primary_10_1631_FITEE_2200359
Cites_doi 10.1145/115372.115320
10.1145/103162.103163
10.1109/MICRO.2006.34
10.1007/978-3-642-37051-9_9
10.1109/TC.1981.1675827
10.1145/1542275.1542303
10.1145/1854273.1854302
10.1007/s00450-010-0108-2
10.1109/FPL.2010.51
10.1145/1148109.1148117
10.1145/1854273.1854301
10.1007/978-3-642-28652-0_1
10.1145/567067.567085
10.1109/CGO.2004.1281665
10.1145/800152.804919
10.1109/MM.2006.41
10.1145/1787275.1787295
10.1145/800028.808480
10.1145/267959.269971
10.1145/1504176.1504207
10.1007/978-3-540-89740-8_2
10.1109/CGO.2011.5764682
10.1145/390013.808479
ContentType Journal Article
Copyright Springer Science+Business Media New York 2014
Springer Science+Business Media New York 2015
Copyright_xml – notice: Springer Science+Business Media New York 2014
– notice: Springer Science+Business Media New York 2015
DBID AAYXX
CITATION
3V.
7SC
7WY
7WZ
7XB
87Z
8AL
8FD
8FE
8FG
8FK
8FL
8G5
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
DWQXO
FRNLG
F~G
GNUQQ
GUQSH
HCIFZ
JQ2
K60
K6~
K7-
L.-
L.0
L7M
L~C
L~D
M0C
M0N
M2O
MBDVC
P5Z
P62
PHGZM
PHGZT
PKEHL
PQBIZ
PQBZA
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
DOI 10.1007/s10766-014-0320-y
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ABI/INFORM Collection
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni)
ProQuest Research Library
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials - QC
ProQuest Central
Business Premium Collection
Technology Collection
ProQuest One Community College
ProQuest Central
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Central Student
ProQuest Research Library
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
Computer Science Database
ABI/INFORM Professional Advanced
ABI/INFORM Professional Standard
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ABI/INFORM Global
Computing Database
Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
ProQuest One Academic Middle East (New)
ProQuest One Business
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central Basic
DatabaseTitle CrossRef
ABI/INFORM Global (Corporate)
ProQuest Business Collection (Alumni Edition)
ProQuest One Business
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
Research Library (Alumni Edition)
ABI/INFORM Complete
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest One Applied & Life Sciences
ABI/INFORM Professional Standard
ProQuest Central Korea
ProQuest Research Library
ProQuest Central (New)
Advanced Technologies Database with Aerospace
ABI/INFORM Complete (Alumni Edition)
Advanced Technologies & Aerospace Collection
Business Premium Collection
ABI/INFORM Global
ProQuest Computing
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Business Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Business (Alumni)
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList Computer and Information Systems Abstracts

ABI/INFORM Global (Corporate)
Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1573-7640
EndPage 785
ExternalDocumentID 3755403461
10_1007_s10766_014_0320_y
Genre Feature
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
-~X
.4S
.86
.DC
.VR
06D
0R~
0VY
199
1N0
2.D
203
28-
29J
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
5GY
5QI
5VS
67Z
6NX
78A
7WY
8FE
8FG
8FL
8G5
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYJJ
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDBF
ABDPE
ABDZT
ABECU
ABFSI
ABFTD
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTAH
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFO
ACGFS
ACHSB
ACHXU
ACIHN
ACKNC
ACMDZ
ACMLO
ACNCT
ACOKC
ACOMO
ACPIV
ACREN
ACUHS
ACZOJ
ADHIR
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADURQ
ADYFF
ADYOE
ADZKW
AEAQA
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFYQB
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMTXH
AMXSW
AMYLF
AOCGG
ARAPS
ARCSS
ARMRJ
AXYYD
AYJHY
AZFZN
AZQEC
B-.
B0M
BA0
BBWZM
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BKOMP
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
E.L
EAD
EAP
EAS
EBLON
EBS
EDO
EIOEI
EJD
EMK
EPL
ESBYG
ESX
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GROUPED_ABI_INFORM_RESEARCH
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
H~9
I-F
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
KOW
LAK
LLZTM
M0C
M0N
M2O
M4Y
MA-
MS~
N2Q
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P62
P9O
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
Q2X
QOK
QOS
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TAE
TEORI
TN5
TSG
TSK
TSV
TUC
TUS
U2A
U5U
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
VXZ
W23
W48
WH7
WK8
YLTOR
Z45
Z7R
Z7X
Z81
Z83
Z88
Z8R
Z8W
Z92
ZMTXR
ZY4
~8M
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PQGLB
7SC
7XB
8AL
8FD
8FK
JQ2
L.-
L.0
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQUKI
Q9U
PUEGO
ID FETCH-LOGICAL-c485t-85c1487250d95e59e7a9bd4d29a4501f2a5570af98646510dc305ac2551ce7f3
IEDL.DBID K7-
ISICitedReferencesCount 71
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000358648600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0885-7458
IngestDate Thu Sep 04 15:24:03 EDT 2025
Tue Nov 04 21:43:39 EST 2025
Tue Nov 18 22:37:46 EST 2025
Sat Nov 29 01:59:42 EST 2025
Fri Feb 21 02:37:21 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 5
Keywords Parallel programming
LLVM
OpenCL
SIMD
Heterogeneous platforms
VLIW
GPGPU
Performance portability
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c485t-85c1487250d95e59e7a9bd4d29a4501f2a5570af98646510dc305ac2551ce7f3
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
PQID 1698576253
PQPubID 48389
PageCount 34
ParticipantIDs proquest_miscellaneous_1730084961
proquest_journals_1698576253
crossref_citationtrail_10_1007_s10766_014_0320_y
crossref_primary_10_1007_s10766_014_0320_y
springer_journals_10_1007_s10766_014_0320_y
PublicationCentury 2000
PublicationDate 2015-10-01
PublicationDateYYYYMMDD 2015-10-01
PublicationDate_xml – month: 10
  year: 2015
  text: 2015-10-01
  day: 01
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle International journal of parallel programming
PublicationTitleAbbrev Int J Parallel Prog
PublicationYear 2015
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References TTA-based codesign environment (TCE). http://tce.cs.tut.fi. Online; Accessed 18 May 2013
Nicolau, A., Li, G., Kejariwal, A.: Techniques for efficient placement of synchronization primitives. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’09, pp. 199–208. ACM, New York, NY, USA (2009). doi:10.1145/1504176.1504207
Hecht, M.S., Ullman, J.D.: Flow graph reducibility. In: Proceedings of Annual ACM Symposium on Theory of Computing, pp. 238–250. Denver, CO (1972)
Khronos Group, Beaverton, OR: OpenCL Specification, v1.2r19 edn. (2012)
Shibata, N.: SLEEF (SIMD library for evaluating elementary functions). Web Site (2013). http://shibatch.sourceforge.net
Schnetter, E.: Vecmathlib. http://bitbucket.org/eschnett/vecmathlib. Online; Accessed 5 Feb 2014
Shibata, N.: Efficient evaluation methods of elementary functions suitable for SIMD computation. In: Journal of Computer Science on Research and Development, Proceedings of the International Supercomputing Conference ISC10, vol. 25, pp. 25–32 (2010). doi:10.1007/s00450-010-0108-2
GschwindMHofsteeHPFlachsBHopkinsMWatanabeYYamazakiTSynergistic processing in Cell’s multicore architectureIEEE Micro200626102410.1109/MM.2006.41
AhoAVSethiRUllmanJDCompilers: Principles, Techniques, and Tools1986ReadingAddison-Wesley Longman Publishing Co. Inc.
AllenFEControl flow analysisACM SIGPLAN Not.19705711910.1145/390013.808479
Maher, B.A., Smith, A., Burger, D., McKinley, K.S.: Merging head and tail duplication for convergent hyperblock formation. In: Proceedings of Annual IEEE/ACM International Symposium on Microarchitecture, pp. 65–76. Orlando, FL (2006)
Nvidia Corp., Santa Clara, CA: NVIDIA CUDA Compute Unified Device Architecture: Programming Guide, v2.0 edn. (2008)
GoldbergDWhat every computer scientist should know about floating-point arithmeticACM Comput. Surv.19912354810.1145/103162.103163
Jääskeläinen, P., Sánchez de La Lama, C., Huerta, P., Takala, J.: OpenCL-based design methodology for application-specific processors. Trans. HiPEAC 5 (2011). http://www.hipeac.net/node/4310
LLVM compiler infrastructure. http://llvm.org/. Online; Accessed 5 Feb 2014
Karrenberg, R., Hack, S.: Improving performance of OpenCL on CPUs. In: Proceedings of International Conference on Compiler Construction, pp. 1–20. Tallinn, Estonia (2012)
PressWHTeukolskySAVetterlingWTFlanneryBPNumerical Recipes 3rd Edition: The Art of Scientific Computing2007CambridgeCambridge University Press
ARM Ltd.: The ARM NEON™ general-purpose SIMD engine (2012). http://www.arm.com/products/processors/technologies/neon.php
IBM: OpenCL(TM) development kit for Linux on Power, v0.3 (2011)
MullerJMElementary Functions: Algorithms and Implementation2006LondonBirkhäuser
Karrenberg, R., Hack, S.: Whole-function vectorization. In: Proceedings of Annual IEEE/ACM International Symposium Code Generation and Optimization, pp. 141–150. Chamonix, France (2011)
Lee, J., Kim, J., Seo, S., Kim, S., Park, J., Kim, H., Dao, T.T., Cho, Y., Seo, S.J., Lee, S.H., Cho, S.M., Song, H.J., Suh, S.B., Choi, J.D.: An OpenCL framework for heterogeneous multicores with local memory. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, pp. 193–204. ACM, New York, NY, USA (2010). doi:10.1145/1854273.1854301
Stratton, J.A., Stone, S.S., Hwu, W.M.W.: MCUDA: an efficient implementation of CUDA kernels for multi-core CPUs. In: J.N. Amaral (ed.) Languages and Compilers for Parallel Computing, LNCS, vol. 5335, pp. 16–30. Springer, Berlin (2008). doi:10.1007/978-3-540-89740-8_2
Gummaraju, J., Morichetti, L., Houston, M., Sander, B., Gaster, B.R., Zheng, B.: Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. In: Proceedings of International Conference on Parallel Architectures and Compilation Techniques, pp. 205–216. Vienna, Austria (2010)
Gummaraju, J., Sander, B., Morichetti, L., Gaster, B., Howes, L.: Efficient implementation of GPGPU synchronization primitives on CPUs. In: Proceedings of ACM International Conference on Computing Frontiers, pp. 85–86. Bertinoro, Italy (2010)
freeocl: Multi-platform implementation of OpenCL 1.2 targeting CPUs. http://code.google.com/p/freeocl/. Online; Accessed 18 May 2013
CytronRFerranteJRosenBKWegmanMNZadeckFKEfficiently computing static single assignment form and the control dependence graphACM Trans. Program. Lang. Syst.199113445149010.1145/115372.115320
IEEE, Piscataway, NJ: Standard for floating-point arithmetic (2008). Std 754-2008
Advanced Micro Devices Inc: Accelerated parallel processing (APP) software development kit (SDK) v2.8 (2012)
CorporaalHMicroprocessor Architectures: From VLIW to TTA1997ChichesterWiley
Cammarota, R., Nicolau, A., Veidenbaum, A.V., Kejariwal, A., Donato, D., Madhugiri, M.: On the determination of inlining vectors for program optimization. In: Proceedings of 22nd International Conference on Compiler Construction, CC’13, pp. 164–183. Springer, Berlin (2013). doi:10.1007/978-3-642-37051-9_9
ARM Ltd.: The ARMCortex™ A9 processor (2013). http://www.arm.com/products/processors/cortex-a/cortex-a9.php
IEEE, Piscataway, NJ: IEEE standard for information technology—portable operation system interface (POSIX). Shell and utilities., 2004 edn. (2004). Std 1003.1
Khronos Group: SPIR 1.2 Specification for OpenCL (2014)
Clover Git: OpenCL 1.1 software implementation. http://people.freedesktop.org/steckdenis/clover/index.html. Online; Accessed 18 May 2013
Cocke, J.: Global common subexpression elimination. In: Proceedings of Symposium Compiler Optimization, pp. 20–24. Urbana-Champaign, IL (1970)
Esko, O., Jääskeläinen, P., Huerta, P., de La Lama, C.S., Takala, J., Martinez, J.I.: Customized exposed datapath soft-core design flow with compiler support. In: International Conference on Field Programmable Logic and Applications, pp. 217–222. Milan, Italy (2010)
Intel Corp.: Desktop 4th Gen IntelCore™ Processor Family: Datasheet, Vol. 1 (2013). Doc. No. 328897-004
FisherJTrace scheduling: a technique for global microcode compactionIEEE Trans. Comput.1981C–30747849010.1109/TC.1981.1675827
Kejariwal, A., Nicolau, A., Saito, H., Tian, X., Girkar, M., Banerjee, U., Polychronopoulos, C.D.: A general approach for partitioning N-dimensional parallel nested loops with conditionals. In: Proceedings of 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’06, pp. 49–58. ACM, New York, NY, USA (2006). doi:10.1145/1148109.1148117
Clang: A C language frontend for LLVM. http://clang.llvm.org/. Online; Accessed 5 Feb 2014
Allen, J., Kennedy, K., Porterfield, C., Warren, J.: Conversion of control dependence to data dependence. In: Proceedings of ACM Symposium Principles of Programming Languages, Austin, TX, pp. 177–189 (1983)
Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis and transformation. In: Proceedings of International Symposium on Code Generation Optimization, p. 75 (2004)
Rotem, N.: Intel OpenCL SDK vectorizer. LLVM Developer’s Meeting (2011)
Nicolau, A., Li, G., Veidenbaum, A.V., Kejariwal, A.: Synchronization optimizations for efficient execution on multi-cores. In: Proceedings of the 23rd International Conference on Supercomputing, ICS ’09, pp. 169–180. ACM, New York, NY, USA (2009). doi:10.1145/1542275.1542303
Clover Git: Implementing barriers. http://people.freedesktop.org/steckdenis/clover/barrier.html. Online; Accessed 18 May 2013
JanssenJCorporaalHMaking graphs reducible with controlled node splittingACM Trans. Program. Lang. Syst.19971961031105210.1145/267959.269971
R Cytron (320_CR16) 1991; 13
320_CR12
D Goldberg (320_CR19) 1991; 23
320_CR34
320_CR13
320_CR35
320_CR10
320_CR32
320_CR11
320_CR33
J Janssen (320_CR29) 1997; 19
320_CR30
320_CR31
320_CR1
AV Aho (320_CR8) 1986
320_CR27
320_CR2
320_CR28
320_CR3
320_CR25
320_CR47
320_CR4
320_CR26
320_CR5
320_CR6
H Corporaal (320_CR15) 1997
320_CR7
FE Allen (320_CR9) 1970; 5
320_CR23
320_CR45
320_CR24
320_CR46
320_CR21
320_CR43
320_CR22
320_CR44
320_CR41
320_CR40
WH Press (320_CR42) 2007
JM Muller (320_CR38) 2006
320_CR17
320_CR39
320_CR14
320_CR36
320_CR37
J Fisher (320_CR18) 1981; C–30
M Gschwind (320_CR20) 2006; 26
References_xml – reference: Hecht, M.S., Ullman, J.D.: Flow graph reducibility. In: Proceedings of Annual ACM Symposium on Theory of Computing, pp. 238–250. Denver, CO (1972)
– reference: Nvidia Corp., Santa Clara, CA: NVIDIA CUDA Compute Unified Device Architecture: Programming Guide, v2.0 edn. (2008)
– reference: JanssenJCorporaalHMaking graphs reducible with controlled node splittingACM Trans. Program. Lang. Syst.19971961031105210.1145/267959.269971
– reference: AllenFEControl flow analysisACM SIGPLAN Not.19705711910.1145/390013.808479
– reference: IBM: OpenCL(TM) development kit for Linux on Power, v0.3 (2011)
– reference: MullerJMElementary Functions: Algorithms and Implementation2006LondonBirkhäuser
– reference: TTA-based codesign environment (TCE). http://tce.cs.tut.fi. Online; Accessed 18 May 2013
– reference: Karrenberg, R., Hack, S.: Whole-function vectorization. In: Proceedings of Annual IEEE/ACM International Symposium Code Generation and Optimization, pp. 141–150. Chamonix, France (2011)
– reference: Intel Corp.: Desktop 4th Gen IntelCore™ Processor Family: Datasheet, Vol. 1 (2013). Doc. No. 328897-004
– reference: Nicolau, A., Li, G., Kejariwal, A.: Techniques for efficient placement of synchronization primitives. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’09, pp. 199–208. ACM, New York, NY, USA (2009). doi:10.1145/1504176.1504207
– reference: Gummaraju, J., Sander, B., Morichetti, L., Gaster, B., Howes, L.: Efficient implementation of GPGPU synchronization primitives on CPUs. In: Proceedings of ACM International Conference on Computing Frontiers, pp. 85–86. Bertinoro, Italy (2010)
– reference: Clover Git: OpenCL 1.1 software implementation. http://people.freedesktop.org/steckdenis/clover/index.html. Online; Accessed 18 May 2013
– reference: GschwindMHofsteeHPFlachsBHopkinsMWatanabeYYamazakiTSynergistic processing in Cell’s multicore architectureIEEE Micro200626102410.1109/MM.2006.41
– reference: Rotem, N.: Intel OpenCL SDK vectorizer. LLVM Developer’s Meeting (2011)
– reference: IEEE, Piscataway, NJ: IEEE standard for information technology—portable operation system interface (POSIX). Shell and utilities., 2004 edn. (2004). Std 1003.1
– reference: Karrenberg, R., Hack, S.: Improving performance of OpenCL on CPUs. In: Proceedings of International Conference on Compiler Construction, pp. 1–20. Tallinn, Estonia (2012)
– reference: Shibata, N.: Efficient evaluation methods of elementary functions suitable for SIMD computation. In: Journal of Computer Science on Research and Development, Proceedings of the International Supercomputing Conference ISC10, vol. 25, pp. 25–32 (2010). doi:10.1007/s00450-010-0108-2
– reference: Allen, J., Kennedy, K., Porterfield, C., Warren, J.: Conversion of control dependence to data dependence. In: Proceedings of ACM Symposium Principles of Programming Languages, Austin, TX, pp. 177–189 (1983)
– reference: CytronRFerranteJRosenBKWegmanMNZadeckFKEfficiently computing static single assignment form and the control dependence graphACM Trans. Program. Lang. Syst.199113445149010.1145/115372.115320
– reference: freeocl: Multi-platform implementation of OpenCL 1.2 targeting CPUs. http://code.google.com/p/freeocl/. Online; Accessed 18 May 2013
– reference: IEEE, Piscataway, NJ: Standard for floating-point arithmetic (2008). Std 754-2008
– reference: Cocke, J.: Global common subexpression elimination. In: Proceedings of Symposium Compiler Optimization, pp. 20–24. Urbana-Champaign, IL (1970)
– reference: Jääskeläinen, P., Sánchez de La Lama, C., Huerta, P., Takala, J.: OpenCL-based design methodology for application-specific processors. Trans. HiPEAC 5 (2011). http://www.hipeac.net/node/4310
– reference: Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis and transformation. In: Proceedings of International Symposium on Code Generation Optimization, p. 75 (2004)
– reference: LLVM compiler infrastructure. http://llvm.org/. Online; Accessed 5 Feb 2014
– reference: FisherJTrace scheduling: a technique for global microcode compactionIEEE Trans. Comput.1981C–30747849010.1109/TC.1981.1675827
– reference: CorporaalHMicroprocessor Architectures: From VLIW to TTA1997ChichesterWiley
– reference: Advanced Micro Devices Inc: Accelerated parallel processing (APP) software development kit (SDK) v2.8 (2012)
– reference: PressWHTeukolskySAVetterlingWTFlanneryBPNumerical Recipes 3rd Edition: The Art of Scientific Computing2007CambridgeCambridge University Press
– reference: Shibata, N.: SLEEF (SIMD library for evaluating elementary functions). Web Site (2013). http://shibatch.sourceforge.net/
– reference: ARM Ltd.: The ARM NEON™ general-purpose SIMD engine (2012). http://www.arm.com/products/processors/technologies/neon.php
– reference: Cammarota, R., Nicolau, A., Veidenbaum, A.V., Kejariwal, A., Donato, D., Madhugiri, M.: On the determination of inlining vectors for program optimization. In: Proceedings of 22nd International Conference on Compiler Construction, CC’13, pp. 164–183. Springer, Berlin (2013). doi:10.1007/978-3-642-37051-9_9
– reference: Stratton, J.A., Stone, S.S., Hwu, W.M.W.: MCUDA: an efficient implementation of CUDA kernels for multi-core CPUs. In: J.N. Amaral (ed.) Languages and Compilers for Parallel Computing, LNCS, vol. 5335, pp. 16–30. Springer, Berlin (2008). doi:10.1007/978-3-540-89740-8_2
– reference: Schnetter, E.: Vecmathlib. http://bitbucket.org/eschnett/vecmathlib. Online; Accessed 5 Feb 2014
– reference: Clang: A C language frontend for LLVM. http://clang.llvm.org/. Online; Accessed 5 Feb 2014
– reference: Khronos Group: SPIR 1.2 Specification for OpenCL (2014)
– reference: ARM Ltd.: The ARMCortex™ A9 processor (2013). http://www.arm.com/products/processors/cortex-a/cortex-a9.php
– reference: Esko, O., Jääskeläinen, P., Huerta, P., de La Lama, C.S., Takala, J., Martinez, J.I.: Customized exposed datapath soft-core design flow with compiler support. In: International Conference on Field Programmable Logic and Applications, pp. 217–222. Milan, Italy (2010)
– reference: Maher, B.A., Smith, A., Burger, D., McKinley, K.S.: Merging head and tail duplication for convergent hyperblock formation. In: Proceedings of Annual IEEE/ACM International Symposium on Microarchitecture, pp. 65–76. Orlando, FL (2006)
– reference: Kejariwal, A., Nicolau, A., Saito, H., Tian, X., Girkar, M., Banerjee, U., Polychronopoulos, C.D.: A general approach for partitioning N-dimensional parallel nested loops with conditionals. In: Proceedings of 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’06, pp. 49–58. ACM, New York, NY, USA (2006). doi:10.1145/1148109.1148117
– reference: GoldbergDWhat every computer scientist should know about floating-point arithmeticACM Comput. Surv.19912354810.1145/103162.103163
– reference: Nicolau, A., Li, G., Veidenbaum, A.V., Kejariwal, A.: Synchronization optimizations for efficient execution on multi-cores. In: Proceedings of the 23rd International Conference on Supercomputing, ICS ’09, pp. 169–180. ACM, New York, NY, USA (2009). doi:10.1145/1542275.1542303
– reference: AhoAVSethiRUllmanJDCompilers: Principles, Techniques, and Tools1986ReadingAddison-Wesley Longman Publishing Co. Inc.
– reference: Lee, J., Kim, J., Seo, S., Kim, S., Park, J., Kim, H., Dao, T.T., Cho, Y., Seo, S.J., Lee, S.H., Cho, S.M., Song, H.J., Suh, S.B., Choi, J.D.: An OpenCL framework for heterogeneous multicores with local memory. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, pp. 193–204. ACM, New York, NY, USA (2010). doi:10.1145/1854273.1854301
– reference: Clover Git: Implementing barriers. http://people.freedesktop.org/steckdenis/clover/barrier.html. Online; Accessed 18 May 2013
– reference: Gummaraju, J., Morichetti, L., Houston, M., Sander, B., Gaster, B.R., Zheng, B.: Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. In: Proceedings of International Conference on Parallel Architectures and Compilation Techniques, pp. 205–216. Vienna, Austria (2010)
– reference: Khronos Group, Beaverton, OR: OpenCL Specification, v1.2r19 edn. (2012)
– ident: 320_CR12
– ident: 320_CR6
– volume: 13
  start-page: 451
  issue: 4
  year: 1991
  ident: 320_CR16
  publication-title: ACM Trans. Program. Lang. Syst.
  doi: 10.1145/115372.115320
– ident: 320_CR4
– volume: 23
  start-page: 5
  year: 1991
  ident: 320_CR19
  publication-title: ACM Comput. Surv.
  doi: 10.1145/103162.103163
– ident: 320_CR41
– ident: 320_CR37
  doi: 10.1109/MICRO.2006.34
– ident: 320_CR43
– ident: 320_CR13
  doi: 10.1007/978-3-642-37051-9_9
– volume: C–30
  start-page: 478
  issue: 7
  year: 1981
  ident: 320_CR18
  publication-title: IEEE Trans. Comput.
  doi: 10.1109/TC.1981.1675827
– ident: 320_CR24
– ident: 320_CR40
  doi: 10.1145/1542275.1542303
– ident: 320_CR21
  doi: 10.1145/1854273.1854302
– ident: 320_CR26
– volume-title: Numerical Recipes 3rd Edition: The Art of Scientific Computing
  year: 2007
  ident: 320_CR42
– ident: 320_CR45
  doi: 10.1007/s00450-010-0108-2
– ident: 320_CR17
  doi: 10.1109/FPL.2010.51
– ident: 320_CR33
– ident: 320_CR1
– ident: 320_CR32
  doi: 10.1145/1148109.1148117
– ident: 320_CR36
  doi: 10.1145/1854273.1854301
– ident: 320_CR5
– ident: 320_CR31
  doi: 10.1007/978-3-642-28652-0_1
– ident: 320_CR10
  doi: 10.1145/567067.567085
– ident: 320_CR3
– volume-title: Microprocessor Architectures: From VLIW to TTA
  year: 1997
  ident: 320_CR15
– ident: 320_CR34
– ident: 320_CR7
– ident: 320_CR35
  doi: 10.1109/CGO.2004.1281665
– ident: 320_CR28
– ident: 320_CR44
– ident: 320_CR23
  doi: 10.1145/800152.804919
– volume: 26
  start-page: 10
  year: 2006
  ident: 320_CR20
  publication-title: IEEE Micro
  doi: 10.1109/MM.2006.41
– ident: 320_CR22
  doi: 10.1145/1787275.1787295
– ident: 320_CR14
  doi: 10.1145/800028.808480
– volume: 19
  start-page: 1031
  issue: 6
  year: 1997
  ident: 320_CR29
  publication-title: ACM Trans. Program. Lang. Syst.
  doi: 10.1145/267959.269971
– ident: 320_CR39
  doi: 10.1145/1504176.1504207
– ident: 320_CR25
– ident: 320_CR27
– volume-title: Elementary Functions: Algorithms and Implementation
  year: 2006
  ident: 320_CR38
– ident: 320_CR47
  doi: 10.1007/978-3-540-89740-8_2
– volume-title: Compilers: Principles, Techniques, and Tools
  year: 1986
  ident: 320_CR8
– ident: 320_CR46
– ident: 320_CR30
  doi: 10.1109/CGO.2011.5764682
– volume: 5
  start-page: 1
  issue: 7
  year: 1970
  ident: 320_CR9
  publication-title: ACM SIGPLAN Not.
  doi: 10.1145/390013.808479
– ident: 320_CR2
– ident: 320_CR11
SSID ssj0009788
Score 2.3999305
Snippet OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide...
Issue Title: Includes a Special Section on High-level Heterogeneous and Hierarchical Parallel Systems OpenCL is a standard for parallel programming of...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 752
SubjectTerms Analysis
Compilers
Computer programming
Computer Science
Formations
Kernels
Language
Mathematical functions
Parallel programming
Platforms
Portability
Processor Architectures
Proprietary
Software
Software Engineering/Programming and Operating Systems
Source code
Studies
Theory of Computation
SummonAdditionalLinks – databaseName: Springer Journals New Starts & Take-Overs Collection
  dbid: RSV
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fS8MwED50-uCL8ydWp1TwSQlsWdIkvo3h8GGMoUP2Vto0AWF2Yz-E_fdeunadooI-N0nLJV9zH3f3HcCNtQqprBEEqQIlTDtI2SQhjFL0R2XdCpvtdFf0enI4VP28jntWZLsXIcnsT71R7CYCx34ZcU2_yXIbdvC2kw6NT88vpdKuyJpNIno4EYzLIpT53RKfL6PSw_wSFM3umk71X195APu5a-m3VmfhELZMegTVom2Dn6P4GOhkrEf3fsvvl0UDJMsojUfGdxkm7a6fqQa_5YVJ6QkMOg-D9iPJWycQzSSfE8k18hyB_k2iuOHKiEjFCUuoihivNyyNnPRWZJ04e4CwTDTiPtLILxraCNs8hUo6Ts0Z-EnsNP4SbiVOtRFTNOboNUZaC45shHpQL0wY6lxW3HW3GIWlILIzSYgmCZ1JwqUHt-spk5Wmxm-Da8W-hDm8ZmEjUBKJEuVND67XjxEYLtoRpWa8wDFOiV8yFTQ8uCv2amOJn154_qfRF7CHLhRfpffVoDKfLswl7Or3-etsepWdzA-SlNss
  priority: 102
  providerName: Springer Nature
Title pocl: A Performance-Portable OpenCL Implementation
URI https://link.springer.com/article/10.1007/s10766-014-0320-y
https://www.proquest.com/docview/1698576253
https://www.proquest.com/docview/1730084961
Volume 43
WOSCitedRecordID wos000358648600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVPQU
  databaseName: ABI/INFORM Collection
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: 7WY
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/abicomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ABI/INFORM Global
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: M0C
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/abiglobal
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: P5Z
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: K7-
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: BENPR
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Research Library
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: M2O
  dateStart: 19970201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/pqrl
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: Springer Journals New Starts & Take-Overs Collection
  customDbUrl:
  eissn: 1573-7640
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0009788
  issn: 0885-7458
  databaseCode: RSV
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3fT9swED5B2cNeYGxMdGNVJu2JyaIxdh3vBXUVCImuVF01YC9R6h8SUklLWybx33OXOs2GRF_2ci-J48jns-985-8D-OK9xlDWKYahAmfCkEl5a5ngHP3RpOmVLzTdVb1ecn2t--HAbR7KKss1sVio7cTQGflR3NIJ-sZcHp9M7xmxRlF2NVBobMJWzHlM8_xCsQp0VxW8k2hIkikhkzKrubw6p1oUSwtGFOLs8d99qXI2n-VHi23nbOd_f_gNbAeHM2ovZ8gubLj8LeyUZA5RsO13wKcTM_4WtaN-dZWAFXWmo7GLqO6k040KLOG7cF0p34Ph2emwc84CoQIzIpELlkiD0Y9Cr8dq6aR2KtMjKyzXmZDN2POMALkyT5DtLTRWa3A1yAxGHbFxyh-_h1o-yd0-RHZEyH9W-gSb-kxoPpLoS2bGKIkxCq9DsxzN1ASwceK8GKcVTDIpIEUFpKSA9LEOh6sm0yXSxrqXD8pBT4PRzdNqxOvwefUYzYVyIFnuJg_4DuHzJ0K34jp8LVX71yde6vDD-g4_wmv0pOSyyu8AaovZg_sEr8yfxe181oBNdXXTgK3vp73-oFHMUZQ_mh2S_BJlX_5GOfj56wmBDuty
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LTxsxEB4FqFQuDZQiUl6LRC9FVhPHjtdICKEAApFGHHLgZm38kCqFTSCBKj-q_5GZfbClUrlx4Lx-rPabGc_sjOcD2A9BYyjrFcNQgTNhSaWCc0xwjv5o3AwqZEj3VL8f39zo6xr8Ke_CUFllaRMzQ-3Glv6R_2h1dIy-MZft48kdI9Yoyq6WFBq5WFz5-W8M2aZHl6eI7zfOz88G3QtWsAowK2I5Y7G0GAIoPPqdll5qrxI9dMJxnQjZbAWeUFeqJFDf8g5KrLOoEolF17tlvQptXHYBloRAbaBKwWa36vGrMppL1FvJlJBxmUTNb-qpDoXughFjOZu_PAYr3_afdGx2yp3X39n3WYFPhTsdneTyvwo1n36GeklVERWWaw34ZGxHh9FJdF1dlGBZFe1w5COqqun2oqxT8m1xGSv9AoO3ePF1WEzHqd-AyA2pr6GTIcapIRGaDyV6yom1SmIExhvQLMEztmilToweI1M1gSa8DeJtCG8zb8D35ymTvI_Ia4O3SoxNYVKmpgK4AXvPj9EYUIYnSf34AccQ-0AsdKfVgINSkv5a4n8bfn19w134eDH42TO9y_7VJiyjzyjzesYtWJzdP_ht-GAfZ7-m9zuZQkRg3ljAngDHtT7z
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LTxsxEB5BihAXKBTUAG0XCS5FVhJjx2ukqkJAVJQoyoEDN2vjh4SUbgJJQPlp_Xed2QcLlcqNA-f1Y7XfzHhmZzwfwGEIGkNZrxiGCpwJSyoVnGOCc_RH42ZQIUO6p_r9-OZGD5bgT3kXhsoqS5uYGWo3tvSPvNFq6xh9Yy5PGqEoixhcdH5O7hgxSFGmtaTTyEWk6xePGL5Nf1xdINZHnHcur89_sYJhgFkRyxmLpcVwQKEb4LT0UnuV6KETjutEyGYr8IQ6VCWBepi3UXqdRfVILLrhLetVOMFll-EDHsKSVKyrWNXvV2WUl6jDkikh4zKhmt_aU20K4wUj9nK2eHkkVn7uP6nZ7MTrbLzjb_UR1gs3OzrL9WITlny6BRslhUVUWLRPwCdjOzqNzqJBdYGCZdW1w5GPqNrmvBdlHZR_F5e00m24fosX34FaOk79Z4jckPodOhlinBoSoflQogedWKskRma8Ds0SSGOLFuvE9DEyVXNowt4g9oawN4s6fH-aMsn7i7w2eL_E2xSmZmoqsOtw8PQYjQRlfpLUj-c4hlgJYqHbrTocl1L1bIn_bbj7-obfYBXlyvSu-t09WENXUuZljvtQm93P_RdYsQ-z2-n910w3IjBvLF9_AV55R5k
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=pocl%3A+A+Performance-Portable+OpenCL+Implementation&rft.jtitle=International+journal+of+parallel+programming&rft.au=J%C3%A4%C3%A4skel%C3%A4inen%2C+Pekka&rft.au=de+La+Lama%2C+Carlos+S%C3%A1nchez&rft.au=Schnetter%2C+Erik&rft.au=Raiskila%2C+Kalle&rft.date=2015-10-01&rft.pub=Springer+Nature+B.V&rft.issn=0885-7458&rft.eissn=1573-7640&rft.volume=43&rft.issue=5&rft.spage=752&rft_id=info:doi/10.1007%2Fs10766-014-0320-y&rft.externalDBID=HAS_PDF_LINK&rft.externalDocID=3755403461
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0885-7458&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0885-7458&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0885-7458&client=summon