pocl: A Performance-Portable OpenCL Implementation
OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings th...
Gespeichert in:
| Veröffentlicht in: | International journal of parallel programming Jg. 43; H. 5; S. 752 - 785 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York
Springer US
01.10.2015
Springer Nature B.V |
| Schlagworte: | |
| ISSN: | 0885-7458, 1573-7640 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue. Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure. This data can be exploited by the later generic compiler passes for efficient parallelization. The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research. The paper describes how the portability of the implementation is achieved. We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources. The results show that most of the benchmarked applications when compiled using pocl were faster or close to as fast as the best proprietary OpenCL implementation for the platform at hand. |
|---|---|
| AbstractList | OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue. Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure. This data can be exploited by the later generic compiler passes for efficient parallelization. The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research. The paper describes how the portability of the implementation is achieved. We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources. The results show that most of the benchmarked applications when compiled using pocl were faster or close to as fast as the best proprietary OpenCL implementation for the platform at hand. Issue Title: Includes a Special Section on High-level Heterogeneous and Hierarchical Parallel Systems OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide support for application descriptions written according to the standard, thus reducing the program porting effort. While the standard brings the obvious benefits of platform portability, the performance portability aspects are largely left to the programmer. The situation is made worse due to multiple proprietary vendor implementations with different characteristics, and, thus, required optimization strategies. In this paper, we propose an OpenCL implementation that is both portable and performance portable. At its core is a kernel compiler that can be used to exploit the data parallelism of OpenCL programs on multiple platforms with different parallel hardware styles. The kernel compiler is modularized to perform target-independent parallel region formation separately from the target-specific parallel mapping of the regions to enable support for various styles of fine-grained parallel resources such as subword SIMD extensions, SIMD datapaths and static multi-issue. Unlike previous similar techniques that work on the source level, the parallel region formation retains the information of the data parallelism using the LLVM IR and its metadata infrastructure. This data can be exploited by the later generic compiler passes for efficient parallelization. The proposed open source implementation of OpenCL is also platform portable, enabling OpenCL on a wide range of architectures, both already commercialized and on those that are still under research. The paper describes how the portability of the implementation is achieved. We test the two aspects to portability by utilizing the kernel compiler and the OpenCL implementation to run OpenCL applications in various platforms with different style of parallel resources. The results show that most of the benchmarked applications when compiled using pocl were faster or close to as fast as the best proprietary OpenCL implementation for the platform at hand. |
| Author | Raiskila, Kalle de La Lama, Carlos Sánchez Jääskeläinen, Pekka Berg, Heikki Schnetter, Erik Takala, Jarmo |
| Author_xml | – sequence: 1 givenname: Pekka surname: Jääskeläinen fullname: Jääskeläinen, Pekka email: pekka.jaaskelainen@tut.fi organization: Tampere University of Technology – sequence: 2 givenname: Carlos Sánchez surname: de La Lama fullname: de La Lama, Carlos Sánchez organization: Knowledge Development for POF – sequence: 3 givenname: Erik surname: Schnetter fullname: Schnetter, Erik organization: Perimeter Institute for Theoretical Physics, Department of Physics, University of Guelph, Center for Computation and Technology, Louisiana State University – sequence: 4 givenname: Kalle surname: Raiskila fullname: Raiskila, Kalle organization: Nokia Research Center – sequence: 5 givenname: Jarmo surname: Takala fullname: Takala, Jarmo organization: Tampere University of Technology – sequence: 6 givenname: Heikki surname: Berg fullname: Berg, Heikki organization: Nokia Research Center |
| BookMark | eNp9kD1rwzAQQEVJoUnaH9DN0KWL2pOts-RuIfQjEEiG7EKR5eJgS67kDPn3dXCHEminW9477t6MTJx3lpB7Bk8MQDxHBiLPKTBOIUuBnq7IlKHIqMg5TMgUpEQqOMobMovxAACFkHJK0s6b5iVZJFsbKh9a7YylWx96vW9ssumsW66TVds1trWu133t3S25rnQT7d3PnJPd2-tu-UHXm_fVcrGmhkvsqUTDuBQpQlmgxcIKXexLXqaF5gisSjWiAF0VMuc5MihNBqhNisiMFVU2J4_j2i74r6ONvWrraGzTaGf9MSomMgDJi5wN6MMFevDH4IbjFMsLiSJPMRsoMVIm-BiDrZSpx4_6oOtGMVDnlGpMqYaU6pxSnQaTXZhdqFsdTv866ejEgXWfNvy66U_pG9vVhfw |
| CODEN | IJPPE5 |
| CitedBy_id | crossref_primary_10_1016_j_cageo_2019_04_003 crossref_primary_10_3847_1538_4357_aa6f06 crossref_primary_10_1177_10943420251369350 crossref_primary_10_1145_3315569 crossref_primary_10_1007_s42514_020_00039_4 crossref_primary_10_1109_ACCESS_2025_3546635 crossref_primary_10_1016_j_aam_2021_102229 crossref_primary_10_1016_j_sysarc_2017_10_004 crossref_primary_10_1145_3199610_3199614 crossref_primary_10_1145_3659949 crossref_primary_10_1109_TC_2021_3107196 crossref_primary_10_1016_j_micpro_2023_104772 crossref_primary_10_1007_s11042_018_6532_1 crossref_primary_10_1109_MCSE_2021_3083547 crossref_primary_10_3233_JIFS_200616 crossref_primary_10_1007_s11265_018_1416_1 crossref_primary_10_1007_s42514_024_00181_3 crossref_primary_10_3390_computers13100250 crossref_primary_10_1145_3140582_3081040 crossref_primary_10_1109_TVLSI_2025_3574427 crossref_primary_10_1109_TC_2018_2793919 crossref_primary_10_1007_s11227_023_05879_9 crossref_primary_10_1016_j_combustflame_2018_09_008 crossref_primary_10_1016_j_ijepes_2024_110014 crossref_primary_10_1109_TPDS_2021_3116859 crossref_primary_10_1109_TVLSI_2019_2897508 crossref_primary_10_1145_3434312 crossref_primary_10_1145_3177960 crossref_primary_10_1007_s11265_018_1422_3 crossref_primary_10_1007_s11265_018_1424_1 crossref_primary_10_1016_j_parco_2021_102754 crossref_primary_10_1145_3554736 crossref_primary_10_1631_FITEE_2200359 |
| Cites_doi | 10.1145/115372.115320 10.1145/103162.103163 10.1109/MICRO.2006.34 10.1007/978-3-642-37051-9_9 10.1109/TC.1981.1675827 10.1145/1542275.1542303 10.1145/1854273.1854302 10.1007/s00450-010-0108-2 10.1109/FPL.2010.51 10.1145/1148109.1148117 10.1145/1854273.1854301 10.1007/978-3-642-28652-0_1 10.1145/567067.567085 10.1109/CGO.2004.1281665 10.1145/800152.804919 10.1109/MM.2006.41 10.1145/1787275.1787295 10.1145/800028.808480 10.1145/267959.269971 10.1145/1504176.1504207 10.1007/978-3-540-89740-8_2 10.1109/CGO.2011.5764682 10.1145/390013.808479 |
| ContentType | Journal Article |
| Copyright | Springer Science+Business Media New York 2014 Springer Science+Business Media New York 2015 |
| Copyright_xml | – notice: Springer Science+Business Media New York 2014 – notice: Springer Science+Business Media New York 2015 |
| DBID | AAYXX CITATION 3V. 7SC 7WY 7WZ 7XB 87Z 8AL 8FD 8FE 8FG 8FK 8FL 8G5 ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ GUQSH HCIFZ JQ2 K60 K6~ K7- L.- L.0 L7M L~C L~D M0C M0N M2O MBDVC P5Z P62 PHGZM PHGZT PKEHL PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI Q9U |
| DOI | 10.1007/s10766-014-0320-y |
| DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni) Research Library (Alumni) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Database (1962 - current) ProQuest Central Essentials - QC ProQuest Central Business Premium Collection Technology Collection ProQuest One Community College ProQuest Central Korea Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student Research Library Prep SciTech Premium Collection ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced ABI/INFORM Professional Standard Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global Computing Database Research Library Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection Proquest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Business ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central Basic |
| DatabaseTitle | CrossRef ABI/INFORM Global (Corporate) ProQuest Business Collection (Alumni Edition) ProQuest One Business Research Library Prep Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College Research Library (Alumni Edition) ABI/INFORM Complete ProQuest Central ABI/INFORM Professional Advanced ProQuest One Applied & Life Sciences ABI/INFORM Professional Standard ProQuest Central Korea ProQuest Research Library ProQuest Central (New) Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global ProQuest Computing ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest Business Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Business (Alumni) ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) Business Premium Collection (Alumni) |
| DatabaseTitleList | Computer and Information Systems Abstracts ABI/INFORM Global (Corporate) |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1573-7640 |
| EndPage | 785 |
| ExternalDocumentID | 3755403461 10_1007_s10766_014_0320_y |
| Genre | Feature |
| GroupedDBID | -4Z -59 -5G -BR -EM -Y2 -~C -~X .4S .86 .DC .VR 06D 0R~ 0VY 199 1N0 2.D 203 28- 29J 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 78A 7WY 8FE 8FG 8FL 8G5 8TC 8UJ 95- 95. 95~~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ H~9 I-F I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW LAK LLZTM M0C M0N M2O M4Y MA- MS~ N2Q NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P62 P9O PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 Q2X QOK QOS R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TAE TEORI TN5 TSG TSK TSV TUC TUS U2A U5U UG4 UOJIU UTJUX UZXMN VC2 VFIZW VXZ W23 W48 WH7 WK8 YLTOR Z45 Z7R Z7X Z81 Z83 Z88 Z8R Z8W Z92 ZMTXR ZY4 ~8M ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADHKG AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT PQGLB 7SC 7XB 8AL 8FD 8FK JQ2 L.- L.0 L7M L~C L~D MBDVC PKEHL PQEST PQUKI Q9U PUEGO |
| ID | FETCH-LOGICAL-c485t-85c1487250d95e59e7a9bd4d29a4501f2a5570af98646510dc305ac2551ce7f3 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 71 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000358648600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0885-7458 |
| IngestDate | Thu Sep 04 15:24:03 EDT 2025 Tue Nov 04 21:43:39 EST 2025 Tue Nov 18 22:37:46 EST 2025 Sat Nov 29 01:59:42 EST 2025 Fri Feb 21 02:37:21 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 5 |
| Keywords | Parallel programming LLVM OpenCL SIMD Heterogeneous platforms VLIW GPGPU Performance portability |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c485t-85c1487250d95e59e7a9bd4d29a4501f2a5570af98646510dc305ac2551ce7f3 |
| Notes | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
| PQID | 1698576253 |
| PQPubID | 48389 |
| PageCount | 34 |
| ParticipantIDs | proquest_miscellaneous_1730084961 proquest_journals_1698576253 crossref_citationtrail_10_1007_s10766_014_0320_y crossref_primary_10_1007_s10766_014_0320_y springer_journals_10_1007_s10766_014_0320_y |
| PublicationCentury | 2000 |
| PublicationDate | 2015-10-01 |
| PublicationDateYYYYMMDD | 2015-10-01 |
| PublicationDate_xml | – month: 10 year: 2015 text: 2015-10-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | International journal of parallel programming |
| PublicationTitleAbbrev | Int J Parallel Prog |
| PublicationYear | 2015 |
| Publisher | Springer US Springer Nature B.V |
| Publisher_xml | – name: Springer US – name: Springer Nature B.V |
| References | TTA-based codesign environment (TCE). http://tce.cs.tut.fi. Online; Accessed 18 May 2013 Nicolau, A., Li, G., Kejariwal, A.: Techniques for efficient placement of synchronization primitives. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’09, pp. 199–208. ACM, New York, NY, USA (2009). doi:10.1145/1504176.1504207 Hecht, M.S., Ullman, J.D.: Flow graph reducibility. In: Proceedings of Annual ACM Symposium on Theory of Computing, pp. 238–250. Denver, CO (1972) Khronos Group, Beaverton, OR: OpenCL Specification, v1.2r19 edn. (2012) Shibata, N.: SLEEF (SIMD library for evaluating elementary functions). Web Site (2013). http://shibatch.sourceforge.net Schnetter, E.: Vecmathlib. http://bitbucket.org/eschnett/vecmathlib. Online; Accessed 5 Feb 2014 Shibata, N.: Efficient evaluation methods of elementary functions suitable for SIMD computation. In: Journal of Computer Science on Research and Development, Proceedings of the International Supercomputing Conference ISC10, vol. 25, pp. 25–32 (2010). doi:10.1007/s00450-010-0108-2 GschwindMHofsteeHPFlachsBHopkinsMWatanabeYYamazakiTSynergistic processing in Cell’s multicore architectureIEEE Micro200626102410.1109/MM.2006.41 AhoAVSethiRUllmanJDCompilers: Principles, Techniques, and Tools1986ReadingAddison-Wesley Longman Publishing Co. Inc. AllenFEControl flow analysisACM SIGPLAN Not.19705711910.1145/390013.808479 Maher, B.A., Smith, A., Burger, D., McKinley, K.S.: Merging head and tail duplication for convergent hyperblock formation. In: Proceedings of Annual IEEE/ACM International Symposium on Microarchitecture, pp. 65–76. Orlando, FL (2006) Nvidia Corp., Santa Clara, CA: NVIDIA CUDA Compute Unified Device Architecture: Programming Guide, v2.0 edn. (2008) GoldbergDWhat every computer scientist should know about floating-point arithmeticACM Comput. Surv.19912354810.1145/103162.103163 Jääskeläinen, P., Sánchez de La Lama, C., Huerta, P., Takala, J.: OpenCL-based design methodology for application-specific processors. Trans. HiPEAC 5 (2011). http://www.hipeac.net/node/4310 LLVM compiler infrastructure. http://llvm.org/. Online; Accessed 5 Feb 2014 Karrenberg, R., Hack, S.: Improving performance of OpenCL on CPUs. In: Proceedings of International Conference on Compiler Construction, pp. 1–20. Tallinn, Estonia (2012) PressWHTeukolskySAVetterlingWTFlanneryBPNumerical Recipes 3rd Edition: The Art of Scientific Computing2007CambridgeCambridge University Press ARM Ltd.: The ARM NEON™ general-purpose SIMD engine (2012). http://www.arm.com/products/processors/technologies/neon.php IBM: OpenCL(TM) development kit for Linux on Power, v0.3 (2011) MullerJMElementary Functions: Algorithms and Implementation2006LondonBirkhäuser Karrenberg, R., Hack, S.: Whole-function vectorization. In: Proceedings of Annual IEEE/ACM International Symposium Code Generation and Optimization, pp. 141–150. Chamonix, France (2011) Lee, J., Kim, J., Seo, S., Kim, S., Park, J., Kim, H., Dao, T.T., Cho, Y., Seo, S.J., Lee, S.H., Cho, S.M., Song, H.J., Suh, S.B., Choi, J.D.: An OpenCL framework for heterogeneous multicores with local memory. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, pp. 193–204. ACM, New York, NY, USA (2010). doi:10.1145/1854273.1854301 Stratton, J.A., Stone, S.S., Hwu, W.M.W.: MCUDA: an efficient implementation of CUDA kernels for multi-core CPUs. In: J.N. Amaral (ed.) Languages and Compilers for Parallel Computing, LNCS, vol. 5335, pp. 16–30. Springer, Berlin (2008). doi:10.1007/978-3-540-89740-8_2 Gummaraju, J., Morichetti, L., Houston, M., Sander, B., Gaster, B.R., Zheng, B.: Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. In: Proceedings of International Conference on Parallel Architectures and Compilation Techniques, pp. 205–216. Vienna, Austria (2010) Gummaraju, J., Sander, B., Morichetti, L., Gaster, B., Howes, L.: Efficient implementation of GPGPU synchronization primitives on CPUs. In: Proceedings of ACM International Conference on Computing Frontiers, pp. 85–86. Bertinoro, Italy (2010) freeocl: Multi-platform implementation of OpenCL 1.2 targeting CPUs. http://code.google.com/p/freeocl/. Online; Accessed 18 May 2013 CytronRFerranteJRosenBKWegmanMNZadeckFKEfficiently computing static single assignment form and the control dependence graphACM Trans. Program. Lang. Syst.199113445149010.1145/115372.115320 IEEE, Piscataway, NJ: Standard for floating-point arithmetic (2008). Std 754-2008 Advanced Micro Devices Inc: Accelerated parallel processing (APP) software development kit (SDK) v2.8 (2012) CorporaalHMicroprocessor Architectures: From VLIW to TTA1997ChichesterWiley Cammarota, R., Nicolau, A., Veidenbaum, A.V., Kejariwal, A., Donato, D., Madhugiri, M.: On the determination of inlining vectors for program optimization. In: Proceedings of 22nd International Conference on Compiler Construction, CC’13, pp. 164–183. Springer, Berlin (2013). doi:10.1007/978-3-642-37051-9_9 ARM Ltd.: The ARMCortex™ A9 processor (2013). http://www.arm.com/products/processors/cortex-a/cortex-a9.php IEEE, Piscataway, NJ: IEEE standard for information technology—portable operation system interface (POSIX). Shell and utilities., 2004 edn. (2004). Std 1003.1 Khronos Group: SPIR 1.2 Specification for OpenCL (2014) Clover Git: OpenCL 1.1 software implementation. http://people.freedesktop.org/steckdenis/clover/index.html. Online; Accessed 18 May 2013 Cocke, J.: Global common subexpression elimination. In: Proceedings of Symposium Compiler Optimization, pp. 20–24. Urbana-Champaign, IL (1970) Esko, O., Jääskeläinen, P., Huerta, P., de La Lama, C.S., Takala, J., Martinez, J.I.: Customized exposed datapath soft-core design flow with compiler support. In: International Conference on Field Programmable Logic and Applications, pp. 217–222. Milan, Italy (2010) Intel Corp.: Desktop 4th Gen IntelCore™ Processor Family: Datasheet, Vol. 1 (2013). Doc. No. 328897-004 FisherJTrace scheduling: a technique for global microcode compactionIEEE Trans. Comput.1981C–30747849010.1109/TC.1981.1675827 Kejariwal, A., Nicolau, A., Saito, H., Tian, X., Girkar, M., Banerjee, U., Polychronopoulos, C.D.: A general approach for partitioning N-dimensional parallel nested loops with conditionals. In: Proceedings of 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’06, pp. 49–58. ACM, New York, NY, USA (2006). doi:10.1145/1148109.1148117 Clang: A C language frontend for LLVM. http://clang.llvm.org/. Online; Accessed 5 Feb 2014 Allen, J., Kennedy, K., Porterfield, C., Warren, J.: Conversion of control dependence to data dependence. In: Proceedings of ACM Symposium Principles of Programming Languages, Austin, TX, pp. 177–189 (1983) Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis and transformation. In: Proceedings of International Symposium on Code Generation Optimization, p. 75 (2004) Rotem, N.: Intel OpenCL SDK vectorizer. LLVM Developer’s Meeting (2011) Nicolau, A., Li, G., Veidenbaum, A.V., Kejariwal, A.: Synchronization optimizations for efficient execution on multi-cores. In: Proceedings of the 23rd International Conference on Supercomputing, ICS ’09, pp. 169–180. ACM, New York, NY, USA (2009). doi:10.1145/1542275.1542303 Clover Git: Implementing barriers. http://people.freedesktop.org/steckdenis/clover/barrier.html. Online; Accessed 18 May 2013 JanssenJCorporaalHMaking graphs reducible with controlled node splittingACM Trans. Program. Lang. Syst.19971961031105210.1145/267959.269971 R Cytron (320_CR16) 1991; 13 320_CR12 D Goldberg (320_CR19) 1991; 23 320_CR34 320_CR13 320_CR35 320_CR10 320_CR32 320_CR11 320_CR33 J Janssen (320_CR29) 1997; 19 320_CR30 320_CR31 320_CR1 AV Aho (320_CR8) 1986 320_CR27 320_CR2 320_CR28 320_CR3 320_CR25 320_CR47 320_CR4 320_CR26 320_CR5 320_CR6 H Corporaal (320_CR15) 1997 320_CR7 FE Allen (320_CR9) 1970; 5 320_CR23 320_CR45 320_CR24 320_CR46 320_CR21 320_CR43 320_CR22 320_CR44 320_CR41 320_CR40 WH Press (320_CR42) 2007 JM Muller (320_CR38) 2006 320_CR17 320_CR39 320_CR14 320_CR36 320_CR37 J Fisher (320_CR18) 1981; C–30 M Gschwind (320_CR20) 2006; 26 |
| References_xml | – reference: Hecht, M.S., Ullman, J.D.: Flow graph reducibility. In: Proceedings of Annual ACM Symposium on Theory of Computing, pp. 238–250. Denver, CO (1972) – reference: Nvidia Corp., Santa Clara, CA: NVIDIA CUDA Compute Unified Device Architecture: Programming Guide, v2.0 edn. (2008) – reference: JanssenJCorporaalHMaking graphs reducible with controlled node splittingACM Trans. Program. Lang. Syst.19971961031105210.1145/267959.269971 – reference: AllenFEControl flow analysisACM SIGPLAN Not.19705711910.1145/390013.808479 – reference: IBM: OpenCL(TM) development kit for Linux on Power, v0.3 (2011) – reference: MullerJMElementary Functions: Algorithms and Implementation2006LondonBirkhäuser – reference: TTA-based codesign environment (TCE). http://tce.cs.tut.fi. Online; Accessed 18 May 2013 – reference: Karrenberg, R., Hack, S.: Whole-function vectorization. In: Proceedings of Annual IEEE/ACM International Symposium Code Generation and Optimization, pp. 141–150. Chamonix, France (2011) – reference: Intel Corp.: Desktop 4th Gen IntelCore™ Processor Family: Datasheet, Vol. 1 (2013). Doc. No. 328897-004 – reference: Nicolau, A., Li, G., Kejariwal, A.: Techniques for efficient placement of synchronization primitives. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’09, pp. 199–208. ACM, New York, NY, USA (2009). doi:10.1145/1504176.1504207 – reference: Gummaraju, J., Sander, B., Morichetti, L., Gaster, B., Howes, L.: Efficient implementation of GPGPU synchronization primitives on CPUs. In: Proceedings of ACM International Conference on Computing Frontiers, pp. 85–86. Bertinoro, Italy (2010) – reference: Clover Git: OpenCL 1.1 software implementation. http://people.freedesktop.org/steckdenis/clover/index.html. Online; Accessed 18 May 2013 – reference: GschwindMHofsteeHPFlachsBHopkinsMWatanabeYYamazakiTSynergistic processing in Cell’s multicore architectureIEEE Micro200626102410.1109/MM.2006.41 – reference: Rotem, N.: Intel OpenCL SDK vectorizer. LLVM Developer’s Meeting (2011) – reference: IEEE, Piscataway, NJ: IEEE standard for information technology—portable operation system interface (POSIX). Shell and utilities., 2004 edn. (2004). Std 1003.1 – reference: Karrenberg, R., Hack, S.: Improving performance of OpenCL on CPUs. In: Proceedings of International Conference on Compiler Construction, pp. 1–20. Tallinn, Estonia (2012) – reference: Shibata, N.: Efficient evaluation methods of elementary functions suitable for SIMD computation. In: Journal of Computer Science on Research and Development, Proceedings of the International Supercomputing Conference ISC10, vol. 25, pp. 25–32 (2010). doi:10.1007/s00450-010-0108-2 – reference: Allen, J., Kennedy, K., Porterfield, C., Warren, J.: Conversion of control dependence to data dependence. In: Proceedings of ACM Symposium Principles of Programming Languages, Austin, TX, pp. 177–189 (1983) – reference: CytronRFerranteJRosenBKWegmanMNZadeckFKEfficiently computing static single assignment form and the control dependence graphACM Trans. Program. Lang. Syst.199113445149010.1145/115372.115320 – reference: freeocl: Multi-platform implementation of OpenCL 1.2 targeting CPUs. http://code.google.com/p/freeocl/. Online; Accessed 18 May 2013 – reference: IEEE, Piscataway, NJ: Standard for floating-point arithmetic (2008). Std 754-2008 – reference: Cocke, J.: Global common subexpression elimination. In: Proceedings of Symposium Compiler Optimization, pp. 20–24. Urbana-Champaign, IL (1970) – reference: Jääskeläinen, P., Sánchez de La Lama, C., Huerta, P., Takala, J.: OpenCL-based design methodology for application-specific processors. Trans. HiPEAC 5 (2011). http://www.hipeac.net/node/4310 – reference: Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis and transformation. In: Proceedings of International Symposium on Code Generation Optimization, p. 75 (2004) – reference: LLVM compiler infrastructure. http://llvm.org/. Online; Accessed 5 Feb 2014 – reference: FisherJTrace scheduling: a technique for global microcode compactionIEEE Trans. Comput.1981C–30747849010.1109/TC.1981.1675827 – reference: CorporaalHMicroprocessor Architectures: From VLIW to TTA1997ChichesterWiley – reference: Advanced Micro Devices Inc: Accelerated parallel processing (APP) software development kit (SDK) v2.8 (2012) – reference: PressWHTeukolskySAVetterlingWTFlanneryBPNumerical Recipes 3rd Edition: The Art of Scientific Computing2007CambridgeCambridge University Press – reference: Shibata, N.: SLEEF (SIMD library for evaluating elementary functions). Web Site (2013). http://shibatch.sourceforge.net/ – reference: ARM Ltd.: The ARM NEON™ general-purpose SIMD engine (2012). http://www.arm.com/products/processors/technologies/neon.php – reference: Cammarota, R., Nicolau, A., Veidenbaum, A.V., Kejariwal, A., Donato, D., Madhugiri, M.: On the determination of inlining vectors for program optimization. In: Proceedings of 22nd International Conference on Compiler Construction, CC’13, pp. 164–183. Springer, Berlin (2013). doi:10.1007/978-3-642-37051-9_9 – reference: Stratton, J.A., Stone, S.S., Hwu, W.M.W.: MCUDA: an efficient implementation of CUDA kernels for multi-core CPUs. In: J.N. Amaral (ed.) Languages and Compilers for Parallel Computing, LNCS, vol. 5335, pp. 16–30. Springer, Berlin (2008). doi:10.1007/978-3-540-89740-8_2 – reference: Schnetter, E.: Vecmathlib. http://bitbucket.org/eschnett/vecmathlib. Online; Accessed 5 Feb 2014 – reference: Clang: A C language frontend for LLVM. http://clang.llvm.org/. Online; Accessed 5 Feb 2014 – reference: Khronos Group: SPIR 1.2 Specification for OpenCL (2014) – reference: ARM Ltd.: The ARMCortex™ A9 processor (2013). http://www.arm.com/products/processors/cortex-a/cortex-a9.php – reference: Esko, O., Jääskeläinen, P., Huerta, P., de La Lama, C.S., Takala, J., Martinez, J.I.: Customized exposed datapath soft-core design flow with compiler support. In: International Conference on Field Programmable Logic and Applications, pp. 217–222. Milan, Italy (2010) – reference: Maher, B.A., Smith, A., Burger, D., McKinley, K.S.: Merging head and tail duplication for convergent hyperblock formation. In: Proceedings of Annual IEEE/ACM International Symposium on Microarchitecture, pp. 65–76. Orlando, FL (2006) – reference: Kejariwal, A., Nicolau, A., Saito, H., Tian, X., Girkar, M., Banerjee, U., Polychronopoulos, C.D.: A general approach for partitioning N-dimensional parallel nested loops with conditionals. In: Proceedings of 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’06, pp. 49–58. ACM, New York, NY, USA (2006). doi:10.1145/1148109.1148117 – reference: GoldbergDWhat every computer scientist should know about floating-point arithmeticACM Comput. Surv.19912354810.1145/103162.103163 – reference: Nicolau, A., Li, G., Veidenbaum, A.V., Kejariwal, A.: Synchronization optimizations for efficient execution on multi-cores. In: Proceedings of the 23rd International Conference on Supercomputing, ICS ’09, pp. 169–180. ACM, New York, NY, USA (2009). doi:10.1145/1542275.1542303 – reference: AhoAVSethiRUllmanJDCompilers: Principles, Techniques, and Tools1986ReadingAddison-Wesley Longman Publishing Co. Inc. – reference: Lee, J., Kim, J., Seo, S., Kim, S., Park, J., Kim, H., Dao, T.T., Cho, Y., Seo, S.J., Lee, S.H., Cho, S.M., Song, H.J., Suh, S.B., Choi, J.D.: An OpenCL framework for heterogeneous multicores with local memory. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, pp. 193–204. ACM, New York, NY, USA (2010). doi:10.1145/1854273.1854301 – reference: Clover Git: Implementing barriers. http://people.freedesktop.org/steckdenis/clover/barrier.html. Online; Accessed 18 May 2013 – reference: Gummaraju, J., Morichetti, L., Houston, M., Sander, B., Gaster, B.R., Zheng, B.: Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. In: Proceedings of International Conference on Parallel Architectures and Compilation Techniques, pp. 205–216. Vienna, Austria (2010) – reference: Khronos Group, Beaverton, OR: OpenCL Specification, v1.2r19 edn. (2012) – ident: 320_CR12 – ident: 320_CR6 – volume: 13 start-page: 451 issue: 4 year: 1991 ident: 320_CR16 publication-title: ACM Trans. Program. Lang. Syst. doi: 10.1145/115372.115320 – ident: 320_CR4 – volume: 23 start-page: 5 year: 1991 ident: 320_CR19 publication-title: ACM Comput. Surv. doi: 10.1145/103162.103163 – ident: 320_CR41 – ident: 320_CR37 doi: 10.1109/MICRO.2006.34 – ident: 320_CR43 – ident: 320_CR13 doi: 10.1007/978-3-642-37051-9_9 – volume: C–30 start-page: 478 issue: 7 year: 1981 ident: 320_CR18 publication-title: IEEE Trans. Comput. doi: 10.1109/TC.1981.1675827 – ident: 320_CR24 – ident: 320_CR40 doi: 10.1145/1542275.1542303 – ident: 320_CR21 doi: 10.1145/1854273.1854302 – ident: 320_CR26 – volume-title: Numerical Recipes 3rd Edition: The Art of Scientific Computing year: 2007 ident: 320_CR42 – ident: 320_CR45 doi: 10.1007/s00450-010-0108-2 – ident: 320_CR17 doi: 10.1109/FPL.2010.51 – ident: 320_CR33 – ident: 320_CR1 – ident: 320_CR32 doi: 10.1145/1148109.1148117 – ident: 320_CR36 doi: 10.1145/1854273.1854301 – ident: 320_CR5 – ident: 320_CR31 doi: 10.1007/978-3-642-28652-0_1 – ident: 320_CR10 doi: 10.1145/567067.567085 – ident: 320_CR3 – volume-title: Microprocessor Architectures: From VLIW to TTA year: 1997 ident: 320_CR15 – ident: 320_CR34 – ident: 320_CR7 – ident: 320_CR35 doi: 10.1109/CGO.2004.1281665 – ident: 320_CR28 – ident: 320_CR44 – ident: 320_CR23 doi: 10.1145/800152.804919 – volume: 26 start-page: 10 year: 2006 ident: 320_CR20 publication-title: IEEE Micro doi: 10.1109/MM.2006.41 – ident: 320_CR22 doi: 10.1145/1787275.1787295 – ident: 320_CR14 doi: 10.1145/800028.808480 – volume: 19 start-page: 1031 issue: 6 year: 1997 ident: 320_CR29 publication-title: ACM Trans. Program. Lang. Syst. doi: 10.1145/267959.269971 – ident: 320_CR39 doi: 10.1145/1504176.1504207 – ident: 320_CR25 – ident: 320_CR27 – volume-title: Elementary Functions: Algorithms and Implementation year: 2006 ident: 320_CR38 – ident: 320_CR47 doi: 10.1007/978-3-540-89740-8_2 – volume-title: Compilers: Principles, Techniques, and Tools year: 1986 ident: 320_CR8 – ident: 320_CR46 – ident: 320_CR30 doi: 10.1109/CGO.2011.5764682 – volume: 5 start-page: 1 issue: 7 year: 1970 ident: 320_CR9 publication-title: ACM SIGPLAN Not. doi: 10.1145/390013.808479 – ident: 320_CR2 – ident: 320_CR11 |
| SSID | ssj0009788 |
| Score | 2.3999305 |
| Snippet | OpenCL is a standard for parallel programming of heterogeneous systems. The benefits of a common programming standard are clear; multiple vendors can provide... Issue Title: Includes a Special Section on High-level Heterogeneous and Hierarchical Parallel Systems OpenCL is a standard for parallel programming of... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 752 |
| SubjectTerms | Analysis Compilers Computer programming Computer Science Formations Kernels Language Mathematical functions Parallel programming Platforms Portability Processor Architectures Proprietary Software Software Engineering/Programming and Operating Systems Source code Studies Theory of Computation |
| SummonAdditionalLinks | – databaseName: Computer Science Database dbid: K7- link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NS8MwFH_o9ODF-YnVKRU8KcG1S9rEi4zhEJSxww67lTRJQZjd3Iew_96XLl1VcBevbfpBXl7ye3kvvx_ADWum2rQkJRnib0KZSAnHlkTysKUxCjIZl4XYRNzr8eFQ9N2G28yVVZZzYjFR67Gye-T3QSQ4YuOQtR4nH8SqRtnsqpPQ2IadIAwDO85fYlKR7saF7iQ6EiMxZbzMaq6OzsWRjaUpsRLiZPlzXarA5q_8aLHsdOv__eED2HeA02-vRsghbJn8COqlmIPvfPsYwslYjR78tt-vjhKQos40HRnf1p10Xv2CS_jdHVfKT2DQfRp0nokTVCCKcjYnnCmMfmJEPVoww4SJpUg11aGQlDWDLJSWkEtmlrI9QmfVCmcDqTDqCJSJs9Yp1PJxbs7ApyZKEVmkgdEY30URz4TGS5nJAhkJGXrQLHszUY5s3GpejJKKJtkaIEEDJNYAydKD2_UjkxXTxqbGjbLTE-d0s6TqcQ-u17fRXWwOROZmvMA2lp-fUxEFHtyVpv32ir8-eL75gxewh0iKrar8GlCbTxfmEnbV5_xtNr0qRuUX8x3mNA priority: 102 providerName: ProQuest |
| Title | pocl: A Performance-Portable OpenCL Implementation |
| URI | https://link.springer.com/article/10.1007/s10766-014-0320-y https://www.proquest.com/docview/1698576253 https://www.proquest.com/docview/1730084961 |
| Volume | 43 |
| WOSCitedRecordID | wos000358648600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: ABI/INFORM Collection customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: 7WY dateStart: 19970201 isFulltext: true titleUrlDefault: https://www.proquest.com/abicomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ABI/INFORM Global customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: M0C dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/abiglobal providerName: ProQuest – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: P5Z dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: K7- dateStart: 19970201 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: BENPR dateStart: 19970201 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Research Library customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: M2O dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/pqrl providerName: ProQuest – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1573-7640 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dS8MwED_c5oMvzk-czlHBJyWwdkmT-DbHhqDOMYdOX0rapiDMbuxD2H_vpWvdFBX0JdD2-sElv-aOu_sdwCmr-qGuKUoitL8JZdInAiWJEk4tRC9IR0IlzSZ4uy36fdlJ67gnWbZ7FpJM_tQrxW7cNd4vJabpN5nnoIC7nTBo7N4_LJl2edJsEtHDCKdMZKHM7x7xeTNaWphfgqLJXtMq_usrt2AzNS2t-mItbMOajnegmLVtsFIU74IzGgaDC6tudZZFAyTJKPUH2jIZJo0bK2ENfk0Lk-I96LWavcYVSVsnkIAKNiWCBejncLRvQsk0k5or6Yc0dKSirGpHjjLUWyoy5OwuwjIMEPcqQP_CDjSPavuQj4exPgCLatdHG8K3dYienOuKSIZ4KtKRrVypnBJUMxV6QUorbrpbDLwlIbJRiYcq8YxKvHkJzj5uGS04NX4TLmfz4qXwmni2KwU6Sg6rleDk4zICw0Q7VKyHM5QxTPyCStcuwXk2VyuP-OmFh3-SPoINNKHYIr2vDPnpeKaPYT14m75MxhXI8cenChQum-1OF4-uOcHxttowo3OHY4c9V5L1-w6wNeK- |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1JS8QwFH64gV7cxXGtoBclOO0kbSKIyKgojoOHOXgLaZKCMHZGZ1TmR_kffeliVdCbB69tmpS-vW_5AHZZPTa2oShJ0P8mlImYcFxJFA8aBqMgm3CVgU1E7Ta_uxO3Y_BW9sK4sspSJ2aK2vS0-0d-6IeCo28csMZJ_5E41CiXXS0hNHK2uLajVwzZBsdXZ0jfvSC4OO80L0mBKkA05WxIONMYAkRo-o1glgkbKREbagKhKKv7SaDcVCqVuLnlIXKs0SgSSqPr7WsbJQ3cdhwmKUVpcJWC9WY14zfKYC5RbhmJKONlEjXv1ItCF7pT4hDLyeirGax822_p2MzKXcz9s-8zD7OFO-2d5vy_AGM2XYS5EqrCKzTXEgT9nu4eeafebdUoQbIq2rhrPVdV02x52aTkh6IZK12Gzl-8-ApMpL3UroJHbRij3xT71mD0GoY8EQYvJTbxVShUUIN6STypi1HqDtGjK6sh0I7eEuktHb3lqAb7H4_08zkivy3eKGksC5UykBWBa7DzcRuVgcvwqNT2nnGNQx_gVIR-DQ5KTvq0xU8Hrv1-4DZMX3ZuWrJ11b5ehxn0GVlez7gBE8OnZ7sJU_pleD942soEwgP5xwz2DvOkQEM |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LT9wwEB5tF1T10gVa1G23NEj00spik9iJXamqEHQFAq32wAH1Yjl-SEhLsrAL1f60_ruO8yCABDcOXBPHjjLj8TeZxweww4aZsbGixCH-JpSJjHAcSRSPYoNekHVclWQT6XjMz87EpAP_mloYn1bZ2MTSUJtC-3_ku2EiOGLjiMW7rk6LmByMfs0uiWeQ8pHWhk6jUpFju_yL7tv859EByvprFI1-n-4fkpphgGjK2YJwptEdSBEGGMEsEzZVIjPUREJRNgxdpHyHKuV8D_MEtddo3B5KIwwPtU1djNO-gpUUXUzv903Yn7bfb1pSXuIeZiSljDcB1apqL028G0-JZy8ny_tHYotzH4RmyxNv1HvB32oN3tYwO9ir9sU6dGy-Ab2GwiKoLdo7iGaFnv4I9oJJW0BByuzabGoDn22zfxKUHZQv6iKt_D2cPseLb0I3L3L7AQJqkwzxVBZag15tknAnDF5y1oUqESrqw7ARpNR1i3XP9DGVbXNoL3uJspde9nLZh2-3j8yq_iJPDR408pa1qZnLVth92L69jUbCR35UbotrHONZCTgVSdiH741W3ZnisQU_Pr3gF3iNeiVPjsbHn-ANQklWpTkOoLu4urafYVXfLM7nV1vl3ghAPrN-_Qd9SElU |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=pocl%3A+A+Performance-Portable+OpenCL+Implementation&rft.jtitle=International+journal+of+parallel+programming&rft.au=J%C3%A4%C3%A4skel%C3%A4inen%2C+Pekka&rft.au=de+La+Lama%2C+Carlos+S%C3%A1nchez&rft.au=Schnetter%2C+Erik&rft.au=Raiskila%2C+Kalle&rft.date=2015-10-01&rft.pub=Springer+US&rft.issn=0885-7458&rft.eissn=1573-7640&rft.volume=43&rft.issue=5&rft.spage=752&rft.epage=785&rft_id=info:doi/10.1007%2Fs10766-014-0320-y&rft.externalDocID=10_1007_s10766_014_0320_y |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0885-7458&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0885-7458&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0885-7458&client=summon |