Adaptive Runtime-Assisted Block Prefetching on Chip-Multiprocessors

Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the hardware, or be initiated and controlled by software. Among software controlled...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of parallel programming Jg. 45; H. 3; S. 530 - 550
Hauptverfasser:	Garcia, Victor, Rico, Alejandro, Villavieja, Carlos, Carpenter, Paul, Navarro, Nacho, Ramirez, Alex
Format:	Journal Article Verlag
Sprache:	Englisch
Veröffentlicht:	New York Springer US 01.06.2017 Springer Nature B.V
Schlagworte:	Analysis Arquitectura de computadors Benchmarks Cache Cache memories Cache memory Central processing units Computer programming Computer programs Computer Science CPUs Dynamical systems Engine blocks Engines Gestió de memòria (Informàtica) Hardware Informàtica Memòria cau Prefetch Processor Architectures Schedules Software Software Engineering/Programming and Operating Systems Stall Studies Task based programming models Theory of Computation Àrees temàtiques de la UPC Prefetch Cache memories Task based programming models
ISSN:	0885-7458, 1573-7640
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the hardware, or be initiated and controlled by software. Among software controlled prefetching we find a wide variety of schemes, including runtime-directed prefetching and more specifically runtime-directed block prefetching. This paper proposes a hybrid prefetching mechanism that integrates a software driven block prefetcher with existing hardware prefetching techniques. Our runtime-assisted software prefetcher brings large blocks of data on-chip with the support of a low cost hardware engine, and synergizes with existing hardware prefetchers that manage locality at a finer granularity. The runtime system that drives the prefetch engine dynamically selects which cache to prefetch to. Our evaluation on a set of scientific benchmarks obtains a maximum speed up of 32 and 10 % on average compared to a baseline with hardware prefetching only. As a result, we also achieve a reduction of up to 18 and 3 % on average in energy-to-solution.
AbstractList	Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the hardware, or be initiated and controlled by software. Among software controlled prefetching we find a wide variety of schemes, including runtime-directed prefetching and more specifically runtime-directed block prefetching. This paper proposes a hybrid prefetching mechanism that integrates a software driven block prefetcher with existing hardware prefetching techniques. Our runtime-assisted software prefetcher brings large blocks of data on-chip with the support of a low cost hardware engine, and synergizes with existing hardware prefetchers that manage locality at a finer granularity. The runtime system that drives the prefetch engine dynamically selects which cache to prefetch to. Our evaluation on a set of scientific benchmarks obtains a maximum speed up of 32 and 10 % on average compared to a baseline with hardware prefetching only. As a result, we also achieve a reduction of up to 18 and 3 % on average in energy-to-solution. Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the hardware, or be initiated and controlled by software. Among software controlled prefetching we find a wide variety of schemes, including runtime-directed prefetching and more specifically runtime-directed block prefetching. This paper proposes a hybrid prefetching mechanism that integrates a software driven block prefetcher with existing hardware prefetching techniques. Our runtime-assisted software prefetcher brings large blocks of data on-chip with the support of a low cost hardware engine, and synergizes with existing hardware prefetchers that manage locality at a finer granularity. The runtime system that drives the prefetch engine dynamically selects which cache to prefetch to. Our evaluation on a set of scientific benchmarks obtains a maximum speed up of 32 and 10 % on average compared to a baseline with hardware prefetching only. As a result, we also achieve a reduction of up to 18 and 3 % on average in energy-to-solution. Peer Reviewed Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the hardware, or be initiated and controlled by software. Among software controlled prefetching we find a wide variety of schemes, including runtime-directed prefetching and more specifically runtime-directed block prefetching. This paper proposes a hybrid prefetching mechanism that integrates a software driven block prefetcher with existing hardware prefetching techniques. Our runtime-assisted software prefetcher brings large blocks of data on-chip with the support of a low cost hardware engine, and synergizes with existing hardware prefetchers that manage locality at a finer granularity. The runtime system that drives the prefetch engine dynamically selects which cache to prefetch to. Our evaluation on a set of scientific benchmarks obtains a maximum speed up of 32 and 10 % on average compared to a baseline with hardware prefetching only. As a result, we also achieve a reduction of up to 18 and 3 % on average in energy-to-solution.
Author	Ramirez, Alex Rico, Alejandro Garcia, Victor Villavieja, Carlos Navarro, Nacho Carpenter, Paul
Author_xml	– sequence: 1 givenname: Victor surname: Garcia fullname: Garcia, Victor email: vgarcia@ac.upc.edu organization: Universitat Politecnica de Catalunya, Barcelona Supercomputing Center – sequence: 2 givenname: Alejandro surname: Rico fullname: Rico, Alejandro organization: Barcelona Supercomputing Center – sequence: 3 givenname: Carlos surname: Villavieja fullname: Villavieja, Carlos organization: Google Inc – sequence: 4 givenname: Paul surname: Carpenter fullname: Carpenter, Paul organization: Barcelona Supercomputing Center – sequence: 5 givenname: Nacho surname: Navarro fullname: Navarro, Nacho organization: Universitat Politecnica de Catalunya, Barcelona Supercomputing Center – sequence: 6 givenname: Alex surname: Ramirez fullname: Ramirez, Alex organization: NVIDIA Corporation
BookMark	eNp1kUGLFDEQhYOs4OzqD_DW4MVLtCqddNLHcXBVWFFEzyGTrt7N2tMZU92C_94MI7gIHoqi4H2PqnqX4mLOMwnxHOEVAtjXjGC7TgLW0i1K90hs0NhW2k7DhdiAc0ZabdwTccl8DwC9dW4jdtshHJf0k5ov67ykA8ktc-KFhubNlOP35nOhkZZ4l-bbJs_N7i4d5cd1WtKx5EjMufBT8XgME9OzP_1KfLt--3X3Xt58evdht72RsXXdIpXCAYZ9oFGjBT3scUQT21YZZXtDvd2bjgZyrem1VoaiCkMXBxpDF2Hfh_ZK4Nk38hp9oUglhsXnkP4Op1JglVe2RcDKvDwzdd0fK_HiD4kjTVOYKa_ssQetwIDVVfriH-l9XstcL_LoXKcQoVcPliiZuT7HH0s6hPLLI_hTFP4cha9R-FMU3lVGnRmu2vmWygPn_0K_AV9_jKk
CODEN	IJPPE5
Cites_doi	10.1109/12.381947 10.1002/cpe.1631 10.1016/0743-7315(91)90014-Z 10.1109/PACT.2011.65 10.1109/ISCA.1994.288147 10.1109/HPCA.2004.10030 10.1145/1065010.1065034 10.1145/277650.277725 10.1109/ICPP.1997.622557 10.1109/I-SPAN.2008.24 10.1145/77726.255176 10.1109/SC.2006.55 10.1109/IPPS.1999.760439 10.1145/200912.201006 10.1145/2464996.2465443 10.1177/1094342007078442 10.1145/216585.216588 10.1109/HPDC.2006.1652135 10.1145/2086696.2086715 10.1109/ISCA.2002.1003576 10.1109/TVLSI.2009.2032916 10.1145/1735970.1736058 10.1142/S0129626411000151 10.1109/ICPP.1993.92 10.1145/1103845.1094852 10.1109/SC.2006.47 10.1109/HPCA.1995.386554
ContentType	Journal Article Publication
Contributor	Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Contributor_xml	– sequence: 1 fullname: Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Copyright	Springer Science+Business Media New York 2016 International Journal of Parallel Programming is a copyright of Springer, 2017. info:eu-repo/semantics/openAccess
Copyright_xml	– notice: Springer Science+Business Media New York 2016 – notice: International Journal of Parallel Programming is a copyright of Springer, 2017. – notice: info:eu-repo/semantics/openAccess
DBID	AAYXX CITATION 3V. 7SC 7WY 7WZ 7XB 87Z 8AL 8FD 8FE 8FG 8FK 8FL 8G5 ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ GUQSH HCIFZ JQ2 K60 K6~ K7- L.- L.0 L7M L~C L~D M0C M0N M2O MBDVC P5Z P62 PHGZM PHGZT PKEHL PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI Q9U XX2
DOI	10.1007/s10766-016-0431-8
DatabaseName	CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Global (Alumni Edition) Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni Edition) Research Library (Alumni Edition) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials AUTh Library subscriptions: ProQuest Central Business Premium Collection Technology collection ProQuest One Community College ProQuest Central Korea Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student Research Library Prep SciTech Premium Collection ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced ABI/INFORM Professional Standard Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global Computing Database ProQuest Research Library Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Business ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central Basic Recercat
DatabaseTitle	CrossRef ABI/INFORM Global (Corporate) ProQuest Business Collection (Alumni Edition) ProQuest One Business Research Library Prep Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College Research Library (Alumni Edition) ABI/INFORM Complete ProQuest Central ABI/INFORM Professional Advanced ProQuest One Applied & Life Sciences ABI/INFORM Professional Standard ProQuest Central Korea ProQuest Research Library ProQuest Central (New) Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global ProQuest Computing ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest Business Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Business (Alumni) ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) Business Premium Collection (Alumni)
DatabaseTitleList	ABI/INFORM Global (Corporate) Computer and Information Systems Abstracts
Database_xml	– sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1573-7640
EndPage	550
ExternalDocumentID	oai_recercat_cat_2072_273101 4321601949 10_1007_s10766_016_0431_8
Genre	Feature
GrantInformation_xml	– fundername: Ministerio de Educación, Cultura y Deporte grantid: TIN2012-34557 funderid: http://dx.doi.org/10.13039/501100003176
GroupedDBID	-4Z -59 -5G -BR -EM -Y2 -~C -~X .4S .86 .DC .VR 06D 0R~ 0VY 199 1N0 2.D 203 28- 29J 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 78A 7WY 8FE 8FG 8FL 8G5 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAOBN AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYJJ AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDBF ABDPE ABDZT ABECU ABFSI ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTAH ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFO ACGFS ACHSB ACHXU ACIHN ACKNC ACMDZ ACMLO ACNCT ACOKC ACOMO ACPIV ACREN ACUHS ACZOJ ADHIR ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADYOE ADZKW AEAQA AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFKRA AFLOW AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMTXH AMXSW AMYLF AOCGG ARAPS ARCSS ARMRJ AXYYD AYJHY AZFZN AZQEC B-. B0M BA0 BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BKOMP BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO E.L EAD EAP EAS EBLON EBS EDO EIOEI EJD EMK EPL ESBYG ESX FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GUQSH GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ H~9 I-F I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW LAK LLZTM M0C M0N M2O M4Y MA- MS~ N2Q NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P62 P9O PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 Q2X QOK QOS R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TAE TEORI TN5 TSG TSK TSV TUC TUS U2A U5U UG4 UOJIU UTJUX UZXMN VC2 VFIZW VXZ W23 W48 WH7 WK8 YLTOR Z45 Z7R Z7X Z81 Z83 Z88 Z8R Z8W Z92 ZMTXR ZY4 ~8M ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADHKG AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT PQGLB 7SC 7XB 8AL 8FD 8FK JQ2 L.- L.0 L7M L~C L~D MBDVC PKEHL PQEST PQUKI Q9U PUEGO XX2
ID	FETCH-LOGICAL-c386t-221d0dbaef41704db1f15c33252795e97b56ede83594425ec2ad6cdefa6c0b9a3
IEDL.DBID	RSV
ISICitedReferencesCount	0
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000399240300005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	0885-7458
IngestDate	Fri Nov 07 13:57:02 EST 2025 Fri Sep 05 14:21:01 EDT 2025 Tue Nov 04 22:09:45 EST 2025 Sat Nov 29 01:59:43 EST 2025 Fri Feb 21 02:37:22 EST 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	3
Keywords	Prefetch Cache memories Task based programming models
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c386t-221d0dbaef41704db1f15c33252795e97b56ede83594425ec2ad6cdefa6c0b9a3
Notes	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
OpenAccessLink	https://recercat.cat/handle/2072/273101
PQID	1886211092
PQPubID	48389
PageCount	21
ParticipantIDs	csuc_recercat_oai_recercat_cat_2072_273101 proquest_miscellaneous_1904205074 proquest_journals_1886211092 crossref_primary_10_1007_s10766_016_0431_8 springer_journals_10_1007_s10766_016_0431_8
PublicationCentury	2000
PublicationDate	2017-06-01
PublicationDateYYYYMMDD	2017-06-01
PublicationDate_xml	– month: 06 year: 2017 text: 2017-06-01 day: 01
PublicationDecade	2010
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	International journal of parallel programming
PublicationTitleAbbrev	Int J Parallel Prog
PublicationYear	2017
Publisher	Springer US Springer Nature B.V
Publisher_xml	– name: Springer US – name: Springer Nature B.V
References	Chen, T.-F., Baer, J.-L.: A performance study of software and hardware data prefetching Schemes. In: Proceedings the 21st Annual International Symposium on Computer Architecture, 1994, Chicago, IL, pp. 223–232 (1994) EbrahimiELeeCJMutluOPattYNFairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systemsSIGARCH Comput. Archit. News201038133534610.1145/1735970.1736058 ReindersJIntel Threading Building Blocks20071SebastopolO’Reilly and Associates Inc Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pp. 190–200, New York, NY, USA, 2005. ACM RicoARamirezAValeroMAvailable task-level parallelism on the cell BESci. Program.2009171–25976 Nesbit, K., Smith, J.: Data cache prefetching using a global history buffer. In: Software, IEE Proceedings, p. 96 (2004) GCC Developers: GCC Optimization Options. https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Optimize-Options.html (2014). Accessed 14 Aug 2014 Dahlgren, F., Stenstrom, P.: Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors. In: Proceedings., First IEEE Symposium on High-Performance Computer Architecture, 1995, Raleigh, NC, pp. 68–77 (1995). doi:10.1109/HPCA.1995.386554 D. Lowenthal and M. James. Run-time selection of block size in pipelined parallel programs. In: Parallel Processing, 1999. Proceedings13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP, pp. 82–87 Martonosi, M.R.: Analyzing and tuning memory performance in sequential and parallel programs. Technical report, Stanford, CA, USA (1994) Papaefstathiou, V., Katevenis, M.G., Nikolopoulos, D.S., Pnevmatikatos, D.: Prefetching and cache management using task lifetimes. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS ’13, pp. 325–334, ACM, New York, NY, USA (2013) Dahlgren, F., Dubois, M., Stenstrom, P.: Fixed and adaptive sequential prefetching in shared memory multiprocessors. In: International Conference on Parallel Processing, 1993. ICPP 1993, Syracuse, NY, pp. 56–63 (1993) Lu, J.: Design and Implementation of a Lightweight Runtime Optimization System on Modern Computer Architectures. Ph.D. Thesis, Minneapolis, MN, USA, AAI3220014 (2006) Byna, S., Chen, Y., Sun, X.H.: A taxonomy of data prefetching mechanisms. In: 2008 International Symposium on Parallel Architectures, Algorithms, and Networks (i-span 2008), Sydney, NSW, pp. 19–24 (2008). doi:10.1109/I-SPAN.2008.24 Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: An object-oriented approach to non-uniform cluster computing. SIGPLAN Not. 40(10), 519–538 BaerJ-LChenT-FEffective hardware-based data prefetching for high-performance processorsIEEE Trans. Comput.199544560962310.1109/12.3819471041.68507 MowryTGuptaATolerating latency through software-controlled prefetching in shared-memory multiprocessorsJ. Parallel Distrib. Comput.1991128710610.1016/0743-7315(91)90014-Z AugonnetCThibaultSNamystRWacrenierP-AStarpu: a unified platform for task scheduling on heterogeneous multicore architecturesConcurr. Comput. Pract. Exp.201123218719810.1002/cpe.1631 DuranAAyguadéEBadiaRMLabartaJMartinellLMartorellXPlanasJOmpss: a proposal for programming heterogeneous multi-core architecturesParallel Process. Lett.2011212173193281200010.1142/S0129626411000151 Tandri, S., Abdelrahman, T.S.: Automatic partitioning of data and computations on scalable shared memory multiprocessors. In: Proceedings of the 1997 International Conference on Parallel Processing, 1997, Bloomington, IL, pp. 64–73 (1997). doi:10.1109/ICPP.1997.622557 Chung, I.H., Hollingsworth, J.K.: A case study using automatic performance tuning for large-scale scientific programs. In: 15th IEEE International Conference on High Performance Distributed Computing, Paris, 2006, pp. 45–56 (2006). doi:10.1109/HPDC.2006.1652135 ARM, Cortex-A9 Technical Reference Manual. http://infocenter.arm.com/help/topic/com.arm.doc.ddi0388f/DDI0388F_cortex_a9_r2p2_trm.pdf (2008). Accessed 10 Nov 2014 WulfWAMcKeeSAHitting the memory wall: implications of the obviousSIGARCH Comput. Archit. News1995231202410.1145/216585.216588 OpenMP Consortium. http://openmp.org/wp/ (2014). Accessed 25 July 2014 Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI ’98, pp. 212–223, ACM, New York, NY, USA (1998) Rothberg, E., Singh, J.P., Gupta, A.: Working sets, cache sizes, and node granularity issues for large-scale multiprocessors. In: Proceedings of the 20th Annual International Symposium on Computer Architecture, 1993, pp. 14–25 (1993) Gornish, E.H., Granston, E.D., Veidenbaum, A.V.: Compiler-directed data prefetching in multiprocessors with memory hierarchies. In: In International Conference on Supercomputing, pp. 354–368 (1990) Solihin, Y., Lee, J., Torrellas, J.: Using a user-level memory thread for correlation prefetching. In: Proceedings of the 29th Annual International Symposium on Computer Architecture, ISCA ’02, pp. 171–182, IEEE Computer Society, Washington, DC, USA (2002) Feng, X., Cameron, K.W., Buell, D.A.: PBPI: a high performance implementation of bayesian phylogenetic inference. In: Proceedings of the ACM/IEEE, SC 2006 Conference, Tampa, FL, pp. 40 (2006). doi:10.1109/SC.2006.47 Villavieja, C., Karakostas, V., Vilanova, L., Etsion, Y., Ramirez, A., Mendelson, A., Navarro, N., Cristal, A., Unsal, O.S.: Didi: Mitigating the performance impact of tlb shootdowns using a shared tlb directory. In: 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT), Galveston, TX, pp. 340–349 (2011) GuoYNarayananPBennaserMChhedaSMoritzCEnergy-efficient hardware data prefetching.Very Large Scale Integration (VLSI) Syst. IEEE Trans.201119225026310.1109/TVLSI.2009.2032916 Fatahalian, K., Horn, D.R., Knight, T.J., Leem, L., Houston, M., Park, J.Y., Erez, M., Ren, M., Aiken, A., Dally, W.J., Hanrahan, P.: Sequoia: programming the memory hierarchy. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC ’06, ACM, New York, NY, USA (2006) TullsenDMEggersSJEffective cache prefetching on bus-based multiprocessorsACM Trans. Comput. Syst.1995131578810.1145/200912.201006 Wall, M.: Using block prefetch for optimized memory performance. http://web.mit.edu/ehliu/Public/ProjectX/Meetings/AMD_block_prefetch_paper.pdf (2001). Accessed 10 July 2014 RicoACabarcasFVillaviejaCPavlovicMVegaAEtsionYRamirezAValeroMOn the simulation of large-scale architectures using multiple application abstraction levelsACM Trans. Archit. Code Optim.20128436:136:2010.1145/2086696.2086715 ChamberlainBCallahanDZimaHParallel programmability and the chapel languageInt. J. High Perform. Comput. Appl.200721329131210.1177/1094342007078442 Y Guo (431_CR19) 2011; 19 T Mowry (431_CR24) 1991; 12 431_CR31 431_CR10 431_CR32 WA Wulf (431_CR36) 1995; 23 431_CR30 431_CR35 431_CR11 431_CR12 431_CR34 E Ebrahimi (431_CR14) 2010; 38 431_CR4 431_CR7 431_CR26 A Rico (431_CR29) 2009; 17 431_CR6 J Reinders (431_CR27) 2007 431_CR9 J-L Baer (431_CR3) 1995; 44 431_CR8 431_CR1 431_CR20 A Duran (431_CR13) 2011; 21 431_CR21 431_CR25 B Chamberlain (431_CR5) 2007; 21 431_CR22 431_CR23 431_CR17 DM Tullsen (431_CR33) 1995; 13 431_CR18 431_CR15 431_CR16 C Augonnet (431_CR2) 2011; 23 A Rico (431_CR28) 2012; 8
References_xml	– reference: Fatahalian, K., Horn, D.R., Knight, T.J., Leem, L., Houston, M., Park, J.Y., Erez, M., Ren, M., Aiken, A., Dally, W.J., Hanrahan, P.: Sequoia: programming the memory hierarchy. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC ’06, ACM, New York, NY, USA (2006) – reference: GuoYNarayananPBennaserMChhedaSMoritzCEnergy-efficient hardware data prefetching.Very Large Scale Integration (VLSI) Syst. IEEE Trans.201119225026310.1109/TVLSI.2009.2032916 – reference: Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pp. 190–200, New York, NY, USA, 2005. ACM – reference: Solihin, Y., Lee, J., Torrellas, J.: Using a user-level memory thread for correlation prefetching. In: Proceedings of the 29th Annual International Symposium on Computer Architecture, ISCA ’02, pp. 171–182, IEEE Computer Society, Washington, DC, USA (2002) – reference: Dahlgren, F., Dubois, M., Stenstrom, P.: Fixed and adaptive sequential prefetching in shared memory multiprocessors. In: International Conference on Parallel Processing, 1993. ICPP 1993, Syracuse, NY, pp. 56–63 (1993) – reference: Dahlgren, F., Stenstrom, P.: Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors. In: Proceedings., First IEEE Symposium on High-Performance Computer Architecture, 1995, Raleigh, NC, pp. 68–77 (1995). doi:10.1109/HPCA.1995.386554 – reference: RicoARamirezAValeroMAvailable task-level parallelism on the cell BESci. Program.2009171–25976 – reference: Lu, J.: Design and Implementation of a Lightweight Runtime Optimization System on Modern Computer Architectures. Ph.D. Thesis, Minneapolis, MN, USA, AAI3220014 (2006) – reference: RicoACabarcasFVillaviejaCPavlovicMVegaAEtsionYRamirezAValeroMOn the simulation of large-scale architectures using multiple application abstraction levelsACM Trans. Archit. Code Optim.20128436:136:2010.1145/2086696.2086715 – reference: Tandri, S., Abdelrahman, T.S.: Automatic partitioning of data and computations on scalable shared memory multiprocessors. In: Proceedings of the 1997 International Conference on Parallel Processing, 1997, Bloomington, IL, pp. 64–73 (1997). doi:10.1109/ICPP.1997.622557 – reference: Nesbit, K., Smith, J.: Data cache prefetching using a global history buffer. In: Software, IEE Proceedings, p. 96 (2004) – reference: Papaefstathiou, V., Katevenis, M.G., Nikolopoulos, D.S., Pnevmatikatos, D.: Prefetching and cache management using task lifetimes. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS ’13, pp. 325–334, ACM, New York, NY, USA (2013) – reference: Chen, T.-F., Baer, J.-L.: A performance study of software and hardware data prefetching Schemes. In: Proceedings the 21st Annual International Symposium on Computer Architecture, 1994, Chicago, IL, pp. 223–232 (1994) – reference: Feng, X., Cameron, K.W., Buell, D.A.: PBPI: a high performance implementation of bayesian phylogenetic inference. In: Proceedings of the ACM/IEEE, SC 2006 Conference, Tampa, FL, pp. 40 (2006). doi:10.1109/SC.2006.47 – reference: Martonosi, M.R.: Analyzing and tuning memory performance in sequential and parallel programs. Technical report, Stanford, CA, USA (1994) – reference: EbrahimiELeeCJMutluOPattYNFairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systemsSIGARCH Comput. Archit. News201038133534610.1145/1735970.1736058 – reference: Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: An object-oriented approach to non-uniform cluster computing. SIGPLAN Not. 40(10), 519–538 – reference: GCC Developers: GCC Optimization Options. https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Optimize-Options.html (2014). Accessed 14 Aug 2014 – reference: ChamberlainBCallahanDZimaHParallel programmability and the chapel languageInt. J. High Perform. Comput. Appl.200721329131210.1177/1094342007078442 – reference: OpenMP Consortium. http://openmp.org/wp/ (2014). Accessed 25 July 2014 – reference: DuranAAyguadéEBadiaRMLabartaJMartinellLMartorellXPlanasJOmpss: a proposal for programming heterogeneous multi-core architecturesParallel Process. Lett.2011212173193281200010.1142/S0129626411000151 – reference: Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI ’98, pp. 212–223, ACM, New York, NY, USA (1998) – reference: ReindersJIntel Threading Building Blocks20071SebastopolO’Reilly and Associates Inc – reference: MowryTGuptaATolerating latency through software-controlled prefetching in shared-memory multiprocessorsJ. Parallel Distrib. Comput.1991128710610.1016/0743-7315(91)90014-Z – reference: D. Lowenthal and M. James. Run-time selection of block size in pipelined parallel programs. In: Parallel Processing, 1999. Proceedings13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP, pp. 82–87 – reference: Gornish, E.H., Granston, E.D., Veidenbaum, A.V.: Compiler-directed data prefetching in multiprocessors with memory hierarchies. In: In International Conference on Supercomputing, pp. 354–368 (1990) – reference: BaerJ-LChenT-FEffective hardware-based data prefetching for high-performance processorsIEEE Trans. Comput.199544560962310.1109/12.3819471041.68507 – reference: Byna, S., Chen, Y., Sun, X.H.: A taxonomy of data prefetching mechanisms. In: 2008 International Symposium on Parallel Architectures, Algorithms, and Networks (i-span 2008), Sydney, NSW, pp. 19–24 (2008). doi:10.1109/I-SPAN.2008.24 – reference: TullsenDMEggersSJEffective cache prefetching on bus-based multiprocessorsACM Trans. Comput. Syst.1995131578810.1145/200912.201006 – reference: ARM, Cortex-A9 Technical Reference Manual. http://infocenter.arm.com/help/topic/com.arm.doc.ddi0388f/DDI0388F_cortex_a9_r2p2_trm.pdf (2008). Accessed 10 Nov 2014 – reference: Wall, M.: Using block prefetch for optimized memory performance. http://web.mit.edu/ehliu/Public/ProjectX/Meetings/AMD_block_prefetch_paper.pdf (2001). Accessed 10 July 2014 – reference: Chung, I.H., Hollingsworth, J.K.: A case study using automatic performance tuning for large-scale scientific programs. In: 15th IEEE International Conference on High Performance Distributed Computing, Paris, 2006, pp. 45–56 (2006). doi:10.1109/HPDC.2006.1652135 – reference: WulfWAMcKeeSAHitting the memory wall: implications of the obviousSIGARCH Comput. Archit. News1995231202410.1145/216585.216588 – reference: AugonnetCThibaultSNamystRWacrenierP-AStarpu: a unified platform for task scheduling on heterogeneous multicore architecturesConcurr. Comput. Pract. Exp.201123218719810.1002/cpe.1631 – reference: Rothberg, E., Singh, J.P., Gupta, A.: Working sets, cache sizes, and node granularity issues for large-scale multiprocessors. In: Proceedings of the 20th Annual International Symposium on Computer Architecture, 1993, pp. 14–25 (1993) – reference: Villavieja, C., Karakostas, V., Vilanova, L., Etsion, Y., Ramirez, A., Mendelson, A., Navarro, N., Cristal, A., Unsal, O.S.: Didi: Mitigating the performance impact of tlb shootdowns using a shared tlb directory. In: 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT), Galveston, TX, pp. 340–349 (2011) – ident: 431_CR23 – volume: 44 start-page: 609 issue: 5 year: 1995 ident: 431_CR3 publication-title: IEEE Trans. Comput. doi: 10.1109/12.381947 – volume: 23 start-page: 187 issue: 2 year: 2011 ident: 431_CR2 publication-title: Concurr. Comput. Pract. Exp. doi: 10.1002/cpe.1631 – volume: 12 start-page: 87 year: 1991 ident: 431_CR24 publication-title: J. Parallel Distrib. Comput. doi: 10.1016/0743-7315(91)90014-Z – ident: 431_CR34 doi: 10.1109/PACT.2011.65 – ident: 431_CR1 – ident: 431_CR7 doi: 10.1109/ISCA.1994.288147 – ident: 431_CR35 – volume: 17 start-page: 59 issue: 1–2 year: 2009 ident: 431_CR29 publication-title: Sci. Program. – ident: 431_CR25 doi: 10.1109/HPCA.2004.10030 – ident: 431_CR22 doi: 10.1145/1065010.1065034 – ident: 431_CR17 doi: 10.1145/277650.277725 – ident: 431_CR32 doi: 10.1109/ICPP.1997.622557 – ident: 431_CR21 – ident: 431_CR4 doi: 10.1109/I-SPAN.2008.24 – ident: 431_CR18 doi: 10.1145/77726.255176 – ident: 431_CR15 doi: 10.1109/SC.2006.55 – ident: 431_CR20 doi: 10.1109/IPPS.1999.760439 – volume: 13 start-page: 57 issue: 1 year: 1995 ident: 431_CR33 publication-title: ACM Trans. Comput. Syst. doi: 10.1145/200912.201006 – ident: 431_CR26 doi: 10.1145/2464996.2465443 – volume: 21 start-page: 291 issue: 3 year: 2007 ident: 431_CR5 publication-title: Int. J. High Perform. Comput. Appl. doi: 10.1177/1094342007078442 – ident: 431_CR30 – volume: 23 start-page: 20 issue: 1 year: 1995 ident: 431_CR36 publication-title: SIGARCH Comput. Archit. News doi: 10.1145/216585.216588 – ident: 431_CR8 doi: 10.1109/HPDC.2006.1652135 – volume: 8 start-page: 36:1 issue: 4 year: 2012 ident: 431_CR28 publication-title: ACM Trans. Archit. Code Optim. doi: 10.1145/2086696.2086715 – ident: 431_CR31 doi: 10.1109/ISCA.2002.1003576 – volume: 19 start-page: 250 issue: 2 year: 2011 ident: 431_CR19 publication-title: Very Large Scale Integration (VLSI) Syst. IEEE Trans. doi: 10.1109/TVLSI.2009.2032916 – volume-title: Intel Threading Building Blocks year: 2007 ident: 431_CR27 – volume: 38 start-page: 335 issue: 1 year: 2010 ident: 431_CR14 publication-title: SIGARCH Comput. Archit. News doi: 10.1145/1735970.1736058 – ident: 431_CR12 – volume: 21 start-page: 173 issue: 2 year: 2011 ident: 431_CR13 publication-title: Parallel Process. Lett. doi: 10.1142/S0129626411000151 – ident: 431_CR10 doi: 10.1109/ICPP.1993.92 – ident: 431_CR6 doi: 10.1145/1103845.1094852 – ident: 431_CR9 – ident: 431_CR16 doi: 10.1109/SC.2006.47 – ident: 431_CR11 doi: 10.1109/HPCA.1995.386554
SSID	ssj0009788
Score	2.0652516
Snippet	Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to...
SourceID	csuc proquest crossref springer
SourceType	Open Access Repository Aggregation Database Index Database Publisher
StartPage	530
SubjectTerms	Analysis Arquitectura de computadors Benchmarks Cache Cache memories Cache memory Central processing units Computer programming Computer programs Computer Science CPUs Dynamical systems Engine blocks Engines Gestió de memòria (Informàtica) Hardware Informàtica Memòria cau Prefetch Processor Architectures Schedules Software Software Engineering/Programming and Operating Systems Stall Studies Task based programming models Theory of Computation Àrees temàtiques de la UPC
SummonAdditionalLinks	– databaseName: Computer Science Database dbid: K7- link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NS8MwFA86PXjxW5xfVPCkBNs0bZqT6FAERYao7BbaJMUhtnXt_Pt9r2s3J-jFQwqlaRve5y_Jy3uEnGjpmgAcCxWuZpSbWNJE8JRyhujACqPraIuXe_HwEA0Gst8suJVNWGVrE2tDbXKNa-TnXgTYG9Njsovig2LVKNxdbUpoLJIljzEP5fxO0FnSXVHXnQRFCqjgQdTuak6OzokQ59LQwIfSaM4vdXQ51nOY88c2ae19btb-O-51strgTudyIigbZMFmm2StrengNCq-RXqXJi7QBDqPWETi3VJgIIqCca7A7705fSxMUtURmE6eOb3XYUHrU7zF5MhBPiq3yfPN9VPvljaFFqj2o7CiQDrjmiS2KfeEy03ipV6gfZ8FTMjASpEEoTUWwJrkoONWs9iE2tg0DrWbyNjfIZ0sz-wucVKBZE8Ax0WGw9vShytM63gCQAx0v0tOWzKrYpJPQ80yJyNPFMacIU9UhJ2BEQpsvx3puFKYC3t6g425gilAYGBYuuSgZYBq9LBUM-p3yfH0MWgQbovEmc3H0EeC4XIBF8Pgzlo2f_vEb6Pb-_uH-2SFIQSoV2wOSKcaje0hWdaf1bAcHdWC-gVNo-vn priority: 102 providerName: ProQuest
Title	Adaptive Runtime-Assisted Block Prefetching on Chip-Multiprocessors
URI	https://link.springer.com/article/10.1007/s10766-016-0431-8 https://www.proquest.com/docview/1886211092 https://www.proquest.com/docview/1904205074 https://recercat.cat/handle/2072/273101
Volume	45
WOSCitedRecordID	wos000399240300005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVPQU databaseName: ABI/INFORM Collection (ProQuest) customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: 7WY dateStart: 19970201 isFulltext: true titleUrlDefault: https://www.proquest.com/abicomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ABI/INFORM Global customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: M0C dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/abiglobal providerName: ProQuest – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database (ProQuest) customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: P5Z dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database (ProQuest) customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: K7- dateStart: 19970201 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: BENPR dateStart: 19970201 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Research Library (ProQuest) customDbUrl: eissn: 1573-7640 dateEnd: 20171231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: M2O dateStart: 19970201 isFulltext: true titleUrlDefault: https://search.proquest.com/pqrl providerName: ProQuest – providerCode: PRVAVX databaseName: Springer Nature Link Journals customDbUrl: eissn: 1573-7640 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnR1dT9sw8DTKHniBDYZW6Kog8QSylDhOHD9CVYS0UaryMbYXK7EdgSbSqkn5_dylCR8TexgPOcmKY1l3vg_nvgD2jfJthIqFSd9wJmyqWCZFzgQn68BJa-poi-sfcjRKbm7UuMnjLtto99YlWUvqF8luMqbbLz6o9ViyAquo7RLixsnF9XOlXVk3m0TuiZgUUdK6Mt9a4pUy6phyYV4Zmn_5RmuVc7Lxrs1-gvXGwvSOlkfiM3xwxSZstN0bvIaZt2BwZNMZCTtvQu0i7h1DUhHRrXeMGu6PN6YWJFUda-lNC29wezdjdb7ubJlcMJ2XX-DqZHg5OGVNSwVmwiSuGOeB9W2WulwE0hc2C_IgMmHIIy5V5JTMothZh2aZEsjNzvDUxsa6PI2Nn6k03IZOMS3cV_BySbjO0GJLrMCvVYgQL3AiQ5MLubwLBy1u9WxZOUM_10gm9GiKLiP06IQmI_Y1Snk3N2mlqer104Ae7kuu0dZCEdKFXksj3XBcqYME72ZUPpV3Ye_pNfIKOUDSwk0XOEehiPLRAsbNHbZ0e7HEv3a381-zd2GNk-6vf9X0oFPNF-4bfDQP1V0578OK_PmrD6vHw9F4gqPvkiE88wcE-TnCcfS7Xx_pR2q25-c
linkProvider	Springer Nature
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Nb9QwEB2VBQkulK-KhQJGggvIInGcOD5UqCxUrbqsKlRQbyaxHbWqmoRNFsSf4jd2Jtl0KRLceuDgSFGcxMmM3zzb4xmAF1YHLkbDwlVgBZcu0zxXsuBSEDvwytnO2-LLVM1m6dGRPliDX8NeGHKrHDCxA2pXWZojfxOmyL0pPKZ4W3_jlDWKVleHFBq9Wuz7nz9wyNZs7b1H-b4UYufD4WSXL7MKcBulScuFCF3g8swXMlSBdHlYhLGNIhELpWOvVR4n3nlkJlqiQnsrMpdY54sssUGuswifew2uS4ndgVwFg8kqyK_q8lxix425knE6rKL2W_VUQmN3LGizeXrJDo5ss7CXOO4fy7KdtdtZ_9_-0x24veTVbLvvCHdhzZf3YH3IWcGWEHYfJtsuqwni2SdKknHmOSooqbpj79Cun7IDSrzSdh6mrCrZ5Pik5t0u5brfUlHNmwfw-Uo-ZQNGZVX6h8AKRWLOkaemTuLdOsIjDltljkQTsW0MrwaxmrqPF2JWkaFJBwz51JEOmJQqo-AN2jY_t1lrKNb3xQkVEShhkGEicI5hcxC4WeJMY1bSHsPzi8uIELTsk5W-WmAdjcAcIO_Hxr0e1Oq3R_ytdY_-_cJncHP38OPUTPdm-4_hliC6081ObcKonS_8E7hhv7cnzfxp10kYfL1qbTsHrilKUg
linkToPdf	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Nb9NAEB2VgBCXlq-KlAJGggtoVXu99noPVVVSIqpUUYQA9bbYu2tRodpu7ID4a_w6ZvzRUCS49cDBlqI4tuN58-atd3YG4IVRvo0wsDDpG86ETRXLpMiZ4KQOnLSmzbb4dCLn8-T0VC024OewFobSKgdObInalobeke8FCWpvKo_J9_I-LWJxND2oLhh1kKKZ1qGdRgeRmfvxHYdv9f7xEdr6JefTtx8m71jfYYCZMIkbxnlgfZulLheB9IXNgjyITBjyiEsVOSWzKHbWoUpRAsHtDE9tbKzL09j4mUpDPO8NuIlROCIfm0m2Lvgr256X6MQRkyJKhhnVbtmejGkcjxvGb5ZciYkjU6_MFb37xxRtG_mmW__zM7sLm73e9g47B7kHG664D1tDLwuvp7YHMDm0aUXU772n5hnnjiFwyQWs9wbj_VdvQQ1Zmjbz1CsLb_LlrGLt6uWqW2pRLuuH8PFa_so2jIqycI_AyyWZPEP9mliBv1Yh7nE4KzIUoMh5Y3g1mFhXXR0Rva4YTXjQlGtHeNAJHYwg0Bjz3NKkjaYa4JcfaOO-5BqVJxLqGHYH4-uef2q9tvwYnl9-jcxB00Fp4coVHqOQsH0cD-DNvR4g9tsp_nZ3O_--4DO4jSDTJ8fz2WO4w0kFtS-tdmHULFfuCdwy35qzevm09RcPPl832H4BABhS-A
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Adaptive+Runtime-Assisted+Block+Prefetching+on+Chip-Multiprocessors&rft.jtitle=International+journal+of+parallel+programming&rft.au=Garcia%2C+Victor&rft.au=Rico%2C+Alejandro&rft.au=Villavieja%2C+Carlos&rft.au=Carpenter%2C+Paul&rft.date=2017-06-01&rft.pub=Springer+US&rft.issn=0885-7458&rft.eissn=1573-7640&rft.volume=45&rft.issue=3&rft.spage=530&rft.epage=550&rft_id=info:doi/10.1007%2Fs10766-016-0431-8&rft.externalDocID=10_1007_s10766_016_0431_8
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0885-7458&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0885-7458&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0885-7458&client=summon