Automatic mapping of parallel applications on multicore architectures using the Servet benchmark suite

[Display omitted] ► Multicore systems require new mapping policies to make the most of the architecture. ► An automatic mapping based on Servet is proposed. ► Servet is a benchmark suite that obtains relevant hardware parameters of clusters. ► Results show a significant improvement of performance of...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Computers & electrical engineering Ročník 38; číslo 2; s. 258 - 269
Hlavní autori:	González-Domínguez, Jorge, Taboada, Guillermo L., Fraguela, Basilio B., Martín, María J., Touriño, Juan
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Elsevier Ltd 01.03.2012
Predmet:	Benchmarking Clusters Electrical engineering Hierarchies Mapping Mathematical models Optimization
ISSN:	0045-7906, 1879-0755
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	[Display omitted] ► Multicore systems require new mapping policies to make the most of the architecture. ► An automatic mapping based on Servet is proposed. ► Servet is a benchmark suite that obtains relevant hardware parameters of clusters. ► Results show a significant improvement of performance of parallel applications. ► The mapping technique proposed does not require source code modifications. Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their performance on multicore clusters. Although Servet has been proved to detect accurately cache hierarchies, bandwidths and bottlenecks in memory accesses, as well as the communication overhead among cores, up to now the impact of the use of this information on application performance optimization has not been assessed. This paper presents a novel algorithm that automatically uses Servet for mapping parallel applications on multicore systems and analyzes its impact on three testbeds using three different parallel programming models: message-passing, shared memory and partitioned global address space (PGAS). Our results show that a suitable mapping policy based on the data provided by this tool can significantly improve the performance of parallel applications without source code modification.
AbstractList	[Display omitted] ► Multicore systems require new mapping policies to make the most of the architecture. ► An automatic mapping based on Servet is proposed. ► Servet is a benchmark suite that obtains relevant hardware parameters of clusters. ► Results show a significant improvement of performance of parallel applications. ► The mapping technique proposed does not require source code modifications. Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their performance on multicore clusters. Although Servet has been proved to detect accurately cache hierarchies, bandwidths and bottlenecks in memory accesses, as well as the communication overhead among cores, up to now the impact of the use of this information on application performance optimization has not been assessed. This paper presents a novel algorithm that automatically uses Servet for mapping parallel applications on multicore systems and analyzes its impact on three testbeds using three different parallel programming models: message-passing, shared memory and partitioned global address space (PGAS). Our results show that a suitable mapping policy based on the data provided by this tool can significantly improve the performance of parallel applications without source code modification. Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their performance on multicore clusters. Although Servet has been proved to detect accurately cache hierarchies, bandwidths and bottlenecks in memory accesses, as well as the communication overhead among cores, up to now the impact of the use of this information on application performance optimization has not been assessed. This paper presents a novel algorithm that automatically uses Servet for mapping parallel applications on multicore systems and analyzes its impact on three testbeds using three different parallel programming models: message-passing, shared memory and partitioned global address space (PGAS). Our results show that a suitable mapping policy based on the data provided by this tool can significantly improve the performance of parallel applications without source code modification.
Author	Taboada, Guillermo L. Martín, María J. Touriño, Juan González-Domínguez, Jorge Fraguela, Basilio B.
Author_xml	– sequence: 1 givenname: Jorge surname: González-Domínguez fullname: González-Domínguez, Jorge email: jgonzalezd@udc.es – sequence: 2 givenname: Guillermo L. surname: Taboada fullname: Taboada, Guillermo L. email: taboada@udc.es – sequence: 3 givenname: Basilio B. surname: Fraguela fullname: Fraguela, Basilio B. email: basilio.fraguela@udc.es – sequence: 4 givenname: María J. surname: Martín fullname: Martín, María J. email: mariam@udc.es – sequence: 5 givenname: Juan surname: Touriño fullname: Touriño, Juan email: juan@udc.es
BookMark	eNqNkMFO3DAQhq2KSizQd3BvvSS1nTiJTxVaFaiExAE4W85kwnrrxKntIPH2eNkeECdOI4-__5fmOyMns5-RkO-clZzx5ue-BD8t6BBwfioF47zkomSs_UI2vGtVwVopT8iGsVoWrWLNKTmLcc_yu-HdhoyXa_KTSRboZJbFzk_Uj3QxwTiHjuaVs5C__Rypn-m0uoz6gNQE2NmEkNaAka7xkEw7pPcYnjHRHmfYTSb8pXHN2AX5OhoX8dv_eU4er34_bG-K27vrP9vL2wIqWaeCq6ZGCUYhcin6qu_l0BnFukHyxkjkQo4VVKbvUSiDVSsEgJBQD009tsCqc_Lj2LsE_2_FmPRkI6BzZka_Rs2ZEJ1Sbdtk9NcRheBjDDhqsOnt0hSMdRnVB8N6r98Z1gfDmgudDecG9aFhCTbf_PKp7PaYxWzj2WLQEWx2hoMNWaoevP1Eyyul3KL4
CitedBy_id	crossref_primary_10_1002_cpe_1914 crossref_primary_10_1002_cpe_6600 crossref_primary_10_1002_cpe_7419 crossref_primary_10_1016_j_compeleceng_2013_01_008 crossref_primary_10_3390_electronics12010053 crossref_primary_10_1016_j_compeleceng_2013_08_012
Cites_doi	10.1145/1088149.1088202 10.1016/S0167-8191(00)00087-9 10.1109/JPROC.2004.840301 10.1109/SC.2000.10024 10.1145/1542275.1542344 10.1016/j.compeleceng.2011.05.012 10.1145/331532.331555 10.1016/j.compeleceng.2007.09.007 10.1109/PACT.2009.11 10.1109/JPROC.2004.840306 10.1145/1183401.1183451 10.1109/JPROC.2004.840444 10.1109/PDP.2010.67 10.1109/IPDPS.2010.5470442 10.1109/IPDPS.2010.5470358
ContentType	Journal Article
Copyright	2011 Elsevier Ltd
Copyright_xml	– notice: 2011 Elsevier Ltd
DBID	AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1016/j.compeleceng.2011.12.007
DatabaseName	CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1879-0755
EndPage	269
ExternalDocumentID	10_1016_j_compeleceng_2011_12_007 S0045790611002114
GroupedDBID	--K --M .DC .~1 0R~ 1B1 1~. 1~5 29F 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFFNX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 R2- RIG ROL RPZ RXW SBC SDF SDG SDP SES SET SEW SPC SPCBC SST SSV SSZ T5K TAE TN5 UHS VOH WH7 WUQ XPP ZMT ~G- ~S- 9DU AATTM AAXKI AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 7SP 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c354t-1964e5ca9ee152b3bb5d8a908d516a5e125f3c3abbe29ae3722cc25c4d64f7c03
ISICitedReferencesCount	10
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000303094900007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	0045-7906
IngestDate	Sun Sep 28 08:18:46 EDT 2025 Tue Nov 18 21:02:18 EST 2025 Sat Nov 29 03:04:30 EST 2025 Fri Feb 23 02:32:44 EST 2024
IsPeerReviewed	true
IsScholarly	true
Issue	2
Language	English
License	https://www.elsevier.com/tdm/userlicense/1.0
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c354t-1964e5ca9ee152b3bb5d8a908d516a5e125f3c3abbe29ae3722cc25c4d64f7c03
Notes	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
PQID	1022899776
PQPubID	23500
PageCount	12
ParticipantIDs	proquest_miscellaneous_1022899776 crossref_citationtrail_10_1016_j_compeleceng_2011_12_007 crossref_primary_10_1016_j_compeleceng_2011_12_007 elsevier_sciencedirect_doi_10_1016_j_compeleceng_2011_12_007
PublicationCentury	2000
PublicationDate	March 2012 2012-3-00 20120301
PublicationDateYYYYMMDD	2012-03-01
PublicationDate_xml	– month: 03 year: 2012 text: March 2012
PublicationDecade	2010
PublicationTitle	Computers & electrical engineering
PublicationYear	2012
Publisher	Elsevier Ltd
Publisher_xml	– name: Elsevier Ltd
References	Musoll (b0045) 2011; 37 Faraj A, Yuan X. Automatic generation and tuning of MPI collective communication routines. In: Proceedings of 19th international conference on supercomputing (ICS’05). Cambridge, MA, USA; 2005. p. 393–402. Broquedis F, Aumage O, Goglin B, Thibault S, Wacrenier P-A, Namyst R. Structuring the execution of OpenMP applications for multicore architectures. In: Proceedings 24th international parallel and distributed processing symposium (IPDPS’10). Atlanta, GA, USA; 2010. Faraj A, Kumar S, Smih B, Mamidala AR, Gunnels JA, Heidelberger P. MPI collective communications on the blue gene/P supercomputer: algorithms and optimizations. In: Proceedings of 23rd international conference on supercomputing (ICS’09). Yorktown Heights, NY, USA; 2009. p. 489–90. Püschel, Moura, Johnson, Padua, Veloso, Singer (b0005) 2005; 93 Mercier G, Clêt-Ortega J. Towards an efficient process placement policy for MPI applications in multicore environments. In: Proceedings of 16th European PVM/MPI users’, group meeting (EuroPVM/MPI’09), vol. 5759 of lecture notes in computer science. Espoo, Finland; 2009. p. 104–15. González-Domínguez J, Taboada GL, Fraguela BB, Martín MJ, Touriño J. Servet: a benchmark suite for autotuning on multicore clusters. In: Proceedings of 24th international parallel and distributed processing symposium (IPDPS’10). Atlanta, GA, USA; 2010. Frigo, Johnson (b0010) 2005; 93 Mallón DA, Taboada GL, Teijeiro C, Touriño J, Fraguela BB, Gómez A, Doallo R, Mouriño JC. Performance evaluation of MPI, UPC and OpenMP on multicore architectures. In: Proceedings of 16th European PVM/MPI users’, group meeting (EuroPVM/MPI’09), vol. 5759 of lecture notes in computer science. Espoo, Finland; 2009. p. 174–84. NASA Advanced Computing Division. NAS parallel benchmarks Message Passing Interface Forum. Broquedis F, Clêt-Ortega J, Moreaud S, Furmento N, Goglin B, Mercier G, Thibault S, Namyst R. Hwloc: a generic framework for managing hardware affinities in HPC applications. In: Proceedings 18th Euromicro international conference on parallel, distributed and network-based processing (PDP’10). Pisa, Italy; 2010. Chen H, Chen W, Huang J, Robert B, Kuhn H. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters. In: Proceedings of 20th international conference on supercomputing (ICS’06). Cairns, Australia; 2006. p. 353–60. Javadi, Abawajy, Akbari (b0115) 2008; 34 The Servet Benchmark Suite. Fraguela BB, Voronenko Y, Püschel M. Automatic tuning of discrete fourier transforms driven by analytical modeling. In: Proceedings of 18th international conference on parallel architectures and compilation techniques (PACT’09). Raleigh, NC, USA; 2009. p. 271–80. Vadhiyar SS, Fagg GE, Dongarra JJ. Automatically tuned collective communications. In: Proceedings of 13th ACM/IEEE conference on supercomputing (SC’00). Dallas, TX, USA; 2000. p. 3. Sistare S, Vandevaart R, Loh E. Optimization of MPI collectives on clusters of large-scale SMPs. In: Proceedings of 12th ACM/IEEE conference on supercomputing (SC’99). Portland, OR, USA; 1999. p. 23–36. Tipparaju V, Nieplocha J, Panda DK. Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings of 17th international parallel and distributed processing symposium (IPDPS’03). Nice, France; 2003. pp. 84–93. OpenMP. UPC Consortium: UPC Language. Yotov, Li, Ren, Garzarán, Padua, Pingali (b0020) 2005; 93 Whaley, Petitet, Dongarra (b0015) 2001; 27 [accessed July 2011]. Zhang J, Zhai J, Chen W, Zheng W. Process mapping for collective communications. In: Proceedings of 15th Euro-Par conference (Euro-Par’09), vol. 5704 of lecture notes in computer science. Delft, The Netherlands; 2009. p. 81–92. Musoll (10.1016/j.compeleceng.2011.12.007_b0045) 2011; 37 10.1016/j.compeleceng.2011.12.007_b0060 10.1016/j.compeleceng.2011.12.007_b0070 10.1016/j.compeleceng.2011.12.007_b0080 10.1016/j.compeleceng.2011.12.007_b0090 10.1016/j.compeleceng.2011.12.007_b0075 10.1016/j.compeleceng.2011.12.007_b0030 10.1016/j.compeleceng.2011.12.007_b0085 10.1016/j.compeleceng.2011.12.007_b0040 10.1016/j.compeleceng.2011.12.007_b0095 10.1016/j.compeleceng.2011.12.007_b0050 Whaley (10.1016/j.compeleceng.2011.12.007_b0015) 2001; 27 10.1016/j.compeleceng.2011.12.007_b0035 10.1016/j.compeleceng.2011.12.007_b0100 Frigo (10.1016/j.compeleceng.2011.12.007_b0010) 2005; 93 10.1016/j.compeleceng.2011.12.007_b0055 10.1016/j.compeleceng.2011.12.007_b0110 10.1016/j.compeleceng.2011.12.007_b0065 10.1016/j.compeleceng.2011.12.007_b0120 10.1016/j.compeleceng.2011.12.007_b0105 10.1016/j.compeleceng.2011.12.007_b0025 Yotov (10.1016/j.compeleceng.2011.12.007_b0020) 2005; 93 Javadi (10.1016/j.compeleceng.2011.12.007_b0115) 2008; 34 Püschel (10.1016/j.compeleceng.2011.12.007_b0005) 2005; 93
References_xml	– reference: Fraguela BB, Voronenko Y, Püschel M. Automatic tuning of discrete fourier transforms driven by analytical modeling. In: Proceedings of 18th international conference on parallel architectures and compilation techniques (PACT’09). Raleigh, NC, USA; 2009. p. 271–80. – reference: Mercier G, Clêt-Ortega J. Towards an efficient process placement policy for MPI applications in multicore environments. In: Proceedings of 16th European PVM/MPI users’, group meeting (EuroPVM/MPI’09), vol. 5759 of lecture notes in computer science. Espoo, Finland; 2009. p. 104–15. – reference: Message Passing Interface Forum. – reference: The Servet Benchmark Suite. – reference: Sistare S, Vandevaart R, Loh E. Optimization of MPI collectives on clusters of large-scale SMPs. In: Proceedings of 12th ACM/IEEE conference on supercomputing (SC’99). Portland, OR, USA; 1999. p. 23–36. – reference: Vadhiyar SS, Fagg GE, Dongarra JJ. Automatically tuned collective communications. In: Proceedings of 13th ACM/IEEE conference on supercomputing (SC’00). Dallas, TX, USA; 2000. p. 3. – volume: 27 start-page: 3 year: 2001 end-page: 35 ident: b0015 article-title: Automated empirical optimizations of software and the ATLAS project publication-title: Parallel Comput – reference: Chen H, Chen W, Huang J, Robert B, Kuhn H. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters. In: Proceedings of 20th international conference on supercomputing (ICS’06). Cairns, Australia; 2006. p. 353–60. – volume: 93 start-page: 232 year: 2005 end-page: 275 ident: b0005 article-title: SPIRAL: code generation for DSP transforms publication-title: Proc IEEE – reference: NASA Advanced Computing Division. NAS parallel benchmarks – volume: 93 start-page: 216 year: 2005 end-page: 231 ident: b0010 article-title: The design and implementation of FFTW3 publication-title: Proc IEEE – reference: Mallón DA, Taboada GL, Teijeiro C, Touriño J, Fraguela BB, Gómez A, Doallo R, Mouriño JC. Performance evaluation of MPI, UPC and OpenMP on multicore architectures. In: Proceedings of 16th European PVM/MPI users’, group meeting (EuroPVM/MPI’09), vol. 5759 of lecture notes in computer science. Espoo, Finland; 2009. p. 174–84. – reference: UPC Consortium: UPC Language. – reference: [accessed July 2011]. – volume: 93 start-page: 358 year: 2005 end-page: 386 ident: b0020 article-title: Is search really necessary to generate high performance BLAS? publication-title: Proc IEEE – reference: OpenMP. – reference: Tipparaju V, Nieplocha J, Panda DK. Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings of 17th international parallel and distributed processing symposium (IPDPS’03). Nice, France; 2003. pp. 84–93. – reference: Faraj A, Yuan X. Automatic generation and tuning of MPI collective communication routines. In: Proceedings of 19th international conference on supercomputing (ICS’05). Cambridge, MA, USA; 2005. p. 393–402. – reference: González-Domínguez J, Taboada GL, Fraguela BB, Martín MJ, Touriño J. Servet: a benchmark suite for autotuning on multicore clusters. In: Proceedings of 24th international parallel and distributed processing symposium (IPDPS’10). Atlanta, GA, USA; 2010. – reference: Faraj A, Kumar S, Smih B, Mamidala AR, Gunnels JA, Heidelberger P. MPI collective communications on the blue gene/P supercomputer: algorithms and optimizations. In: Proceedings of 23rd international conference on supercomputing (ICS’09). Yorktown Heights, NY, USA; 2009. p. 489–90. – reference: Zhang J, Zhai J, Chen W, Zheng W. Process mapping for collective communications. In: Proceedings of 15th Euro-Par conference (Euro-Par’09), vol. 5704 of lecture notes in computer science. Delft, The Netherlands; 2009. p. 81–92. – reference: Broquedis F, Clêt-Ortega J, Moreaud S, Furmento N, Goglin B, Mercier G, Thibault S, Namyst R. Hwloc: a generic framework for managing hardware affinities in HPC applications. In: Proceedings 18th Euromicro international conference on parallel, distributed and network-based processing (PDP’10). Pisa, Italy; 2010. – volume: 34 start-page: 488 year: 2008 end-page: 502 ident: b0115 article-title: Performance modeling and analysis of heterogeneous meta-computing systems interconnection networks publication-title: Comput Elect Eng – volume: 37 start-page: 1193 year: 2011 end-page: 1211 ident: b0045 article-title: Variable-size mosaics: a process-variation aware technique to increase the performance of tile-based, massive multi-core processors publication-title: Comput Elect Eng – reference: Broquedis F, Aumage O, Goglin B, Thibault S, Wacrenier P-A, Namyst R. Structuring the execution of OpenMP applications for multicore architectures. In: Proceedings 24th international parallel and distributed processing symposium (IPDPS’10). Atlanta, GA, USA; 2010. – ident: 10.1016/j.compeleceng.2011.12.007_b0085 doi: 10.1145/1088149.1088202 – ident: 10.1016/j.compeleceng.2011.12.007_b0075 – volume: 27 start-page: 3 issue: 1–2 year: 2001 ident: 10.1016/j.compeleceng.2011.12.007_b0015 article-title: Automated empirical optimizations of software and the ATLAS project publication-title: Parallel Comput doi: 10.1016/S0167-8191(00)00087-9 – volume: 93 start-page: 216 issue: 2 year: 2005 ident: 10.1016/j.compeleceng.2011.12.007_b0010 article-title: The design and implementation of FFTW3 publication-title: Proc IEEE doi: 10.1109/JPROC.2004.840301 – ident: 10.1016/j.compeleceng.2011.12.007_b0080 doi: 10.1109/SC.2000.10024 – ident: 10.1016/j.compeleceng.2011.12.007_b0090 doi: 10.1145/1542275.1542344 – volume: 37 start-page: 1193 year: 2011 ident: 10.1016/j.compeleceng.2011.12.007_b0045 article-title: Variable-size mosaics: a process-variation aware technique to increase the performance of tile-based, massive multi-core processors publication-title: Comput Elect Eng doi: 10.1016/j.compeleceng.2011.05.012 – ident: 10.1016/j.compeleceng.2011.12.007_b0035 doi: 10.1145/331532.331555 – ident: 10.1016/j.compeleceng.2011.12.007_b0065 – ident: 10.1016/j.compeleceng.2011.12.007_b0060 – volume: 34 start-page: 488 issue: 6 year: 2008 ident: 10.1016/j.compeleceng.2011.12.007_b0115 article-title: Performance modeling and analysis of heterogeneous meta-computing systems interconnection networks publication-title: Comput Elect Eng doi: 10.1016/j.compeleceng.2007.09.007 – ident: 10.1016/j.compeleceng.2011.12.007_b0025 doi: 10.1109/PACT.2009.11 – ident: 10.1016/j.compeleceng.2011.12.007_b0040 – ident: 10.1016/j.compeleceng.2011.12.007_b0100 – ident: 10.1016/j.compeleceng.2011.12.007_b0120 – volume: 93 start-page: 232 issue: 2 year: 2005 ident: 10.1016/j.compeleceng.2011.12.007_b0005 article-title: SPIRAL: code generation for DSP transforms publication-title: Proc IEEE doi: 10.1109/JPROC.2004.840306 – ident: 10.1016/j.compeleceng.2011.12.007_b0095 doi: 10.1145/1183401.1183451 – ident: 10.1016/j.compeleceng.2011.12.007_b0030 – volume: 93 start-page: 358 issue: 2 year: 2005 ident: 10.1016/j.compeleceng.2011.12.007_b0020 article-title: Is search really necessary to generate high performance BLAS? publication-title: Proc IEEE doi: 10.1109/JPROC.2004.840444 – ident: 10.1016/j.compeleceng.2011.12.007_b0110 doi: 10.1109/PDP.2010.67 – ident: 10.1016/j.compeleceng.2011.12.007_b0105 doi: 10.1109/IPDPS.2010.5470442 – ident: 10.1016/j.compeleceng.2011.12.007_b0070 – ident: 10.1016/j.compeleceng.2011.12.007_b0050 doi: 10.1109/IPDPS.2010.5470358 – ident: 10.1016/j.compeleceng.2011.12.007_b0055
SSID	ssj0004618
Score	1.954394
Snippet	[Display omitted] ► Multicore systems require new mapping policies to make the most of the architecture. ► An automatic mapping based on Servet is proposed. ►... Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters...
SourceID	proquest crossref elsevier
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	258
SubjectTerms	Benchmarking Clusters Electrical engineering Hierarchies Mapping Mathematical models Optimization
Title	Automatic mapping of parallel applications on multicore architectures using the Servet benchmark suite
URI	https://dx.doi.org/10.1016/j.compeleceng.2011.12.007 https://www.proquest.com/docview/1022899776
Volume	38
WOSCitedRecordID	wos000303094900007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-0755 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0004618 issn: 0045-7906 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Za9tAEF5cp5T0ofSkSQ820DehIK9WWgn64jbpYUooNAW_idFqlTqVJWNJIeQH93d0L8lKSsCl9EVYEqtj5vNodnbmG4Te8Bwm8ssNLpCMujRMPTf10sj1aJZByjilDHSzCXZyEs3n8dfR6FdXC3NRsLKMLi_j1X9VtTwmla1KZ_9C3f1F5QH5WypdbqXa5XYrxU_bpjI8rEtYrbqkZlirpimFM1ywVgsFOqFQUVk6wyWF2mnrro5KWRPROKl8xx9LWP906nbRXGc4sJ0hao0j01hH615syA77PJ-qvNKL85NCXLlH1VLvHJVnrQllz1SIfhNOSCvItHv7sdVFi8vK-XLYY24NclhhmkdDvSgWlfOuP6sIEuzFbVGS2QNndjiMdaikEX8Y6-iLcDYZT9qoU8W66VlGbWPHI6ZqswwDcGfo_WgAaDK02oY93joAxPSO-ePbYsIc5woaKyVKKUNDAavCyaZ17w3q7m_q0dSTKV4-OdOmd9AOYUEcjdHO9PPxfDao4J0Yn8G-yj10sMlEvOWGt3lSN3wK7SidPkQP7AwHTw0yH6GRKB-j-wPeyyco7zGKLUZxleMOo3iIUVyVuMcovoZRrDGKJUaxwSjuMYo1Rp-i7x-OT99_cm3DD5f7AW1cRQ4nAg6xENKtTP00DbIIYi_KgkkIgZDOeO5zH9JUkBiEzwjhnAScZiHNGff8Z2hcVqV4jjBQ7jHi5yAdCxpyojiYIGcBgZyEjLM9FHXCS7hlw1dNWYqkS3s8TwZyT5TckwlJpNz3EOmHrgwlzDaD3nYaSqxva3zWRMJrm-EHnVYTaf_Voh6UomrrREVsoljO4sL9f7vFC7S7-cu9RONm3YpX6C6_aBb1-rUF7G8vYd_R
linkProvider	Elsevier
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automatic+mapping+of+parallel+applications+on+multicore+architectures+using+the+Servet+benchmark+suite&rft.jtitle=Computers+%26+electrical+engineering&rft.au=Gonz%C3%A1lez-Dom%C3%ADnguez%2C+Jorge&rft.au=Taboada%2C+Guillermo+L.&rft.au=Fraguela%2C+Basilio+B.&rft.au=Mart%C3%ADn%2C+Mar%C3%ADa+J.&rft.date=2012-03-01&rft.pub=Elsevier+Ltd&rft.issn=0045-7906&rft.eissn=1879-0755&rft.volume=38&rft.issue=2&rft.spage=258&rft.epage=269&rft_id=info:doi/10.1016%2Fj.compeleceng.2011.12.007&rft.externalDocID=S0045790611002114
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0045-7906&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0045-7906&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0045-7906&client=summon