Automatic mapping of parallel applications on multicore architectures using the Servet benchmark suite
[Display omitted] ► Multicore systems require new mapping policies to make the most of the architecture. ► An automatic mapping based on Servet is proposed. ► Servet is a benchmark suite that obtains relevant hardware parameters of clusters. ► Results show a significant improvement of performance of...
Uložené v:
| Vydané v: | Computers & electrical engineering Ročník 38; číslo 2; s. 258 - 269 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier Ltd
01.03.2012
|
| Predmet: | |
| ISSN: | 0045-7906, 1879-0755 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | [Display omitted]
► Multicore systems require new mapping policies to make the most of the architecture. ► An automatic mapping based on Servet is proposed. ► Servet is a benchmark suite that obtains relevant hardware parameters of clusters. ► Results show a significant improvement of performance of parallel applications. ► The mapping technique proposed does not require source code modifications.
Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their performance on multicore clusters. Although Servet has been proved to detect accurately cache hierarchies, bandwidths and bottlenecks in memory accesses, as well as the communication overhead among cores, up to now the impact of the use of this information on application performance optimization has not been assessed. This paper presents a novel algorithm that automatically uses Servet for mapping parallel applications on multicore systems and analyzes its impact on three testbeds using three different parallel programming models: message-passing, shared memory and partitioned global address space (PGAS). Our results show that a suitable mapping policy based on the data provided by this tool can significantly improve the performance of parallel applications without source code modification. |
|---|---|
| AbstractList | [Display omitted]
► Multicore systems require new mapping policies to make the most of the architecture. ► An automatic mapping based on Servet is proposed. ► Servet is a benchmark suite that obtains relevant hardware parameters of clusters. ► Results show a significant improvement of performance of parallel applications. ► The mapping technique proposed does not require source code modifications.
Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their performance on multicore clusters. Although Servet has been proved to detect accurately cache hierarchies, bandwidths and bottlenecks in memory accesses, as well as the communication overhead among cores, up to now the impact of the use of this information on application performance optimization has not been assessed. This paper presents a novel algorithm that automatically uses Servet for mapping parallel applications on multicore systems and analyzes its impact on three testbeds using three different parallel programming models: message-passing, shared memory and partitioned global address space (PGAS). Our results show that a suitable mapping policy based on the data provided by this tool can significantly improve the performance of parallel applications without source code modification. Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their performance on multicore clusters. Although Servet has been proved to detect accurately cache hierarchies, bandwidths and bottlenecks in memory accesses, as well as the communication overhead among cores, up to now the impact of the use of this information on application performance optimization has not been assessed. This paper presents a novel algorithm that automatically uses Servet for mapping parallel applications on multicore systems and analyzes its impact on three testbeds using three different parallel programming models: message-passing, shared memory and partitioned global address space (PGAS). Our results show that a suitable mapping policy based on the data provided by this tool can significantly improve the performance of parallel applications without source code modification. |
| Author | Taboada, Guillermo L. Martín, María J. Touriño, Juan González-Domínguez, Jorge Fraguela, Basilio B. |
| Author_xml | – sequence: 1 givenname: Jorge surname: González-Domínguez fullname: González-Domínguez, Jorge email: jgonzalezd@udc.es – sequence: 2 givenname: Guillermo L. surname: Taboada fullname: Taboada, Guillermo L. email: taboada@udc.es – sequence: 3 givenname: Basilio B. surname: Fraguela fullname: Fraguela, Basilio B. email: basilio.fraguela@udc.es – sequence: 4 givenname: María J. surname: Martín fullname: Martín, María J. email: mariam@udc.es – sequence: 5 givenname: Juan surname: Touriño fullname: Touriño, Juan email: juan@udc.es |
| BookMark | eNqNkMFO3DAQhq2KSizQd3BvvSS1nTiJTxVaFaiExAE4W85kwnrrxKntIPH2eNkeECdOI4-__5fmOyMns5-RkO-clZzx5ue-BD8t6BBwfioF47zkomSs_UI2vGtVwVopT8iGsVoWrWLNKTmLcc_yu-HdhoyXa_KTSRboZJbFzk_Uj3QxwTiHjuaVs5C__Rypn-m0uoz6gNQE2NmEkNaAka7xkEw7pPcYnjHRHmfYTSb8pXHN2AX5OhoX8dv_eU4er34_bG-K27vrP9vL2wIqWaeCq6ZGCUYhcin6qu_l0BnFukHyxkjkQo4VVKbvUSiDVSsEgJBQD009tsCqc_Lj2LsE_2_FmPRkI6BzZka_Rs2ZEJ1Sbdtk9NcRheBjDDhqsOnt0hSMdRnVB8N6r98Z1gfDmgudDecG9aFhCTbf_PKp7PaYxWzj2WLQEWx2hoMNWaoevP1Eyyul3KL4 |
| CitedBy_id | crossref_primary_10_1002_cpe_1914 crossref_primary_10_1002_cpe_6600 crossref_primary_10_1002_cpe_7419 crossref_primary_10_1016_j_compeleceng_2013_01_008 crossref_primary_10_3390_electronics12010053 crossref_primary_10_1016_j_compeleceng_2013_08_012 |
| Cites_doi | 10.1145/1088149.1088202 10.1016/S0167-8191(00)00087-9 10.1109/JPROC.2004.840301 10.1109/SC.2000.10024 10.1145/1542275.1542344 10.1016/j.compeleceng.2011.05.012 10.1145/331532.331555 10.1016/j.compeleceng.2007.09.007 10.1109/PACT.2009.11 10.1109/JPROC.2004.840306 10.1145/1183401.1183451 10.1109/JPROC.2004.840444 10.1109/PDP.2010.67 10.1109/IPDPS.2010.5470442 10.1109/IPDPS.2010.5470358 |
| ContentType | Journal Article |
| Copyright | 2011 Elsevier Ltd |
| Copyright_xml | – notice: 2011 Elsevier Ltd |
| DBID | AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.compeleceng.2011.12.007 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1879-0755 |
| EndPage | 269 |
| ExternalDocumentID | 10_1016_j_compeleceng_2011_12_007 S0045790611002114 |
| GroupedDBID | --K --M .DC .~1 0R~ 1B1 1~. 1~5 29F 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFFNX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 R2- RIG ROL RPZ RXW SBC SDF SDG SDP SES SET SEW SPC SPCBC SST SSV SSZ T5K TAE TN5 UHS VOH WH7 WUQ XPP ZMT ~G- ~S- 9DU AATTM AAXKI AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c354t-1964e5ca9ee152b3bb5d8a908d516a5e125f3c3abbe29ae3722cc25c4d64f7c03 |
| ISICitedReferencesCount | 10 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000303094900007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0045-7906 |
| IngestDate | Sun Sep 28 08:18:46 EDT 2025 Tue Nov 18 21:02:18 EST 2025 Sat Nov 29 03:04:30 EST 2025 Fri Feb 23 02:32:44 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| License | https://www.elsevier.com/tdm/userlicense/1.0 |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c354t-1964e5ca9ee152b3bb5d8a908d516a5e125f3c3abbe29ae3722cc25c4d64f7c03 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
| PQID | 1022899776 |
| PQPubID | 23500 |
| PageCount | 12 |
| ParticipantIDs | proquest_miscellaneous_1022899776 crossref_citationtrail_10_1016_j_compeleceng_2011_12_007 crossref_primary_10_1016_j_compeleceng_2011_12_007 elsevier_sciencedirect_doi_10_1016_j_compeleceng_2011_12_007 |
| PublicationCentury | 2000 |
| PublicationDate | March 2012 2012-3-00 20120301 |
| PublicationDateYYYYMMDD | 2012-03-01 |
| PublicationDate_xml | – month: 03 year: 2012 text: March 2012 |
| PublicationDecade | 2010 |
| PublicationTitle | Computers & electrical engineering |
| PublicationYear | 2012 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Musoll (b0045) 2011; 37 Faraj A, Yuan X. Automatic generation and tuning of MPI collective communication routines. In: Proceedings of 19th international conference on supercomputing (ICS’05). Cambridge, MA, USA; 2005. p. 393–402. Broquedis F, Aumage O, Goglin B, Thibault S, Wacrenier P-A, Namyst R. Structuring the execution of OpenMP applications for multicore architectures. In: Proceedings 24th international parallel and distributed processing symposium (IPDPS’10). Atlanta, GA, USA; 2010. Faraj A, Kumar S, Smih B, Mamidala AR, Gunnels JA, Heidelberger P. MPI collective communications on the blue gene/P supercomputer: algorithms and optimizations. In: Proceedings of 23rd international conference on supercomputing (ICS’09). Yorktown Heights, NY, USA; 2009. p. 489–90. Püschel, Moura, Johnson, Padua, Veloso, Singer (b0005) 2005; 93 Mercier G, Clêt-Ortega J. Towards an efficient process placement policy for MPI applications in multicore environments. In: Proceedings of 16th European PVM/MPI users’, group meeting (EuroPVM/MPI’09), vol. 5759 of lecture notes in computer science. Espoo, Finland; 2009. p. 104–15. González-Domínguez J, Taboada GL, Fraguela BB, Martín MJ, Touriño J. Servet: a benchmark suite for autotuning on multicore clusters. In: Proceedings of 24th international parallel and distributed processing symposium (IPDPS’10). Atlanta, GA, USA; 2010. Frigo, Johnson (b0010) 2005; 93 Mallón DA, Taboada GL, Teijeiro C, Touriño J, Fraguela BB, Gómez A, Doallo R, Mouriño JC. Performance evaluation of MPI, UPC and OpenMP on multicore architectures. In: Proceedings of 16th European PVM/MPI users’, group meeting (EuroPVM/MPI’09), vol. 5759 of lecture notes in computer science. Espoo, Finland; 2009. p. 174–84. NASA Advanced Computing Division. NAS parallel benchmarks Message Passing Interface Forum. Broquedis F, Clêt-Ortega J, Moreaud S, Furmento N, Goglin B, Mercier G, Thibault S, Namyst R. Hwloc: a generic framework for managing hardware affinities in HPC applications. In: Proceedings 18th Euromicro international conference on parallel, distributed and network-based processing (PDP’10). Pisa, Italy; 2010. Chen H, Chen W, Huang J, Robert B, Kuhn H. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters. In: Proceedings of 20th international conference on supercomputing (ICS’06). Cairns, Australia; 2006. p. 353–60. Javadi, Abawajy, Akbari (b0115) 2008; 34 The Servet Benchmark Suite. Fraguela BB, Voronenko Y, Püschel M. Automatic tuning of discrete fourier transforms driven by analytical modeling. In: Proceedings of 18th international conference on parallel architectures and compilation techniques (PACT’09). Raleigh, NC, USA; 2009. p. 271–80. Vadhiyar SS, Fagg GE, Dongarra JJ. Automatically tuned collective communications. In: Proceedings of 13th ACM/IEEE conference on supercomputing (SC’00). Dallas, TX, USA; 2000. p. 3. Sistare S, Vandevaart R, Loh E. Optimization of MPI collectives on clusters of large-scale SMPs. In: Proceedings of 12th ACM/IEEE conference on supercomputing (SC’99). Portland, OR, USA; 1999. p. 23–36. Tipparaju V, Nieplocha J, Panda DK. Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings of 17th international parallel and distributed processing symposium (IPDPS’03). Nice, France; 2003. pp. 84–93. OpenMP. UPC Consortium: UPC Language. Yotov, Li, Ren, Garzarán, Padua, Pingali (b0020) 2005; 93 Whaley, Petitet, Dongarra (b0015) 2001; 27 [accessed July 2011]. Zhang J, Zhai J, Chen W, Zheng W. Process mapping for collective communications. In: Proceedings of 15th Euro-Par conference (Euro-Par’09), vol. 5704 of lecture notes in computer science. Delft, The Netherlands; 2009. p. 81–92. Musoll (10.1016/j.compeleceng.2011.12.007_b0045) 2011; 37 10.1016/j.compeleceng.2011.12.007_b0060 10.1016/j.compeleceng.2011.12.007_b0070 10.1016/j.compeleceng.2011.12.007_b0080 10.1016/j.compeleceng.2011.12.007_b0090 10.1016/j.compeleceng.2011.12.007_b0075 10.1016/j.compeleceng.2011.12.007_b0030 10.1016/j.compeleceng.2011.12.007_b0085 10.1016/j.compeleceng.2011.12.007_b0040 10.1016/j.compeleceng.2011.12.007_b0095 10.1016/j.compeleceng.2011.12.007_b0050 Whaley (10.1016/j.compeleceng.2011.12.007_b0015) 2001; 27 10.1016/j.compeleceng.2011.12.007_b0035 10.1016/j.compeleceng.2011.12.007_b0100 Frigo (10.1016/j.compeleceng.2011.12.007_b0010) 2005; 93 10.1016/j.compeleceng.2011.12.007_b0055 10.1016/j.compeleceng.2011.12.007_b0110 10.1016/j.compeleceng.2011.12.007_b0065 10.1016/j.compeleceng.2011.12.007_b0120 10.1016/j.compeleceng.2011.12.007_b0105 10.1016/j.compeleceng.2011.12.007_b0025 Yotov (10.1016/j.compeleceng.2011.12.007_b0020) 2005; 93 Javadi (10.1016/j.compeleceng.2011.12.007_b0115) 2008; 34 Püschel (10.1016/j.compeleceng.2011.12.007_b0005) 2005; 93 |
| References_xml | – reference: Fraguela BB, Voronenko Y, Püschel M. Automatic tuning of discrete fourier transforms driven by analytical modeling. In: Proceedings of 18th international conference on parallel architectures and compilation techniques (PACT’09). Raleigh, NC, USA; 2009. p. 271–80. – reference: Mercier G, Clêt-Ortega J. Towards an efficient process placement policy for MPI applications in multicore environments. In: Proceedings of 16th European PVM/MPI users’, group meeting (EuroPVM/MPI’09), vol. 5759 of lecture notes in computer science. Espoo, Finland; 2009. p. 104–15. – reference: Message Passing Interface Forum. – reference: The Servet Benchmark Suite. – reference: Sistare S, Vandevaart R, Loh E. Optimization of MPI collectives on clusters of large-scale SMPs. In: Proceedings of 12th ACM/IEEE conference on supercomputing (SC’99). Portland, OR, USA; 1999. p. 23–36. – reference: Vadhiyar SS, Fagg GE, Dongarra JJ. Automatically tuned collective communications. In: Proceedings of 13th ACM/IEEE conference on supercomputing (SC’00). Dallas, TX, USA; 2000. p. 3. – volume: 27 start-page: 3 year: 2001 end-page: 35 ident: b0015 article-title: Automated empirical optimizations of software and the ATLAS project publication-title: Parallel Comput – reference: Chen H, Chen W, Huang J, Robert B, Kuhn H. MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters. In: Proceedings of 20th international conference on supercomputing (ICS’06). Cairns, Australia; 2006. p. 353–60. – volume: 93 start-page: 232 year: 2005 end-page: 275 ident: b0005 article-title: SPIRAL: code generation for DSP transforms publication-title: Proc IEEE – reference: NASA Advanced Computing Division. NAS parallel benchmarks – volume: 93 start-page: 216 year: 2005 end-page: 231 ident: b0010 article-title: The design and implementation of FFTW3 publication-title: Proc IEEE – reference: Mallón DA, Taboada GL, Teijeiro C, Touriño J, Fraguela BB, Gómez A, Doallo R, Mouriño JC. Performance evaluation of MPI, UPC and OpenMP on multicore architectures. In: Proceedings of 16th European PVM/MPI users’, group meeting (EuroPVM/MPI’09), vol. 5759 of lecture notes in computer science. Espoo, Finland; 2009. p. 174–84. – reference: UPC Consortium: UPC Language. – reference: [accessed July 2011]. – volume: 93 start-page: 358 year: 2005 end-page: 386 ident: b0020 article-title: Is search really necessary to generate high performance BLAS? publication-title: Proc IEEE – reference: OpenMP. – reference: Tipparaju V, Nieplocha J, Panda DK. Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings of 17th international parallel and distributed processing symposium (IPDPS’03). Nice, France; 2003. pp. 84–93. – reference: Faraj A, Yuan X. Automatic generation and tuning of MPI collective communication routines. In: Proceedings of 19th international conference on supercomputing (ICS’05). Cambridge, MA, USA; 2005. p. 393–402. – reference: González-Domínguez J, Taboada GL, Fraguela BB, Martín MJ, Touriño J. Servet: a benchmark suite for autotuning on multicore clusters. In: Proceedings of 24th international parallel and distributed processing symposium (IPDPS’10). Atlanta, GA, USA; 2010. – reference: Faraj A, Kumar S, Smih B, Mamidala AR, Gunnels JA, Heidelberger P. MPI collective communications on the blue gene/P supercomputer: algorithms and optimizations. In: Proceedings of 23rd international conference on supercomputing (ICS’09). Yorktown Heights, NY, USA; 2009. p. 489–90. – reference: Zhang J, Zhai J, Chen W, Zheng W. Process mapping for collective communications. In: Proceedings of 15th Euro-Par conference (Euro-Par’09), vol. 5704 of lecture notes in computer science. Delft, The Netherlands; 2009. p. 81–92. – reference: Broquedis F, Clêt-Ortega J, Moreaud S, Furmento N, Goglin B, Mercier G, Thibault S, Namyst R. Hwloc: a generic framework for managing hardware affinities in HPC applications. In: Proceedings 18th Euromicro international conference on parallel, distributed and network-based processing (PDP’10). Pisa, Italy; 2010. – volume: 34 start-page: 488 year: 2008 end-page: 502 ident: b0115 article-title: Performance modeling and analysis of heterogeneous meta-computing systems interconnection networks publication-title: Comput Elect Eng – volume: 37 start-page: 1193 year: 2011 end-page: 1211 ident: b0045 article-title: Variable-size mosaics: a process-variation aware technique to increase the performance of tile-based, massive multi-core processors publication-title: Comput Elect Eng – reference: Broquedis F, Aumage O, Goglin B, Thibault S, Wacrenier P-A, Namyst R. Structuring the execution of OpenMP applications for multicore architectures. In: Proceedings 24th international parallel and distributed processing symposium (IPDPS’10). Atlanta, GA, USA; 2010. – ident: 10.1016/j.compeleceng.2011.12.007_b0085 doi: 10.1145/1088149.1088202 – ident: 10.1016/j.compeleceng.2011.12.007_b0075 – volume: 27 start-page: 3 issue: 1–2 year: 2001 ident: 10.1016/j.compeleceng.2011.12.007_b0015 article-title: Automated empirical optimizations of software and the ATLAS project publication-title: Parallel Comput doi: 10.1016/S0167-8191(00)00087-9 – volume: 93 start-page: 216 issue: 2 year: 2005 ident: 10.1016/j.compeleceng.2011.12.007_b0010 article-title: The design and implementation of FFTW3 publication-title: Proc IEEE doi: 10.1109/JPROC.2004.840301 – ident: 10.1016/j.compeleceng.2011.12.007_b0080 doi: 10.1109/SC.2000.10024 – ident: 10.1016/j.compeleceng.2011.12.007_b0090 doi: 10.1145/1542275.1542344 – volume: 37 start-page: 1193 year: 2011 ident: 10.1016/j.compeleceng.2011.12.007_b0045 article-title: Variable-size mosaics: a process-variation aware technique to increase the performance of tile-based, massive multi-core processors publication-title: Comput Elect Eng doi: 10.1016/j.compeleceng.2011.05.012 – ident: 10.1016/j.compeleceng.2011.12.007_b0035 doi: 10.1145/331532.331555 – ident: 10.1016/j.compeleceng.2011.12.007_b0065 – ident: 10.1016/j.compeleceng.2011.12.007_b0060 – volume: 34 start-page: 488 issue: 6 year: 2008 ident: 10.1016/j.compeleceng.2011.12.007_b0115 article-title: Performance modeling and analysis of heterogeneous meta-computing systems interconnection networks publication-title: Comput Elect Eng doi: 10.1016/j.compeleceng.2007.09.007 – ident: 10.1016/j.compeleceng.2011.12.007_b0025 doi: 10.1109/PACT.2009.11 – ident: 10.1016/j.compeleceng.2011.12.007_b0040 – ident: 10.1016/j.compeleceng.2011.12.007_b0100 – ident: 10.1016/j.compeleceng.2011.12.007_b0120 – volume: 93 start-page: 232 issue: 2 year: 2005 ident: 10.1016/j.compeleceng.2011.12.007_b0005 article-title: SPIRAL: code generation for DSP transforms publication-title: Proc IEEE doi: 10.1109/JPROC.2004.840306 – ident: 10.1016/j.compeleceng.2011.12.007_b0095 doi: 10.1145/1183401.1183451 – ident: 10.1016/j.compeleceng.2011.12.007_b0030 – volume: 93 start-page: 358 issue: 2 year: 2005 ident: 10.1016/j.compeleceng.2011.12.007_b0020 article-title: Is search really necessary to generate high performance BLAS? publication-title: Proc IEEE doi: 10.1109/JPROC.2004.840444 – ident: 10.1016/j.compeleceng.2011.12.007_b0110 doi: 10.1109/PDP.2010.67 – ident: 10.1016/j.compeleceng.2011.12.007_b0105 doi: 10.1109/IPDPS.2010.5470442 – ident: 10.1016/j.compeleceng.2011.12.007_b0070 – ident: 10.1016/j.compeleceng.2011.12.007_b0050 doi: 10.1109/IPDPS.2010.5470358 – ident: 10.1016/j.compeleceng.2011.12.007_b0055 |
| SSID | ssj0004618 |
| Score | 1.954394 |
| Snippet | [Display omitted]
► Multicore systems require new mapping policies to make the most of the architecture. ► An automatic mapping based on Servet is proposed. ►... Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 258 |
| SubjectTerms | Benchmarking Clusters Electrical engineering Hierarchies Mapping Mathematical models Optimization |
| Title | Automatic mapping of parallel applications on multicore architectures using the Servet benchmark suite |
| URI | https://dx.doi.org/10.1016/j.compeleceng.2011.12.007 https://www.proquest.com/docview/1022899776 |
| Volume | 38 |
| WOSCitedRecordID | wos000303094900007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-0755 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0004618 issn: 0045-7906 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Za9tAEF5cp5T0ofSkSQ820DehIK9WWgn64jbpYUooNAW_idFqlTqVJWNJIeQH93d0L8lKSsCl9EVYEqtj5vNodnbmG4Te8Bwm8ssNLpCMujRMPTf10sj1aJZByjilDHSzCXZyEs3n8dfR6FdXC3NRsLKMLi_j1X9VtTwmla1KZ_9C3f1F5QH5WypdbqXa5XYrxU_bpjI8rEtYrbqkZlirpimFM1ywVgsFOqFQUVk6wyWF2mnrro5KWRPROKl8xx9LWP906nbRXGc4sJ0hao0j01hH615syA77PJ-qvNKL85NCXLlH1VLvHJVnrQllz1SIfhNOSCvItHv7sdVFi8vK-XLYY24NclhhmkdDvSgWlfOuP6sIEuzFbVGS2QNndjiMdaikEX8Y6-iLcDYZT9qoU8W66VlGbWPHI6ZqswwDcGfo_WgAaDK02oY93joAxPSO-ePbYsIc5woaKyVKKUNDAavCyaZ17w3q7m_q0dSTKV4-OdOmd9AOYUEcjdHO9PPxfDao4J0Yn8G-yj10sMlEvOWGt3lSN3wK7SidPkQP7AwHTw0yH6GRKB-j-wPeyyco7zGKLUZxleMOo3iIUVyVuMcovoZRrDGKJUaxwSjuMYo1Rp-i7x-OT99_cm3DD5f7AW1cRQ4nAg6xENKtTP00DbIIYi_KgkkIgZDOeO5zH9JUkBiEzwjhnAScZiHNGff8Z2hcVqV4jjBQ7jHi5yAdCxpyojiYIGcBgZyEjLM9FHXCS7hlw1dNWYqkS3s8TwZyT5TckwlJpNz3EOmHrgwlzDaD3nYaSqxva3zWRMJrm-EHnVYTaf_Voh6UomrrREVsoljO4sL9f7vFC7S7-cu9RONm3YpX6C6_aBb1-rUF7G8vYd_R |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automatic+mapping+of+parallel+applications+on+multicore+architectures+using+the+Servet+benchmark+suite&rft.jtitle=Computers+%26+electrical+engineering&rft.au=Gonz%C3%A1lez-Dom%C3%ADnguez%2C+Jorge&rft.au=Taboada%2C+Guillermo+L.&rft.au=Fraguela%2C+Basilio+B.&rft.au=Mart%C3%ADn%2C+Mar%C3%ADa+J.&rft.date=2012-03-01&rft.pub=Elsevier+Ltd&rft.issn=0045-7906&rft.eissn=1879-0755&rft.volume=38&rft.issue=2&rft.spage=258&rft.epage=269&rft_id=info:doi/10.1016%2Fj.compeleceng.2011.12.007&rft.externalDocID=S0045790611002114 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0045-7906&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0045-7906&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0045-7906&client=summon |