Sorting on the SGI Origin 2000: comparing MPI and shared memory implementations
Analyses the C/sup 3/-Radix (Communication- and Cache-Conscious Radix) sort algorithm, using the distributed and the shared memory parallel programming models. C/sup 3/-Radix was originally proposed based on the idea of the classic Radix sort to exploit the memory hierarchy locality and to reduce th...
Uloženo v:
| Vydáno v: | Proceedings. SCCC'99 XIX International Conference of the Chilean Computer Science Society s. 209 - 215 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
1999
|
| Témata: | |
| ISBN: | 0769502962, 9780769502960 |
| ISSN: | 1522-4902 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Analyses the C/sup 3/-Radix (Communication- and Cache-Conscious Radix) sort algorithm, using the distributed and the shared memory parallel programming models. C/sup 3/-Radix was originally proposed based on the idea of the classic Radix sort to exploit the memory hierarchy locality and to reduce the amount of communication for distributed memory computers. We implement C/sup 3/-Radix on the SGI Origin 2000 NUMA multiprocessor and make use of the Message Passing Interface (MPI) and the native shared memory directives of that computer to implement the two programming models that we want to analyse. We give results for up to 16 processors and 64 million 32-bit keys. The results show that for data sets that are small compared to the number of processors, the MPI implementation is faster, while for data sets that are large, the shared memory implementation is faster. In this paper, we explain the reasons for the different behaviours depending on the size of the data sets. |
|---|---|
| ISBN: | 0769502962 9780769502960 |
| ISSN: | 1522-4902 |
| DOI: | 10.1109/SCCC.1999.810190 |

