Sorting on the SGI Origin 2000: comparing MPI and shared memory implementations

Analyses the C/sup 3/-Radix (Communication- and Cache-Conscious Radix) sort algorithm, using the distributed and the shared memory parallel programming models. C/sup 3/-Radix was originally proposed based on the idea of the classic Radix sort to exploit the memory hierarchy locality and to reduce th...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings. SCCC'99 XIX International Conference of the Chilean Computer Science Society s. 209 - 215
Hlavní autoři: Jimenez-Gonzalez, D., Guinovart, E., Larriba-Pey, J.-L., Navarro, J.J.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 1999
Témata:
ISBN:0769502962, 9780769502960
ISSN:1522-4902
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Analyses the C/sup 3/-Radix (Communication- and Cache-Conscious Radix) sort algorithm, using the distributed and the shared memory parallel programming models. C/sup 3/-Radix was originally proposed based on the idea of the classic Radix sort to exploit the memory hierarchy locality and to reduce the amount of communication for distributed memory computers. We implement C/sup 3/-Radix on the SGI Origin 2000 NUMA multiprocessor and make use of the Message Passing Interface (MPI) and the native shared memory directives of that computer to implement the two programming models that we want to analyse. We give results for up to 16 processors and 64 million 32-bit keys. The results show that for data sets that are small compared to the number of processors, the MPI implementation is faster, while for data sets that are large, the shared memory implementation is faster. In this paper, we explain the reasons for the different behaviours depending on the size of the data sets.
ISBN:0769502962
9780769502960
ISSN:1522-4902
DOI:10.1109/SCCC.1999.810190