A Parallel Graph Environment for Real-World Data Analytics Workflows

Economic competitiveness and national security depend increasingly on the insightful analysis of large data sets. The diversity of real-world data sources and analytic workflows impose challenging hardware and software requirements for parallel graph platforms. The irregular nature of graph methods...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings - Design, Automation, and Test in Europe Conference and Exhibition s. 1313 - 1318
Hlavní autoři: Castellana, Vito Giovanni, Drocco, Maurizio, Feo, John, Firoz, Jesun, Kanewala, Thejaka, Lumsdaine, Andrew, Manzano, Joseph, Marquez, Andres, Minutoli, Marco, Suetterlein, Joshua, Tumeo, Antonino, Zalewski, Marcin
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: EDAA 01.03.2019
Témata:
ISSN:1558-1101
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Economic competitiveness and national security depend increasingly on the insightful analysis of large data sets. The diversity of real-world data sources and analytic workflows impose challenging hardware and software requirements for parallel graph platforms. The irregular nature of graph methods is not supported well by the deep memory hierarchies of conventional distributed systems, requiring new processor and runtime system designs to tolerate memory and synchronization latencies. Moreover, the efficiency of relational table operations and matrix computations are not attainable when data is stored in common graph data structures. In this paper, we present HAGGLE, a high-performance, scalable data analytics platform. The platform's hybrid data model supports a variety of distributed, thread-safe data structures, parallel programming constructs, and persistent and streaming data. An abstract runtime layer enables us to map the stack to conventional, distributed computer systems with accelerators. The runtime uses multithreading, active messages, and data aggregation to hide memory and synchronization latencies on large-scale systems.
ISSN:1558-1101
DOI:10.23919/DATE.2019.8715196