Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics

Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP) programming and execution model. BSP permits bulk-communication and uses large messages which are supported efficiently by current message transport layers...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings / International Conference on Parallel Architectures and Compilation Techniques S. 15 - 28
Hauptverfasser: Dathathri, Roshan, Gill, Gurbinder, Hoang, Loc, Jatala, Vishwesh, Pingali, Keshav, Nandivada, V. Krishna, Dang, Hoang-Vu, Snir, Marc
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.09.2019
Schlagworte:
ISSN:2641-7936
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP) programming and execution model. BSP permits bulk-communication and uses large messages which are supported efficiently by current message transport layers, but bulk-synchronization can exacerbate the performance impact of load imbalance because a round cannot be completed until every host has completed that round. Asynchronous distributed graph analytics systems circumvent this problem by permitting hosts to make progress at their own pace, but existing systems either use global locks and send small messages or send large messages but do not support general partitioning policies such as vertex-cuts. Consequently, they perform substantially worse than bulk-synchronous systems. Moreover, none of their programming or execution models can be easily adapted for heterogeneous devices like GPUs. In this paper, we design and implement a lock-free, non-blocking, bulk-asynchronous runtime called Gluon-Async for distributed and heterogeneous graph analytics. The runtime supports any partitioning policy and uses bulk-communication. We present the bulk-asynchronous parallel (BASP) model which allows the programmer to utilize the runtime by specifying only the abstract communication required. Applications written in this model are compared with the BSP programs written using (1) D-Galois and D-IrGL, the state-of-the-art distributed graph analytics systems (which are bulk-synchronous) for CPUs and GPUs, respectively, and (2) Lux, another (bulk-synchronous) distributed GPU graph analytical system. Our evaluation shows that programs written using BASP-style execution are on average ~1.5x faster than those in D-Galois and D-IrGL on real-world large-diameter graphs at scale. They are also on average ~12x faster than Lux. To the best of our knowledge, Gluon-Async is the first asynchronous distributed GPU graph analytics system.
AbstractList Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP) programming and execution model. BSP permits bulk-communication and uses large messages which are supported efficiently by current message transport layers, but bulk-synchronization can exacerbate the performance impact of load imbalance because a round cannot be completed until every host has completed that round. Asynchronous distributed graph analytics systems circumvent this problem by permitting hosts to make progress at their own pace, but existing systems either use global locks and send small messages or send large messages but do not support general partitioning policies such as vertex-cuts. Consequently, they perform substantially worse than bulk-synchronous systems. Moreover, none of their programming or execution models can be easily adapted for heterogeneous devices like GPUs. In this paper, we design and implement a lock-free, non-blocking, bulk-asynchronous runtime called Gluon-Async for distributed and heterogeneous graph analytics. The runtime supports any partitioning policy and uses bulk-communication. We present the bulk-asynchronous parallel (BASP) model which allows the programmer to utilize the runtime by specifying only the abstract communication required. Applications written in this model are compared with the BSP programs written using (1) D-Galois and D-IrGL, the state-of-the-art distributed graph analytics systems (which are bulk-synchronous) for CPUs and GPUs, respectively, and (2) Lux, another (bulk-synchronous) distributed GPU graph analytical system. Our evaluation shows that programs written using BASP-style execution are on average ~1.5x faster than those in D-Galois and D-IrGL on real-world large-diameter graphs at scale. They are also on average ~12x faster than Lux. To the best of our knowledge, Gluon-Async is the first asynchronous distributed GPU graph analytics system.
Author Gill, Gurbinder
Jatala, Vishwesh
Dang, Hoang-Vu
Nandivada, V. Krishna
Snir, Marc
Dathathri, Roshan
Hoang, Loc
Pingali, Keshav
Author_xml – sequence: 1
  givenname: Roshan
  surname: Dathathri
  fullname: Dathathri, Roshan
  organization: University of Texas at Austin
– sequence: 2
  givenname: Gurbinder
  surname: Gill
  fullname: Gill, Gurbinder
  organization: University of Texas at Austin
– sequence: 3
  givenname: Loc
  surname: Hoang
  fullname: Hoang, Loc
  organization: University of Texas at Austin
– sequence: 4
  givenname: Vishwesh
  surname: Jatala
  fullname: Jatala, Vishwesh
  organization: University of Texas at Austin
– sequence: 5
  givenname: Keshav
  surname: Pingali
  fullname: Pingali, Keshav
  organization: University of Texas at Austin
– sequence: 6
  givenname: V. Krishna
  surname: Nandivada
  fullname: Nandivada, V. Krishna
  organization: Indian Institute of Technology Madras
– sequence: 7
  givenname: Hoang-Vu
  surname: Dang
  fullname: Dang, Hoang-Vu
  organization: University of Illinois at Urbana-Champaign
– sequence: 8
  givenname: Marc
  surname: Snir
  fullname: Snir, Marc
  organization: University of Illinois at Urbana-Champaign
BookMark eNotj0FLwzAYhqMouE3PHrzkD7R--dKkrbe66SYMFJygp5G2X1y1S0fSHvrvVebp5YGHB94pO3OdI8auBcRCQH77Usw3MYLIYwAQcMKmIsVMSC3k-ymboE5ElOZSX7BpCF8AidBKTtjHsh06FxVhdNUdL_j90H4faec71w2Bv46hpz23neeLJvS-KYeeam5czVfUk-8-ydGfuPTmsOOFM-3YN1W4ZOfWtIGu_nfG3h4fNvNVtH5ePs2LdWQwVX1kDWR1WcrKJmgqBJ0oUqAQTJrYUqPGDEVekrVASv9eIizBCEosIMrayBm7OXYbItoefLM3ftxmWS40KvkDetJTYA
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/PACT.2019.00010
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 172813613X
9781728136134
EISSN 2641-7936
EndPage 28
ExternalDocumentID 8891625
Genre orig-research
GroupedDBID 123
23M
29O
6IE
6IL
ACGFS
AFFNX
ALMA_UNASSIGNED_HOLDINGS
CBEJK
M43
RIE
RIL
RNS
ID FETCH-LOGICAL-a275t-fa08dbb3cf42ac20645e50520a74fb62628219beff0e56728e2b0a1e4f0223da3
IEDL.DBID RIE
ISICitedReferencesCount 20
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000550990200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:43:19 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a275t-fa08dbb3cf42ac20645e50520a74fb62628219beff0e56728e2b0a1e4f0223da3
PageCount 14
ParticipantIDs ieee_primary_8891625
PublicationCentury 2000
PublicationDate 2019-Sept.
PublicationDateYYYYMMDD 2019-09-01
PublicationDate_xml – month: 09
  year: 2019
  text: 2019-Sept.
PublicationDecade 2010
PublicationTitle Proceedings / International Conference on Parallel Architectures and Compilation Techniques
PublicationTitleAbbrev PACT
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0041653
ssib057737306
Score 2.2084458
Snippet Distributed graph analytics systems for CPUs, like D-Galois and Gemini, and for GPUs, like D-IrGL and Lux, use a bulk-synchronous parallel (BSP) programming...
SourceID ieee
SourceType Publisher
StartPage 15
SubjectTerms Analytical models
asynchronous parallel execution models
BSP model
Computational modeling
distributed and heterogeneous
graph analytics
Graphics processing units
Load modeling
Mirrors
Partitioning algorithms
Programming
Title Gluon-Async: A Bulk-Asynchronous System for Distributed and Heterogeneous Graph Analytics
URI https://ieeexplore.ieee.org/document/8891625
WOSCitedRecordID wos000550990200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8JAEN4A8eAJFYzv7MGjlaWvbb0hCpwIB0zwRPYxmxhJa4Ca-O-daQt68OKtbXpod2Y733Rmvo-xW4N-YhAYeE6kxgu1wS0lY-UpY2QaG0UF2FJsQk6nyWKRzhrsbj8LAwBl8xnc02FZy7e5KehXWS9JEMz4UZM1pZTVrNbOdyIpA3TWePcVRpwRBTWVT1-kvdlgOKdGLmKnFDQu-0tLpQwlo_b_HuKIdX9m8vhsH22OWQOyE9beiTLweo922Ot4VeSZN9h8ZeaBD_hjsXqvzogGF_N8XpGUc0Sr_Iloc0nxCixXmeUTao7J0aeAbhwTmTUvaUuIzLnLXkbP8-HEq_UTPOXLaOs5JRKrdWBc6BMTYxxGQLp1QsnQacxkMN3qpxqcExDF0k_A10L1IXQY2AOrglPWyvIMzhgXIC0CXEtVxxBNqAW-rZLgDI1u-facdWillh8VRcayXqSLvy9fskMyRdWqdcVa23UB1-zAfG7fNuub0q7fWAmkkA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT4NAEJ7UaqKnqq3x7R48il1gYcFbfbQ11qaHmtRTs-wjMTZg2mLiv3cHaPXgxRsQDrAzy3zDzHwfwKW0fiItMHAMjaXDEmm3FA-FI6TkcSgFFmALsQk-HEaTSTyqwdV6FkZrXTSf6Ws8LGr5KpM5_iprR5EFM16wAZsBY55bTmutvCfg3LfuGq6-wxZpBH5F5uPSuD3q3I2xlQv5KSkOzP5SUymCSbfxv8fYhdbPVB4ZrePNHtR0ug-NlSwDqXZpE157szxLnc7iK5U3pENu89l7eYZEuDbTJyVNObF4ldwjcS5qXmlFRKpIH9tjMutVGm_sIZ01KYhLkM65BS_dh_Fd36kUFBzh8WDpGEEjlSS-NMxDLsaQBRqV66jgzCQ2l7EJlxsn2hiqg5B7kfYSKlzNjA3tvhL-AdTTLNWHQKjmykJchXVHZo2YUPu2gmsjcXjLU0fQxJWafpQkGdNqkY7_vnwB2_3x82A6eBw-ncAOmqVs3DqF-nKe6zPYkp_Lt8X8vLDxNxSip9c
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques&rft.atitle=Gluon-Async%3A+A+Bulk-Asynchronous+System+for+Distributed+and+Heterogeneous+Graph+Analytics&rft.au=Dathathri%2C+Roshan&rft.au=Gill%2C+Gurbinder&rft.au=Hoang%2C+Loc&rft.au=Jatala%2C+Vishwesh&rft.date=2019-09-01&rft.pub=IEEE&rft.eissn=2641-7936&rft.spage=15&rft.epage=28&rft_id=info:doi/10.1109%2FPACT.2019.00010&rft.externalDocID=8891625