Fast and Efficient Graph Traversal Algorithm for CPUs: Maximizing Single-Node Efficiency

Graph-based structures are being increasingly used to model data and relations among data in a number of fields. Graph-based databases are becoming more popular as a means to better represent such data. Graph traversal is a key component in graph algorithms such as reachability and graph matching. S...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2012 IEEE 26th International Parallel and Distributed Processing Symposium S. 378 - 389
Hauptverfasser:	Chhugani, J., Satish, N., Changkyu Kim, Sewall, J., Dubey, P.
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.05.2012
Schlagworte:	Arrays Bandwidth efficient Graph traversal Instruction sets multi-socket Partitioning algorithms single node Sockets
ISBN:	1467309753, 9781467309752
ISSN:	1530-2075
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	Graph-based structures are being increasingly used to model data and relations among data in a number of fields. Graph-based databases are becoming more popular as a means to better represent such data. Graph traversal is a key component in graph algorithms such as reachability and graph matching. Since the scale of data stored and queried in these databases is increasing, it is important to obtain high performing implementations of graph traversal that can efficiently utilize the processing power of modern processors. In this work, we present a scalable Breadth-First Search Traversal algorithm for modern multi-socket, multi-core CPUs. Our algorithm uses lock- and atomic-free operations on a cache-resident structure for arbitrary sized graphs to filter out expensive main memory accesses, and completely and efficiently utilizes all available bandwidth resources. We propose a work distribution approach for multi-socket platforms that ensures load-balancing while keeping cross-socket communication low. We provide a detailed analytical model that accurately projects the performance of our single- and multi-socket traversal algorithms to within 5-10% of obtained performance. Our analytical model serves as a useful tool to analyze performance bottlenecks on modern CPUs. When measured on various synthetic and real-world graphs with a wide range of graph sizes, vertex degrees and graph diameters, our implementation on a dual-socket Intel ® Xeon ® X5570 (Intel microarchitecture code name Nehalem) system achieves 1.5X-13.2X performance speedup over the best reported numbers. We achieve around 1 Billion traversed edges per second on a scale-free R-MAT graph with 64M vertices and 2 Billion edges on a dual-socket Nehalem system. Our optimized algorithm is useful as a building block for efficient multi-node implementations and future exascale systems, thereby allowing them to ride the trend of increasing per-node compute and bandwidth resources.
AbstractList	Graph-based structures are being increasingly used to model data and relations among data in a number of fields. Graph-based databases are becoming more popular as a means to better represent such data. Graph traversal is a key component in graph algorithms such as reachability and graph matching. Since the scale of data stored and queried in these databases is increasing, it is important to obtain high performing implementations of graph traversal that can efficiently utilize the processing power of modern processors. In this work, we present a scalable Breadth-First Search Traversal algorithm for modern multi-socket, multi-core CPUs. Our algorithm uses lock- and atomic-free operations on a cache-resident structure for arbitrary sized graphs to filter out expensive main memory accesses, and completely and efficiently utilizes all available bandwidth resources. We propose a work distribution approach for multi-socket platforms that ensures load-balancing while keeping cross-socket communication low. We provide a detailed analytical model that accurately projects the performance of our single- and multi-socket traversal algorithms to within 5-10% of obtained performance. Our analytical model serves as a useful tool to analyze performance bottlenecks on modern CPUs. When measured on various synthetic and real-world graphs with a wide range of graph sizes, vertex degrees and graph diameters, our implementation on a dual-socket Intel ® Xeon ® X5570 (Intel microarchitecture code name Nehalem) system achieves 1.5X-13.2X performance speedup over the best reported numbers. We achieve around 1 Billion traversed edges per second on a scale-free R-MAT graph with 64M vertices and 2 Billion edges on a dual-socket Nehalem system. Our optimized algorithm is useful as a building block for efficient multi-node implementations and future exascale systems, thereby allowing them to ride the trend of increasing per-node compute and bandwidth resources.
Author	Changkyu Kim Satish, N. Chhugani, J. Sewall, J. Dubey, P.
Author_xml	– sequence: 1 givenname: J. surname: Chhugani fullname: Chhugani, J. – sequence: 2 givenname: N. surname: Satish fullname: Satish, N. – sequence: 3 surname: Changkyu Kim fullname: Changkyu Kim – sequence: 4 givenname: J. surname: Sewall fullname: Sewall, J. – sequence: 5 givenname: P. surname: Dubey fullname: Dubey, P.
BookMark	eNo9j81OwkAURseIiYAsXbmZFyje-e-4IwhIgkoCJO7ILZ2BMaUl04aIT28TjZtzvtVJvh7plFXpCLlnMGQM7ON8-bxcDTkwPpTiigysScFoq6Q2yl6THmuHAGuU6JAuUwISDkbdkl5dfwJwENJ2yccU64ZimdOJ92EXXNnQWcTTga4jnl2ssaCjYl_F0ByO1FeRjpeb-om-4lc4hu9Q7umqReGStyp3_5Hd5Y7ceCxqN_hzn2ymk_X4JVm8z-bj0SIJXLImyT3PJUdUuTVSaCvBQiqYAA2YeoRMomGoc5UJ6S2C9C0VWiMyo7Ncij55-O0G59z2FMMR42WruTZp-_wHZ1VUXw
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/IPDPS.2012.43
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9780769546759 0769546757
EndPage	389
ExternalDocumentID	6267875
Genre	orig-research
GroupedDBID	29O 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL
ID	FETCH-LOGICAL-i241t-df2d42aa5d97436940908313060a8fa0b4a71a6d5b34f9a04ff9a5a973b76bd43
IEDL.DBID	RIE
ISBN	1467309753 9781467309752
ISICitedReferencesCount	41
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000309131900034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	1530-2075
IngestDate	Wed Aug 27 04:45:00 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i241t-df2d42aa5d97436940908313060a8fa0b4a71a6d5b34f9a04ff9a5a973b76bd43
PageCount	12
ParticipantIDs	ieee_primary_6267875
PublicationCentury	2000
PublicationDate	2012-05
PublicationDateYYYYMMDD	2012-05-01
PublicationDate_xml	– month: 05 year: 2012 text: 2012-05
PublicationDecade	2010
PublicationTitle	2012 IEEE 26th International Parallel and Distributed Processing Symposium
PublicationTitleAbbrev	ipdps
PublicationYear	2012
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0020349 ssj0000781219
Score	1.6779436
Snippet	Graph-based structures are being increasingly used to model data and relations among data in a number of fields. Graph-based databases are becoming more...
SourceID	ieee
SourceType	Publisher
StartPage	378
SubjectTerms	Arrays Bandwidth efficient Graph traversal Instruction sets multi-socket Partitioning algorithms single node Sockets
Title	Fast and Efficient Graph Traversal Algorithm for CPUs: Maximizing Single-Node Efficiency
URI	https://ieeexplore.ieee.org/document/6267875
WOSCitedRecordID	wos000309131900034&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEN0A8eAJFYzf2YNHC2vZ7VJvBkFNlDQBDDcy3d1qEz4MLUb99c62BTx48bJpe5g0u9PMvOm8N4RcepIr7jPhRJ4AhxtfOphGWNlKxaQLRqislP3yJPv99njsByVyteHCGGOy5jPTsJfZv3y9UCtbKmti8o3-JcqkLKXMuVqbeooVrXGtNFkBtqzuSq6VytATpMhIXR76s2WSrrWeint3K77ZfAzugoFt-XIblsnza-RKFnF61f-96x6pb6l7NNgEpX1SMvMDUl3PbqDFp1wj4x4kKYW5pt1MRAKN0XsrXk2HdiDRMoEpvZ2-LpZx-jajmNnSTjBKbugzfMaz-Btt0wEuU-P0F9psjKivOhn1usPOg1NMWXBijN6poyNXcxdAaIQWLc9HwGenjyGUYNCOgIUc5DV4WoQtHvnAeISrAF-2QumFmrcOSWW-mJsjQtGOYEqD9gzjSrkhJj9e5IcKQkBk5h6Tmt2oyXsupDEp9ujk78enZNceQ95deEYq6XJlzsmO-kjjZHmRnf4PuhGoqw
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEN0gmugJFYzf7sGjhbLsttSbQRAikCaA4Uamu1ttwoeBYtRf72wp4MGLl03bw6TZnWbmTee9IeTWcbnkni2s0BFgce25FqYRRrZS2i4DLWRSyn5pu91udTj0_Ay523BhtNZJ85kumsvkX76ayaUplZUw-Ub_EjtkV3DOyiu21qaiYmRrmBEnS-GWUV5ZqaXa6AuuSGhdDnq04ZKu1Z7Se7aV3yy1_Ee_Z5q-WNFweX4NXUliTiP3v7c9JIUteY_6m7B0RDJ6ekxy6-kNNP2Y82TYgEVMYapoPZGRQGP0ychX074ZSTRfwJg-jF9n8yh-m1DMbWnNHyzuaQc-o0n0jbZpD5extrozpTdG5FeBDBr1fq1ppXMWrAjjd2ypkCnOAIRCcFFxPIR8Zv4YggkbqiHYAQe3DI4SQYWHHtg8xFWA51YC1wkUr5yQ7HQ21aeEoh1hSwXK0TaXkgWY_jihF0gIALEZOyN5s1Gj95WUxijdo_O_H9-Q_Wa_0x61W93nC3JgjmTVa3hJsvF8qa_InvyIo8X8OvGEH7Tiq_I
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+IEEE+26th+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=Fast+and+Efficient+Graph+Traversal+Algorithm+for+CPUs%3A+Maximizing+Single-Node+Efficiency&rft.au=Chhugani%2C+J.&rft.au=Satish%2C+N.&rft.au=Changkyu+Kim&rft.au=Sewall%2C+J.&rft.date=2012-05-01&rft.pub=IEEE&rft.isbn=9781467309752&rft.issn=1530-2075&rft.spage=378&rft.epage=389&rft_id=info:doi/10.1109%2FIPDPS.2012.43&rft.externalDocID=6267875
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-2075&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-2075&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-2075&client=summon