SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction

Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) S. 318 - 330
Hauptverfasser: Gui, Chuangyi, Liao, Xiaofei, Zheng, Long, Yao, Pengcheng, Wang, Qinggang, Jin, Hai
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.09.2021
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space towards a single pattern by exploring a highly-optimized matching order, but inherent computational redundancies of such a matching order itself still suffer severely, leading to significant performance degradation. The key innovation of this work lies in a general redundancy criterion that characterizes computational redundancies arising in not only handing a single pattern but also matching multiple patterns simultaneously. In this paper, we present SumPA, a high-performance pattern-centric graph mining system that can sufficiently remove redundant computations for any complex graph mining problems. SumPA features three key designs: (1) a pattern abstraction technique that can simplify numerous complex patterns into a few simple abstract patterns based on pattern similarity, (2) abstraction-guided pattern matching that completely eliminates (totally and partially) redundant computations during subgraph enumeration, and (3) a suite of system optimizations to maximize storage and computation efficiency. Our evaluation on a wide variety of real-world graphs shows that SumPA outperforms the two state-of-the-art systems Peregrine and GraphPi by up to 61.89× and 8.94×, respectively. For many mining problems on large graphs, Peregrine takes hours or even days while SumPA finishes in only a few minutes.
AbstractList Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space towards a single pattern by exploring a highly-optimized matching order, but inherent computational redundancies of such a matching order itself still suffer severely, leading to significant performance degradation. The key innovation of this work lies in a general redundancy criterion that characterizes computational redundancies arising in not only handing a single pattern but also matching multiple patterns simultaneously. In this paper, we present SumPA, a high-performance pattern-centric graph mining system that can sufficiently remove redundant computations for any complex graph mining problems. SumPA features three key designs: (1) a pattern abstraction technique that can simplify numerous complex patterns into a few simple abstract patterns based on pattern similarity, (2) abstraction-guided pattern matching that completely eliminates (totally and partially) redundant computations during subgraph enumeration, and (3) a suite of system optimizations to maximize storage and computation efficiency. Our evaluation on a wide variety of real-world graphs shows that SumPA outperforms the two state-of-the-art systems Peregrine and GraphPi by up to 61.89× and 8.94×, respectively. For many mining problems on large graphs, Peregrine takes hours or even days while SumPA finishes in only a few minutes.
Author Yao, Pengcheng
Wang, Qinggang
Gui, Chuangyi
Liao, Xiaofei
Jin, Hai
Zheng, Long
Author_xml – sequence: 1
  givenname: Chuangyi
  surname: Gui
  fullname: Gui, Chuangyi
  email: chygui@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 2
  givenname: Xiaofei
  surname: Liao
  fullname: Liao, Xiaofei
  email: xfliao@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 3
  givenname: Long
  surname: Zheng
  fullname: Zheng, Long
  email: longzh@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 4
  givenname: Pengcheng
  surname: Yao
  fullname: Yao, Pengcheng
  email: pcyao@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 5
  givenname: Qinggang
  surname: Wang
  fullname: Wang, Qinggang
  email: qgwang@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 6
  givenname: Hai
  surname: Jin
  fullname: Jin, Hai
  email: hjin@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
BookMark eNo1jN1KwzAYQCMoqLNPoBd9gdYv-Zo_70rZpjCx4LweafbVBVw20oj49grTq8OBw7lm5_EQibE7DjXnYO_7tltLoa2sBQheAwDCGSusNlwp2TRCG33JimkKA0itUQvLr9ji9XPftw_lfByDDxRz2bucKcWq-5UUfLlM7rgrn0MM8b38Cnn3X5TtMOXkfA6HeMMuRvcxUfHHGXtbzNfdY7V6WT517apywshcoTGgOJnBwkDorUU1ikYaN4CSbove-Uag0CQBvZJaeTVysl5urdfYEM7Y7ekbiGhzTGHv0vfGSoUgBP4ApvFMHg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/PACT52795.2021.00030
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL) (UW System Shared)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781665442787
1665442786
EndPage 330
ExternalDocumentID 9563022
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61825202,62072195,61832006
  funderid: 10.13039/501100001809
– fundername: National Key Research and Development Program of China
  grantid: 2018YFB1003502
  funderid: 10.13039/501100012166
GroupedDBID 6IE
6IL
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
LHSKQ
RIE
RIL
ID FETCH-LOGICAL-a285t-388061e8b90be3c9936f2458ab065ad3cac42327e503c6576c6f1e9c5d9c734e3
IEDL.DBID RIE
ISICitedReferencesCount 15
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000758464500023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Tue May 06 03:33:13 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a285t-388061e8b90be3c9936f2458ab065ad3cac42327e503c6576c6f1e9c5d9c734e3
PageCount 13
ParticipantIDs ieee_primary_9563022
PublicationCentury 2000
PublicationDate 2021-Sept.
PublicationDateYYYYMMDD 2021-09-01
PublicationDate_xml – month: 09
  year: 2021
  text: 2021-Sept.
PublicationDecade 2020
PublicationTitle 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)
PublicationTitleAbbrev PACT
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib057737291
Score 2.2554102
Snippet Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem...
SourceID ieee
SourceType Publisher
StartPage 318
SubjectTerms Computational efficiency
data reuse
Degradation
graph mining
Parallel architectures
Pattern matching
Redundancy
Technological innovation
Transforms
Title SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction
URI https://ieeexplore.ieee.org/document/9563022
WOSCitedRecordID wos000758464500023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH7M4cGTyib-JgePZmuSpkm8jbHpxVFwwm6jSV9gBzuZm3-_STYnghcvoYSW0tfS9_P7PoA77Q2XOtLsh5XmwUlQmxlOGXqmPDe6rvIkNqEmEz2bmbIF93ssDCKm4TPsxcPUy6-XbhNLZX0Tyax4-OEeKKW2WK3vb0eqqLdi2A4dxzLTLwfDqeTKyJAFctZL0f8vDZXkQsbH_7v5CXR_sHik3HuZU2hh04Hxy-atHDyQUSKACNeRMvFkNjRVaxeOPEYiavKc5B9ILLZ-n0EGNlY3EpyhC6_j0XT4RHeKCLTiWq5pZG4pGGprMovChdii8DyXurIhkqhq4SoXG68KZSZcEVIJV3iGxsnaOCVyFGfQbpYNngOxSiLPRJXVtg5JkjMsBCbecysKGeGmF9CJNpi_b0kv5rvHv_x7-wqOopG3w1fX0F6vNngDh-5zvfhY3aY39QW95ZLA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8MwGH4ZKuhJZRO_zcGj2ZqkaRJvY2xO3EbBCbuNJk1gBzuZm7_fJPsQwYuXUEJL6dvS9_N5HoB76RTlMtDs-xWn3klgnSiKiXVEOKpkWaRRbEKMRnIyUXkNHnZYGGttHD6zzXAYe_nl3KxCqaylApkV9T_cfZ6mlKzRWtuvh4uguKLIBh9HEtXK250xp0JxnwdS0ozx_y8VlehEesf_u_0JNH7QeCjf-ZlTqNmqDr3X1XvefkTdSAHhr0N5ZMqscKzXzgx6ClTUaBgFIFAot27PQG0d6hsR0NCAt1533OnjjSYCLqjkSxy4WzJipVaJtsz46CJzNOWy0D6WKEpmChNar8LyhJnMJxMmc8Qqw0tlBEstO4O9al7Zc0BacEsTViSlLn2aZBTxoYlzVLOMB8DpBdSDDaYfa9qL6ebxL__evoPD_ng4mA6eRy9XcBQMvh7Fuoa95WJlb-DAfC1nn4vb-Na-AZJzlgc
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+30th+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%28PACT%29&rft.atitle=SumPA%3A+Efficient+Pattern-Centric+Graph+Mining+with+Pattern+Abstraction&rft.au=Gui%2C+Chuangyi&rft.au=Liao%2C+Xiaofei&rft.au=Zheng%2C+Long&rft.au=Yao%2C+Pengcheng&rft.date=2021-09-01&rft.pub=IEEE&rft.spage=318&rft.epage=330&rft_id=info:doi/10.1109%2FPACT52795.2021.00030&rft.externalDocID=9563022