SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction

Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) s. 318 - 330
Hlavní autori: Gui, Chuangyi, Liao, Xiaofei, Zheng, Long, Yao, Pengcheng, Wang, Qinggang, Jin, Hai
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.09.2021
Predmet:
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space towards a single pattern by exploring a highly-optimized matching order, but inherent computational redundancies of such a matching order itself still suffer severely, leading to significant performance degradation. The key innovation of this work lies in a general redundancy criterion that characterizes computational redundancies arising in not only handing a single pattern but also matching multiple patterns simultaneously. In this paper, we present SumPA, a high-performance pattern-centric graph mining system that can sufficiently remove redundant computations for any complex graph mining problems. SumPA features three key designs: (1) a pattern abstraction technique that can simplify numerous complex patterns into a few simple abstract patterns based on pattern similarity, (2) abstraction-guided pattern matching that completely eliminates (totally and partially) redundant computations during subgraph enumeration, and (3) a suite of system optimizations to maximize storage and computation efficiency. Our evaluation on a wide variety of real-world graphs shows that SumPA outperforms the two state-of-the-art systems Peregrine and GraphPi by up to 61.89× and 8.94×, respectively. For many mining problems on large graphs, Peregrine takes hours or even days while SumPA finishes in only a few minutes.
AbstractList Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space towards a single pattern by exploring a highly-optimized matching order, but inherent computational redundancies of such a matching order itself still suffer severely, leading to significant performance degradation. The key innovation of this work lies in a general redundancy criterion that characterizes computational redundancies arising in not only handing a single pattern but also matching multiple patterns simultaneously. In this paper, we present SumPA, a high-performance pattern-centric graph mining system that can sufficiently remove redundant computations for any complex graph mining problems. SumPA features three key designs: (1) a pattern abstraction technique that can simplify numerous complex patterns into a few simple abstract patterns based on pattern similarity, (2) abstraction-guided pattern matching that completely eliminates (totally and partially) redundant computations during subgraph enumeration, and (3) a suite of system optimizations to maximize storage and computation efficiency. Our evaluation on a wide variety of real-world graphs shows that SumPA outperforms the two state-of-the-art systems Peregrine and GraphPi by up to 61.89× and 8.94×, respectively. For many mining problems on large graphs, Peregrine takes hours or even days while SumPA finishes in only a few minutes.
Author Yao, Pengcheng
Wang, Qinggang
Gui, Chuangyi
Liao, Xiaofei
Jin, Hai
Zheng, Long
Author_xml – sequence: 1
  givenname: Chuangyi
  surname: Gui
  fullname: Gui, Chuangyi
  email: chygui@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 2
  givenname: Xiaofei
  surname: Liao
  fullname: Liao, Xiaofei
  email: xfliao@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 3
  givenname: Long
  surname: Zheng
  fullname: Zheng, Long
  email: longzh@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 4
  givenname: Pengcheng
  surname: Yao
  fullname: Yao, Pengcheng
  email: pcyao@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 5
  givenname: Qinggang
  surname: Wang
  fullname: Wang, Qinggang
  email: qgwang@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
– sequence: 6
  givenname: Hai
  surname: Jin
  fullname: Jin, Hai
  email: hjin@hust.edu.cn
  organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074
BookMark eNo1jN1KwzAYQCMoqLNPoBd9gdYv-Zo_70rZpjCx4LweafbVBVw20oj49grTq8OBw7lm5_EQibE7DjXnYO_7tltLoa2sBQheAwDCGSusNlwp2TRCG33JimkKA0itUQvLr9ji9XPftw_lfByDDxRz2bucKcWq-5UUfLlM7rgrn0MM8b38Cnn3X5TtMOXkfA6HeMMuRvcxUfHHGXtbzNfdY7V6WT517apywshcoTGgOJnBwkDorUU1ikYaN4CSbove-Uag0CQBvZJaeTVysl5urdfYEM7Y7ekbiGhzTGHv0vfGSoUgBP4ApvFMHg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/PACT52795.2021.00030
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781665442787
1665442786
EndPage 330
ExternalDocumentID 9563022
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61825202,62072195,61832006
  funderid: 10.13039/501100001809
– fundername: National Key Research and Development Program of China
  grantid: 2018YFB1003502
  funderid: 10.13039/501100012166
GroupedDBID 6IE
6IL
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
LHSKQ
RIE
RIL
ID FETCH-LOGICAL-a285t-388061e8b90be3c9936f2458ab065ad3cac42327e503c6576c6f1e9c5d9c734e3
IEDL.DBID RIE
ISICitedReferencesCount 15
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000758464500023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Tue May 06 03:33:13 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a285t-388061e8b90be3c9936f2458ab065ad3cac42327e503c6576c6f1e9c5d9c734e3
PageCount 13
ParticipantIDs ieee_primary_9563022
PublicationCentury 2000
PublicationDate 2021-Sept.
PublicationDateYYYYMMDD 2021-09-01
PublicationDate_xml – month: 09
  year: 2021
  text: 2021-Sept.
PublicationDecade 2020
PublicationTitle 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)
PublicationTitleAbbrev PACT
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib057737291
Score 2.2550955
Snippet Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem...
SourceID ieee
SourceType Publisher
StartPage 318
SubjectTerms Computational efficiency
data reuse
Degradation
graph mining
Parallel architectures
Pattern matching
Redundancy
Technological innovation
Transforms
Title SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction
URI https://ieeexplore.ieee.org/document/9563022
WOSCitedRecordID wos000758464500023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEJ0g8eBJDRi_04NHC7st3W69EQJ6ULKJaLiRbjubcHAxwPr7bbsLxsSLt6Zp0_T7zbTvDcCdkExL9NaJAxt0ILigaYGGMuvMOOtwkg38ivdnOZ2m87nKWnC_58IgYvh8hj2fDG_5dmUq7yrrKy9mxdyBeyClrLlau7UjpI-3ouKGHRdHqp8NRzPBpBLOCmRxL6D_XzFUwhUyOf5f4yfQ_eHikWx_y5xCC8sOTF6rj2z4QMZBAMLVI1nQySxp8NYuDXn0QtTkJYR_IN7ZuitBhrn3bgQ6QxfeJuPZ6Ik2ERGoZqnYUq_cksSY5irKkRuHLZKCDUSqc4cktOVGG__wKlFE3CTOlDBJEaMywioj-QD5GbTLVYnnQFJRxFa7PVg4CMI4Ks14oaLIqDyXCqML6PgxWHzWoheLpvuXf2dfwZEf5Prz1TW0t-sKb-DQfG2Xm_VtmKlvOemTpA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LSgMxFA2lCrpSacW3Wbg07SSZTCbuSmmt2JYBq3RXMskd6MKp1Knfb5I-RHDjLoSEkPe5NznnInQnJNMSvHXiwAaJBRckLcAQZp0ZZx1OsoFf8TaU43E6naqshu53XBgACJ_PoOWT4S3fLszKu8rayotZMXfg7ok4ZnTN1tquHiF9xBVFN_w4Gql21ulOBJNKODuQ0VbA_7-iqIRLpH_0v-aPUfOHjYez3T1zgmpQNlD_ZfWedR5wL0hAuHo4C0qZJQn-2rnBj16KGo9CAAjs3a3bEriTe_9GIDQ00Wu_N-kOyCYmAtEsFRXx2i0JhTRXUQ7cOHSRFCwWqc4dltCWG23806sEEXGTOGPCJAUFZYRVRvIY-Cmql4sSzhBORUGtdruwcCCEcVCa8UJFkVF5LhVE56jhx2D2sZa9mG26f_F39i06GExGw9nwafx8iQ79gK-_Yl2herVcwTXaN1_V_HN5E2btGxqXlus
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+30th+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%28PACT%29&rft.atitle=SumPA%3A+Efficient+Pattern-Centric+Graph+Mining+with+Pattern+Abstraction&rft.au=Gui%2C+Chuangyi&rft.au=Liao%2C+Xiaofei&rft.au=Zheng%2C+Long&rft.au=Yao%2C+Pengcheng&rft.date=2021-09-01&rft.pub=IEEE&rft.spage=318&rft.epage=330&rft_id=info:doi/10.1109%2FPACT52795.2021.00030&rft.externalDocID=9563022