SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction
Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space...
Uložené v:
| Vydané v: | 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) s. 318 - 330 |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.09.2021
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space towards a single pattern by exploring a highly-optimized matching order, but inherent computational redundancies of such a matching order itself still suffer severely, leading to significant performance degradation. The key innovation of this work lies in a general redundancy criterion that characterizes computational redundancies arising in not only handing a single pattern but also matching multiple patterns simultaneously. In this paper, we present SumPA, a high-performance pattern-centric graph mining system that can sufficiently remove redundant computations for any complex graph mining problems. SumPA features three key designs: (1) a pattern abstraction technique that can simplify numerous complex patterns into a few simple abstract patterns based on pattern similarity, (2) abstraction-guided pattern matching that completely eliminates (totally and partially) redundant computations during subgraph enumeration, and (3) a suite of system optimizations to maximize storage and computation efficiency. Our evaluation on a wide variety of real-world graphs shows that SumPA outperforms the two state-of-the-art systems Peregrine and GraphPi by up to 61.89× and 8.94×, respectively. For many mining problems on large graphs, Peregrine takes hours or even days while SumPA finishes in only a few minutes. |
|---|---|
| AbstractList | Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space towards a single pattern by exploring a highly-optimized matching order, but inherent computational redundancies of such a matching order itself still suffer severely, leading to significant performance degradation. The key innovation of this work lies in a general redundancy criterion that characterizes computational redundancies arising in not only handing a single pattern but also matching multiple patterns simultaneously. In this paper, we present SumPA, a high-performance pattern-centric graph mining system that can sufficiently remove redundant computations for any complex graph mining problems. SumPA features three key designs: (1) a pattern abstraction technique that can simplify numerous complex patterns into a few simple abstract patterns based on pattern similarity, (2) abstraction-guided pattern matching that completely eliminates (totally and partially) redundant computations during subgraph enumeration, and (3) a suite of system optimizations to maximize storage and computation efficiency. Our evaluation on a wide variety of real-world graphs shows that SumPA outperforms the two state-of-the-art systems Peregrine and GraphPi by up to 61.89× and 8.94×, respectively. For many mining problems on large graphs, Peregrine takes hours or even days while SumPA finishes in only a few minutes. |
| Author | Yao, Pengcheng Wang, Qinggang Gui, Chuangyi Liao, Xiaofei Jin, Hai Zheng, Long |
| Author_xml | – sequence: 1 givenname: Chuangyi surname: Gui fullname: Gui, Chuangyi email: chygui@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 2 givenname: Xiaofei surname: Liao fullname: Liao, Xiaofei email: xfliao@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 3 givenname: Long surname: Zheng fullname: Zheng, Long email: longzh@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 4 givenname: Pengcheng surname: Yao fullname: Yao, Pengcheng email: pcyao@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 5 givenname: Qinggang surname: Wang fullname: Wang, Qinggang email: qgwang@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 6 givenname: Hai surname: Jin fullname: Jin, Hai email: hjin@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 |
| BookMark | eNo1jN1KwzAYQCMoqLNPoBd9gdYv-Zo_70rZpjCx4LweafbVBVw20oj49grTq8OBw7lm5_EQibE7DjXnYO_7tltLoa2sBQheAwDCGSusNlwp2TRCG33JimkKA0itUQvLr9ji9XPftw_lfByDDxRz2bucKcWq-5UUfLlM7rgrn0MM8b38Cnn3X5TtMOXkfA6HeMMuRvcxUfHHGXtbzNfdY7V6WT517apywshcoTGgOJnBwkDorUU1ikYaN4CSbove-Uag0CQBvZJaeTVysl5urdfYEM7Y7ekbiGhzTGHv0vfGSoUgBP4ApvFMHg |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/PACT52795.2021.00030 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781665442787 1665442786 |
| EndPage | 330 |
| ExternalDocumentID | 9563022 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61825202,62072195,61832006 funderid: 10.13039/501100001809 – fundername: National Key Research and Development Program of China grantid: 2018YFB1003502 funderid: 10.13039/501100012166 |
| GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK LHSKQ RIE RIL |
| ID | FETCH-LOGICAL-a285t-388061e8b90be3c9936f2458ab065ad3cac42327e503c6576c6f1e9c5d9c734e3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 15 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000758464500023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Tue May 06 03:33:13 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a285t-388061e8b90be3c9936f2458ab065ad3cac42327e503c6576c6f1e9c5d9c734e3 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_9563022 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-Sept. |
| PublicationDateYYYYMMDD | 2021-09-01 |
| PublicationDate_xml | – month: 09 year: 2021 text: 2021-Sept. |
| PublicationDecade | 2020 |
| PublicationTitle | 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) |
| PublicationTitleAbbrev | PACT |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib057737291 |
| Score | 2.2550955 |
| Snippet | Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 318 |
| SubjectTerms | Computational efficiency data reuse Degradation graph mining Parallel architectures Pattern matching Redundancy Technological innovation Transforms |
| Title | SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction |
| URI | https://ieeexplore.ieee.org/document/9563022 |
| WOSCitedRecordID | wos000758464500023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEJ0g8eBJDRi_04NHC7st3W69EQJ6ULKJaLiRbjubcHAxwPr7bbsLxsSLt6Zp0_T7zbTvDcCdkExL9NaJAxt0ILigaYGGMuvMOOtwkg38ivdnOZ2m87nKWnC_58IgYvh8hj2fDG_5dmUq7yrrKy9mxdyBeyClrLlau7UjpI-3ouKGHRdHqp8NRzPBpBLOCmRxL6D_XzFUwhUyOf5f4yfQ_eHikWx_y5xCC8sOTF6rj2z4QMZBAMLVI1nQySxp8NYuDXn0QtTkJYR_IN7ZuitBhrn3bgQ6QxfeJuPZ6Ik2ERGoZqnYUq_cksSY5irKkRuHLZKCDUSqc4cktOVGG__wKlFE3CTOlDBJEaMywioj-QD5GbTLVYnnQFJRxFa7PVg4CMI4Ks14oaLIqDyXCqML6PgxWHzWoheLpvuXf2dfwZEf5Prz1TW0t-sKb-DQfG2Xm_VtmKlvOemTpA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LSgMxFA2lCrpSacW3Wbg07SSZTCbuSmmt2JYBq3RXMskd6MKp1Knfb5I-RHDjLoSEkPe5NznnInQnJNMSvHXiwAaJBRckLcAQZp0ZZx1OsoFf8TaU43E6naqshu53XBgACJ_PoOWT4S3fLszKu8rayotZMXfg7ok4ZnTN1tquHiF9xBVFN_w4Gql21ulOBJNKODuQ0VbA_7-iqIRLpH_0v-aPUfOHjYez3T1zgmpQNlD_ZfWedR5wL0hAuHo4C0qZJQn-2rnBj16KGo9CAAjs3a3bEriTe_9GIDQ00Wu_N-kOyCYmAtEsFRXx2i0JhTRXUQ7cOHSRFCwWqc4dltCWG23806sEEXGTOGPCJAUFZYRVRvIY-Cmql4sSzhBORUGtdruwcCCEcVCa8UJFkVF5LhVE56jhx2D2sZa9mG26f_F39i06GExGw9nwafx8iQ79gK-_Yl2herVcwTXaN1_V_HN5E2btGxqXlus |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+30th+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%28PACT%29&rft.atitle=SumPA%3A+Efficient+Pattern-Centric+Graph+Mining+with+Pattern+Abstraction&rft.au=Gui%2C+Chuangyi&rft.au=Liao%2C+Xiaofei&rft.au=Zheng%2C+Long&rft.au=Yao%2C+Pengcheng&rft.date=2021-09-01&rft.pub=IEEE&rft.spage=318&rft.epage=330&rft_id=info:doi/10.1109%2FPACT52795.2021.00030&rft.externalDocID=9563022 |