SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction
Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space...
Gespeichert in:
| Veröffentlicht in: | 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) S. 318 - 330 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
01.09.2021
|
| Schlagworte: | |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space towards a single pattern by exploring a highly-optimized matching order, but inherent computational redundancies of such a matching order itself still suffer severely, leading to significant performance degradation. The key innovation of this work lies in a general redundancy criterion that characterizes computational redundancies arising in not only handing a single pattern but also matching multiple patterns simultaneously. In this paper, we present SumPA, a high-performance pattern-centric graph mining system that can sufficiently remove redundant computations for any complex graph mining problems. SumPA features three key designs: (1) a pattern abstraction technique that can simplify numerous complex patterns into a few simple abstract patterns based on pattern similarity, (2) abstraction-guided pattern matching that completely eliminates (totally and partially) redundant computations during subgraph enumeration, and (3) a suite of system optimizations to maximize storage and computation efficiency. Our evaluation on a wide variety of real-world graphs shows that SumPA outperforms the two state-of-the-art systems Peregrine and GraphPi by up to 61.89× and 8.94×, respectively. For many mining problems on large graphs, Peregrine takes hours or even days while SumPA finishes in only a few minutes. |
|---|---|
| AbstractList | Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem into a series of subgraph matching problems for high performance. Existing pattern-centric mining systems reduce the substantial search space towards a single pattern by exploring a highly-optimized matching order, but inherent computational redundancies of such a matching order itself still suffer severely, leading to significant performance degradation. The key innovation of this work lies in a general redundancy criterion that characterizes computational redundancies arising in not only handing a single pattern but also matching multiple patterns simultaneously. In this paper, we present SumPA, a high-performance pattern-centric graph mining system that can sufficiently remove redundant computations for any complex graph mining problems. SumPA features three key designs: (1) a pattern abstraction technique that can simplify numerous complex patterns into a few simple abstract patterns based on pattern similarity, (2) abstraction-guided pattern matching that completely eliminates (totally and partially) redundant computations during subgraph enumeration, and (3) a suite of system optimizations to maximize storage and computation efficiency. Our evaluation on a wide variety of real-world graphs shows that SumPA outperforms the two state-of-the-art systems Peregrine and GraphPi by up to 61.89× and 8.94×, respectively. For many mining problems on large graphs, Peregrine takes hours or even days while SumPA finishes in only a few minutes. |
| Author | Yao, Pengcheng Wang, Qinggang Gui, Chuangyi Liao, Xiaofei Jin, Hai Zheng, Long |
| Author_xml | – sequence: 1 givenname: Chuangyi surname: Gui fullname: Gui, Chuangyi email: chygui@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 2 givenname: Xiaofei surname: Liao fullname: Liao, Xiaofei email: xfliao@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 3 givenname: Long surname: Zheng fullname: Zheng, Long email: longzh@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 4 givenname: Pengcheng surname: Yao fullname: Yao, Pengcheng email: pcyao@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 5 givenname: Qinggang surname: Wang fullname: Wang, Qinggang email: qgwang@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 – sequence: 6 givenname: Hai surname: Jin fullname: Jin, Hai email: hjin@hust.edu.cn organization: National Engineering Research Center for Big Data Technology and System, School of Computer Science and Technology, Huazhong University of Science and Technology,Wuhan,China,430074 |
| BookMark | eNo1jN1KwzAYQCMoqLNPoBd9gdYv-Zo_70rZpjCx4LweafbVBVw20oj49grTq8OBw7lm5_EQibE7DjXnYO_7tltLoa2sBQheAwDCGSusNlwp2TRCG33JimkKA0itUQvLr9ji9XPftw_lfByDDxRz2bucKcWq-5UUfLlM7rgrn0MM8b38Cnn3X5TtMOXkfA6HeMMuRvcxUfHHGXtbzNfdY7V6WT517apywshcoTGgOJnBwkDorUU1ikYaN4CSbove-Uag0CQBvZJaeTVysl5urdfYEM7Y7ekbiGhzTGHv0vfGSoUgBP4ApvFMHg |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/PACT52795.2021.00030 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781665442787 1665442786 |
| EndPage | 330 |
| ExternalDocumentID | 9563022 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61825202,62072195,61832006 funderid: 10.13039/501100001809 – fundername: National Key Research and Development Program of China grantid: 2018YFB1003502 funderid: 10.13039/501100012166 |
| GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK LHSKQ RIE RIL |
| ID | FETCH-LOGICAL-a285t-388061e8b90be3c9936f2458ab065ad3cac42327e503c6576c6f1e9c5d9c734e3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 15 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000758464500023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Tue May 06 03:33:13 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a285t-388061e8b90be3c9936f2458ab065ad3cac42327e503c6576c6f1e9c5d9c734e3 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_9563022 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-Sept. |
| PublicationDateYYYYMMDD | 2021-09-01 |
| PublicationDate_xml | – month: 09 year: 2021 text: 2021-Sept. |
| PublicationDecade | 2020 |
| PublicationTitle | 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) |
| PublicationTitleAbbrev | PACT |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib057737291 |
| Score | 2.2554102 |
| Snippet | Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 318 |
| SubjectTerms | Computational efficiency data reuse Degradation graph mining Parallel architectures Pattern matching Redundancy Technological innovation Transforms |
| Title | SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction |
| URI | https://ieeexplore.ieee.org/document/9563022 |
| WOSCitedRecordID | wos000758464500023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8IwGG6QePCkBozf6cGjhXXrpzdCQC-SJWLCjaxv3yUcHAbB329bEGPixdvSbG3atX0_2ud5CLnztURUFTDvIWPCiJpZlMCEVxislywsZklsQk8mZjazZYvc77EwiJgun2EvPqazfL-ETUyV9W0ks8rDhnugtd5itb7njtRRb8XyHTqOZ7ZfDoZTmWsrQxSY817y_n9pqCQTMj7-X-MnpPuDxaPl3sqckhY2HTJ-2byVgwc6SgQQ4TtaJp7MhqVs7QLoYySips9J_oHGZOv3G3TgYnYjwRm65HU8mg6f2E4RgVW5kWsWmVsUR-Ns5rCA4FuoOhfSVC54EpUvoIJ48KpRZgWoEEqAqjlakN6CLgQWZ6TdLBs8J7SuDQgTghcINYRFXqnQDyec01yiFfyCdOIYzN-3pBfzXfcv_y6-IkdxkLeXr65Je73a4A05hM_14mN1m_7UFxZelNQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8MwGA5DBT2pbOK3OXg0W9MmaeJtjM2J2yg4YbfRvH0LO9jJ3Pz9JtmHCF68ldAmJE3yfiTP8xByX5QSUeXAigIiJrQomUEJTBQKnfWSicEoiE2ko5GeTExWIw87LAwihstn2PSP4Sy_mMPKp8paxpNZxW7D3ZdCxHyN1trOHpl6xRXDN_g4HplW1u6MZZwa6eLAmDeD__9LRSUYkd7x_5o_IY0fNB7NdnbmlNSwqpPe6-o9az_SbqCAcN_RLDBlVizka2dAnzwVNR0GAQjq063bN2jb-vxGADQ0yFuvO-702UYTgeWxlkvmuVsUR21NZDEB512oMhZS59b5EnmRQA7-6DVFGSWgXDABquRoQBYG0kRgckb2qnmF54SWpQahXfgCrga3zHPl-mGFtSmXaAS_IHU_BtOPNe3FdNP9y7-L78hhfzwcTAfPo5crcuQHfH0V65rsLRcrvCEH8LWcfS5uw1_7BgbmmBs |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+30th+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%28PACT%29&rft.atitle=SumPA%3A+Efficient+Pattern-Centric+Graph+Mining+with+Pattern+Abstraction&rft.au=Gui%2C+Chuangyi&rft.au=Liao%2C+Xiaofei&rft.au=Zheng%2C+Long&rft.au=Yao%2C+Pengcheng&rft.date=2021-09-01&rft.pub=IEEE&rft.spage=318&rft.epage=330&rft_id=info:doi/10.1109%2FPACT52795.2021.00030&rft.externalDocID=9563022 |