Scalable Dynamic Graph Summarization
Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between nodes. In this work, we address the problem of summarizing large-scale dynamic graphs, while maintaining the evolution of their structure and...
Saved in:
| Published in: | IEEE transactions on knowledge and data engineering Vol. 32; no. 2; pp. 360 - 373 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.02.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1041-4347, 1558-2191 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between nodes. In this work, we address the problem of summarizing large-scale dynamic graphs, while maintaining the evolution of their structure and interactions. Our approach is based on grouping the nodes of the graph in supernodes according to their connectivity and communication patterns. The resulting summary graph preserves the information about the evolution of the graph within a time window. We propose two online algorithms for summarizing this type of graphs. Our baseline algorithm kC based on clustering is fast but rather memory expensive. The second method we propose, named /LC, reduces the memory requirements by introducing an intermediate step that keeps statistics of the clustering of the previous rounds. Our algorithms are distributed by design, and we implement them over the Apache Spark framework, so as to address the problem of scalability for large-scale graphs and massive streams. We apply our methods to several dynamic graphs, and show that we can efficiently use the summary graphs to answer temporal and probabilistic graph queries. |
|---|---|
| AbstractList | Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between nodes. In this work, we address the problem of summarizing large-scale dynamic graphs, while maintaining the evolution of their structure and interactions. Our approach is based on grouping the nodes of the graph in supernodes according to their connectivity and communication patterns. The resulting summary graph preserves the information about the evolution of the graph within a time window. We propose two online algorithms for summarizing this type of graphs. Our baseline algorithm [Formula Omitted]C based on clustering is fast but rather memory expensive. The second method we propose, named [Formula Omitted]C, reduces the memory requirements by introducing an intermediate step that keeps statistics of the clustering of the previous rounds. Our algorithms are distributed by design, and we implement them over the Apache Spark framework, so as to address the problem of scalability for large-scale graphs and massive streams. We apply our methods to several dynamic graphs, and show that we can efficiently use the summary graphs to answer temporal and probabilistic graph queries. Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between nodes. In this work, we address the problem of summarizing large-scale dynamic graphs, while maintaining the evolution of their structure and interactions. Our approach is based on grouping the nodes of the graph in supernodes according to their connectivity and communication patterns. The resulting summary graph preserves the information about the evolution of the graph within a time window. We propose two online algorithms for summarizing this type of graphs. Our baseline algorithm kC based on clustering is fast but rather memory expensive. The second method we propose, named /LC, reduces the memory requirements by introducing an intermediate step that keeps statistics of the clustering of the previous rounds. Our algorithms are distributed by design, and we implement them over the Apache Spark framework, so as to address the problem of scalability for large-scale graphs and massive streams. We apply our methods to several dynamic graphs, and show that we can efficiently use the summary graphs to answer temporal and probabilistic graph queries. |
| Author | Tsalouchidou, Ioanna Baeza-Yates, Ricardo Morales, Gianmarco De Francisci Bonchi, Francesco |
| Author_xml | – sequence: 1 givenname: Ioanna orcidid: 0000-0002-8560-9565 surname: Tsalouchidou fullname: Tsalouchidou, Ioanna email: ioanna.tsalouchidou@upf.edu organization: Pompeu Fabra University, Barcelona, Spain – sequence: 2 givenname: Francesco orcidid: 0000-0001-9464-8315 surname: Bonchi fullname: Bonchi, Francesco email: francesco.bonchi@isi.it organization: ISI Foundation, Turin, Italy – sequence: 3 givenname: Gianmarco De Francisci orcidid: 0000-0002-2415-494X surname: Morales fullname: Morales, Gianmarco De Francisci email: gdfm@acm.org organization: ISI Foundation, Turin, Italy – sequence: 4 givenname: Ricardo surname: Baeza-Yates fullname: Baeza-Yates, Ricardo email: rbaeza@acm.org organization: Pompeu Fabra University, Barcelona, Spain |
| BookMark | eNp9kE1PAjEQhhuDiYD-AOOFRK-LnX73aADRSOIBPDel28aSZRe7ywF_PbtCPHjwNHOYZ96ZZ4B6ZVV6hG4BjwGwfly9TWdjgkGNiVKMSbhAfeBcZQQ09NoeM8gYZfIKDep6gzFWUkEfPSydLey68KPpobTb6EbzZHefo-V-u7UpftsmVuU1ugy2qP3NuQ7Rx_NsNXnJFu_z18nTInNE0yYTao1zHljQgXO7xiynwhFQTGhQ1mkqPZOSBCCWkpBj8FoTkkMQKhcueDpE96e9u1R97X3dmE21T2UbaQilSnAhsW6n4DTlUlXXyQezS7E99mAAm06G6WSYToY5y2gZ-Ydxsfn5rUk2Fv-Sdycyeu9_kxTnDAimR9H_bFM |
| CODEN | ITKEEH |
| CitedBy_id | crossref_primary_10_1007_s10462_020_09916_4 crossref_primary_10_1007_s13278_024_01314_w crossref_primary_10_1109_TKDE_2023_3270460 crossref_primary_10_1007_s10618_020_00714_8 crossref_primary_10_1109_TNSE_2022_3227909 crossref_primary_10_1016_j_ins_2024_121624 crossref_primary_10_1007_s11227_020_03290_2 crossref_primary_10_1109_TKDE_2024_3373712 |
| Cites_doi | 10.1145/2783258.2783321 10.1109/DCC.2001.917152 10.1145/2020408.2020566 10.1109/ICDM.2014.56 10.1109/TKDE.2017.2776282 10.1145/1376616.1376675 10.1016/B978-012722442-8/50016-1 10.1109/ICDE.2010.5447830 10.1137/1.9781611972801.40 10.1109/ASONAM.2016.7752224 10.1145/1376616.1376661 10.1145/2661829.2661862 10.1145/988672.988752 10.1145/235968.233324 10.1145/2882903.2915223 10.1145/2213836.2213855 10.2197/ipsjjip.20.77 10.1145/1150402.1150445 10.1109/TKDE.2016.2601611 10.1145/1835804.1835873 10.1109/DCC.2001.917151 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TKDE.2018.2884471 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1558-2191 |
| EndPage | 373 |
| ExternalDocumentID | 10_1109_TKDE_2018_2884471 8554120 |
| Genre | orig-research |
| GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS RXW TAE TN5 UHB AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c293t-68b0d5f4f9f55ab04d36c21846918ac937e4772f12a32fd01e9922d1f68d6cfe3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 16 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000507883700012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1041-4347 |
| IngestDate | Sun Nov 09 07:06:31 EST 2025 Sat Nov 29 04:46:47 EST 2025 Tue Nov 18 22:32:51 EST 2025 Wed Aug 27 02:40:59 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c293t-68b0d5f4f9f55ab04d36c21846918ac937e4772f12a32fd01e9922d1f68d6cfe3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-8560-9565 0000-0001-9464-8315 0000-0002-2415-494X |
| PQID | 2338656709 |
| PQPubID | 85438 |
| PageCount | 14 |
| ParticipantIDs | crossref_citationtrail_10_1109_TKDE_2018_2884471 ieee_primary_8554120 crossref_primary_10_1109_TKDE_2018_2884471 proquest_journals_2338656709 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-02-01 |
| PublicationDateYYYYMMDD | 2020-02-01 |
| PublicationDate_xml | – month: 02 year: 2020 text: 2020-02-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on knowledge and data engineering |
| PublicationTitleAbbrev | TKDE |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 ref23 ref15 ref14 ref20 hernández (ref7) 2011 zhang (ref24) 1996 ref11 ref10 ref21 ref2 ref1 ref17 aggarwal (ref4) 2001 ref16 ref19 ref18 ref8 tsalouchidou (ref22) 2016 ref9 ref3 ref6 ref5 |
| References_xml | – ident: ref16 doi: 10.1145/2783258.2783321 – ident: ref17 doi: 10.1109/DCC.2001.917152 – ident: ref21 doi: 10.1145/2020408.2020566 – ident: ref15 doi: 10.1109/ICDM.2014.56 – ident: ref1 doi: 10.1109/TKDE.2017.2776282 – ident: ref20 doi: 10.1145/1376616.1376675 – year: 2011 ident: ref7 article-title: Compression of web and social graphs supporting neighbor and community queries publication-title: Proc ACM Workshop Social Netw Mining Anal – ident: ref3 doi: 10.1016/B978-012722442-8/50016-1 – ident: ref23 doi: 10.1109/ICDE.2010.5447830 – ident: ref9 doi: 10.1137/1.9781611972801.40 – ident: ref8 doi: 10.1109/ASONAM.2016.7752224 – ident: ref13 doi: 10.1145/1376616.1376661 – ident: ref10 doi: 10.1145/2661829.2661862 – ident: ref5 doi: 10.1145/988672.988752 – start-page: 103 year: 1996 ident: ref24 article-title: Birch: An efficient data clustering method for very large databases publication-title: Proc ACM SIGMOD Int Conf Manag Data doi: 10.1145/235968.233324 – ident: ref19 doi: 10.1145/2882903.2915223 – ident: ref6 doi: 10.1145/2213836.2213855 – ident: ref11 doi: 10.2197/ipsjjip.20.77 – start-page: 1032 year: 2016 ident: ref22 article-title: Scalable dynamic graph summarization publication-title: Proc IEEE Int Conf Big Data – start-page: 420 year: 2001 ident: ref4 article-title: On the surprising behavior of distance metrics in high dimensional spaces publication-title: Proc 8th Int Conf Database Theory – ident: ref18 doi: 10.1145/1150402.1150445 – ident: ref14 doi: 10.1109/TKDE.2016.2601611 – ident: ref12 doi: 10.1145/1835804.1835873 – ident: ref2 doi: 10.1109/DCC.2001.917151 |
| SSID | ssj0008781 |
| Score | 2.404604 |
| Snippet | Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 360 |
| SubjectTerms | Algorithms Clustering Clustering algorithms dynamic graphs Evolution Graph summarization Graphs Heuristic algorithms Microsoft Windows Nodes Scalability Silicon Tensile stress Windows (intervals) |
| Title | Scalable Dynamic Graph Summarization |
| URI | https://ieeexplore.ieee.org/document/8554120 https://www.proquest.com/docview/2338656709 |
| Volume | 32 |
| WOSCitedRecordID | wos000507883700012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library customDbUrl: eissn: 1558-2191 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0008781 issn: 1041-4347 databaseCode: RIE dateStart: 19890101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5q8aAHq61itcoeehK3zWN3kxzFtgpCEVqht2U3DxCklT78_SbZtCiK4G0PCYSZJDPfZub7ALqEKi60LGIk0yy2eKOIbRghdi8bwW16Xqak9GITbDzms5l4rsHtrhdGa-2Lz3TPffq3fLWQG_errO9KqjCxAH2Psazq1drdupx5QVKLLiwmogkLL5gYif70aTB0RVy8RzhPEoa_xSAvqvLjJvbhZdT438KO4SikkdFd5fcTqOl5ExpbiYYonNgmHH7hG2xBd2I94nqlokElRB89OL7qaOI72EJH5im8jIbT-8c4yCTE0sbqdZzxEqnUJEaYNC1KlCiaSYfcMoF5IW3-oRObQxtMCkqMQlg7LlqFTcZVJo2mZ1CfL-b6HCKGuKKppgZrkwhtSouPeZFyI5NCpIS1AW0Nl8vAIe6kLN5yjyWQyJ2tc2frPNi6DTe7Ke8VgcZfg1vOuLuBwa5t6Gy9k4cjtsoJdXKljn7u4vdZl3BAHDj2JdYdqK-XG30F-_Jj_bpaXvvd8wlSPr7p |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH6MKagHp5vidGoPO4nd0iRtk6O4zcnmEDZht9KmCQiyyX7495tkWVEUwVsPCYT3krz3Ne99H0ATk5xxKVIfiTDyNd5IfR1GsN7LijOdnmchzqzYRDwasemUP5fgtuiFkVLa4jPZMp_2LT-fi7X5VdY2JVUB1gB9xyhnuW6t4t5lsZUk1fhCoyJCY_eGGSDengw6XVPGxVqYMUrj4FsUsrIqP-5iG2B6lf8t7QgOXSLp3W08fwwlOatCZSvS4LkzW4WDL4yDNWiOtU9Mt5TX2UjRew-Gsdob2x4215N5Ai-97uS-7zuhBF_oaL3yI5ahPFRUcRWGaYZoTiJhsFvEA5YKnYFIqrNoFeCUYJWjQBo22jxQEcsjoSQ5hfJsPpNn4MWI5SSURAVSUS5VphEyS0OmBE15iOM6oK3hEuFYxI2YxVti0QTiibF1YmydOFvX4aaY8r6h0PhrcM0Ytxjo7FqHxtY7iTtkywQTI1hqCOjOf591DXv9ydMwGT6OBhewjw1UtgXXDSivFmt5CbviY_W6XFzZnfQJdEfCMg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+Dynamic+Graph+Summarization&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Tsalouchidou%2C+Ioanna&rft.au=Bonchi%2C+Francesco&rft.au=Morales%2C+Gianmarco+De+Francisci&rft.au=Baeza-Yates%2C+Ricardo&rft.date=2020-02-01&rft.pub=IEEE&rft.issn=1041-4347&rft.volume=32&rft.issue=2&rft.spage=360&rft.epage=373&rft_id=info:doi/10.1109%2FTKDE.2018.2884471&rft.externalDocID=8554120 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon |