Scalable Dynamic Graph Summarization

Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between nodes. In this work, we address the problem of summarizing large-scale dynamic graphs, while maintaining the evolution of their structure and...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering Vol. 32; no. 2; pp. 360 - 373
Main Authors: Tsalouchidou, Ioanna, Bonchi, Francesco, Morales, Gianmarco De Francisci, Baeza-Yates, Ricardo
Format: Journal Article
Language:English
Published: New York IEEE 01.02.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1041-4347, 1558-2191
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between nodes. In this work, we address the problem of summarizing large-scale dynamic graphs, while maintaining the evolution of their structure and interactions. Our approach is based on grouping the nodes of the graph in supernodes according to their connectivity and communication patterns. The resulting summary graph preserves the information about the evolution of the graph within a time window. We propose two online algorithms for summarizing this type of graphs. Our baseline algorithm kC based on clustering is fast but rather memory expensive. The second method we propose, named /LC, reduces the memory requirements by introducing an intermediate step that keeps statistics of the clustering of the previous rounds. Our algorithms are distributed by design, and we implement them over the Apache Spark framework, so as to address the problem of scalability for large-scale graphs and massive streams. We apply our methods to several dynamic graphs, and show that we can efficiently use the summary graphs to answer temporal and probabilistic graph queries.
AbstractList Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between nodes. In this work, we address the problem of summarizing large-scale dynamic graphs, while maintaining the evolution of their structure and interactions. Our approach is based on grouping the nodes of the graph in supernodes according to their connectivity and communication patterns. The resulting summary graph preserves the information about the evolution of the graph within a time window. We propose two online algorithms for summarizing this type of graphs. Our baseline algorithm [Formula Omitted]C based on clustering is fast but rather memory expensive. The second method we propose, named [Formula Omitted]C, reduces the memory requirements by introducing an intermediate step that keeps statistics of the clustering of the previous rounds. Our algorithms are distributed by design, and we implement them over the Apache Spark framework, so as to address the problem of scalability for large-scale graphs and massive streams. We apply our methods to several dynamic graphs, and show that we can efficiently use the summary graphs to answer temporal and probabilistic graph queries.
Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between nodes. In this work, we address the problem of summarizing large-scale dynamic graphs, while maintaining the evolution of their structure and interactions. Our approach is based on grouping the nodes of the graph in supernodes according to their connectivity and communication patterns. The resulting summary graph preserves the information about the evolution of the graph within a time window. We propose two online algorithms for summarizing this type of graphs. Our baseline algorithm kC based on clustering is fast but rather memory expensive. The second method we propose, named /LC, reduces the memory requirements by introducing an intermediate step that keeps statistics of the clustering of the previous rounds. Our algorithms are distributed by design, and we implement them over the Apache Spark framework, so as to address the problem of scalability for large-scale graphs and massive streams. We apply our methods to several dynamic graphs, and show that we can efficiently use the summary graphs to answer temporal and probabilistic graph queries.
Author Tsalouchidou, Ioanna
Baeza-Yates, Ricardo
Morales, Gianmarco De Francisci
Bonchi, Francesco
Author_xml – sequence: 1
  givenname: Ioanna
  orcidid: 0000-0002-8560-9565
  surname: Tsalouchidou
  fullname: Tsalouchidou, Ioanna
  email: ioanna.tsalouchidou@upf.edu
  organization: Pompeu Fabra University, Barcelona, Spain
– sequence: 2
  givenname: Francesco
  orcidid: 0000-0001-9464-8315
  surname: Bonchi
  fullname: Bonchi, Francesco
  email: francesco.bonchi@isi.it
  organization: ISI Foundation, Turin, Italy
– sequence: 3
  givenname: Gianmarco De Francisci
  orcidid: 0000-0002-2415-494X
  surname: Morales
  fullname: Morales, Gianmarco De Francisci
  email: gdfm@acm.org
  organization: ISI Foundation, Turin, Italy
– sequence: 4
  givenname: Ricardo
  surname: Baeza-Yates
  fullname: Baeza-Yates, Ricardo
  email: rbaeza@acm.org
  organization: Pompeu Fabra University, Barcelona, Spain
BookMark eNp9kE1PAjEQhhuDiYD-AOOFRK-LnX73aADRSOIBPDel28aSZRe7ywF_PbtCPHjwNHOYZ96ZZ4B6ZVV6hG4BjwGwfly9TWdjgkGNiVKMSbhAfeBcZQQ09NoeM8gYZfIKDep6gzFWUkEfPSydLey68KPpobTb6EbzZHefo-V-u7UpftsmVuU1ugy2qP3NuQ7Rx_NsNXnJFu_z18nTInNE0yYTao1zHljQgXO7xiynwhFQTGhQ1mkqPZOSBCCWkpBj8FoTkkMQKhcueDpE96e9u1R97X3dmE21T2UbaQilSnAhsW6n4DTlUlXXyQezS7E99mAAm06G6WSYToY5y2gZ-Ydxsfn5rUk2Fv-Sdycyeu9_kxTnDAimR9H_bFM
CODEN ITKEEH
CitedBy_id crossref_primary_10_1007_s10462_020_09916_4
crossref_primary_10_1007_s13278_024_01314_w
crossref_primary_10_1109_TKDE_2023_3270460
crossref_primary_10_1007_s10618_020_00714_8
crossref_primary_10_1109_TNSE_2022_3227909
crossref_primary_10_1016_j_ins_2024_121624
crossref_primary_10_1007_s11227_020_03290_2
crossref_primary_10_1109_TKDE_2024_3373712
Cites_doi 10.1145/2783258.2783321
10.1109/DCC.2001.917152
10.1145/2020408.2020566
10.1109/ICDM.2014.56
10.1109/TKDE.2017.2776282
10.1145/1376616.1376675
10.1016/B978-012722442-8/50016-1
10.1109/ICDE.2010.5447830
10.1137/1.9781611972801.40
10.1109/ASONAM.2016.7752224
10.1145/1376616.1376661
10.1145/2661829.2661862
10.1145/988672.988752
10.1145/235968.233324
10.1145/2882903.2915223
10.1145/2213836.2213855
10.2197/ipsjjip.20.77
10.1145/1150402.1150445
10.1109/TKDE.2016.2601611
10.1145/1835804.1835873
10.1109/DCC.2001.917151
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TKDE.2018.2884471
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2191
EndPage 373
ExternalDocumentID 10_1109_TKDE_2018_2884471
8554120
Genre orig-research
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
RXW
TAE
TN5
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c293t-68b0d5f4f9f55ab04d36c21846918ac937e4772f12a32fd01e9922d1f68d6cfe3
IEDL.DBID RIE
ISICitedReferencesCount 16
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000507883700012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1041-4347
IngestDate Sun Nov 09 07:06:31 EST 2025
Sat Nov 29 04:46:47 EST 2025
Tue Nov 18 22:32:51 EST 2025
Wed Aug 27 02:40:59 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c293t-68b0d5f4f9f55ab04d36c21846918ac937e4772f12a32fd01e9922d1f68d6cfe3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-8560-9565
0000-0001-9464-8315
0000-0002-2415-494X
PQID 2338656709
PQPubID 85438
PageCount 14
ParticipantIDs crossref_citationtrail_10_1109_TKDE_2018_2884471
ieee_primary_8554120
crossref_primary_10_1109_TKDE_2018_2884471
proquest_journals_2338656709
PublicationCentury 2000
PublicationDate 2020-02-01
PublicationDateYYYYMMDD 2020-02-01
PublicationDate_xml – month: 02
  year: 2020
  text: 2020-02-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on knowledge and data engineering
PublicationTitleAbbrev TKDE
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref23
ref15
ref14
ref20
hernández (ref7) 2011
zhang (ref24) 1996
ref11
ref10
ref21
ref2
ref1
ref17
aggarwal (ref4) 2001
ref16
ref19
ref18
ref8
tsalouchidou (ref22) 2016
ref9
ref3
ref6
ref5
References_xml – ident: ref16
  doi: 10.1145/2783258.2783321
– ident: ref17
  doi: 10.1109/DCC.2001.917152
– ident: ref21
  doi: 10.1145/2020408.2020566
– ident: ref15
  doi: 10.1109/ICDM.2014.56
– ident: ref1
  doi: 10.1109/TKDE.2017.2776282
– ident: ref20
  doi: 10.1145/1376616.1376675
– year: 2011
  ident: ref7
  article-title: Compression of web and social graphs supporting neighbor and community queries
  publication-title: Proc ACM Workshop Social Netw Mining Anal
– ident: ref3
  doi: 10.1016/B978-012722442-8/50016-1
– ident: ref23
  doi: 10.1109/ICDE.2010.5447830
– ident: ref9
  doi: 10.1137/1.9781611972801.40
– ident: ref8
  doi: 10.1109/ASONAM.2016.7752224
– ident: ref13
  doi: 10.1145/1376616.1376661
– ident: ref10
  doi: 10.1145/2661829.2661862
– ident: ref5
  doi: 10.1145/988672.988752
– start-page: 103
  year: 1996
  ident: ref24
  article-title: Birch: An efficient data clustering method for very large databases
  publication-title: Proc ACM SIGMOD Int Conf Manag Data
  doi: 10.1145/235968.233324
– ident: ref19
  doi: 10.1145/2882903.2915223
– ident: ref6
  doi: 10.1145/2213836.2213855
– ident: ref11
  doi: 10.2197/ipsjjip.20.77
– start-page: 1032
  year: 2016
  ident: ref22
  article-title: Scalable dynamic graph summarization
  publication-title: Proc IEEE Int Conf Big Data
– start-page: 420
  year: 2001
  ident: ref4
  article-title: On the surprising behavior of distance metrics in high dimensional spaces
  publication-title: Proc 8th Int Conf Database Theory
– ident: ref18
  doi: 10.1145/1150402.1150445
– ident: ref14
  doi: 10.1109/TKDE.2016.2601611
– ident: ref12
  doi: 10.1145/1835804.1835873
– ident: ref2
  doi: 10.1109/DCC.2001.917151
SSID ssj0008781
Score 2.404604
Snippet Large-scale dynamic interaction graphs can be challenging to process and store, due to their size and the continuous change of communication patterns between...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 360
SubjectTerms Algorithms
Clustering
Clustering algorithms
dynamic graphs
Evolution
Graph summarization
Graphs
Heuristic algorithms
Microsoft Windows
Nodes
Scalability
Silicon
Tensile stress
Windows (intervals)
Title Scalable Dynamic Graph Summarization
URI https://ieeexplore.ieee.org/document/8554120
https://www.proquest.com/docview/2338656709
Volume 32
WOSCitedRecordID wos000507883700012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library
  customDbUrl:
  eissn: 1558-2191
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0008781
  issn: 1041-4347
  databaseCode: RIE
  dateStart: 19890101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5q8aAHq61itcoeehK3zWN3kxzFtgpCEVqht2U3DxCklT78_SbZtCiK4G0PCYSZJDPfZub7ALqEKi60LGIk0yy2eKOIbRghdi8bwW16Xqak9GITbDzms5l4rsHtrhdGa-2Lz3TPffq3fLWQG_errO9KqjCxAH2Psazq1drdupx5QVKLLiwmogkLL5gYif70aTB0RVy8RzhPEoa_xSAvqvLjJvbhZdT438KO4SikkdFd5fcTqOl5ExpbiYYonNgmHH7hG2xBd2I94nqlokElRB89OL7qaOI72EJH5im8jIbT-8c4yCTE0sbqdZzxEqnUJEaYNC1KlCiaSYfcMoF5IW3-oRObQxtMCkqMQlg7LlqFTcZVJo2mZ1CfL-b6HCKGuKKppgZrkwhtSouPeZFyI5NCpIS1AW0Nl8vAIe6kLN5yjyWQyJ2tc2frPNi6DTe7Ke8VgcZfg1vOuLuBwa5t6Gy9k4cjtsoJdXKljn7u4vdZl3BAHDj2JdYdqK-XG30F-_Jj_bpaXvvd8wlSPr7p
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH6MKagHp5vidGoPO4nd0iRtk6O4zcnmEDZht9KmCQiyyX7495tkWVEUwVsPCYT3krz3Ne99H0ATk5xxKVIfiTDyNd5IfR1GsN7LijOdnmchzqzYRDwasemUP5fgtuiFkVLa4jPZMp_2LT-fi7X5VdY2JVUB1gB9xyhnuW6t4t5lsZUk1fhCoyJCY_eGGSDengw6XVPGxVqYMUrj4FsUsrIqP-5iG2B6lf8t7QgOXSLp3W08fwwlOatCZSvS4LkzW4WDL4yDNWiOtU9Mt5TX2UjRew-Gsdob2x4215N5Ai-97uS-7zuhBF_oaL3yI5ahPFRUcRWGaYZoTiJhsFvEA5YKnYFIqrNoFeCUYJWjQBo22jxQEcsjoSQ5hfJsPpNn4MWI5SSURAVSUS5VphEyS0OmBE15iOM6oK3hEuFYxI2YxVti0QTiibF1YmydOFvX4aaY8r6h0PhrcM0Ytxjo7FqHxtY7iTtkywQTI1hqCOjOf591DXv9ydMwGT6OBhewjw1UtgXXDSivFmt5CbviY_W6XFzZnfQJdEfCMg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+Dynamic+Graph+Summarization&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Tsalouchidou%2C+Ioanna&rft.au=Bonchi%2C+Francesco&rft.au=Morales%2C+Gianmarco+De+Francisci&rft.au=Baeza-Yates%2C+Ricardo&rft.date=2020-02-01&rft.pub=IEEE&rft.issn=1041-4347&rft.volume=32&rft.issue=2&rft.spage=360&rft.epage=373&rft_id=info:doi/10.1109%2FTKDE.2018.2884471&rft.externalDocID=8554120
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon