Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs

Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the ... International World-Wide Web Conference. International WWW Conference Ročník 2018; s. 489
Hlavní autoři: Yang, Xiaofeng, Nicholson, Patrick K, Ajwani, Deepak, Riedewald, Mirek, Gatterbauer, Wolfgang, Sala, Alessandra
Médium: Journal Article
Jazyk:angličtina
Vydáno: Netherlands 01.04.2018
On-line přístup:Zjistit podrobnosti o přístupu
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top- matches according to a ranking function over edge and node weights. For users, it is difficult to select value . We therefore propose the novel notion of an : for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.
AbstractList Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-k matches according to a ranking function over edge and node weights. For users, it is difficult to select value k. We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-k matches according to a ranking function over edge and node weights. For users, it is difficult to select value k. We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.
Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top- matches according to a ranking function over edge and node weights. For users, it is difficult to select value . We therefore propose the novel notion of an : for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.
Author Sala, Alessandra
Ajwani, Deepak
Yang, Xiaofeng
Nicholson, Patrick K
Gatterbauer, Wolfgang
Riedewald, Mirek
Author_xml – sequence: 1
  givenname: Xiaofeng
  surname: Yang
  fullname: Yang, Xiaofeng
  organization: Northeastern University, Boston, MA
– sequence: 2
  givenname: Patrick K
  surname: Nicholson
  fullname: Nicholson, Patrick K
  organization: Nokia, Bell Labs, Dublin, Ireland
– sequence: 3
  givenname: Deepak
  surname: Ajwani
  fullname: Ajwani, Deepak
  organization: Nokia Bell Labs, Dublin, Ireland
– sequence: 4
  givenname: Mirek
  surname: Riedewald
  fullname: Riedewald, Mirek
  organization: Northeastern University Boston, MA
– sequence: 5
  givenname: Wolfgang
  surname: Gatterbauer
  fullname: Gatterbauer, Wolfgang
  organization: Northeastern University, Boston, MA
– sequence: 6
  givenname: Alessandra
  surname: Sala
  fullname: Sala, Alessandra
  organization: Nokia Bell Labs, Dublin, Ireland
BackLink https://www.ncbi.nlm.nih.gov/pubmed/30003197$$D View this record in MEDLINE/PubMed
BookMark eNo1j89LwzAYQHNQ_DE9e5McvXQm-ZImFTyMoVMoKFLPJWm-Ylmb1jQT9t87cJ7e5fHgXZKTMAYk5IazJedS3QPXxuh8CdzknKszcg6MMeCFviCPq7DPtg_0gNQNSKtxyra0ioj03aaEMdAPTLHDH9vTLtDSOuzR002009d8RU5b2894feSCfD4_VeuXrHzbvK5XZWYBTMp0Ab5t2sa0hXOKWcUkAjMOgOWuyI2VTggrJEjtlZdGOgZ5YbiGNrdGebEgd3_dKY7fO5xTPXRzg31vA467uRZMMyG1Uvyg3h7VnRvQ11PsBhv39f-y-AWwbFCX
ContentType Journal Article
DBID NPM
7X8
DOI 10.1145/3178876.3186115
DatabaseName PubMed
MEDLINE - Academic
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
ExternalDocumentID 30003197
Genre Journal Article
GroupedDBID NPM
7X8
ID FETCH-LOGICAL-a338t-793dfcfc8f9bb50a504e308b3306b968a4b22a24347d5d484b03698173f6a85d2
IEDL.DBID 7X8
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000460379000048&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Fri Jul 11 14:49:45 EDT 2025
Wed Feb 19 02:43:27 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a338t-793dfcfc8f9bb50a504e308b3306b968a4b22a24347d5d484b03698173f6a85d2
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://pubmed.ncbi.nlm.nih.gov/PMC6037532
PMID 30003197
PQID 2070247551
PQPubID 23479
ParticipantIDs proquest_miscellaneous_2070247551
pubmed_primary_30003197
PublicationCentury 2000
PublicationDate 20180401
PublicationDateYYYYMMDD 2018-04-01
PublicationDate_xml – month: 4
  year: 2018
  text: 20180401
  day: 1
PublicationDecade 2010
PublicationPlace Netherlands
PublicationPlace_xml – name: Netherlands
PublicationTitle Proceedings of the ... International World-Wide Web Conference. International WWW Conference
PublicationTitleAlternate Proc Int World Wide Web Conf
PublicationYear 2018
Score 1.7608294
Snippet Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 489
Title Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
URI https://www.ncbi.nlm.nih.gov/pubmed/30003197
https://www.proquest.com/docview/2070247551
Volume 2018
WOSCitedRecordID wos000460379000048&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEA7qevDiA1_riwheg02bNIkXEXH1oMsiK-yt5AnLQrtuV8F_76Tt4kkQvKSnQJnJ5PtmMsyH0BULKvdcQ6QBWyZMC01Mbig4xEllNJXOhkZsQgyHcjJRo67gVndtlas7sbmoXWVjjTxWQgBOBAD87fydRNWo-LraSWiso14GVCa2dImJ7Cb4UMavAR0hinLITGVOKf-dRjZwMtj574_sou2OSOK71vN7aM2X-wiy-S8yu8HwiaLxeFzNyQyPF97jUTNIs8SvjYQWnC88LfGzNgA7Dj_GsdX1AXobPIzvn0gnkEA0ZJZLArHlgg1WBmUMTzRPmM8SaTLIA4zKpWYmTXXKMiYcd0wyA3ilJBVZyLXkLj1EG2VV-mOEnbHGc8sk9Qo4ijWRmQnhjXE5d0nWR5craxRwAOOrgi599VEXP_boo6PWpMW8nZRRZDHlokqc_GH3KdoCMiLbrpgz1AsQfv4cbdrP5bReXDSehXU4evkGAd2u2g
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Any-k%3A+Anytime+Top-k+Tree+Pattern+Retrieval+in+Labeled+Graphs&rft.jtitle=Proceedings+of+the+...+International+World-Wide+Web+Conference.+International+WWW+Conference&rft.au=Yang%2C+Xiaofeng&rft.au=Nicholson%2C+Patrick+K&rft.au=Ajwani%2C+Deepak&rft.au=Riedewald%2C+Mirek&rft.date=2018-04-01&rft.volume=2018&rft.spage=489&rft_id=info:doi/10.1145%2F3178876.3186115&rft_id=info%3Apmid%2F30003197&rft_id=info%3Apmid%2F30003197&rft.externalDocID=30003197