G-ThinkerQ: A General Subgraph Querying System With a Unified Task-Based Programming Model

Given a large graph <inline-formula><tex-math notation="LaTeX">G</tex-math> <mml:math><mml:mi>G</mml:mi></mml:math><inline-graphic xlink:href="yan-ieq1-3537964.gif"/> </inline-formula>, a subgraph query <inline-formula>&...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on knowledge and data engineering Ročník 37; číslo 6; s. 3429 - 3444
Hlavní autoři: Yuan, Lyuheng, Guo, Guimu, Yan, Da, Adhikari, Saugat, Khalil, Jalal, Long, Cheng, Zou, Lei
Médium: Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.06.2025
Témata:
ISSN:1041-4347, 1558-2191
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Given a large graph <inline-formula><tex-math notation="LaTeX">G</tex-math> <mml:math><mml:mi>G</mml:mi></mml:math><inline-graphic xlink:href="yan-ieq1-3537964.gif"/> </inline-formula>, a subgraph query <inline-formula><tex-math notation="LaTeX">Q</tex-math> <mml:math><mml:mi>Q</mml:mi></mml:math><inline-graphic xlink:href="yan-ieq2-3537964.gif"/> </inline-formula> finds the set of all subgraphs of <inline-formula><tex-math notation="LaTeX">G</tex-math> <mml:math><mml:mi>G</mml:mi></mml:math><inline-graphic xlink:href="yan-ieq3-3537964.gif"/> </inline-formula> that satisfy certain conditions specified by <inline-formula><tex-math notation="LaTeX">Q</tex-math> <mml:math><mml:mi>Q</mml:mi></mml:math><inline-graphic xlink:href="yan-ieq4-3537964.gif"/> </inline-formula>. Examples of subgraph queries including finding a community containing designated members to organize an event, and subgraph matching. To overcome the weakness of existing graph-parallel systems that underutilize CPU cores when finding subgraphs, our prior system, G-thinker, was proposed that adopts a novel think-like-a-task (TLAT) parallel programming model. However, G-thinker targets offline analytics and cannot support interactive online querying where users continually submit subgraph queries with different query contents. The challenges here are (i) how to maintain fairness that queries are answered in the order that they are received: a later query is processed only if earlier queries cannot saturate the available computation resources; (ii) how to track the progress of active queries (each with many tasks under computation) so that users can be timely notified as soon as a query completes; and (iii) how to maintain memory boundedness and high task concurrency as in G-thinker. In this article, we propose a novel TLAT programming framework, called G-thinkerQ, for answering online subgraph queries. G-thinkerQ inherits the memory boundedness and high task concurrency of G-thinker by organizing the tasks of each query using a "task capsule" structure, and designs a novel task-capsule list is to ensure fairness among queries. A novel lineage-based mechanism is also designed to keep track of when the last task of a query is completed. Parallel counterparts of the state-of-the-art algorithms for 4 recent advanced subgraph queries are implemented on G-thinkerQ to demonstrate its CPU-scalability.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2025.3537964