A Generic Ontology Framework for Indexing Keyword Search on Massive Graphs

Due to the unstructuredness and the lack of schema information of knowledge graphs, social networks and RDF graphs, keyword search has been proposed for querying such graphs/networks. Recently, various keyword search semantics have been designed. In this paper, we propose a generic ontology-based in...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering Vol. 33; no. 6; pp. 2322 - 2336
Main Authors: Jiang, Jiaxin, Choi, Byron, Xu, Jianliang, Bhowmick, Sourav S
Format: Journal Article
Language:English
Published: New York IEEE 01.06.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1041-4347, 1558-2191
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Due to the unstructuredness and the lack of schema information of knowledge graphs, social networks and RDF graphs, keyword search has been proposed for querying such graphs/networks. Recently, various keyword search semantics have been designed. In this paper, we propose a generic ontology-based indexing framework for keyword search, called Bisimulation of Generalized Graph Index (<inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq1-2956535.gif"/> </inline-formula>), to enhance the search performance. The novelties of <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq2-2956535.gif"/> </inline-formula> reside in using an ontology graph <inline-formula><tex-math notation="LaTeX">G_{Ont}</tex-math> <mml:math><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi>O</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math><inline-graphic xlink:href="jiang-ieq3-2956535.gif"/> </inline-formula> to summarize and index a data graph <inline-formula><tex-math notation="LaTeX">G</tex-math> <mml:math><mml:mi>G</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq4-2956535.gif"/> </inline-formula> iteratively, to form a hierarchical index structure <inline-formula><tex-math notation="LaTeX">\mathbb {G}</tex-math> <mml:math><mml:mi mathvariant="double-struck">G</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq5-2956535.gif"/> </inline-formula>. <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq6-2956535.gif"/> </inline-formula> is generic since it only requires keyword search algorithms to generate query answers from summary graphs having two simple properties. Regarding query evaluation, we transform a keyword search <inline-formula><tex-math notation="LaTeX">q</tex-math> <mml:math><mml:mi>q</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq7-2956535.gif"/> </inline-formula> into <inline-formula><tex-math notation="LaTeX">\mathbb {Q}</tex-math> <mml:math><mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq8-2956535.gif"/> </inline-formula> according to <inline-formula><tex-math notation="LaTeX">G_{Ont}</tex-math> <mml:math><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi>O</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math><inline-graphic xlink:href="jiang-ieq9-2956535.gif"/> </inline-formula> in runtime. The transformed query is searched on the summary graphs in <inline-formula><tex-math notation="LaTeX">\mathbb {G}</tex-math> <mml:math><mml:mi mathvariant="double-struck">G</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq10-2956535.gif"/> </inline-formula>. The efficiency is due to the small sizes of the summary graphs and the early pruning of semantically irrelevant subgraphs. To illustrate <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq11-2956535.gif"/> </inline-formula>'s applicability, we show popular indexing techniques for keyword search (e.g., <inline-formula><tex-math notation="LaTeX">\mathsf {Blinks}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">Blinks</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq12-2956535.gif"/> </inline-formula> and <inline-formula><tex-math notation="LaTeX">\mathsf {r\hbox{-}clique}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">r</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">clique</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq13-2956535.gif"/> </inline-formula>) can be easily implemented on top of <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq14-2956535.gif"/> </inline-formula>. Our extensive experiments show that <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq15-2956535.gif"/> </inline-formula> reduced the runtimes of popular keyword search work <inline-formula><tex-math notation="LaTeX">\mathsf {Blinks}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">Blinks</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq16-2956535.gif"/> </inline-formula> by 50.5 percent and <inline-formula><tex-math notation="LaTeX">\mathsf {r\hbox{-}clique}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">r</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">clique</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq17-2956535.gif"/> </inline-formula> by 29.5 percent.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2019.2956535