A Generic Ontology Framework for Indexing Keyword Search on Massive Graphs
Due to the unstructuredness and the lack of schema information of knowledge graphs, social networks and RDF graphs, keyword search has been proposed for querying such graphs/networks. Recently, various keyword search semantics have been designed. In this paper, we propose a generic ontology-based in...
Saved in:
| Published in: | IEEE transactions on knowledge and data engineering Vol. 33; no. 6; pp. 2322 - 2336 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.06.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1041-4347, 1558-2191 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Due to the unstructuredness and the lack of schema information of knowledge graphs, social networks and RDF graphs, keyword search has been proposed for querying such graphs/networks. Recently, various keyword search semantics have been designed. In this paper, we propose a generic ontology-based indexing framework for keyword search, called Bisimulation of Generalized Graph Index (<inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq1-2956535.gif"/> </inline-formula>), to enhance the search performance. The novelties of <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq2-2956535.gif"/> </inline-formula> reside in using an ontology graph <inline-formula><tex-math notation="LaTeX">G_{Ont}</tex-math> <mml:math><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi>O</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math><inline-graphic xlink:href="jiang-ieq3-2956535.gif"/> </inline-formula> to summarize and index a data graph <inline-formula><tex-math notation="LaTeX">G</tex-math> <mml:math><mml:mi>G</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq4-2956535.gif"/> </inline-formula> iteratively, to form a hierarchical index structure <inline-formula><tex-math notation="LaTeX">\mathbb {G}</tex-math> <mml:math><mml:mi mathvariant="double-struck">G</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq5-2956535.gif"/> </inline-formula>. <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq6-2956535.gif"/> </inline-formula> is generic since it only requires keyword search algorithms to generate query answers from summary graphs having two simple properties. Regarding query evaluation, we transform a keyword search <inline-formula><tex-math notation="LaTeX">q</tex-math> <mml:math><mml:mi>q</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq7-2956535.gif"/> </inline-formula> into <inline-formula><tex-math notation="LaTeX">\mathbb {Q}</tex-math> <mml:math><mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq8-2956535.gif"/> </inline-formula> according to <inline-formula><tex-math notation="LaTeX">G_{Ont}</tex-math> <mml:math><mml:msub><mml:mi>G</mml:mi><mml:mrow><mml:mi>O</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math><inline-graphic xlink:href="jiang-ieq9-2956535.gif"/> </inline-formula> in runtime. The transformed query is searched on the summary graphs in <inline-formula><tex-math notation="LaTeX">\mathbb {G}</tex-math> <mml:math><mml:mi mathvariant="double-struck">G</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq10-2956535.gif"/> </inline-formula>. The efficiency is due to the small sizes of the summary graphs and the early pruning of semantically irrelevant subgraphs. To illustrate <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq11-2956535.gif"/> </inline-formula>'s applicability, we show popular indexing techniques for keyword search (e.g., <inline-formula><tex-math notation="LaTeX">\mathsf {Blinks}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">Blinks</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq12-2956535.gif"/> </inline-formula> and <inline-formula><tex-math notation="LaTeX">\mathsf {r\hbox{-}clique}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">r</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">clique</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq13-2956535.gif"/> </inline-formula>) can be easily implemented on top of <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq14-2956535.gif"/> </inline-formula>. Our extensive experiments show that <inline-formula><tex-math notation="LaTeX">\mathsf {BiG\hbox{-}index}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">BiG</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">index</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq15-2956535.gif"/> </inline-formula> reduced the runtimes of popular keyword search work <inline-formula><tex-math notation="LaTeX">\mathsf {Blinks}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">Blinks</mml:mi></mml:math><inline-graphic xlink:href="jiang-ieq16-2956535.gif"/> </inline-formula> by 50.5 percent and <inline-formula><tex-math notation="LaTeX">\mathsf {r\hbox{-}clique}</tex-math> <mml:math><mml:mrow><mml:mi mathvariant="sans-serif">r</mml:mi><mml:mtext mathvariant="sans-serif">-</mml:mtext><mml:mi mathvariant="sans-serif">clique</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="jiang-ieq17-2956535.gif"/> </inline-formula> by 29.5 percent. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1041-4347 1558-2191 |
| DOI: | 10.1109/TKDE.2019.2956535 |