Reliable Keyword Query Interpretation on Summary Graphs

The semantic gap between keyword queries and search intents behind them motivates intensive studies on keyword query interpretation, which aims to interpret a keyword query to structured queries (a.k.a. patterns) representing most possibly relevant search intents. However, there still lacks of study...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on knowledge and data engineering Vol. 35; no. 5; pp. 5187 - 5202
Main Authors:	Zhong, Ming, Zheng, Yingyi, Xue, Guotong, Liu, Mengchi
Format:	Journal Article
Language:	English
Published:	New York IEEE 01.05.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Estimation Evaluation graph summarization Keyword query interpretation Keywords knowledge graph Motion pictures probabilistic model Queries Reliability Resource description framework search algorithm Search algorithms Semantics Uncertainty
ISSN:	1041-4347, 1558-2191
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The semantic gap between keyword queries and search intents behind them motivates intensive studies on keyword query interpretation, which aims to interpret a keyword query to structured queries (a.k.a. patterns) representing most possibly relevant search intents. However, there still lacks of study on an important issue: how to guarantee the patterns are "reliable", which means the structured queries can be evaluated as really existing results. In this paper, we regard the reliability as a new metric for ranking patterns, and present a keyword query interpretation approach to find both reliable and relevant pattern trees on an arbitrary summary graph of underlying data. Specifically, we first propose a reliability estimation model to measure how possibly a pattern tree can be evaluated as a nonempty result set by statistics under reasonable assumptions. Second, we develop constrained top-<inline-formula><tex-math notation="LaTeX">k</tex-math> <mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic xlink:href="zhong-ieq1-3144001.gif"/> </inline-formula> search algorithms that guarantee to return the optimal pattern trees for a specific keyword query. Moreover, to improve the efficiency of online search, we also design elaborate indexes, search heuristics and pruning strategies. Lastly, we perform comprehensive experiments on two real-world datasets, DBpedia and Yago, with both QALD-9 queries and random queries. The observations indicate our approach improves the accuracy and overall quality of top-<inline-formula><tex-math notation="LaTeX">k</tex-math> <mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic xlink:href="zhong-ieq2-3144001.gif"/> </inline-formula> results significantly.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2022.3144001