3SEPIAS: A Semi-Structured Search Engine for Personal Information in dAtaspace System

Nowadays, personal information is being distributed into more and more heterogeneous sources, which presents a huge obstacle to management and retrieval of personal information. To address this problem, this paper presents the blueprint of a novel Personal Information Management (PIM) system named 3...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences Jg. 218; S. 31 - 50
Hauptverfasser: Zhong, Ming, Liu, Mengchi, He, Yanxiang
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Inc 01.01.2013
Schlagworte:
ISSN:0020-0255, 1872-6291
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Nowadays, personal information is being distributed into more and more heterogeneous sources, which presents a huge obstacle to management and retrieval of personal information. To address this problem, this paper presents the blueprint of a novel Personal Information Management (PIM) system named 3SEPIAS (short for Semi-Structured Search Engine for Personal Information in dAtaspace System). 3SEPIAS has three main features, data integration without upfront semantic reconciliation, flexible query model for data having sparse and evolving schema, and efficient best-effort proximity search approach on graphs. For that, we first propose a semi-structured graph data model called Interpreted Object Model (IOM) to uniformly represents a user’s heterogeneous personal information and loosely integrates it into a dataspace in a schema-later way. Then, a Semi-Structured Search Engine (3SE) can be used to search over the personal dataspaces. We propose an intuitive 3SE Query Language (3SQL) that enables users to query in a varying degree of structural constraint according to their knowledge of underlying schemas. Moreover, a best-effort top-k proximity search optimization strategy and corresponding graph index structures are proposed to improve the efficiency of query processing. We perform comprehensive experiments to test both effectiveness and efficiency of our proximity search approach. The results reveal that 3SE can beat the previous proximity search systems by a large margin with only a little or even no loss of result quality, especially for large graphs.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2012.06.013