Database Selection for Processing k Nearest Neighbors Queries in Distributed Environments.

Saved in:
Bibliographic Details
Title: Database Selection for Processing k Nearest Neighbors Queries in Distributed Environments.
Language: English
Authors: Yu, Clement, Sharma, Prasoon, Meng, Weiyi, Qin, Yan
Availability: Association for Computing Machinery, 1515 Broadway, New York NY 10036. Tel: 800-342-6626 (Toll Free); Tel: 212-626-0500; e-mail: acmhelp@acm.org. For full text: http://www1.acm.org/pubs/contents/proceedings/dl/379437/.
Peer Reviewed: N
Page Count: 10
Publication Date: 2001
Sponsoring Agency: National Science Foundation, Arlington, VA.
Document Type: Numerical/Quantitative Data
Reports - Research
Speeches/Meeting Papers
Descriptors: Data Processing, Databases, Electronic Libraries, Information Processing, Information Retrieval, Information Seeking, Online Searching, Online Systems
Abstract: This paper considers the processing of digital library queries, consisting of a text component and a structured component in distributed environments. The paper concentrates on the processing of the structured component of a distributed query. A method is proposed to identify the databases that are likely to be useful for processing any given query and to determine the tuples from each useful site which are necessary for answering the query. In this way, both the communication cost and the local processing costs are saved. One common characteristic of these "k" nearest neighbors queries is that it is not necessary to obtain all the "k" nearest neighbors; it is often sufficient to get most of the "k" neighbors. Experimental results are provided to demonstrate that most of the "k" nearest neighbors (85% to 100%) are obtained using this approach. An average accuracy rate of 94.7% is achieved when the 20 closest neighbors are desired. (Contains 15 references.) (AEF)
Entry Date: 2002
Accession Number: ED459829
Database: ERIC
Description
Abstract:This paper considers the processing of digital library queries, consisting of a text component and a structured component in distributed environments. The paper concentrates on the processing of the structured component of a distributed query. A method is proposed to identify the databases that are likely to be useful for processing any given query and to determine the tuples from each useful site which are necessary for answering the query. In this way, both the communication cost and the local processing costs are saved. One common characteristic of these "k" nearest neighbors queries is that it is not necessary to obtain all the "k" nearest neighbors; it is often sufficient to get most of the "k" neighbors. Experimental results are provided to demonstrate that most of the "k" nearest neighbors (85% to 100%) are obtained using this approach. An average accuracy rate of 94.7% is achieved when the 20 closest neighbors are desired. (Contains 15 references.) (AEF)