In EDS ansehen

Merging R-Trees: Efficient Strategies for Local Bulk Insertion: Merging R-trees: Efficient strategies for local bulk insertion

Gespeichert in:

Bibliographische Detailangaben
Titel:	Merging R-Trees: Efficient Strategies for Local Bulk Insertion: Merging R-trees: Efficient strategies for local bulk insertion
Autoren:	Rupesh Choubey, Li Chen, Elke A. Rundensteiner
Quelle:	GeoInformatica. 6:7-34
Verlagsinformationen:	Springer Science and Business Media LLC, 2002.
Publikationsjahr:	2002
Schlagwörter:	Computing methodologies for information systems (hypertext navigation, interfaces, decision support, etc.), multidimensional index structures, bulk loading, Computing methodologies and applications, localized datasets, 0202 electrical engineering, electronic engineering, information engineering, spatial query processing, 02 engineering and technology, bulk insertion, performance evaluation
Beschreibung:	Summary: A lot of recent work has focussed on bulk loading of data into multidimensional index structures in order to efficiently construct such structures for large datasets. In this paper, we address this problem with particular focus on R-trees--which are an important class of index structures used widely in commercial database systems. We propose a new technique, which as opposed to the current technique of inserting data one by one, bulk inserts entire new datasets into an active R-tree. This technique, called STLT (for Small-Tree-Large-Tree), considers the new dataset as an R-tree itself (small tree), identifies and prepares a suitable location in the original R-tree (large tree) for insertion, and lastly performs the insert of the small tree into the large tree. Besides an analytical cost model of STLT, extensive experimental studies both on synthetic and real GIS data sets are also reported. These experiments not only compare STLT against the conventional technique, but also evaluate the suitability and limitations of STLT under different conditions, such as varying buffer sizes, ratio between existing and new data sizes, and skewness of new data with respect to the whole spatial region. We find that STLT does much better (in average, about 65\%) than the existing technique for skewed datasets as well for large sizes of both the large tree and the small tree in terms of insertion time, while keeping comparable query tree quality. STLT consistently outperforms the alternate technique in all other circumstances in terms of bulk insertion time, especially, even up to 2,000\% for the cases when the area of new data sets covers up to 4\% of the global region covered by the existing index tree; however, at the cost of a deteriorating resulting tree quality.
Publikationsart:	Article
Dateibeschreibung:	application/xml
Sprache:	English
ISSN:	1573-7624 1384-6175
DOI:	10.1023/a:1013764014000
Zugangs-URL:	https://dblp.uni-trier.de/db/journals/geoinformatica/geoinformatica6.html#ChenCR02 https://link.springer.com/article/10.1023/A:1013764014000
Rights:	Springer Nature TDM
Dokumentencode:	edsair.doi.dedup.....9c1833f6b8267934a2e94942d0c2ce06
Datenbank:	OpenAIRE

View record at OpenAIRE

Full Text Finder

Nájsť tento článok vo Web of Science

Beschreibung
Abstract:	Summary: A lot of recent work has focussed on bulk loading of data into multidimensional index structures in order to efficiently construct such structures for large datasets. In this paper, we address this problem with particular focus on R-trees--which are an important class of index structures used widely in commercial database systems. We propose a new technique, which as opposed to the current technique of inserting data one by one, bulk inserts entire new datasets into an active R-tree. This technique, called STLT (for Small-Tree-Large-Tree), considers the new dataset as an R-tree itself (small tree), identifies and prepares a suitable location in the original R-tree (large tree) for insertion, and lastly performs the insert of the small tree into the large tree. Besides an analytical cost model of STLT, extensive experimental studies both on synthetic and real GIS data sets are also reported. These experiments not only compare STLT against the conventional technique, but also evaluate the suitability and limitations of STLT under different conditions, such as varying buffer sizes, ratio between existing and new data sizes, and skewness of new data with respect to the whole spatial region. We find that STLT does much better (in average, about 65\%) than the existing technique for skewed datasets as well for large sizes of both the large tree and the small tree in terms of insertion time, while keeping comparable query tree quality. STLT consistently outperforms the alternate technique in all other circumstances in terms of bulk insertion time, especially, even up to 2,000\% for the cases when the area of new data sets covers up to 4\% of the global region covered by the existing index tree; however, at the cost of a deteriorating resulting tree quality.
ISSN:	15737624 13846175
DOI:	10.1023/a:1013764014000