Merging R-Trees: Efficient Strategies for Local Bulk Insertion: Merging R-trees: Efficient strategies for local bulk insertion
Gespeichert in:
| Titel: | Merging R-Trees: Efficient Strategies for Local Bulk Insertion: Merging R-trees: Efficient strategies for local bulk insertion |
|---|---|
| Autoren: | Rupesh Choubey, Li Chen, Elke A. Rundensteiner |
| Quelle: | GeoInformatica. 6:7-34 |
| Verlagsinformationen: | Springer Science and Business Media LLC, 2002. |
| Publikationsjahr: | 2002 |
| Schlagwörter: | Computing methodologies for information systems (hypertext navigation, interfaces, decision support, etc.), multidimensional index structures, bulk loading, Computing methodologies and applications, localized datasets, 0202 electrical engineering, electronic engineering, information engineering, spatial query processing, 02 engineering and technology, bulk insertion, performance evaluation |
| Beschreibung: | Summary: A lot of recent work has focussed on bulk loading of data into multidimensional index structures in order to efficiently construct such structures for large datasets. In this paper, we address this problem with particular focus on R-trees--which are an important class of index structures used widely in commercial database systems. We propose a new technique, which as opposed to the current technique of inserting data one by one, bulk inserts entire new datasets into an active R-tree. This technique, called STLT (for Small-Tree-Large-Tree), considers the new dataset as an R-tree itself (small tree), identifies and prepares a suitable location in the original R-tree (large tree) for insertion, and lastly performs the insert of the small tree into the large tree. Besides an analytical cost model of STLT, extensive experimental studies both on synthetic and real GIS data sets are also reported. These experiments not only compare STLT against the conventional technique, but also evaluate the suitability and limitations of STLT under different conditions, such as varying buffer sizes, ratio between existing and new data sizes, and skewness of new data with respect to the whole spatial region. We find that STLT does much better (in average, about 65\%) than the existing technique for skewed datasets as well for large sizes of both the large tree and the small tree in terms of insertion time, while keeping comparable query tree quality. STLT consistently outperforms the alternate technique in all other circumstances in terms of bulk insertion time, especially, even up to 2,000\% for the cases when the area of new data sets covers up to 4\% of the global region covered by the existing index tree; however, at the cost of a deteriorating resulting tree quality. |
| Publikationsart: | Article |
| Dateibeschreibung: | application/xml |
| Sprache: | English |
| ISSN: | 1573-7624 1384-6175 |
| DOI: | 10.1023/a:1013764014000 |
| Zugangs-URL: | https://dblp.uni-trier.de/db/journals/geoinformatica/geoinformatica6.html#ChenCR02 https://link.springer.com/article/10.1023/A:1013764014000 |
| Rights: | Springer Nature TDM |
| Dokumentencode: | edsair.doi.dedup.....9c1833f6b8267934a2e94942d0c2ce06 |
| Datenbank: | OpenAIRE |
| Abstract: | Summary: A lot of recent work has focussed on bulk loading of data into multidimensional index structures in order to efficiently construct such structures for large datasets. In this paper, we address this problem with particular focus on R-trees--which are an important class of index structures used widely in commercial database systems. We propose a new technique, which as opposed to the current technique of inserting data one by one, bulk inserts entire new datasets into an active R-tree. This technique, called STLT (for Small-Tree-Large-Tree), considers the new dataset as an R-tree itself (small tree), identifies and prepares a suitable location in the original R-tree (large tree) for insertion, and lastly performs the insert of the small tree into the large tree. Besides an analytical cost model of STLT, extensive experimental studies both on synthetic and real GIS data sets are also reported. These experiments not only compare STLT against the conventional technique, but also evaluate the suitability and limitations of STLT under different conditions, such as varying buffer sizes, ratio between existing and new data sizes, and skewness of new data with respect to the whole spatial region. We find that STLT does much better (in average, about 65\%) than the existing technique for skewed datasets as well for large sizes of both the large tree and the small tree in terms of insertion time, while keeping comparable query tree quality. STLT consistently outperforms the alternate technique in all other circumstances in terms of bulk insertion time, especially, even up to 2,000\% for the cases when the area of new data sets covers up to 4\% of the global region covered by the existing index tree; however, at the cost of a deteriorating resulting tree quality. |
|---|---|
| ISSN: | 15737624 13846175 |
| DOI: | 10.1023/a:1013764014000 |
Full Text Finder
Nájsť tento článok vo Web of Science