GRIDEN: An effective grid-based and density-based spatial clustering algorithm to support parallel computing
•Propose a new effective density-based and grid-based clustering algorithm GRIDEN for massive spatial data.•Present a new concept of ε-neighbor cells to improve the clustering accuracy of grid-based algorithm.•Present a parallel computing algorithm for high dimensional density-based clustering to ac...
Uloženo v:
| Vydáno v: | Pattern recognition letters Ročník 109; s. 81 - 88 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Amsterdam
Elsevier B.V
15.07.2018
Elsevier Science Ltd |
| Témata: | |
| ISSN: | 0167-8655, 1872-7344 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | •Propose a new effective density-based and grid-based clustering algorithm GRIDEN for massive spatial data.•Present a new concept of ε-neighbor cells to improve the clustering accuracy of grid-based algorithm.•Present a parallel computing algorithm for high dimensional density-based clustering to achieve high performance.•Supports for multi-density clustering and incremental density-based clustering.
Density-based clustering has been widely used in many fields. A new effective grid-based and density-based spatial clustering algorithm, GRIDEN, is proposed in this paper, which supports parallel computing in addition to multi-density clustering. It constructs grids using hyper-square cells and provides users with parameter k to control the balance between efficiency and accuracy to increase the flexibility of the algorithm. Compared with conventional density-based algorithms, it achieves much higher performance by eliminating distance calculations among points based on the newly proposed concept of ε-neighbor cells. Compared with conventional grid-based algorithms, it uses a set of symmetric (2k+1)D cells to identify dense cells and the density-connected relationships among cells. Therefore, the maximum calculated deviation of ε-neighbor points in the grid-based algorithm can be controlled to an acceptable level through parameter k. In our experiments, the results demonstrate that GRIDEN can achieve a reliable clustering result that is infinite closed with respect to the exact DBSCAN as parameter k grows, and it requires computational time that is only linear to N. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0167-8655 1872-7344 |
| DOI: | 10.1016/j.patrec.2017.11.011 |