An Improved Grid Clustering Algorithm for Geographic Data Mining

Uložené v:
Podrobná bibliografia
Názov: An Improved Grid Clustering Algorithm for Geographic Data Mining
Autori: Honglei He
Zdroj: Expert Systems. 42
Informácie o vydavateľovi: Wiley, 2025.
Rok vydania: 2025
Popis: Grid clustering is a classical clustering algorithm with the advantage of lower time complexity, which is suitable for the analysis of large geographic data. However, it is sensitive to the grid division parameter M and density threshold R, and the clustering accuracy is poor. The article proposes a hybrid clustering algorithm HCA‐BGP based on grid and division. the algorithm first uses grid clustering to obtain the core part of the class family, and then uses the division‐based method to obtain the edge part of the class family. Through experiments on simulated datasets and real geographic datasets, it is proved to have better results than the existing grid clustering as well as some other classical algorithms. In terms of clustering accuracy, compared with the classical grid clustering algorithm Clique, the clustering F‐value of this paper's algorithm is improved by 20.3% on dataset S1, 81.8% on dataset R15, and 7.6% on average on the eight geographic datasets. In terms of the sensitivity of parameters M and R, compared with Clique, the variance of the clustered F‐value of this paper's algorithm is reduced by 89.3% on dataset S1; the variance of the clustered ARI is reduced by 99.9% on the real geographic dataset Data8. Compared to another grid‐based clustering algorithm, GDB, HCA‐BGP also demonstrates significant advantages.
Druh dokumentu: Article
Jazyk: English
ISSN: 1468-0394
0266-4720
DOI: 10.1111/exsy.70042
Rights: Wiley Online Library User Agreement
Prístupové číslo: edsair.doi...........152d40108d85819e2f9361abdd90c93c
Databáza: OpenAIRE
Popis
Abstrakt:Grid clustering is a classical clustering algorithm with the advantage of lower time complexity, which is suitable for the analysis of large geographic data. However, it is sensitive to the grid division parameter M and density threshold R, and the clustering accuracy is poor. The article proposes a hybrid clustering algorithm HCA‐BGP based on grid and division. the algorithm first uses grid clustering to obtain the core part of the class family, and then uses the division‐based method to obtain the edge part of the class family. Through experiments on simulated datasets and real geographic datasets, it is proved to have better results than the existing grid clustering as well as some other classical algorithms. In terms of clustering accuracy, compared with the classical grid clustering algorithm Clique, the clustering F‐value of this paper's algorithm is improved by 20.3% on dataset S1, 81.8% on dataset R15, and 7.6% on average on the eight geographic datasets. In terms of the sensitivity of parameters M and R, compared with Clique, the variance of the clustered F‐value of this paper's algorithm is reduced by 89.3% on dataset S1; the variance of the clustered ARI is reduced by 99.9% on the real geographic dataset Data8. Compared to another grid‐based clustering algorithm, GDB, HCA‐BGP also demonstrates significant advantages.
ISSN:14680394
02664720
DOI:10.1111/exsy.70042