Kernel Density Based Spatial Clustering of Applications with Noise

Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a widely used clustering algorithm renowned for its ability to identify clusters of arbitrary shapes and detect noise. However, its reliance on fixed parameters, such as the minimum number of points (MinPts) and the epsilon radi...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the International Florida Artificial Intelligence Research Society Conference Ročník 38; číslo 1
Hlavní autoři: Kalpavruksha, Rohan, Kalpavruksha, Roshan, Cha, Teryn, Cha, Sung-Hyuk
Médium: Journal Article
Jazyk:angličtina
Vydáno: LibraryPress@UF 14.05.2025
Témata:
ISSN:2334-0754, 2334-0762
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a widely used clustering algorithm renowned for its ability to identify clusters of arbitrary shapes and detect noise. However, its reliance on fixed parameters, such as the minimum number of points (MinPts) and the epsilon radius (epsilon), makes it sensitive to variations in sample density. This paper reinterprets DBSCAN as a specific case of kernel density estimation (KDE)-based clustering, where the kernel shape corresponds to a hyper-rectangular pillar or cylindrical kernel, depending on the distance metric. Building on this foundation, we introduce a flexible framework incorporating various kernel functions, including uniform, conical, Epanechnikov, cosine, exponential, and Gaussian kernels, to estimate the density distribution of data points. The threshold values are selected to identify high-density regions by retaining the top 90% of points, while excluding low-density points as noise, thereby enhancing clustering precision. Clusters are adaptively formed by leveraging points within the kernel range, thereby increasing the algorithm's robustness to noise and its adaptability to irregular density patterns. Empirical results demonstrate that the proposed approach outperforms traditional DBSCAN, as evidenced by lower Davies-Bouldin indices and higher silhouette scores. This study highlights the potential of density-driven clustering for practical applications, including social media sentiment analysis, customer segmentation in e-commerce, and medical data analysis, particularly in scenarios involving noise-prone or unevenly distributed datasets.
ISSN:2334-0754
2334-0762
DOI:10.32473/flairs.38.1.138998