Kernel Density Based Spatial Clustering of Applications with Noise

Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a widely used clustering algorithm renowned for its ability to identify clusters of arbitrary shapes and detect noise. However, its reliance on fixed parameters, such as the minimum number of points (MinPts) and the epsilon radi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the International Florida Artificial Intelligence Research Society Conference Jg. 38; H. 1
Hauptverfasser: Kalpavruksha, Rohan, Kalpavruksha, Roshan, Cha, Teryn, Cha, Sung-Hyuk
Format: Journal Article
Sprache:Englisch
Veröffentlicht: LibraryPress@UF 14.05.2025
Schlagworte:
ISSN:2334-0754, 2334-0762
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a widely used clustering algorithm renowned for its ability to identify clusters of arbitrary shapes and detect noise. However, its reliance on fixed parameters, such as the minimum number of points (MinPts) and the epsilon radius (epsilon), makes it sensitive to variations in sample density. This paper reinterprets DBSCAN as a specific case of kernel density estimation (KDE)-based clustering, where the kernel shape corresponds to a hyper-rectangular pillar or cylindrical kernel, depending on the distance metric. Building on this foundation, we introduce a flexible framework incorporating various kernel functions, including uniform, conical, Epanechnikov, cosine, exponential, and Gaussian kernels, to estimate the density distribution of data points. The threshold values are selected to identify high-density regions by retaining the top 90% of points, while excluding low-density points as noise, thereby enhancing clustering precision. Clusters are adaptively formed by leveraging points within the kernel range, thereby increasing the algorithm's robustness to noise and its adaptability to irregular density patterns. Empirical results demonstrate that the proposed approach outperforms traditional DBSCAN, as evidenced by lower Davies-Bouldin indices and higher silhouette scores. This study highlights the potential of density-driven clustering for practical applications, including social media sentiment analysis, customer segmentation in e-commerce, and medical data analysis, particularly in scenarios involving noise-prone or unevenly distributed datasets.
ISSN:2334-0754
2334-0762
DOI:10.32473/flairs.38.1.138998