A Topological-Indicators-Based k-Means Clustering Algorithm and Its Application in Time Series Data: A Case Study on Sea Level Variability in Peninsular Malaysia

Traditional k-means clustering is widely used to analyze regional and temporal variations in time series data, such as sea levels. However, its accuracy can be affected by limitations, particularly when applied to datasets with mixed groups or significant noise. In this study, we analyzed monthly se...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE access Ročník 13; s. 46514 - 46533
Hlavní autoři: Lin, Zixin, Zulkepli, Nur Fariha Syaqina, Bin Mohd Kasihmuddin, Mohd Shareduwan, Gobithaasan, Rudrusamyr
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2169-3536, 2169-3536
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Traditional k-means clustering is widely used to analyze regional and temporal variations in time series data, such as sea levels. However, its accuracy can be affected by limitations, particularly when applied to datasets with mixed groups or significant noise. In this study, we analyzed monthly sea level data derived from daily time series at 14 tide gauge stations along the coastline of Peninsular Malaysia. To enhance traditional k-means clustering, we propose a hybrid approach that combines clustering techniques with topological data analysis (TDA). Specifically, we integrate k-means and its variant, k-means++, with persistent homology, the primary tool in TDA, to capture topological insights from the datasets. The proposed approach clusters the 14 tide gauge stations based on predefined topological features, and the probability of data points from each station belonging to specific clusters is computed. The results demonstrate that our approach significantly improves the performance of traditional k-means clustering by incorporating topological information, compared to using clustering without such insights.
Bibliografie:ObjectType-Case Study-2
SourceType-Scholarly Journals-1
content type line 14
ObjectType-Feature-4
ObjectType-Report-1
ObjectType-Article-3
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2025.3548558