AD‐KMeans: A Novel Clustering Algorithm for Fraud Detection in Imbalanced Datasets

Credit card fraud detection remains a critical challenge due to the rarity and evolving nature of fraudulent transactions. Traditional clustering methods, such as K‐means, are limited by fixed cluster numbers and sensitivity to outliers, often missing small or irregular fraud patterns. This research...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Applied computational intelligence and soft computing Ročník 2025; číslo 1
Hlavní autor:	Ullah, Mohammad Aman
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Wiley 01.01.2025
ISSN:	1687-9724, 1687-9732
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Credit card fraud detection remains a critical challenge due to the rarity and evolving nature of fraudulent transactions. Traditional clustering methods, such as K‐means, are limited by fixed cluster numbers and sensitivity to outliers, often missing small or irregular fraud patterns. This research introduces an adaptive clustering algorithm (AD‐KMeans) that dynamically adjusts the number of clusters based on variance and density thresholds, enabling better identification of hidden fraudulent activity. Using the Kaggle Credit Card Fraud Detection Dataset, the proposed method is evaluated against standard K‐means using metrics such as accuracy, precision, recall, F 1‐score, silhouette score, Davies–Bouldin index (DBI), and the Calinski–Harabasz index (CHI). Experimental results show that compared to state‐of‐the‐art methods, the proposed AD‐KMeans achieves the highest fraud recall (91.2%) and F 1‐score (87.6%), improving recall by 22.3 percentage points over the best hybrid model (hybrid DNN + clustering, 68.9%) and F 1‐score by 11.5 points (from 76.1% to 87.6%). Moreover, it identifies distinct fraud‐prone clusters and adapts effectively to data structure variations. These findings highlight the algorithm’s potential as a robust unsupervised approach for improving fraud detection in highly imbalanced financial datasets.
ISSN:	1687-9724 1687-9732
DOI:	10.1155/acis/2070857