AD‐KMeans: A Novel Clustering Algorithm for Fraud Detection in Imbalanced Datasets
Credit card fraud detection remains a critical challenge due to the rarity and evolving nature of fraudulent transactions. Traditional clustering methods, such as K‐means, are limited by fixed cluster numbers and sensitivity to outliers, often missing small or irregular fraud patterns. This research...
Saved in:
| Published in: | Applied computational intelligence and soft computing Vol. 2025; no. 1 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English |
| Published: |
Wiley
01.01.2025
|
| ISSN: | 1687-9724, 1687-9732 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Credit card fraud detection remains a critical challenge due to the rarity and evolving nature of fraudulent transactions. Traditional clustering methods, such as K‐means, are limited by fixed cluster numbers and sensitivity to outliers, often missing small or irregular fraud patterns. This research introduces an adaptive clustering algorithm (AD‐KMeans) that dynamically adjusts the number of clusters based on variance and density thresholds, enabling better identification of hidden fraudulent activity. Using the Kaggle Credit Card Fraud Detection Dataset, the proposed method is evaluated against standard K‐means using metrics such as accuracy, precision, recall, F 1‐score, silhouette score, Davies–Bouldin index (DBI), and the Calinski–Harabasz index (CHI). Experimental results show that compared to state‐of‐the‐art methods, the proposed AD‐KMeans achieves the highest fraud recall (91.2%) and F 1‐score (87.6%), improving recall by 22.3 percentage points over the best hybrid model (hybrid DNN + clustering, 68.9%) and F 1‐score by 11.5 points (from 76.1% to 87.6%). Moreover, it identifies distinct fraud‐prone clusters and adapts effectively to data structure variations. These findings highlight the algorithm’s potential as a robust unsupervised approach for improving fraud detection in highly imbalanced financial datasets. |
|---|---|
| ISSN: | 1687-9724 1687-9732 |
| DOI: | 10.1155/acis/2070857 |