AD‐KMeans: A Novel Clustering Algorithm for Fraud Detection in Imbalanced Datasets

Credit card fraud detection remains a critical challenge due to the rarity and evolving nature of fraudulent transactions. Traditional clustering methods, such as K‐means, are limited by fixed cluster numbers and sensitivity to outliers, often missing small or irregular fraud patterns. This research...

Full description

Saved in:
Bibliographic Details
Published in:Applied computational intelligence and soft computing Vol. 2025; no. 1
Main Author: Ullah, Mohammad Aman
Format: Journal Article
Language:English
Published: Wiley 01.01.2025
ISSN:1687-9724, 1687-9732
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Credit card fraud detection remains a critical challenge due to the rarity and evolving nature of fraudulent transactions. Traditional clustering methods, such as K‐means, are limited by fixed cluster numbers and sensitivity to outliers, often missing small or irregular fraud patterns. This research introduces an adaptive clustering algorithm (AD‐KMeans) that dynamically adjusts the number of clusters based on variance and density thresholds, enabling better identification of hidden fraudulent activity. Using the Kaggle Credit Card Fraud Detection Dataset, the proposed method is evaluated against standard K‐means using metrics such as accuracy, precision, recall, F 1‐score, silhouette score, Davies–Bouldin index (DBI), and the Calinski–Harabasz index (CHI). Experimental results show that compared to state‐of‐the‐art methods, the proposed AD‐KMeans achieves the highest fraud recall (91.2%) and F 1‐score (87.6%), improving recall by 22.3 percentage points over the best hybrid model (hybrid DNN + clustering, 68.9%) and F 1‐score by 11.5 points (from 76.1% to 87.6%). Moreover, it identifies distinct fraud‐prone clusters and adapts effectively to data structure variations. These findings highlight the algorithm’s potential as a robust unsupervised approach for improving fraud detection in highly imbalanced financial datasets.
ISSN:1687-9724
1687-9732
DOI:10.1155/acis/2070857