Generative hybrid models for fraud detection in auto insurance with a comparative analysis of VAE, GAN, and diffusion approaches
Fraud claim detection in auto insurance remains a vital yet complex challenge, mainly due to imbalanced data sets, non-linear feature interactions, and the necessity for explicable predictions. While traditional Machine Learning (ML) approaches show promise, they frequently struggle from poor genera...
Uloženo v:
| Vydáno v: | Discover Artificial Intelligence Ročník 5; číslo 1; s. 313 - 23 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Cham
Springer International Publishing
01.12.2025
Springer Nature B.V Springer |
| Témata: | |
| ISSN: | 2731-0809, 2731-0809 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Fraud claim detection in auto insurance remains a vital yet complex challenge, mainly due to imbalanced data sets, non-linear feature interactions, and the necessity for explicable predictions. While traditional Machine Learning (ML) approaches show promise, they frequently struggle from poor generalization, limited interpretability, and inadequate treatment of rare fraudulent cases. The present paper proposes a new hybrid approach involving generative models —namely Variational AutoEncoders (VAEs), Generative Adversarial Networks (GANs), and Diffusion Models (DMs)—with an ensemble of classifiers including eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Light Gradient Boosting (Light GBM), coupled with Isolation Forest (IF) for anomaly detection and oversampling-based techniques (SMOTE and ADASYN) to ameliorate class balance. In total, 18 hybrid combinations were developed and evaluated across classification performance (AUC-ROC, Accuracy, Precision, Recall, F1-score), probabilistic calibration (Brier Score and Log loss), and stochastic stability (Monte Carlo Variance and Bootstrap Variance). The experimental findings—backed up by graphical analysis based on radar plots, ROC curves, 3D metric visualization, and SHAP explainability—confirm that DM coupled with XGBoost and SMOTE (DM_XGBoost_SMOTE) and DM with Light GBM and SMOTE (DM_Light GBM_SMOTE) outperform alternative combinations. In particular, DM_XGBoost_SMOTE achieves a well balanced compromise between accuracy, confidence calibration, and robustness. This work underlines the efficiency of Diffusion-based hybrid models in fraud detection and opens the way for their implementation in high-risk, real-world insurance environments. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2731-0809 2731-0809 |
| DOI: | 10.1007/s44163-025-00574-5 |