Improving SMOTE via fusing conditional VAE for data-adaptive noise filtering Improving SMOTE via fusing conditional VAE for data-adaptive noise filtering
Recent advances in a generative neural network model extend the development of data augmentation methods. However, the augmentation methods based on the modern generative models fail to achieve notable improvement in class imbalance data compared to the conventional model, Synthetic Minority Oversam...
Uloženo v:
| Vydáno v: | Applied intelligence (Dordrecht, Netherlands) Ročník 55; číslo 12; s. 841 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
Springer US
01.08.2025
Springer Nature B.V |
| Témata: | |
| ISSN: | 0924-669X, 1573-7497 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Recent advances in a generative neural network model extend the development of data augmentation methods. However, the augmentation methods based on the modern generative models fail to achieve notable improvement in class imbalance data compared to the conventional model, Synthetic Minority Oversampling Technique (SMOTE). We investigate the problem of the generative model for imbalanced classification and introduce a framework to enhance the SMOTE algorithm using Variational Autoencoders (VAE s). Our approach systematically quantifies the density of data points in a low-dimensional latent space using the VAE, simultaneously incorporating information on class labels and classification difficulty. Then, the data points potentially degrading the augmentation are systematically excluded, and the neighboring observations are directly augmented on the data space. Empirical studies on several imbalanced datasets represent that this simple process innovatively improves the conventional SMOTE algorithm over the deep learning models. Consequently, we conclude that the selection of minority data and the interpolation in the data space are beneficial for imbalanced classification problems with a relatively small number of data points. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0924-669X 1573-7497 |
| DOI: | 10.1007/s10489-025-06692-y |