Image forgery detection by combining Visual Transformer with Variational Autoencoder Network
Recently, the applications and artificial intelligences used for image manipulation have become quite successful. In this case, the manipulation of personal data can lead to problems of insurmountable magnitude. Such problems not only put personal data at risk, but also lead us to unethical practice...
Saved in:
| Published in: | Applied soft computing Vol. 165; p. 112068 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
01.11.2024
|
| Subjects: | |
| ISSN: | 1568-4946 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Recently, the applications and artificial intelligences used for image manipulation have become quite successful. In this case, the manipulation of personal data can lead to problems of insurmountable magnitude. Such problems not only put personal data at risk, but also lead us to unethical practices, with potentially irreversible negative consequences. For this reason, the reliability of image or video data is highly questionable. To solve this challenging problem, we introduce a Visual Transformer based Visual Transformer with Variational Autoencoder Network (ViT-VAE Net) model. The model includes Visual Transformer, one of the state-of-the-art architectures. In addition to this architecture, a Variational Auto Encoder structure is also included. is much more effective than models developed with the classical Convolutional Neural Network (CNN). Unlike models developed with CNN, it can perform operations on images of any size without being bound by a standard image resolution. In addition, thanks to the self-attention mechanism in the Visual Transformer architecture, manipulations on the image are caught more easily than CNN. The ViT-VAE Net model was trained with a large dataset and tested with 4 different datasets. With a success rate of 67 % on the training dataset, the model provided promising results. Very high rates were also obtained with the test datasets.
•IFDwT is a model developed to detect and localize manipulations made on images.•The model is based on the state-of-the-art Visual Transformer architecture.•The model can make predictions at all image sizes regardless of image size.•The model does not need many computational units. |
|---|---|
| ISSN: | 1568-4946 |
| DOI: | 10.1016/j.asoc.2024.112068 |