Effective image compression using transformer and residual network for balanced handling of high and low-frequency information
Image compression has made significant progress through end-to-end deep-learning approaches in recent years. The Transformer network, coupled with self-attention mechanisms, efficiently captures high-frequency features during image compression. However, the low-frequency information in the image can...
Uloženo v:
| Vydáno v: | PloS one Ročník 20; číslo 10; s. e0333376 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
Public Library of Science
03.10.2025
Public Library of Science (PLoS) |
| Témata: | |
| ISSN: | 1932-6203, 1932-6203 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Image compression has made significant progress through end-to-end deep-learning approaches in recent years. The Transformer network, coupled with self-attention mechanisms, efficiently captures high-frequency features during image compression. However, the low-frequency information in the image cannot be obtained well through the Transformer network. To address this issue, the paper introduces a novel end-to-end autoencoder architecture for image compression based on the transformer and residual network. This method, called Transformer and Residual Network (TRN), offers a comprehensive solution for efficient image compression, capturing essential image content while effectively reducing data size. The TRN employs a dual network, comprising a self-attention pathway and a residual network, intricately designed as a high-low-frequency mixer. This dual-network can preserve both high and low-frequency features during image compression. The end-to-end training of this model employs rate-distortion optimization (RDO methods). Experimental results demonstrate that the proposed TRN method outperforms the latest deep learning-based image compression methods, achieving an impressive 8.32% BD-rate (bit-rate distortion performance) improvement on the CLIC dataset. In comparison to traditional methods like JPEG, the proposed achieves a remarkable BD-rate improvement of 70.35% on the CLIC dataset. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1932-6203 1932-6203 |
| DOI: | 10.1371/journal.pone.0333376 |