Effective image compression using transformer and residual network for balanced handling of high and low-frequency information

Image compression has made significant progress through end-to-end deep-learning approaches in recent years. The Transformer network, coupled with self-attention mechanisms, efficiently captures high-frequency features during image compression. However, the low-frequency information in the image can...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:PloS one Ročník 20; číslo 10; s. e0333376
Hlavní autoři: Hu, Jianhua, Luo, Guixiang, Feng, Xiangfei, Yuan, Zhanjiang, Yang, Jiahui, Nie, Wei
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States Public Library of Science 03.10.2025
Public Library of Science (PLoS)
Témata:
ISSN:1932-6203, 1932-6203
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Image compression has made significant progress through end-to-end deep-learning approaches in recent years. The Transformer network, coupled with self-attention mechanisms, efficiently captures high-frequency features during image compression. However, the low-frequency information in the image cannot be obtained well through the Transformer network. To address this issue, the paper introduces a novel end-to-end autoencoder architecture for image compression based on the transformer and residual network. This method, called Transformer and Residual Network (TRN), offers a comprehensive solution for efficient image compression, capturing essential image content while effectively reducing data size. The TRN employs a dual network, comprising a self-attention pathway and a residual network, intricately designed as a high-low-frequency mixer. This dual-network can preserve both high and low-frequency features during image compression. The end-to-end training of this model employs rate-distortion optimization (RDO methods). Experimental results demonstrate that the proposed TRN method outperforms the latest deep learning-based image compression methods, achieving an impressive 8.32% BD-rate (bit-rate distortion performance) improvement on the CLIC dataset. In comparison to traditional methods like JPEG, the proposed achieves a remarkable BD-rate improvement of 70.35% on the CLIC dataset.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0333376