Effective image compression using transformer and residual network for balanced handling of high and low-frequency information

Image compression has made significant progress through end-to-end deep-learning approaches in recent years. The Transformer network, coupled with self-attention mechanisms, efficiently captures high-frequency features during image compression. However, the low-frequency information in the image can...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	PloS one Ročník 20; číslo 10; s. e0333376
Hlavní autoři:	Hu, Jianhua, Luo, Guixiang, Feng, Xiangfei, Yuan, Zhanjiang, Yang, Jiahui, Nie, Wei
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	United States Public Library of Science 03.10.2025 Public Library of Science (PLoS)
Témata:	Algorithms Analysis Compression Data compression Data Compression - methods Data reduction Datasets Deep Learning Design Distortion Entropy Humans Image compression Image processing Image Processing, Computer-Assisted - methods Methods Neural networks Neural Networks, Computer China
ISSN:	1932-6203, 1932-6203
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Image compression has made significant progress through end-to-end deep-learning approaches in recent years. The Transformer network, coupled with self-attention mechanisms, efficiently captures high-frequency features during image compression. However, the low-frequency information in the image cannot be obtained well through the Transformer network. To address this issue, the paper introduces a novel end-to-end autoencoder architecture for image compression based on the transformer and residual network. This method, called Transformer and Residual Network (TRN), offers a comprehensive solution for efficient image compression, capturing essential image content while effectively reducing data size. The TRN employs a dual network, comprising a self-attention pathway and a residual network, intricately designed as a high-low-frequency mixer. This dual-network can preserve both high and low-frequency features during image compression. The end-to-end training of this model employs rate-distortion optimization (RDO methods). Experimental results demonstrate that the proposed TRN method outperforms the latest deep learning-based image compression methods, achieving an impressive 8.32% BD-rate (bit-rate distortion performance) improvement on the CLIC dataset. In comparison to traditional methods like JPEG, the proposed achieves a remarkable BD-rate improvement of 70.35% on the CLIC dataset.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0333376