Effective image compression using transformer and residual network for balanced handling of high and low-frequency information

Image compression has made significant progress through end-to-end deep-learning approaches in recent years. The Transformer network, coupled with self-attention mechanisms, efficiently captures high-frequency features during image compression. However, the low-frequency information in the image can...

Full description

Saved in:
Bibliographic Details
Published in:PloS one Vol. 20; no. 10; p. e0333376
Main Authors: Hu, Jianhua, Luo, Guixiang, Feng, Xiangfei, Yuan, Zhanjiang, Yang, Jiahui, Nie, Wei
Format: Journal Article
Language:English
Published: United States Public Library of Science 03.10.2025
Public Library of Science (PLoS)
Subjects:
ISSN:1932-6203, 1932-6203
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Image compression has made significant progress through end-to-end deep-learning approaches in recent years. The Transformer network, coupled with self-attention mechanisms, efficiently captures high-frequency features during image compression. However, the low-frequency information in the image cannot be obtained well through the Transformer network. To address this issue, the paper introduces a novel end-to-end autoencoder architecture for image compression based on the transformer and residual network. This method, called Transformer and Residual Network (TRN), offers a comprehensive solution for efficient image compression, capturing essential image content while effectively reducing data size. The TRN employs a dual network, comprising a self-attention pathway and a residual network, intricately designed as a high-low-frequency mixer. This dual-network can preserve both high and low-frequency features during image compression. The end-to-end training of this model employs rate-distortion optimization (RDO methods). Experimental results demonstrate that the proposed TRN method outperforms the latest deep learning-based image compression methods, achieving an impressive 8.32% BD-rate (bit-rate distortion performance) improvement on the CLIC dataset. In comparison to traditional methods like JPEG, the proposed achieves a remarkable BD-rate improvement of 70.35% on the CLIC dataset.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0333376