Frequency-Aware Hierarchical Image Compression for Humans and Machines

To achieve efficient compression for both human vision and machine perception, scalable coding methods have been proposed in recent years. However, existing methods do not fully eliminate the redundancy between features corresponding to different tasks, resulting in suboptimal coding performance. In...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Visual communications and image processing (Online) s. 1 - 5
Hlavní autori: Luo, Yue, Zhang, Zixiang, Kuang, Jinhao, Yu, Li
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 08.12.2024
Predmet:
ISSN:2642-9357
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract To achieve efficient compression for both human vision and machine perception, scalable coding methods have been proposed in recent years. However, existing methods do not fully eliminate the redundancy between features corresponding to different tasks, resulting in suboptimal coding performance. In this paper, we propose a frequency-aware hierarchical image compression framework designed for humans and machines. Specifically, we investigate task relationships from a frequency perspective, utilizing only HF information for machine vision tasks and leveraging both HF and LF features for image reconstruction. Besides, the residual block embedded octave convolution module is designed to enhance the information interaction between HF features and LF features. Additionally, a dual-frequency channel-wise entropy model is applied to reasonably exploit the correlation between different tasks, thereby improving multi-task performance. The experiments show that the proposed method offers -69.3%∼-75.3% coding gains on machine vision tasks compared to the relevant benchmarks, and -19.1% gains over state-of-the-art scalable image codec in terms of image reconstruction quality.
AbstractList To achieve efficient compression for both human vision and machine perception, scalable coding methods have been proposed in recent years. However, existing methods do not fully eliminate the redundancy between features corresponding to different tasks, resulting in suboptimal coding performance. In this paper, we propose a frequency-aware hierarchical image compression framework designed for humans and machines. Specifically, we investigate task relationships from a frequency perspective, utilizing only HF information for machine vision tasks and leveraging both HF and LF features for image reconstruction. Besides, the residual block embedded octave convolution module is designed to enhance the information interaction between HF features and LF features. Additionally, a dual-frequency channel-wise entropy model is applied to reasonably exploit the correlation between different tasks, thereby improving multi-task performance. The experiments show that the proposed method offers -69.3%∼-75.3% coding gains on machine vision tasks compared to the relevant benchmarks, and -19.1% gains over state-of-the-art scalable image codec in terms of image reconstruction quality.
Author Yu, Li
Kuang, Jinhao
Zhang, Zixiang
Luo, Yue
Author_xml – sequence: 1
  givenname: Yue
  surname: Luo
  fullname: Luo, Yue
  organization: Huazhong University of Science and Technology,Wuhan,China
– sequence: 2
  givenname: Zixiang
  surname: Zhang
  fullname: Zhang, Zixiang
  organization: Huazhong University of Science and Technology,Wuhan,China
– sequence: 3
  givenname: Jinhao
  surname: Kuang
  fullname: Kuang, Jinhao
  organization: Huazhong University of Science and Technology,Wuhan,China
– sequence: 4
  givenname: Li
  surname: Yu
  fullname: Yu, Li
  email: hustlyu@hust.edu.cn
  organization: Huazhong University of Science and Technology,Wuhan,China
BookMark eNo1j9FKwzAYRqMoOGffQDAv0JrkT9PkchRrCxO9UG9Hmv7VyJrOxCF7ewfq1XdzOJzvkpyFOSAhN5wVnDNz-1p3Twq4YoVgQhacaWm0qU5IZiqjAXgpTCnhlCyEkiI3UFYXJEvpgzEmSg7C6AVpmoifewzukK--bUTaeow2unfv7JZ2k31DWs_TLmJKfg50nCNt95MNidow0Ad7JAOmK3I-2m3C7G-X5KW5e67bfP1439Wrde45U1-5HGQJqKpeycGA1urYZK1AC6OqpOmFq5xjemSDNoL3XGk2KNlrECgAtYEluf71ekTc7KKfbDxs_q_DD_rxTq8
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/VCIP63160.2024.10849897
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9798331529543
EISSN 2642-9357
EndPage 5
ExternalDocumentID 10849897
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-i106t-4d453e67b64d93886357aa2ea3f6749b2c7cc08f0d8921b1680d64b832e23e893
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001431710700096&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Feb 12 06:22:40 EST 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i106t-4d453e67b64d93886357aa2ea3f6749b2c7cc08f0d8921b1680d64b832e23e893
PageCount 5
ParticipantIDs ieee_primary_10849897
PublicationCentury 2000
PublicationDate 2024-Dec.-8
PublicationDateYYYYMMDD 2024-12-08
PublicationDate_xml – month: 12
  year: 2024
  text: 2024-Dec.-8
  day: 08
PublicationDecade 2020
PublicationTitle Visual communications and image processing (Online)
PublicationTitleAbbrev VCIP
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002513298
Score 2.2764351
Snippet To achieve efficient compression for both human vision and machine perception, scalable coding methods have been proposed in recent years. However, existing...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Codecs
Convolution
Correlation
Entropy
Image coding
image coding for machines
Image reconstruction
learned image compression
Machine vision
Multitasking
Redundancy
scalable image coding
Visual communication
Title Frequency-Aware Hierarchical Image Compression for Humans and Machines
URI https://ieeexplore.ieee.org/document/10849897
WOSCitedRecordID wos001431710700096&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVoxcBUPor4lgdWlyR2YntEFVE7UGUA1K1y7LNUCVKUtCD-PbaTFjEwsFmWLFln3d3z-T0fQrepw0CxNZRwYIwwmyakdDiBaC9FSFOblJaFZhN8NhPzuSw6sXrQwgBAIJ_ByA_DW75Z6Y0vlTkPF0wKyXuox3nWirV2BRWXqGkiRcfhiiN59zKeFhn15RN3w2ej7epffVRCGskH_9zAIRr-CPJwsUs1R2gPqmM06BAk7vyzOUF5XrfU6C9y_6lqwJOlFxiHfievePrmggf2EaAlv1bYIVYcyvgNVpXBj4FZCc0QPecPT-MJ6TolkKW70q0JMyylkPEyY0ZSIfwnc0oloKjNOJNlornWkbCRETKJyzgTkclY6bwZEgoOspyifrWq4AxhsNyUTFijGfWvnsplOa2twwlacWqjczT0dlm8t59hLLYmufhj_hIdeOsHBoi4Qv11vYFrtK8_1sumvglH-A3e65ut
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5aBT3VR8W3OXhN3U2ym-QoxaXFtvRQpbeSzQMKupXdVvHfm2S3igcP3kIgECZM5pvJ92UAuE0cBoqtJogZShG1CUa5wwlIeSlCklicWxqaTbDxmM9mYtKI1YMWxhgTyGem64fhLV8v1dqXypyHcyq4YNtgJ6EUR7Vc67uk4kI1wYI3LK44EnfPvcEkJb6A4nJ82t2s_9VJJQSSrP3PLRyAzo8kD06-g80h2DLFEWg3GBI2Hlodgywra3L0J7r_kKWB_YWXGIeOJy9w8OquD-jvgJr-WkCHWWEo5FdQFhqOArfSVB3wlD1Me33U9EpAC5fUrRDVNCEmZXlKtSCc-2_mpMRGEpsyKnKsmFIRt5HmAsd5nPJIpzR3_mwwMQ60nIBWsSzMKYDGMp1TbrWixL97ShfnlLIOKSjJiI3OQMfbZf5Wf4cx35jk_I_5G7DXn46G8-Fg_HgB9v1JBD4IvwStVbk2V2BXva8WVXkdjvMLKRue9A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Visual+communications+and+image+processing+%28Online%29&rft.atitle=Frequency-Aware+Hierarchical+Image+Compression+for+Humans+and+Machines&rft.au=Luo%2C+Yue&rft.au=Zhang%2C+Zixiang&rft.au=Kuang%2C+Jinhao&rft.au=Yu%2C+Li&rft.date=2024-12-08&rft.pub=IEEE&rft.eissn=2642-9357&rft.spage=1&rft.epage=5&rft_id=info:doi/10.1109%2FVCIP63160.2024.10849897&rft.externalDocID=10849897