Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring

We present an effective and efficient method that explores the properties of Transformers in the frequency domain for high-quality image deblurring. Our method is motivated by the convolution theorem that the correlation or convolution of two signals in the spatial domain is equivalent to an element...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 5886 - 5895
Hlavní autori: Kong, Lingshun, Dong, Jiangxin, Ge, Jianjun, Li, Mingqiang, Pan, Jinshan
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.06.2023
Predmet:
ISSN:1063-6919
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract We present an effective and efficient method that explores the properties of Transformers in the frequency domain for high-quality image deblurring. Our method is motivated by the convolution theorem that the correlation or convolution of two signals in the spatial domain is equivalent to an element-wise product of them in the frequency domain. This inspires us to develop an efficient frequency domain-based self-attention solver (FSAS) to estimate the scaled dot-product attention by an element-wise product operation instead of the matrix multiplication in the spatial domain. In addition, we note that simply using the naive feed-forward network (FFN) in Transformers does not generate good deblurred results. To overcome this problem, we propose a simple yet effective discriminative frequency domain-based FFN (DFFN), where we introduce a gated mechanism in the FFN based on the Joint Photographic Experts Group (JPEG) compression algorithm to discriminatively determine which low- and high-frequency information of the features should be preserved for latent clear image restoration. We formulate the proposed FSAS and DFFN into an asymmetrical network based on an encoder and decoder architecture, where the FSAS is only used in the decoder module for better image deblurring. Experimental results show that the proposed method performs favorably against the state-of-the-art approaches.
AbstractList We present an effective and efficient method that explores the properties of Transformers in the frequency domain for high-quality image deblurring. Our method is motivated by the convolution theorem that the correlation or convolution of two signals in the spatial domain is equivalent to an element-wise product of them in the frequency domain. This inspires us to develop an efficient frequency domain-based self-attention solver (FSAS) to estimate the scaled dot-product attention by an element-wise product operation instead of the matrix multiplication in the spatial domain. In addition, we note that simply using the naive feed-forward network (FFN) in Transformers does not generate good deblurred results. To overcome this problem, we propose a simple yet effective discriminative frequency domain-based FFN (DFFN), where we introduce a gated mechanism in the FFN based on the Joint Photographic Experts Group (JPEG) compression algorithm to discriminatively determine which low- and high-frequency information of the features should be preserved for latent clear image restoration. We formulate the proposed FSAS and DFFN into an asymmetrical network based on an encoder and decoder architecture, where the FSAS is only used in the decoder module for better image deblurring. Experimental results show that the proposed method performs favorably against the state-of-the-art approaches.
Author Pan, Jinshan
Ge, Jianjun
Dong, Jiangxin
Kong, Lingshun
Li, Mingqiang
Author_xml – sequence: 1
  givenname: Lingshun
  surname: Kong
  fullname: Kong, Lingshun
  organization: School of Computer Science and Engineering, Nanjing University of Science and Technology
– sequence: 2
  givenname: Jiangxin
  surname: Dong
  fullname: Dong, Jiangxin
  organization: School of Computer Science and Engineering, Nanjing University of Science and Technology
– sequence: 3
  givenname: Jianjun
  surname: Ge
  fullname: Ge, Jianjun
  organization: Information Science Academy, China Electronics Technology Group Corporation
– sequence: 4
  givenname: Mingqiang
  surname: Li
  fullname: Li, Mingqiang
  organization: Information Science Academy, China Electronics Technology Group Corporation
– sequence: 5
  givenname: Jinshan
  surname: Pan
  fullname: Pan, Jinshan
  organization: School of Computer Science and Engineering, Nanjing University of Science and Technology
BookMark eNotzMtOwkAUgOHRaCIib8BiXqB4ztzaWRouQkKCGnRLpu0ZHEMHnbaLvr0kuvo2f_57dhPPkRibIswQwT7OP17etMiFnQkQcgagc7hiE5vbQmqQgMIW12yEYGRmLNo7NmnbLwCQAtHYYsR2S-9DFSh2fJXop6dYDXxxblyIWelaqvk-udj6c2ootfwiX4fjZ_bau1PoBr5p3JH4gspTn1KIxwd2692ppcm_Y_a-Wu7n62y7e97Mn7ZZEKC6rARTltJZZSuhja6VUEZID8rUwomStLJYSeUKD5oqU9Elq9Eo6513Ikc5ZtO_byCiw3cKjUvDAeFyB0D5C8SPUgU
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR52729.2023.00570
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9798350301298
EISSN 1063-6919
EndPage 5895
ExternalDocumentID 10204001
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i204t-b06bb3a949c2565d424623f046d2a2be5491c34a8f05ec6cec25d1649fafa2713
IEDL.DBID RIE
ISICitedReferencesCount 201
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001058542606023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:56:33 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-b06bb3a949c2565d424623f046d2a2be5491c34a8f05ec6cec25d1649fafa2713
PageCount 10
ParticipantIDs ieee_primary_10204001
PublicationCentury 2000
PublicationDate 2023-June
PublicationDateYYYYMMDD 2023-06-01
PublicationDate_xml – month: 06
  year: 2023
  text: 2023-June
PublicationDecade 2020
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.6424007
Snippet We present an effective and efficient method that explores the properties of Transformers in the frequency domain for high-quality image deblurring. Our method...
SourceID ieee
SourceType Publisher
StartPage 5886
SubjectTerms Computer architecture
Convolution
Frequency estimation
Frequency-domain analysis
Low-level vision
Training
Transform coding
Transformers
Title Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring
URI https://ieeexplore.ieee.org/document/10204001
WOSCitedRecordID wos001058542606023&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagYmAqjyLe8sDq4jiOE8-lFUioVKhU3So_USWaoj6Q-u85O2kRAwNTrChyIjv2fd_5vjuE7rRyPuNakgzAPeESOKssZEJcYkOtF12IeGI6es77_WI8loNarB61MM65GHzm2qEZz_Lt3KyDqwxWOAv_HJCd_TwXlVhr51BJgcoIWdTyuITK-85o8JoxQI_tUCO8HXSX9FcRlWhDes1_vv0ItX7UeHiwszPHaM-VJ6hZw0dcL87lKXrpxnQQ0AnuLaoI6Q1-mM-A-5NgrCweblEqYD4MVxyiPEiVRmODn2awt2DYgD6CW7B8b6G3XnfYeSR1vQQyhS9bEU2F1qmSXBoAMpnljAO48cCALVNMO6CCiUm5KjzNnBHGwWMW6JL0yisGbPUMNcp56c4R1omiwioAI8Lw3GZKGpp6lnvqhc9lfoFaYYAmn1VKjMl2bC7_uH-FDsMcVDFW16ixWqzdDTowX6vpcnEbJ_IbGNOeBQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4MmugJHxjf9uC12O12Hz0jBCIiMUi4kT4NCSyGhwn_3unugvHgwdM2m01302473zedbwahByWti7gSJAJwT7gAzipSERAbGF_rRaVxfmI67Ca9XjoaiX4pVs-1MNbaPPjM1n0zP8s3c732rjJY4cz_c0B29iPOGS3kWjuXSghkJhZpKZALqHhsDPtvEQP8WPdVwuteeUl_lVHJrUir-s_3H6Pajx4P93eW5gTt2ewUVUsAicvluTxDr808IQR0gluLIkZ6g5_mM2D_xJsrgwdbnAqoD8MV-zgPUiTS2ODODHYXDFvQ1DsGs48aem81B402KSsmkAl82YooGisVSsGFBigTGc44wBsHHNgwyZQFMhjokMvU0cjqWFt4zABhEk46yYCvnqNKNs_sBcIqkDQ2EuBIrHliIik0DR1LHHWxS0RyiWp-gMafRVKM8XZsrv64f48O24OX7rjb6T1foyM_H0XE1Q2qrBZre4sO9Ndqslzc5ZP6DYC2oUw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Efficient+Frequency+Domain-based+Transformers+for+High-Quality+Image+Deblurring&rft.au=Kong%2C+Lingshun&rft.au=Dong%2C+Jiangxin&rft.au=Ge%2C+Jianjun&rft.au=Li%2C+Mingqiang&rft.date=2023-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=5886&rft.epage=5895&rft_id=info:doi/10.1109%2FCVPR52729.2023.00570&rft.externalDocID=10204001