Ensemble Learning-Based Rate-Distortion Optimization for End-to-End Image Compression
End-to-end image compression using trained deep networks as encoding/decoding models has been developed substantially in the recent years. Previous work is limited in using a single encoding/decoding model, whereas we explore the usage of multiple encoding/decoding models as an ensemble. We propose...
Gespeichert in:
| Veröffentlicht in: | IEEE transactions on circuits and systems for video technology Jg. 31; H. 3; S. 1193 - 1207 |
|---|---|
| Hauptverfasser: | , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York
IEEE
01.03.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Schlagworte: | |
| ISSN: | 1051-8215, 1558-2205 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | End-to-end image compression using trained deep networks as encoding/decoding models has been developed substantially in the recent years. Previous work is limited in using a single encoding/decoding model, whereas we explore the usage of multiple encoding/decoding models as an ensemble. We propose several methods to obtain multiple models. First, we adopt the boosting strategy to train multiple networks with diversity as an ensemble. Second, we train an ensemble of multiple probability distribution models to reduce the distribution gap for efficient entropy coding. Third, we present a geometric transform-based self-ensemble method. The multiple models can be regarded as the multiple coding modes, similar to those in non-deep video coding schemes. We further adopt block-level model/mode selection at the encoder side to pursue rate-distortion optimization, where we use hierarchical block partitioning to improve the adaptation ability. Compared with single-model end-to-end compression, our proposed method improves the compression efficiency significantly, leading to 21% BD-rate reduction on the Kodak dataset, without increasing the decoding complexity. On the other hand, when keeping the same compression efficiency, our method can use much simplified decoding models, where the floating-point operations are reduced by 70%. |
|---|---|
| AbstractList | End-to-end image compression using trained deep networks as encoding/decoding models has been developed substantially in the recent years. Previous work is limited in using a single encoding/decoding model, whereas we explore the usage of multiple encoding/decoding models as an ensemble. We propose several methods to obtain multiple models. First, we adopt the boosting strategy to train multiple networks with diversity as an ensemble. Second, we train an ensemble of multiple probability distribution models to reduce the distribution gap for efficient entropy coding. Third, we present a geometric transform-based self-ensemble method. The multiple models can be regarded as the multiple coding modes, similar to those in non-deep video coding schemes. We further adopt block-level model/mode selection at the encoder side to pursue rate-distortion optimization, where we use hierarchical block partitioning to improve the adaptation ability. Compared with single-model end-to-end compression, our proposed method improves the compression efficiency significantly, leading to 21% BD-rate reduction on the Kodak dataset, without increasing the decoding complexity. On the other hand, when keeping the same compression efficiency, our method can use much simplified decoding models, where the floating-point operations are reduced by 70%. |
| Author | Wu, Feng Gao, Wen Ma, Siwei Liu, Dong Wang, Yefei |
| Author_xml | – sequence: 1 givenname: Yefei surname: Wang fullname: Wang, Yefei email: wyfei@mail.ustc.edu.cn organization: CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, University of Science and Technology of China, Hefei, China – sequence: 2 givenname: Dong orcidid: 0000-0001-9100-2906 surname: Liu fullname: Liu, Dong email: dongeliu@ustc.edu.cn organization: CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, University of Science and Technology of China, Hefei, China – sequence: 3 givenname: Siwei orcidid: 0000-0002-2731-5403 surname: Ma fullname: Ma, Siwei email: swma@pku.edu.cn organization: National Engineering Laboratory for Video Technology, Peking University, Beijing, China – sequence: 4 givenname: Feng surname: Wu fullname: Wu, Feng email: fengwu@ustc.edu.cn organization: CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, University of Science and Technology of China, Hefei, China – sequence: 5 givenname: Wen surname: Gao fullname: Gao, Wen email: wgao@pku.edu.cn organization: National Engineering Laboratory for Video Technology, Peking University, Beijing, China |
| BookMark | eNp9kEtPwzAQhC1UJNrCH4BLJM4ufsSOc4RSHlKlStBytZxkU7lq4mC7B_j1pA9x4MBpdqVvdrQzQoPWtYDQNSUTSkl-t5y-fywnjDAy4YQQzukZGlIhFGaMiEE_E0GxYlRcoFEIG0JoqtJsiFazNkBTbCGZg_Gtbdf4wQSokjcTAT_aEJ2P1rXJoou2sd_msNTOJ7O2wtHhXpLXxqwhmbqm8xBCD1yi89psA1yddIxWT7Pl9AXPF8-v0_s5LrmkERsiGavBpGBUISqVUVoWKi9LagzjacbSDCpp8pTXhWKZZHWRKyq5LImAggEfo9vj3c67zx2EqDdu59s-UrM0VyrjUsieYkeq9C4ED7XuvG2M_9KU6H19-lCf3tenT_X1JvXHVNp4-D56Y7f_W2-OVgsAv1l5TwuZ8R9j7H-B |
| CODEN | ITCTEM |
| CitedBy_id | crossref_primary_10_1109_TCSVT_2023_3237274 crossref_primary_10_1109_TCSVT_2022_3157074 crossref_primary_10_1109_TCSVT_2023_3300316 crossref_primary_10_1145_3580499 crossref_primary_10_1007_s00530_022_01026_1 crossref_primary_10_1109_TMM_2024_3372352 crossref_primary_10_7780_kjrs_2025_41_1_2 crossref_primary_10_1109_TGRS_2023_3315725 crossref_primary_10_1109_TCSVT_2022_3145024 crossref_primary_10_1109_TCSVT_2021_3104575 crossref_primary_10_1016_j_ipm_2021_102808 crossref_primary_10_1109_TCSVT_2022_3216713 crossref_primary_10_1109_TCSVT_2022_3230843 crossref_primary_10_1016_j_mfglet_2025_06_163 crossref_primary_10_1109_TCSVT_2024_3395481 crossref_primary_10_1109_TCSVT_2021_3082635 crossref_primary_10_1109_TCSVT_2023_3241225 crossref_primary_10_1109_ACCESS_2023_3236086 crossref_primary_10_1145_3652148 crossref_primary_10_1007_s11042_023_15271_7 crossref_primary_10_1109_TCSVT_2022_3231789 crossref_primary_10_1007_s11263_023_01809_7 crossref_primary_10_1109_TCSVT_2021_3133313 crossref_primary_10_1016_j_neucom_2022_08_009 crossref_primary_10_1109_JIOT_2022_3150417 crossref_primary_10_1145_3719011 |
| Cites_doi | 10.1109/CVPRW.2017.151 10.1109/ICME.2017.8019416 10.1145/2379776.2379786 10.1145/1968.1972 10.1145/28395.28426 10.1007/3-540-48219-9_8 10.1023/A:1010933404324 10.1007/s10916-018-1136-x 10.1109/CVPR.2017.577 10.1007/978-0-387-73003-5_293 10.1109/CVPR.2018.00462 10.1007/978-3-319-51811-4_3 10.1109/CVPR.2014.81 10.1016/j.rse.2018.12.010 10.1109/30.125072 10.1109/79.733495 10.1201/b12207 10.1142/9789812386533_0015 10.1214/aos/1024691352 10.1109/CVPR.2016.206 10.1109/TCSVT.2012.2221191 10.1016/S0893-6080(05)80023-1 10.1109/TITS.2018.2888587 10.1007/978-3-319-24574-4_28 10.1007/3-540-59119-2_166 10.1007/BF00116037 10.1007/978-1-4615-3626-0_12 10.1016/j.ins.2016.08.007 10.1007/BF00058655 10.1006/inco.1995.1136 10.1016/S0923-5965(01)00024-8 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TCSVT.2020.3000331 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1558-2205 |
| EndPage | 1207 |
| ExternalDocumentID | 10_1109_TCSVT_2020_3000331 9109567 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Key Research and Development Program of China grantid: 2018YFA0701603 funderid: 10.13039/501100002855 – fundername: Natural Science Foundation of China grantid: 61931014; 61772483; 61632001 funderid: 10.13039/501100001809 |
| GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c361t-a0622fea4ea8b5d8711cb89cc1aa2347247ed6a943fb82762fb981636c05eb2e3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 31 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000626532100028&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1051-8215 |
| IngestDate | Sun Nov 30 05:08:39 EST 2025 Tue Nov 18 22:35:24 EST 2025 Sat Nov 29 01:44:14 EST 2025 Wed Aug 27 02:48:57 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c361t-a0622fea4ea8b5d8711cb89cc1aa2347247ed6a943fb82762fb981636c05eb2e3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-9100-2906 0000-0002-2731-5403 |
| PQID | 2498873656 |
| PQPubID | 85433 |
| PageCount | 15 |
| ParticipantIDs | ieee_primary_9109567 crossref_primary_10_1109_TCSVT_2020_3000331 proquest_journals_2498873656 crossref_citationtrail_10_1109_TCSVT_2020_3000331 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-03-01 |
| PublicationDateYYYYMMDD | 2021-03-01 |
| PublicationDate_xml | – month: 03 year: 2021 text: 2021-03-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on circuits and systems for video technology |
| PublicationTitleAbbrev | TCSVT |
| PublicationYear | 2021 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 recht (ref43) 2019 ref15 ref14 ref17 ref16 theis (ref4) 2017 ref18 zhou (ref25) 2019 kingma (ref45) 2014 ref50 liu (ref24) 2019 ref48 ref42 ref41 ballé (ref10) 2018 van den oord (ref23) 2016 shannon (ref1) 1959; 4 ref7 ref3 ref6 ref40 agustsson (ref8) 2017 ref35 ref34 ref37 ref31 krizhevsky (ref19) 2012 ref30 ref33 ref32 ref2 bjontegaard (ref46) 2001 ref39 goodfellow (ref26) 2014 ref38 ballé (ref9) 2016 lin (ref44) 2014 minnen (ref11) 2018 ref20 ref22 ref21 ref28 ref27 ref29 howard (ref47) 2017 rippel (ref5) 2017 an (ref36) 2019 zheng (ref49) 2019 |
| References_xml | – ident: ref20 doi: 10.1109/CVPRW.2017.151 – ident: ref50 doi: 10.1109/ICME.2017.8019416 – ident: ref28 doi: 10.1145/2379776.2379786 – ident: ref13 doi: 10.1145/1968.1972 – ident: ref14 doi: 10.1145/28395.28426 – start-page: 4790 year: 2016 ident: ref23 article-title: Conditional image generation with PixelCNN decoders publication-title: Proc NIPS – ident: ref29 doi: 10.1007/3-540-48219-9_8 – year: 2017 ident: ref4 article-title: Lossy image compression with compressive autoencoders publication-title: arXiv 1703 00395 – ident: ref33 doi: 10.1023/A:1010933404324 – ident: ref37 doi: 10.1007/s10916-018-1136-x – year: 2019 ident: ref49 article-title: Implicit dual-domain convolutional network for robust color image compression artifact reduction publication-title: IEEE Trans Circuits Syst Video Technol – ident: ref6 doi: 10.1109/CVPR.2017.577 – year: 2001 ident: ref46 article-title: Calcuation of average PSNR differences between RD-curves – ident: ref27 doi: 10.1007/978-0-387-73003-5_293 – start-page: 740 year: 2014 ident: ref44 article-title: Microsoft COCO: Common objects in context publication-title: Proc Eur Conf Comput Vis – ident: ref7 doi: 10.1109/CVPR.2018.00462 – ident: ref48 doi: 10.1007/978-3-319-51811-4_3 – start-page: 1 year: 2019 ident: ref25 article-title: End-to-end optimized image compression with attention mechanism publication-title: Proc CVPR Workshops – ident: ref22 doi: 10.1109/CVPR.2014.81 – ident: ref35 doi: 10.1016/j.rse.2018.12.010 – start-page: 2672 year: 2014 ident: ref26 article-title: Generative adversarial nets publication-title: Proc NIPS – ident: ref17 doi: 10.1109/30.125072 – year: 2019 ident: ref43 article-title: Do ImageNet classifiers generalize to ImageNet? publication-title: arXiv 1902 10811 – ident: ref2 doi: 10.1109/79.733495 – ident: ref12 doi: 10.1201/b12207 – ident: ref40 doi: 10.1142/9789812386533_0015 – ident: ref39 doi: 10.1214/aos/1024691352 – ident: ref42 doi: 10.1109/CVPR.2016.206 – start-page: 2922 year: 2017 ident: ref5 article-title: Real-time adaptive image compression publication-title: Proc ICML – start-page: 1 year: 2019 ident: ref24 article-title: Practical stacked non-local attention modules for image compression publication-title: Proc CVPR Workshops – ident: ref3 doi: 10.1109/TCSVT.2012.2221191 – start-page: 10794 year: 2018 ident: ref11 article-title: Joint autoregressive and hierarchical priors for learned image compression publication-title: Proc NIPS – year: 2014 ident: ref45 article-title: Adam: A method for stochastic optimization publication-title: arXiv 1412 6980 – ident: ref34 doi: 10.1016/S0893-6080(05)80023-1 – year: 2018 ident: ref10 article-title: Variational image compression with a scale hyperprior publication-title: arXiv 1802 01436 – year: 2016 ident: ref9 article-title: End-to-end optimized image compression publication-title: arXiv 1611 01704 – year: 2017 ident: ref47 article-title: MobileNets: Efficient convolutional neural networks for mobile vision applications publication-title: arXiv 1704 04861 – year: 2019 ident: ref36 article-title: Deep ensemble learning for alzheimers disease classification publication-title: arXiv 1905 12827 – start-page: 1141 year: 2017 ident: ref8 article-title: Soft-to-hard vector quantization for end-to-end learning compressible representations publication-title: Proc NIPS – ident: ref38 doi: 10.1109/TITS.2018.2888587 – ident: ref21 doi: 10.1007/978-3-319-24574-4_28 – ident: ref31 doi: 10.1007/3-540-59119-2_166 – ident: ref15 doi: 10.1007/BF00116037 – ident: ref16 doi: 10.1007/978-1-4615-3626-0_12 – ident: ref41 doi: 10.1016/j.ins.2016.08.007 – volume: 4 start-page: 142 year: 1959 ident: ref1 article-title: Coding theorems for a discrete source with a fidelity criterion publication-title: IRE Nat Conv Rec – start-page: 1097 year: 2012 ident: ref19 article-title: ImageNet classification with deep convolutional neural networks publication-title: Proc NIPS – ident: ref32 doi: 10.1007/BF00058655 – ident: ref30 doi: 10.1006/inco.1995.1136 – ident: ref18 doi: 10.1016/S0923-5965(01)00024-8 |
| SSID | ssj0014847 |
| Score | 2.5172803 |
| Snippet | End-to-end image compression using trained deep networks as encoding/decoding models has been developed substantially in the recent years. Previous work is... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1193 |
| SubjectTerms | Adaptation models Coders Coding Decoding Distortion Encoding-Decoding Ensemble learning Entropy coding Floating point arithmetic Geometric transformation Image coding Image compression Modal choice Optimization Rate-distortion rate-distortion optimization Transforms |
| Title | Ensemble Learning-Based Rate-Distortion Optimization for End-to-End Image Compression |
| URI | https://ieeexplore.ieee.org/document/9109567 https://www.proquest.com/docview/2498873656 |
| Volume | 31 |
| WOSCitedRecordID | wos000626532100028&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2205 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014847 issn: 1051-8215 databaseCode: RIE dateStart: 19910101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFH4B4kEP_kIjiqYHb1ph7Vi7oyJEE4NGgXBb1q4zJjAMDP9-X8sgGo2Jp-3QJktf-73vW98PgPNUirQVqJgyIwLqK18iDoYJHndP-wFPEu6iKocPoteTo1H4VILLdS6MMcYFn5kr--ru8pOpXthfZQ10bUjnRRnKQgTLXK31jYEvXTMxpAselejHVgkyzbDRb78M-ygFGSpUqwG4980Jua4qP6DY-Zfuzv--bBe2Cx5JrpeG34OSyfZh60t1wSoMOtncTNTYkKKI6iu9QZ-VkGfkl_TW1QexZiGPCBuTIh-TIIklnSyh-ZTig9xPEHCIRY1lwGx2AINup9--o0UXBap54OU0bgaMpSb2TSxVK0GB5GklQ629OGbcF8wXJgni0OepkgyxMVWhRJYW6GYLZbfhh1DJppk5AqJQnPBU2HotHM8-V4J7KTehUMYyA1YDb7WskS5KjNtOF-PISY1mGDlTRNYUUWGKGlys57wvC2z8ObpqF389slj3GtRX1ouKMziPUFgignIkrMe_zzqBTWYjVFxEWR0q-WxhTmFDf-Rv89mZ216ffZ3LGA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD7MC6gP3sXp1Dz4ptE2yZr00ctEcU7RbfhWmjQVwXXipr_fk6wbiiL41D4kUHKS73xfcy4A-7mSeT3SKWVWRlRooRAH4wyPe2hExLOM-6jKblO2WurxMb6rwOEkF8Za64PP7JF79Xf5Wd-8u19lx-jakM7LKZipC8GCUbbW5M5AKN9ODAlDSBV6snGKTBAft88eum0Ugww1qlMBPPzmhnxflR9g7D3MxdL_vm0ZFksmSU5Gpl-Bii1WYeFLfcE16DSKge3pF0vKMqpP9BS9VkbukWHSc18hxBmG3CJw9MqMTII0ljSKjA77FB_kqoeQQxxujEJmi3XoXDTaZ5e07KNADY_CIU2DiLHcpsKmStczlEih0So2JkxTxoVkQtosSmPBc60YomOuY4U8LTJBHYW35RswXfQLuwlEozzhuXQVWziefq4lD3NuY6mt4wasCuF4WRNTFhl3vS5eEi82gjjxpkicKZLSFFU4mMx5HZXY-HP0mlv8ychy3atQG1svKU_hIEFpiRjKkbJu_T5rD-Yu2zfNpHnVut6GeebiVXx8WQ2mh2_vdgdmzcfwefC267faJznDzl8 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Ensemble+Learning-Based+Rate-Distortion+Optimization+for+End-to-End+Image+Compression&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Wang%2C+Yefei&rft.au=Liu%2C+Dong&rft.au=Ma%2C+Siwei&rft.au=Wu%2C+Feng&rft.date=2021-03-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1051-8215&rft.eissn=1558-2205&rft.volume=31&rft.issue=3&rft.spage=1193&rft_id=info:doi/10.1109%2FTCSVT.2020.3000331&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon |