RAFNet: Reparameterizable Across-Resolution Fusion Network for Real-Time Image Semantic Segmentation
The demand to implement semantic segmentation networks on mobile devices has increased dramatically. However, existing real-time semantic segmentation methods still suffer from a large number of network parameters, unsuitable for mobile devices with limited memory resources. The reason mainly arises...
Uloženo v:
| Vydáno v: | IEEE transactions on circuits and systems for video technology Ročník 34; číslo 2; s. 1212 - 1227 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.02.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1051-8215, 1558-2205 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | The demand to implement semantic segmentation networks on mobile devices has increased dramatically. However, existing real-time semantic segmentation methods still suffer from a large number of network parameters, unsuitable for mobile devices with limited memory resources. The reason mainly arises from the fact that most existing methods take the backbone networks (e.g., ResNet-18 and MobileNet) as an encoder. To alleviate this problem, we propose a novel Reparameterizable Channel & Dilation (RCD) block and construct a considerably lightweight yet effective encoder by stacking several RCD blocks according to three guidelines. The strengths of the proposed encoder result in the abilities not only to extract discriminative feature representations via channel convolutions and dilated convolutions, but also to reduce computational burdens while maintaining segmentation accuracy with the help of re-parameterization technique. Except for encoder, we also present a simple but effective decoder that adopts an across-resolution fusion strategy to fuse multi-scale feature maps generated from the encoder instead of a bottom-up pathway fusion. With such an encoder and a decoder, we provide a Reparameterizable Across-resolution Fusion Network (RAFNet) for real-time semantic segmentation. Extensive experiments demonstrate that our RAFNet achieves a promising trade-off between segmentation accuracy, inference speed and network parameters. Specifically, our RAFNet with only 0.96M parameters obtains 75.3% mIoU at 107 FPS and 75.8% mIoU at 195 FPS on Cityscapes and CamVid test sets for full-resolution inputs, respectively. After quantization and deployment on a Xilinx ZCU104 device, our RAFNet obtains a favorable segmentation performance with only 1.4W power. |
|---|---|
| AbstractList | The demand to implement semantic segmentation networks on mobile devices has increased dramatically. However, existing real-time semantic segmentation methods still suffer from a large number of network parameters, unsuitable for mobile devices with limited memory resources. The reason mainly arises from the fact that most existing methods take the backbone networks (e.g., ResNet-18 and MobileNet) as an encoder. To alleviate this problem, we propose a novel Reparameterizable Channel & Dilation (RCD) block and construct a considerably lightweight yet effective encoder by stacking several RCD blocks according to three guidelines. The strengths of the proposed encoder result in the abilities not only to extract discriminative feature representations via channel convolutions and dilated convolutions, but also to reduce computational burdens while maintaining segmentation accuracy with the help of re-parameterization technique. Except for encoder, we also present a simple but effective decoder that adopts an across-resolution fusion strategy to fuse multi-scale feature maps generated from the encoder instead of a bottom-up pathway fusion. With such an encoder and a decoder, we provide a Reparameterizable Across-resolution Fusion Network (RAFNet) for real-time semantic segmentation. Extensive experiments demonstrate that our RAFNet achieves a promising trade-off between segmentation accuracy, inference speed and network parameters. Specifically, our RAFNet with only 0.96M parameters obtains 75.3% mIoU at 107 FPS and 75.8% mIoU at 195 FPS on Cityscapes and CamVid test sets for full-resolution inputs, respectively. After quantization and deployment on a Xilinx ZCU104 device, our RAFNet obtains a favorable segmentation performance with only 1.4W power. |
| Author | Dai, Huhe Chen, Lei Zheng, Yuan |
| Author_xml | – sequence: 1 givenname: Lei surname: Chen fullname: Chen, Lei email: 32056105@mail.imu.edu.cn organization: College of Electronic Information Engineering, Inner Mongolia University, Hohhot, China – sequence: 2 givenname: Huhe surname: Dai fullname: Dai, Huhe email: daihuhe@imu.edu.cn organization: College of Electronic Information Engineering, Inner Mongolia University, Hohhot, China – sequence: 3 givenname: Yuan orcidid: 0000-0002-7632-6846 surname: Zheng fullname: Zheng, Yuan email: zhengyuan@imu.edu.cn organization: National and Local Joint Engineering Research Center of Intelligent Information Processing Technology for Mongolian, College of Computer Science, Inner Mongolia University, Hohhot, China |
| BookMark | eNp9kMtKw0AUhgepYFt9AXERcJ06l0wycVeK1UJRaKPbMJmclKm51JkE0ad30nQhLlyds_i_c_kmaFQ3NSB0TfCMEBzfJYvtWzKjmLIZozEjYXiGxoRz4VOK-cj1mBNfUMIv0MTaPcYkEEE0RvlmvnyG9t7bwEEaWUELRn_LrARvrkxjrb8B25Rdq5vaW3a2Ly7_2Zh3r2iMw2TpJ7oCb1XJHXhbqGTdauWaXQV1K3vwEp0XsrRwdapT9Lp8SBZP_vrlcbWYr31F47D1Jag45LwIFCMyYyEGnGe5kIEUUgEXRDCasSIKMaaS8bgApYRiQY5zEUdRxqbodph7MM1HB7ZN901narcypTFlhFJnwKXEkDr-Z6BIlR7ubI3UZUpw2jtNj07T3ml6cupQ-gc9GF1J8_U_dDNAGgB-ASTinBP2A69nheg |
| CODEN | ITCTEM |
| CitedBy_id | crossref_primary_10_1109_TCAD_2024_3491015 crossref_primary_10_1109_TCSVT_2024_3483191 crossref_primary_10_1007_s00371_025_04130_1 crossref_primary_10_1109_TCSVT_2024_3457622 crossref_primary_10_1109_TCSVT_2024_3427720 crossref_primary_10_1109_TITS_2024_3519162 crossref_primary_10_1007_s00371_025_03853_5 crossref_primary_10_1007_s00530_025_01923_1 |
| Cites_doi | 10.1109/CVPR.2018.00388 10.1109/TPAMI.2016.2572683 10.1609/aaai.v35i3.16359 10.1007/978-3-030-58452-8_45 10.1016/j.isprsjprs.2021.06.006 10.1109/TCSVT.2021.3121680 10.1007/s11263-021-01515-2 10.1109/CVPR.2017.179 10.48550/arXiv.1802.02611 10.1109/ICCV.2019.00069 10.1109/CVPR.2017.660 10.1109/TPAMI.2016.2644615 10.1145/3373087.3375887 10.1109/TCSVT.2021.3096814 10.1109/ICCV.2019.00073 10.1109/CVPR.2019.01191 10.1109/TCSVT.2020.3037234 10.1109/CVPR.2018.00929 10.1109/ICCV.2019.00365 10.1109/CVPR46437.2021.00405 10.1007/s11063-022-10740-w 10.1109/ICMEW46912.2020.9106038 10.1007/978-3-319-24574-4_28 10.1109/CVPR46437.2021.00959 10.1007/978-3-030-01261-8_20 10.1109/TCSVT.2021.3132047 10.1109/CVPR46437.2021.01352 10.1109/TCSVT.2022.3206476 10.1109/CVPR42600.2020.00884 10.1109/TCSVT.2020.3015866 10.1109/CVPR42600.2020.01044 10.1016/j.patcog.2021.108250 10.1109/ICCV.2019.00140 10.1109/CVPR42600.2020.00426 10.1109/CVPR.2016.350 10.1109/CVPR.2018.00199 10.1109/TIP.2020.3042065 10.1109/CVPR.2019.00975 10.1109/ICCV.2019.00141 10.1109/CVPR.2019.01289 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TCSVT.2023.3293166 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1558-2205 |
| EndPage | 1227 |
| ExternalDocumentID | 10_1109_TCSVT_2023_3293166 10175551 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61962043 funderid: 10.13039/501100001809 – fundername: Natural Science Foundation of Inner Mongolia grantid: 2019MS06012 funderid: 10.13039/501100004763 |
| GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c296t-aec9655f4c31ab360e0dbd8a4a8ace581832b3f76002a359fecc8c34d0d8977b3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 10 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001173373700041&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1051-8215 |
| IngestDate | Sat Sep 06 09:56:44 EDT 2025 Sat Nov 29 01:44:24 EST 2025 Tue Nov 18 22:23:59 EST 2025 Wed Aug 27 02:12:08 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c296t-aec9655f4c31ab360e0dbd8a4a8ace581832b3f76002a359fecc8c34d0d8977b3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-7632-6846 |
| PQID | 2923122821 |
| PQPubID | 85433 |
| PageCount | 16 |
| ParticipantIDs | crossref_citationtrail_10_1109_TCSVT_2023_3293166 ieee_primary_10175551 proquest_journals_2923122821 crossref_primary_10_1109_TCSVT_2023_3293166 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-02-01 |
| PublicationDateYYYYMMDD | 2024-02-01 |
| PublicationDate_xml | – month: 02 year: 2024 text: 2024-02-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on circuits and systems for video technology |
| PublicationTitleAbbrev | TCSVT |
| PublicationYear | 2024 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 ref34 ref15 ref37 ref14 ref36 ref30 ref11 ref33 ref10 ref32 Si (ref42) 2019 ref2 ref1 ref17 ref39 ref16 ref38 ref18 Gao (ref19) 2021 ref24 ref46 ref23 ref45 ref26 ref25 ref20 ref41 ref22 Han (ref31) ref44 ref21 Chen (ref35) ref43 ref27 Hong (ref28) 2021 ref8 Simonyan (ref29) ref7 ref9 ref4 ref3 ref6 ref5 ref40 |
| References_xml | – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. (ICLR) ident: ref35 article-title: FasterSeg: Searching for faster real-time semantic segmentation – ident: ref12 doi: 10.1109/CVPR.2018.00388 – year: 2021 ident: ref28 article-title: Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes publication-title: arXiv:2101.06085 – ident: ref9 doi: 10.1109/TPAMI.2016.2572683 – start-page: 1 volume-title: Proc. 4th Int. Conf. Learn. Represent. (ICLR) ident: ref31 article-title: Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding – ident: ref39 doi: 10.1609/aaai.v35i3.16359 – ident: ref14 doi: 10.1007/978-3-030-58452-8_45 – ident: ref24 doi: 10.1016/j.isprsjprs.2021.06.006 – ident: ref15 doi: 10.1109/TCSVT.2021.3121680 – ident: ref26 doi: 10.1007/s11263-021-01515-2 – ident: ref5 doi: 10.1109/CVPR.2017.179 – year: 2019 ident: ref42 article-title: Real-time semantic segmentation via multiply spatial fusion network publication-title: arXiv:1911.07217 – ident: ref3 doi: 10.48550/arXiv.1802.02611 – year: 2021 ident: ref19 article-title: Rethink dilated convolution for real-time semantic segmentation publication-title: arXiv:2111.09957 – ident: ref45 doi: 10.1109/ICCV.2019.00069 – ident: ref2 doi: 10.1109/CVPR.2017.660 – ident: ref21 doi: 10.1109/TPAMI.2016.2644615 – ident: ref33 doi: 10.1145/3373087.3375887 – ident: ref10 doi: 10.1109/TCSVT.2021.3096814 – ident: ref18 doi: 10.1109/ICCV.2019.00073 – ident: ref41 doi: 10.1109/CVPR.2019.01191 – ident: ref6 doi: 10.1109/TCSVT.2020.3037234 – ident: ref44 doi: 10.1109/CVPR.2018.00929 – ident: ref37 doi: 10.1109/ICCV.2019.00365 – ident: ref27 doi: 10.1109/CVPR46437.2021.00405 – ident: ref38 doi: 10.1007/s11063-022-10740-w – ident: ref17 doi: 10.1109/ICMEW46912.2020.9106038 – ident: ref46 doi: 10.1007/978-3-319-24574-4_28 – ident: ref13 doi: 10.1109/CVPR46437.2021.00959 – ident: ref20 doi: 10.1007/978-3-030-01261-8_20 – ident: ref8 doi: 10.1109/TCSVT.2021.3132047 – start-page: 1 volume-title: Proc. 3rd Int. Conf. Learn. Represent. (ICLR) ident: ref29 article-title: Very deep convolutional networks for large-scale image recognition – ident: ref30 doi: 10.1109/CVPR46437.2021.01352 – ident: ref4 doi: 10.1109/TCSVT.2022.3206476 – ident: ref43 doi: 10.1109/CVPR42600.2020.00884 – ident: ref22 doi: 10.1109/TCSVT.2020.3015866 – ident: ref34 doi: 10.1109/CVPR42600.2020.01044 – ident: ref7 doi: 10.1016/j.patcog.2021.108250 – ident: ref25 doi: 10.1109/ICCV.2019.00140 – ident: ref40 doi: 10.1109/CVPR42600.2020.00426 – ident: ref1 doi: 10.1109/CVPR.2016.350 – ident: ref11 doi: 10.1109/CVPR.2018.00199 – ident: ref16 doi: 10.1109/TIP.2020.3042065 – ident: ref23 doi: 10.1109/CVPR.2019.00975 – ident: ref32 doi: 10.1109/ICCV.2019.00141 – ident: ref36 doi: 10.1109/CVPR.2019.01289 |
| SSID | ssj0014847 |
| Score | 2.484223 |
| Snippet | The demand to implement semantic segmentation networks on mobile devices has increased dramatically. However, existing real-time semantic segmentation methods... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1212 |
| SubjectTerms | Coders Computer networks Decoding Electronic devices encoder-decoder structure Feature extraction Feature maps hardware deployment Image segmentation lightweight network Memory devices Mobile handsets Parameterization Parameters Real time Real-time image segmentation Real-time systems Semantic segmentation Semantics Task analysis Training |
| Title | RAFNet: Reparameterizable Across-Resolution Fusion Network for Real-Time Image Semantic Segmentation |
| URI | https://ieeexplore.ieee.org/document/10175551 https://www.proquest.com/docview/2923122821 |
| Volume | 34 |
| WOSCitedRecordID | wos001173373700041&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared) customDbUrl: eissn: 1558-2205 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014847 issn: 1051-8215 databaseCode: RIE dateStart: 19910101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA86POjBz4nTKTl4k2xt04_U2xgWBSnipuxW8lUZbJ1snQf_el_Sbg5EwVsOSQn9JXm_l7zfewhdK5_mrpSUxFIE4KCAw8q5UCSOdO4BI-W0Spn_GKUpG43ip1qsbrUwWmsbfKY7pmnf8tVMLs1VWdcsnyAwguntKIoqsdb6ycBntpoY8AWXMDBkK4WME3eH_cHrsGMKhXcomDfXpkT8tkK2rMqPs9gamOTgn1M7RPs1k8S9CvojtKWLY7S3kV_wBKnnXpLq8hYDzeYmCsskZv40YincsxMj5va-Wns4WZqLM5xWceEYyCwM4xNiRCL4YQrnDh7oKQAxltB4m9aipaKJXpK7Yf-e1GUViPTisCRcyzgMgtyX1OWCho52lFCM-5xxqQNmNrmguX2x4zSIc0CZSeorRzFgi4KeokYxK_QZwsBfOAtVxCMY7mrFXQH0RICXo4E7CN1C7uo3Z7LOOW5KX0wy63s4cWahyQw0WQ1NC92sx7xXGTf-7N00YGz0rHBoofYKzqzelYvMM2zWAyfTPf9l2AXaha_7VVh2GzXK-VJfoh35UY4X8yu74L4A6RDTTw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFA4yBfXBuzidmgffpFvTtGvq2xiWDWcRN2VvJbfKYBfZxQd_vSdpNwei4FseEhr6JTnfSc53DkI3yqcZkZI6kRQBOCjgsHIulBOFOvOAkXKap8zvhEnC-v3oqRCrWy2M1toGn-mqadq3fDWRC3NVVjPLJwiMYHoz8H2P5HKt1aOBz2w9MWAMxGFgypYaGTeq9Zrd117VlAqvUjBwxCZF_LZDtrDKj9PYmph4_5-TO0B7BZfEjRz8Q7Shx0dody3D4DFSz4040fM7DESbmzgsk5r508ilcMNOzDH39_nqw_HCXJ3hJI8Mx0BnYRgfOkYmgtsjOHlwV48AioGExtuokC2NT9BLfN9rtpyisIIjvag-d7iWUT0IMl9SwgWtu9pVQjHuc8alDpjZ5oJm9s2O0yDKAGcmqa9cxYAvCnqKSuPJWJ8hDAyGs7oKeQjDiVacCCAoAvwcDexB6DIiy9-cyiLruCl-MUyt9-FGqYUmNdCkBTRldLsa857n3Piz94kBY61njkMZVZZwpsW-nKWe4bMeuJnk_Jdh12i71XvspJ128nCBduBLfh6kXUGl-XShL9GW_JgPZtMru_i-AA931pY |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RAFNet%3A+Reparameterizable+Across-Resolution+Fusion+Network+for+Real-Time+Image+Semantic+Segmentation&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Chen%2C+Lei&rft.au=Dai%2C+Huhe&rft.au=Zheng%2C+Yuan&rft.date=2024-02-01&rft.pub=IEEE&rft.issn=1051-8215&rft.volume=34&rft.issue=2&rft.spage=1212&rft.epage=1227&rft_id=info:doi/10.1109%2FTCSVT.2023.3293166&rft.externalDocID=10175551 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon |