Deep Road Scene Understanding
Road scene understanding is a difficult task in autonomous driving. In this letter, we propose a novel deep encoder-decoder architecture for road scene understanding in an end-to-end manner. This core trainable understanding engine includes an encoder network, a decoder network with two streams, and...
Gespeichert in:
| Veröffentlicht in: | IEEE signal processing letters Jg. 26; H. 4; S. 587 - 591 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
01.04.2019
|
| Schlagworte: | |
| ISSN: | 1070-9908, 1558-2361 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Road scene understanding is a difficult task in autonomous driving. In this letter, we propose a novel deep encoder-decoder architecture for road scene understanding in an end-to-end manner. This core trainable understanding engine includes an encoder network, a decoder network with two streams, and a pixel-level fusion network with classification layer. The encoder network is composed of the front-end model of the classical convolution neural network, VGGNet. The decoder network with two streams includes multi-scale skip connection modules to reduce the down-scaling effect. Finally, a fusion network fuses the two-level information from the two streams of the decoder network for precise pixel-level classification. Additionally, the convolution layer is added to each skip connection module to increase the depth of the architecture. Our architecture achieves outstanding performance on the publicly available CamVid dataset and significantly outperforms previous architectures. This deep architecture is ideal for road scene understanding. |
|---|---|
| AbstractList | Road scene understanding is a difficult task in autonomous driving. In this letter, we propose a novel deep encoder-decoder architecture for road scene understanding in an end-to-end manner. This core trainable understanding engine includes an encoder network, a decoder network with two streams, and a pixel-level fusion network with classification layer. The encoder network is composed of the front-end model of the classical convolution neural network, VGGNet. The decoder network with two streams includes multi-scale skip connection modules to reduce the down-scaling effect. Finally, a fusion network fuses the two-level information from the two streams of the decoder network for precise pixel-level classification. Additionally, the convolution layer is added to each skip connection module to increase the depth of the architecture. Our architecture achieves outstanding performance on the publicly available CamVid dataset and significantly outperforms previous architectures. This deep architecture is ideal for road scene understanding. |
| Author | Zhou, Wujie Lv, Sijia Yu, Lu Jiang, Qiuping |
| Author_xml | – sequence: 1 givenname: Wujie orcidid: 0000-0002-3055-2493 surname: Zhou fullname: Zhou, Wujie email: wujiezhou@163.com organization: School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou, China – sequence: 2 givenname: Sijia surname: Lv fullname: Lv, Sijia email: lvsijia@zust.edu.cn organization: School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou, China – sequence: 3 givenname: Qiuping surname: Jiang fullname: Jiang, Qiuping email: jiangqiuping@nbu.edu.cn organization: Faculty of Information Science and Engineering, Ningbo University, Ningbo, China – sequence: 4 givenname: Lu orcidid: 0000-0003-2903-9913 surname: Yu fullname: Yu, Lu email: yul@zju.edu.cn organization: Institute of Information and Communication Engineering, Zhejiang University, Hangzhou, China |
| BookMark | eNp9j81KAzEYRYNUsK3uBRHmBaZ-SZq_pdRfGFCsXYdM5otEaqYks_HtndLiwoWrexf3XDgzMkl9QkIuKSwoBXPTrF8XDKhZMG2kMvyETKkQumZc0snYQUFtDOgzMivlEwA01WJKru8Qd9Vb77pq7TFhtUkd5jK41MX0cU5Og9sWvDjmnGwe7t9XT3Xz8vi8um1qzyQfaoFM0OA9d4x2vFMscIUSJRegO2xbzz0Yhw5YCMIFwYOTQIVZUqZa4Zd8TuTh1-e-lIzB-ji4IfZpyC5uLQW7l7SjpN1L2qPkCMIfcJfjl8vf_yFXByQi4u9cSw5CKf4DcMxdtg |
| CODEN | ISPLEM |
| CitedBy_id | crossref_primary_10_3103_S1060992X23020108 crossref_primary_10_1109_LSP_2021_3089912 crossref_primary_10_3390_s22072654 crossref_primary_10_1016_j_eswa_2021_115090 crossref_primary_10_1109_LSP_2021_3134194 crossref_primary_10_1002_cpe_6767 crossref_primary_10_1016_j_aei_2023_101912 crossref_primary_10_1109_ACCESS_2020_3001679 crossref_primary_10_1109_JSEN_2021_3101497 crossref_primary_10_1109_MITS_2022_3180892 crossref_primary_10_1109_JSEN_2020_3037340 crossref_primary_10_1109_LSP_2020_3048849 |
| Cites_doi | 10.1109/CVPR.2015.7299057 10.1109/VCIP.2017.8305148 10.1109/TIP.2018.2794207 10.1016/j.ins.2017.02.049 10.1109/CVPR.2017.243 10.1109/ICCV.2015.178 10.1109/TITS.2017.2688352 10.1109/CVPR.2015.7298965 10.1109/LSP.2018.2865833 10.1109/LSP.2018.2809685 10.1016/j.patrec.2008.04.005 |
| ContentType | Journal Article |
| DBID | 97E RIA RIE AAYXX CITATION |
| DOI | 10.1109/LSP.2019.2896793 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1558-2361 |
| EndPage | 591 |
| ExternalDocumentID | 10_1109_LSP_2019_2896793 8630577 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Zhejiang Provincial Natural Science Foundation of China grantid: LY18F020012 – fundername: China Postdoctoral Science Foundation grantid: 2015M581932 funderid: 10.13039/501100002858 – fundername: National Natural Science Foundation of China grantid: 61502429; 61431015; 61501270 funderid: 10.13039/501100001809 |
| GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 85S 97E AAJGR AARMG AASAJ AAWTH AAYJJ ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TAE TN5 VH1 AAYXX CITATION |
| ID | FETCH-LOGICAL-c263t-5e251fcc3a21d3d72f37e6e63508debbc3c09aea02ff5af53fa601594127b5c43 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 20 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000460727300004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1070-9908 |
| IngestDate | Tue Nov 18 21:59:18 EST 2025 Sat Nov 29 01:48:52 EST 2025 Wed Aug 27 02:50:33 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c263t-5e251fcc3a21d3d72f37e6e63508debbc3c09aea02ff5af53fa601594127b5c43 |
| ORCID | 0000-0002-3055-2493 0000-0003-2903-9913 |
| PageCount | 5 |
| ParticipantIDs | crossref_citationtrail_10_1109_LSP_2019_2896793 ieee_primary_8630577 crossref_primary_10_1109_LSP_2019_2896793 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-April 2019-4-00 |
| PublicationDateYYYYMMDD | 2019-04-01 |
| PublicationDate_xml | – month: 04 year: 2019 text: 2019-April |
| PublicationDecade | 2010 |
| PublicationTitle | IEEE signal processing letters |
| PublicationTitleAbbrev | LSP |
| PublicationYear | 2019 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| References | ref13 iglovikov (ref19) 2018 romera (ref5) 2016 ref2 ref1 ref17 paszke (ref15) 2016 (ref21) 2014 kendall (ref12) 2015 badrinarayanan (ref10) 2015 ronneberger (ref14) 0 ref24 ref23 simonyan (ref6) 2014 chen (ref20) 0 mehta (ref18) 0 ref8 lin (ref16) 2016 ref7 ref9 ref3 badrinarayanan (ref11) 2015 he (ref25) 2015 melih (ref4) 2017; 18 (ref22) 2015 |
| References_xml | – ident: ref24 doi: 10.1109/CVPR.2015.7299057 – ident: ref17 doi: 10.1109/VCIP.2017.8305148 – year: 2018 ident: ref19 article-title: TernausNet: U-Net with VGG11 encoder pre-trained on ImageNet for image segmentation – ident: ref7 doi: 10.1109/TIP.2018.2794207 – year: 2015 ident: ref12 article-title: Bayesian SegNet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding publication-title: Comput Sci – year: 0 ident: ref20 article-title: Semantic image segmentation with deep convolutional nets and fully connected CRFs publication-title: Proc Int Conf Learn Representations – year: 2015 ident: ref11 article-title: Segnet: A deep convolutional encoder-decoder architecture for image segmentation – year: 2015 ident: ref25 article-title: Deep residual learning for image recognition – year: 2015 ident: ref22 article-title: On using very large target vocabulary for neural machine translation publication-title: Comput Sci – year: 2016 ident: ref16 article-title: Refinenet: Multipath refinement networks with identity mappings for highresolution semantic segmentation publication-title: arXiv 1611 06612 – ident: ref2 doi: 10.1016/j.ins.2017.02.049 – year: 2016 ident: ref15 article-title: ENet: A deep neural network architecture for real-time semantic segmentation – ident: ref8 doi: 10.1109/CVPR.2017.243 – year: 2014 ident: ref21 article-title: Adam: A method for stochastic optimization publication-title: arXiv 1412 6980 – ident: ref13 doi: 10.1109/ICCV.2015.178 – volume: 18 start-page: 3398 year: 2017 ident: ref4 article-title: Road scene content analysis for driver assistance and autonomous driving publication-title: IEEE Trans Intell Transp Syst doi: 10.1109/TITS.2017.2688352 – start-page: 234 year: 0 ident: ref14 article-title: U-Net: Convolutional networks for biomedical image segmentation publication-title: Proc Int Conf Med Image Comput Comput -Assisted Intervention – ident: ref9 doi: 10.1109/CVPR.2015.7298965 – ident: ref3 doi: 10.1109/LSP.2018.2865833 – year: 2014 ident: ref6 article-title: Very deep convolutional networks for large-scale image recognition – ident: ref1 doi: 10.1109/LSP.2018.2809685 – ident: ref23 doi: 10.1016/j.patrec.2008.04.005 – year: 0 ident: ref18 article-title: ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation publication-title: Proc Eur Conf Comput Vision – year: 2015 ident: ref10 article-title: Segnet: A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling – year: 2016 ident: ref5 article-title: Can we unify monocular detectors for autonomous driving by using the pixel-wise semantic segmentation of CNNs? |
| SSID | ssj0008185 |
| Score | 2.3632057 |
| Snippet | Road scene understanding is a difficult task in autonomous driving. In this letter, we propose a novel deep encoder-decoder architecture for road scene... |
| SourceID | crossref ieee |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 587 |
| SubjectTerms | Computer architecture Convolution Decoding encoder–decoder architecture Feature extraction Fuses fusion layer Road scene understanding Roads Semantics skip connection |
| Title | Deep Road Scene Understanding |
| URI | https://ieeexplore.ieee.org/document/8630577 |
| Volume | 26 |
| WOSCitedRecordID | wos000460727300004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2361 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0008185 issn: 1070-9908 databaseCode: RIE dateStart: 19940101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5q8aAHX1WsWtmDF8G06WZ3kxxFLR6kFGuht2XyAkHaUlt_v8nudqkggrcQZiB8CfPIvABuFCrKDNdEaudIkqSSoHBIUmoNk6hMJsphE3w4FNOpHDXgrq6FsdYWyWe2G5ZFLN_M9Tp8lfVE5l8n5zuwwzkva7VqqRsUT5lfSImXsGITkqSy9zIehRwu2fXORcYl-6GCtmaqFCplcPi_wxzBQWU6RvflXR9Dw85OYH-roWALOo_WLqLXOZporL0UiybbtSunMBk8vT08k2oAAtFxxlYktd76cFozjPvGIxo7xm1mvY1AhbFKaaapRIs0di5FlzKH3r9KZdKPuUp1ws6gOZvP7DlEsZYUDWptZJYYqRCRy9go4f0JiYloQ2-DSa6r7uBhSMVHXngJVOYexTygmFcotuG25liUnTH-oG0FAGu6CruL37cvYS8wl_kxV9BcLde2A7v6a_X-ubwurv0bZBaqJQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFH7UKqgHd7FqdQ5eBKdNk8lMchS1VKyl2BZ6G7KCIG2prb_fZGY6VBDBWwhJCF8eb8nbAG6kkIjoRIVcWRtGEeWhYFaEFBlNuJA6ZnmziaTXY-Mx71fgrsyFMcZkwWem4YeZL19P1dJ_lTVZ7KgzSTZgk0YRbuXZWiXf9aInjzBEoeOxbOWURLzZHfR9FBdvOPMiTjj5IYTWuqpkQqW9_7_rHMBeoTwG9_lrH0LFTI5gd62k4DHUH42ZBW9ToYOBcnwsGK1nr5zAqP00fOiERQuEUOGYLEJqnP5hlSICt7TDFFuSmNg4LQExbaRURCEujEDYWiosJVY4C4vyqIUTSVVETqE6mU7MGQRYcSS0UErzONJcCiESjrVkzqLgImI1aK4wSVVRH9y3qfhIMzsB8dShmHoU0wLFGtyWO2Z5bYw_1h57AMt1BXbnv09fw3Zn-NpNu8-9lwvY8Qfl0TKXUF3Ml6YOW-pr8f45v8pI4Btwsq1s |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+Road+Scene+Understanding&rft.jtitle=IEEE+signal+processing+letters&rft.au=Zhou%2C+Wujie&rft.au=Lv%2C+Sijia&rft.au=Jiang%2C+Qiuping&rft.au=Yu%2C+Lu&rft.date=2019-04-01&rft.issn=1070-9908&rft.eissn=1558-2361&rft.volume=26&rft.issue=4&rft.spage=587&rft.epage=591&rft_id=info:doi/10.1109%2FLSP.2019.2896793&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_LSP_2019_2896793 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1070-9908&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1070-9908&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1070-9908&client=summon |