Position Encoding for 3D Lane Detection via Perspective Transformer
3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information....
Uložené v:
| Vydané v: | IEEE access Ročník 12; s. 106480 - 106487 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
2024
|
| Predmet: | |
| ISSN: | 2169-3536, 2169-3536 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | 3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets. |
|---|---|
| AbstractList | 3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets. |
| Author | Yang Deng, Yan Li Zhang, Meng Yu Lei, Xin Wei Wang, Ming |
| Author_xml | – sequence: 1 givenname: Meng orcidid: 0009-0004-6348-3174 surname: Li Zhang fullname: Li Zhang, Meng organization: Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, China – sequence: 2 givenname: Ming orcidid: 0000-0001-9285-9071 surname: Wei Wang fullname: Wei Wang, Ming email: wangmingwei@sust.edu.cn organization: Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, China – sequence: 3 givenname: Yan surname: Yang Deng fullname: Yang Deng, Yan organization: Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, China – sequence: 4 givenname: Xin orcidid: 0009-0004-2423-0254 surname: Yu Lei fullname: Yu Lei, Xin organization: Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, China |
| BookMark | eNqFkF1rwjAUhsNwMOf8BdtF_4AuH03aXEp1myBM0F2HNDmRiDaSFmH_fq2VIbvZuTmf73vgeUSDKlSA0DPBU0KwfJ0VxWKzmVJM0ylLmeCC3KEhJUJOGGdicFM_oHFd73EbeTvi2RAV61D7xocqWVQmWF_tEhdiwubJSleQzKEBc1mfvU7WEOtT158h2UZd1e3pEeITunf6UMP4mkfo622xLT4mq8_3ZTFbTQzLSTNxWggsTQna0lRKYoWgFLhx2hDJJWsbqbEFa0pM0hy0sNxIJ4mzGtPcsBFa9r426L06RX_U8VsF7dVlEOJO6dh4cwDlUi4MFhmmLk9LzqQBK6BkLAMDwvDWi_VeJoa6juB-_QhWHVbVY1UdVnXF2qrkH5Xxje74NFH7wz_al17rAeDmm2ivspT9APEKiDo |
| CODEN | IAECCG |
| CitedBy_id | crossref_primary_10_1109_ACCESS_2025_3583341 |
| Cites_doi | 10.1609/aaai.v32i1.12301 10.1109/TITS.2019.2890870 10.1109/ICCV48922.2021.00375 10.1109/CVPR.2018.00472 10.1109/IVS.2018.8500547 10.1109/CVPR52688.2022.01665 10.1109/cvpr52729.2023.00103 10.1007/978-3-031-19812-0_31 10.1609/aaai.v35i4.16469 10.1109/WACV57701.2024.00121 10.1109/taes.2022.3218496 10.1109/ICRA48891.2023.10161160 10.1109/CVPRW56347.2022.00483 10.1109/JAS.2023.123660 10.1007/978-3-031-19839-7_32 10.1109/CVPR52729.2023.01674 10.1109/CVPR52688.2022.00398 10.1109/CVPR46437.2021.00036 10.1109/cvpr.2019.01298 10.1111/mice.12829 10.1109/ICCV.2019.00301 10.1109/CVPR52688.2022.00097 10.1007/978-3-030-58586-0_17 10.1007/978-3-030-58589-1_40 10.1609/aaai.v36i2.20069 10.1109/CVPR52688.2022.01655 10.1109/WACV48630.2021.00374 10.1007/BF00201978 |
| ContentType | Journal Article |
| DBID | 97E ESBDL RIA RIE AAYXX CITATION DOA |
| DOI | 10.1109/ACCESS.2024.3436561 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library (IEL) (UW System Shared) CrossRef DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2169-3536 |
| EndPage | 106487 |
| ExternalDocumentID | oai_doaj_org_article_f456c06702f84b539ced6eb337ece6c5 10_1109_ACCESS_2024_3436561 10620274 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Key Research and Development Plan of Shaanxi Province in 2023 grantid: 2023-YBGY-215 – fundername: Xianyang Science and Technology Bureau grantid: L2022-JBGS-GY-01 funderid: 10.13039/501100007765 – fundername: Shaanxi Provincial Science and Technology Department Qin Chuangyuan “Scientist + Engineer” Team Construction grantid: 2024QCY-KXJ-181 – fundername: Shaanxi Province Technology Innovation Guidance Special (Fund) grantid: 2023GXLH-072 |
| GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABAZT ABVLG ACGFS ADBBV AGSQL ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RNS AAYXX CITATION |
| ID | FETCH-LOGICAL-c381t-fa6609cbead24991d6622e5cfac1959322e9a0dedcb0148ea6d5c9f91fda028c3 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001288380500001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2169-3536 |
| IngestDate | Fri Oct 03 12:33:57 EDT 2025 Tue Nov 18 21:15:18 EST 2025 Sat Nov 29 04:27:00 EST 2025 Wed Aug 27 02:28:33 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | https://creativecommons.org/licenses/by-nc-nd/4.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c381t-fa6609cbead24991d6622e5cfac1959322e9a0dedcb0148ea6d5c9f91fda028c3 |
| ORCID | 0009-0004-6348-3174 0000-0001-9285-9071 0009-0004-2423-0254 |
| OpenAccessLink | https://doaj.org/article/f456c06702f84b539ced6eb337ece6c5 |
| PageCount | 8 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_f456c06702f84b539ced6eb337ece6c5 crossref_primary_10_1109_ACCESS_2024_3436561 ieee_primary_10620274 crossref_citationtrail_10_1109_ACCESS_2024_3436561 |
| PublicationCentury | 2000 |
| PublicationDate | 20240000 2024-00-00 2024-01-01 |
| PublicationDateYYYYMMDD | 2024-01-01 |
| PublicationDate_xml | – year: 2024 text: 20240000 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE access |
| PublicationTitleAbbrev | Access |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| References | ref13 ref12 ref15 ref14 ref30 ref11 ref10 ref2 ref1 ref17 ref16 ref19 ref18 Efrat (ref25) 2020 ref24 ref23 ref26 ref20 ref22 ref21 Zhu (ref27) ref28 ref29 ref8 ref7 ref9 ref4 ref3 ref6 ref5 |
| References_xml | – start-page: 1 volume-title: Proc. 9th Int. Conf. Learn. Represent. (ICLR) ident: ref27 article-title: Deformable DETR: Deformable transformers for end-to-end object detection – ident: ref13 doi: 10.1609/aaai.v32i1.12301 – ident: ref16 doi: 10.1109/TITS.2019.2890870 – ident: ref20 doi: 10.1109/ICCV48922.2021.00375 – year: 2020 ident: ref25 article-title: 3D-LaneNet+: Anchor free lane detection using a semi-local representation publication-title: arXiv:2011.01535 – ident: ref3 doi: 10.1109/CVPR.2018.00472 – ident: ref12 doi: 10.1109/IVS.2018.8500547 – ident: ref18 doi: 10.1109/CVPR52688.2022.01665 – ident: ref29 doi: 10.1109/cvpr52729.2023.00103 – ident: ref11 doi: 10.1007/978-3-031-19812-0_31 – ident: ref14 doi: 10.1609/aaai.v35i4.16469 – ident: ref21 doi: 10.1109/WACV57701.2024.00121 – ident: ref2 doi: 10.1109/taes.2022.3218496 – ident: ref6 doi: 10.1109/ICRA48891.2023.10161160 – ident: ref8 doi: 10.1109/CVPRW56347.2022.00483 – ident: ref23 doi: 10.1109/JAS.2023.123660 – ident: ref7 doi: 10.1007/978-3-031-19839-7_32 – ident: ref28 doi: 10.1109/CVPR52729.2023.01674 – ident: ref22 doi: 10.1109/CVPR52688.2022.00398 – ident: ref17 doi: 10.1109/CVPR46437.2021.00036 – ident: ref1 doi: 10.1109/cvpr.2019.01298 – ident: ref4 doi: 10.1111/mice.12829 – ident: ref10 doi: 10.1109/ICCV.2019.00301 – ident: ref15 doi: 10.1109/CVPR52688.2022.00097 – ident: ref19 doi: 10.1007/978-3-030-58586-0_17 – ident: ref26 doi: 10.1007/978-3-030-58589-1_40 – ident: ref30 doi: 10.1609/aaai.v36i2.20069 – ident: ref5 doi: 10.1109/CVPR52688.2022.01655 – ident: ref9 doi: 10.1109/WACV48630.2021.00374 – ident: ref24 doi: 10.1007/BF00201978 |
| SSID | ssj0000816957 |
| Score | 2.3143766 |
| Snippet | 3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules... |
| SourceID | doaj crossref ieee |
| SourceType | Open Website Enrichment Source Index Database Publisher |
| StartPage | 106480 |
| SubjectTerms | 3D lane detection autonomous vehicle Convolution Decoding Deep learning Encoding Feature extraction Lane detection Machine learning position embedding Task analysis Three-dimensional displays view conversion |
| SummonAdditionalLinks | – databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared) dbid: RIE link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELYoYoCBN6K85IGRQGI7djxCATEg1AEQW-Scz1IllKLS9vdjO6bAABJbZDnK5c6ne_juO0JOy0ZaKwRmFbM2E4VgmZEu96EKOG_xrFFxfsrzvXp4qF5e9DA1q8deGESMxWd4Hh7jXb4dwyykyryGyxCrix7pKaW6Zq1FQiVMkNClSshCRa4vLgcD_xM-BmTinAvuPZfih_WJIP0_pqpEo3K78U9yNsl68h7pZSfuLbKE7TZZ-4YpuEMGw1SGRW9aGAfLRL1fSvk1vTct0mucxuKrls5Hhg6_Wi3p46cLi5Nd8nR78zi4y9KkhAy8xZ1mzkiZa2j8sfDhlC6slIxhCc5AAI_xSova5BYtNCGDiEbaErTThbPGOxjA98hyO25xn1CueW44sKbkIGzVVAA2d0pxVarCWtYn7JODNSQY8TDN4rWO4USu647tdWB7ndjeJ2eLl946FI2_t18F0Sy2BgjsuODZXyeNqp13_SB0GTFXCU-tBrQSG84VAkoo-2Q3iOzb9zppHfyyfkhWAw1deuWILE8nMzwmKzCfjt4nJ_GsfQAK7NGq priority: 102 providerName: IEEE |
| Title | Position Encoding for 3D Lane Detection via Perspective Transformer |
| URI | https://ieeexplore.ieee.org/document/10620274 https://doaj.org/article/f456c06702f84b539ced6eb337ece6c5 |
| Volume | 12 |
| WOSCitedRecordID | wos001288380500001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2169-3536 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000816957 issn: 2169-3536 databaseCode: DOA dateStart: 20130101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2169-3536 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000816957 issn: 2169-3536 databaseCode: M~E dateStart: 20130101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELYQYoAB8SiiPCoPjAQS23HisfQhhlJ1KIgtcs62hIQCKqUjv52zk5Z2gYUlQ-Qk9ndJ7j7r7jtCrtJSGiOEjXJmTCQSwSItXYxUBRx6PKOz0D_laZSNx_nzs5qstfryOWG1PHAN3K1DDw--mIS5XJQpV2CNRAbIMwtWQlAvxahnjUyFf3CeSJVmjcxQEqvbbq-HK0JCyMQNFxzDmGTDFQXF_o0WK8HDDA_IfhMa0m49pUOyZasjsrcmGHhMepMmx4oOKnjzbodi0El5n450ZWnfzkNmVUUXL5pOfuoo6XQZn9pZizwOB9PefdS0QYgA3ek8clrKWEGJNkeupBIjJWM2BafBK8PgF2mVjo01UPrtQaulSUE5lTijMXoAfkK2q7fKnhLKFY81B4Y4gjB5mQOY2GUZz9IsMYa1CVsiUkCjEe5bVbwWgSvEqqhhLDyMRQNjm1yvLnqvJTJ-H37noV4N9frW4QRavWisXvxl9TZpeUOtPU_6XRxx9h83Pye7fsL1RssF2Z7PPu0l2YHF_OVj1gkvGh4fvgadUC74DVMI1tg |
| linkProvider | Directory of Open Access Journals |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dT9swED9t3aTBA9sYiPI1P_BIWGI7TvwIBcREqfrQTbxFzvksVUIpKqV_P7YTCjxsEm-R5SiXO5_uw3e_AzjKa2WtlJSU3NpEZpInRrnUhyrovMWzpojzU_4Oi9GovL3V465ZPfbCEFEsPqOT8Bjv8u0MH0OqzGu4CrG6_Aifcil51rZrrVIqYYaEzosOWyhL9a_TwcD_ho8CuTwRUnjfJXtjfyJM_5u5KtGsXH59J0HfYKPzH9lpK_Dv8IGaTVh_hSr4AwbjrhCLXTQ4C7aJec-UiXM2NA2xc1rE8quGLaeGjV-aLdnk2Yml-Rb8ubyYDK6SblZCgt7mLhJnlEo11v5g-IBKZ1YpzilHZzDAx3i1JW1SSxbrkEMko2yO2unMWeNdDBTb0GtmDe0AE1qkRiCvc4HSlnWJaFNXFKLIi8xa3gf-zMEKOyDxMM_irooBRaqrlu1VYHvVsb0Px6uX7lscjf9vPwuiWW0NINhxwbO_6nSqct75w9BnxF0pPbUaySqqhSgISWHeh60gslffa6W1-4_1n_DlanIzrIa_R9d7sBboaZMt-9BbzB_pAD7jcjF9mB_Gc_cEG2vU8Q |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Position+Encoding+for+3D+Lane+Detection+via+Perspective+Transformer&rft.jtitle=IEEE+access&rft.au=Li+Zhang%2C+Meng&rft.au=Wei+Wang%2C+Ming&rft.au=Yang+Deng%2C+Yan&rft.au=Yu+Lei%2C+Xin&rft.date=2024&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=12&rft.spage=106480&rft.epage=106487&rft_id=info:doi/10.1109%2FACCESS.2024.3436561&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2024_3436561 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |