Position Encoding for 3D Lane Detection via Perspective Transformer

3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information....

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE access Ročník 12; s. 106480 - 106487
Hlavní autori: Li Zhang, Meng, Wei Wang, Ming, Yang Deng, Yan, Yu Lei, Xin
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: IEEE 2024
Predmet:
ISSN:2169-3536, 2169-3536
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract 3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets.
AbstractList 3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules such as depth estimation, coordinate system transformation, and time series tracking to achieve the correspondence of 2D to 3D information. However, inaccurate depth information generated by perturbations during conversion poses a challenge to lane detection methods that rely only on monocular images. To solve the above problems, we propose a PELD model, a proxy transformation using BEV aerial view, to explicitly give 3D lane detection results. Specifically, when sampling feature information, feature flipping is proposed to supplement the global context information before view conversion, and the 3D position encoding information generated by the forward-looking features enhances the depth information. After the 3D position encoding information is combined with the feature information, the cross-attention module is used as a value for adaptive supervision of BEV queries. On the one hand, we use deformable attention to sample forward looking features and generate explicit lane representation; on the other hand, we supplement supervised lane line generation by supplementing forward looking features and enhancing 3D spatial information. PELD implements a more advanced approach than ever before on OpenLane and Apollo datasets.
Author Yang Deng, Yan
Li Zhang, Meng
Yu Lei, Xin
Wei Wang, Ming
Author_xml – sequence: 1
  givenname: Meng
  orcidid: 0009-0004-6348-3174
  surname: Li Zhang
  fullname: Li Zhang, Meng
  organization: Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, China
– sequence: 2
  givenname: Ming
  orcidid: 0000-0001-9285-9071
  surname: Wei Wang
  fullname: Wei Wang, Ming
  email: wangmingwei@sust.edu.cn
  organization: Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, China
– sequence: 3
  givenname: Yan
  surname: Yang Deng
  fullname: Yang Deng, Yan
  organization: Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, China
– sequence: 4
  givenname: Xin
  orcidid: 0009-0004-2423-0254
  surname: Yu Lei
  fullname: Yu Lei, Xin
  organization: Shaanxi Joint Laboratory of Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, China
BookMark eNqFkF1rwjAUhsNwMOf8BdtF_4AuH03aXEp1myBM0F2HNDmRiDaSFmH_fq2VIbvZuTmf73vgeUSDKlSA0DPBU0KwfJ0VxWKzmVJM0ylLmeCC3KEhJUJOGGdicFM_oHFd73EbeTvi2RAV61D7xocqWVQmWF_tEhdiwubJSleQzKEBc1mfvU7WEOtT158h2UZd1e3pEeITunf6UMP4mkfo622xLT4mq8_3ZTFbTQzLSTNxWggsTQna0lRKYoWgFLhx2hDJJWsbqbEFa0pM0hy0sNxIJ4mzGtPcsBFa9r426L06RX_U8VsF7dVlEOJO6dh4cwDlUi4MFhmmLk9LzqQBK6BkLAMDwvDWi_VeJoa6juB-_QhWHVbVY1UdVnXF2qrkH5Xxje74NFH7wz_al17rAeDmm2ivspT9APEKiDo
CODEN IAECCG
CitedBy_id crossref_primary_10_1109_ACCESS_2025_3583341
Cites_doi 10.1609/aaai.v32i1.12301
10.1109/TITS.2019.2890870
10.1109/ICCV48922.2021.00375
10.1109/CVPR.2018.00472
10.1109/IVS.2018.8500547
10.1109/CVPR52688.2022.01665
10.1109/cvpr52729.2023.00103
10.1007/978-3-031-19812-0_31
10.1609/aaai.v35i4.16469
10.1109/WACV57701.2024.00121
10.1109/taes.2022.3218496
10.1109/ICRA48891.2023.10161160
10.1109/CVPRW56347.2022.00483
10.1109/JAS.2023.123660
10.1007/978-3-031-19839-7_32
10.1109/CVPR52729.2023.01674
10.1109/CVPR52688.2022.00398
10.1109/CVPR46437.2021.00036
10.1109/cvpr.2019.01298
10.1111/mice.12829
10.1109/ICCV.2019.00301
10.1109/CVPR52688.2022.00097
10.1007/978-3-030-58586-0_17
10.1007/978-3-030-58589-1_40
10.1609/aaai.v36i2.20069
10.1109/CVPR52688.2022.01655
10.1109/WACV48630.2021.00374
10.1007/BF00201978
ContentType Journal Article
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
DOA
DOI 10.1109/ACCESS.2024.3436561
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library (IEL) (UW System Shared)
CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 106487
ExternalDocumentID oai_doaj_org_article_f456c06702f84b539ced6eb337ece6c5
10_1109_ACCESS_2024_3436561
10620274
Genre orig-research
GrantInformation_xml – fundername: Key Research and Development Plan of Shaanxi Province in 2023
  grantid: 2023-YBGY-215
– fundername: Xianyang Science and Technology Bureau
  grantid: L2022-JBGS-GY-01
  funderid: 10.13039/501100007765
– fundername: Shaanxi Provincial Science and Technology Department Qin Chuangyuan “Scientist + Engineer” Team Construction
  grantid: 2024QCY-KXJ-181
– fundername: Shaanxi Province Technology Innovation Guidance Special (Fund)
  grantid: 2023GXLH-072
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ABAZT
ABVLG
ACGFS
ADBBV
AGSQL
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RNS
AAYXX
CITATION
ID FETCH-LOGICAL-c381t-fa6609cbead24991d6622e5cfac1959322e9a0dedcb0148ea6d5c9f91fda028c3
IEDL.DBID DOA
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001288380500001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2169-3536
IngestDate Fri Oct 03 12:33:57 EDT 2025
Tue Nov 18 21:15:18 EST 2025
Sat Nov 29 04:27:00 EST 2025
Wed Aug 27 02:28:33 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
License https://creativecommons.org/licenses/by-nc-nd/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c381t-fa6609cbead24991d6622e5cfac1959322e9a0dedcb0148ea6d5c9f91fda028c3
ORCID 0009-0004-6348-3174
0000-0001-9285-9071
0009-0004-2423-0254
OpenAccessLink https://doaj.org/article/f456c06702f84b539ced6eb337ece6c5
PageCount 8
ParticipantIDs doaj_primary_oai_doaj_org_article_f456c06702f84b539ced6eb337ece6c5
crossref_primary_10_1109_ACCESS_2024_3436561
ieee_primary_10620274
crossref_citationtrail_10_1109_ACCESS_2024_3436561
PublicationCentury 2000
PublicationDate 20240000
2024-00-00
2024-01-01
PublicationDateYYYYMMDD 2024-01-01
PublicationDate_xml – year: 2024
  text: 20240000
PublicationDecade 2020
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
References ref13
ref12
ref15
ref14
ref30
ref11
ref10
ref2
ref1
ref17
ref16
ref19
ref18
Efrat (ref25) 2020
ref24
ref23
ref26
ref20
ref22
ref21
Zhu (ref27)
ref28
ref29
ref8
ref7
ref9
ref4
ref3
ref6
ref5
References_xml – start-page: 1
  volume-title: Proc. 9th Int. Conf. Learn. Represent. (ICLR)
  ident: ref27
  article-title: Deformable DETR: Deformable transformers for end-to-end object detection
– ident: ref13
  doi: 10.1609/aaai.v32i1.12301
– ident: ref16
  doi: 10.1109/TITS.2019.2890870
– ident: ref20
  doi: 10.1109/ICCV48922.2021.00375
– year: 2020
  ident: ref25
  article-title: 3D-LaneNet+: Anchor free lane detection using a semi-local representation
  publication-title: arXiv:2011.01535
– ident: ref3
  doi: 10.1109/CVPR.2018.00472
– ident: ref12
  doi: 10.1109/IVS.2018.8500547
– ident: ref18
  doi: 10.1109/CVPR52688.2022.01665
– ident: ref29
  doi: 10.1109/cvpr52729.2023.00103
– ident: ref11
  doi: 10.1007/978-3-031-19812-0_31
– ident: ref14
  doi: 10.1609/aaai.v35i4.16469
– ident: ref21
  doi: 10.1109/WACV57701.2024.00121
– ident: ref2
  doi: 10.1109/taes.2022.3218496
– ident: ref6
  doi: 10.1109/ICRA48891.2023.10161160
– ident: ref8
  doi: 10.1109/CVPRW56347.2022.00483
– ident: ref23
  doi: 10.1109/JAS.2023.123660
– ident: ref7
  doi: 10.1007/978-3-031-19839-7_32
– ident: ref28
  doi: 10.1109/CVPR52729.2023.01674
– ident: ref22
  doi: 10.1109/CVPR52688.2022.00398
– ident: ref17
  doi: 10.1109/CVPR46437.2021.00036
– ident: ref1
  doi: 10.1109/cvpr.2019.01298
– ident: ref4
  doi: 10.1111/mice.12829
– ident: ref10
  doi: 10.1109/ICCV.2019.00301
– ident: ref15
  doi: 10.1109/CVPR52688.2022.00097
– ident: ref19
  doi: 10.1007/978-3-030-58586-0_17
– ident: ref26
  doi: 10.1007/978-3-030-58589-1_40
– ident: ref30
  doi: 10.1609/aaai.v36i2.20069
– ident: ref5
  doi: 10.1109/CVPR52688.2022.01655
– ident: ref9
  doi: 10.1109/WACV48630.2021.00374
– ident: ref24
  doi: 10.1007/BF00201978
SSID ssj0000816957
Score 2.3143766
Snippet 3D lane detection from the input monocular image is a basic but indispensable task in the environment perception of automatic driving. Recent work uses modules...
SourceID doaj
crossref
ieee
SourceType Open Website
Enrichment Source
Index Database
Publisher
StartPage 106480
SubjectTerms 3D lane detection
autonomous vehicle
Convolution
Decoding
Deep learning
Encoding
Feature extraction
Lane detection
Machine learning
position embedding
Task analysis
Three-dimensional displays
view conversion
SummonAdditionalLinks – databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared)
  dbid: RIE
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELYoYoCBN6K85IGRQGI7djxCATEg1AEQW-Scz1IllKLS9vdjO6bAABJbZDnK5c6ne_juO0JOy0ZaKwRmFbM2E4VgmZEu96EKOG_xrFFxfsrzvXp4qF5e9DA1q8deGESMxWd4Hh7jXb4dwyykyryGyxCrix7pKaW6Zq1FQiVMkNClSshCRa4vLgcD_xM-BmTinAvuPZfih_WJIP0_pqpEo3K78U9yNsl68h7pZSfuLbKE7TZZ-4YpuEMGw1SGRW9aGAfLRL1fSvk1vTct0mucxuKrls5Hhg6_Wi3p46cLi5Nd8nR78zi4y9KkhAy8xZ1mzkiZa2j8sfDhlC6slIxhCc5AAI_xSova5BYtNCGDiEbaErTThbPGOxjA98hyO25xn1CueW44sKbkIGzVVAA2d0pxVarCWtYn7JODNSQY8TDN4rWO4USu647tdWB7ndjeJ2eLl946FI2_t18F0Sy2BgjsuODZXyeNqp13_SB0GTFXCU-tBrQSG84VAkoo-2Q3iOzb9zppHfyyfkhWAw1deuWILE8nMzwmKzCfjt4nJ_GsfQAK7NGq
  priority: 102
  providerName: IEEE
Title Position Encoding for 3D Lane Detection via Perspective Transformer
URI https://ieeexplore.ieee.org/document/10620274
https://doaj.org/article/f456c06702f84b539ced6eb337ece6c5
Volume 12
WOSCitedRecordID wos001288380500001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2169-3536
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000816957
  issn: 2169-3536
  databaseCode: DOA
  dateStart: 20130101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2169-3536
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000816957
  issn: 2169-3536
  databaseCode: M~E
  dateStart: 20130101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELYQYoAB8SiiPCoPjAQS23HisfQhhlJ1KIgtcs62hIQCKqUjv52zk5Z2gYUlQ-Qk9ndJ7j7r7jtCrtJSGiOEjXJmTCQSwSItXYxUBRx6PKOz0D_laZSNx_nzs5qstfryOWG1PHAN3K1DDw--mIS5XJQpV2CNRAbIMwtWQlAvxahnjUyFf3CeSJVmjcxQEqvbbq-HK0JCyMQNFxzDmGTDFQXF_o0WK8HDDA_IfhMa0m49pUOyZasjsrcmGHhMepMmx4oOKnjzbodi0El5n450ZWnfzkNmVUUXL5pOfuoo6XQZn9pZizwOB9PefdS0QYgA3ek8clrKWEGJNkeupBIjJWM2BafBK8PgF2mVjo01UPrtQaulSUE5lTijMXoAfkK2q7fKnhLKFY81B4Y4gjB5mQOY2GUZz9IsMYa1CVsiUkCjEe5bVbwWgSvEqqhhLDyMRQNjm1yvLnqvJTJ-H37noV4N9frW4QRavWisXvxl9TZpeUOtPU_6XRxx9h83Pye7fsL1RssF2Z7PPu0l2YHF_OVj1gkvGh4fvgadUC74DVMI1tg
linkProvider Directory of Open Access Journals
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dT9swED9t3aTBA9sYiPI1P_BIWGI7TvwIBcREqfrQTbxFzvksVUIpKqV_P7YTCjxsEm-R5SiXO5_uw3e_AzjKa2WtlJSU3NpEZpInRrnUhyrovMWzpojzU_4Oi9GovL3V465ZPfbCEFEsPqOT8Bjv8u0MH0OqzGu4CrG6_Aifcil51rZrrVIqYYaEzosOWyhL9a_TwcD_ho8CuTwRUnjfJXtjfyJM_5u5KtGsXH59J0HfYKPzH9lpK_Dv8IGaTVh_hSr4AwbjrhCLXTQ4C7aJec-UiXM2NA2xc1rE8quGLaeGjV-aLdnk2Yml-Rb8ubyYDK6SblZCgt7mLhJnlEo11v5g-IBKZ1YpzilHZzDAx3i1JW1SSxbrkEMko2yO2unMWeNdDBTb0GtmDe0AE1qkRiCvc4HSlnWJaFNXFKLIi8xa3gf-zMEKOyDxMM_irooBRaqrlu1VYHvVsb0Px6uX7lscjf9vPwuiWW0NINhxwbO_6nSqct75w9BnxF0pPbUaySqqhSgISWHeh60gslffa6W1-4_1n_DlanIzrIa_R9d7sBboaZMt-9BbzB_pAD7jcjF9mB_Gc_cEG2vU8Q
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Position+Encoding+for+3D+Lane+Detection+via+Perspective+Transformer&rft.jtitle=IEEE+access&rft.au=Li+Zhang%2C+Meng&rft.au=Wei+Wang%2C+Ming&rft.au=Yang+Deng%2C+Yan&rft.au=Yu+Lei%2C+Xin&rft.date=2024&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=12&rft.spage=106480&rft.epage=106487&rft_id=info:doi/10.1109%2FACCESS.2024.3436561&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2024_3436561
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon