EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Locating 3D objects from a single RGB image via Perspective-n-Points (PnP) is a long-standing problem in computer vision. Driven by end-to-end deep learning, recent studies suggest interpreting PnP as a differentiable layer, so that 2D-3D point correspondences can be partly learned by backpropagatin...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 2771 - 2780
Hlavní autoři: Chen, Hansheng, Wang, Pichao, Wang, Fan, Tian, Wei, Xiong, Lu, Li, Hao
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.06.2022
Témata:
ISSN:1063-6919
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Locating 3D objects from a single RGB image via Perspective-n-Points (PnP) is a long-standing problem in computer vision. Driven by end-to-end deep learning, recent studies suggest interpreting PnP as a differentiable layer, so that 2D-3D point correspondences can be partly learned by backpropagating the gradient w.r.t. object pose. Yet, learning the entire set of unrestricted 2D-3D points from scratch fails to converge with existing approaches, since the deterministic pose is inherently non-differentiable. In this paper, we propose the EPro-PnP a probabilistic PnP layer for general end-to-end pose estimation, which outputs a distribution of pose on the SE(3) manifold, essentially bringing categorical Softmax to the continuous domain. The 2D-3D coordinates and corresponding weights are treated as intermediate variables learned by minimizing the KL divergence between the predicted and target pose distribution. The underlying principle unifies the existing approaches and resembles the attention mechanism. EPro-PnP significantly outperforms competitive baselines, closing the gap between PnP-based method and the task-specific leaders on the LineMOD 6DoF pose estimation and nuScenes 3D object detection benchmarks. 3
AbstractList Locating 3D objects from a single RGB image via Perspective-n-Points (PnP) is a long-standing problem in computer vision. Driven by end-to-end deep learning, recent studies suggest interpreting PnP as a differentiable layer, so that 2D-3D point correspondences can be partly learned by backpropagating the gradient w.r.t. object pose. Yet, learning the entire set of unrestricted 2D-3D points from scratch fails to converge with existing approaches, since the deterministic pose is inherently non-differentiable. In this paper, we propose the EPro-PnP a probabilistic PnP layer for general end-to-end pose estimation, which outputs a distribution of pose on the SE(3) manifold, essentially bringing categorical Softmax to the continuous domain. The 2D-3D coordinates and corresponding weights are treated as intermediate variables learned by minimizing the KL divergence between the predicted and target pose distribution. The underlying principle unifies the existing approaches and resembles the attention mechanism. EPro-PnP significantly outperforms competitive baselines, closing the gap between PnP-based method and the task-specific leaders on the LineMOD 6DoF pose estimation and nuScenes 3D object detection benchmarks. 3
Author Wang, Pichao
Tian, Wei
Chen, Hansheng
Wang, Fan
Xiong, Lu
Li, Hao
Author_xml – sequence: 1
  givenname: Hansheng
  surname: Chen
  fullname: Chen, Hansheng
  email: hanshengchen97@gmail.com
  organization: School of Automotive Studies, Tongji University
– sequence: 2
  givenname: Pichao
  surname: Wang
  fullname: Wang, Pichao
  email: pichao.wang@alibaba-inc.com
  organization: Alibaba Group
– sequence: 3
  givenname: Fan
  surname: Wang
  fullname: Wang, Fan
  email: fan.w@alibaba-inc.com
  organization: Alibaba Group
– sequence: 4
  givenname: Wei
  surname: Tian
  fullname: Tian, Wei
  email: tian_wei@tongji.edu.cn
  organization: School of Automotive Studies, Tongji University
– sequence: 5
  givenname: Lu
  surname: Xiong
  fullname: Xiong, Lu
  email: xiong_lu@tongji.edu.cn
  organization: School of Automotive Studies, Tongji University
– sequence: 6
  givenname: Hao
  surname: Li
  fullname: Li, Hao
  email: lihao.lh@alibaba-inc.com
  organization: Alibaba Group
BookMark eNotjctKAzEYhaMo2NY-gS7yAqm5zGQSd1LGKlQaRN2WXP7BlDEpk1HQp3dAV9_inO-cOTpLOQFC14yuGKP6Zv1mnmsulVpxyvmKUq7oCZozKetK6kqKUzRjVAoiNdMXaFnKgVIqOGNSqxl6b82QiUnmFm8gwWD7-AMBtymQMZMJeMqddbGPZYweGxjKEfwYv4AkYnJMY8FdHvBTTtl_9nbAO3eYCtjkAridpA87xpwu0Xln-wLLfy7Q6337sn4g293mcX23JZFTMRKw1nPtgTsLPAjKQtCscQCBuY7XnNoOVOWs0pb6RnkHvpKhq6BrJodpsUBXf7sRAPbHYbofvvdaNVpUtfgFZnxcAg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR52688.2022.00280
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 1665469463
9781665469463
EISSN 1063-6919
EndPage 2780
ExternalDocumentID 9879345
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i203t-eaac29ce2bae2d301dd917beed1bf2520afe84ba89a0c78cbec46df4ef7e2b193
IEDL.DBID RIE
ISICitedReferencesCount 110
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000867754203004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:15:10 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-eaac29ce2bae2d301dd917beed1bf2520afe84ba89a0c78cbec46df4ef7e2b193
PageCount 10
ParticipantIDs ieee_primary_9879345
PublicationCentury 2000
PublicationDate 2022-June
PublicationDateYYYYMMDD 2022-06-01
PublicationDate_xml – month: 06
  year: 2022
  text: 2022-June
PublicationDecade 2020
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.6019711
Snippet Locating 3D objects from a single RGB image via Perspective-n-Points (PnP) is a long-standing problem in computer vision. Driven by end-to-end deep learning,...
SourceID ieee
SourceType Publisher
StartPage 2771
SubjectTerms 3D from single images; Computer vision theory; Deep learning architectures and techniques; Navigation and autonomous driving; Recognition: detection
categorization
Computer vision
Deep learning
Pose estimation
Probabilistic logic
retrieval; Robot vision; Statistical methods
Robot vision systems
Statistical analysis
Three-dimensional displays
Title EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
URI https://ieeexplore.ieee.org/document/9879345
WOSCitedRecordID wos000867754203004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZKxcBUoEW85YER0zyc2GGtUjEVCwHqVsX2RWRJUJsy8Os5O1ErJBamWImdRD6d7uHvuyPkLsQHYQaGSbRFjAOONPoBjCeQFaXgVtrCN5sQi4VcLjM1IPc7LgwAePAZPLihP8u3jdm6VNkU4-Ms5skBORBCdFytXT4lxkgmzWTPjguDbDp7Vy-umIkDcEWRR0wGv3qoeBMyH_3v48dksufiUbWzMidkAPUpGfXOI-1VczMmHznOYqpWj7QvJV1944S8tqxtGF7cW7Svp-tKM1O1Z1mymqmmqtsNRQ-WopY3HpxKn7VL0lDVbIDmuKhjOU7I2zx_nT2xvo0Cq6IgbhkUhYkyA5EuILKo0NZijKbxt0NdRkkUFCVIrguZFYER0qBUeWpLDqXANejgnZFh3dRwTmhpwwC45mmCXmMQWxmAsCXGubKMk9SYCzJ2G7f67CplrPo9u_z79hU5cpLpgFfXZNiut3BDDs1XW23Wt168P6QiqYk
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT4MwFG_mNNHT1M34bQ8erYNSoHhdWGack5hpdlugfUQuYAbz4F_vK5AtJl480UALpC8v76O_33uE3Nr4wA5AMYm2iAnAUYJ-ABMuBHHqCy11XDeb8GczuVgEUYfcbbgwAFCDz-DeDOuzfF2otUmVDTE-Dhzh7pBdVwhuN2ytTUbFwVjGC2TLj7OtYDh6j15NORMD4eK8xkxav7qo1EZk3Pvf5w_JYMvGo9HGzhyRDuTHpNe6j7RVzrJPPkKcxaI8eqBtMensGyeEuWZVwfBi3pLUFXVNcWYabXmWLGdRkeVVSdGHpajnRQ1PpS-JSdPQqCiBhrio4TkOyNs4nI8mrG2kwDJuORWDOFY8UMCTGLhGldYao7QEf9tOUu5yK05BiiSWQWwpXyqUq_B0KiD1cQ26eCekmxc5nBKaatsCkQjPRb_RcrS0wNcpRroydVxPqTPSNxu3_GxqZSzbPTv_-_YN2Z_Mn6fL6ePs6YIcGCk1MKxL0q1Wa7gie-qrysrVdS3qH6CmrNA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=EPro-PnP%3A+Generalized+End-to-End+Probabilistic+Perspective-n-Points+for+Monocular+Object+Pose+Estimation&rft.au=Chen%2C+Hansheng&rft.au=Wang%2C+Pichao&rft.au=Wang%2C+Fan&rft.au=Tian%2C+Wei&rft.date=2022-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=2771&rft.epage=2780&rft_id=info:doi/10.1109%2FCVPR52688.2022.00280&rft.externalDocID=9879345