Learning to Recover 3D Scene Shape from a Single Image
Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown...
Uloženo v:
| Vydáno v: | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 204 - 213 |
|---|---|
| Hlavní autoři: | , , , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.01.2021
|
| Témata: | |
| ISSN: | 1063-6919 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length. We investigate this problem in detail, and propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape. In addition, we propose an image-level normalized regression loss and a normal-based geometry loss to enhance depth prediction models trained on mixed datasets. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot dataset generalization. Code is available at: https://git.io/Depth |
|---|---|
| AbstractList | Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length. We investigate this problem in detail, and propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape. In addition, we propose an image-level normalized regression loss and a normal-based geometry loss to enhance depth prediction models trained on mixed datasets. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot dataset generalization. Code is available at: https://git.io/Depth |
| Author | Niklaus, Simon Shen, Chunhua Zhang, Jianming Wang, Oliver Chen, Simon Mai, Long Yin, Wei |
| Author_xml | – sequence: 1 givenname: Wei surname: Yin fullname: Yin, Wei organization: The University of Adelaide,Australia – sequence: 2 givenname: Jianming surname: Zhang fullname: Zhang, Jianming organization: Adobe Research – sequence: 3 givenname: Oliver surname: Wang fullname: Wang, Oliver organization: Adobe Research – sequence: 4 givenname: Simon surname: Niklaus fullname: Niklaus, Simon organization: Adobe Research – sequence: 5 givenname: Long surname: Mai fullname: Mai, Long organization: Adobe Research – sequence: 6 givenname: Simon surname: Chen fullname: Chen, Simon organization: Adobe Research – sequence: 7 givenname: Chunhua surname: Shen fullname: Shen, Chunhua organization: The University of Adelaide,Australia |
| BookMark | eNotj9tKw0AURUdRsK39An2YH0idM9ecR4lWCwGlUV_LXE5qpElKUgT_3oA-bTYs1mbP2UXXd8TYLYgVgMC74uN1q61WbiWFhJUQQrozNgdrjdZGoDxnMxBWZRYBr9hyHL8mRkkAi_mM2ZL80DXdnp96vqXYf9PA1QOvInXEq09_JF4Pfcs9rybqQHzT-j1ds8vaH0Za_ueCva8f34rnrHx52hT3ZdZIi6fMmbxOWsVayeQgTbMpGgOSXNDBoQ6Yam9yxDhViMGHPMqEoMN0SQSrFuzmz9sQ0e44NK0ffnZonNNKql8EpUa6 |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR46437.2021.00027 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 1665445092 9781665445092 |
| EISSN | 1063-6919 |
| EndPage | 213 |
| ExternalDocumentID | 9577432 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i269t-758fd43cf32d71d032dc5512e7b4b794b9dfa5899c4b71cbab8c2d914b4640b63 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 122 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000739917300021&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:24:15 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i269t-758fd43cf32d71d032dc5512e7b4b794b9dfa5899c4b71cbab8c2d914b4640b63 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_9577432 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-01-01 |
| PublicationDateYYYYMMDD | 2021-01-01 |
| PublicationDate_xml | – month: 01 year: 2021 text: 2021-01-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003211698 |
| Score | 2.5978608 |
| Snippet | Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 204 |
| SubjectTerms | Estimation Geometry Predictive models Reconstruction algorithms Shape Three-dimensional displays Training |
| Title | Learning to Recover 3D Scene Shape from a Single Image |
| URI | https://ieeexplore.ieee.org/document/9577432 |
| WOSCitedRecordID | wos000739917300021&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4g8eAJFYzv9ODRhd1tu23PKNELIaKGG-lTSRQILP5-p8sGY-LFW9s0bTpt59mvA3Cj8mA5VTSRSkZvFReJdswn1DJlhU4N97pKNiGGQzmZqFEDbndYGO999fjMd2OxiuW7hd1EV1lPcVRWKDLcPSGKLVZr50-haMkUStbouCxVvf7r6InFuBRagXnWrYJsv3KoVCJk0Prf5IfQ-cHikdFOyhxBw8-PoVUrj6S-mus2FPVPqW-kXJBoVOIZJfQOeyA3I-N3vfQkYkmIJmPs9eHJ4yfykg68DO6f-w9JnRQhmeWFKhPU74Nj1AaaO5E5XLezqPXkXhhm8HIZ5YLmaEVZrGbWaCNt7lTGDNIiNQU9geZ8MfenQFD4iyACtV5aZrnUnPrMhoADpSoU6gzakQzT5fbfi2lNgfO_my_gINJ56564hGa52vgr2Ldf5Wy9uq426xsXfZQS |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4QNdETKhjf9uDRhd1td7s9owQiEiJouJE-lUSBwOLvd7psMCZevLVN06bTdp79OgC3InY6oYIGmci8tyrhgTTMBlQzobkMVWJlkWyC9_vZeCwGFbjbYmGstcXjM9vwxSKWb-Z67V1lTZGgskKR4e76zFklWmvrUaFoy6QiK_FxUSiardfBM_ORKbQD46hRhNl-ZVEphEi7-r_pD6H-g8Yjg62cOYKKnR1DtVQfSXk5VzVIy79S30g-J96sxFNK6D32QH5Ghu9yYYlHkxBJhtjrw5LuJ3KTOry0H0atTlCmRQimcSryADV8ZxjVjsaGRwbXbTTqPbHliim8XkoYJxO0ozRWI62kynRsRMQU0iJUKT2Bndl8Zk-BoPjnjjuqbaaZTjKZUBtp53CgULhUnEHNk2Gy2Px8MSkpcP538w3sd0ZPvUmv23-8gANP842z4hJ28uXaXsGe_sqnq-V1sXHfAomXWw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Learning+to+Recover+3D+Scene+Shape+from+a+Single+Image&rft.au=Yin%2C+Wei&rft.au=Zhang%2C+Jianming&rft.au=Wang%2C+Oliver&rft.au=Niklaus%2C+Simon&rft.date=2021-01-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=204&rft.epage=213&rft_id=info:doi/10.1109%2FCVPR46437.2021.00027&rft.externalDocID=9577432 |