Learning to Recover 3D Scene Shape from a Single Image
Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown...
Uloženo v:
| Vydáno v: | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 204 - 213 |
|---|---|
| Hlavní autoři: | , , , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.01.2021
|
| Témata: | |
| ISSN: | 1063-6919 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length. We investigate this problem in detail, and propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape. In addition, we propose an image-level normalized regression loss and a normal-based geometry loss to enhance depth prediction models trained on mixed datasets. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot dataset generalization. Code is available at: https://git.io/Depth |
|---|---|
| AbstractList | Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length. We investigate this problem in detail, and propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape. In addition, we propose an image-level normalized regression loss and a normal-based geometry loss to enhance depth prediction models trained on mixed datasets. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot dataset generalization. Code is available at: https://git.io/Depth |
| Author | Niklaus, Simon Shen, Chunhua Zhang, Jianming Wang, Oliver Chen, Simon Mai, Long Yin, Wei |
| Author_xml | – sequence: 1 givenname: Wei surname: Yin fullname: Yin, Wei organization: The University of Adelaide,Australia – sequence: 2 givenname: Jianming surname: Zhang fullname: Zhang, Jianming organization: Adobe Research – sequence: 3 givenname: Oliver surname: Wang fullname: Wang, Oliver organization: Adobe Research – sequence: 4 givenname: Simon surname: Niklaus fullname: Niklaus, Simon organization: Adobe Research – sequence: 5 givenname: Long surname: Mai fullname: Mai, Long organization: Adobe Research – sequence: 6 givenname: Simon surname: Chen fullname: Chen, Simon organization: Adobe Research – sequence: 7 givenname: Chunhua surname: Shen fullname: Shen, Chunhua organization: The University of Adelaide,Australia |
| BookMark | eNotj9tKw0AURUdRsK39An2YH0idM9ecR4lWCwGlUV_LXE5qpElKUgT_3oA-bTYs1mbP2UXXd8TYLYgVgMC74uN1q61WbiWFhJUQQrozNgdrjdZGoDxnMxBWZRYBr9hyHL8mRkkAi_mM2ZL80DXdnp96vqXYf9PA1QOvInXEq09_JF4Pfcs9rybqQHzT-j1ds8vaH0Za_ueCva8f34rnrHx52hT3ZdZIi6fMmbxOWsVayeQgTbMpGgOSXNDBoQ6Yam9yxDhViMGHPMqEoMN0SQSrFuzmz9sQ0e44NK0ffnZonNNKql8EpUa6 |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR46437.2021.00027 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 1665445092 9781665445092 |
| EISSN | 1063-6919 |
| EndPage | 213 |
| ExternalDocumentID | 9577432 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i269t-758fd43cf32d71d032dc5512e7b4b794b9dfa5899c4b71cbab8c2d914b4640b63 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 121 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000739917300021&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:24:15 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i269t-758fd43cf32d71d032dc5512e7b4b794b9dfa5899c4b71cbab8c2d914b4640b63 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_9577432 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-01-01 |
| PublicationDateYYYYMMDD | 2021-01-01 |
| PublicationDate_xml | – month: 01 year: 2021 text: 2021-01-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003211698 |
| Score | 2.597798 |
| Snippet | Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 204 |
| SubjectTerms | Estimation Geometry Predictive models Reconstruction algorithms Shape Three-dimensional displays Training |
| Title | Learning to Recover 3D Scene Shape from a Single Image |
| URI | https://ieeexplore.ieee.org/document/9577432 |
| WOSCitedRecordID | wos000739917300021&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5VPHiq2opvcvBo2s1jHzlXi4KUYlV6K3lMtKBtabf-fifbpSJ48bZZEpad3Uy-byZfhpBr50GAKDKmhQGmggVmwAVW8MQ7nVrOfVW15DEfDIrxWA8b5GarhQGAavMZdOJllcv3c7eOobKuThGsSHS4O3meb7Ra23iKRCaT6aJWx_FEd3uvwycV81LIAgXvVEm2XzVUqiWk3_zfww9I-0eLR4fbVeaQNGB2RJo1eKT11Fy1SFaflPpGyzmNpBL_USpvsQd6Mzp6NwugUUtCDR1hrw-gD5_oS9rkpX_33LtndVEENhWZLhni--CVdEEKn3OP7-0doh4BuVUWJ5fVPpgUWZTDJnfW2MIJr7myaIvEZvKY7M7mMzghNDNKQgEhIOdQkEqjuE1wMCIEzrWCU9KKZpgsNudeTGoLnP19-5zsRztvwhMXZLdcruGS7LmvcrpaXlUf6xsA7pRz |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8JAEJ4QNdETKhjf7sGjBfbR0j2jBCISImi4kX1MlUSBQPH3Oy0NxsSLt26zm6bT7uz3zey3A3DrPAoUcRRoYTBQicXAoEuCmDe806Hl3OdVS3rNfj8ej_WgBHdbLQwi5pvPsJZd5rl8P3frLFRW1yGBFUkOdzdUSvCNWmsbUZHEZSIdF_o43tD11uvgWWWZKeKBgtfyNNuvKir5ItIu_-_xh1D9UeOxwXadOYISzo6hXMBHVkzOVQWi4qzUN5bOWUYr6S9l8p56kD9jw3ezQJapSZhhQ-r1gaz7Sd6kCi_th1GrExRlEYKpiHQaEMJPvJIukcI3uaf39o5wj8CmVZaml9U-MSHxKEdN7qyxsRNec2XJFg0byRPYmc1neAosMkpijElCrENhKI3itkGDCSNwrhWeQSUzw2SxOfliUljg_O_bN7DfGT31Jr1u__ECDjKbb4IVl7CTLtd4BXvuK52ultf5h_sG7wKXug |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Learning+to+Recover+3D+Scene+Shape+from+a+Single+Image&rft.au=Yin%2C+Wei&rft.au=Zhang%2C+Jianming&rft.au=Wang%2C+Oliver&rft.au=Niklaus%2C+Simon&rft.date=2021-01-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=204&rft.epage=213&rft_id=info:doi/10.1109%2FCVPR46437.2021.00027&rft.externalDocID=9577432 |