Neural 3D Scene Reconstruction with the Manhattan-world Assumption

This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Many previous works have shown impressive reconstruction results on textured objects, but they still have difficulty in handling low-textured planar regions, which are common in indoor scenes. An approach t...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) s. 5501 - 5510
Hlavní autoři: Guo, Haoyu, Peng, Sida, Lin, Haotong, Wang, Qianqian, Zhang, Guofeng, Bao, Hujun, Zhou, Xiaowei
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.06.2022
Témata:
ISSN:1063-6919
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Many previous works have shown impressive reconstruction results on textured objects, but they still have difficulty in handling low-textured planar regions, which are common in indoor scenes. An approach to solving this issue is to incorporate planer constraints into the depth map estimation in multiview stereo-based methods, but the per-view plane estimation and depth optimization lack both efficiency and multiview consistency. In this work, we show that the planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods. Specifically, we use an MLP network to represent the signed distance function as the scene geometry. Based on the Manhattan-world assumption, planar constraints are employed to regularize the geometry in floor and wall regions predicted by a 2D semantic segmentation network. To resolve the inaccurate segmentation, we encode the semantics of 3D points with another MLP and design a novel loss that jointly optimizes the scene geometry and semantics in 3D space. Experiments on ScanNet and 7-Scenes datasets show that the proposed method outperforms previous methods by a large margin on 3D reconstruction quality. The code and supplementary materials are available at https://zju3dv.github.io/manhattan_sdf.
AbstractList This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Many previous works have shown impressive reconstruction results on textured objects, but they still have difficulty in handling low-textured planar regions, which are common in indoor scenes. An approach to solving this issue is to incorporate planer constraints into the depth map estimation in multiview stereo-based methods, but the per-view plane estimation and depth optimization lack both efficiency and multiview consistency. In this work, we show that the planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods. Specifically, we use an MLP network to represent the signed distance function as the scene geometry. Based on the Manhattan-world assumption, planar constraints are employed to regularize the geometry in floor and wall regions predicted by a 2D semantic segmentation network. To resolve the inaccurate segmentation, we encode the semantics of 3D points with another MLP and design a novel loss that jointly optimizes the scene geometry and semantics in 3D space. Experiments on ScanNet and 7-Scenes datasets show that the proposed method outperforms previous methods by a large margin on 3D reconstruction quality. The code and supplementary materials are available at https://zju3dv.github.io/manhattan_sdf.
Author Peng, Sida
Wang, Qianqian
Zhang, Guofeng
Guo, Haoyu
Bao, Hujun
Lin, Haotong
Zhou, Xiaowei
Author_xml – sequence: 1
  givenname: Haoyu
  surname: Guo
  fullname: Guo, Haoyu
  organization: Zhejiang University
– sequence: 2
  givenname: Sida
  surname: Peng
  fullname: Peng, Sida
  organization: Zhejiang University
– sequence: 3
  givenname: Haotong
  surname: Lin
  fullname: Lin, Haotong
  organization: Zhejiang University
– sequence: 4
  givenname: Qianqian
  surname: Wang
  fullname: Wang, Qianqian
  organization: Cornell University
– sequence: 5
  givenname: Guofeng
  surname: Zhang
  fullname: Zhang, Guofeng
  organization: Zhejiang University
– sequence: 6
  givenname: Hujun
  surname: Bao
  fullname: Bao, Hujun
  organization: Zhejiang University
– sequence: 7
  givenname: Xiaowei
  surname: Zhou
  fullname: Zhou, Xiaowei
  organization: Zhejiang University
BookMark eNotj8lOwzAURQ0Cibb0C2DhH0h4HmMvSxmlMqgM28pxXpWg1KliRxV_TxCs7ubeo3On5CR0AQm5ZJAzBvZq-fm6Vlwbk3PgPAdQUhyRKdNaSW2lFsdkwkCLTFtmz8g8xi8AEJwxbc2EXD_j0LuWihv65jEgXaPvQkz94FPTBXpoUk1TjfTJhdql5EJ26Pq2oosYh93-t3NOTreujTj_zxn5uLt9Xz5kq5f7x-VilTUcRMpQogVfCWs1lyhciUyWOAqOpqIAz_mWKVRKmIrzQhas5IUulJfgWDWOxIxc_HEbRNzs-2bn-u-NNWa8A-IHyiRLbA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR52688.2022.00543
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Library
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 1665469463
9781665469463
EISSN 1063-6919
EndPage 5510
ExternalDocumentID 9880030
Genre orig-research
GrantInformation_xml – fundername: NSFC
  grantid: 62172364
  funderid: 10.13039/501100001809
– fundername: National Key Research and Development Program of China
  grantid: 2020AAA0108901
  funderid: 10.13039/501100012166
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i203t-e4e90cd399624e3abe14be463665370c22f15e5538d227471b27675c40a1d9963
IEDL.DBID RIE
ISICitedReferencesCount 110
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000867754205074&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:15:09 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-e4e90cd399624e3abe14be463665370c22f15e5538d227471b27675c40a1d9963
PageCount 10
ParticipantIDs ieee_primary_9880030
PublicationCentury 2000
PublicationDate 2022-June
PublicationDateYYYYMMDD 2022-06-01
PublicationDate_xml – month: 06
  year: 2022
  text: 2022-June
PublicationDecade 2020
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.6073554
Snippet This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Many previous works have shown impressive reconstruction results...
SourceID ieee
SourceType Publisher
StartPage 5501
SubjectTerms 3D from multi-view and sensors
Estimation
Geometry
Pattern recognition
Reconstruction algorithms
Robustness
Semantics
Three-dimensional displays
Title Neural 3D Scene Reconstruction with the Manhattan-world Assumption
URI https://ieeexplore.ieee.org/document/9880030
WOSCitedRecordID wos000867754205074&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED21FQNTgRbxLQ-MmCZ2nDgrhYqFqkKAulWJfRGVUIrawu_nzo1aBhaWKLISRblz4nfne-8Arp1BFxNykKnzRibWV9JWmg-JL63J0JgqNJvIxmM7neaTFtxsuTCIGIrP8JZPw16-X7gvTpUNcppsNCnb0M6ybMPV2uZTNEUyaW4bdlwc5YPh2-SZxUy4gEuxLKdhas6vHiphCRl1__fwA-jvuHhisl1lDqGF9RF0G_Aomk9z1YM7ltkoPoS-p0H6gQkOLHfysIITroLQnngq6vdiTZhQBrVUQQ4il_I1fXgdPbwMH2XTIEHOVaTXEhPMI-cJY6QqQV2UGCclsgRYanQWOaWq2JC1tfUqRJ-lYu0Wl0RF7OkmfQydelHjCYg89ch7pkUVQhSO49A4wo7eWo0uOoUem2T2udHAmDXWOPt7-Bz22eabkqoL6NC74iXsue_1fLW8Co77AaPomFI
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEJ0gmugJFYzf7sGjlXY_yvYqSjACIQYNN9LuTiOJKUaqv9-dpQEPXrw0zaZN05lt983svDcA10ahiRxyCGJjVSC1zQOdCzpIm2nVQaVy32yiMxrp6TQZ1-BmzYVBRF98hrd06vfy7cJ8UaqsnbjJ5iblFmwrKXm0YmutMyrCxTJxoit-XBQm7e7r-JnkTKiEi5MwpyJyzq8uKn4R6TX-9_h9aG3YeGy8XmcOoIbFITQq-Miqj3PZhDsS2kjfmbh3g-4Xxii03AjEMkq5Mof32DAt3tLSocLA66Uy5yLnVLqmBS-9h0m3H1QtEoI5D0UZoMQkNNahjJhLFGmGkcyQRMBiJTqh4TyPlLO30Jb7-DPjpN5iZJhG1t0kjqBeLAo8BpbEFmnXNM19kEKRHCrj0KPVWqAJT6BJJpl9rFQwZpU1Tv8evoLd_mQ4mA0eR09nsEf2XxVYnUPdvTdewI75LufLz0vvxB_DnZuZ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Neural+3D+Scene+Reconstruction+with+the+Manhattan-world+Assumption&rft.au=Guo%2C+Haoyu&rft.au=Peng%2C+Sida&rft.au=Lin%2C+Haotong&rft.au=Wang%2C+Qianqian&rft.date=2022-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=5501&rft.epage=5510&rft_id=info:doi/10.1109%2FCVPR52688.2022.00543&rft.externalDocID=9880030