3D Snapshot: Invertible Embedding of 3D Neural Representations in a Single Image

Gespeichert in:
Bibliographische Detailangaben
Titel: 3D Snapshot: Invertible Embedding of 3D Neural Representations in a Single Image
Autoren: Yuqin Lu, Bailin Deng, Zhixuan Zhong, Tianle Zhang, Yuhui Quan, Hongmin Cai, Shengfeng He
Quelle: IEEE Transactions on Pattern Analysis and Machine Intelligence. 46:11524-11531
Verlagsinformationen: Institute of Electrical and Electronics Engineers (IEEE), 2024.
Publikationsjahr: 2024
Schlagwörter: Spherical Harmonics, Artificial Intelligence and Robotics, Neural Networks, Spatial Domain, 3 D Scene, Half Of The Channel, Neural Network, Wavelet Transform, Solid Modeling, Neural Coding, 3 D Snapshots, Loss Function, Spatial Coordinates, Rendering Computer Graphics, Compact Network, Short Video, Image Representation, Least Significant Bit, Image Reconstruction, Steganography, Noisy Images, Invertible Image Processing, Model Size, Image Embedding, Metaverse, Volume Density, Neural Representations, Data Storage, Reconstruction Quality, Dynamic Neural Network, View Synthesis, Image Color Analysis, Dynamic Update, Scene Representation, Three Dimensional Displays, Intermediate Representation, Graphics and Human Computer Interfaces, Single Image, View Direction, Dynamic Network, Neural Image
Beschreibung: 3D neural rendering enables photo-realistic reconstruction of a specific scene by encoding discontinuous inputs into a neural representation. Despite the remarkable rendering results, the storage of network parameters is not transmission-friendly and not extendable to metaverse applications. In this paper, we propose an invertible neural rendering approach that enables generating an interactive 3D model from a single image (i.e., 3D Snapshot). Our idea is to distill a pre-trained neural rendering model (e.g., NeRF) into a visualizable image form that can then be easily inverted back to a neural network. To this end, we first present a neural image distillation method to optimize three neural planes for representing the original neural rendering model. However, this representation is noisy and visually meaningless. We thus propose a dynamic invertible neural network to embed this noisy representation into a plausible image representation of the scene. We demonstrate promising reconstruction quality quantitatively and qualitatively, by comparing to the original neural rendering model, as well as video-based invertible methods. On the other hand, our method can store dozens of NeRFs with a compact restoration network (5 MB), and embedding each 3D scene takes up only 160 KB of storage. More importantly, our approach is the first solution that allows embedding a neural rendering model into image representations, which enables applications like creating an interactive 3D model from a printed image in the metaverse.
Publikationsart: Article
Dateibeschreibung: application/pdf
ISSN: 1939-3539
0162-8828
DOI: 10.1109/tpami.2024.3411051
Zugangs-URL: https://pubmed.ncbi.nlm.nih.gov/38848236
Rights: IEEE Copyright
CC BY NC ND
Dokumentencode: edsair.doi.dedup.....7f49a7b9e1711c62195db3e6a0f98bfd
Datenbank: OpenAIRE
Beschreibung
Abstract:3D neural rendering enables photo-realistic reconstruction of a specific scene by encoding discontinuous inputs into a neural representation. Despite the remarkable rendering results, the storage of network parameters is not transmission-friendly and not extendable to metaverse applications. In this paper, we propose an invertible neural rendering approach that enables generating an interactive 3D model from a single image (i.e., 3D Snapshot). Our idea is to distill a pre-trained neural rendering model (e.g., NeRF) into a visualizable image form that can then be easily inverted back to a neural network. To this end, we first present a neural image distillation method to optimize three neural planes for representing the original neural rendering model. However, this representation is noisy and visually meaningless. We thus propose a dynamic invertible neural network to embed this noisy representation into a plausible image representation of the scene. We demonstrate promising reconstruction quality quantitatively and qualitatively, by comparing to the original neural rendering model, as well as video-based invertible methods. On the other hand, our method can store dozens of NeRFs with a compact restoration network (5 MB), and embedding each 3D scene takes up only 160 KB of storage. More importantly, our approach is the first solution that allows embedding a neural rendering model into image representations, which enables applications like creating an interactive 3D model from a printed image in the metaverse.
ISSN:19393539
01628828
DOI:10.1109/tpami.2024.3411051