SA3D-L: A lightweight model for 3D object segmentation using neural radiance fields

The Segment Anything Model (SAM) has recently made significant progress in object segmentation within 2D images. However, the task of segmenting objects in a 3D space remains a primary hurdle in computer vision. The neural radiance field (NeRF) utilizes a multilayer perceptron (MLP) to effectively l...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) Vol. 623; p. 129420
Main Authors: Liu, Jian, Yu, Zhen
Format: Journal Article
Language:English
Published: Elsevier B.V 28.03.2025
Subjects:
ISSN:0925-2312
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The Segment Anything Model (SAM) has recently made significant progress in object segmentation within 2D images. However, the task of segmenting objects in a 3D space remains a primary hurdle in computer vision. The neural radiance field (NeRF) utilizes a multilayer perceptron (MLP) to effectively learn the continuous representation of a 3D scene. Due to its consistent 3D perspectives from various angles, SAM, originally designed for 2D segmentation, can be extended for 3D object segmentation by incorporating NeRF. However, a limitation of NeRF is that the MLP encapsulates the whole scene as a single representation, without distinguishing individual objects. This study introduces a lightweight 3D Segment Anything Model (SA3D-L), which separately represents each segmented object within a scene by modifying the MLP output. Experimental results on established benchmarks revealed that the 3D segmentation representations of objects can be derived from their 2D masks, allowing the independent manipulation of segmented objects and the reconstruction of a new scene. The code is available at: https://github.com/liujian0819/SA3D-L. •We proposed a method to learn the representation of 3D segmentation from 2D masks.•We utilized a single MLP to represent multiple objects in the scene simultaneously.•We can separately manipulate each segmented object and reconstruct a new scene.
ISSN:0925-2312
DOI:10.1016/j.neucom.2025.129420