Point-GSMAE: A graph convolution and scale-based masked autoencoder for 3D point cloud representation

Masked Autoencoders (MAEs) have demonstrated considerable potential in advancing self-supervised learning for 3D point cloud representation. Nevertheless, existing MAE-based approaches, predominantly relying on Transformer architectures, struggle to effectively model interactions between points in l...

Full description

Saved in:

Bibliographic Details
Published in:	Information sciences Vol. 719; p. 122474
Main Authors:	Bai, Yun, Yang, Chaozhi, Li, Guanlin, He, Xiao, Xiao, Qian, Li, Zongmin
Format:	Journal Article
Language:	English
Published:	Elsevier Inc 01.11.2025
Subjects:	Graph convolution Graph scale Masked autoencoder Point cloud Graph scale Point cloud Graph convolution Masked autoencoder
ISSN:	0020-0255
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Masked Autoencoders (MAEs) have demonstrated considerable potential in advancing self-supervised learning for 3D point cloud representation. Nevertheless, existing MAE-based approaches, predominantly relying on Transformer architectures, struggle to effectively model interactions between points in local neighborhoods. This limitation hinders the ability to capture fine-grained local geometric structures of point clouds, negatively impacting tasks that depend on local geometric relationships. In response to this issue, we present Point-GSMAE, an innovative MAE framework for 3D point clouds. This framework integrates graph convolution and graph scale to enhance local geometric modeling. Specifically, it constructs weighted adjacency matrices to encode relationships between neighboring points, edges, and center points, enabling graph convolution to aggregate neighborhood information into center points for precise modeling of local geometric structures. Furthermore, we introduce a graph scale component as a complementary descriptor capturing both the graph structure and spatial distribution, enriching the representation of local geometric properties. To further refine the learned representations, we incorporate a scale consistency loss function, aligning the reconstructed point clouds with their original structures and improving sensitivity to scale variations within local neighborhoods. Comprehensive experiments on multiple datasets demonstrate the efficacy of Point-GSMAE, outperforming existing Transformer-based MAE methods while requiring fewer parameters.
ISSN:	0020-0255
DOI:	10.1016/j.ins.2025.122474