Point-GSMAE: A graph convolution and scale-based masked autoencoder for 3D point cloud representation

Masked Autoencoders (MAEs) have demonstrated considerable potential in advancing self-supervised learning for 3D point cloud representation. Nevertheless, existing MAE-based approaches, predominantly relying on Transformer architectures, struggle to effectively model interactions between points in l...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Information sciences Ročník 719; s. 122474
Hlavní autoři:	Bai, Yun, Yang, Chaozhi, Li, Guanlin, He, Xiao, Xiao, Qian, Li, Zongmin
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Elsevier Inc 01.11.2025
Témata:	Graph convolution Graph scale Masked autoencoder Point cloud Graph scale Point cloud Graph convolution Masked autoencoder
ISSN:	0020-0255
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Masked Autoencoders (MAEs) have demonstrated considerable potential in advancing self-supervised learning for 3D point cloud representation. Nevertheless, existing MAE-based approaches, predominantly relying on Transformer architectures, struggle to effectively model interactions between points in local neighborhoods. This limitation hinders the ability to capture fine-grained local geometric structures of point clouds, negatively impacting tasks that depend on local geometric relationships. In response to this issue, we present Point-GSMAE, an innovative MAE framework for 3D point clouds. This framework integrates graph convolution and graph scale to enhance local geometric modeling. Specifically, it constructs weighted adjacency matrices to encode relationships between neighboring points, edges, and center points, enabling graph convolution to aggregate neighborhood information into center points for precise modeling of local geometric structures. Furthermore, we introduce a graph scale component as a complementary descriptor capturing both the graph structure and spatial distribution, enriching the representation of local geometric properties. To further refine the learned representations, we incorporate a scale consistency loss function, aligning the reconstructed point clouds with their original structures and improving sensitivity to scale variations within local neighborhoods. Comprehensive experiments on multiple datasets demonstrate the efficacy of Point-GSMAE, outperforming existing Transformer-based MAE methods while requiring fewer parameters.
ISSN:	0020-0255
DOI:	10.1016/j.ins.2025.122474