Point-GSMAE: A graph convolution and scale-based masked autoencoder for 3D point cloud representation

Masked Autoencoders (MAEs) have demonstrated considerable potential in advancing self-supervised learning for 3D point cloud representation. Nevertheless, existing MAE-based approaches, predominantly relying on Transformer architectures, struggle to effectively model interactions between points in l...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information sciences Jg. 719; S. 122474
Hauptverfasser: Bai, Yun, Yang, Chaozhi, Li, Guanlin, He, Xiao, Xiao, Qian, Li, Zongmin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Inc 01.11.2025
Schlagworte:
ISSN:0020-0255
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Masked Autoencoders (MAEs) have demonstrated considerable potential in advancing self-supervised learning for 3D point cloud representation. Nevertheless, existing MAE-based approaches, predominantly relying on Transformer architectures, struggle to effectively model interactions between points in local neighborhoods. This limitation hinders the ability to capture fine-grained local geometric structures of point clouds, negatively impacting tasks that depend on local geometric relationships. In response to this issue, we present Point-GSMAE, an innovative MAE framework for 3D point clouds. This framework integrates graph convolution and graph scale to enhance local geometric modeling. Specifically, it constructs weighted adjacency matrices to encode relationships between neighboring points, edges, and center points, enabling graph convolution to aggregate neighborhood information into center points for precise modeling of local geometric structures. Furthermore, we introduce a graph scale component as a complementary descriptor capturing both the graph structure and spatial distribution, enriching the representation of local geometric properties. To further refine the learned representations, we incorporate a scale consistency loss function, aligning the reconstructed point clouds with their original structures and improving sensitivity to scale variations within local neighborhoods. Comprehensive experiments on multiple datasets demonstrate the efficacy of Point-GSMAE, outperforming existing Transformer-based MAE methods while requiring fewer parameters.
ISSN:0020-0255
DOI:10.1016/j.ins.2025.122474