Contrastive Semantic-Aware Masked Autoencoder for Point Cloud Self-Supervised Learning

Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely on point-level or low-level feature reconstruction, forcing the model to focus on local regions while lacking enough global discriminability i...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE signal processing letters Ročník 32; s. 1760 - 1764
Hlavní autoři: He, Yuan, Hu, Guyue, Yu, Shan
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1070-9908, 1558-2361
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely on point-level or low-level feature reconstruction, forcing the model to focus on local regions while lacking enough global discriminability in the feature representation. Moreover, conventional masking strategies randomly mask some point patches, thereby neglecting the semantic structure of the point cloud and hindering the holistic understanding of global information and geometric structures. To address these challenges, we proposed a Contrastive Semantic-aware Masked Autoencoder (Point-CSMAE), which is equipped with a semantic-aware masking (SAM) strategy and a contrastive regularization (CR) mechanism. Specifically, the semantic-aware masking strategy adaptively selects patches with richer semantic information for masking and reconstruction, enhancing the understanding of global geometric structure. Furthermore, the contrastive regularization mechanism adaptively aligns the global information between the masked and visible parts, thus improving the learned global semantic representation. Meanwhile, the CR mechanism assists the SAM strategy with effective global semantic representations. Extensive experiments on various downstream tasks, including shape classification, few-shot classification, and part segmentation, demonstrate the superiority of the proposed approach.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1070-9908
1558-2361
DOI:10.1109/LSP.2025.3560175