Contrastive Semantic-Aware Masked Autoencoder for Point Cloud Self-Supervised Learning

Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely on point-level or low-level feature reconstruction, forcing the model to focus on local regions while lacking enough global discriminability i...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE signal processing letters Ročník 32; s. 1760 - 1764
Hlavní autoři: He, Yuan, Hu, Guyue, Yu, Shan
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1070-9908, 1558-2361
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely on point-level or low-level feature reconstruction, forcing the model to focus on local regions while lacking enough global discriminability in the feature representation. Moreover, conventional masking strategies randomly mask some point patches, thereby neglecting the semantic structure of the point cloud and hindering the holistic understanding of global information and geometric structures. To address these challenges, we proposed a Contrastive Semantic-aware Masked Autoencoder (Point-CSMAE), which is equipped with a semantic-aware masking (SAM) strategy and a contrastive regularization (CR) mechanism. Specifically, the semantic-aware masking strategy adaptively selects patches with richer semantic information for masking and reconstruction, enhancing the understanding of global geometric structure. Furthermore, the contrastive regularization mechanism adaptively aligns the global information between the masked and visible parts, thus improving the learned global semantic representation. Meanwhile, the CR mechanism assists the SAM strategy with effective global semantic representations. Extensive experiments on various downstream tasks, including shape classification, few-shot classification, and part segmentation, demonstrate the superiority of the proposed approach.
AbstractList Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely on point-level or low-level feature reconstruction, forcing the model to focus on local regions while lacking enough global discriminability in the feature representation. Moreover, conventional masking strategies randomly mask some point patches, thereby neglecting the semantic structure of the point cloud and hindering the holistic understanding of global information and geometric structures. To address these challenges, we proposed a Contrastive Semantic-aware Masked Autoencoder (Point-CSMAE), which is equipped with a semantic-aware masking (SAM) strategy and a contrastive regularization (CR) mechanism. Specifically, the semantic-aware masking strategy adaptively selects patches with richer semantic information for masking and reconstruction, enhancing the understanding of global geometric structure. Furthermore, the contrastive regularization mechanism adaptively aligns the global information between the masked and visible parts, thus improving the learned global semantic representation. Meanwhile, the CR mechanism assists the SAM strategy with effective global semantic representations. Extensive experiments on various downstream tasks, including shape classification, few-shot classification, and part segmentation, demonstrate the superiority of the proposed approach.
Author He, Yuan
Yu, Shan
Hu, Guyue
Author_xml – sequence: 1
  givenname: Yuan
  orcidid: 0009-0001-1960-9866
  surname: He
  fullname: He, Yuan
  email: heyuan2017@ia.ac.cn
  organization: Laboratory of Brain Atlas and Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
– sequence: 2
  givenname: Guyue
  orcidid: 0000-0002-6198-8230
  surname: Hu
  fullname: Hu, Guyue
  email: guyue.hu@ahu.edu.cn
  organization: School of Artificial Intelligence, Anhui University, Hefei, China
– sequence: 3
  givenname: Shan
  surname: Yu
  fullname: Yu, Shan
  email: shan.yu@nlpr.ia.ac.cn
  organization: Laboratory of Brain Atlas and Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
BookMark eNpNkD1PwzAQQC1UJNrCzsAQiTnh7MROMlYRX1IQlVqxWk5yQS6tXeykiH-Pq3Zguhveu5PejEyMNUjILYWEUigf6tUyYcB4knIBNOcXZEo5L2KWCjoJO-QQlyUUV2Tm_QYAClrwKfmorBmc8oM-YLTCnTKDbuPFj3IYvSn_hV20GAeLprUduqi3LlpabYao2tqxC8a2j1fjHt1B-8DWqJzR5vOaXPZq6_HmPOdk_fS4rl7i-v35tVrUccuyfIh7AcCAClBcYYNdo1hesq5ABipTvRLYprTLm76kDc9K4A2mTLQsAKJjKp2T-9PZvbPfI_pBbuzoTPgo0xCDFgXjeaDgRLXOeu-wl3und8r9SgryGE-GePIYT57jBeXupGhE_IeXIqM5Tf8ALbhtOg
CODEN ISPLEM
Cites_doi 10.1007/978-3-031-73229-4_20
10.1109/ICCV.2019.00362
10.1007/978-3-031-73001-6_5
10.1109/LSP.2024.3458792
10.1109/ICCV.2019.00651
10.1109/ICCV.2019.00167
10.1145/2980179.2980238
10.1109/LSP.2024.3386115
10.1109/TIM.2023.3322509
10.1109/LSP.2024.3525398
10.24963/ijcai.2023/88
10.1109/TPAMI.2020.3005434
10.1109/CVPR52688.2022.01871
10.1007/s41095-021-0229-5
10.1109/CVPR52733.2024.01980
10.1109/CVPR.2017.264
10.1007/978-3-031-20086-1_38
10.1109/CVPR.2017.691
10.1109/LSP.2024.3449233
10.1145/3326362
10.1109/LSP.2023.3324245
10.1109/CVPR52688.2022.01553
10.1109/ICCV48922.2021.00964
10.1007/978-3-030-58580-8_34
10.1109/CVPR.2015.7298801
10.1109/CVPR52688.2022.00967
10.1007/978-3-031-20086-1_35
10.1109/CVPR42600.2020.01297
10.1109/LSP.2024.3495557
10.1109/CVPR42600.2020.00183
10.1109/CVPR42600.2020.01281
10.1109/ICCV48922.2021.00950
10.1109/CVPR52729.2023.02177
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/LSP.2025.3560175
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2361
EndPage 1764
ExternalDocumentID 10_1109_LSP_2025_3560175
10964171
Genre orig-research
GroupedDBID -~X
.DC
0R~
29I
3EH
4.4
5GY
5VS
6IK
85S
97E
AAJGR
AARMG
AASAJ
AAWTH
AAYJJ
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
TAE
TN5
VH1
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c247t-f60020160a5aebedba2792d8e20a4afa6ec31d7bf91b54905be326c28e26d2a3
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001484664000005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1070-9908
IngestDate Mon Jun 30 07:40:17 EDT 2025
Sat Nov 29 07:58:46 EST 2025
Wed Aug 27 01:53:09 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c247t-f60020160a5aebedba2792d8e20a4afa6ec31d7bf91b54905be326c28e26d2a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-6198-8230
0009-0001-1960-9866
PQID 3202188257
PQPubID 75747
PageCount 5
ParticipantIDs ieee_primary_10964171
proquest_journals_3202188257
crossref_primary_10_1109_LSP_2025_3560175
PublicationCentury 2000
PublicationDate 20250000
2025-00-00
20250101
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – year: 2025
  text: 20250000
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE signal processing letters
PublicationTitleAbbrev LSP
PublicationYear 2025
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref35
ref12
ref15
ref14
ref36
ref30
ref11
ref33
Chang (ref37) 2015
ref10
ref2
ref1
ref17
ref39
ref16
ref38
ref19
ref18
Qi (ref28) 2017
ref24
Zhang (ref23) 2022
ref25
ref20
Qian (ref32) 2022
ref22
ref21
Qi (ref26) 2017
ref27
Rao (ref34) 2022
ref29
ref8
ref7
ref9
ref4
Ma (ref31) 2022
ref3
ref6
ref5
ref40
References_xml – volume-title: Proc. 11th Int. Conf. Learn. Representations
  year: 2022
  ident: ref23
  article-title: Contextual image masking modeling via synergized contrasting without view augmentation for faster and better visual pretraining
– ident: ref24
  doi: 10.1007/978-3-031-73229-4_20
– ident: ref6
  doi: 10.1109/ICCV.2019.00362
– ident: ref36
  doi: 10.1007/978-3-031-73001-6_5
– ident: ref21
  doi: 10.1109/LSP.2024.3458792
– ident: ref30
  doi: 10.1109/ICCV.2019.00651
– start-page: 10353
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2022
  ident: ref34
  article-title: HorNet: Efficient high-order spatial interactions with recursive gated convolutions
– ident: ref39
  doi: 10.1109/ICCV.2019.00167
– ident: ref40
  doi: 10.1145/2980179.2980238
– ident: ref2
  doi: 10.1109/LSP.2024.3386115
– ident: ref7
  doi: 10.1109/TIM.2023.3322509
– ident: ref20
  doi: 10.1109/LSP.2024.3525398
– ident: ref15
  doi: 10.24963/ijcai.2023/88
– ident: ref1
  doi: 10.1109/TPAMI.2020.3005434
– ident: ref25
  doi: 10.1109/CVPR52688.2022.01871
– ident: ref33
  doi: 10.1007/s41095-021-0229-5
– ident: ref16
  doi: 10.1109/CVPR52733.2024.01980
– start-page: 23192
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2022
  ident: ref32
  article-title: Pointnext: Revisiting PointNet++ with improved training and scaling strategies
– ident: ref27
  doi: 10.1109/CVPR.2017.264
– start-page: 5105
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2017
  ident: ref28
  article-title: PointNet: Deep hierarchical feature learning on point sets in a metric space
– ident: ref35
  doi: 10.1007/978-3-031-20086-1_38
– ident: ref3
  doi: 10.1109/CVPR.2017.691
– ident: ref13
  doi: 10.1109/LSP.2024.3449233
– ident: ref29
  doi: 10.1145/3326362
– ident: ref19
  doi: 10.1109/LSP.2023.3324245
– start-page: 652
  volume-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
  year: 2017
  ident: ref26
  article-title: PointNet: Deep learning on point sets for 3D classification and segmentation
– ident: ref18
  doi: 10.1109/CVPR52688.2022.01553
– ident: ref11
  doi: 10.1109/ICCV48922.2021.00964
– volume-title: Proc. 11th Int. Conf. Learn. Representations
  year: 2022
  ident: ref31
  article-title: Rethinking network design and local geometry in point cloud: A simple residual MLP framework
– year: 2015
  ident: ref37
  article-title: ShapeNet: An information-rich 3D model repository
– ident: ref10
  doi: 10.1007/978-3-030-58580-8_34
– ident: ref38
  doi: 10.1109/CVPR.2015.7298801
– ident: ref12
  doi: 10.1109/CVPR52688.2022.00967
– ident: ref14
  doi: 10.1007/978-3-031-20086-1_35
– ident: ref5
  doi: 10.1109/CVPR42600.2020.01297
– ident: ref4
  doi: 10.1109/LSP.2024.3495557
– ident: ref8
  doi: 10.1109/CVPR42600.2020.00183
– ident: ref9
  doi: 10.1109/CVPR42600.2020.01281
– ident: ref17
  doi: 10.1109/ICCV48922.2021.00950
– ident: ref22
  doi: 10.1109/CVPR52729.2023.02177
SSID ssj0008185
Score 2.4443884
Snippet Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Index Database
Publisher
StartPage 1760
SubjectTerms 3D point cloud
Artificial intelligence
Autoencoders
Brain
Classification
contrastive regularization
Image reconstruction
Machine learning
masked autoencoder
Masking
masking strategy
Nearest neighbor methods
Point cloud compression
Reconstruction
Regularization
Representations
Self-supervised learning
Semantics
Shape
Three dimensional models
Three-dimensional displays
Title Contrastive Semantic-Aware Masked Autoencoder for Point Cloud Self-Supervised Learning
URI https://ieeexplore.ieee.org/document/10964171
https://www.proquest.com/docview/3202188257
Volume 32
WOSCitedRecordID wos001484664000005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2361
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0008185
  issn: 1070-9908
  databaseCode: RIE
  dateStart: 19940101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8NADD5BxQADzyIKBWVgYUjJ-3JjVVExlKpSK9Qtutw5qKIkVR7w97GTFFVCDGwZfFFkx6-z_Zmxew98EdtKmMLTChMUjXbQVxIDOVC-48lAhfWg8IRPp-FyKWbtsHo9CwMAdfMZDOixruXrTFV0VYYaLgLPponxfc6DZljrx-yS52kaDC0TTWy4rUla4nEyn2Em6PgDl_IPainc8UH1UpVflrh2L-OTf37YKTtu40hj2Aj-jO1Bes6OdtAFL9grIU_lsiCDZszhA3m4UubwS-ZgvMjiHbQxrMqMkCw15AZGr8YsW6WlMVpnlcYT68ScVxsyJgXStkisb122GD8tRs9mu0bBVI7HSzOh0hsByUlfosh0LAk0UIfgWNKTiQxAubbmcSLsGLNFy48BYzrlIEGgHelesk6apXDFDMfSWieWyxOOus9FnAShBMLbCRPXk16PPWz5Gm0asIyoTjIsEaEMIpJB1Mqgx7rExx26hoU91t9KImrVqYhoybuNuYDPr_84dsMO6e3N5Uifdcq8glt2oD7LVZHf1X_KNzBLvHM
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF5EBfXgW6zPHLx4SM1j89hjKZaKbSm0iLew2Z1IsTYlD_37ziSpFMSDtxxmSZjJvHZmvmHsjoMnYlsJU3CtMEHRaAc9JTGQA-U5XPoqrAaFB8FoFL6-inEzrF7NwgBA1XwGbXqsavk6VSVdlaGGC5_bNDG-5XHuWPW41o_hJd9TtxhaJhrZcFWVtMTDYDLGXNDx2i5lINRUuOaFqrUqv2xx5WB6B__8tEO230SSRqcW_RHbgMUx21vDFzxhL4Q9lcmcTJoxgQ_k4kyZnS-ZgTGU-Ttoo1MWKWFZasgMjF-NcTpbFEZ3npYaT8wTc1IuyZzkSNtgsb6dsmnvcdrtm80iBVM5PCjMhIpvBCUnPYlC07Ek2EAdgmNJLhPpg3JtHcSJsGPMFy0vBozqlIMEvnake8Y2F-kCzpnhWFrrxHKDJEDtD0Sc-KEEQtwJE5dL3mL3K75GyxouI6rSDEtEKIOIZBA1MmixU-LjGl3Nwha7WkkiahQqj2jNu43ZgBdc_HHslu30p8NBNHgaPV-yXXpTfVVyxTaLrIRrtq0-i1me3VR_zTcqvb-6
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Contrastive+Semantic-Aware+Masked+Autoencoder+for+Point+Cloud+Self-Supervised+Learning&rft.jtitle=IEEE+signal+processing+letters&rft.au=He%2C+Yuan&rft.au=Hu%2C+Guyue&rft.au=Yu%2C+Shan&rft.date=2025&rft.issn=1070-9908&rft.eissn=1558-2361&rft.volume=32&rft.spage=1760&rft.epage=1764&rft_id=info:doi/10.1109%2FLSP.2025.3560175&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_LSP_2025_3560175
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1070-9908&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1070-9908&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1070-9908&client=summon