Contrastive Semantic-Aware Masked Autoencoder for Point Cloud Self-Supervised Learning
Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely on point-level or low-level feature reconstruction, forcing the model to focus on local regions while lacking enough global discriminability i...
Uloženo v:
| Vydáno v: | IEEE signal processing letters Ročník 32; s. 1760 - 1764 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1070-9908, 1558-2361 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely on point-level or low-level feature reconstruction, forcing the model to focus on local regions while lacking enough global discriminability in the feature representation. Moreover, conventional masking strategies randomly mask some point patches, thereby neglecting the semantic structure of the point cloud and hindering the holistic understanding of global information and geometric structures. To address these challenges, we proposed a Contrastive Semantic-aware Masked Autoencoder (Point-CSMAE), which is equipped with a semantic-aware masking (SAM) strategy and a contrastive regularization (CR) mechanism. Specifically, the semantic-aware masking strategy adaptively selects patches with richer semantic information for masking and reconstruction, enhancing the understanding of global geometric structure. Furthermore, the contrastive regularization mechanism adaptively aligns the global information between the masked and visible parts, thus improving the learned global semantic representation. Meanwhile, the CR mechanism assists the SAM strategy with effective global semantic representations. Extensive experiments on various downstream tasks, including shape classification, few-shot classification, and part segmentation, demonstrate the superiority of the proposed approach. |
|---|---|
| AbstractList | Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely on point-level or low-level feature reconstruction, forcing the model to focus on local regions while lacking enough global discriminability in the feature representation. Moreover, conventional masking strategies randomly mask some point patches, thereby neglecting the semantic structure of the point cloud and hindering the holistic understanding of global information and geometric structures. To address these challenges, we proposed a Contrastive Semantic-aware Masked Autoencoder (Point-CSMAE), which is equipped with a semantic-aware masking (SAM) strategy and a contrastive regularization (CR) mechanism. Specifically, the semantic-aware masking strategy adaptively selects patches with richer semantic information for masking and reconstruction, enhancing the understanding of global geometric structure. Furthermore, the contrastive regularization mechanism adaptively aligns the global information between the masked and visible parts, thus improving the learned global semantic representation. Meanwhile, the CR mechanism assists the SAM strategy with effective global semantic representations. Extensive experiments on various downstream tasks, including shape classification, few-shot classification, and part segmentation, demonstrate the superiority of the proposed approach. |
| Author | He, Yuan Yu, Shan Hu, Guyue |
| Author_xml | – sequence: 1 givenname: Yuan orcidid: 0009-0001-1960-9866 surname: He fullname: He, Yuan email: heyuan2017@ia.ac.cn organization: Laboratory of Brain Atlas and Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China – sequence: 2 givenname: Guyue orcidid: 0000-0002-6198-8230 surname: Hu fullname: Hu, Guyue email: guyue.hu@ahu.edu.cn organization: School of Artificial Intelligence, Anhui University, Hefei, China – sequence: 3 givenname: Shan surname: Yu fullname: Yu, Shan email: shan.yu@nlpr.ia.ac.cn organization: Laboratory of Brain Atlas and Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China |
| BookMark | eNpNkD1PwzAQQC1UJNrCzsAQiTnh7MROMlYRX1IQlVqxWk5yQS6tXeykiH-Pq3Zguhveu5PejEyMNUjILYWEUigf6tUyYcB4knIBNOcXZEo5L2KWCjoJO-QQlyUUV2Tm_QYAClrwKfmorBmc8oM-YLTCnTKDbuPFj3IYvSn_hV20GAeLprUduqi3LlpabYao2tqxC8a2j1fjHt1B-8DWqJzR5vOaXPZq6_HmPOdk_fS4rl7i-v35tVrUccuyfIh7AcCAClBcYYNdo1hesq5ABipTvRLYprTLm76kDc9K4A2mTLQsAKJjKp2T-9PZvbPfI_pBbuzoTPgo0xCDFgXjeaDgRLXOeu-wl3und8r9SgryGE-GePIYT57jBeXupGhE_IeXIqM5Tf8ALbhtOg |
| CODEN | ISPLEM |
| Cites_doi | 10.1007/978-3-031-73229-4_20 10.1109/ICCV.2019.00362 10.1007/978-3-031-73001-6_5 10.1109/LSP.2024.3458792 10.1109/ICCV.2019.00651 10.1109/ICCV.2019.00167 10.1145/2980179.2980238 10.1109/LSP.2024.3386115 10.1109/TIM.2023.3322509 10.1109/LSP.2024.3525398 10.24963/ijcai.2023/88 10.1109/TPAMI.2020.3005434 10.1109/CVPR52688.2022.01871 10.1007/s41095-021-0229-5 10.1109/CVPR52733.2024.01980 10.1109/CVPR.2017.264 10.1007/978-3-031-20086-1_38 10.1109/CVPR.2017.691 10.1109/LSP.2024.3449233 10.1145/3326362 10.1109/LSP.2023.3324245 10.1109/CVPR52688.2022.01553 10.1109/ICCV48922.2021.00964 10.1007/978-3-030-58580-8_34 10.1109/CVPR.2015.7298801 10.1109/CVPR52688.2022.00967 10.1007/978-3-031-20086-1_35 10.1109/CVPR42600.2020.01297 10.1109/LSP.2024.3495557 10.1109/CVPR42600.2020.00183 10.1109/CVPR42600.2020.01281 10.1109/ICCV48922.2021.00950 10.1109/CVPR52729.2023.02177 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/LSP.2025.3560175 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1558-2361 |
| EndPage | 1764 |
| ExternalDocumentID | 10_1109_LSP_2025_3560175 10964171 |
| Genre | orig-research |
| GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 85S 97E AAJGR AARMG AASAJ AAWTH AAYJJ ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TAE TN5 VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c247t-f60020160a5aebedba2792d8e20a4afa6ec31d7bf91b54905be326c28e26d2a3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001484664000005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1070-9908 |
| IngestDate | Mon Jun 30 07:40:17 EDT 2025 Sat Nov 29 07:58:46 EST 2025 Wed Aug 27 01:53:09 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c247t-f60020160a5aebedba2792d8e20a4afa6ec31d7bf91b54905be326c28e26d2a3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-6198-8230 0009-0001-1960-9866 |
| PQID | 3202188257 |
| PQPubID | 75747 |
| PageCount | 5 |
| ParticipantIDs | ieee_primary_10964171 proquest_journals_3202188257 crossref_primary_10_1109_LSP_2025_3560175 |
| PublicationCentury | 2000 |
| PublicationDate | 20250000 2025-00-00 20250101 |
| PublicationDateYYYYMMDD | 2025-01-01 |
| PublicationDate_xml | – year: 2025 text: 20250000 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE signal processing letters |
| PublicationTitleAbbrev | LSP |
| PublicationYear | 2025 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref35 ref12 ref15 ref14 ref36 ref30 ref11 ref33 Chang (ref37) 2015 ref10 ref2 ref1 ref17 ref39 ref16 ref38 ref19 ref18 Qi (ref28) 2017 ref24 Zhang (ref23) 2022 ref25 ref20 Qian (ref32) 2022 ref22 ref21 Qi (ref26) 2017 ref27 Rao (ref34) 2022 ref29 ref8 ref7 ref9 ref4 Ma (ref31) 2022 ref3 ref6 ref5 ref40 |
| References_xml | – volume-title: Proc. 11th Int. Conf. Learn. Representations year: 2022 ident: ref23 article-title: Contextual image masking modeling via synergized contrasting without view augmentation for faster and better visual pretraining – ident: ref24 doi: 10.1007/978-3-031-73229-4_20 – ident: ref6 doi: 10.1109/ICCV.2019.00362 – ident: ref36 doi: 10.1007/978-3-031-73001-6_5 – ident: ref21 doi: 10.1109/LSP.2024.3458792 – ident: ref30 doi: 10.1109/ICCV.2019.00651 – start-page: 10353 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2022 ident: ref34 article-title: HorNet: Efficient high-order spatial interactions with recursive gated convolutions – ident: ref39 doi: 10.1109/ICCV.2019.00167 – ident: ref40 doi: 10.1145/2980179.2980238 – ident: ref2 doi: 10.1109/LSP.2024.3386115 – ident: ref7 doi: 10.1109/TIM.2023.3322509 – ident: ref20 doi: 10.1109/LSP.2024.3525398 – ident: ref15 doi: 10.24963/ijcai.2023/88 – ident: ref1 doi: 10.1109/TPAMI.2020.3005434 – ident: ref25 doi: 10.1109/CVPR52688.2022.01871 – ident: ref33 doi: 10.1007/s41095-021-0229-5 – ident: ref16 doi: 10.1109/CVPR52733.2024.01980 – start-page: 23192 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2022 ident: ref32 article-title: Pointnext: Revisiting PointNet++ with improved training and scaling strategies – ident: ref27 doi: 10.1109/CVPR.2017.264 – start-page: 5105 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2017 ident: ref28 article-title: PointNet: Deep hierarchical feature learning on point sets in a metric space – ident: ref35 doi: 10.1007/978-3-031-20086-1_38 – ident: ref3 doi: 10.1109/CVPR.2017.691 – ident: ref13 doi: 10.1109/LSP.2024.3449233 – ident: ref29 doi: 10.1145/3326362 – ident: ref19 doi: 10.1109/LSP.2023.3324245 – start-page: 652 volume-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. year: 2017 ident: ref26 article-title: PointNet: Deep learning on point sets for 3D classification and segmentation – ident: ref18 doi: 10.1109/CVPR52688.2022.01553 – ident: ref11 doi: 10.1109/ICCV48922.2021.00964 – volume-title: Proc. 11th Int. Conf. Learn. Representations year: 2022 ident: ref31 article-title: Rethinking network design and local geometry in point cloud: A simple residual MLP framework – year: 2015 ident: ref37 article-title: ShapeNet: An information-rich 3D model repository – ident: ref10 doi: 10.1007/978-3-030-58580-8_34 – ident: ref38 doi: 10.1109/CVPR.2015.7298801 – ident: ref12 doi: 10.1109/CVPR52688.2022.00967 – ident: ref14 doi: 10.1007/978-3-031-20086-1_35 – ident: ref5 doi: 10.1109/CVPR42600.2020.01297 – ident: ref4 doi: 10.1109/LSP.2024.3495557 – ident: ref8 doi: 10.1109/CVPR42600.2020.00183 – ident: ref9 doi: 10.1109/CVPR42600.2020.01281 – ident: ref17 doi: 10.1109/ICCV48922.2021.00950 – ident: ref22 doi: 10.1109/CVPR52729.2023.02177 |
| SSID | ssj0008185 |
| Score | 2.4444687 |
| Snippet | Masked Autoencoder (MAE) has shown remarkable potential in self-supervised representation learning for 3D point clouds. However, these methods primarily rely... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Index Database Publisher |
| StartPage | 1760 |
| SubjectTerms | 3D point cloud Artificial intelligence Autoencoders Brain Classification contrastive regularization Image reconstruction Machine learning masked autoencoder Masking masking strategy Nearest neighbor methods Point cloud compression Reconstruction Regularization Representations Self-supervised learning Semantics Shape Three dimensional models Three-dimensional displays |
| Title | Contrastive Semantic-Aware Masked Autoencoder for Point Cloud Self-Supervised Learning |
| URI | https://ieeexplore.ieee.org/document/10964171 https://www.proquest.com/docview/3202188257 |
| Volume | 32 |
| WOSCitedRecordID | wos001484664000005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2361 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0008185 issn: 1070-9908 databaseCode: RIE dateStart: 19940101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF60eNCDz4rVKjl48bA1702OpVg81FJokd7CZncixZqUPPTvO5OkUhAPXkIOsyHM7Lx2Zr5l7N4UtheAcLnyHZO7iYx57Gjcyw46d3TP4EA9KDwR02mwXIazdli9noUBgLr5DAb0WtfydaYqOipDDQ9916KJ8X0h_GZY68fskudpGgxNjiY22NYkzfBxMp9hJmh7A4fyD2op3PFB9aUqvyxx7V7GJ__8sVN23MaRxrAR_Bnbg_ScHe2gC16wV0KeymVBBs2YwwfycKX48EvmYLzI4h20MazKjJAsNeQGRq_GLFulpTFaZ5XGFeuEz6sNGZMCaVsk1rcuW4yfFqNn3l6jwJXtipInVHojIDnpSRSZjiWBBuoAbFO6MpE-KMfSIk5CK8Zs0fRiwJhO2Ujga1s6l6yTZilcMSPxhVS-mdguBG4oQimdROIDFd-Scah67GHL12jTgGVEdZJhhhHKICIZRK0MeqxLfNyha1jYY_2tJKJWnYqILnm3MBfwxPUfy27YIX29ORzps06ZV3DLDtRnuSryu3qnfAPRvbxG |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF5EBfXgW6zPHLx4WN0km9exFEWxLYUW8RY2uxMp1kby0L_vTJJKQTx4CTnMkjCz89qZ-ZaxKxE4XgiB5Np3BZepSnjiGtzLLjp3dM_gQj0o3A-Gw_DlJRq1w-r1LAwA1M1ncEOvdS3fZLqiozLU8MiXNk2Mr3lSOqIZ1_oxvOR7mhZDwdHIhouqpIhu--MR5oKOd-NSBkJNhUteqL5W5Zctrh3M_c4_f22XbbeRpNVtRL_HVmC-z7aW8AUP2DNhT-WqIJNmjeEduTjVvPulcrAGqngDY3WrMiMsSwO5hfGrNcqm89LqzbLK4IpZysfVB5mTAmlbLNbXQza5v5v0Hnh7kQLXjgxKnlLxjaDklKdQaCZRBBtoQnCEkipVPmjXNkGSRnaC-aLwEsCoTjtI4BtHuUdsdZ7N4ZhZqR8o7YvUkRDKKIiUclOFD1R9WyWR7rDrBV_jjwYuI67TDBHFKIOYZBC3MuiwQ-LjEl3Dwg47W0gibhWqiOmadxuzAS84-WPZJdt4mAz6cf9x-HTKNulLzVHJGVst8wrO2br-LKdFflHvmm_Ivb-N |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Contrastive+Semantic-Aware+Masked+Autoencoder+for+Point+Cloud+Self-Supervised+Learning&rft.jtitle=IEEE+signal+processing+letters&rft.au=He%2C+Yuan&rft.au=Hu%2C+Guyue&rft.au=Yu%2C+Shan&rft.date=2025&rft.pub=IEEE&rft.issn=1070-9908&rft.volume=32&rft.spage=1760&rft.epage=1764&rft_id=info:doi/10.1109%2FLSP.2025.3560175&rft.externalDocID=10964171 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1070-9908&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1070-9908&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1070-9908&client=summon |