ProteinMAE: masked autoencoder for protein surface self-supervised learning
Abstract Summary The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many task...
Saved in:
| Published in: | Bioinformatics (Oxford, England) Vol. 39; no. 12 |
|---|---|
| Main Authors: | , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
England
Oxford University Press
01.12.2023
Oxford Publishing Limited (England) |
| Subjects: | |
| ISSN: | 1367-4811, 1367-4803, 1367-4811 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Abstract
Summary
The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein–protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein–protein interaction prediction. The extensive experiments show that our method not only successfully improves the network’s performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods.
Availability and implementation
https://github.com/phdymz/ProteinMAE. |
|---|---|
| AbstractList | The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein–protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein–protein interaction prediction. The extensive experiments show that our method not only successfully improves the network’s performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. Availability and implementation https://github.com/phdymz/ProteinMAE. Abstract Summary The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein–protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein–protein interaction prediction. The extensive experiments show that our method not only successfully improves the network’s performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. Availability and implementation https://github.com/phdymz/ProteinMAE. The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods.SUMMARYThe biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods.https://github.com/phdymz/ProteinMAE.AVAILABILITY AND IMPLEMENTATIONhttps://github.com/phdymz/ProteinMAE. The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. https://github.com/phdymz/ProteinMAE. |
| Author | Yuan, Mingzhi Guan, Jiaming Shen, Ao Fu, Kexue Qiao, Qin Ma, Yingfan Wang, Manning |
| Author_xml | – sequence: 1 givenname: Mingzhi orcidid: 0000-0003-1322-7530 surname: Yuan fullname: Yuan, Mingzhi – sequence: 2 givenname: Ao surname: Shen fullname: Shen, Ao – sequence: 3 givenname: Kexue orcidid: 0000-0003-1204-0942 surname: Fu fullname: Fu, Kexue email: qinqiao@fudan.edu.cn – sequence: 4 givenname: Jiaming surname: Guan fullname: Guan, Jiaming – sequence: 5 givenname: Yingfan orcidid: 0000-0002-5436-7997 surname: Ma fullname: Ma, Yingfan – sequence: 6 givenname: Qin orcidid: 0000-0001-8369-6135 surname: Qiao fullname: Qiao, Qin email: qinqiao@fudan.edu.cn – sequence: 7 givenname: Manning surname: Wang fullname: Wang, Manning email: mnwang@fudan.edu.cn |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/38019955$$D View this record in MEDLINE/PubMed |
| BookMark | eNqNkV9L3jAUxsNw-P8rSGE3u6nmNE2TjsEQcU5U9EKvQ5qeuLg26ZJW8Nsv8r4O9Wa7SiC_5znPybNDNnzwSMgB0EOgLTvqXHDehjjq2Zl01M26F1X9gWwDa0RZS4CNV_ctspPSA6WUU95ski0mKbQt59vk4iaGGZ2_Oj79Uow6_cK-0Msc0JvQYyzyiGJaIUVaotUGi4SDLdMyYXx0KfMD6uidv98jH60eEu6vz11y9_309uRHeXl9dn5yfFmaWvK5NIiyZUbQmgkGPVLQfWVkJbqeWwPCcmOENRXIqkEpOttrLjrR1Ka2wIGxXfJt5Tst3Yi9QT9HPagpulHHJxW0U29fvPup7sOjAiqAAYjs8HntEMPvBdOsRpcMDoP2GJakKtlyQSta84x-eoc-hCX6vJ9iFQMq26almTp4HelvlpePzsDXFWBiSCmiVcbNubvwnNANOZp67lW97VWte83y5p38ZcI_hbAShmX6X80f8dvCig |
| CitedBy_id | crossref_primary_10_1007_s11704_024_3806_9 crossref_primary_10_1093_bib_bbae695 crossref_primary_10_1093_bib_bbae256 crossref_primary_10_1093_bib_bbae455 crossref_primary_10_1016_j_bbrc_2025_151799 crossref_primary_10_1016_j_biotechadv_2025_108603 crossref_primary_10_1007_s10822_025_00658_5 |
| Cites_doi | 10.1093/nar/28.1.235 10.1093/bioinformatics/bty918 10.1002/prot.21248 10.1093/bioinformatics/btu724 10.1002/pro.3280 10.1073/pnas.0906146106 10.1145/357306.357310 10.1093/bib/bbab087 10.3390/ijms18051029 10.1038/s41586-023-05993-x 10.1016/0022-2836(82)90515-0 10.1016/j.jmb.2013.01.014 10.2174/138920311796957612 10.1186/1471-2105-6-96 10.1093/bioinformatics/btq302 10.1186/1758-2946-1-19 10.1109/TPAMI.2022.3152247 10.1038/s41592-019-0666-6 10.1002/(SICI)1097-0282(199603)38:3<305::AID-BIP4>3.0.CO;2-Y |
| ContentType | Journal Article |
| Copyright | The Author(s) 2023. Published by Oxford University Press. 2023 The Author(s) 2023. Published by Oxford University Press. |
| Copyright_xml | – notice: The Author(s) 2023. Published by Oxford University Press. 2023 – notice: The Author(s) 2023. Published by Oxford University Press. |
| DBID | TOX AAYXX CITATION NPM 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7TM 7TO 7U5 8BQ 8FD F28 FR3 H8D H8G H94 JG9 JQ2 K9. KR7 L7M L~C L~D P64 7X8 5PM |
| DOI | 10.1093/bioinformatics/btad724 |
| DatabaseName | Oxford Journals Open Access Collection CrossRef PubMed Aluminium Industry Abstracts Biotechnology Research Abstracts Ceramic Abstracts Computer and Information Systems Abstracts Corrosion Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts Materials Business File Mechanical & Transportation Engineering Abstracts Nucleic Acids Abstracts Oncogenes and Growth Factors Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database ANTE: Abstracts in New Technology & Engineering Engineering Research Database Aerospace Database Copper Technical Reference Library AIDS and Cancer Research Abstracts Materials Research Database ProQuest Computer Science Collection ProQuest Health & Medical Complete (Alumni) Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Biotechnology and BioEngineering Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) |
| DatabaseTitle | CrossRef PubMed Materials Research Database Oncogenes and Growth Factors Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Nucleic Acids Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Health & Medical Complete (Alumni) Materials Business File Aerospace Database Copper Technical Reference Library Engineered Materials Abstracts Biotechnology Research Abstracts AIDS and Cancer Research Abstracts Advanced Technologies Database with Aerospace ANTE: Abstracts in New Technology & Engineering Civil Engineering Abstracts Aluminium Industry Abstracts Electronics & Communications Abstracts Ceramic Abstracts METADEX Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Professional Solid State and Superconductivity Abstracts Engineering Research Database Corrosion Abstracts MEDLINE - Academic |
| DatabaseTitleList | Materials Research Database MEDLINE - Academic PubMed |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: TOX name: Oxford Journals Open Access Collection url: https://academic.oup.com/journals/ sourceTypes: Publisher – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology |
| EISSN | 1367-4811 |
| ExternalDocumentID | PMC10713117 38019955 10_1093_bioinformatics_btad724 10.1093/bioinformatics/btad724 |
| Genre | Journal Article |
| GrantInformation_xml | – fundername: Technology Innovation Plan Of Shanghai Science and Technology Commission grantid: 23S41900400 – fundername: ; grantid: 23S41900400 |
| GroupedDBID | --- -E4 -~X .-4 .2P .DC .GJ .I3 0R~ 1TH 23N 2WC 4.4 48X 53G 5GY 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAMVS AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN ABEFU ABEJV ABEUO ABGNP ABIXL ABNGD ABNKS ABPQP ABPTD ABQLI ABQTQ ABWST ABXVV ABZBJ ACGFS ACIWK ACPRK ACUFI ACUKT ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ ADZXQ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFNX AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AI. AIJHB AJEEA AJEUX AKHUL AKWXX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC AMNDL APIBT APWMN AQDSO ARIXL ASPBG ATTQO AVWKF AXUDD AYOIW AZFZN AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CAG CDBKE COF CS3 CZ4 DAKXR DIK DILTD DU5 D~K EBD EBS EE~ EJD ELUNK EMOBN F5P F9B FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 H5~ HAR HVGLF HW0 HZ~ IOX J21 JXSIZ KAQDR KOP KQ8 KSI KSN M-Z M49 MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NTWIH NU- NVLIB O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED O~Y P2P PAFKI PB- PEELM PQQKQ Q1. Q5Y R44 RD5 RIG RNI RNS ROL RPM RUSNO RW1 RXO RZF RZO SV3 TEORI TJP TLC TOX TR2 VH1 W8F WOQ X7H YAYTL YKOAZ YXANX ZGI ZKX ~91 ~KM AAYXX CITATION ROX NPM 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7TM 7TO 7U5 8BQ 8FD F28 FR3 H8D H8G H94 JG9 JQ2 K9. KR7 L7M L~C L~D P64 7X8 5PM |
| ID | FETCH-LOGICAL-c485t-cee893c7043731de01ad2c827bd5fc17f5cc7fc21826e87bfda57b764c4f15133 |
| IEDL.DBID | TOX |
| ISICitedReferencesCount | 8 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001122483700006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1367-4811 1367-4803 |
| IngestDate | Thu Aug 21 18:35:51 EDT 2025 Thu Jul 10 17:58:12 EDT 2025 Mon Oct 06 17:40:54 EDT 2025 Thu Apr 03 07:06:17 EDT 2025 Sat Nov 29 03:49:28 EST 2025 Tue Nov 18 21:52:33 EST 2025 Wed Apr 02 07:09:52 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 12 |
| Language | English |
| License | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. https://creativecommons.org/licenses/by/4.0 The Author(s) 2023. Published by Oxford University Press. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c485t-cee893c7043731de01ad2c827bd5fc17f5cc7fc21826e87bfda57b764c4f15133 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Mingzhi Yuan and Ao Shen Equal contribution. |
| ORCID | 0000-0003-1322-7530 0000-0003-1204-0942 0000-0002-5436-7997 0000-0001-8369-6135 |
| OpenAccessLink | https://dx.doi.org/10.1093/bioinformatics/btad724 |
| PMID | 38019955 |
| PQID | 3231089690 |
| PQPubID | 36124 |
| ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_10713117 proquest_miscellaneous_2895702045 proquest_journals_3231089690 pubmed_primary_38019955 crossref_citationtrail_10_1093_bioinformatics_btad724 crossref_primary_10_1093_bioinformatics_btad724 oup_primary_10_1093_bioinformatics_btad724 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-12-01 |
| PublicationDateYYYYMMDD | 2023-12-01 |
| PublicationDate_xml | – month: 12 year: 2023 text: 2023-12-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | England |
| PublicationPlace_xml | – name: England – name: Oxford |
| PublicationTitle | Bioinformatics (Oxford, England) |
| PublicationTitleAlternate | Bioinformatics |
| PublicationYear | 2023 |
| Publisher | Oxford University Press Oxford Publishing Limited (England) |
| Publisher_xml | – name: Oxford University Press – name: Oxford Publishing Limited (England) |
| References | Han (2023121122510072500_btad724-B14) 2022; 45 Yin (2023121122510072500_btad724-B35) 2009; 106 Wang (2023121122510072500_btad724-B34) 2017; 18 Blinn (2023121122510072500_btad724-B4) 1982; 1 Kyte (2023121122510072500_btad724-B20) 1982; 157 Berrar (2023121122510072500_btad724-B3) 2021 Sverrisson (2023121122510072500_btad724-B30) 2021 Hu (2023121122510072500_btad724-B17) 2022; 41 Planas-Iglesias (2023121122510072500_btad724-B27) 2013; 425 Theodoridis (2023121122510072500_btad724-B31) 2006 Zhu (2023121122510072500_btad724-B37) 2015; 31 Dosovitskiy (2023121122510072500_btad724-B10) 2020 Zhang (2023121122510072500_btad724-B36) 2022 Vaswani (2023121122510072500_btad724-B32) 2017; 30 Berman (2023121122510072500_btad724-B2) 2000; 28 Chen (2023121122510072500_btad724-B7) 2021 Fan (2023121122510072500_btad724-B11) 2017 Devlin (2023121122510072500_btad724-B9) 2018 Mower (2023121122510072500_btad724-B23) 2005; 6 Porollo (2023121122510072500_btad724-B28) 2007; 66 Paszke (2023121122510072500_btad724-B26) 2017 Venkatraman (2023121122510072500_btad724-B33) 2009; 1 Murakami (2023121122510072500_btad724-B24) 2010; 26 Chen (2023121122510072500_btad724-B6) 2020 Jurrus (2023121122510072500_btad724-B18) 2018; 27 Daberdaku (2023121122510072500_btad724-B8) 2019; 35 He (2023121122510072500_btad724-B16) 2020 Bao (2023121122510072500_btad724-B1) 2021 He (2023121122510072500_btad724-B15) 2022 Loshchilov (2023121122510072500_btad724-B22) 2017 Cao (2023121122510072500_btad724-B5) 2019; 1050 Pang (2023121122510072500_btad724-B25) 2022 Liu (2023121122510072500_btad724-B21) 2023; 55 Gainza (2023121122510072500_btad724-B13) 2023; 617 Gainza (2023121122510072500_btad724-B12) 2020; 17 Sanner (2023121122510072500_btad724-B29) 1996; 38 Kihara (2023121122510072500_btad724-B19) 2011; 12 |
| References_xml | – volume: 28 start-page: 235 year: 2000 ident: 2023121122510072500_btad724-B2 article-title: The Protein Data Bank publication-title: Nucleic Acids Res doi: 10.1093/nar/28.1.235 – year: 2021 ident: 2023121122510072500_btad724-B1 – volume: 35 start-page: 1870 year: 2019 ident: 2023121122510072500_btad724-B8 article-title: Antibody interface prediction with 3D Zernike descriptors and SVM publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty918 – volume: 66 start-page: 630 year: 2007 ident: 2023121122510072500_btad724-B28 article-title: Prediction-based fingerprints of protein–protein interactions publication-title: Proteins doi: 10.1002/prot.21248 – year: 2022 ident: 2023121122510072500_btad724-B36 – volume: 31 start-page: 707 year: 2015 ident: 2023121122510072500_btad724-B37 article-title: Large-scale binding ligand prediction by improved patch-based method patch-surfer2.0 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu724 – volume: 27 start-page: 112 year: 2018 ident: 2023121122510072500_btad724-B18 article-title: Improvements to the APBS biomolecular solvation software suite publication-title: Protein Sci doi: 10.1002/pro.3280 – start-page: 1597 year: 2020 ident: 2023121122510072500_btad724-B6 – volume: 1050 start-page: 26 year: 2019 ident: 2023121122510072500_btad724-B5 article-title: Efficient curvature estimation for oriented point clouds publication-title: stat – volume: 41 start-page: 1 year: 2022 ident: 2023121122510072500_btad724-B17 article-title: Subdivision-based mesh convolution networks publication-title: ACM Trans Graph – volume: 106 start-page: 16622 year: 2009 ident: 2023121122510072500_btad724-B35 article-title: Fast screening of protein surfaces using geometric invariant fingerprints publication-title: Proc Natl Acad Sci USA doi: 10.1073/pnas.0906146106 – volume: 1 start-page: 235 year: 1982 ident: 2023121122510072500_btad724-B4 article-title: A generalization of algebraic surface drawing publication-title: ACM Trans Graph doi: 10.1145/357306.357310 – volume: 55 start-page: 1 year: 2023 ident: 2023121122510072500_btad724-B21 article-title: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing publication-title: ACM Comput Surv – start-page: 16000 year: 2022 ident: 2023121122510072500_btad724-B15 – year: 2021 ident: 2023121122510072500_btad724-B3 article-title: Deep learning in bioinformatics and biomedicine doi: 10.1093/bib/bbab087 – year: 2018 ident: 2023121122510072500_btad724-B9 – volume: 18 start-page: 1029 year: 2017 ident: 2023121122510072500_btad724-B34 article-title: PCVMZM: using the probabilistic classification vector machines model combined with a Zernike moments descriptor to predict protein–protein interactions from protein sequences publication-title: Int J Mol Sci doi: 10.3390/ijms18051029 – volume: 30 start-page: 5998 year: 2017 ident: 2023121122510072500_btad724-B32 article-title: Attention is all you need publication-title: Adv Neural Inf Process Syst – volume: 617 start-page: 176 year: 2023 ident: 2023121122510072500_btad724-B13 article-title: De novo design of protein interactions with learned surface fingerprints publication-title: Nature doi: 10.1038/s41586-023-05993-x – volume: 157 start-page: 105 year: 1982 ident: 2023121122510072500_btad724-B20 article-title: A simple method for displaying the hydropathic character of a protein publication-title: J Mol Biol doi: 10.1016/0022-2836(82)90515-0 – volume: 425 start-page: 1210 year: 2013 ident: 2023121122510072500_btad724-B27 article-title: Understanding protein–protein interactions using local structural features publication-title: J Mol Biol doi: 10.1016/j.jmb.2013.01.014 – volume-title: Pattern Recognition year: 2006 ident: 2023121122510072500_btad724-B31 – start-page: 9729 year: 2020 ident: 2023121122510072500_btad724-B16 – start-page: 15750 year: 2021 ident: 2023121122510072500_btad724-B7 – volume: 12 start-page: 520 year: 2011 ident: 2023121122510072500_btad724-B19 article-title: Molecular surface representation using 3D Zernike descriptors for protein shape comparison and docking publication-title: Curr Protein Pept Sci doi: 10.2174/138920311796957612 – year: 2017 ident: 2023121122510072500_btad724-B26 – start-page: 15272 year: 2021 ident: 2023121122510072500_btad724-B30 – year: 2020 ident: 2023121122510072500_btad724-B10 – volume: 6 start-page: 1 year: 2005 ident: 2023121122510072500_btad724-B23 article-title: PREP-Mt: predictive RNA editor for plant mitochondrial genes publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-6-96 – volume: 26 start-page: 1841 year: 2010 ident: 2023121122510072500_btad724-B24 article-title: Applying the naïve Bayes classifier with kernel density estimation to the prediction of protein–protein interaction sites publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq302 – start-page: 605 year: 2017 ident: 2023121122510072500_btad724-B11 – year: 2017 ident: 2023121122510072500_btad724-B22 – volume: 1 start-page: 19 year: 2009 ident: 2023121122510072500_btad724-B33 article-title: Application of 3D Zernike descriptors to shape-based ligand similarity searching publication-title: J Cheminform doi: 10.1186/1758-2946-1-19 – volume: 45 start-page: 87 year: 2022 ident: 2023121122510072500_btad724-B14 article-title: A survey on vision transformer publication-title: IEEE Trans Pattern Anal Mach Intell doi: 10.1109/TPAMI.2022.3152247 – volume: 17 start-page: 184 year: 2020 ident: 2023121122510072500_btad724-B12 article-title: Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning publication-title: Nat Methods doi: 10.1038/s41592-019-0666-6 – start-page: 604 year: 2022 ident: 2023121122510072500_btad724-B25 – volume: 38 start-page: 305 year: 1996 ident: 2023121122510072500_btad724-B29 article-title: Reduced surface: an efficient way to compute molecular surfaces publication-title: Biopolymers doi: 10.1002/(SICI)1097-0282(199603)38:3<305::AID-BIP4>3.0.CO;2-Y |
| SSID | ssj0005056 |
| Score | 2.5192347 |
| Snippet | Abstract
Summary
The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming... The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep... |
| SourceID | pubmedcentral proquest pubmed crossref oup |
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
| SubjectTerms | Binding sites Computer vision Computing costs Deep learning Labels Machine learning Natural language processing Original Paper Proteins Self-supervised learning |
| Title | ProteinMAE: masked autoencoder for protein surface self-supervised learning |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/38019955 https://www.proquest.com/docview/3231089690 https://www.proquest.com/docview/2895702045 https://pubmed.ncbi.nlm.nih.gov/PMC10713117 |
| Volume | 39 |
| WOSCitedRecordID | wos001122483700006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1367-4811 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: DOA dateStart: 20230101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVASL databaseName: Oxford Journals Open Access Collection customDbUrl: eissn: 1367-4811 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005056 issn: 1367-4811 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fT9swED4BAokXGGNAoVSZtCekqHUS1zZvFQIhDRgPbOpbdHHsrRqkVZMg8d_jS9KOIKGNPfsuP2xf7uK7-z6AL84lJxzDhHAq0Y_UAH3UkfWNCZSNMEJeVVX-uBI3N3I8VrcrwBa9MK9T-CrsJ5NpAyJKwMX9pMBUBIQAyrgkzoK7b-M_RR3Ony_6gN9UbbmgVlvbi-jydZHkC69zsf0fz_sBtpoQ0xvVe2IHVkz2ETZq0smnXfh6S9AMk-x6dH7qPWD-26QelsWUEC1TM_fcNb1ZLeLl5dyiNl5u7q2flzP6sOROvuGa-PkJvl-c351d-g2lgq8jyQvfuUQXoGhRIRqx1AwYpoGWgUhSbjUTlmstrCZY96GRIrEpcpGIYeTWjxEVzB6sZdPMHIDHELXllgg2VaQkx4gHWqENExkOkOsO8MUsx7rBGyfai_u4znuHcXui4maiOtBf6s1qxI2_apy4Rfxn4e5irePGXPM4pChXqqEadODzctgZGmVPMDPTMo_dnykX1ErMO7Bfb43lLUPn55XibkS2Ns1SgEC82yPZ5FcF5s3omIAxcfielziCTaK9r8tqurBWzEtzDOv6sZjk8x6sirHsVacLvco8ngHGphom |
| linkProvider | Oxford University Press |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ProteinMAE%3A+masked+autoencoder+for+protein+surface+self-supervised+learning&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Yuan%2C+Mingzhi&rft.au=Shen%2C+Ao&rft.au=Fu%2C+Kexue&rft.au=Guan%2C+Jiaming&rft.date=2023-12-01&rft.pub=Oxford+University+Press&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=39&rft.issue=12&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtad724&rft_id=info%3Apmid%2F38019955&rft.externalDocID=PMC10713117 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4811&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4811&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4811&client=summon |