ProteinMAE: masked autoencoder for protein surface self-supervised learning

Abstract Summary The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many task...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics (Oxford, England) Vol. 39; no. 12
Main Authors: Yuan, Mingzhi, Shen, Ao, Fu, Kexue, Guan, Jiaming, Ma, Yingfan, Qiao, Qin, Wang, Manning
Format: Journal Article
Language:English
Published: England Oxford University Press 01.12.2023
Oxford Publishing Limited (England)
Subjects:
ISSN:1367-4811, 1367-4803, 1367-4811
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Abstract Summary The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein–protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein–protein interaction prediction. The extensive experiments show that our method not only successfully improves the network’s performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. Availability and implementation https://github.com/phdymz/ProteinMAE.
AbstractList The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein–protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein–protein interaction prediction. The extensive experiments show that our method not only successfully improves the network’s performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. Availability and implementation https://github.com/phdymz/ProteinMAE.
Abstract Summary The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein–protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein–protein interaction prediction. The extensive experiments show that our method not only successfully improves the network’s performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. Availability and implementation https://github.com/phdymz/ProteinMAE.
The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods.SUMMARYThe biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods.https://github.com/phdymz/ProteinMAE.AVAILABILITY AND IMPLEMENTATIONhttps://github.com/phdymz/ProteinMAE.
The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. https://github.com/phdymz/ProteinMAE.
Author Yuan, Mingzhi
Guan, Jiaming
Shen, Ao
Fu, Kexue
Qiao, Qin
Ma, Yingfan
Wang, Manning
Author_xml – sequence: 1
  givenname: Mingzhi
  orcidid: 0000-0003-1322-7530
  surname: Yuan
  fullname: Yuan, Mingzhi
– sequence: 2
  givenname: Ao
  surname: Shen
  fullname: Shen, Ao
– sequence: 3
  givenname: Kexue
  orcidid: 0000-0003-1204-0942
  surname: Fu
  fullname: Fu, Kexue
  email: qinqiao@fudan.edu.cn
– sequence: 4
  givenname: Jiaming
  surname: Guan
  fullname: Guan, Jiaming
– sequence: 5
  givenname: Yingfan
  orcidid: 0000-0002-5436-7997
  surname: Ma
  fullname: Ma, Yingfan
– sequence: 6
  givenname: Qin
  orcidid: 0000-0001-8369-6135
  surname: Qiao
  fullname: Qiao, Qin
  email: qinqiao@fudan.edu.cn
– sequence: 7
  givenname: Manning
  surname: Wang
  fullname: Wang, Manning
  email: mnwang@fudan.edu.cn
BackLink https://www.ncbi.nlm.nih.gov/pubmed/38019955$$D View this record in MEDLINE/PubMed
BookMark eNqNkV9L3jAUxsNw-P8rSGE3u6nmNE2TjsEQcU5U9EKvQ5qeuLg26ZJW8Nsv8r4O9Wa7SiC_5znPybNDNnzwSMgB0EOgLTvqXHDehjjq2Zl01M26F1X9gWwDa0RZS4CNV_ctspPSA6WUU95ski0mKbQt59vk4iaGGZ2_Oj79Uow6_cK-0Msc0JvQYyzyiGJaIUVaotUGi4SDLdMyYXx0KfMD6uidv98jH60eEu6vz11y9_309uRHeXl9dn5yfFmaWvK5NIiyZUbQmgkGPVLQfWVkJbqeWwPCcmOENRXIqkEpOttrLjrR1Ka2wIGxXfJt5Tst3Yi9QT9HPagpulHHJxW0U29fvPup7sOjAiqAAYjs8HntEMPvBdOsRpcMDoP2GJakKtlyQSta84x-eoc-hCX6vJ9iFQMq26almTp4HelvlpePzsDXFWBiSCmiVcbNubvwnNANOZp67lW97VWte83y5p38ZcI_hbAShmX6X80f8dvCig
CitedBy_id crossref_primary_10_1007_s11704_024_3806_9
crossref_primary_10_1093_bib_bbae695
crossref_primary_10_1093_bib_bbae256
crossref_primary_10_1093_bib_bbae455
crossref_primary_10_1016_j_bbrc_2025_151799
crossref_primary_10_1016_j_biotechadv_2025_108603
crossref_primary_10_1007_s10822_025_00658_5
Cites_doi 10.1093/nar/28.1.235
10.1093/bioinformatics/bty918
10.1002/prot.21248
10.1093/bioinformatics/btu724
10.1002/pro.3280
10.1073/pnas.0906146106
10.1145/357306.357310
10.1093/bib/bbab087
10.3390/ijms18051029
10.1038/s41586-023-05993-x
10.1016/0022-2836(82)90515-0
10.1016/j.jmb.2013.01.014
10.2174/138920311796957612
10.1186/1471-2105-6-96
10.1093/bioinformatics/btq302
10.1186/1758-2946-1-19
10.1109/TPAMI.2022.3152247
10.1038/s41592-019-0666-6
10.1002/(SICI)1097-0282(199603)38:3<305::AID-BIP4>3.0.CO;2-Y
ContentType Journal Article
Copyright The Author(s) 2023. Published by Oxford University Press. 2023
The Author(s) 2023. Published by Oxford University Press.
Copyright_xml – notice: The Author(s) 2023. Published by Oxford University Press. 2023
– notice: The Author(s) 2023. Published by Oxford University Press.
DBID TOX
AAYXX
CITATION
NPM
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7TM
7TO
7U5
8BQ
8FD
F28
FR3
H8D
H8G
H94
JG9
JQ2
K9.
KR7
L7M
L~C
L~D
P64
7X8
5PM
DOI 10.1093/bioinformatics/btad724
DatabaseName Oxford Journals Open Access Collection
CrossRef
PubMed
Aluminium Industry Abstracts
Biotechnology Research Abstracts
Ceramic Abstracts
Computer and Information Systems Abstracts
Corrosion Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
Materials Business File
Mechanical & Transportation Engineering Abstracts
Nucleic Acids Abstracts
Oncogenes and Growth Factors Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
Aerospace Database
Copper Technical Reference Library
AIDS and Cancer Research Abstracts
Materials Research Database
ProQuest Computer Science Collection
ProQuest Health & Medical Complete (Alumni)
Civil Engineering Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
PubMed
Materials Research Database
Oncogenes and Growth Factors Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Nucleic Acids Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Health & Medical Complete (Alumni)
Materials Business File
Aerospace Database
Copper Technical Reference Library
Engineered Materials Abstracts
Biotechnology Research Abstracts
AIDS and Cancer Research Abstracts
Advanced Technologies Database with Aerospace
ANTE: Abstracts in New Technology & Engineering
Civil Engineering Abstracts
Aluminium Industry Abstracts
Electronics & Communications Abstracts
Ceramic Abstracts
METADEX
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
Solid State and Superconductivity Abstracts
Engineering Research Database
Corrosion Abstracts
MEDLINE - Academic
DatabaseTitleList Materials Research Database

MEDLINE - Academic
PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: TOX
  name: Oxford Journals Open Access Collection
  url: https://academic.oup.com/journals/
  sourceTypes: Publisher
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1367-4811
ExternalDocumentID PMC10713117
38019955
10_1093_bioinformatics_btad724
10.1093/bioinformatics/btad724
Genre Journal Article
GrantInformation_xml – fundername: Technology Innovation Plan Of Shanghai Science and Technology Commission
  grantid: 23S41900400
– fundername: ;
  grantid: 23S41900400
GroupedDBID ---
-E4
-~X
.-4
.2P
.DC
.GJ
.I3
0R~
1TH
23N
2WC
4.4
48X
53G
5GY
5WA
70D
AAIJN
AAIMJ
AAJKP
AAJQQ
AAKPC
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
ABEFU
ABEJV
ABEUO
ABGNP
ABIXL
ABNGD
ABNKS
ABPQP
ABPTD
ABQLI
ABQTQ
ABWST
ABXVV
ABZBJ
ACGFS
ACIWK
ACPRK
ACUFI
ACUKT
ACUXJ
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADMLS
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFNX
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AI.
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
AMNDL
APIBT
APWMN
AQDSO
ARIXL
ASPBG
ATTQO
AVWKF
AXUDD
AYOIW
AZFZN
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
COF
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EE~
EJD
ELUNK
EMOBN
F5P
F9B
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HVGLF
HW0
HZ~
IOX
J21
JXSIZ
KAQDR
KOP
KQ8
KSI
KSN
M-Z
M49
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NTWIH
NU-
NVLIB
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
O~Y
P2P
PAFKI
PB-
PEELM
PQQKQ
Q1.
Q5Y
R44
RD5
RIG
RNI
RNS
ROL
RPM
RUSNO
RW1
RXO
RZF
RZO
SV3
TEORI
TJP
TLC
TOX
TR2
VH1
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZGI
ZKX
~91
~KM
AAYXX
CITATION
ROX
NPM
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7TM
7TO
7U5
8BQ
8FD
F28
FR3
H8D
H8G
H94
JG9
JQ2
K9.
KR7
L7M
L~C
L~D
P64
7X8
5PM
ID FETCH-LOGICAL-c485t-cee893c7043731de01ad2c827bd5fc17f5cc7fc21826e87bfda57b764c4f15133
IEDL.DBID TOX
ISICitedReferencesCount 8
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001122483700006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1367-4811
1367-4803
IngestDate Thu Aug 21 18:35:51 EDT 2025
Thu Jul 10 17:58:12 EDT 2025
Mon Oct 06 17:40:54 EDT 2025
Thu Apr 03 07:06:17 EDT 2025
Sat Nov 29 03:49:28 EST 2025
Tue Nov 18 21:52:33 EST 2025
Wed Apr 02 07:09:52 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 12
Language English
License This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
https://creativecommons.org/licenses/by/4.0
The Author(s) 2023. Published by Oxford University Press.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c485t-cee893c7043731de01ad2c827bd5fc17f5cc7fc21826e87bfda57b764c4f15133
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Mingzhi Yuan and Ao Shen Equal contribution.
ORCID 0000-0003-1322-7530
0000-0003-1204-0942
0000-0002-5436-7997
0000-0001-8369-6135
OpenAccessLink https://dx.doi.org/10.1093/bioinformatics/btad724
PMID 38019955
PQID 3231089690
PQPubID 36124
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_10713117
proquest_miscellaneous_2895702045
proquest_journals_3231089690
pubmed_primary_38019955
crossref_citationtrail_10_1093_bioinformatics_btad724
crossref_primary_10_1093_bioinformatics_btad724
oup_primary_10_1093_bioinformatics_btad724
PublicationCentury 2000
PublicationDate 2023-12-01
PublicationDateYYYYMMDD 2023-12-01
PublicationDate_xml – month: 12
  year: 2023
  text: 2023-12-01
  day: 01
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
– name: Oxford
PublicationTitle Bioinformatics (Oxford, England)
PublicationTitleAlternate Bioinformatics
PublicationYear 2023
Publisher Oxford University Press
Oxford Publishing Limited (England)
Publisher_xml – name: Oxford University Press
– name: Oxford Publishing Limited (England)
References Han (2023121122510072500_btad724-B14) 2022; 45
Yin (2023121122510072500_btad724-B35) 2009; 106
Wang (2023121122510072500_btad724-B34) 2017; 18
Blinn (2023121122510072500_btad724-B4) 1982; 1
Kyte (2023121122510072500_btad724-B20) 1982; 157
Berrar (2023121122510072500_btad724-B3) 2021
Sverrisson (2023121122510072500_btad724-B30) 2021
Hu (2023121122510072500_btad724-B17) 2022; 41
Planas-Iglesias (2023121122510072500_btad724-B27) 2013; 425
Theodoridis (2023121122510072500_btad724-B31) 2006
Zhu (2023121122510072500_btad724-B37) 2015; 31
Dosovitskiy (2023121122510072500_btad724-B10) 2020
Zhang (2023121122510072500_btad724-B36) 2022
Vaswani (2023121122510072500_btad724-B32) 2017; 30
Berman (2023121122510072500_btad724-B2) 2000; 28
Chen (2023121122510072500_btad724-B7) 2021
Fan (2023121122510072500_btad724-B11) 2017
Devlin (2023121122510072500_btad724-B9) 2018
Mower (2023121122510072500_btad724-B23) 2005; 6
Porollo (2023121122510072500_btad724-B28) 2007; 66
Paszke (2023121122510072500_btad724-B26) 2017
Venkatraman (2023121122510072500_btad724-B33) 2009; 1
Murakami (2023121122510072500_btad724-B24) 2010; 26
Chen (2023121122510072500_btad724-B6) 2020
Jurrus (2023121122510072500_btad724-B18) 2018; 27
Daberdaku (2023121122510072500_btad724-B8) 2019; 35
He (2023121122510072500_btad724-B16) 2020
Bao (2023121122510072500_btad724-B1) 2021
He (2023121122510072500_btad724-B15) 2022
Loshchilov (2023121122510072500_btad724-B22) 2017
Cao (2023121122510072500_btad724-B5) 2019; 1050
Pang (2023121122510072500_btad724-B25) 2022
Liu (2023121122510072500_btad724-B21) 2023; 55
Gainza (2023121122510072500_btad724-B13) 2023; 617
Gainza (2023121122510072500_btad724-B12) 2020; 17
Sanner (2023121122510072500_btad724-B29) 1996; 38
Kihara (2023121122510072500_btad724-B19) 2011; 12
References_xml – volume: 28
  start-page: 235
  year: 2000
  ident: 2023121122510072500_btad724-B2
  article-title: The Protein Data Bank
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/28.1.235
– year: 2021
  ident: 2023121122510072500_btad724-B1
– volume: 35
  start-page: 1870
  year: 2019
  ident: 2023121122510072500_btad724-B8
  article-title: Antibody interface prediction with 3D Zernike descriptors and SVM
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty918
– volume: 66
  start-page: 630
  year: 2007
  ident: 2023121122510072500_btad724-B28
  article-title: Prediction-based fingerprints of protein–protein interactions
  publication-title: Proteins
  doi: 10.1002/prot.21248
– year: 2022
  ident: 2023121122510072500_btad724-B36
– volume: 31
  start-page: 707
  year: 2015
  ident: 2023121122510072500_btad724-B37
  article-title: Large-scale binding ligand prediction by improved patch-based method patch-surfer2.0
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu724
– volume: 27
  start-page: 112
  year: 2018
  ident: 2023121122510072500_btad724-B18
  article-title: Improvements to the APBS biomolecular solvation software suite
  publication-title: Protein Sci
  doi: 10.1002/pro.3280
– start-page: 1597
  year: 2020
  ident: 2023121122510072500_btad724-B6
– volume: 1050
  start-page: 26
  year: 2019
  ident: 2023121122510072500_btad724-B5
  article-title: Efficient curvature estimation for oriented point clouds
  publication-title: stat
– volume: 41
  start-page: 1
  year: 2022
  ident: 2023121122510072500_btad724-B17
  article-title: Subdivision-based mesh convolution networks
  publication-title: ACM Trans Graph
– volume: 106
  start-page: 16622
  year: 2009
  ident: 2023121122510072500_btad724-B35
  article-title: Fast screening of protein surfaces using geometric invariant fingerprints
  publication-title: Proc Natl Acad Sci USA
  doi: 10.1073/pnas.0906146106
– volume: 1
  start-page: 235
  year: 1982
  ident: 2023121122510072500_btad724-B4
  article-title: A generalization of algebraic surface drawing
  publication-title: ACM Trans Graph
  doi: 10.1145/357306.357310
– volume: 55
  start-page: 1
  year: 2023
  ident: 2023121122510072500_btad724-B21
  article-title: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing
  publication-title: ACM Comput Surv
– start-page: 16000
  year: 2022
  ident: 2023121122510072500_btad724-B15
– year: 2021
  ident: 2023121122510072500_btad724-B3
  article-title: Deep learning in bioinformatics and biomedicine
  doi: 10.1093/bib/bbab087
– year: 2018
  ident: 2023121122510072500_btad724-B9
– volume: 18
  start-page: 1029
  year: 2017
  ident: 2023121122510072500_btad724-B34
  article-title: PCVMZM: using the probabilistic classification vector machines model combined with a Zernike moments descriptor to predict protein–protein interactions from protein sequences
  publication-title: Int J Mol Sci
  doi: 10.3390/ijms18051029
– volume: 30
  start-page: 5998
  year: 2017
  ident: 2023121122510072500_btad724-B32
  article-title: Attention is all you need
  publication-title: Adv Neural Inf Process Syst
– volume: 617
  start-page: 176
  year: 2023
  ident: 2023121122510072500_btad724-B13
  article-title: De novo design of protein interactions with learned surface fingerprints
  publication-title: Nature
  doi: 10.1038/s41586-023-05993-x
– volume: 157
  start-page: 105
  year: 1982
  ident: 2023121122510072500_btad724-B20
  article-title: A simple method for displaying the hydropathic character of a protein
  publication-title: J Mol Biol
  doi: 10.1016/0022-2836(82)90515-0
– volume: 425
  start-page: 1210
  year: 2013
  ident: 2023121122510072500_btad724-B27
  article-title: Understanding protein–protein interactions using local structural features
  publication-title: J Mol Biol
  doi: 10.1016/j.jmb.2013.01.014
– volume-title: Pattern Recognition
  year: 2006
  ident: 2023121122510072500_btad724-B31
– start-page: 9729
  year: 2020
  ident: 2023121122510072500_btad724-B16
– start-page: 15750
  year: 2021
  ident: 2023121122510072500_btad724-B7
– volume: 12
  start-page: 520
  year: 2011
  ident: 2023121122510072500_btad724-B19
  article-title: Molecular surface representation using 3D Zernike descriptors for protein shape comparison and docking
  publication-title: Curr Protein Pept Sci
  doi: 10.2174/138920311796957612
– year: 2017
  ident: 2023121122510072500_btad724-B26
– start-page: 15272
  year: 2021
  ident: 2023121122510072500_btad724-B30
– year: 2020
  ident: 2023121122510072500_btad724-B10
– volume: 6
  start-page: 1
  year: 2005
  ident: 2023121122510072500_btad724-B23
  article-title: PREP-Mt: predictive RNA editor for plant mitochondrial genes
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-6-96
– volume: 26
  start-page: 1841
  year: 2010
  ident: 2023121122510072500_btad724-B24
  article-title: Applying the naïve Bayes classifier with kernel density estimation to the prediction of protein–protein interaction sites
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btq302
– start-page: 605
  year: 2017
  ident: 2023121122510072500_btad724-B11
– year: 2017
  ident: 2023121122510072500_btad724-B22
– volume: 1
  start-page: 19
  year: 2009
  ident: 2023121122510072500_btad724-B33
  article-title: Application of 3D Zernike descriptors to shape-based ligand similarity searching
  publication-title: J Cheminform
  doi: 10.1186/1758-2946-1-19
– volume: 45
  start-page: 87
  year: 2022
  ident: 2023121122510072500_btad724-B14
  article-title: A survey on vision transformer
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/TPAMI.2022.3152247
– volume: 17
  start-page: 184
  year: 2020
  ident: 2023121122510072500_btad724-B12
  article-title: Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning
  publication-title: Nat Methods
  doi: 10.1038/s41592-019-0666-6
– start-page: 604
  year: 2022
  ident: 2023121122510072500_btad724-B25
– volume: 38
  start-page: 305
  year: 1996
  ident: 2023121122510072500_btad724-B29
  article-title: Reduced surface: an efficient way to compute molecular surfaces
  publication-title: Biopolymers
  doi: 10.1002/(SICI)1097-0282(199603)38:3<305::AID-BIP4>3.0.CO;2-Y
SSID ssj0005056
Score 2.5192347
Snippet Abstract Summary The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming...
The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep...
SourceID pubmedcentral
proquest
pubmed
crossref
oup
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
SubjectTerms Binding sites
Computer vision
Computing costs
Deep learning
Labels
Machine learning
Natural language processing
Original Paper
Proteins
Self-supervised learning
Title ProteinMAE: masked autoencoder for protein surface self-supervised learning
URI https://www.ncbi.nlm.nih.gov/pubmed/38019955
https://www.proquest.com/docview/3231089690
https://www.proquest.com/docview/2895702045
https://pubmed.ncbi.nlm.nih.gov/PMC10713117
Volume 39
WOSCitedRecordID wos001122483700006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: DOA
  dateStart: 20230101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fT9swED4BAokXGGNAoVSZtCekqHUS1zZvFQIhDRgPbOpbdHHsrRqkVZMg8d_jS9KOIKGNPfsuP2xf7uK7-z6AL84lJxzDhHAq0Y_UAH3UkfWNCZSNMEJeVVX-uBI3N3I8VrcrwBa9MK9T-CrsJ5NpAyJKwMX9pMBUBIQAyrgkzoK7b-M_RR3Ony_6gN9UbbmgVlvbi-jydZHkC69zsf0fz_sBtpoQ0xvVe2IHVkz2ETZq0smnXfh6S9AMk-x6dH7qPWD-26QelsWUEC1TM_fcNb1ZLeLl5dyiNl5u7q2flzP6sOROvuGa-PkJvl-c351d-g2lgq8jyQvfuUQXoGhRIRqx1AwYpoGWgUhSbjUTlmstrCZY96GRIrEpcpGIYeTWjxEVzB6sZdPMHIDHELXllgg2VaQkx4gHWqENExkOkOsO8MUsx7rBGyfai_u4znuHcXui4maiOtBf6s1qxI2_apy4Rfxn4e5irePGXPM4pChXqqEadODzctgZGmVPMDPTMo_dnykX1ErMO7Bfb43lLUPn55XibkS2Ns1SgEC82yPZ5FcF5s3omIAxcfielziCTaK9r8tqurBWzEtzDOv6sZjk8x6sirHsVacLvco8ngHGphom
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ProteinMAE%3A+masked+autoencoder+for+protein+surface+self-supervised+learning&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Yuan%2C+Mingzhi&rft.au=Shen%2C+Ao&rft.au=Fu%2C+Kexue&rft.au=Guan%2C+Jiaming&rft.date=2023-12-01&rft.pub=Oxford+University+Press&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=39&rft.issue=12&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtad724&rft_id=info%3Apmid%2F38019955&rft.externalDocID=PMC10713117
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4811&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4811&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4811&client=summon