ProteinNet: a standardized data set for machine learning of protein structure

Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new me...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:BMC bioinformatics Jg. 20; H. 1; S. 311 - 10
1. Verfasser: AlQuraishi, Mohammed
Format: Journal Article
Sprache:Englisch
Veröffentlicht: London BioMed Central 11.06.2019
BioMed Central Ltd
Springer Nature B.V
BMC
Schlagworte:
ISSN:1471-2105, 1471-2105
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.
AbstractList Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.
Abstract Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.
Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. Keywords: Proteins, Protein structure, Machine learning, CASP, Protein sequence, Co-evolution, PSSM, Protein structure prediction, Database, Deep learning
Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.
Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.
Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.
Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space.BACKGROUNDRapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space.We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty.RESULTSWe created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty.ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.CONCLUSIONProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.
ArticleNumber 311
Audience Academic
Author AlQuraishi, Mohammed
Author_xml – sequence: 1
  givenname: Mohammed
  orcidid: 0000-0001-6817-1322
  surname: AlQuraishi
  fullname: AlQuraishi, Mohammed
  email: alquraishi@hms.harvard.edu
  organization: Laboratory of Systems Pharmacology, Department of Systems Biology, Harvard Medical School
BackLink https://www.ncbi.nlm.nih.gov/pubmed/31185886$$D View this record in MEDLINE/PubMed
BookMark eNp9kktv1DAUhSNURB_wA9igSGzoIsV2YsdmgVRVPEYqD_FYW7Zzk3qUsQfbQZRfj0Na2qkAZZHo5jvn2kfnsNhz3kFRPMboBGPOnkdMOBUVwqIioiYVulcc4KbFFcGI7t363i8OY1wjhFuO6INiv85yyjk7KN59DD6Bde8hvShVGZNynQqd_Qld2amUJ5DK3odyo8yFdVCOoIKzbih9X24XbVaFyaQpwMPifq_GCI-u3kfF19evvpy9rc4_vFmdnZ5XhtVtqvpWYNPopm5o11LSGEQFcK50Y3rUUa0Mo0QbjZRWjPUE07rJh0edgRZRzeqjYrX4dl6t5TbYjQqX0isrfw98GKQKyZoRpACtRA2ItIAbgE60nPStIVxDI4wW2evl4rWd9AbyCpeCGndMd_84eyEH_10yyhBnNBs8uzII_tsEMcmNjQbGUTnwU5SEMMpRjemMPr2Drv0UXI4qUw1pkWgov6EGlS9gXe_zXjObylMqUE4MsZk6-QuVnw421uSi9DbPdwTHO4LMJPiRBjXFKFefP-2yT26H8ieN6-JkoF0AE3yMAXppbFLJ-jkjO0qM5FxRuVRU5orKuaISZSW-o7w2_5-GLJqYWTdAuMnt36JfSOz11A
CitedBy_id crossref_primary_10_1016_j_csbj_2021_12_030
crossref_primary_10_1038_s42256_023_00647_z
crossref_primary_10_3390_biom12091246
crossref_primary_10_1038_s41467_021_26529_9
crossref_primary_10_1007_s00894_024_06259_7
crossref_primary_10_1021_acs_jcim_5c01281
crossref_primary_10_1038_s41592_019_0598_1
crossref_primary_10_3390_app12178465
crossref_primary_10_1002_wcms_1542
crossref_primary_10_1016_j_jmb_2025_169090
crossref_primary_10_3390_biom10040626
crossref_primary_10_1007_s10930_021_10003_y
crossref_primary_10_1038_s41557_025_01760_9
crossref_primary_10_3389_fphar_2025_1498662
crossref_primary_10_1093_bioinformatics_btaf374
crossref_primary_10_1177_11779322251358314
crossref_primary_10_1038_s41580_021_00407_0
crossref_primary_10_3389_fcimb_2021_610348
crossref_primary_10_4014_jmb_2405_05022
crossref_primary_10_1038_s41598_020_70181_0
crossref_primary_10_1186_s12859_024_05914_3
crossref_primary_10_1007_s13721_021_00311_9
crossref_primary_10_1038_s42256_024_00931_6
crossref_primary_10_1038_s41586_023_06622_3
crossref_primary_10_1038_s41592_021_01100_y
crossref_primary_10_1016_j_isci_2025_112480
crossref_primary_10_1103_PhysRevResearch_6_023006
crossref_primary_10_1038_s42256_025_01011_z
crossref_primary_10_1002_prot_26235
crossref_primary_10_1186_s12859_023_05586_5
crossref_primary_10_1038_s41467_023_38347_2
crossref_primary_10_1016_j_csbj_2022_11_012
crossref_primary_10_1186_s12859_020_03644_w
crossref_primary_10_1007_s42330_025_00377_x
crossref_primary_10_1186_s12859_019_3220_8
crossref_primary_10_1186_s13059_023_02948_3
crossref_primary_10_1109_ACCESS_2024_3444468
crossref_primary_10_1186_s12859_022_05031_z
crossref_primary_10_1109_JESTPE_2024_3413163
crossref_primary_10_1093_bib_bbab502
crossref_primary_10_1002_prot_26169
crossref_primary_10_1016_j_csbj_2020_11_007
crossref_primary_10_1002_jctb_6798
crossref_primary_10_3390_ijms252111649
crossref_primary_10_1016_j_cels_2025_101201
crossref_primary_10_1038_s41592_021_01283_4
crossref_primary_10_1080_13678868_2024_2337964
crossref_primary_10_3390_ijms23137389
crossref_primary_10_1109_TPAMI_2021_3095381
crossref_primary_10_1073_pnas_2300838121
crossref_primary_10_2147_DDDT_S405906
crossref_primary_10_1016_j_compbiomed_2024_108810
crossref_primary_10_1016_j_semcancer_2020_01_010
crossref_primary_10_1051_bioconf_20214104003
crossref_primary_10_1186_s12859_023_05310_3
crossref_primary_10_1109_ACCESS_2023_3282702
crossref_primary_10_1002_cbic_202400092
crossref_primary_10_1002_wcms_1637
crossref_primary_10_1016_j_compbiolchem_2025_108429
crossref_primary_10_1016_j_str_2022_05_001
crossref_primary_10_1038_s41587_022_01432_w
Cites_doi 10.1371/journal.pcbi.1002195
10.1038/nbt.3300
10.1093/bioinformatics/bti125
10.1002/0471721204.ch8
10.1007/s11263-015-0816-y
10.1093/nar/gky448
10.1093/bib/bbw108
10.1038/s41591-018-0029-3
10.1093/bioinformatics/16.1.16
10.1038/nbt.4128
10.1093/nar/gkt1240
10.1038/525172a
10.1038/nmeth.1818
10.1038/nbt.3988
10.1093/nar/gky092
10.1186/s12976-015-0014-1
10.1093/protein/12.2.85
10.12688/f1000research.11543.1
10.1093/bioinformatics/btg224
10.1007/978-3-319-41324-2_22
10.1016/j.cels.2019.03.006
10.1002/prot.25431
10.1093/nar/gkq1105
10.1098/rsif.2017.0387
10.1016/S0022-2836(77)80200-3
10.1126/science.aah4043
10.1038/nature14539
10.1002/prot.25415
10.1126/science.1121018
ContentType Journal Article
Copyright The Author(s). 2019
COPYRIGHT 2019 BioMed Central Ltd.
2019. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: The Author(s). 2019
– notice: COPYRIGHT 2019 BioMed Central Ltd.
– notice: 2019. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
ISR
3V.
7QO
7SC
7X7
7XB
88E
8AL
8AO
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
HCIFZ
JQ2
K7-
K9.
L7M
LK8
L~C
L~D
M0N
M0S
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
7X8
5PM
DOA
DOI 10.1186/s12859-019-2932-0
DatabaseName Springer Nature OA Free Journals
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Gale In Context: Science
ProQuest Central (Corporate)
Biotechnology Research Abstracts
Computer and Information Systems Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
Computing Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Journals
ProQuest Hospital Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials - QC
Biological Science Collection
ProQuest Central
Technology Collection
Natural Science Collection
ProQuest One
ProQuest Central Korea
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Health & Medical Complete (Alumni)
Advanced Technologies Database with Aerospace
Biological Sciences
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
ProQuest Health & Medical Collection
Medical Database
Biological Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central Basic
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
ProQuest One Academic (New)
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Advanced Technologies Database with Aerospace
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest Medical Library
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList MEDLINE




Publicly Available Content Database

MEDLINE - Academic
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1471-2105
EndPage 10
ExternalDocumentID oai_doaj_org_article_9eba93e027e14eed9782f7c28be49cb9
PMC6560865
A590752068
31185886
10_1186_s12859_019_2932_0
Genre Journal Article
GeographicLocations United States
GeographicLocations_xml – name: United States
GrantInformation_xml – fundername: National Institutes of Health
  grantid: U54-CA225088
  funderid: http://dx.doi.org/10.13039/100000002
– fundername: National Institute of General Medical Sciences
  grantid: P50-GM107618
  funderid: http://dx.doi.org/10.13039/100000057
– fundername: NCI NIH HHS
  grantid: U54 CA225088
– fundername: NIGMS NIH HHS
  grantid: P50 GM107618
– fundername: National Institute of General Medical Sciences
  grantid: P50-GM107618
– fundername: National Institutes of Health
  grantid: U54-CA225088
– fundername: ;
  grantid: P50-GM107618
– fundername: ;
  grantid: U54-CA225088
GroupedDBID ---
0R~
23N
2WC
53G
5VS
6J9
7X7
88E
8AO
8FE
8FG
8FH
8FI
8FJ
AAFWJ
AAJSJ
AAKPC
AASML
ABDBF
ABUWG
ACGFO
ACGFS
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
ADMLS
ADUKV
AEAQA
AENEX
AEUYN
AFKRA
AFPKN
AFRAH
AHBYD
AHMBA
AHYZX
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMTXH
AOIJS
ARAPS
AZQEC
BAPOH
BAWUL
BBNVY
BCNDV
BENPR
BFQNJ
BGLVJ
BHPHI
BMC
BPHCQ
BVXVI
C6C
CCPQU
CS3
DIK
DU5
DWQXO
E3Z
EAD
EAP
EAS
EBD
EBLON
EBS
EJD
EMB
EMK
EMOBN
ESX
F5P
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
HCIFZ
HMCUK
HYE
IAO
ICD
IHR
INH
INR
ISR
ITC
K6V
K7-
KQ8
LK8
M1P
M48
M7P
MK~
ML0
M~E
O5R
O5S
OK1
OVT
P2P
P62
PGMZT
PHGZM
PHGZT
PIMPY
PJZUB
PPXIY
PQGLB
PQQKQ
PROAC
PSQYO
PUEGO
RBZ
RNS
ROL
RPM
RSV
SBL
SOJ
SV3
TR2
TUS
UKHRP
W2D
WOQ
WOW
XH6
XSB
AAYXX
AFFHD
CITATION
-A0
3V.
ACRMQ
ADINQ
ALIPV
C24
CGR
CUY
CVF
ECM
EIF
M0N
NPM
7QO
7SC
7XB
8AL
8FD
8FK
FR3
JQ2
K9.
L7M
L~C
L~D
P64
PKEHL
PQEST
PQUKI
Q9U
7X8
5PM
ID FETCH-LOGICAL-c637t-f791c4b4345d7524c059e88ab4cf0d5bac652bcb0aba66f215340010dce705b63
IEDL.DBID M7P
ISICitedReferencesCount 91
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000471320900003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1471-2105
IngestDate Mon Nov 10 04:31:15 EST 2025
Tue Nov 04 01:52:12 EST 2025
Wed Oct 01 13:44:05 EDT 2025
Mon Oct 06 18:28:55 EDT 2025
Tue Nov 11 10:07:27 EST 2025
Tue Nov 04 18:02:26 EST 2025
Thu Nov 13 15:32:11 EST 2025
Wed Feb 19 02:31:29 EST 2025
Tue Nov 18 21:55:10 EST 2025
Sat Nov 29 05:40:04 EST 2025
Sat Sep 06 07:27:26 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Proteins
Deep learning
CASP
Machine learning
PSSM
Database
Protein sequence
Protein structure prediction
Co-evolution
Protein structure
Language English
License Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c637t-f791c4b4345d7524c059e88ab4cf0d5bac652bcb0aba66f215340010dce705b63
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0001-6817-1322
OpenAccessLink https://www.proquest.com/docview/2242709458?pq-origsite=%requestingapplication%
PMID 31185886
PQID 2242709458
PQPubID 44065
PageCount 10
ParticipantIDs doaj_primary_oai_doaj_org_article_9eba93e027e14eed9782f7c28be49cb9
pubmedcentral_primary_oai_pubmedcentral_nih_gov_6560865
proquest_miscellaneous_2265803155
proquest_journals_2242709458
gale_infotracmisc_A590752068
gale_infotracacademiconefile_A590752068
gale_incontextgauss_ISR_A590752068
pubmed_primary_31185886
crossref_citationtrail_10_1186_s12859_019_2932_0
crossref_primary_10_1186_s12859_019_2932_0
springer_journals_10_1186_s12859_019_2932_0
PublicationCentury 2000
PublicationDate 2019-06-11
PublicationDateYYYYMMDD 2019-06-11
PublicationDate_xml – month: 06
  year: 2019
  text: 2019-06-11
  day: 11
PublicationDecade 2010
PublicationPlace London
PublicationPlace_xml – name: London
– name: England
PublicationTitle BMC bioinformatics
PublicationTitleAbbrev BMC Bioinformatics
PublicationTitleAlternate BMC Bioinformatics
PublicationYear 2019
Publisher BioMed Central
BioMed Central Ltd
Springer Nature B.V
BMC
Publisher_xml – name: BioMed Central
– name: BioMed Central Ltd
– name: Springer Nature B.V
– name: BMC
References B Alipanahi (2932_CR3) 2015; 33
J Guinney (2932_CR6) 2018; 36
2932_CR18
I Goodfellow (2932_CR8) 2016
S Ovchinnikov (2932_CR22) 2017; 355
2932_CR15
G Wang (2932_CR10) 2003; 19
B Rost (2932_CR12) 1999; 12
DSW Ting (2932_CR2) 2018; 24
Bianca Hermine Habermann (2932_CR16) 2016
GD Stormo (2932_CR19) 2000; 16
The UniProt Consortium (2932_CR21) 2018; 46
J Haas (2932_CR14) 2018; 86
2932_CR31
O Russakovsky (2932_CR5) 2015; 115
J Chen (2932_CR17) 2018; 19
NK Fox (2932_CR20) 2014; 42
RP Joosten (2932_CR11) 2011; 39
SR Eddy (2932_CR23) 2011; 7
T Ching (2932_CR4) 2018; 15
J Söding (2932_CR25) 2005; 21
M AlQuraishi (2932_CR32) 2019; 8
E Callaway (2932_CR30) 2015; 525
2932_CR28
FC Bernstein (2932_CR9) 1977; 112
SC Potter (2932_CR24) 2018; 46
J-M Chandonia (2932_CR29) 2006; 311
M John (2932_CR13) 2018; 86
M Remmert (2932_CR26) 2012; 9
M Steinegger (2932_CR27) 2017; 35
Y LeCun (2932_CR1) 2015; 521
2932_CR7
References_xml – volume: 7
  start-page: e1002195
  issue: 10
  year: 2011
  ident: 2932_CR23
  publication-title: PLoS Comput Biol
  doi: 10.1371/journal.pcbi.1002195
– volume: 33
  start-page: 831
  issue: 8
  year: 2015
  ident: 2932_CR3
  publication-title: Nat Biotechnol
  doi: 10.1038/nbt.3300
– volume: 21
  start-page: 951
  issue: 7
  year: 2005
  ident: 2932_CR25
  publication-title: Bioinformatics.
  doi: 10.1093/bioinformatics/bti125
– ident: 2932_CR18
  doi: 10.1002/0471721204.ch8
– volume: 115
  start-page: 211
  issue: 3
  year: 2015
  ident: 2932_CR5
  publication-title: Int J Comput Vis
  doi: 10.1007/s11263-015-0816-y
– volume: 46
  start-page: W200
  issue: W1
  year: 2018
  ident: 2932_CR24
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky448
– volume: 19
  start-page: 231
  issue: 2
  year: 2018
  ident: 2932_CR17
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbw108
– volume: 24
  start-page: 539
  issue: 5
  year: 2018
  ident: 2932_CR2
  publication-title: Nat Med
  doi: 10.1038/s41591-018-0029-3
– start-page: 800
  volume-title: Deep learning
  year: 2016
  ident: 2932_CR8
– volume: 16
  start-page: 16
  issue: 1
  year: 2000
  ident: 2932_CR19
  publication-title: Bioinformatics.
  doi: 10.1093/bioinformatics/16.1.16
– volume: 36
  start-page: 391
  year: 2018
  ident: 2932_CR6
  publication-title: Nat Biotechnol
  doi: 10.1038/nbt.4128
– volume: 42
  start-page: D304
  issue: D1
  year: 2014
  ident: 2932_CR20
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkt1240
– volume: 525
  start-page: 172
  issue: 7568
  year: 2015
  ident: 2932_CR30
  publication-title: Nature
  doi: 10.1038/525172a
– volume: 9
  start-page: 173
  issue: 2
  year: 2012
  ident: 2932_CR26
  publication-title: Nat Methods
  doi: 10.1038/nmeth.1818
– volume: 35
  start-page: 1026
  year: 2017
  ident: 2932_CR27
  publication-title: Nat Biotechnol
  doi: 10.1038/nbt.3988
– volume: 46
  start-page: 2699
  issue: 5
  year: 2018
  ident: 2932_CR21
  publication-title: Nucleic Acids Research
  doi: 10.1093/nar/gky092
– ident: 2932_CR15
  doi: 10.1186/s12976-015-0014-1
– volume: 12
  start-page: 85
  issue: 2
  year: 1999
  ident: 2932_CR12
  publication-title: Protein Eng Des Sel
  doi: 10.1093/protein/12.2.85
– ident: 2932_CR7
  doi: 10.12688/f1000research.11543.1
– volume: 19
  start-page: 1589
  issue: 12
  year: 2003
  ident: 2932_CR10
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg224
– start-page: 393
  volume-title: Evolutionary Biology
  year: 2016
  ident: 2932_CR16
  doi: 10.1007/978-3-319-41324-2_22
– ident: 2932_CR31
– volume: 8
  start-page: 292
  issue: 4
  year: 2019
  ident: 2932_CR32
  publication-title: Cell Syst
  doi: 10.1016/j.cels.2019.03.006
– volume: 86
  start-page: 387
  issue: S1
  year: 2018
  ident: 2932_CR14
  publication-title: Proteins Struct Funct Bioinforma
  doi: 10.1002/prot.25431
– volume: 39
  start-page: D411
  issue: Database issue
  year: 2011
  ident: 2932_CR11
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkq1105
– volume: 15
  start-page: 20170387
  issue: 141
  year: 2018
  ident: 2932_CR4
  publication-title: J R Soc Interface
  doi: 10.1098/rsif.2017.0387
– volume: 112
  start-page: 535
  issue: 3
  year: 1977
  ident: 2932_CR9
  publication-title: J Mol Biol
  doi: 10.1016/S0022-2836(77)80200-3
– volume: 355
  start-page: 294
  issue: 6322
  year: 2017
  ident: 2932_CR22
  publication-title: Science
  doi: 10.1126/science.aah4043
– volume: 521
  start-page: 436
  issue: 7553
  year: 2015
  ident: 2932_CR1
  publication-title: Nature.
  doi: 10.1038/nature14539
– volume: 86
  start-page: 7
  issue: S1
  year: 2018
  ident: 2932_CR13
  publication-title: Proteins Struct Funct Bioinforma.
  doi: 10.1002/prot.25415
– volume: 311
  start-page: 347
  issue: 5759
  year: 2006
  ident: 2932_CR29
  publication-title: Science
  doi: 10.1126/science.1121018
– ident: 2932_CR28
SSID ssj0017805
Score 2.6140485
Snippet Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic...
Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine...
Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic...
Abstract Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design....
SourceID doaj
pubmedcentral
proquest
gale
pubmed
crossref
springer
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 311
SubjectTerms Accessibility
Algorithms
Amino Acid Sequence
Analysis
Artificial intelligence
Automation
Binding sites
Bioinformatics
Biomedical and Life Sciences
CASP
Co-evolution
Coevolution
Computational biology
Computational Biology/Bioinformatics
Computer Appl. in Life Sciences
Computer vision
Databases, Protein
Datasets
Deep learning
Historical structures
Homology
Internet
Learning algorithms
Life Sciences
Machine Learning
Machine Learning and Artificial Intelligence in Bioinformatics
Methods
Microarrays
Natural language processing
Protein research
Protein sequence
Protein structure
Protein structure prediction
Proteins
Proteins - chemistry
Reference Standards
Sequence Alignment
Structure (Literature)
Technology application
Training
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1ba9VAEF6kKPgitd5Sa1lFEJTQXDZ78a2WFgU9FG_0bcluJu0BzSnNOYL-emeym2NTUV98zcyG7JfZnZns5BvGngqPMXWeuxTK0qdCNpBiGOJSIRR4QEE29Dr8_FbNZvrkxBxfavVFNWGBHjgAt2fA1aYEzJ4gF7ihY9ZTtMoX2oEw3g2_7mHUMyZT8fyAmPrjGWau5V6fE08bps0mRfdWpNnECw1k_b9vyZd80tV6ySuHpoMvOtpkt2IQyffDw99m16DbYjdCW8nvd9i7YyJfmHczWL7kNR-_Fsx_QMOpIpT3sOQYrPKvQyUl8Ng64pQvWn4exvJALLu6gLvs09Hhx4PXaWybkHpZqmXaKpN74UQpqkZVhfAYQYHWtRO-zZrK1V5WhfMuq10tZYs-vxSUGuL0VFY5Wd5jG92igweMt00j6jzzuOgxk2oofDCt0gb9PjRKQsKyEUbrI6c4tbb4YofcQksbkLeIvCXkbZaw5-sh54FQ42_Kr-jdrBWJC3u4gBZio4XYf1lIwp7Qm7XEdtFROc1pvep7--bDe7tfGQyZikzqhD2LSu0CZ-Dr-HcC4kAEWRPNnYkmLkc_FY8GZON20FvEq1CYSFcofrwW00gqcetgsSIdDAap50aVsPvB3tbzLhGeSmuZMDWxxAkwU0k3PxvIwolcSUu854vRZn891h9x3_4fuD9kNwtacdToKd9hG2i28Ihd99-W8_5id1ivPwHW8kHr
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: SpringerLINK Contemporary 1997-Present
  dbid: RSV
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3ra9RAEF-kKvilvjVaZRVBUELz2OzDb1UsCnqUVku_LdnN5DywSbncCfrXO7NJTlMfoF9vZo_buZmd32Qnv2HsifCIqdPUxZDnPhayghhhiIuFUOABBUmYdXj8Ts1m-uTEHAzvcXdjt_t4JRlO6hDWWu52KXGtYelrYkxRWYx1-kXMdpqi8fDoeHN1QCT9w_Xlb5dNElDg6f_1NP4pHZ1vlTx3XxrS0P7V_9rANbY9oE6-17vJdXYBmhvscj-H8utN9v6A2BoWzQxWL3jJx8cLi29QcWoh5R2sOKJbfhpaL4EPsybmvK35Wb-W90y06yXcYh_3X3949SYe5izEXuZqFdfKpF44kYuiUkUmPEIu0Lp0wtdJVbjSyyJz3iWlK6WsESTkgmpJNIpKCifz22yraRu4y3hdVaJME4-nBJZeFeENUyttEChApSRELBmNb_1AQk6zMD7bUIxoaXsrWbSSJSvZJGLPNkvOegaOvym_pH90o0jk2eGDdjm3QyxaA640OWBBDqlAjICFdFYrn2kHwnhnIvaY_MESPUZD_Tfzct119u3Rod0rDGKsLJE6Yk8HpbrFHfhyeJ0B7UCMWhPNnYkmxq-fike3s8P50Vm0V6aw8i5Q_GgjppXUE9dAuyYdRI80pKOI2J3eSzf7ztE8hdYyYmrivxPDTCXN4lNgFyc2Ji3xO5-PXvzjZ_3R7vf-Sfs-u5JRGNAIqHSHbaF_wgN2yX9ZLbrlwxDO3wHIfkST
  priority: 102
  providerName: Springer Nature
Title ProteinNet: a standardized data set for machine learning of protein structure
URI https://link.springer.com/article/10.1186/s12859-019-2932-0
https://www.ncbi.nlm.nih.gov/pubmed/31185886
https://www.proquest.com/docview/2242709458
https://www.proquest.com/docview/2265803155
https://pubmed.ncbi.nlm.nih.gov/PMC6560865
https://doaj.org/article/9eba93e027e14eed9782f7c28be49cb9
Volume 20
WOSCitedRecordID wos000471320900003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVADU
  databaseName: BioMed Central Open Access Free
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: RBZ
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.biomedcentral.com/search/
  providerName: BioMedCentral
– providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: DOA
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: M~E
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: P5Z
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Biological Science Database
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: M7P
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: K7-
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Health & Medical Collection
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: 7X7
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: BENPR
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: PIMPY
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: RSV
  dateStart: 20001201
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwELfYBhIvfH8ERmUQEhIoWj4c2-EFbWgTE6yKOpg6XqzYcUolSEfTIsFfz53jdmSIvfBiqfE5is_n-7CvvyPkOTPgU8exDm2ampDxyobghuiQMWGNhY7I1To8-SCGQzke54U_cGt9WuVKJzpFXc0MnpHvgKlJBMQimXxz9j3EqlF4u-pLaGyQLURJSF3qXrG-RUC8fn-TGUu-08aI1gbBcx6CkUvCqGeLHGT_34r5D8t0MWvywtWps0gHN_93LrfIDe-L0t1OeG6TK7a5Q6511Sl_3iVHBWI4TJuhXbymJV0dOkx_2YpiYilt7YKCz0u_uYRMS30Figmd1fSsG0s7fNrl3N4jnw72P759F_rqC6HhqViEtchjwzRLWVaJLGEGHDErZamZqaMq06XhWaKNjkpdcl4j-xlGmMAfEWWap_fJZjNr7ENC66piZRwZ0B0QkFXoheS1kDnwwVaC24BEq3VQxkOTY4WMr8qFKJKrbukULJ3CpVNRQF6uh5x1uByXEe_h4q4JEVLbPZjNJ8rvUJVbXeaphTDdxgw8Bwivk1qYRGrLcqPzgDxD0VAImtFgVs6kXLatOjweqd0sB88ribgMyAtPVM9gBqb0f3IAPiDOVo9yu0cJu9r0u1eio7xWadW53ATk6bobR2KmXGNnS6QBnxJLd2QBedAJ7HreKbAnk5IHRPREuceYfk8z_eIwxxGjSXJ456uV0J9_1j_5_ujySTwm1xPcjFgJKt4mmyCQ9gm5an4spu18QDbEWLhWDsjW3v6wGA3ciQm070U4cFsd2iL7DP3F4VFxCr9Gxye_AUrHV_0
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Zb9QwELZKAcEL9xEoYBAICWQ1h2M7SAiVo-pq21VFC-qbiR1nWQmSZbMLKj-K38hMji0pom994DUeR_FkPEc8-T5CHnMLOXUQGOaiyDIuMscgDTGMc-msgwG_5jr8uC1HI3VwkOyukF_dvzDYVtn5xNpRZ6XFb-TrEGpCCbVIrF5NvzFkjcLT1Y5CozGLoTv8ASVb9XLwFt7vkzDcfLf_Zou1rALMikjOWS6TwHLDIx5nMg65hQTDKZUabnM_i01qRRwaa_zUpELkEBIjjpVTZp30YyMiuO8ZcpZHSiJW_1Cy5akF8gO0J6eBEutVgOhwUKwnDIJqyPxe7KspAv4OBH9EwuNdmseOausIuHn5f9PdFXKpzbXpRrM5rpIVV1wj5xv2zcPrZGcXMSomxcjNX9CUdh9VJj9dRrFxllZuTiGnp1_rhlNHW4aNMS1zOm3m0gZ_dzFzN8iHU1nLTbJalIW7TWieZTwNfAu-EQrODLOsJJcqAb27TArnEb9779q20OvIAPJF1yWYEroxFQ2motFUtO-RZ8sp0wZ35CTh12hMS0GEDK8vlLOxbj2QTpxJk8j5oXQBh8wogdwwlzZUxvHEmsQjj9AUNYKCFNh1NE4XVaUHe-_1RpxAZhn6QnnkaSuUl7ACm7Y_cYAeEEesJ7nWkwSvZfvDnanq1mtW-shOPfJwOYwzsROwcOUCZSBnRmqS2CO3mg2yXHcE6omVEh6Rva3TU0x_pJh8rjHVEYNKCbjn826THT3WP_V-5-RFPCAXtvZ3tvX2YDS8Sy6G6AiQ9SpYI6tgnO4eOWe_zyfV7H7tRij5dNp77zfjPKua
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Zb9QwELZQOcQLNzRQwCAkJKqoORwfvJVjRUVZrShUfbNix1lWKtnVJosEv56ZxAmkHBLiNTOO4smM_U08-YaQJ8wCpo5jE7o0tSHjhQsBhpiQMeGsA0HU9jo8PhTTqTw5UTPf57Tuq937I8nunwZkaaqavVVRdiEu-V4dI-8apMEqhO0qCSFnP8-wZxCm60fHwzECEvb7o8zfDhttRi1n_68r809b09myyTNnp-2WNLn635O5Rq54NEr3O_e5Ts656ga52PWn_HqTvJshi8OimrrmOc1p_9lh8c0VFEtLae0aCqiXfm5LMh31PSjmdFnSVTeWdgy1m7W7RT5OXn94-Sb0_RdCy1PRhKVQsWWGpSwrRJYwC1DMSZkbZsuoyExueZYYa6Lc5JyXAB5ShjkmGEhEmeHpbbJVLSu3TWhZFCyPIwurB6RkBeIQVQqpAEC4QnAXkKh_Edp6cnLskXGq2yRFct1ZSYOVNFpJRwF5NgxZdcwcf1N-gW93UERS7fbCcj3XPka1ciZXqYNE3cUMsAMk2EkpbCKNY8oaFZDH6BsaaTMqrMuZ55u61gdH7_V-pgB7JRGXAXnqlcolzMDm_jcHsAMybY00d0aaENd2LO5dUPt1pdZgr0RARp6B-NEgxpFYK1e55QZ1AFVi844sIHc6jx3mnYJ5Mil5QMTIl0eGGUuqxaeWdRxZmiSHe-72Hv3jsf5o97v_pP2QXJq9mujDg-nbe-RyghGBXaLiHbIFrurukwv2S7Oo1w_aKP8OElxQWw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ProteinNet%3A+a+standardized+data+set+for+machine+learning+of+protein+structure&rft.jtitle=BMC+bioinformatics&rft.au=AlQuraishi%2C+Mohammed&rft.date=2019-06-11&rft.pub=BioMed+Central+Ltd&rft.issn=1471-2105&rft.eissn=1471-2105&rft.volume=20&rft.issue=1&rft_id=info:doi/10.1186%2Fs12859-019-2932-0&rft.externalDocID=A590752068
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon