ProteinNet: a standardized data set for machine learning of protein structure
Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new me...
Uloženo v:
| Vydáno v: | BMC bioinformatics Ročník 20; číslo 1; s. 311 - 10 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
London
BioMed Central
11.06.2019
BioMed Central Ltd Springer Nature B.V BMC |
| Témata: | |
| ISSN: | 1471-2105, 1471-2105 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Background
Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space.
Results
We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty.
Conclusion
ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. |
|---|---|
| AbstractList | Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. Abstract Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. Results We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. Conclusion ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. Keywords: Proteins, Protein structure, Machine learning, CASP, Protein sequence, Co-evolution, PSSM, Protein structure prediction, Database, Deep learning Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space.BACKGROUNDRapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space.We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty.RESULTSWe created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty.ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.CONCLUSIONProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure. |
| ArticleNumber | 311 |
| Audience | Academic |
| Author | AlQuraishi, Mohammed |
| Author_xml | – sequence: 1 givenname: Mohammed orcidid: 0000-0001-6817-1322 surname: AlQuraishi fullname: AlQuraishi, Mohammed email: alquraishi@hms.harvard.edu organization: Laboratory of Systems Pharmacology, Department of Systems Biology, Harvard Medical School |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/31185886$$D View this record in MEDLINE/PubMed |
| BookMark | eNp9kktv1DAUhSNURB_wA9igSGzoIsV2YsdmgVRVPEYqD_FYW7Zzk3qUsQfbQZRfj0Na2qkAZZHo5jvn2kfnsNhz3kFRPMboBGPOnkdMOBUVwqIioiYVulcc4KbFFcGI7t363i8OY1wjhFuO6INiv85yyjk7KN59DD6Bde8hvShVGZNynQqd_Qld2amUJ5DK3odyo8yFdVCOoIKzbih9X24XbVaFyaQpwMPifq_GCI-u3kfF19evvpy9rc4_vFmdnZ5XhtVtqvpWYNPopm5o11LSGEQFcK50Y3rUUa0Mo0QbjZRWjPUE07rJh0edgRZRzeqjYrX4dl6t5TbYjQqX0isrfw98GKQKyZoRpACtRA2ItIAbgE60nPStIVxDI4wW2evl4rWd9AbyCpeCGndMd_84eyEH_10yyhBnNBs8uzII_tsEMcmNjQbGUTnwU5SEMMpRjemMPr2Drv0UXI4qUw1pkWgov6EGlS9gXe_zXjObylMqUE4MsZk6-QuVnw421uSi9DbPdwTHO4LMJPiRBjXFKFefP-2yT26H8ieN6-JkoF0AE3yMAXppbFLJ-jkjO0qM5FxRuVRU5orKuaISZSW-o7w2_5-GLJqYWTdAuMnt36JfSOz11A |
| CitedBy_id | crossref_primary_10_1016_j_csbj_2021_12_030 crossref_primary_10_1038_s42256_023_00647_z crossref_primary_10_3390_biom12091246 crossref_primary_10_1038_s41467_021_26529_9 crossref_primary_10_1007_s00894_024_06259_7 crossref_primary_10_1021_acs_jcim_5c01281 crossref_primary_10_1038_s41592_019_0598_1 crossref_primary_10_3390_app12178465 crossref_primary_10_1002_wcms_1542 crossref_primary_10_1016_j_jmb_2025_169090 crossref_primary_10_3390_biom10040626 crossref_primary_10_1007_s10930_021_10003_y crossref_primary_10_1038_s41557_025_01760_9 crossref_primary_10_3389_fphar_2025_1498662 crossref_primary_10_1093_bioinformatics_btaf374 crossref_primary_10_1177_11779322251358314 crossref_primary_10_1038_s41580_021_00407_0 crossref_primary_10_3389_fcimb_2021_610348 crossref_primary_10_4014_jmb_2405_05022 crossref_primary_10_1038_s41598_020_70181_0 crossref_primary_10_1186_s12859_024_05914_3 crossref_primary_10_1007_s13721_021_00311_9 crossref_primary_10_1038_s42256_024_00931_6 crossref_primary_10_1038_s41586_023_06622_3 crossref_primary_10_1038_s41592_021_01100_y crossref_primary_10_1016_j_isci_2025_112480 crossref_primary_10_1103_PhysRevResearch_6_023006 crossref_primary_10_1038_s42256_025_01011_z crossref_primary_10_1002_prot_26235 crossref_primary_10_1186_s12859_023_05586_5 crossref_primary_10_1038_s41467_023_38347_2 crossref_primary_10_1016_j_csbj_2022_11_012 crossref_primary_10_1186_s12859_020_03644_w crossref_primary_10_1007_s42330_025_00377_x crossref_primary_10_1186_s12859_019_3220_8 crossref_primary_10_1186_s13059_023_02948_3 crossref_primary_10_1109_ACCESS_2024_3444468 crossref_primary_10_1186_s12859_022_05031_z crossref_primary_10_1109_JESTPE_2024_3413163 crossref_primary_10_1093_bib_bbab502 crossref_primary_10_1002_prot_26169 crossref_primary_10_1016_j_csbj_2020_11_007 crossref_primary_10_1002_jctb_6798 crossref_primary_10_3390_ijms252111649 crossref_primary_10_1016_j_cels_2025_101201 crossref_primary_10_1038_s41592_021_01283_4 crossref_primary_10_1080_13678868_2024_2337964 crossref_primary_10_3390_ijms23137389 crossref_primary_10_1109_TPAMI_2021_3095381 crossref_primary_10_1073_pnas_2300838121 crossref_primary_10_2147_DDDT_S405906 crossref_primary_10_1016_j_compbiomed_2024_108810 crossref_primary_10_1016_j_semcancer_2020_01_010 crossref_primary_10_1051_bioconf_20214104003 crossref_primary_10_1186_s12859_023_05310_3 crossref_primary_10_1109_ACCESS_2023_3282702 crossref_primary_10_1002_cbic_202400092 crossref_primary_10_1002_wcms_1637 crossref_primary_10_1016_j_compbiolchem_2025_108429 crossref_primary_10_1016_j_str_2022_05_001 crossref_primary_10_1038_s41587_022_01432_w |
| Cites_doi | 10.1371/journal.pcbi.1002195 10.1038/nbt.3300 10.1093/bioinformatics/bti125 10.1002/0471721204.ch8 10.1007/s11263-015-0816-y 10.1093/nar/gky448 10.1093/bib/bbw108 10.1038/s41591-018-0029-3 10.1093/bioinformatics/16.1.16 10.1038/nbt.4128 10.1093/nar/gkt1240 10.1038/525172a 10.1038/nmeth.1818 10.1038/nbt.3988 10.1093/nar/gky092 10.1186/s12976-015-0014-1 10.1093/protein/12.2.85 10.12688/f1000research.11543.1 10.1093/bioinformatics/btg224 10.1007/978-3-319-41324-2_22 10.1016/j.cels.2019.03.006 10.1002/prot.25431 10.1093/nar/gkq1105 10.1098/rsif.2017.0387 10.1016/S0022-2836(77)80200-3 10.1126/science.aah4043 10.1038/nature14539 10.1002/prot.25415 10.1126/science.1121018 |
| ContentType | Journal Article |
| Copyright | The Author(s). 2019 COPYRIGHT 2019 BioMed Central Ltd. 2019. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: The Author(s). 2019 – notice: COPYRIGHT 2019 BioMed Central Ltd. – notice: 2019. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | C6C AAYXX CITATION CGR CUY CVF ECM EIF NPM ISR 3V. 7QO 7SC 7X7 7XB 88E 8AL 8AO 8FD 8FE 8FG 8FH 8FI 8FJ 8FK ABUWG AEUYN AFKRA ARAPS AZQEC BBNVY BENPR BGLVJ BHPHI CCPQU DWQXO FR3 FYUFA GHDGH GNUQQ HCIFZ JQ2 K7- K9. L7M LK8 L~C L~D M0N M0S M1P M7P P5Z P62 P64 PHGZM PHGZT PIMPY PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI Q9U 7X8 5PM DOA |
| DOI | 10.1186/s12859-019-2932-0 |
| DatabaseName | Springer Nature Link CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Science ProQuest Central (Corporate) Biotechnology Research Abstracts Computer and Information Systems Abstracts Health & Medical Collection ProQuest Central (purchase pre-March 2016) Medical Database (Alumni Edition) Computing Database (Alumni Edition) ProQuest Pharma Collection Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Natural Science Collection ProQuest Hospital Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) One Sustainability ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials - QC ProQuest : Biological Science Collection journals [unlimited simultaneous users] ProQuest Central Technology collection Natural Science Collection ProQuest One Community College ProQuest Central Engineering Research Database Health Research Premium Collection (UHCL Subscription) Health Research Premium Collection (Alumni) ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database (ProQuest) ProQuest Health & Medical Complete (Alumni) Advanced Technologies Database with Aerospace Biological Sciences Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Computing Database ProQuest Health & Medical Collection PML(ProQuest Medical Library) Biological Science Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection Biotechnology and BioEngineering Abstracts Proquest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central Basic MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Publicly Available Content Database Computer Science Database ProQuest Central Student ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts SciTech Premium Collection ProQuest One Applied & Life Sciences ProQuest One Sustainability Health Research Premium Collection Natural Science Collection Health & Medical Research Collection Biological Science Collection ProQuest Central (New) ProQuest Medical Library (Alumni) Advanced Technologies & Aerospace Collection ProQuest Biological Science Collection ProQuest One Academic Eastern Edition ProQuest Hospital Collection ProQuest Technology Collection Health Research Premium Collection (Alumni) Biological Science Database ProQuest Hospital Collection (Alumni) Biotechnology and BioEngineering Abstracts ProQuest Health & Medical Complete ProQuest One Academic UKI Edition Engineering Research Database ProQuest One Academic ProQuest One Academic (New) Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Pharma Collection ProQuest Central ProQuest Health & Medical Research Collection Biotechnology Research Abstracts Health and Medicine Complete (Alumni Edition) ProQuest Central Korea Advanced Technologies Database with Aerospace ProQuest Computing ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest Medical Library ProQuest Central (Alumni) MEDLINE - Academic |
| DatabaseTitleList | Publicly Available Content Database MEDLINE - Academic MEDLINE |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: PIMPY name: Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology |
| EISSN | 1471-2105 |
| EndPage | 10 |
| ExternalDocumentID | oai_doaj_org_article_9eba93e027e14eed9782f7c28be49cb9 PMC6560865 A590752068 31185886 10_1186_s12859_019_2932_0 |
| Genre | Journal Article |
| GeographicLocations | United States |
| GeographicLocations_xml | – name: United States |
| GrantInformation_xml | – fundername: National Institutes of Health grantid: U54-CA225088 funderid: http://dx.doi.org/10.13039/100000002 – fundername: National Institute of General Medical Sciences grantid: P50-GM107618 funderid: http://dx.doi.org/10.13039/100000057 – fundername: NCI NIH HHS grantid: U54 CA225088 – fundername: NIGMS NIH HHS grantid: P50 GM107618 – fundername: National Institute of General Medical Sciences grantid: P50-GM107618 – fundername: National Institutes of Health grantid: U54-CA225088 – fundername: ; grantid: P50-GM107618 – fundername: ; grantid: U54-CA225088 |
| GroupedDBID | --- 0R~ 23N 2WC 53G 5VS 6J9 7X7 88E 8AO 8FE 8FG 8FH 8FI 8FJ AAFWJ AAJSJ AAKPC AASML ABDBF ABUWG ACGFO ACGFS ACIHN ACIWK ACPRK ACUHS ADBBV ADMLS ADUKV AEAQA AENEX AEUYN AFKRA AFPKN AFRAH AHBYD AHMBA AHYZX ALMA_UNASSIGNED_HOLDINGS AMKLP AMTXH AOIJS ARAPS AZQEC BAPOH BAWUL BBNVY BCNDV BENPR BFQNJ BGLVJ BHPHI BMC BPHCQ BVXVI C6C CCPQU CS3 DIK DU5 DWQXO E3Z EAD EAP EAS EBD EBLON EBS EJD EMB EMK EMOBN ESX F5P FYUFA GNUQQ GROUPED_DOAJ GX1 HCIFZ HMCUK HYE IAO ICD IHR INH INR ISR ITC K6V K7- KQ8 LK8 M1P M48 M7P MK~ ML0 M~E O5R O5S OK1 OVT P2P P62 PGMZT PHGZM PHGZT PIMPY PJZUB PPXIY PQGLB PQQKQ PROAC PSQYO PUEGO RBZ RNS ROL RPM RSV SBL SOJ SV3 TR2 TUS UKHRP W2D WOQ WOW XH6 XSB AAYXX AFFHD CITATION -A0 3V. ACRMQ ADINQ ALIPV C24 CGR CUY CVF ECM EIF M0N NPM 7QO 7SC 7XB 8AL 8FD 8FK FR3 JQ2 K9. L7M L~C L~D P64 PKEHL PQEST PQUKI Q9U 7X8 5PM |
| ID | FETCH-LOGICAL-c637t-f791c4b4345d7524c059e88ab4cf0d5bac652bcb0aba66f215340010dce705b63 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 91 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000471320900003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1471-2105 |
| IngestDate | Mon Nov 10 04:31:15 EST 2025 Tue Nov 04 01:52:12 EST 2025 Wed Oct 01 13:44:05 EDT 2025 Mon Oct 06 18:28:55 EDT 2025 Tue Nov 11 10:07:27 EST 2025 Tue Nov 04 18:02:26 EST 2025 Thu Nov 13 15:32:11 EST 2025 Wed Feb 19 02:31:29 EST 2025 Tue Nov 18 21:55:10 EST 2025 Sat Nov 29 05:40:04 EST 2025 Sat Sep 06 07:27:26 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | Proteins Deep learning CASP Machine learning PSSM Database Protein sequence Protein structure prediction Co-evolution Protein structure |
| Language | English |
| License | Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c637t-f791c4b4345d7524c059e88ab4cf0d5bac652bcb0aba66f215340010dce705b63 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ORCID | 0000-0001-6817-1322 |
| OpenAccessLink | https://doaj.org/article/9eba93e027e14eed9782f7c28be49cb9 |
| PMID | 31185886 |
| PQID | 2242709458 |
| PQPubID | 44065 |
| PageCount | 10 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_9eba93e027e14eed9782f7c28be49cb9 pubmedcentral_primary_oai_pubmedcentral_nih_gov_6560865 proquest_miscellaneous_2265803155 proquest_journals_2242709458 gale_infotracmisc_A590752068 gale_infotracacademiconefile_A590752068 gale_incontextgauss_ISR_A590752068 pubmed_primary_31185886 crossref_citationtrail_10_1186_s12859_019_2932_0 crossref_primary_10_1186_s12859_019_2932_0 springer_journals_10_1186_s12859_019_2932_0 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-06-11 |
| PublicationDateYYYYMMDD | 2019-06-11 |
| PublicationDate_xml | – month: 06 year: 2019 text: 2019-06-11 day: 11 |
| PublicationDecade | 2010 |
| PublicationPlace | London |
| PublicationPlace_xml | – name: London – name: England |
| PublicationTitle | BMC bioinformatics |
| PublicationTitleAbbrev | BMC Bioinformatics |
| PublicationTitleAlternate | BMC Bioinformatics |
| PublicationYear | 2019 |
| Publisher | BioMed Central BioMed Central Ltd Springer Nature B.V BMC |
| Publisher_xml | – name: BioMed Central – name: BioMed Central Ltd – name: Springer Nature B.V – name: BMC |
| References | B Alipanahi (2932_CR3) 2015; 33 J Guinney (2932_CR6) 2018; 36 2932_CR18 I Goodfellow (2932_CR8) 2016 S Ovchinnikov (2932_CR22) 2017; 355 2932_CR15 G Wang (2932_CR10) 2003; 19 B Rost (2932_CR12) 1999; 12 DSW Ting (2932_CR2) 2018; 24 Bianca Hermine Habermann (2932_CR16) 2016 GD Stormo (2932_CR19) 2000; 16 The UniProt Consortium (2932_CR21) 2018; 46 J Haas (2932_CR14) 2018; 86 2932_CR31 O Russakovsky (2932_CR5) 2015; 115 J Chen (2932_CR17) 2018; 19 NK Fox (2932_CR20) 2014; 42 RP Joosten (2932_CR11) 2011; 39 SR Eddy (2932_CR23) 2011; 7 T Ching (2932_CR4) 2018; 15 J Söding (2932_CR25) 2005; 21 M AlQuraishi (2932_CR32) 2019; 8 E Callaway (2932_CR30) 2015; 525 2932_CR28 FC Bernstein (2932_CR9) 1977; 112 SC Potter (2932_CR24) 2018; 46 J-M Chandonia (2932_CR29) 2006; 311 M John (2932_CR13) 2018; 86 M Remmert (2932_CR26) 2012; 9 M Steinegger (2932_CR27) 2017; 35 Y LeCun (2932_CR1) 2015; 521 2932_CR7 |
| References_xml | – volume: 7 start-page: e1002195 issue: 10 year: 2011 ident: 2932_CR23 publication-title: PLoS Comput Biol doi: 10.1371/journal.pcbi.1002195 – volume: 33 start-page: 831 issue: 8 year: 2015 ident: 2932_CR3 publication-title: Nat Biotechnol doi: 10.1038/nbt.3300 – volume: 21 start-page: 951 issue: 7 year: 2005 ident: 2932_CR25 publication-title: Bioinformatics. doi: 10.1093/bioinformatics/bti125 – ident: 2932_CR18 doi: 10.1002/0471721204.ch8 – volume: 115 start-page: 211 issue: 3 year: 2015 ident: 2932_CR5 publication-title: Int J Comput Vis doi: 10.1007/s11263-015-0816-y – volume: 46 start-page: W200 issue: W1 year: 2018 ident: 2932_CR24 publication-title: Nucleic Acids Res doi: 10.1093/nar/gky448 – volume: 19 start-page: 231 issue: 2 year: 2018 ident: 2932_CR17 publication-title: Brief Bioinform doi: 10.1093/bib/bbw108 – volume: 24 start-page: 539 issue: 5 year: 2018 ident: 2932_CR2 publication-title: Nat Med doi: 10.1038/s41591-018-0029-3 – start-page: 800 volume-title: Deep learning year: 2016 ident: 2932_CR8 – volume: 16 start-page: 16 issue: 1 year: 2000 ident: 2932_CR19 publication-title: Bioinformatics. doi: 10.1093/bioinformatics/16.1.16 – volume: 36 start-page: 391 year: 2018 ident: 2932_CR6 publication-title: Nat Biotechnol doi: 10.1038/nbt.4128 – volume: 42 start-page: D304 issue: D1 year: 2014 ident: 2932_CR20 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkt1240 – volume: 525 start-page: 172 issue: 7568 year: 2015 ident: 2932_CR30 publication-title: Nature doi: 10.1038/525172a – volume: 9 start-page: 173 issue: 2 year: 2012 ident: 2932_CR26 publication-title: Nat Methods doi: 10.1038/nmeth.1818 – volume: 35 start-page: 1026 year: 2017 ident: 2932_CR27 publication-title: Nat Biotechnol doi: 10.1038/nbt.3988 – volume: 46 start-page: 2699 issue: 5 year: 2018 ident: 2932_CR21 publication-title: Nucleic Acids Research doi: 10.1093/nar/gky092 – ident: 2932_CR15 doi: 10.1186/s12976-015-0014-1 – volume: 12 start-page: 85 issue: 2 year: 1999 ident: 2932_CR12 publication-title: Protein Eng Des Sel doi: 10.1093/protein/12.2.85 – ident: 2932_CR7 doi: 10.12688/f1000research.11543.1 – volume: 19 start-page: 1589 issue: 12 year: 2003 ident: 2932_CR10 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btg224 – start-page: 393 volume-title: Evolutionary Biology year: 2016 ident: 2932_CR16 doi: 10.1007/978-3-319-41324-2_22 – ident: 2932_CR31 – volume: 8 start-page: 292 issue: 4 year: 2019 ident: 2932_CR32 publication-title: Cell Syst doi: 10.1016/j.cels.2019.03.006 – volume: 86 start-page: 387 issue: S1 year: 2018 ident: 2932_CR14 publication-title: Proteins Struct Funct Bioinforma doi: 10.1002/prot.25431 – volume: 39 start-page: D411 issue: Database issue year: 2011 ident: 2932_CR11 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkq1105 – volume: 15 start-page: 20170387 issue: 141 year: 2018 ident: 2932_CR4 publication-title: J R Soc Interface doi: 10.1098/rsif.2017.0387 – volume: 112 start-page: 535 issue: 3 year: 1977 ident: 2932_CR9 publication-title: J Mol Biol doi: 10.1016/S0022-2836(77)80200-3 – volume: 355 start-page: 294 issue: 6322 year: 2017 ident: 2932_CR22 publication-title: Science doi: 10.1126/science.aah4043 – volume: 521 start-page: 436 issue: 7553 year: 2015 ident: 2932_CR1 publication-title: Nature. doi: 10.1038/nature14539 – volume: 86 start-page: 7 issue: S1 year: 2018 ident: 2932_CR13 publication-title: Proteins Struct Funct Bioinforma. doi: 10.1002/prot.25415 – volume: 311 start-page: 347 issue: 5759 year: 2006 ident: 2932_CR29 publication-title: Science doi: 10.1126/science.1121018 – ident: 2932_CR28 |
| SSID | ssj0017805 |
| Score | 2.6140592 |
| Snippet | Background
Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic... Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine... Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic... Abstract Background Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design.... |
| SourceID | doaj pubmedcentral proquest gale pubmed crossref springer |
| SourceType | Open Website Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 311 |
| SubjectTerms | Accessibility Algorithms Amino Acid Sequence Analysis Artificial intelligence Automation Binding sites Bioinformatics Biomedical and Life Sciences CASP Co-evolution Coevolution Computational biology Computational Biology/Bioinformatics Computer Appl. in Life Sciences Computer vision Databases, Protein Datasets Deep learning Historical structures Homology Internet Learning algorithms Life Sciences Machine Learning Machine Learning and Artificial Intelligence in Bioinformatics Methods Microarrays Natural language processing Protein research Protein sequence Protein structure Protein structure prediction Proteins Proteins - chemistry Reference Standards Sequence Alignment Structure (Literature) Technology application Training |
| SummonAdditionalLinks | – databaseName: Computer Science Database (ProQuest) dbid: K7- link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3di9QwEA96Kvji92n1lCiCoIRr0yZNfZFTPBRxOfyAewtNOl0XtF23u4L-9c602Z498V58bSalmfklmUmmv2HscYUoAOVBVCoGkVW5Ek7VSsR15Z0khixf9cUm8tnMHB8XR-HArQtplds1sV-oq9bTGfk-bjUyx1hEmRfL74KqRtHtaiihcZ5dSKRMCOfvcjHeIhBff7jJTIze7xJia8PguRC4yUkRT_ainrL_74X5j53pdNbkqavTfkc6vPq_Y7nGrgRflB8M4LnOzkFzg10aqlP-vMneHxGHw6KZwfo5L_n20GHxCypOiaW8gzVHn5d_6xMygYcKFHPe1nw59OUDP-1mBbfY58PXn169EaH6gvA6zdeizovEZy5LM4UWlJlHRwyMKV3m67hSrvRaSeddXLpS6xpdhzSjCBP1k8fK6XSX7TRtA3cYd1DFGEkChnqADotyGBY7dLZ8Sr_dgopYvLWD9YGanCpkfLV9iGK0HUxn0XSWTGfjiD0duywHXo6zhF-ScUdBotTuH7SruQ0z1BbgyiIFDNMhydBzwPBa1rmXxkFWeFdE7BFBwxJpRkNZOfNy03X27ccP9kAV6HnJWJuIPQlCdYsj8GX4yQH1QDxbE8m9iSTOaj9t3kLHhlWlsye4idjDsZl6UqZcA-2GZNCnpNIdqNXbA2DHcaeoHmWMjlg-gfJEMdOWZvGl5xwnjiaj8Z3PtqA_-ax_6v3u2YO4xy5LmoxUCSrZYzsISLjPLvof60W3etBP5d-njU9A priority: 102 providerName: ProQuest – databaseName: SpringerLink dbid: RSV link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELZQAYkL70egRQYhIYGiZp3Ycbi1iAoOrKoWUG9W7EyWlSCpNrtI7a9nxnECKQ8JrvF4FX-Z8cysx98w9qxCLQDpIK5kAnFW5TK2spZxUlfOCmLIcpVvNpHP5_rkpDgM97i7odp9OJL0O7U3a612uxlxrWHqW8TookSMefpl9HaarPHo-NN4dEAk_eH48rfTJg7I8_T_uhv_5I4ulkpeOC_1bujgxn8t4Ca7HqJOvteryS12CZrb7Grfh_LsDnt_SGwNy2YO61e85MPfC8tzqDiVkPIO1hyjW_7Vl14CD70mFryt-Wk_l_dMtJsV3GUfD958eP02Dn0WYqfSfB3XeTFzmc3STOK3EpnDkAu0Lm3m6qSStnRKCutsUtpSqRqDhDSjXBJByRNpVXqPbTVtAw8Yt1AlmDMCJnWAoYm0mABbDKtcShdsQUYsGcA3LpCQUy-ML8YnI1qZHiWDKBlCySQRezFOOe0ZOP4mvE9fdBQk8mz_oF0tTLBFU4AtixQwIYdZhjECJtKizp3QFrLC2SJiT0kfDNFjNFR_syg3XWfeHR-ZPVlgjCUSpSP2PAjVLa7AleE6A-JAjFoTye2JJNqvmw4PamfC_tEZDKxEjpm3xOEn4zDNpJq4BtoNyWD0SE06ENX7vZaO604RHqm1ilg-0d8JMNORZvnZs4sTG5NW-JsvBy3-8Vp_xP3hP0k_YtcEmQG1gJptsy3UT9hhV9y39bJbPfbm_B2dD0Uf priority: 102 providerName: Springer Nature |
| Title | ProteinNet: a standardized data set for machine learning of protein structure |
| URI | https://link.springer.com/article/10.1186/s12859-019-2932-0 https://www.ncbi.nlm.nih.gov/pubmed/31185886 https://www.proquest.com/docview/2242709458 https://www.proquest.com/docview/2265803155 https://pubmed.ncbi.nlm.nih.gov/PMC6560865 https://doaj.org/article/9eba93e027e14eed9782f7c28be49cb9 |
| Volume | 20 |
| WOSCitedRecordID | wos000471320900003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVADU databaseName: BioMed Central customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: RBZ dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.biomedcentral.com/search/ providerName: BioMedCentral – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: DOA dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: M~E dateStart: 20000101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: P5Z dateStart: 20090101 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Biological Science Database customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: M7P dateStart: 20090101 isFulltext: true titleUrlDefault: http://search.proquest.com/biologicalscijournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: K7- dateStart: 20090101 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: Health & Medical Collection customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: 7X7 dateStart: 20090101 isFulltext: true titleUrlDefault: https://search.proquest.com/healthcomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: BENPR dateStart: 20090101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Publicly Available Content Database customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: PIMPY dateStart: 20090101 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest – providerCode: PRVAVX databaseName: Springer Collection (Lakeside Campuses) customDbUrl: eissn: 1471-2105 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017805 issn: 1471-2105 databaseCode: RSV dateStart: 20001201 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9QwEA96Kvgifls9lyiCoJRLP9Kkvt3JHR5yS9lTWX0JTTo9F7R7XHcF_eudSdv1eqK--BLYncnS_DpJZjaT3zD2rEIrAOkgrKSAMK2UDK2sZSjqytmYGLJc5YtNqOlUz-d5ca7UF-WEdfTAHXA7OdgyTwCjJ4hSXNAx6olr5WJtIc2d9Vf30OsZgqn-_ICY-vszzEhnO21EPG0YNuchbm9xKEa7kCfr_31JPrcnXcyXvHBo6veig5vsRu9E8t3u4W-xS9DcZte6spLf77CjgsgXFs0UVq94yYd_CxY_oOKUEcpbWHF0VvlXn0kJvC8dccKXNT_t-vKOWHZ9BnfZ-4P9d6_fhH3ZhNBliVqFtcojl9o0SSVCH6cOPSjQurSpq0UlbekyGVtnRWnLLKtxz09SCg1xeEpImyX32FazbOAB4xYqgSEgYIwG6GlIi_GsRS_JJXRfFmTAxACjcT2nOJW2-GJ8bKEz0yFvEHlDyBsRsBebLqcdocbflPfo3WwUiQvbf4EWYnoLMf-ykIA9pTdriO2ioXSak3LdtubweGZ2ZY4uUywyHbDnvVK9xBG4sr-dgDgQQdZIc3ukidPRjcWDAZl-OWgN-kmxwkBaovjJRkw9KcWtgeWadNAZpJobiOr9zt42404QHql1FjA1ssQRMGNJs_jsycKJXEln-JsvB5v99Vh_xP3h_8D9Ebse04yjQk_RNttCs4XH7Kr7tlq0ZxN2Wc2Vb_WEXdnbnxaziZ_D2L5V4YSScAtsC_kJ5cXhUfERP82OP_wE7VNLcw |
| linkProvider | Directory of Open Access Journals |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwELZKAcGFdyFQwCAQElXUxInjBAmh8qi6aruqoEh7M7EzWVaCZNnsgsqP4jcyk8eWFNFbD1yTcRRPPs8jHs_H2JMMUQDSgptJD9wwU9I1Mpeul2fWCOqQZbOabEINh_FolByssF_dWRgqq-xsYm2os9LSP_JNdDVCYS4i41fTby6xRtHuakeh0cBiF45-YMpWvRy8xe_7VIjtd4dvdtyWVcC1UaDmbq4S34YmDEKJbyZCiwEGxHFqQpt7mTSpjaQw1nipSaMoR5cYhJQ5ZRaUJ00U4HPPsfNoxxWVkKnRMsHziR-g3Tn142iz8qk7HCbriYtOVbhez_fVFAF_O4I_POHJKs0TW7W1B9y--r_p7hq70sbafKtZHNfZChQ32MWGffPoJts_oB4Vk2II8xc85d1PlclPyDgVzvIK5hxjev61LjgF3jJsjHmZ82kzljf9dxczuMU-nslc1thqURZwh3EDmYeZMmAqCxiQSYNpv8Fg0gZ0rBikw7zuu2vbtl4nBpAvuk7B4kg3UNEIFU1Q0Z7Dni-HTJu-I6cJvyYwLQWpZXh9oZyNdWuBdAImTQLwhAI_xMgI4StyZUVsIEysSRz2mKCoqSlIQVVH43RRVXrw4b3ekglGlsKLYoc9a4XyEmdg0_YQB-qB-oj1JNd7kmi1bP92B1XdWs1KH-PUYY-Wt2kkVQIWUC5IBmNmoiZBrd5uFshy3gGqR8Zx5DDVWzo9xfTvFJPPdU916kEVR_jMjW6RHb_WP_V-9_RJPGSXdg739_TeYLh7j10WZAiI9cpfZ6sITrjPLtjv80k1e1CbEc4-nfXa-w07WaxY |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Zj9MwELbQcogX7iOwgEFISKyiTRPbcXhbjooVUFUsoH2zYmdSKkFSNSkS_HpmEqeQ5ZAQr_G4isfj-Pvq8TeMPSwwCkA6CAsZQSiKVIZWljKMysLZmBSyXNEVm0hnM318nM19ndNmyHYfjiT7Ow2k0lS1-6ui7Je4VvvNhHTXkAZnIW5XcYic_bSgmkFE148-bI8RSLDfH2X-tttoM-o0-3_9Mv-0NZ1MmzxxdtptSdOL_z2YS-yCR6P8oA-fy-wUVFfY2b4-5der7M2cVByW1QzaJzznw98Oy29QcEot5Q20HFEv_9ylZAL3NSgWvC75qu_Le4XazRqusffTF--evQx9_YXQqSRtwzLNJk5YkQiJcxgLh1AMtM6tcGVUSJs7JWPrbJTbXKkSwUMiiGOig9JIWpVcZztVXcFNxi0UEXJJQLIHCFmkRWJsEW65hC7eggxYNEyEcV6cnGpkfDIdSdHK9F4y6CVDXjJRwB5vu6x6ZY6_GT-l2d0akqh296BeL4xfoyYDm2cJIFGHiUDsgAQ7LlMXawsiczYL2AOKDUOyGRXl5SzyTdOYw6O35kBmiL3iSOmAPfJGZY0jcLm_5oB-IKWtkeXuyBLXtRs3DyFo_HelMQi44hQZucTm-9tm6km5chXUG7JBVEnFO9CrN_qI3Y47QfdIrVXA0lEsjxwzbqmWHzvVcVJp0gp_c2-I6B-v9Ue_3_on63vs3Pz51Lw-nL26zc7HtCKoStRkl-1gqMIddsZ9aZfN-m63yr8DAHFQ5w |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=ProteinNet%3A+a+standardized+data+set+for+machine+learning+of+protein+structure&rft.jtitle=BMC+bioinformatics&rft.au=Mohammed+AlQuraishi&rft.date=2019-06-11&rft.pub=BMC&rft.eissn=1471-2105&rft.volume=20&rft.issue=1&rft.spage=1&rft.epage=10&rft_id=info:doi/10.1186%2Fs12859-019-2932-0&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_9eba93e027e14eed9782f7c28be49cb9 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon |