Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine

The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools...

Full description

Saved in:
Bibliographic Details
Published in:PLoS computational biology Vol. 12; no. 11; p. e1005017
Main Authors: Singhal, Ayush, Simmons, Michael, Lu, Zhiyong
Format: Journal Article
Language:English
Published: United States Public Library of Science 01.11.2016
Public Library of Science (PLoS)
Subjects:
ISSN:1553-7358, 1553-734X, 1553-7358
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships.
AbstractList The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships.
The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient’s genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer’s disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships. To provide personalized health care it is important to understand patients’ genomic variations and the effect these variants have in protecting or predisposing patients to disease. Several projects aim at providing this information by manually curating such genotype-phenotype relationships in organized databases using data from clinical trials and biomedical literature. However, the exponentially increasing size of biomedical literature and the limited ability of manual curators to discover the genotype-phenotype relationships “hidden” in text has led to delays in keeping such databases updated with the current findings. The result is a bottleneck in leveraging valuable information that is currently available to develop personalized health care solutions. In the past, a few computational techniques have attempted to speed up the curation efforts by using text mining techniques to automatically mine genotype-phenotype information from biomedical literature. However, such computational approaches have not been able to achieve accuracy levels sufficient to make them appealing for practical use. In this work, we present a highly accurate machine-learning-based text mining approach for mining complete genotype-phenotype relationships from biomedical literature. We test the performance of this approach on ten well-known diseases and demonstrate the validity of our approach and its potential utility for practical purposes. We are currently working towards generating genotype-phenotype relationships for all PubMed data with the goal of developing an exhaustive database of all the known diseases in life science. We believe that this work will provide very important and needed support for implementation of personalized health care using genomic data.
  The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships.
The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F.sub.1 -measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships.
Audience Academic
Author Singhal, Ayush
Simmons, Michael
Lu, Zhiyong
AuthorAffiliation University of Chicago, UNITED STATES
National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland, United States of America
AuthorAffiliation_xml – name: University of Chicago, UNITED STATES
– name: National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland, United States of America
Author_xml – sequence: 1
  givenname: Ayush
  orcidid: 0000-0002-2378-3795
  surname: Singhal
  fullname: Singhal, Ayush
– sequence: 2
  givenname: Michael
  surname: Simmons
  fullname: Simmons, Michael
– sequence: 3
  givenname: Zhiyong
  surname: Lu
  fullname: Lu, Zhiyong
BackLink https://www.ncbi.nlm.nih.gov/pubmed/27902695$$D View this record in MEDLINE/PubMed
BookMark eNqVk19v0zAUxSM0xP7AN0AQiRd4aLFjO4n3gDQKjEodTGM8Wzeu3XpK7WI7iH17nDZF6zSBUB7iOL9zcn1y73F2YJ1VWfYcozEmFX574zpvoR2vZWPGGCGGcPUoO8KMkVFFWH1wZ32YHYdwg1Ba8vJJdlhUHBUlZ0dZvFa_Yn5hrLGL_FxZF2_XanS5HFb5lWohGmfD0qxDrr1b5e-NW6m5kdDmMxOVh9h5lWvn8w8QoYGg8knnN6oc7Dy_9Eqa0D9d9DJj1dPssYY2qGfD_ST7_unj9eTzaPb1fDo5m41kyWkc1TVusOSE8lLyQgLHWhZzLoEUTU0YaWpFNRDScI2hZFjhxOGqJpohSUGSk-zl1nfduiCGwILAdXJkVVGTREy3xNzBjVh7swJ_KxwYsdlwfiHARyNbJSSVuqkQxrxGFEgDSnKlQSGJgZWYJa93w9e6JgUklY0e2j3T_TfWLMXC_RQME4TLOhm8Hgy8-9GpEMXKBKnaFqxyXV8347TGLP3Hf6OUFaykrD_iq3vow0EM1ALSWY3VLpUoe1NxRqu6qApalokaP0Cla65WRqb-1Cbt7wne7AkSE1PDLaALQUy_Xf0H-2WffXE36z8h7xo7AadbQHoXgldaSBM3PZkqNq3ASPRTtMtC9FMkhilKYnpPvPP_q-w3lQ4iJQ
CitedBy_id crossref_primary_10_1016_j_jbi_2024_104716
crossref_primary_10_1186_s12859_021_04421_z
crossref_primary_10_1186_s12864_023_09561_5
crossref_primary_10_1016_j_ymeth_2019_02_021
crossref_primary_10_3389_fonc_2024_1486310
crossref_primary_10_2196_52655
crossref_primary_10_1186_s13326_017_0163_8
crossref_primary_10_1371_journal_pone_0210475
crossref_primary_10_1038_s41593_023_01259_x
crossref_primary_10_1093_gigascience_giae073
crossref_primary_10_3233_WEB_170354
crossref_primary_10_12688_f1000research_10788_1
crossref_primary_10_1038_s41598_022_05939_9
crossref_primary_10_3390_app122312012
crossref_primary_10_1038_s41598_021_93809_1
crossref_primary_10_1038_s41598_020_68649_0
crossref_primary_10_1016_j_ophtha_2017_09_023
crossref_primary_10_1186_s12859_020_03673_5
crossref_primary_10_1186_s12920_019_0637_x
crossref_primary_10_3389_fphar_2019_00839
crossref_primary_10_1186_s12859_019_3217_3
crossref_primary_10_1093_nar_gkz389
crossref_primary_10_1371_journal_pcbi_1008453
crossref_primary_10_1093_jamiaopen_ooz009
crossref_primary_10_3389_fphar_2021_630003
crossref_primary_10_1016_j_phymed_2021_153527
crossref_primary_10_2217_pme_2016_0100
crossref_primary_10_1093_bib_bbac282
crossref_primary_10_1126_scitranslmed_aau9113
crossref_primary_10_1038_s41592_019_0422_y
crossref_primary_10_1186_s40001_024_01983_5
crossref_primary_10_1016_j_csbj_2020_05_017
crossref_primary_10_1093_bib_bbaa142
crossref_primary_10_1093_nar_gky428
crossref_primary_10_3390_md20010038
crossref_primary_10_1016_j_biotechadv_2021_107739
crossref_primary_10_1177_11779322221125604
crossref_primary_10_1097_MD_0000000000018493
crossref_primary_10_1002_cpe_5986
crossref_primary_10_1016_j_phrs_2020_105203
crossref_primary_10_1093_bib_bbx143
crossref_primary_10_1016_j_humgen_2024_201309
crossref_primary_10_3389_fgene_2019_00070
crossref_primary_10_1186_s12859_019_2958_3
crossref_primary_10_3390_diagnostics12040887
crossref_primary_10_1093_bib_bbaa057
crossref_primary_10_1080_09537287_2024_2349224
crossref_primary_10_1093_nar_gky355
crossref_primary_10_1016_j_intimp_2021_107526
crossref_primary_10_1093_gigascience_giad036
crossref_primary_10_1371_journal_pone_0189663
crossref_primary_10_1186_s12864_020_07185_7
Cites_doi 10.1109/JBHI.2015.2422651
10.1093/bioinformatics/btq667
10.1038/nature05610
10.1002/humu.21317
10.1016/j.jbi.2012.04.006
10.7717/peerj.639
10.1056/NEJMp1500523
10.1016/j.gpb.2015.01.006
10.1186/1471-2105-10-S8-S2
10.1002/(SICI)1098-1004(200001)15:1<7::AID-HUMU4>3.0.CO;2-N
10.1186/1471-2164-11-S4-S24
10.1093/database/bau003
10.1093/nar/gki070
10.1142/S021972000700317X
10.1093/bib/bbn043
10.1111/jeu.12132
10.1002/humu.22594
10.1289/ehp.6028
10.1093/bioinformatics/btl421
10.1093/bioinformatics/btm229
10.1371/journal.pone.0038460
10.1016/S0140-6736(10)60452-7
10.1093/database/bat019
10.1001/jama.2014.1717
10.1093/bioinformatics/btt474
10.1186/1471-2105-9-402
10.1093/bioinformatics/btm235
10.1186/s12859-015-0865-9
10.1093/nar/gku1205
10.1093/nar/26.1.285
10.12688/f1000research.3-18.v2
10.1093/nar/gkl842
10.1186/1471-2105-12-S8-S5
10.1186/gb-2002-3-4-comment1005
ContentType Journal Article
Copyright COPYRIGHT 2016 Public Library of Science
2016 Public Library of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Singhal A, Simmons M, Lu Z (2016) Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine. PLoS Comput Biol 12(11): e1005017. doi:10.1371/journal.pcbi.1005017
2016 Public Library of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Singhal A, Simmons M, Lu Z (2016) Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine. PLoS Comput Biol 12(11): e1005017. doi:10.1371/journal.pcbi.1005017
Copyright_xml – notice: COPYRIGHT 2016 Public Library of Science
– notice: 2016 Public Library of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Singhal A, Simmons M, Lu Z (2016) Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine. PLoS Comput Biol 12(11): e1005017. doi:10.1371/journal.pcbi.1005017
– notice: 2016 Public Library of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Singhal A, Simmons M, Lu Z (2016) Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine. PLoS Comput Biol 12(11): e1005017. doi:10.1371/journal.pcbi.1005017
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
ISN
ISR
3V.
7QO
7QP
7TK
7TM
7X7
7XB
88E
8AL
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
HCIFZ
JQ2
K7-
K9.
LK8
M0N
M0S
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
RC3
7X8
5PM
DOA
DOI 10.1371/journal.pcbi.1005017
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Gale In Context: Canada
Gale In Context: Science
ProQuest Central (Corporate)
Biotechnology Research Abstracts
Calcium & Calcified Tissue Abstracts
Neurosciences Abstracts
Nucleic Acids Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Collection
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
ProQuest SciTech Premium Collection Natural Science Collection Biological Science Collection
ProQuest Central
ProQuest Technology Collection
Natural Science Collection
ProQuest One
ProQuest Central Korea
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection (via ProQuest)
ProQuest Computer Science Collection
Computer Science Database
ProQuest Health & Medical Complete (Alumni)
Biological Sciences
Computing Database
ProQuest Health & Medical Collection
Medical Database
Biological Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Nucleic Acids Abstracts
SciTech Premium Collection
ProQuest Central China
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Neurosciences Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
Calcium & Calcified Tissue Abstracts
ProQuest One Academic (New)
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Genetics Abstracts
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest Medical Library
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
Publicly Available Content Database
MEDLINE



Engineering Research Database

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
Medicine
DocumentTitleAlternate Text Mining Genotype-Phenotype Relationships
EISSN 1553-7358
EndPage e1005017
ExternalDocumentID 1849657283
oai_doaj_org_article_c4cfb70119804a3baec9efae0c1a5615
PMC5130168
4280509761
A478272466
27902695
10_1371_journal_pcbi_1005017
Genre Journal Article
GeographicLocations United States--US
Maryland
GeographicLocations_xml – name: Maryland
– name: United States--US
GrantInformation_xml – fundername: ;
GroupedDBID ---
123
29O
2WC
53G
5VS
7X7
88E
8FE
8FG
8FH
8FI
8FJ
AAFWJ
AAKPC
AAUCC
AAWOE
AAYXX
ABDBF
ABUWG
ACCTH
ACGFO
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
ADRAZ
AEAQA
AENEX
AEUYN
AFFHD
AFKRA
AFPKN
AFRAH
AHMBA
ALMA_UNASSIGNED_HOLDINGS
AOIJS
ARAPS
AZQEC
B0M
BAIFH
BAWUL
BBNVY
BBTPI
BCNDV
BENPR
BGLVJ
BHPHI
BPHCQ
BVXVI
BWKFM
CCPQU
CITATION
CS3
DIK
DWQXO
E3Z
EAP
EAS
EBD
EBS
EJD
EMK
EMOBN
ESX
F5P
FPL
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
HCIFZ
HMCUK
HYE
IAO
IGS
INH
INR
ISN
ISR
ITC
J9A
K6V
K7-
KQ8
LK8
M1P
M48
M7P
O5R
O5S
OK1
OVT
P2P
P62
PHGZM
PHGZT
PIMPY
PJZUB
PPXIY
PQGLB
PQQKQ
PROAC
PSQYO
PV9
RNS
RPM
RZL
SV3
TR2
TUS
UKHRP
WOW
XSB
~8M
ALIPV
C1A
CGR
CUY
CVF
ECM
EIF
H13
IPNFZ
NPM
RIG
WOQ
3V.
7QO
7QP
7TK
7TM
7XB
8AL
8FD
8FK
FR3
JQ2
K9.
M0N
P64
PKEHL
PQEST
PQUKI
PRINS
Q9U
RC3
7X8
PUEGO
5PM
AAPBV
ABPTK
M~E
ID FETCH-LOGICAL-c694t-881b1c93496c92ca91fc2d9ca32b8353b8e4fa33b9f1a651e16c91783f50c4ac3
IEDL.DBID K7-
ISICitedReferencesCount 79
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000391230900002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1553-7358
1553-734X
IngestDate Sun Nov 05 00:20:32 EDT 2023
Mon Nov 10 04:28:39 EST 2025
Tue Nov 04 01:54:50 EST 2025
Tue Oct 07 09:40:37 EDT 2025
Thu Sep 04 17:25:44 EDT 2025
Sat Nov 29 14:25:27 EST 2025
Tue Nov 11 10:27:52 EST 2025
Tue Nov 04 17:42:38 EST 2025
Thu Nov 13 15:38:09 EST 2025
Thu Nov 13 15:53:58 EST 2025
Mon Jul 21 05:28:29 EDT 2025
Sat Nov 29 06:01:24 EST 2025
Tue Nov 18 21:46:42 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 11
Language English
License This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Creative Commons Attribution License
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c694t-881b1c93496c92ca91fc2d9ca32b8353b8e4fa33b9f1a651e16c91783f50c4ac3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
The authors have no competing interests to declare.
Conceived and designed the experiments: ZL.Performed the experiments: AS MS.Analyzed the data: AS MS.Contributed reagents/materials/analysis tools: AS.Wrote the paper: AS MS ZL.
ORCID 0000-0002-2378-3795
OpenAccessLink https://www.proquest.com/docview/1849657283?pq-origsite=%requestingapplication%
PMID 27902695
PQID 1849657283
PQPubID 1436340
ParticipantIDs plos_journals_1849657283
doaj_primary_oai_doaj_org_article_c4cfb70119804a3baec9efae0c1a5615
pubmedcentral_primary_oai_pubmedcentral_nih_gov_5130168
proquest_miscellaneous_1859481500
proquest_miscellaneous_1845256453
proquest_journals_1849657283
gale_infotracmisc_A478272466
gale_infotracacademiconefile_A478272466
gale_incontextgauss_ISR_A478272466
gale_incontextgauss_ISN_A478272466
pubmed_primary_27902695
crossref_citationtrail_10_1371_journal_pcbi_1005017
crossref_primary_10_1371_journal_pcbi_1005017
PublicationCentury 2000
PublicationDate 2016-11-01
PublicationDateYYYYMMDD 2016-11-01
PublicationDate_xml – month: 11
  year: 2016
  text: 2016-11-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Francisco
– name: San Francisco, CA USA
PublicationTitle PLoS computational biology
PublicationTitleAlternate PLoS Comput Biol
PublicationYear 2016
Publisher Public Library of Science
Public Library of Science (PLoS)
Publisher_xml – name: Public Library of Science
– name: Public Library of Science (PLoS)
References K Verspoor (ref44) 2013; 2013
CH Wei (ref27) 2013; 41
R Winnenburg (ref7) 2008; 9
ref36
FS Collins (ref2) 2015; 372
FE Dewey (ref4) 2014; 311
C Weissenbacher-Lang (ref26) 2014; 61
ref30
J Bonis (ref19) 2006; 22
ML Famiglietti (ref39) 2014; 35
JB Laurila (ref22) 2010; 11
C-H Wei (ref32) 2011; 12
WA Baumgartner Jr. (ref8) 2007; 23
GA Petsko (ref42) 2002; 3
C Greenman (ref1) 2007; 446
E Doughty (ref20) 2011; 27
J Hakenberg (ref6) 2016; 17
G Macintyre (ref23) 2014; 2
CJ Mattingly (ref12) 2003; 111
S Yeniterzi (ref18) 2009; 10
S Sohn (ref34) 2008; 9
J Hakenberg (ref21) 2012; 45
CH Wei (ref33) 2015; 19
D Zou (ref5) 2015; 13
C UniProt (ref38) 2015; 43
CH Wei (ref31) 2012; 7
M Erdogmus (ref17) 2007; 5
C-H Wei (ref28) 2015; 2015
JT den Dunnen (ref40) 2000; 15
R Leaman (ref29) 2013; 29
A Jimeno Yepes (ref43) 2014; 2014
JS Amberger (ref10) 2015; 43
JG Caporaso (ref13) 2007; 23
JD Burger (ref25) 2014
R Leaman (ref35) 2008
A Bairoch (ref9) 2005; 33
ref41
DN Cooper (ref11) 1998; 26
EA Ashley (ref3) 2010; 375
K Lee (ref45) 2016; 2016
R Kuipers (ref16) 2010; 31
A Singhal (ref24) 2016
KD Pruitt (ref37) 2007; 35
A Jimeno Yepes (ref15) 2014; 3
C-H Wei (ref14) 2013
References_xml – year: 2014
  ident: ref25
  article-title: Hybrid curation of gene-mutation relations combining automated extraction and crowdsourcing
  publication-title: Database: the journal of biological databases and curation
– volume: 19
  start-page: 1385
  issue: 4
  year: 2015
  ident: ref33
  article-title: SimConcept: a hybrid approach for simplifying composite named entities in biomedical text
  publication-title: IEEE journal of biomedical and health informatics
  doi: 10.1109/JBHI.2015.2422651
– volume: 27
  start-page: 408
  issue: 3
  year: 2011
  ident: ref20
  article-title: Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature
  publication-title: Bioinformatics (Oxford, England)
  doi: 10.1093/bioinformatics/btq667
– ident: ref41
– volume: 446
  start-page: 153
  issue: 7132
  year: 2007
  ident: ref1
  article-title: Patterns of somatic mutation in human cancer genomes
  publication-title: Nature
  doi: 10.1038/nature05610
– volume: 31
  start-page: 1026
  issue: 9
  year: 2010
  ident: ref16
  article-title: Novel tools for extraction and validation of disease-related mutations applied to Fabry disease
  publication-title: Human mutation
  doi: 10.1002/humu.21317
– volume: 45
  start-page: 842
  issue: 5
  year: 2012
  ident: ref21
  article-title: A SNPshot of PubMed to associate genetic variants with drugs, diseases, and adverse reactions
  publication-title: Journal of biomedical informatics
  doi: 10.1016/j.jbi.2012.04.006
– volume: 2
  start-page: e639
  year: 2014
  ident: ref23
  article-title: Associating disease-related genetic variants in intergenic regions to the genes they impact
  publication-title: PeerJ
  doi: 10.7717/peerj.639
– volume: 372
  start-page: 793
  issue: 9
  year: 2015
  ident: ref2
  article-title: A new initiative on precision medicine
  publication-title: The New England journal of medicine
  doi: 10.1056/NEJMp1500523
– volume: 13
  start-page: 55
  issue: 1
  year: 2015
  ident: ref5
  article-title: Biological databases for human research
  publication-title: Genomics, proteomics & bioinformatics
  doi: 10.1016/j.gpb.2015.01.006
– volume: 10
  start-page: S2
  issue: Suppl 8
  year: 2009
  ident: ref18
  article-title: EnzyMiner: automatic identification of protein level mutations and their impact on target enzymes from PubMed abstracts
  publication-title: BMC bioinformatics
  doi: 10.1186/1471-2105-10-S8-S2
– volume: 15
  start-page: 7
  issue: 1
  year: 2000
  ident: ref40
  article-title: Mutation nomenclature extensions and suggestions to describe complex mutations: a discussion
  publication-title: Human mutation
  doi: 10.1002/(SICI)1098-1004(200001)15:1<7::AID-HUMU4>3.0.CO;2-N
– volume: 11
  start-page: S24
  issue: Suppl 4
  year: 2010
  ident: ref22
  article-title: Algorithms and semantic infrastructure for mutation impact extraction and grounding
  publication-title: BMC genomics
  doi: 10.1186/1471-2164-11-S4-S24
– volume: 2014
  start-page: bau003
  year: 2014
  ident: ref43
  article-title: Literature mining of genetic variants for curation: quantifying the importance of supplementary material
  publication-title: Database: the journal of biological databases and curation
  doi: 10.1093/database/bau003
– volume: 33
  start-page: D154
  issue: Database issue
  year: 2005
  ident: ref9
  article-title: The Universal Protein Resource (UniProt)
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gki070
– volume: 5
  start-page: 1261
  issue: 6
  year: 2007
  ident: ref17
  article-title: Application of automatic mutation-gene pair extraction to diseases
  publication-title: Journal of bioinformatics and computational biology
  doi: 10.1142/S021972000700317X
– volume: 2015
  year: 2015
  ident: ref28
  article-title: GNormPlus: An Integrative Approach for Tagging Genes, Gene Families, and Protein Domains
  publication-title: BioMed research international
– volume: 9
  start-page: 466
  issue: 6
  year: 2008
  ident: ref7
  article-title: Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies?
  publication-title: Briefings in bioinformatics
  doi: 10.1093/bib/bbn043
– volume: 61
  start-page: 537
  issue: 5
  year: 2014
  ident: ref26
  article-title: Finding your way through Pneumocystis sequences in the NCBI gene database
  publication-title: The Journal of eukaryotic microbiology
  doi: 10.1111/jeu.12132
– volume: 35
  start-page: 927
  issue: 8
  year: 2014
  ident: ref39
  article-title: Genetic variations and diseases in UniProtKB/Swiss-Prot: the ins and outs of expert manual curation
  publication-title: Human mutation
  doi: 10.1002/humu.22594
– volume: 111
  start-page: 793
  issue: 6
  year: 2003
  ident: ref12
  article-title: The Comparative Toxicogenomics Database (CTD)
  publication-title: Environmental health perspectives
  doi: 10.1289/ehp.6028
– ident: ref30
– start-page: btt156
  year: 2013
  ident: ref14
  article-title: tmVar: a text mining approach for extracting sequence variants in biomedical literature
  publication-title: Bioinformatics (Oxford, England)
– volume: 22
  start-page: 2567
  issue: 20
  year: 2006
  ident: ref19
  article-title: OSIRIS: a tool for retrieving literature about sequence variants
  publication-title: Bioinformatics (Oxford, England)
  doi: 10.1093/bioinformatics/btl421
– volume: 23
  start-page: i41
  issue: 13
  year: 2007
  ident: ref8
  article-title: Manual curation is not sufficient for annotation of genomic databases
  publication-title: Bioinformatics (Oxford, England)
  doi: 10.1093/bioinformatics/btm229
– ident: ref36
– volume: 7
  start-page: e38460
  issue: 6
  year: 2012
  ident: ref31
  article-title: SR4GN: a species recognition software tool for gene normalization
  publication-title: PloS one
  doi: 10.1371/journal.pone.0038460
– volume: 375
  start-page: 1525
  issue: 9725
  year: 2010
  ident: ref3
  article-title: Clinical assessment incorporating a personal genome
  publication-title: Lancet (London, England)
  doi: 10.1016/S0140-6736(10)60452-7
– volume: 2013
  start-page: bat019
  year: 2013
  ident: ref44
  article-title: Annotating the biomedical literature for the human variome
  publication-title: Database: the journal of biological databases and curation
  doi: 10.1093/database/bat019
– volume: 311
  start-page: 1035
  issue: 10
  year: 2014
  ident: ref4
  article-title: Clinical interpretation and implications of whole-genome sequencing
  publication-title: Jama
  doi: 10.1001/jama.2014.1717
– volume: 2016
  year: 2016
  ident: ref45
  article-title: BRONCO: Biomedical entity Relation ONcology COrpus for extracting gene-variant-disease-drug relations
  publication-title: Database: the journal of biological databases and curation
– volume: 29
  start-page: 2909
  issue: 22
  year: 2013
  ident: ref29
  article-title: DNorm: disease name normalization with pairwise learning to rank
  publication-title: Bioinformatics (Oxford, England)
  doi: 10.1093/bioinformatics/btt474
– year: 2016
  ident: ref24
  article-title: Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature
  publication-title: Journal of the American Medical Informatics Association: JAMIA
– volume: 9
  start-page: 402
  year: 2008
  ident: ref34
  article-title: Abbreviation definition identification based on automatic precision estimates
  publication-title: BMC bioinformatics
  doi: 10.1186/1471-2105-9-402
– volume: 23
  start-page: 1862
  issue: 14
  year: 2007
  ident: ref13
  article-title: MutationFinder: a high-performance system for extracting point mutation mentions from text
  publication-title: Bioinformatics (Oxford, England)
  doi: 10.1093/bioinformatics/btm235
– volume: 41
  start-page: W518
  issue: Web Server issue
  year: 2013
  ident: ref27
  article-title: PubTator: a web-based text mining tool for assisting biocuration
  publication-title: Nucleic Acids Res
– start-page: 652
  year: 2008
  ident: ref35
  article-title: BANNER: an executable survey of advances in biomedical named entity recognition
  publication-title: Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing
– volume: 17
  start-page: 24
  issue: 1
  year: 2016
  ident: ref6
  article-title: Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts
  publication-title: BMC bioinformatics
  doi: 10.1186/s12859-015-0865-9
– volume: 43
  start-page: D789
  issue: Database issue
  year: 2015
  ident: ref10
  article-title: OMIM.org: Online Mendelian Inheritance in Man (OMIM(R)), an online catalog of human genes and genetic disorders
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gku1205
– volume: 43
  start-page: D204
  issue: Database issue
  year: 2015
  ident: ref38
  article-title: UniProt: a hub for protein information
  publication-title: Nucleic Acids Res
– volume: 26
  start-page: 285
  issue: 1
  year: 1998
  ident: ref11
  article-title: The human gene mutation database
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/26.1.285
– volume: 3
  start-page: 18
  year: 2014
  ident: ref15
  article-title: Mutation extraction tools can be combined for robust recognition of genetic variants in the literature
  publication-title: F1000Res
  doi: 10.12688/f1000research.3-18.v2
– volume: 35
  start-page: D61
  issue: Database issue
  year: 2007
  ident: ref37
  article-title: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkl842
– volume: 12
  start-page: S5
  issue: Suppl 8
  year: 2011
  ident: ref32
  article-title: Cross-species gene normalization by species inference
  publication-title: BMC bioinformatics
  doi: 10.1186/1471-2105-12-S8-S5
– volume: 3
  issue: 4
  year: 2002
  ident: ref42
  article-title: What's in a name?
  publication-title: Genome Biology
  doi: 10.1186/gb-2002-3-4-comment1005
SSID ssj0035896
Score 2.5021343
Snippet The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the...
  The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the...
SourceID plos
doaj
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage e1005017
SubjectTerms Accuracy
Alzheimer's disease
Automation
Biology and Life Sciences
Biotechnology
Cancer
Charitable foundations
Computer and Information Sciences
Data collection
Data mining
Data Mining - methods
Database Management Systems
Databases, Genetic
Full text
Genetic Predisposition to Disease - epidemiology
Genetic Predisposition to Disease - genetics
Genetic research
Genome, Human - genetics
Genotype & phenotype
Genotypes
High-Throughput Nucleotide Sequencing - methods
Humans
Leukemia
Library collections
Lung cancer
Macular degeneration
Medicine
Medicine and Health Sciences
Methods
Mutation
National libraries
Natural Language Processing
Pancreatic cancer
Patients
Periodicals as Topic
Phenotypes
Precision medicine
Precision Medicine - methods
Prostate cancer
Quality
Research and Analysis Methods
Validation studies
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3fb9MwELZQBRIviN8rDGQQEk9hcWLH8eM2GCCxqoIh9S1yHHutNKVV0yLtv9-d7YYFDfbCW1RfIvnu4rtr7r6PkHcQFK1QdZ3UVhQJb4os0dqJRPG00cKiSztPNiEnk3I2U9NrVF_YExbggYPiDgw3rpaITFamXOe1tkZZp21qmIbY79FLIevZFVPhDM5F6Zm5kBQnkTmfxaG5XLKDaKMPK1MvsEdApJ6s7HdQ8tj9_Qk9Wl0su5vSzz-7KK-FpZOH5EHMJ-lh2Mcjcse2j8m9wDB5-YRszuDwpaeeBYJ-tu0S_3JNpvN4RfteuPli1VGcNaFHfiAfbUe_9ZjLFHJb-lFvNEY9erwNfkN129DpOtL00NP4nf4p-Xny6ez4SxKJFhJTKL5JSshdmVGIHW9UZrRizmSNMjrPasjQ8rq03Ok8r5VjuhDMMpBjssydSA3XJn9GRu2ytXuEZsZBhWIaZZnhXDHFbcFs2TCnU6e4HJN8p-nKRBRyJMO4qPynNQnVSFBchfapon3GJOnvWgUUjlvkj9CIvSxiaPsfwLOq6FnVbZ41Jm_RBSpEyWixDedcb7uu-vpjUh1ySKxkxovir0LfB0Lvo5BbwmaNjqMPoDJE3xpI7g8k4V03g-U9dMfdnrsK6nNVCAk5Ity5c9Gbl9_0y_hQbK1r7XLrZUSGmEL_lAmoPmk6Js-D1_e6zaSCOl6BuuTgfRgof7jSLuYeyFxAAsWK8sX_sNZLch9y2SKMie6T0Wa9ta_IXfNrs-jWr_3pcAUAkWut
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: Public Library of Science (PLoS) Journals Open Access
  dbid: FPL
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3db9MwELegfIgXPsbHCgMZhMRTII4_Ej9ugwLSVlUwpL1FjmPTSlNaNSkS_z13iRvItIJ4q-Kf0_p8tu_qu98R8hoORSd1UUSFkyoSpUoiY7yMtIhLIx2qtG-LTaTTaXZ-rme_HcVLN_g8Ze-CTN-ubLHAO30JOnSd3Ei4UhjCNZmdbHdeLjOtQnrcrp6D46dl6e_34tHqYllfZWhejpf84wCa3Pvfn36f3A2mJj3sdOMBueaqPXKrKz75c4_cPg3X6g9JcwZbND1ta0XQj65a4h-z0WwePtE-Ym6-WNUUM1LoUZu2jzNMT3pmZgoWMH1vGoNnIz3edNpFTVXS2ToU86Hbr31Evk0-nB1_ikI5hsgqLZooAwuXWY0M81Yn1mjmbVJqa3hSgB3Hi8wJbzgvtGdGSeYY4FiacS9jK4zlj8moWlZun9DEevBjbKkds0JopoVTzGUl8yb2WqRjwrezlNvAVY4lMy7y9gIuBZ-lk2GOos2DaMck6nutOq6Of-CPUAF6LDJttw9gDvOwcHMrrC9SZMbLYmF4YZzVzhsXW2bA9pRj8grVJ0cujQqDdb6bTV3nn79O80MB5leaCKV2gr4MQG8CyC9hsNaEBAkQGXJ0DZAHAyTsCHbQvI-qvB1znYMXr5VMwZKEnlv1vrr5Zd-ML8UAvMotNy1GJsg89FdMx_0Tx2PypFsxvWyTVIO3r0Fc6WAtDYQ_bKkW85buXIKZxVT2dPeonpE7YMeqLkX0gIya9cY9Jzftj2ZRr1-0e8QvfXRoDA
  priority: 102
  providerName: Public Library of Science
Title Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine
URI https://www.ncbi.nlm.nih.gov/pubmed/27902695
https://www.proquest.com/docview/1849657283
https://www.proquest.com/docview/1845256453
https://www.proquest.com/docview/1859481500
https://pubmed.ncbi.nlm.nih.gov/PMC5130168
https://doaj.org/article/c4cfb70119804a3baec9efae0c1a5615
http://dx.doi.org/10.1371/journal.pcbi.1005017
Volume 12
WOSCitedRecordID wos000391230900002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: DOA
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: P5Z
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Biological Science Database
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: M7P
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: K7-
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Health & Medical Collection
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: 7X7
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: BENPR
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: PIMPY
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVATS
  databaseName: Public Library of Science (PLoS) Journals Open Access
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: FPL
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: http://www.plos.org/publications/
  providerName: Public Library of Science
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwELdYB4gXPsbHCqMyCImnsDix4_gJrWOFibWKxpA6XiLHcdZKU1KaFon_Hp_jZgSN8cCLVdWXprbPd2f7_Psh9MY4Rc1ElnmZZpFH8yjwpCyYJ6ifS6ZBpQtLNsEnk3g6FYnbcKtdWuXGJlpDnVcK9sj3zUpERIwbb_h-8d0D1ig4XXUUGltomwQBAT3_zL2NJQ5ZbPm5gBrH4yGduqtzISf7bqTeLVQ2h0wB5lvKsivXZBH8WzvdW1xW9XVB6J-5lL85p9GD_23WQ3TfhaX4oNGjR-iWLnfQnYao8ucOujt2R_CP0erMmHM8trwS-KMuK9jE9ZKZ-4Tb7LrZfFFjuL2Ch_aKP2gDPmlRnLGJlvEHuZLgR_HhutFELMscJ0tH_IM3r32Cvo6Ozg4_eY66wVORoCsvNtEwUQLQ6JUIlBSkUEEulAyDzMR8YRZrWsgwzERBZMSIJkaO8DgsmK-oVOFT1CurUu8iHKjCrHlULjRRlAoiqI6IjnNSSL8QlPdRuBm1VDlcc6DXuEztYR0365umD1MY69SNdR957VOLBtfjH_JDUIhWFlC57RfV8iJ1kzxVVBUZBxS92KcyzKRWQhdS-4pIE6eyPnoN6pQC7kYJiT0Xcl3X6fGXSXpATajGAxpFfxU67Qi9dUJFZRqrpLtMYboM8Lw6knsdSWM9VKd6F1R70-Y6vVJI8-RGZa-vftVWw49Csl6pq7WVYQGgFN0o0-AE-X4fPWtmUNu3ARd-EAnTXbwztzqd360p5zMLjc5MSEai-PnNf_0Fumfi3qi5UrqHeqvlWr9Et9WP1bxeDtAWn3JbxgO0PTyaJKcDu1VjylFyMrA2ZgBJwokpE_bNSCXH4-T8F21qhMk
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1bb9MwFLZGub5wGZcVBhgE4iksTuw4fkBoF8aqbdUERepbcBxnrTQlpWlB-1P8RnwcJyNojKc98FbVJ2lyeq7J8fch9MokRc1EmnqpZpFHsyjwpMyZJ6ifSabBpHNLNsGHw3g8Fkcr6GezFwbGKpuYaAN1Vip4Rr5hOhERMW6y4fvZNw9Yo-DtakOhUZvFvj79YVq26t1gx_y_r4Ng98Noe89zrAKeigRdeLEp1IgSAJSuRKCkILkKMqFkGKSmHAnTWNNchmEqciIjRjQxcoTHYc58RaUKzXmvoKsmjnMYIePjtsELWWz5wICKx-MhHbuteiEnG84y3s5UOoXJBOZbirSzVGgZA9q80JudlNV5Re-fs5u_JcPdO_-bGu-i267sxpu1n9xDK7pYRddrIs7TVXTj0I0Y3EeLkUlX-NDyZuCPuijhIbV3NHGfcDs9OJnOKgy7c_CWhTAAa8cHLUo1Nt0A3pELCXUC3l7WnoZlkeGjuSM2ws3PPkBfLuXuH6JeURZ6DeFA5aanU5nQRFEqiKA6IjrOSC79XFDeR2FjJYlyuO1AH3KS2JeR3PRvtQ4TsK3E2VYfee1Rsxq35B_yW2CArSygjtsvyvlx4oJYoqjKUw4ogbFPZZhKrYTOpfYVkaYOZ330Esw3AVyRAgaXjuWyqpLB52GySU0pygMaRX8V-tQReuOE8tLcrJJus4hRGeCVdSTXO5ImOqrO8hq4UnPPVXLmAObIxkXOX37RLsNJYRix0OXSyrAAUJgulKlxkHy_jx7VHtvqNuDCDyJh1MU7vtxRfnelmE4s9DszJSeJ4scXX_pzdHNvdHiQHAyG-0_QLVPjR_X22XXUW8yX-im6pr4vptX8mY1fGH29bE__Bd-M2Ls
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwELbKAhUXHuXRhQIGgTiFxokdxweE2i6FVdvVCopUcQmOY3dXqpJlswvqX-PX4UmclKBSTj1wi9aTh2dnPDPJ-PsQemGDomYiTb1Us8ijWRR4UhrmCepnkmkwaVORTfDRKD46EuMV9LPZCwNtlc2aWC3UWaHgHfmmrURExLiNhpvGtUWMB7tvZ988YJCCL60NnUZtInv69Ict38o3w4H9r18Gwe67w50PnmMY8FQk6MKLbdJGlADQdCUCJQUxKsiEkmGQ2tQkTGNNjQzDVBgiI0Y0sXKEx6FhvqJShfa6V9BVbmtMKPzG7EsTBUIWV9xgQMvj8ZAeuW17ISebzkpez1Q6hS4F5ld0aWdhsWIPaGNEb3ZSlOclwH_2cf4WGHdv_c8qvY1uunQcb9X-cwet6HwNXa8JOk_X0OqBaz24ixaH9sHxQcWngd_rvICX19544o5w21U4mc5KDLt28HYFbQBegPdb9GpsqwQ8kAsJ-QPeWdYeiGWe4fHcER7h5rb30OdLmf191MuLXK8jHChjaz2VCU0UpYIIqiOi44wY6RtBeR-FjcUkyuG5A63ISVJ9pOS2rqt1mICdJc7O-shrz5rVeCb_kN8GY2xlAY28-qGYHyducUsUVSblgB4Y-1SGqdRKaCO1r4i0-Tnro-dgygngjeRgZsdyWZbJ8NMo2aI2ReUBjaK_Cn3sCL1yQqawk1XSbSKxKgMcs47kRkfSrpqqM7wObtXMuUzOnMGe2bjL-cPP2mG4KDQp5rpYVjIsAHSmC2VqfCTf76MHtfe2ug248INIWHXxjl93lN8dyaeTChKe2VSURPHDix_9KVq1Dp7sD0d7j9ANm_pH9a7aDdRbzJf6Mbqmvi-m5fxJtZRh9PWyHf0X-4Dhrg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Text+Mining+Genotype-Phenotype+Relationships+from+Biomedical+Literature+for+Database+Curation+and+Precision+Medicine&rft.jtitle=PLoS+computational+biology&rft.au=Singhal%2C+Ayush&rft.au=Simmons%2C+Michael&rft.au=Lu%2C+Zhiyong&rft.date=2016-11-01&rft.pub=Public+Library+of+Science&rft.issn=1553-734X&rft.eissn=1553-7358&rft.volume=12&rft.issue=11&rft_id=info:doi/10.1371%2Fjournal.pcbi.1005017&rft_id=info%3Apmid%2F27902695&rft.externalDocID=PMC5130168
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1553-7358&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1553-7358&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1553-7358&client=summon