BioConceptVec: Creating and evaluating literature-based biomedical concept embeddings on a large scale

A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of biological entities is vital to many biological applications, such as protein-protein interaction prediction and literature-based discovery....

Full description

Saved in:
Bibliographic Details
Published in:PLoS computational biology Vol. 16; no. 4; p. e1007617
Main Authors: Chen, Qingyu, Lee, Kyubum, Yan, Shankai, Kim, Sun, Wei, Chih-Hsuan, Lu, Zhiyong
Format: Journal Article
Language:English
Published: United States Public Library of Science 01.04.2020
Public Library of Science (PLoS)
Subjects:
ISSN:1553-7358, 1553-734X, 1553-7358
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of biological entities is vital to many biological applications, such as protein-protein interaction prediction and literature-based discovery. Concept embeddings-which involve the learning of vector representations of concepts using machine learning models-have been employed to capture the semantics of concepts. To develop concept embeddings, named-entity recognition (NER) tools are first used to identify and normalize concepts from the literature, and then different machine learning models are used to train the embeddings. Despite multiple attempts, existing biomedical concept embeddings generally suffer from suboptimal NER tools, small-scale evaluation, and limited availability. In response, we employed high-performance machine learning-based NER tools for concept recognition and trained our concept embeddings, BioConceptVec, via four different machine learning models on ~30 million PubMed abstracts. BioConceptVec covers over 400,000 biomedical concepts mentioned in the literature and is of the largest among the publicly available biomedical concept embeddings to date. To evaluate the validity and utility of BioConceptVec, we respectively performed two intrinsic evaluations (identifying related concepts based on drug-gene and gene-gene interactions) and two extrinsic evaluations (protein-protein interaction prediction and drug-drug interaction extraction), collectively using over 25 million instances from nine independent datasets (17 million instances from six intrinsic evaluation tasks and 8 million instances from three extrinsic evaluation tasks), which is, by far, the most comprehensive to our best knowledge. The intrinsic evaluation results demonstrate that BioConceptVec consistently has, by a large margin, better performance than existing concept embeddings in identifying similar and related concepts. More importantly, the extrinsic evaluation results demonstrate that using BioConceptVec with advanced deep learning models can significantly improve performance in downstream bioinformatics studies and biomedical text-mining applications. Our BioConceptVec embeddings and benchmarking datasets are publicly available at https://github.com/ncbi-nlp/BioConceptVec.
AbstractList A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of biological entities is vital to many biological applications, such as protein-protein interaction prediction and literature-based discovery. Concept embeddings-which involve the learning of vector representations of concepts using machine learning models-have been employed to capture the semantics of concepts. To develop concept embeddings, named-entity recognition (NER) tools are first used to identify and normalize concepts from the literature, and then different machine learning models are used to train the embeddings. Despite multiple attempts, existing biomedical concept embeddings generally suffer from suboptimal NER tools, small-scale evaluation, and limited availability. In response, we employed high-performance machine learning-based NER tools for concept recognition and trained our concept embeddings, BioConceptVec, via four different machine learning models on ~30 million PubMed abstracts. BioConceptVec covers over 400,000 biomedical concepts mentioned in the literature and is of the largest among the publicly available biomedical concept embeddings to date. To evaluate the validity and utility of BioConceptVec, we respectively performed two intrinsic evaluations (identifying related concepts based on drug-gene and gene-gene interactions) and two extrinsic evaluations (protein-protein interaction prediction and drug-drug interaction extraction), collectively using over 25 million instances from nine independent datasets (17 million instances from six intrinsic evaluation tasks and 8 million instances from three extrinsic evaluation tasks), which is, by far, the most comprehensive to our best knowledge. The intrinsic evaluation results demonstrate that BioConceptVec consistently has, by a large margin, better performance than existing concept embeddings in identifying similar and related concepts. More importantly, the extrinsic evaluation results demonstrate that using BioConceptVec with advanced deep learning models can significantly improve performance in downstream bioinformatics studies and biomedical text-mining applications. Our BioConceptVec embeddings and benchmarking datasets are publicly available at
A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of biological entities is vital to many biological applications, such as protein-protein interaction prediction and literature-based discovery. Concept embeddings-which involve the learning of vector representations of concepts using machine learning models-have been employed to capture the semantics of concepts. To develop concept embeddings, named-entity recognition (NER) tools are first used to identify and normalize concepts from the literature, and then different machine learning models are used to train the embeddings. Despite multiple attempts, existing biomedical concept embeddings generally suffer from suboptimal NER tools, small-scale evaluation, and limited availability. In response, we employed high-performance machine learning-based NER tools for concept recognition and trained our concept embeddings, BioConceptVec, via four different machine learning models on ~30 million PubMed abstracts. BioConceptVec covers over 400,000 biomedical concepts mentioned in the literature and is of the largest among the publicly available biomedical concept embeddings to date. To evaluate the validity and utility of BioConceptVec, we respectively performed two intrinsic evaluations (identifying related concepts based on drug-gene and gene-gene interactions) and two extrinsic evaluations (protein-protein interaction prediction and drug-drug interaction extraction), collectively using over 25 million instances from nine independent datasets (17 million instances from six intrinsic evaluation tasks and 8 million instances from three extrinsic evaluation tasks), which is, by far, the most comprehensive to our best knowledge. The intrinsic evaluation results demonstrate that BioConceptVec consistently has, by a large margin, better performance than existing concept embeddings in identifying similar and related concepts. More importantly, the extrinsic evaluation results demonstrate that using BioConceptVec with advanced deep learning models can significantly improve performance in downstream bioinformatics studies and biomedical text-mining applications. Our BioConceptVec embeddings and benchmarking datasets are publicly available at https://github.com/ncbi-nlp/BioConceptVec.
A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of biological entities is vital to many biological applications, such as protein-protein interaction prediction and literature-based discovery. Concept embeddings-which involve the learning of vector representations of concepts using machine learning models-have been employed to capture the semantics of concepts. To develop concept embeddings, named-entity recognition (NER) tools are first used to identify and normalize concepts from the literature, and then different machine learning models are used to train the embeddings. Despite multiple attempts, existing biomedical concept embeddings generally suffer from suboptimal NER tools, small-scale evaluation, and limited availability. In response, we employed high-performance machine learning-based NER tools for concept recognition and trained our concept embeddings, BioConceptVec, via four different machine learning models on ~30 million PubMed abstracts. BioConceptVec covers over 400,000 biomedical concepts mentioned in the literature and is of the largest among the publicly available biomedical concept embeddings to date. To evaluate the validity and utility of BioConceptVec, we respectively performed two intrinsic evaluations (identifying related concepts based on drug-gene and gene-gene interactions) and two extrinsic evaluations (protein-protein interaction prediction and drug-drug interaction extraction), collectively using over 25 million instances from nine independent datasets (17 million instances from six intrinsic evaluation tasks and 8 million instances from three extrinsic evaluation tasks), which is, by far, the most comprehensive to our best knowledge. The intrinsic evaluation results demonstrate that BioConceptVec consistently has, by a large margin, better performance than existing concept embeddings in identifying similar and related concepts. More importantly, the extrinsic evaluation results demonstrate that using BioConceptVec with advanced deep learning models can significantly improve performance in downstream bioinformatics studies and biomedical text-mining applications. Our BioConceptVec embeddings and benchmarking datasets are publicly available at https://github.com/ncbi-nlp/BioConceptVec.A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of biological entities is vital to many biological applications, such as protein-protein interaction prediction and literature-based discovery. Concept embeddings-which involve the learning of vector representations of concepts using machine learning models-have been employed to capture the semantics of concepts. To develop concept embeddings, named-entity recognition (NER) tools are first used to identify and normalize concepts from the literature, and then different machine learning models are used to train the embeddings. Despite multiple attempts, existing biomedical concept embeddings generally suffer from suboptimal NER tools, small-scale evaluation, and limited availability. In response, we employed high-performance machine learning-based NER tools for concept recognition and trained our concept embeddings, BioConceptVec, via four different machine learning models on ~30 million PubMed abstracts. BioConceptVec covers over 400,000 biomedical concepts mentioned in the literature and is of the largest among the publicly available biomedical concept embeddings to date. To evaluate the validity and utility of BioConceptVec, we respectively performed two intrinsic evaluations (identifying related concepts based on drug-gene and gene-gene interactions) and two extrinsic evaluations (protein-protein interaction prediction and drug-drug interaction extraction), collectively using over 25 million instances from nine independent datasets (17 million instances from six intrinsic evaluation tasks and 8 million instances from three extrinsic evaluation tasks), which is, by far, the most comprehensive to our best knowledge. The intrinsic evaluation results demonstrate that BioConceptVec consistently has, by a large margin, better performance than existing concept embeddings in identifying similar and related concepts. More importantly, the extrinsic evaluation results demonstrate that using BioConceptVec with advanced deep learning models can significantly improve performance in downstream bioinformatics studies and biomedical text-mining applications. Our BioConceptVec embeddings and benchmarking datasets are publicly available at https://github.com/ncbi-nlp/BioConceptVec.
A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of biological entities is vital to many biological applications, such as protein-protein interaction prediction and literature-based discovery. Concept embeddings—which involve the learning of vector representations of concepts using machine learning models—have been employed to capture the semantics of concepts. To develop concept embeddings, named-entity recognition (NER) tools are first used to identify and normalize concepts from the literature, and then different machine learning models are used to train the embeddings. Despite multiple attempts, existing biomedical concept embeddings generally suffer from suboptimal NER tools, small-scale evaluation, and limited availability. In response, we employed high-performance machine learning-based NER tools for concept recognition and trained our concept embeddings, BioConceptVec, via four different machine learning models on ~30 million PubMed abstracts. BioConceptVec covers over 400,000 biomedical concepts mentioned in the literature and is of the largest among the publicly available biomedical concept embeddings to date. To evaluate the validity and utility of BioConceptVec, we respectively performed two intrinsic evaluations (identifying related concepts based on drug-gene and gene-gene interactions) and two extrinsic evaluations (protein-protein interaction prediction and drug-drug interaction extraction), collectively using over 25 million instances from nine independent datasets (17 million instances from six intrinsic evaluation tasks and 8 million instances from three extrinsic evaluation tasks), which is, by far, the most comprehensive to our best knowledge. The intrinsic evaluation results demonstrate that BioConceptVec consistently has, by a large margin, better performance than existing concept embeddings in identifying similar and related concepts. More importantly, the extrinsic evaluation results demonstrate that using BioConceptVec with advanced deep learning models can significantly improve performance in downstream bioinformatics studies and biomedical text-mining applications. Our BioConceptVec embeddings and benchmarking datasets are publicly available at https://github.com/ncbi-nlp/BioConceptVec. Capturing the semantics of related biological concepts, such as genes and mutations, is of significant importance to many research tasks in computational biology such as protein-protein interaction detection, gene-drug association prediction, and biomedical literature-based discovery. Here, we propose to leverage state-of-the-art text mining tools and machine learning models to learn the semantics via vector representations (aka. embeddings) of over 400,000 biological concepts mentioned in the entire PubMed abstracts. Our learned embeddings, namely BioConceptVec, can capture related concepts based on their surrounding contextual information in the literature, which is beyond exact term match or co-occurrence-based methods. BioConceptVec has been thoroughly evaluated in multiple bioinformatics tasks consisting of over 25 million instances from nine different biological datasets. The evaluation results demonstrate that BioConceptVec has better performance than existing methods in all tasks. Finally, BioConceptVec is made freely available to the research community and general public.
Audience Academic
Author Wei, Chih-Hsuan
Chen, Qingyu
Lu, Zhiyong
Kim, Sun
Yan, Shankai
Lee, Kyubum
AuthorAffiliation National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland, United States of America
University of Maryland Baltimore County, UNITED STATES
AuthorAffiliation_xml – name: University of Maryland Baltimore County, UNITED STATES
– name: National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland, United States of America
Author_xml – sequence: 1
  givenname: Qingyu
  orcidid: 0000-0002-6036-1516
  surname: Chen
  fullname: Chen, Qingyu
– sequence: 2
  givenname: Kyubum
  orcidid: 0000-0003-2015-3939
  surname: Lee
  fullname: Lee, Kyubum
– sequence: 3
  givenname: Shankai
  orcidid: 0000-0003-0369-4979
  surname: Yan
  fullname: Yan, Shankai
– sequence: 4
  givenname: Sun
  surname: Kim
  fullname: Kim, Sun
– sequence: 5
  givenname: Chih-Hsuan
  surname: Wei
  fullname: Wei, Chih-Hsuan
– sequence: 6
  givenname: Zhiyong
  orcidid: 0000-0001-9998-916X
  surname: Lu
  fullname: Lu, Zhiyong
BackLink https://www.ncbi.nlm.nih.gov/pubmed/32324731$$D View this record in MEDLINE/PubMed
BookMark eNqVk1uL1DAYhousuAf9B6IFb9aLGXNqku6FsA4eBhYFT7chTb7WDJlmTNpl_fdmnI7sLIsgvWgOz_t-zdt8p8VRH3ooiqcYzTEV-NUqjLHXfr4xjZtjhATH4kFxgquKzgSt5NGt8XFxmtIKoTys-aPimBJKmKD4pGjfuLAIvYHN8B3MRbmIoAfXd6XubQnX2o-7qXcDRD2MEWaNTmDLxoU1WGe0L81OX8K6AWszncrQl7r0OnZQpozA4-Jhq32CJ9P7rPj27u3XxYfZ1af3y8Xl1cxwjocZxrLlCBrBhKmsRNJaTgVBLSVCMsSaqm6IEQQoSGyR0bxhFTZcA-Z1DS09K57vfDc-JDVFlBRhiArBKJKZWO4IG_RKbaJb6_hLBe3Un4UQO6Xj4IwHxQVhNWkl0FYyQURTUcsaXIM2klCNs9frqdrY5DAM9EPU_sD0cKd3P1QXrpUgVCCKssH5ZBDDzxHSoNYuGfBe9xDG_N20ZrJGuN7WenEHvf90E9Xl0JXr25Drmq2puuSU1AxxLjI1v4fKj4W1y38TWpfXDwQvDwSZGeBm6PSYklp--fwf7MdD9tntAP8mt7-fGbjYASaGlCK0yrgh38iwzdN5hZHaNsM-C7VtBjU1QxazO-K9_z9lvwGR9g1t
CitedBy_id crossref_primary_10_1038_s41551_024_01284_6
crossref_primary_10_3389_fphar_2023_1205144
crossref_primary_10_1093_jamia_ocaa151
crossref_primary_10_1038_s41467_025_56989_2
crossref_primary_10_1371_journal_pone_0276539
crossref_primary_10_1002_asi_25005
crossref_primary_10_1038_s41598_025_01418_z
crossref_primary_10_1186_s12859_022_05083_1
crossref_primary_10_2196_27386
crossref_primary_10_3390_genes11111264
crossref_primary_10_2478_dim_2021_0007
crossref_primary_10_2196_29667
crossref_primary_10_1109_TCBB_2022_3173562
crossref_primary_10_1007_s00779_021_01595_4
crossref_primary_10_1371_journal_pntd_0008895
crossref_primary_10_1016_j_drudis_2021_06_009
crossref_primary_10_1093_nar_gkaa952
crossref_primary_10_1007_s12539_024_00605_2
crossref_primary_10_3390_ijerph18178985
crossref_primary_10_1371_journal_pone_0253847
crossref_primary_10_1093_bib_bbab282
crossref_primary_10_1186_s13326_021_00247_z
crossref_primary_10_1007_s12559_021_09903_z
crossref_primary_10_1016_j_eswa_2022_118930
crossref_primary_10_1371_journal_pone_0258623
crossref_primary_10_1017_S0021859623000618
crossref_primary_10_1371_journal_pone_0248663
crossref_primary_10_1109_ACCESS_2021_3130956
crossref_primary_10_7717_peerj_cs_1085
crossref_primary_10_1016_j_ins_2023_01_007
crossref_primary_10_2174_1574893618666230612161210
Cites_doi 10.1136/amiajnl-2013-002544
10.3115/v1/D14-1162
10.1016/j.jbi.2019.103182
10.1371/journal.pcbi.1006390
10.1093/nar/gkx1037
10.1109/BIBE.2018.00073
10.2196/medinform.4321
10.1093/nar/gkz289
10.1093/nar/gky868
10.1093/nar/gky1131
10.1093/bioinformatics/btx541
10.1186/s12911-018-0654-2
10.1093/nar/gkt441
10.18653/v1/N18-1202
10.1371/journal.pone.0038460
10.1007/978-3-319-53817-4_4
10.1038/s41597-019-0055-0
10.1109/ICHI.2019.8904728
10.1038/35011540
10.1186/s12864-018-5370-x
10.1093/bioinformatics/btw343
10.1093/bioinformatics/bty933
10.24963/ijcai.2018/554
10.18653/v1/P17-1161
10.1145/2939672.2939823
10.1186/1472-6947-15-S1-S4
10.18653/v1/W16-2922
10.18653/v1/W19-5004
10.1016/j.jbi.2013.07.011
10.18653/v1/D15-1036
10.1186/s12911-020-1044-0
10.1093/bioinformatics/btr260
10.1142/9789811215636_0027
10.1007/978-3-319-69751-2_1
10.1145/3307339.3342162
10.1002/lnco.362
10.1162/tacl_a_00051
10.1093/bioinformatics/btx659
10.1038/nrg1272
10.1145/2661829.2661974
10.1007/978-1-4614-3223-4
10.18653/v1/D18-1349
10.1016/j.cell.2015.06.043
10.1093/bioinformatics/bty259
10.1016/j.jbi.2017.08.011
10.1016/j.jbi.2019.103118
10.1093/nar/gkh061
10.1186/s12911-019-0766-3
10.1186/s12859-018-2543-1
ContentType Journal Article
Copyright COPYRIGHT 2020 Public Library of Science
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication: https://creativecommons.org/publicdomain/zero/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: COPYRIGHT 2020 Public Library of Science
– notice: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication: https://creativecommons.org/publicdomain/zero/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
NPM
ISN
ISR
3V.
7QO
7QP
7TK
7TM
7X7
7XB
88E
8AL
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
HCIFZ
JQ2
K7-
K9.
LK8
M0N
M0S
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
RC3
7X8
5PM
DOA
DOI 10.1371/journal.pcbi.1007617
DatabaseName CrossRef
PubMed
Gale In Context: Canada
Gale In Context: Science
ProQuest Central (Corporate)
Biotechnology Research Abstracts
Calcium & Calcified Tissue Abstracts
Neurosciences Abstracts
Nucleic Acids Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Collection
ProQuest Hospital Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Database‎ (1962 - current)
ProQuest Central Essentials - QC
Biological Science Collection
ProQuest Central
Technology Collection
Natural Science Collection
ProQuest One Community College
ProQuest Central Korea
Engineering Research Database
ProQuest Health & Medical Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Health & Medical Complete (Alumni)
Biological Sciences
Computing Database
ProQuest Health & Medical Collection
Medical Database
Biological Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
Proquest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
Directory of Open Access Journals (DOAJ)
DatabaseTitle CrossRef
PubMed
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Nucleic Acids Abstracts
SciTech Premium Collection
ProQuest Central China
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Neurosciences Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
Calcium & Calcified Tissue Abstracts
ProQuest One Academic (New)
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Genetics Abstracts
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest Medical Library
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList

MEDLINE - Academic
PubMed


Publicly Available Content Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
Medicine
DocumentTitleAlternate Biomedical concept embeddings in bioinformatics and biomedical text mining applications
EISSN 1553-7358
ExternalDocumentID 2403774308
oai_doaj_org_article_672492f8e3f84727b53d4b19eac823a1
PMC7237030
A632940667
32324731
10_1371_journal_pcbi_1007617
Genre Journal Article
GeographicLocations United States
United States--US
Maryland
GeographicLocations_xml – name: United States
– name: Maryland
– name: United States--US
GrantInformation_xml – fundername: ;
GroupedDBID ---
123
29O
2WC
53G
5VS
7X7
88E
8FE
8FG
8FH
8FI
8FJ
AAFWJ
AAKPC
AAUCC
AAWOE
AAYXX
ABDBF
ABUWG
ACCTH
ACGFO
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
AEAQA
AENEX
AEUYN
AFFHD
AFKRA
AFPKN
AFRAH
AHMBA
ALMA_UNASSIGNED_HOLDINGS
AOIJS
ARAPS
AZQEC
B0M
BAIFH
BAWUL
BBNVY
BBTPI
BCNDV
BENPR
BGLVJ
BHPHI
BPHCQ
BVXVI
BWKFM
CCPQU
CITATION
CS3
DIK
DWQXO
E3Z
EAP
EAS
EBD
EBS
EJD
EMK
EMOBN
ESX
F5P
FPL
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
HCIFZ
HMCUK
HYE
IAO
IGS
INH
INR
ISN
ISR
ITC
J9A
K6V
K7-
KQ8
LK8
M1P
M48
M7P
O5R
O5S
OK1
OVT
P2P
P62
PHGZM
PHGZT
PIMPY
PJZUB
PPXIY
PQGLB
PQQKQ
PROAC
PSQYO
PV9
RNS
RPM
RZL
SV3
TR2
TUS
UKHRP
WOW
XSB
~8M
ADRAZ
ALIPV
C1A
H13
IPNFZ
NPM
RIG
WOQ
3V.
7QO
7QP
7TK
7TM
7XB
8AL
8FD
8FK
FR3
JQ2
K9.
M0N
P64
PKEHL
PQEST
PQUKI
PRINS
Q9U
RC3
7X8
5PM
-
AAPBV
ABPTK
ACDSR
BBAFP
M~E
UMP
ID FETCH-LOGICAL-c661t-118f60eb747c5d808dd63720f3278404b59b2c72e3e81d0ca6b451c6ae1699ef3
IEDL.DBID DOA
ISICitedReferencesCount 31
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000531366700040&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1553-7358
1553-734X
IngestDate Sun Sep 04 00:10:37 EDT 2022
Mon Nov 10 04:34:58 EST 2025
Tue Nov 04 01:54:55 EST 2025
Sun Nov 09 13:16:16 EST 2025
Sat Nov 29 14:53:10 EST 2025
Tue Nov 11 07:43:24 EST 2025
Tue Nov 04 18:00:10 EST 2025
Thu Nov 13 14:58:56 EST 2025
Thu Nov 13 14:47:41 EST 2025
Thu Apr 03 06:58:19 EDT 2025
Sat Nov 29 06:10:48 EST 2025
Tue Nov 18 22:12:07 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
License This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Creative Commons CC0 public domain
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c661t-118f60eb747c5d808dd63720f3278404b59b2c72e3e81d0ca6b451c6ae1699ef3
Notes new_version
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
The authors have declared that no competing interests exist.
ORCID 0000-0003-0369-4979
0000-0002-6036-1516
0000-0003-2015-3939
0000-0001-9998-916X
OpenAccessLink https://doaj.org/article/672492f8e3f84727b53d4b19eac823a1
PMID 32324731
PQID 2403774308
PQPubID 1436340
ParticipantIDs plos_journals_2403774308
doaj_primary_oai_doaj_org_article_672492f8e3f84727b53d4b19eac823a1
pubmedcentral_primary_oai_pubmedcentral_nih_gov_7237030
proquest_miscellaneous_2394890191
proquest_journals_2403774308
gale_infotracmisc_A632940667
gale_infotracacademiconefile_A632940667
gale_incontextgauss_ISR_A632940667
gale_incontextgauss_ISN_A632940667
pubmed_primary_32324731
crossref_citationtrail_10_1371_journal_pcbi_1007617
crossref_primary_10_1371_journal_pcbi_1007617
PublicationCentury 2000
PublicationDate 2020-04-01
PublicationDateYYYYMMDD 2020-04-01
PublicationDate_xml – month: 04
  year: 2020
  text: 2020-04-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Francisco
– name: San Francisco, CA USA
PublicationTitle PLoS computational biology
PublicationTitleAlternate PLoS Comput Biol
PublicationYear 2020
Publisher Public Library of Science
Public Library of Science (PLoS)
Publisher_xml – name: Public Library of Science
– name: Public Library of Science (PLoS)
References J Du (pcbi.1007617.ref022) 2019; 20
S Henry (pcbi.1007617.ref003) 2017; 74
RI Doğan (pcbi.1007617.ref006) 2019
R Reátegui (pcbi.1007617.ref026) 2018; 18
pcbi.1007617.ref030
JG Zheng (pcbi.1007617.ref005) 2015; 15
pcbi.1007617.ref031
D Szklarczyk (pcbi.1007617.ref033) 2018; 47
pcbi.1007617.ref034
S Pradhan (pcbi.1007617.ref028) 2014; 22
C-H Wei (pcbi.1007617.ref039) 2019
C-H Wei (pcbi.1007617.ref036) 2015; 2015
A-L Barabasi (pcbi.1007617.ref049) 2004; 5
AP Davis (pcbi.1007617.ref051) 2018; 47
Y Zhang (pcbi.1007617.ref063) 2017; 34
C-H Wei (pcbi.1007617.ref037) 2017; 34
Y Choi (pcbi.1007617.ref012) 2016; 2016
pcbi.1007617.ref041
pcbi.1007617.ref042
pcbi.1007617.ref043
pcbi.1007617.ref044
pcbi.1007617.ref045
pcbi.1007617.ref046
pcbi.1007617.ref047
pcbi.1007617.ref004
A Singhal (pcbi.1007617.ref001) 2016
CC Aggarwal (pcbi.1007617.ref017) 2012
LH Hartwell (pcbi.1007617.ref050) 1999; 402
EL Huttlin (pcbi.1007617.ref055) 2015; 162
Y Li (pcbi.1007617.ref008) 2018
Y Xiang (pcbi.1007617.ref010) 2019; 19
C-H Wei (pcbi.1007617.ref032) 2013; 41
K Erk (pcbi.1007617.ref007) 2012; 6
pcbi.1007617.ref054
pcbi.1007617.ref011
FZ Smaili (pcbi.1007617.ref056) 2018; 34
J Park (pcbi.1007617.ref009) 2019
pcbi.1007617.ref013
pcbi.1007617.ref057
pcbi.1007617.ref014
pcbi.1007617.ref015
pcbi.1007617.ref016
R Leaman (pcbi.1007617.ref035) 2016; 32
pcbi.1007617.ref029
Y Wang (pcbi.1007617.ref040) 2018; 19
M Herrero-Zazo (pcbi.1007617.ref059) 2013; 46
K Lee (pcbi.1007617.ref018) 2018; 14
H Suominen (pcbi.1007617.ref027) 2015; 3
D Dimitriadis (pcbi.1007617.ref020) 2019; 92
Y Zhang (pcbi.1007617.ref053) 2019; 6
A Liberzon (pcbi.1007617.ref052) 2011; 27
Y Wang (pcbi.1007617.ref048) 2018
Z Lu (pcbi.1007617.ref002) 2012
A Allot (pcbi.1007617.ref019) 2019
pcbi.1007617.ref060
O Bodenreider (pcbi.1007617.ref023) 2004; 32
pcbi.1007617.ref061
pcbi.1007617.ref062
DS Wishart (pcbi.1007617.ref058) 2017; 46
pcbi.1007617.ref064
pcbi.1007617.ref021
pcbi.1007617.ref065
pcbi.1007617.ref066
pcbi.1007617.ref024
pcbi.1007617.ref025
CH Wei (pcbi.1007617.ref038) 2012; 7
References_xml – volume: 22
  start-page: 143
  issue: 1
  year: 2014
  ident: pcbi.1007617.ref028
  article-title: Evaluating the state of the art in disorder recognition and normalization of the clinical narrative
  publication-title: Journal of the American Medical Informatics Association
  doi: 10.1136/amiajnl-2013-002544
– ident: pcbi.1007617.ref015
  doi: 10.3115/v1/D14-1162
– ident: pcbi.1007617.ref043
– start-page: 103182
  year: 2019
  ident: pcbi.1007617.ref009
  article-title: Concept Embedding to Measure Semantic Relatedness for Biomedical Information Ontologies
  publication-title: Journal of Biomedical Informatics
  doi: 10.1016/j.jbi.2019.103182
– volume: 14
  start-page: e1006390
  issue: 8
  year: 2018
  ident: pcbi.1007617.ref018
  article-title: Scaling up data curation using deep learning: An application to literature triage in genomic variation resources
  publication-title: PLoS computational biology
  doi: 10.1371/journal.pcbi.1006390
– volume: 46
  start-page: D1074
  issue: D1
  year: 2017
  ident: pcbi.1007617.ref058
  article-title: DrugBank 5.0: a major update to the DrugBank database for 2018
  publication-title: Nucleic acids research
  doi: 10.1093/nar/gkx1037
– ident: pcbi.1007617.ref066
  doi: 10.1109/BIBE.2018.00073
– volume: 3
  start-page: e19
  issue: 2
  year: 2015
  ident: pcbi.1007617.ref027
  article-title: Benchmarking clinical speech recognition and information extraction: new data, methods, and evaluations
  publication-title: JMIR medical informatics
  doi: 10.2196/medinform.4321
– ident: pcbi.1007617.ref014
– year: 2019
  ident: pcbi.1007617.ref019
  article-title: LitSense: making sense of biomedical literature at sentence level
  publication-title: Nucleic acids research
  doi: 10.1093/nar/gkz289
– volume: 47
  start-page: D948
  issue: D1
  year: 2018
  ident: pcbi.1007617.ref051
  article-title: The comparative toxicogenomics database: Update 2019
  publication-title: Nucleic acids research
  doi: 10.1093/nar/gky868
– ident: pcbi.1007617.ref030
– volume: 47
  start-page: D607
  issue: D1
  year: 2018
  ident: pcbi.1007617.ref033
  article-title: STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets
  publication-title: Nucleic acids research
  doi: 10.1093/nar/gky1131
– volume: 34
  start-page: 80
  issue: 1
  year: 2017
  ident: pcbi.1007617.ref037
  article-title: tmVar 2.0: integrating genomic variant information from literature with dbSNP and ClinVar for precision medicine
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx541
– ident: pcbi.1007617.ref024
– volume: 18
  start-page: 74
  issue: 3
  year: 2018
  ident: pcbi.1007617.ref026
  article-title: Comparison of MetaMap and cTAKES for entity extraction in clinical notes
  publication-title: BMC medical informatics and decision making
  doi: 10.1186/s12911-018-0654-2
– volume: 41
  start-page: W518
  issue: W1
  year: 2013
  ident: pcbi.1007617.ref032
  article-title: PubTator: a web-based text mining tool for assisting biocuration
  publication-title: Nucleic acids research
  doi: 10.1093/nar/gkt441
– ident: pcbi.1007617.ref060
  doi: 10.18653/v1/N18-1202
– volume: 7
  start-page: e38460
  issue: 6
  year: 2012
  ident: pcbi.1007617.ref038
  article-title: SR4GN: a species recognition software tool for gene normalization
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0038460
– start-page: 83
  volume-title: Guide to Big Data Applications
  year: 2018
  ident: pcbi.1007617.ref008
  doi: 10.1007/978-3-319-53817-4_4
– year: 2012
  ident: pcbi.1007617.ref002
  article-title: Biocuration workflows and text mining: overview of the BioCreative 2012 Workshop Track II
  publication-title: Database
– volume-title: PubTator central: automated concept annotation for biomedical full text articles
  year: 2019
  ident: pcbi.1007617.ref039
– volume: 6
  start-page: 52
  issue: 1
  year: 2019
  ident: pcbi.1007617.ref053
  article-title: BioWordVec, improving biomedical word embeddings with subword information and MeSH
  publication-title: Scientific data
  doi: 10.1038/s41597-019-0055-0
– ident: pcbi.1007617.ref034
  doi: 10.1109/ICHI.2019.8904728
– volume: 2015
  year: 2015
  ident: pcbi.1007617.ref036
  article-title: GNormPlus: an integrative approach for tagging genes, gene families, and protein domains
  publication-title: BioMed research international
– volume: 402
  start-page: C47
  issue: 6761supp
  year: 1999
  ident: pcbi.1007617.ref050
  article-title: From molecular to modular cell biology
  publication-title: Nature
  doi: 10.1038/35011540
– volume: 20
  start-page: 82
  issue: 1
  year: 2019
  ident: pcbi.1007617.ref022
  article-title: Gene2vec: distributed representation of genes based on co-expression
  publication-title: BMC genomics
  doi: 10.1186/s12864-018-5370-x
– volume: 32
  start-page: 2839
  issue: 18
  year: 2016
  ident: pcbi.1007617.ref035
  article-title: TaggerOne: joint named entity recognition and normalization with semi-Markov Models
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw343
– year: 2019
  ident: pcbi.1007617.ref006
  article-title: Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine
  publication-title: Database: the journal of biological databases and curation
– ident: pcbi.1007617.ref057
  doi: 10.1093/bioinformatics/bty933
– ident: pcbi.1007617.ref004
– ident: pcbi.1007617.ref016
– ident: pcbi.1007617.ref065
  doi: 10.24963/ijcai.2018/554
– ident: pcbi.1007617.ref062
  doi: 10.18653/v1/P17-1161
– ident: pcbi.1007617.ref064
  doi: 10.1145/2939672.2939823
– volume: 15
  start-page: S4
  issue: 1
  year: 2015
  ident: pcbi.1007617.ref005
  article-title: Entity linking for biomedical literature
  publication-title: BMC medical informatics and decision making
  doi: 10.1186/1472-6947-15-S1-S4
– ident: pcbi.1007617.ref046
  doi: 10.18653/v1/W16-2922
– ident: pcbi.1007617.ref061
  doi: 10.18653/v1/W19-5004
– ident: pcbi.1007617.ref031
– volume: 46
  start-page: 914
  issue: 5
  year: 2013
  ident: pcbi.1007617.ref059
  article-title: The DDI corpus: An annotated corpus with pharmacological substances and drug–drug interactions
  publication-title: Journal of biomedical informatics
  doi: 10.1016/j.jbi.2013.07.011
– volume: 2016
  start-page: 41
  year: 2016
  ident: pcbi.1007617.ref012
  article-title: Learning low-dimensional representations of medical concepts
  publication-title: AMIA Summits on Translational Science Proceedings
– ident: pcbi.1007617.ref029
  doi: 10.18653/v1/D15-1036
– year: 2016
  ident: pcbi.1007617.ref001
  article-title: Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges
  publication-title: Database
– ident: pcbi.1007617.ref044
  doi: 10.1186/s12911-020-1044-0
– volume: 27
  start-page: 1739
  issue: 12
  year: 2011
  ident: pcbi.1007617.ref052
  article-title: Molecular signatures database (MSigDB) 3.0
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btr260
– ident: pcbi.1007617.ref011
  doi: 10.1142/9789811215636_0027
– ident: pcbi.1007617.ref025
  doi: 10.1007/978-3-319-69751-2_1
– ident: pcbi.1007617.ref045
– ident: pcbi.1007617.ref021
  doi: 10.1145/3307339.3342162
– volume: 6
  start-page: 635
  issue: 10
  year: 2012
  ident: pcbi.1007617.ref007
  article-title: Vector space models of word meaning and phrase meaning: A survey
  publication-title: Language and Linguistics Compass
  doi: 10.1002/lnco.362
– ident: pcbi.1007617.ref042
  doi: 10.1162/tacl_a_00051
– volume: 34
  start-page: 828
  issue: 5
  year: 2017
  ident: pcbi.1007617.ref063
  article-title: Drug–drug interaction extraction via hierarchical RNNs on sequence and shortest dependency paths
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx659
– year: 2018
  ident: pcbi.1007617.ref048
  article-title: A comparison of word embeddings for the biomedical natural language processing
  publication-title: Journal of biomedical informatics
– ident: pcbi.1007617.ref013
– volume: 5
  start-page: 101
  issue: 2
  year: 2004
  ident: pcbi.1007617.ref049
  article-title: Network biology: understanding the cell's functional organization
  publication-title: Nature reviews genetics
  doi: 10.1038/nrg1272
– ident: pcbi.1007617.ref047
  doi: 10.1145/2661829.2661974
– ident: pcbi.1007617.ref054
– volume-title: Mining text data
  year: 2012
  ident: pcbi.1007617.ref017
  doi: 10.1007/978-1-4614-3223-4
– ident: pcbi.1007617.ref041
  doi: 10.18653/v1/D18-1349
– volume: 162
  start-page: 425
  issue: 2
  year: 2015
  ident: pcbi.1007617.ref055
  article-title: The BioPlex network: a systematic exploration of the human interactome
  publication-title: Cell
  doi: 10.1016/j.cell.2015.06.043
– volume: 34
  start-page: i52
  issue: 13
  year: 2018
  ident: pcbi.1007617.ref056
  article-title: Onto2vec: joint vector-based representation of biological entities and their ontology-based annotations
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty259
– volume: 74
  start-page: 20
  year: 2017
  ident: pcbi.1007617.ref003
  article-title: Literature based discovery: models, methods, and trends
  publication-title: Journal of biomedical informatics
  doi: 10.1016/j.jbi.2017.08.011
– volume: 92
  start-page: 103118
  year: 2019
  ident: pcbi.1007617.ref020
  article-title: Word embeddings and external resources for answer processing in biomedical factoid question answering
  publication-title: Journal of biomedical informatics
  doi: 10.1016/j.jbi.2019.103118
– volume: 32
  start-page: D267
  issue: suppl_1
  year: 2004
  ident: pcbi.1007617.ref023
  article-title: The unified medical language system (UMLS): integrating biomedical terminology
  publication-title: Nucleic acids research
  doi: 10.1093/nar/gkh061
– volume: 19
  start-page: 58
  issue: 2
  year: 2019
  ident: pcbi.1007617.ref010
  article-title: Time-sensitive clinical concept embeddings learned from large electronic health records
  publication-title: BMC medical informatics and decision making
  doi: 10.1186/s12911-019-0766-3
– volume: 19
  start-page: 507
  issue: 20
  year: 2018
  ident: pcbi.1007617.ref040
  article-title: Bidirectional long short-term memory with CRF for detecting biomedical event trigger in FastText semantic space
  publication-title: BMC bioinformatics
  doi: 10.1186/s12859-018-2543-1
SSID ssj0035896
Score 2.5057387
Snippet A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of...
SourceID plos
doaj
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage e1007617
SubjectTerms Bioinformatics
Biology and Life Sciences
Biotechnology
Computational biology
Computer and Information Sciences
Concept mapping
Data mining
Datasets
Deep learning
Drug interaction
Drug interactions
Electronic health records
Evaluation
Heart failure
Influence
Kinases
Knowledge
Learning algorithms
Machine learning
Medical literature
Medicine
Medicine and Health Sciences
Methods
Mutation
National libraries
OLE (Standard)
Performance enhancement
Principal components analysis
Protein interaction
Proteins
Recognition
Semantics
Social Sciences
Software
Studies
SummonAdditionalLinks – databaseName: Biological Science Database
  dbid: M7P
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lj9MwELagwIoLj-WxhQUZhMTJbBMndsIFLRUrOFCteKm3yHbsUqmblKZF4t8z4zhZghY4cEw8jmJ7PA_P-BtCnrnYpUYowZIk5SxRwjGtuGEOXA1pLSg9k_liE3I2y-bz_DQcuDUhrbKTiV5Ql7XBM_IjxI0DU4VPslfrbwyrRmF0NZTQuEyuIEoC96l7p50k5mnm63NhaRwmeTIPV-e4jI7CSr1YG730mQLClyw7V00ewb-X06P1qm4uMkJ_z6X8RTmd3PzfYd0iN4JZSo9bPrpNLtlqn1xrC1X-2Cd770MI_g5x8HLa3nX8Ys1LOvVWZ7Wgqipphx0Oj6ser5mhpixpe9EfeYKatj-1Z9qWPvhF64oqusK0dNoAib1LPp-8-TR9y0KxBmZAxW8ZOCpOTKwG98SkZTbJylJgBRzHMbQ5SXSa69jI2HILJvLEKKGTNAI2sZHIc-v4PTKq6soeEAoeu8u1MwKxC0uM3JVKZZGSljsbRXZMeLdOhQlI5lhQY1X48JwEj6adtQJXtwirOyas77VukTz-Qf8aWaCnRRxu_6LeLIqwrQshEXHRZfBnoOZjqVNeJjrKQZ1lMVfRmDxFBioQaaPCVJ6F2jVN8e7jrDgWPM4TzDH-I9GHAdHzQORqGKxR4foETBkieA0oDweUIC_MoPkAmbkbc1OcsyD07Jj04uYnfTN-FNPzKlvvgIbnSQaWZQ4Dvt_uh37eONrskkOLHOyUwcQOW6rlVw90LmOOCunB33_rIbke4yGIT6c6JKPtZmcfkavm-3bZbB57ifATGqppEQ
  priority: 102
  providerName: ProQuest
– databaseName: Public Library of Science (PLoS) Journals Open Access
  dbid: FPL
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3db9MwELegfIgXPgZshYEMQuIpkMSJHfM2KiqQoJr40t4sx7FHpS6plhaJ_547xwlkWoV4THyO7Luz7y53_pmQFy51ueGaR1mWsyjT3EWlZiZyEGoIa8HomcJfNiEWi-LkRB7_CRQvZPCZSF4Hnr5am3Lpc_pgc6-SaynjHEu45scf-52X5YXk4Xjcrp4j8-NR-oe9eLJeNe1ljubFesm_DND8zv8O_S65HVxNetTpxj1yxdZ75EZ3-eSvPXLzU0ir3ycOXs6684vfrXlDZ96TrE-priva44HD42rAYI7Q-lW0O7yPcqam60_tWWkrn9CiTU01XWGpOW2BxD4g3-bvvs7eR-EChsiA2d5EEHw4HtsSQg6TV0VcVBXHW20cw3RlnJW5LFMjUsssuL2x0bzM8gREbxMupXXsIZnUTW0PCIUo3MnSGY54hBVm4yqti0QLy5xNEjslrJeLMgGdHC_JWCmfchMQpXRcU8hMFZg5JdHQa92hc_yD_i2KfKBFbG3_AqSmwlJVXCCKoitgZGC6U1HmrMrKRIKJKlKmkyl5jgqjED2jxvKcU71tW_Xhy0IdcZbKDOuGdxJ9HhG9DESugckaHY5EAMsQlWtEeTiihD3AjJoPUHn7ObcKURbBsWdxAT17hb68-dnQjB_FkrvaNlugYTIrwFuUMOH9Tv8HvjH0wwWDFjFaGSPGjlvq5Q8PXi5Shkbm0e4RPya3Uvyp4cujDslkc761T8h183OzbM-f-hX_GxIFVT0
  priority: 102
  providerName: Public Library of Science
Title BioConceptVec: Creating and evaluating literature-based biomedical concept embeddings on a large scale
URI https://www.ncbi.nlm.nih.gov/pubmed/32324731
https://www.proquest.com/docview/2403774308
https://www.proquest.com/docview/2394890191
https://pubmed.ncbi.nlm.nih.gov/PMC7237030
https://doaj.org/article/672492f8e3f84727b53d4b19eac823a1
http://dx.doi.org/10.1371/journal.pcbi.1007617
Volume 16
WOSCitedRecordID wos000531366700040&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: DOA
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: P5Z
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Biological Science Database
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: M7P
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: K7-
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Health & Medical Collection
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: 7X7
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: BENPR
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: PIMPY
  dateStart: 20050601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVATS
  databaseName: Public Library of Science (PLoS) Journals Open Access
  customDbUrl:
  eissn: 1553-7358
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0035896
  issn: 1553-7358
  databaseCode: FPL
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: http://www.plos.org/publications/
  providerName: Public Library of Science
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3db9MwELeggMQL4nuFURmExJNZEye2w9tWrWICqmh8qPASOY49KpW0Wlok_nvu7DQsaGgvvFht7xzJdxffXX3-HSEvXexSI7RgSZJylmjhWKm5YQ5SDWktOD2jfLMJOZup-TzLL7T6wpqwAA8cBHcgJGLaOWW5g400lmXKq6SMMtgwVMy1T3wg6tklU2EP5qnynbmwKQ6TPJm3l-a4jA5aHb1em3LhawSEb1b2xyl57P5uhx6sl6vmsvDz7yrKC25pepfcaeNJehjWcY9cs_V9cit0mPz1gDj4NAk3E79Y84ZOfIxYn1FdV3SH9A1flx26MkO_VtFwLR81SE2YT-2P0lb-qIquaqrpEovIaQMs9iH5PD3-NHnL2tYKzIBD3jBIK5wY2xKSCZNWaqyqSmC_GsfxIHKclGlWxkbGllsIaMdGizJJI1CqjUSWWccfkUG9qu0eoZBfu6x0RiDSYIXnbJXWKtISNGajyA4J38m2MC3uOLa_WBb-ME1C_hFEVaBGilYjQ8K6WeuAu3EF_xGqreNF1Gz_A9hS0dpScZUtDckLVHqBuBg1Ft6c6W3TFCcfZ8Wh4HGWYEXwP5lOe0yvWia3gsUa3V52AJEh3laPc7_HCW-36ZH30AB3a24KxE-EkJ2PFczcGeXl5OcdGR-KxXS1XW2Bh2eJgjgwgwU_DjbcyY1jhC05UGTPunuC7VPqxXcPSy5jju7jyf_QxFNyO8Y_NnyJ1D4ZbM639hm5aX5uFs35iFyXc-lHNSI3jo5n-enIv_8wTvP3ML6TbIRlvDmMefoNuPKTD_nX35QuYBo
linkProvider Directory of Open Access Journals
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1bb9MwFLZGub5wGZcVBhgE4smsiRM7QUJoFKZV2yrEBupbcBy7VCpJaVrQ_hS_kXOcpCNowNMeeEx8HMXOuTk-_j5CnljfhloowYIg5CxQwrJUcc0sLDWkMRD0dOTIJuRwGI1G8bs18qM5C4NllY1PdI46KzT-I99C3DhIVXgvejX7ypA1CndXGwqNSi32zPF3WLKVLwdv4Ps-9f2dt0f9XVazCjANsWjBIKO2omdSyKN1mEW9KMsEUrVYjntwvSAN49TX0jfcQC7X00qkQejBeIwn4thYDs89R84HPJKI1b8nWeP5eRg5PjCk4mGSB6P6qB6X3latGc9nOp24ygThKNJOQqFjDFjFhc5sWpSnJb2_127-Egx3rv1v03idXK3Tbrpd2ckNsmbydXKxIuI8XieXDuoSg5vEws1-dZbzo9EvaN9l1fmYqjyjDTY6XE5XeNQMM4GMVkAGqPNUV_2p-ZKazG3u0SKnik6x7J6WIGJukQ9nMt7bpJMXudkgNBCxjVOrBWIzZrgzmSkVeUoabo3nmS7hjV4kukZqR8KQaeK2HyWs2KpZS1CbklqbuoStes0qpJJ_yL9GlVvJIs64u1HMx0ntthIhEVHSRvBmkMb4Mg15FqReDOE68rnyuuQxKmyCSCI5liqN1bIsk8HhMNkW3I8DrKH-o9D7ltCzWsgWMFit6uMhMGWIUNaS3GxJgj_UreYNNJ5mzGVyovLQszGK05sfrZrxoVh-mJtiCTI8DiLInGMY8J3K_lbzxnFNIjm0yJZltia23ZJPPjsgd-lzDLh3__5aD8nl3aOD_WR_MNy7R674-MPHlY5tks5ivjT3yQX9bTEp5w-cN6Lk01nb7U_XUcRg
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lj9MwELaWAisuPJbHFhYwCMQptIkTO0FCaOlSURWqFSyot-A4dqlUktK0oP1r_DpmnMcStMBpDxwbjyOPO-MZx-PvI-SR8UyguOSO7wfM8SU3TiKZcgxsNYTWEPRUaMkmxGQSTqfR4Rb5Ud-FwbLKek20C3WaK_xG3kPcOEhVWD_smaos4vBg-GL51UEGKTxprek0ShMZ6-PvsH0rno8O4L9-7HnDV0eD107FMOAoiEtrB7Jrw_s6gZxaBWnYD9OUI22LYXge1_eTIEo8JTzNNOR1fSV54gcu6KZdHkXaMHjvOXIehhWgj42FU0cBFoSWGwxpeRzB_Gl1bY8Jt1dZydOlSua2SoFburSTsGjZA5oY0Vku8uK0BPj3Os5fAuPwyv88pVfJ5Sodp_ul_1wjWzrbIRdLgs7jHbL9tio9uE4MPByUdzw_avWMDmy2nc2ozFJaY6bDz0WDU-1ghpDSEuAAfYGqsj_VXxKd2kM_mmdU0gWW49MCRPQN8uFM9L1JOlme6V1CfR6ZKDGKI2ZjiieWqZShK4VmRruu7hJW20isKgR3JBJZxPZYUsBOrpy1GC0rriyrS5ym17JEMPmH_Es0v0YW8cftg3w1i6vlLOYCkSZNCCOD9MYTScBSP3EjCOOhx6TbJQ_ReGNEGMnQsGZyUxTx6P0k3ufMi3ysrf6j0LuW0JNKyOSgrJLVtRGYMkQua0nutSRhnVSt5l10pFrnIj4xf-hZO8jpzQ-aZnwpliVmOt-ADIv8EDLqCBS-VfpiM28M9yqCQYtoeWlrYtst2fyzBXgXHsNAfPvvw7pPtsFd4zejyfgOueThdyBbUbZHOuvVRt8lF9S39bxY3bMLEyWfztptfwKIac0a
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BioConceptVec%3A+Creating+and+evaluating+literature-based+biomedical+concept+embeddings+on+a+large+scale&rft.jtitle=PLoS+computational+biology&rft.au=Chen%2C+Qingyu&rft.au=Lee%2C+Kyubum&rft.au=Yan%2C+Shankai&rft.au=Kim%2C+Sun&rft.date=2020-04-01&rft.pub=Public+Library+of+Science&rft.issn=1553-734X&rft.volume=16&rft.issue=4&rft_id=info:doi/10.1371%2Fjournal.pcbi.1007617&rft.externalDBID=ISN&rft.externalDocID=A632940667
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1553-7358&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1553-7358&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1553-7358&client=summon