A survey of word embeddings for clinical text
[Display omitted] •We survey methods of representing clinical text using neural networks.•We provide a “how-to” guide for training these representations on clinical text.•We describe word models, corpora, evaluation methods, and applications. Representing words as numerical vectors based on the cont...
Uložené v:
| Vydané v: | Journal of biomedical informatics Ročník 100; s. 100057 |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier Inc
01.01.2019
|
| Predmet: | |
| ISSN: | 1532-0464, 1532-0480, 1532-0480 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | [Display omitted]
•We survey methods of representing clinical text using neural networks.•We provide a “how-to” guide for training these representations on clinical text.•We describe word models, corpora, evaluation methods, and applications.
Representing words as numerical vectors based on the contexts in which they appear has become the de facto method of analyzing text with machine learning. In this paper, we provide a guide for training these representations on clinical text data, using a survey of relevant research. Specifically, we discuss different types of word representations, clinical text corpora, available pre-trained clinical word vector embeddings, intrinsic and extrinsic evaluation, applications, and limitations of these approaches. This work can be used as a blueprint for clinicians and healthcare workers who may want to incorporate clinical text features in their own models and applications. |
|---|---|
| AbstractList | [Display omitted]
•We survey methods of representing clinical text using neural networks.•We provide a “how-to” guide for training these representations on clinical text.•We describe word models, corpora, evaluation methods, and applications.
Representing words as numerical vectors based on the contexts in which they appear has become the de facto method of analyzing text with machine learning. In this paper, we provide a guide for training these representations on clinical text data, using a survey of relevant research. Specifically, we discuss different types of word representations, clinical text corpora, available pre-trained clinical word vector embeddings, intrinsic and extrinsic evaluation, applications, and limitations of these approaches. This work can be used as a blueprint for clinicians and healthcare workers who may want to incorporate clinical text features in their own models and applications. Representing words as numerical vectors based on the contexts in which they appear has become the de facto method of analyzing text with machine learning. In this paper, we provide a guide for training these representations on clinical text data, using a survey of relevant research. Specifically, we discuss different types of word representations, clinical text corpora, available pre-trained clinical word vector embeddings, intrinsic and extrinsic evaluation, applications, and limitations of these approaches. This work can be used as a blueprint for clinicians and healthcare workers who may want to incorporate clinical text features in their own models and applications.Representing words as numerical vectors based on the contexts in which they appear has become the de facto method of analyzing text with machine learning. In this paper, we provide a guide for training these representations on clinical text data, using a survey of relevant research. Specifically, we discuss different types of word representations, clinical text corpora, available pre-trained clinical word vector embeddings, intrinsic and extrinsic evaluation, applications, and limitations of these approaches. This work can be used as a blueprint for clinicians and healthcare workers who may want to incorporate clinical text features in their own models and applications. |
| ArticleNumber | 100057 |
| Author | Abdalla, Mohamed Khattak, Faiza Khan Pou-Prom, Chloé Jeblee, Serena Meaney, Christopher Rudzicz, Frank |
| Author_xml | – sequence: 1 givenname: Faiza Khan surname: Khattak fullname: Khattak, Faiza Khan email: faizakk@cs.toronto.edu organization: Department of Computer Science, University of Toronto, Toronto, Ontario, Canada – sequence: 2 givenname: Serena surname: Jeblee fullname: Jeblee, Serena email: sjeblee@cs.toronto.edu organization: Department of Computer Science, University of Toronto, Toronto, Ontario, Canada – sequence: 3 givenname: Chloé surname: Pou-Prom fullname: Pou-Prom, Chloé email: poupromc@smh.ca organization: Department of Computer Science, University of Toronto, Toronto, Ontario, Canada – sequence: 4 givenname: Mohamed surname: Abdalla fullname: Abdalla, Mohamed email: mohamed.abdalla@mail.utoronto.ca organization: Department of Computer Science, University of Toronto, Toronto, Ontario, Canada – sequence: 5 givenname: Christopher surname: Meaney fullname: Meaney, Christopher email: christopher.meaney@utoronto.ca organization: Department of Biostatistics, University of Toronto, Toronto, Ontario, Canada – sequence: 6 givenname: Frank orcidid: 0000-0002-9902-0583 surname: Rudzicz fullname: Rudzicz, Frank email: frank@cs.toronto.edu organization: Department of Computer Science, University of Toronto, Toronto, Ontario, Canada |
| BookMark | eNqFkDtPwzAUhS1UJFrgHzBkZEm5dh52GJCqipdUiQVmy3GukaPULnZa2n9PqiAGBpju0dX5zvDNyMR5h4RcUZhToOVNOz-0tXX7OQNaDS-Agp-QKS0ylkIuYPKTy_yMzGJsASgtinJK0kUSt2GHh8Sb5NOHJsF1jU1j3XtMjA-J7qyzWnVJj_v-gpwa1UW8_L7n5O3h_nX5lK5eHp-Xi1WqMwCeVqISuRCCC1E2DLiqKANWCW4QQZTANTe1MjmrS8ZUqXllmhwyGOo55wqyc3I97m6C_9hi7OXaRo1dpxz6bZSsKGkuOIAYqrdjVQcfY0Ajte1Vb73rg7KdpCCPjmQrR0fy6EiOjgY4_wVvgl2rcPgPuxsxHBzsLAYZtUWnsbEBdS8bb_8e-ALRVIH8 |
| CitedBy_id | crossref_primary_10_1016_j_jbi_2021_103982 crossref_primary_10_3390_info12120491 crossref_primary_10_1109_ACCESS_2024_3460976 crossref_primary_10_1016_j_engappai_2025_110827 crossref_primary_10_1007_s00521_020_05211_z crossref_primary_10_3390_computers13090236 crossref_primary_10_1109_TCSS_2023_3322002 crossref_primary_10_1371_journal_pone_0283800 crossref_primary_10_1016_j_neucom_2025_129638 crossref_primary_10_1055_a_2521_4372 crossref_primary_10_1146_annurev_biodatasci_030421_030931 crossref_primary_10_3390_su13179775 crossref_primary_10_1007_s42979_021_00656_y crossref_primary_10_1109_TR_2024_3513834 crossref_primary_10_1016_j_engappai_2025_110142 crossref_primary_10_1515_cllt_2024_0070 crossref_primary_10_1016_j_neucom_2024_128263 crossref_primary_10_1016_j_neucom_2025_130575 crossref_primary_10_1007_s41870_022_01123_4 crossref_primary_10_2196_43014 crossref_primary_10_1109_ACCESS_2023_3326757 crossref_primary_10_3390_app131910725 crossref_primary_10_1093_jamia_ocab236 crossref_primary_10_1186_s40537_021_00429_7 crossref_primary_10_3390_ijerph17197054 crossref_primary_10_1016_j_jbi_2023_104403 crossref_primary_10_1038_s41598_025_04651_8 crossref_primary_10_1007_s41666_023_00125_6 crossref_primary_10_1007_s41870_023_01338_z crossref_primary_10_3389_fdgth_2021_778305 crossref_primary_10_2196_45171 crossref_primary_10_3390_app12042179 crossref_primary_10_1007_s40593_023_00375_w crossref_primary_10_1515_bams_2021_0117 crossref_primary_10_1093_comjnl_bxae004 crossref_primary_10_1145_3524887 crossref_primary_10_1038_s41598_021_93018_w crossref_primary_10_1145_3626523 crossref_primary_10_3389_fgene_2021_569120 crossref_primary_10_1093_jssam_smad015 crossref_primary_10_1155_2022_3524090 crossref_primary_10_7717_peerj_cs_1985 crossref_primary_10_1016_j_jbi_2021_103902 crossref_primary_10_1186_s12911_022_01850_5 crossref_primary_10_1371_journal_pone_0248663 crossref_primary_10_1007_s11277_022_09646_6 crossref_primary_10_1109_ACCESS_2021_3115617 crossref_primary_10_1109_TETC_2020_2983404 crossref_primary_10_1109_ACCESS_2023_3335196 crossref_primary_10_1080_00051144_2021_1922150 crossref_primary_10_24054_rcta_v2i44_3018 crossref_primary_10_1007_s11192_023_04689_3 crossref_primary_10_3390_bdcc7010046 crossref_primary_10_1016_j_inffus_2025_103503 crossref_primary_10_1109_ACCESS_2023_3268165 crossref_primary_10_1016_j_patrec_2020_12_013 crossref_primary_10_1109_ACCESS_2024_3521279 crossref_primary_10_1109_ACCESS_2025_3532397 crossref_primary_10_1016_j_rser_2024_114705 crossref_primary_10_2478_ijssis_2022_0002 crossref_primary_10_1371_journal_pone_0276539 crossref_primary_10_1186_s12911_022_01924_4 crossref_primary_10_1016_j_procs_2021_05_078 crossref_primary_10_2196_22651 crossref_primary_10_1007_s10579_022_09620_5 crossref_primary_10_1007_s41666_024_00159_4 crossref_primary_10_1177_00031348221117039 crossref_primary_10_1016_j_compbiomed_2021_104433 crossref_primary_10_1038_s41598_024_75331_2 crossref_primary_10_3390_app11052045 crossref_primary_10_1007_s11042_022_14043_z crossref_primary_10_3389_fpsyg_2024_1401084 crossref_primary_10_1001_jamanetworkopen_2025_26339 crossref_primary_10_3390_app10217557 crossref_primary_10_1016_j_jbi_2021_103971 crossref_primary_10_1145_3611651 crossref_primary_10_2196_31063 crossref_primary_10_1016_j_surg_2024_03_006 crossref_primary_10_3390_su15054216 crossref_primary_10_1007_s13042_025_02627_8 crossref_primary_10_1016_j_jbi_2021_103898 crossref_primary_10_1186_s13040_024_00373_1 crossref_primary_10_1177_20552076231212296 crossref_primary_10_1016_j_eswa_2022_118034 crossref_primary_10_1007_s42979_020_00164_5 crossref_primary_10_1177_20563051231186368 crossref_primary_10_1051_e3sconf_201914003006 crossref_primary_10_2196_24020 crossref_primary_10_1080_08839514_2024_2423326 crossref_primary_10_2196_21679 crossref_primary_10_1016_j_nlp_2023_100026 crossref_primary_10_1109_ACCESS_2024_3409818 crossref_primary_10_1038_s41531_022_00422_8 crossref_primary_10_7717_peerj_cs_1163 crossref_primary_10_1093_jamia_ocac216 |
| Cites_doi | 10.3115/v1/P14-1023 10.3115/v1/P14-2050 10.18653/v1/P18-1031 10.1109/JBHI.2016.2633963 10.1016/j.jbi.2006.06.004 10.1109/TASLP.2018.2837384 10.2139/ssrn.3064761 10.1006/jbin.2001.1029 10.1016/j.jbi.2010.10.004 10.1038/sdata.2016.35 10.1371/journal.pone.0192360 10.1136/amiajnl-2011-000203 10.1162/COLI_a_00237 10.1073/pnas.1516047113 10.1186/gb-2008-9-s2-s2 10.1197/jamia.M2408 10.1016/j.jbi.2015.07.010 10.1186/s12911-017-0498-1 10.3115/1620754.1620758 10.1109/GRC.2006.1635880 |
| ContentType | Journal Article |
| Copyright | 2019 The Author(s) Copyright © 2019 The Author(s). Published by Elsevier Inc. All rights reserved. |
| Copyright_xml | – notice: 2019 The Author(s) – notice: Copyright © 2019 The Author(s). Published by Elsevier Inc. All rights reserved. |
| DBID | 6I. AAFTH AAYXX CITATION 7X8 |
| DOI | 10.1016/j.yjbinx.2019.100057 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef MEDLINE - Academic |
| DatabaseTitle | CrossRef MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Medicine Engineering Public Health |
| EISSN | 1532-0480 |
| ExternalDocumentID | 10_1016_j_yjbinx_2019_100057 S2590177X19300563 |
| GroupedDBID | --- --K --M -~X .DC .GJ .~1 0R~ 1B1 1RT 1~. 1~5 29J 4.4 457 4G. 53G 5GY 5VS 6I. 7-5 71M 8P~ AACTN AAEDT AAEDW AAFTH AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAWTL AAXUO AAYFN ABBOA ABBQC ABFRF ABJNI ABLVK ABMAC ABMZM ABVKL ABXDB ABYKQ ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADFGL ADMUD AEBSH AEFWE AEKER AENEX AEXQZ AFKWA AFTJW AFXIZ AGHFR AGUBO AGYEJ AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV AJRQY ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ ANZVX AOUOD ASPBG AVWKF AXJTR AZFZN BAWUL BKOJK BLXMC BNPGV CAG COF CS3 DIK DM4 DU5 EBS EFBJH EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HVGLF HZ~ IHE IXB J1W KOM LCYCR LG5 M41 MO0 N9A NCXOZ O-L O9- OAUVE OK1 OZT P-8 P-9 PC. Q38 R2- RIG ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SSH SSV SSZ T5K UAP UHS UNMZH XPP ZGI ZMT ZU3 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACIEU ACLOT ACRPL ACVFH ADCNI ADNMO ADVLN AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7X8 |
| ID | FETCH-LOGICAL-c3007-989848887886d207a91202987fee08607c7fbaf42b622a6c79fd4030886477a03 |
| ISSN | 1532-0464 1532-0480 |
| IngestDate | Sun Sep 28 09:31:52 EDT 2025 Sat Nov 29 06:51:52 EST 2025 Tue Nov 18 21:46:49 EST 2025 Fri Feb 23 02:44:22 EST 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Clinical data Natural language processing Word embeddings |
| Language | English |
| License | This is an open access article under the CC BY-NC-ND license. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c3007-989848887886d207a91202987fee08607c7fbaf42b622a6c79fd4030886477a03 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 ObjectType-Review-3 content type line 23 |
| ORCID | 0000-0002-9902-0583 |
| OpenAccessLink | https://dx.doi.org/10.1016/j.yjbinx.2019.100057 |
| PQID | 2561487008 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_2561487008 crossref_citationtrail_10_1016_j_yjbinx_2019_100057 crossref_primary_10_1016_j_yjbinx_2019_100057 elsevier_sciencedirect_doi_10_1016_j_yjbinx_2019_100057 |
| PublicationCentury | 2000 |
| PublicationDate | 20190101 |
| PublicationDateYYYYMMDD | 2019-01-01 |
| PublicationDate_xml | – month: 01 year: 2019 text: 20190101 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | Journal of biomedical informatics |
| PublicationYear | 2019 |
| Publisher | Elsevier Inc |
| Publisher_xml | – name: Elsevier Inc |
| References | Chiu, Korhonen, Pyysalo (b0330) 2016 Pakhomov, Pedersen, McInnes, Melton, Ruggieri, Chute (b0380) 2011; 44 D. Nelson, C. McEvoy, T. Schreiber, The university of south florida word association, rhyme, and word fragment norms. G. Lample, A. Conneau, Cross-lingual language model pretraining, arXiv preprint arXiv:1901.07291. Maaten, Hinton (b0140) 2008; 9 Nickel, Kiela (b0105) 2017 Tsvetkov, Faruqui, Ling, Lample, Dyer (b0335) 2015 H. Nguyen, H. Al-Mubaid, New ontology-based semantic similarity measure for the biomedical domain, 2006, pp. 623 – 628. Y. Si, J. Wang, H. Xu, K. Roberts, Enhancing Clinical Concept Extraction with Contextual Embedding, JAMIA (in press) arXiv:1902.08691. Zhu, Kiros, Zemel, Salakhutdinov, Urtasun, Torralba, Fidler (b0070) 2015 Uzuner, Goldstein, Luo, Kohane (b0235) 2008; 15 Levy, Goldberg (b0460) 2014 Huang, Xu, Vydiswaran (b0230) 2016 Hill, Reichart, Korhonen (b0315) 2015; 41 J. Howard, S. Ruder, Universal language model fine-tuning for text classification, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, 2018, pp. 328–339. Radford, Wu, Child, Luan, Amodei, Sutskever (b0120) 2019; 1 L. De Vine, M. Kholghi, G. Zuccon, L. Sitbon, A. Nguyen, Analysis of word embeddings and sequence features for clinical information extraction, 2015. M. Baroni, G. Dinu, G. Kruszewski, Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, 2014, pp. 238–247. Y. Wu, M. Schuster, Z. Chen, Q.V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, et al., Google’s neural machine translation system: Bridging the gap between human and machine translation, arXiv preprint arXiv:1609.08144. Pham, Tran, Phung, Venkatesh (b0195) 2016 X. Rong, word2vec parameter learning explained, arXiv preprint arXiv:1411.2738. McDonald, Ramscar (b0010) 2001; vol. 23 F. Doshi-Velez, M. Kortz, R. Budish, C. Bavitz, S.J. Gershman, D. O’Brien, S. Shieber, J. Waldo, D. Weinberger, A. Wood, Accountability of AI Under the Law: The Role of Explanation, 2017. arXiv:1711.01134, doi:10.2139/ssrn.3064761. L.K. Şenel, İhsan Utlu, V. Yücesoy, A. Koç, T. Çukur, Semantic structure and interpretability of word embeddings, IEEE/ACM Trans. Audio Speech Language Process. (2018). Pakhomov, McInnes, Adams, Liu, Pedersen, Melton (b0300) 2010 Szarvas, Vincze, Farkas, Csirik (b0220) 2008 Smith, Tanabe, nee Ando, Kuo, Chung, Hsu, Lin, Klinger, Friedrich, Ganchev (b0365) 2008; 9 Chapman, Bridewell, Hanbury, Cooper, Buchanan (b0225) 2001; 34 A.L. Beam, B. Kompa, I. Fried, N. Palmer, X. Shi, T. Cai, I.S. Kohane, Clinical Concept Embeddings Learned from Massive Sources of Medical Data, arXiv, 2018, pp. 1–27 arXiv:1804.01486. URL Pedersen, Pakhomov, Patwardhan, Chute (b0255) 2007; 40 A. Hliaoutakis, Semantic similarity measures in mesh ontology and their application to information retrieval on medline, Master’s thesis, 2005. Nguyen, Tran, Wickramasinghe, Venkatesh (b0190) 2017; 21 Zhu, Yan, Wang (b0210) 2017; 17 , Mikolov, Sutskever, Chen, Corrado, Dean (b0020) 2013 . Shin, Lu, Kim, Seff, Yao, Summers (b0155) 2015 Y. Peng, S. Yan, Z. Lu, Transfer learning in biomedical natural language processing: An evaluation of bert and elmo on ten benchmarking datasets, arXiv preprint arXiv:1906.05474. Yu, Cohen, Bernstam, Johnson, Wallace (b0260) 2016 De Vries, Nayak, Kutty, Geva, Tagarelli (b0390) 2010 Chiu, Crichton, Korhonen, Pyysalo (b0360) 2016 Miller, Leacock, Tengi, Bunker (b0340) 1993 Gehrmann, Dernoncourt, Li, Carlson, Wu, Welt, Foote, Moseley, Grant, Tyler (b0180) 2018; 13 Hoffman, Trawalter, Axt, Oliver (b0425) 2016; 113 J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C.H. So, J. Kang, Biobert: pre-trained biomedical language representation model for biomedical text mining, arXiv preprint arXiv:1901.08746. T. Bolukbasi, K.-W. Chang, J.Y. Zou, V. Saligrama, A.T. Kalai, Man is to computer programmer as woman is to homemaker? debiasing word embeddings, in: Advances in Neural Information Processing Systems, 2016, pp. 4349–4357. Peters, Neumann, Iyyer, Gardner, Clark, Lee, Zettlemoyer (b0055) 2018 Nam, Mencía, Fürnkranz (b0295) 2016 Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (b0065) 2017 Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, Q.V. Le, Xlnet: Generalized autoregressive pretraining for language understanding, arXiv preprint arXiv:1906.08237. C. Culnane, B.I.P. Rubinstein, V. Teague, Health data in an open world, CoRR abs/1712.05627. arXiv:1712.05627. Dwork, McSherry, Nissim, Smith (b0450) 2006 Voorhees, Hersh (b0240) 2012 Y. Wang, S. Liu, N. Afzal, M. Rastegar-Mojarad, L. Wang, F. Shen, H. Liu, A comparison of word embeddings for the biomedical natural language processing, arXiv preprint arXiv:1802.00400. W. Boag, H. Kané, AWE-CM Vectors: Augmenting Word Embeddings with a Clinical Metathesaurus arXiv:1712.01460. E. Craig, C. Arias, D. Gillman, Predicting readmission risk from doctors’ notes, arXiv preprint arXiv:1711.10663. Uzuner, South, Shen, DuVall (b0405) 2011; 18 Zhao, Masino, Yang (b0215) 2018 Bruni, Tran, Marco (b0325) 2013; 49 A.C. Kozlowski, M. Taddy, J.A. Evans, The geometry of culture: Analyzing meaning through word embeddings, arXiv preprint arXiv:1803.09288. Rogers, Bodenreider (b0310) 2008 Arthur, Vassilvitskii (b0385) 2007 B. Athiwaratkun, A.G. Wilson, A. Anandkumar, Probabilistic fasttext for multi-sense word embeddings, arXiv preprint arXiv:1806.02901. S. Dubois, N. Romano, Learning effective embeddings from medical notes, arXiv preprint arXiv:1705.07025. Faruqui, Dodge, Jauhar, Dyer, Hovy, Smith (b0265) 2015 O. Levy, Y. Goldberg, Dependency-based word embeddings, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2014, pp. 302–308. I. Beltagy, A. Cohan, K. Lo, Scibert: Pretrained contextualized embeddings for scientific text, arXiv preprint arXiv:1903.10676. K. Huang, J. Altosaar, R. Ranganath, Clinicalbert: Modeling clinical notes and predicting hospital readmission, arXiv preprint arXiv:1904.05342. Kholghi, De Vine, Sitbon, Zuccon, Nguyen (b0170) 2016 Moen, Ananiadou (b0205) 2013 B.T. McInnes, T. Pedersen, S.V.S. Pakhomov, UMLS-Interface and UMLS-Similarity: open source software for measuring paths and semantic similarity, vol. 2009, American Medical Informatics Association, 2009, pp. 431–435. Y. Choi, C.Y.-I. Chiu, D. Sontag, Learning Low-Dimensional Representations of Medical Concepts, vol. 2016, American Medical Informatics Association, 2016, pp. 41. Alsentzer, Murphy, Boag, Weng, Jindi, Naumann, McDermott (b0080) 2019 Johnson, Pollard, Shen, Lehman, Feng, Ghassemi, Moody, Szolovits, Anthony Celi, Mark, Celi, Mark (b0090) 2016; 3 Mikolov, Yih, Zweig (b0025) 2013 Leaman, Khare, Lu (b0005) 2015; 57 Patel, Patel, Golakiya, Bhattacharyya, Birari (b0175) 2017; 2017 H. Zhu, I.C. Paschalidis, A. Tahmasebi, Clinical concept extraction with contextual word embedding, arXiv preprint arXiv:1810.10566. J.-B. Escudié, A. Saade, A. Coucke, M. Lelarge, Deep representation for patient visits from electronic health records, arXiv preprint arXiv:1803.09533. S. Pradhan, N. Elhadad, B.R. South, D. Martinez, L.M. Christensen, A. Vogel, H. Suominen, W.W. Chapman, G.K. Savova, Task 1: Share/clef ehealth evaluation lab 2013, in: CLEF (Working Notes), 2013. T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781. Pennington, Socher, Manning (b0040) 2014 Kim, Ohta, Tsuruoka, Tateisi, Collier (b0370) 2004 Fellbaum (b0345) 1998 Y. Sun, S. Wang, Y. Li, S. Feng, X. Chen, H. Zhang, X. Tian, D. Zhu, H. Tian, H. Wu, Ernie: Enhanced representation through knowledge integration, arXiv preprint arXiv:1904.09223. Bhattacharyya (b0440) 1943; 35 Le, Mikolov (b0035) 2014 P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword information, arXiv preprint arXiv:1607.04606. J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805. Finlayson, LePendu, Shah (b0285) 2014; 1 E.L. Mencia, G. de Melo, J. Nam, Medical Concept Embeddings via Labeled Background Corpora, 2016, pp. 4629–4636. URL W. Ammar, D. Groeneveld, C. Bhagavatula, I. Beltagy, M. Crawford, D. Downey, J. Dunkelberger, A. Elgohary, S. Feldman, V. Ha, et al., Construction of the literature graph in semantic scholar, arXiv preprint arXiv:1805.02262. Socher, Perelygin, Wu, Chuang, Manning, Ng, Potts (b0350) 2013 E. Agirre, E. Alfonseca, K. Hall, J. Kravalova, M. Pasca, A. Soroa, A study on similarity and relatedness using distributional and wordnet-based approaches, in: Proceedings of NAACL-HLT 2009, (2009). A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, Improving language understanding with unsupervised learning, Tech. Rep., Technical Report, OpenAI, 2018. Patel (10.1016/j.yjbinx.2019.100057_b0175) 2017; 2017 Yu (10.1016/j.yjbinx.2019.100057_b0260) 2016 10.1016/j.yjbinx.2019.100057_b0085 10.1016/j.yjbinx.2019.100057_b0160 10.1016/j.yjbinx.2019.100057_b0280 Mikolov (10.1016/j.yjbinx.2019.100057_b0020) 2013 Bhattacharyya (10.1016/j.yjbinx.2019.100057_b0440) 1943; 35 Le (10.1016/j.yjbinx.2019.100057_b0035) 2014 Tsvetkov (10.1016/j.yjbinx.2019.100057_b0335) 2015 Alsentzer (10.1016/j.yjbinx.2019.100057_b0080) 2019 Voorhees (10.1016/j.yjbinx.2019.100057_b0240) 2012 Rogers (10.1016/j.yjbinx.2019.100057_b0310) 2008 Nguyen (10.1016/j.yjbinx.2019.100057_b0190) 2017; 21 Miller (10.1016/j.yjbinx.2019.100057_b0340) 1993 Hoffman (10.1016/j.yjbinx.2019.100057_b0425) 2016; 113 Leaman (10.1016/j.yjbinx.2019.100057_b0005) 2015; 57 Hill (10.1016/j.yjbinx.2019.100057_b0315) 2015; 41 Kim (10.1016/j.yjbinx.2019.100057_b0370) 2004 10.1016/j.yjbinx.2019.100057_b0435 10.1016/j.yjbinx.2019.100057_b0115 10.1016/j.yjbinx.2019.100057_b0355 10.1016/j.yjbinx.2019.100057_b0430 10.1016/j.yjbinx.2019.100057_b0110 Shin (10.1016/j.yjbinx.2019.100057_b0155) 2015 10.1016/j.yjbinx.2019.100057_b0275 10.1016/j.yjbinx.2019.100057_b0395 10.1016/j.yjbinx.2019.100057_b0075 10.1016/j.yjbinx.2019.100057_b0030 10.1016/j.yjbinx.2019.100057_b0150 10.1016/j.yjbinx.2019.100057_b0270 Chiu (10.1016/j.yjbinx.2019.100057_b0330) 2016 Peters (10.1016/j.yjbinx.2019.100057_b0055) 2018 Maaten (10.1016/j.yjbinx.2019.100057_b0140) 2008; 9 Dwork (10.1016/j.yjbinx.2019.100057_b0450) 2006 Chapman (10.1016/j.yjbinx.2019.100057_b0225) 2001; 34 10.1016/j.yjbinx.2019.100057_b0305 10.1016/j.yjbinx.2019.100057_b0465 10.1016/j.yjbinx.2019.100057_b0145 10.1016/j.yjbinx.2019.100057_b0420 10.1016/j.yjbinx.2019.100057_b0100 Pakhomov (10.1016/j.yjbinx.2019.100057_b0380) 2011; 44 Pham (10.1016/j.yjbinx.2019.100057_b0195) 2016 Mikolov (10.1016/j.yjbinx.2019.100057_b0025) 2013 10.1016/j.yjbinx.2019.100057_b0185 Uzuner (10.1016/j.yjbinx.2019.100057_b0235) 2008; 15 Faruqui (10.1016/j.yjbinx.2019.100057_b0265) 2015 10.1016/j.yjbinx.2019.100057_b0060 Finlayson (10.1016/j.yjbinx.2019.100057_b0285) 2014; 1 Nickel (10.1016/j.yjbinx.2019.100057_b0105) 2017 Socher (10.1016/j.yjbinx.2019.100057_b0350) 2013 Radford (10.1016/j.yjbinx.2019.100057_b0120) 2019; 1 Fellbaum (10.1016/j.yjbinx.2019.100057_b0345) 1998 De Vries (10.1016/j.yjbinx.2019.100057_b0390) 2010 Bruni (10.1016/j.yjbinx.2019.100057_b0325) 2013; 49 10.1016/j.yjbinx.2019.100057_b0415 Pennington (10.1016/j.yjbinx.2019.100057_b0040) 2014 Moen (10.1016/j.yjbinx.2019.100057_b0205) 2013 10.1016/j.yjbinx.2019.100057_b0015 10.1016/j.yjbinx.2019.100057_b0455 10.1016/j.yjbinx.2019.100057_b0135 10.1016/j.yjbinx.2019.100057_b0410 10.1016/j.yjbinx.2019.100057_b0375 Zhao (10.1016/j.yjbinx.2019.100057_b0215) 2018 10.1016/j.yjbinx.2019.100057_b0130 Gehrmann (10.1016/j.yjbinx.2019.100057_b0180) 2018; 13 Pedersen (10.1016/j.yjbinx.2019.100057_b0255) 2007; 40 10.1016/j.yjbinx.2019.100057_b0250 10.1016/j.yjbinx.2019.100057_b0095 10.1016/j.yjbinx.2019.100057_b0050 10.1016/j.yjbinx.2019.100057_b0290 Levy (10.1016/j.yjbinx.2019.100057_b0460) 2014 Smith (10.1016/j.yjbinx.2019.100057_b0365) 2008; 9 Nam (10.1016/j.yjbinx.2019.100057_b0295) 2016 McDonald (10.1016/j.yjbinx.2019.100057_b0010) 2001; vol. 23 Arthur (10.1016/j.yjbinx.2019.100057_b0385) 2007 Vaswani (10.1016/j.yjbinx.2019.100057_b0065) 2017 Zhu (10.1016/j.yjbinx.2019.100057_b0210) 2017; 17 Chiu (10.1016/j.yjbinx.2019.100057_b0360) 2016 Szarvas (10.1016/j.yjbinx.2019.100057_b0220) 2008 Zhu (10.1016/j.yjbinx.2019.100057_b0070) 2015 10.1016/j.yjbinx.2019.100057_b0445 10.1016/j.yjbinx.2019.100057_b0125 10.1016/j.yjbinx.2019.100057_b0400 Johnson (10.1016/j.yjbinx.2019.100057_b0090) 2016; 3 10.1016/j.yjbinx.2019.100057_b0245 Kholghi (10.1016/j.yjbinx.2019.100057_b0170) 2016 10.1016/j.yjbinx.2019.100057_b0200 10.1016/j.yjbinx.2019.100057_b0045 Huang (10.1016/j.yjbinx.2019.100057_b0230) 2016 Pakhomov (10.1016/j.yjbinx.2019.100057_b0300) 2010 10.1016/j.yjbinx.2019.100057_b0320 10.1016/j.yjbinx.2019.100057_b0165 Uzuner (10.1016/j.yjbinx.2019.100057_b0405) 2011; 18 |
| References_xml | – reference: H. Zhu, I.C. Paschalidis, A. Tahmasebi, Clinical concept extraction with contextual word embedding, arXiv preprint arXiv:1810.10566. – reference: A. Hliaoutakis, Semantic similarity measures in mesh ontology and their application to information retrieval on medline, Master’s thesis, 2005. – volume: 44 start-page: 251 year: 2011 end-page: 265 ident: b0380 article-title: Towards a framework for developing semantic relatedness reference standards publication-title: J. Biomed. Informat. – reference: Y. Choi, C.Y.-I. Chiu, D. Sontag, Learning Low-Dimensional Representations of Medical Concepts, vol. 2016, American Medical Informatics Association, 2016, pp. 41. – start-page: 1188 year: 2014 end-page: 1196 ident: b0035 article-title: Distributed representations of sentences and documents publication-title: International Conference on Machine Learning – reference: W. Boag, H. Kané, AWE-CM Vectors: Augmenting Word Embeddings with a Clinical Metathesaurus arXiv:1712.01460. – reference: E. Agirre, E. Alfonseca, K. Hall, J. Kravalova, M. Pasca, A. Soroa, A study on similarity and relatedness using distributional and wordnet-based approaches, in: Proceedings of NAACL-HLT 2009, (2009). – reference: C. Culnane, B.I.P. Rubinstein, V. Teague, Health data in an open world, CoRR abs/1712.05627. arXiv:1712.05627. – reference: T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781. – volume: 2017 start-page: 302 year: 2017 end-page: 306 ident: b0175 article-title: Adapting pre-trained word embeddings for use in medical coding publication-title: BioNLP – reference: J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C.H. So, J. Kang, Biobert: pre-trained biomedical language representation model for biomedical text mining, arXiv preprint arXiv:1901.08746. – reference: E. Craig, C. Arias, D. Gillman, Predicting readmission risk from doctors’ notes, arXiv preprint arXiv:1711.10663. – volume: 40 start-page: 288 year: 2007 end-page: 299 ident: b0255 article-title: Measures of semantic similarity and relatedness in the biomedical domain publication-title: J. Biomed. Informat. – reference: Y. Si, J. Wang, H. Xu, K. Roberts, Enhancing Clinical Concept Extraction with Contextual Embedding, JAMIA (in press) arXiv:1902.08691. – volume: 41 start-page: 665 year: 2015 end-page: 695 ident: b0315 article-title: Simlex-999: Evaluating semantic models with (Genuine) similarity estimation publication-title: Comput. Linguist. – year: 2015 ident: b0265 article-title: Retrofitting word vectors to semantic lexicons publication-title: Proceedings of NAACL-HLT – volume: 1 year: 2019 ident: b0120 article-title: Language models are unsupervised multitask learners publication-title: OpenAI Blog – reference: Y. Wu, M. Schuster, Z. Chen, Q.V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, et al., Google’s neural machine translation system: Bridging the gap between human and machine translation, arXiv preprint arXiv:1609.08144. – reference: Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, Q.V. Le, Xlnet: Generalized autoregressive pretraining for language understanding, arXiv preprint arXiv:1906.08237. – volume: 49 start-page: 1 year: 2013 end-page: 47 ident: b0325 article-title: Multimodal distributional semantics publication-title: J. Artif. Intell. Res. – start-page: 746 year: 2013 end-page: 751 ident: b0025 article-title: Linguistic regularities in continuous space word representations publication-title: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies – volume: 34 start-page: 301 year: 2001 end-page: 310 ident: b0225 article-title: A simple algorithm for identifying negated findings and diseases in discharge summaries publication-title: J. Biomed. Informat. – reference: B.T. McInnes, T. Pedersen, S.V.S. Pakhomov, UMLS-Interface and UMLS-Similarity: open source software for measuring paths and semantic similarity, vol. 2009, American Medical Informatics Association, 2009, pp. 431–435. – start-page: 2049 year: 2015 end-page: 2054 ident: b0335 article-title: Evaluation of Word Vector Representations by Subspace Alignment publication-title: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17-21 September 2015 (September) (2015) – reference: Y. Peng, S. Yan, Z. Lu, Transfer learning in biomedical natural language processing: An evaluation of bert and elmo on ten benchmarking datasets, arXiv preprint arXiv:1906.05474. – reference: B. Athiwaratkun, A.G. Wilson, A. Anandkumar, Probabilistic fasttext for multi-sense word embeddings, arXiv preprint arXiv:1806.02901. – start-page: 156 year: 2018 end-page: 160 ident: b0215 article-title: A framework for developing and evaluating word embeddings of drug-named entity publication-title: Proceedings of the BioNLP 2018 – reference: S. Dubois, N. Romano, Learning effective embeddings from medical notes, arXiv preprint arXiv:1705.07025. – reference: S. Pradhan, N. Elhadad, B.R. South, D. Martinez, L.M. Christensen, A. Vogel, H. Suominen, W.W. Chapman, G.K. Savova, Task 1: Share/clef ehealth evaluation lab 2013, in: CLEF (Working Notes), 2013. – volume: 1 start-page: 1 year: 2014 end-page: 9 ident: b0285 article-title: Building the graph of medicine from millions of clinical narratives publication-title: Sci. Data – reference: L.K. Şenel, İhsan Utlu, V. Yücesoy, A. Koç, T. Çukur, Semantic structure and interpretability of word embeddings, IEEE/ACM Trans. Audio Speech Language Process. (2018). – reference: Y. Wang, S. Liu, N. Afzal, M. Rastegar-Mojarad, L. Wang, F. Shen, H. Liu, A comparison of word embeddings for the biomedical natural language processing, arXiv preprint arXiv:1802.00400. – year: 1998 ident: b0345 article-title: WordNet: An Electronic Lexical Database – start-page: 1027 year: 2007 end-page: 1035 ident: b0385 article-title: k-means++: The advantages of careful seeding publication-title: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms – reference: L. De Vine, M. Kholghi, G. Zuccon, L. Sitbon, A. Nguyen, Analysis of word embeddings and sequence features for clinical information extraction, 2015. – start-page: 38 year: 2008 end-page: 45 ident: b0220 article-title: The bioscope corpus: annotation for negation, uncertainty and their scope in biomedical texts publication-title: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing – reference: O. Levy, Y. Goldberg, Dependency-based word embeddings, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2014, pp. 302–308. – reference: K. Huang, J. Altosaar, R. Ranganath, Clinicalbert: Modeling clinical notes and predicting hospital readmission, arXiv preprint arXiv:1904.05342. – volume: 15 start-page: 14 year: 2008 end-page: 24 ident: b0235 article-title: Identifying patient smoking status from medical discharge records publication-title: J. Am. Med. Inform. Assoc. – reference: . – start-page: 1090 year: 2015 end-page: 1099 ident: b0155 article-title: Interleaved text/image deep mining on a very large-scale radiology database publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – start-page: 265 year: 2006 end-page: 284 ident: b0450 article-title: Calibrating noise to sensitivity in private data analysis publication-title: Theory Cryptography Conference – volume: 21 start-page: 22 year: 2017 end-page: 30 ident: b0190 article-title: Deepr: A convolutional net for medical records publication-title: IEEE J. Biomed. Health Informat. – reference: W. Ammar, D. Groeneveld, C. Bhagavatula, I. Beltagy, M. Crawford, D. Downey, J. Dunkelberger, A. Elgohary, S. Feldman, V. Ha, et al., Construction of the literature graph in semantic scholar, arXiv preprint arXiv:1805.02262. – reference: H. Nguyen, H. Al-Mubaid, New ontology-based semantic similarity measure for the biomedical domain, 2006, pp. 623 – 628. – start-page: 30 year: 2008 end-page: 36 ident: b0310 article-title: Snomed ct: Browsing the browsers publication-title: KR-MED – reference: Y. Sun, S. Wang, Y. Li, S. Feng, X. Chen, H. Zhang, X. Tian, D. Zhu, H. Tian, H. Wu, Ernie: Enhanced representation through knowledge integration, arXiv preprint arXiv:1904.09223. – reference: , – start-page: 303 year: 1993 end-page: 308 ident: b0340 article-title: A semantic concordance publication-title: Proceedings of Human Language Technologies – year: 2012 ident: b0240 article-title: Overview of the trec 2012 medical records track publication-title: TREC – reference: G. Lample, A. Conneau, Cross-lingual language model pretraining, arXiv preprint arXiv:1901.07291. – start-page: 363 year: 2010 end-page: 376 ident: b0390 article-title: Overview of the inex 2010 xml mining track: Clustering and classification of xml documents publication-title: International Workshop of the Initiative for the Evaluation of XML Retrieval – volume: 57 start-page: 28 year: 2015 end-page: 37 ident: b0005 article-title: Challenges in clinical natural language processing for automated disorder normalization publication-title: J. Biomed. Inform. – reference: J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805. – volume: 17 start-page: 95 year: 2017 ident: b0210 article-title: Semantic relatedness and similarity of biomedical terms: examining the effects of recency, size, and section of biomedical publications on the performance of word2vec publication-title: BMC Med. Inform. Decis. Mak. – reference: D. Nelson, C. McEvoy, T. Schreiber, The university of south florida word association, rhyme, and word fragment norms. – volume: 13 start-page: e0192360 year: 2018 ident: b0180 article-title: Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives publication-title: PloS One – reference: J. Howard, S. Ruder, Universal language model fine-tuning for text classification, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, 2018, pp. 328–339. – volume: 35 start-page: 99 year: 1943 end-page: 109 ident: b0440 article-title: On a measure of divergence between two statistical populations defined by their probability distributions publication-title: Bull. Calcutta Math. Soc. – reference: X. Rong, word2vec parameter learning explained, arXiv preprint arXiv:1411.2738. – reference: J.-B. Escudié, A. Saade, A. Coucke, M. Lelarge, Deep representation for patient visits from electronic health records, arXiv preprint arXiv:1803.09533. – start-page: 43 year: 2016 end-page: 51 ident: b0260 publication-title: Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures – reference: M. Baroni, G. Dinu, G. Kruszewski, Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, 2014, pp. 238–247. – year: 2018 ident: b0055 article-title: Deep contextualized word representations publication-title: Proc. of NAACL – reference: F. Doshi-Velez, M. Kortz, R. Budish, C. Bavitz, S.J. Gershman, D. O’Brien, S. Shieber, J. Waldo, D. Weinberger, A. Wood, Accountability of AI Under the Law: The Role of Explanation, 2017. arXiv:1711.01134, doi:10.2139/ssrn.3064761. – start-page: 70 year: 2004 end-page: 75 ident: b0370 article-title: Introduction to the bio-entity recognition task at jnlpba publication-title: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications – year: 2013 ident: b0350 article-title: Recursive deep models for semantic compositionality over a sentiment treebank publication-title: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) – year: 2010 ident: b0300 article-title: Semantic similarity and relatedness between clinical terms: An experimental study publication-title: Proceedings of the Annual Symposium of the American Medical Informatics Association – reference: A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, Improving language understanding with unsupervised learning, Tech. Rep., Technical Report, OpenAI, 2018. – volume: 113 start-page: 4296 year: 2016 end-page: 4301 ident: b0425 article-title: Racial bias in pain assessment and treatment recommendations, and false beliefs about biological differences between blacks and whites publication-title: Proc. Nat. Acad. Sci. – start-page: 6338 year: 2017 end-page: 6347 ident: b0105 article-title: Poincaré embeddings for learning hierarchical representations publication-title: Adv. Neural Informat. Process. Syst. – reference: E.L. Mencia, G. de Melo, J. Nam, Medical Concept Embeddings via Labeled Background Corpora, 2016, pp. 4629–4636. URL – start-page: 2177 year: 2014 end-page: 2185 ident: b0460 article-title: Neural word embedding as implicit matrix factorization publication-title: Adv. Neural Informat. Process. Syst. – reference: P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword information, arXiv preprint arXiv:1607.04606. – volume: 9 start-page: S2 year: 2008 ident: b0365 article-title: Overview of biocreative ii gene mention recognition publication-title: Genome Biol. – start-page: 3111 year: 2013 end-page: 3119 ident: b0020 article-title: Distributed representations of words and phrases and their compositionality publication-title: Adv. Neural Informat. Process. Syst. – start-page: 1 year: 2016 end-page: 6 ident: b0330 article-title: Intrinsic evaluation of word vectors fails to predict extrinsic performance publication-title: Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP – start-page: 1948 year: 2016 end-page: 1954 ident: b0295 article-title: All-in Text: learning document, label, and word representations jointly publication-title: Thirtieth AAAI Conference on Artificial Intelligence – volume: 9 start-page: 2579 year: 2008 end-page: 2605 ident: b0140 article-title: Visualizing data using t-sne publication-title: J. Machine Learn. Res. – start-page: 527 year: 2016 end-page: 533 ident: b0230 article-title: Analyzing multiple medical corpora using word embedding publication-title: 2016 IEEE International Conference on Healthcare Informatics (ICHI) – start-page: 1532 year: 2014 end-page: 1543 ident: b0040 article-title: Glove: Global vectors for word representation publication-title: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) – reference: . – volume: vol. 23 year: 2001 ident: b0010 article-title: Testing the distributioanl hypothesis: The influence of context on judgements of semantic similarity publication-title: Proceedings of the Annual Meeting of the Cognitive Science Society – volume: 18 start-page: 552 year: 2011 end-page: 556 ident: b0405 article-title: 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text publication-title: J. Am. Med. Inform. Assoc. – reference: A.C. Kozlowski, M. Taddy, J.A. Evans, The geometry of culture: Analyzing meaning through word embeddings, arXiv preprint arXiv:1803.09288. – start-page: 5998 year: 2017 end-page: 6008 ident: b0065 article-title: Attention is all you need publication-title: Adv. Neural Informat. Process. Syst. – reference: A.L. Beam, B. Kompa, I. Fried, N. Palmer, X. Shi, T. Cai, I.S. Kohane, Clinical Concept Embeddings Learned from Massive Sources of Medical Data, arXiv, 2018, pp. 1–27 arXiv:1804.01486. URL – start-page: 25 year: 2016 end-page: 34 ident: b0170 article-title: The benefits of word embeddings features for active learning in clinical information extraction publication-title: Proceedings of the Australasian Language Technology Association Workshop 2016 – start-page: 19 year: 2015 end-page: 27 ident: b0070 article-title: Aligning books and movies: Towards story-like visual explanations by watching movies and reading books publication-title: Proceedings of the IEEE International Conference on Computer Vision – start-page: 39 year: 2013 end-page: 43 ident: b0205 article-title: Distributional semantics resources for biomedical text processing publication-title: Proceedings of the 5th International Symposium on Languages in Biology and Medicine, Tokyo, Japan – start-page: 166 year: 2016 end-page: 174 ident: b0360 article-title: How to train good word embeddings for biomedical nlp publication-title: Proceedings of the 15th Workshop on Biomedical Natural Language Processing – reference: T. Bolukbasi, K.-W. Chang, J.Y. Zou, V. Saligrama, A.T. Kalai, Man is to computer programmer as woman is to homemaker? debiasing word embeddings, in: Advances in Neural Information Processing Systems, 2016, pp. 4349–4357. – reference: I. Beltagy, A. Cohan, K. Lo, Scibert: Pretrained contextualized embeddings for scientific text, arXiv preprint arXiv:1903.10676. – start-page: 72 year: 2019 end-page: 78 ident: b0080 article-title: Publicly available clinical BERT embeddings publication-title: Proceedings of the 2nd Clinical Natural Language Processing Workshop – start-page: 30 year: 2016 end-page: 41 ident: b0195 article-title: Deepcare: A deep dynamic memory model for predictive medicine publication-title: Pacific-Asia Conference on Knowledge Discovery and Data Mining – volume: 3 start-page: 160035 year: 2016 ident: b0090 article-title: MIMIC-III, a freely accessible critical care database publication-title: Sci. Data – start-page: 30 year: 2008 ident: 10.1016/j.yjbinx.2019.100057_b0310 article-title: Snomed ct: Browsing the browsers – ident: 10.1016/j.yjbinx.2019.100057_b0125 – ident: 10.1016/j.yjbinx.2019.100057_b0150 – year: 1998 ident: 10.1016/j.yjbinx.2019.100057_b0345 – ident: 10.1016/j.yjbinx.2019.100057_b0400 – ident: 10.1016/j.yjbinx.2019.100057_b0455 doi: 10.3115/v1/P14-1023 – start-page: 19 year: 2015 ident: 10.1016/j.yjbinx.2019.100057_b0070 article-title: Aligning books and movies: Towards story-like visual explanations by watching movies and reading books – ident: 10.1016/j.yjbinx.2019.100057_b0250 doi: 10.3115/v1/P14-2050 – volume: 1 issue: 8 year: 2019 ident: 10.1016/j.yjbinx.2019.100057_b0120 article-title: Language models are unsupervised multitask learners publication-title: OpenAI Blog – ident: 10.1016/j.yjbinx.2019.100057_b0465 – ident: 10.1016/j.yjbinx.2019.100057_b0110 doi: 10.18653/v1/P18-1031 – ident: 10.1016/j.yjbinx.2019.100057_b0135 – start-page: 38 year: 2008 ident: 10.1016/j.yjbinx.2019.100057_b0220 article-title: The bioscope corpus: annotation for negation, uncertainty and their scope in biomedical texts – ident: 10.1016/j.yjbinx.2019.100057_b0410 – volume: 2017 start-page: 302 year: 2017 ident: 10.1016/j.yjbinx.2019.100057_b0175 article-title: Adapting pre-trained word embeddings for use in medical coding publication-title: BioNLP – year: 2013 ident: 10.1016/j.yjbinx.2019.100057_b0350 article-title: Recursive deep models for semantic compositionality over a sentiment treebank – volume: 21 start-page: 22 issue: 1 year: 2017 ident: 10.1016/j.yjbinx.2019.100057_b0190 article-title: Deepr: A convolutional net for medical records publication-title: IEEE J. Biomed. Health Informat. doi: 10.1109/JBHI.2016.2633963 – year: 2012 ident: 10.1016/j.yjbinx.2019.100057_b0240 article-title: Overview of the trec 2012 medical records track – start-page: 1 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0330 article-title: Intrinsic evaluation of word vectors fails to predict extrinsic performance – ident: 10.1016/j.yjbinx.2019.100057_b0305 – ident: 10.1016/j.yjbinx.2019.100057_b0060 – start-page: 156 year: 2018 ident: 10.1016/j.yjbinx.2019.100057_b0215 article-title: A framework for developing and evaluating word embeddings of drug-named entity – ident: 10.1016/j.yjbinx.2019.100057_b0045 – volume: 40 start-page: 288 issue: 3 year: 2007 ident: 10.1016/j.yjbinx.2019.100057_b0255 article-title: Measures of semantic similarity and relatedness in the biomedical domain publication-title: J. Biomed. Informat. doi: 10.1016/j.jbi.2006.06.004 – ident: 10.1016/j.yjbinx.2019.100057_b0435 doi: 10.1109/TASLP.2018.2837384 – ident: 10.1016/j.yjbinx.2019.100057_b0430 doi: 10.2139/ssrn.3064761 – volume: 34 start-page: 301 issue: 5 year: 2001 ident: 10.1016/j.yjbinx.2019.100057_b0225 article-title: A simple algorithm for identifying negated findings and diseases in discharge summaries publication-title: J. Biomed. Informat. doi: 10.1006/jbin.2001.1029 – year: 2018 ident: 10.1016/j.yjbinx.2019.100057_b0055 article-title: Deep contextualized word representations – volume: 44 start-page: 251 issue: 2 year: 2011 ident: 10.1016/j.yjbinx.2019.100057_b0380 article-title: Towards a framework for developing semantic relatedness reference standards publication-title: J. Biomed. Informat. doi: 10.1016/j.jbi.2010.10.004 – ident: 10.1016/j.yjbinx.2019.100057_b0200 – volume: 3 start-page: 160035 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0090 article-title: MIMIC-III, a freely accessible critical care database publication-title: Sci. Data doi: 10.1038/sdata.2016.35 – ident: 10.1016/j.yjbinx.2019.100057_b0355 – ident: 10.1016/j.yjbinx.2019.100057_b0050 – ident: 10.1016/j.yjbinx.2019.100057_b0185 – volume: 1 start-page: 1 issue: 140032 year: 2014 ident: 10.1016/j.yjbinx.2019.100057_b0285 article-title: Building the graph of medicine from millions of clinical narratives publication-title: Sci. Data – ident: 10.1016/j.yjbinx.2019.100057_b0160 – ident: 10.1016/j.yjbinx.2019.100057_b0375 – start-page: 2177 year: 2014 ident: 10.1016/j.yjbinx.2019.100057_b0460 article-title: Neural word embedding as implicit matrix factorization – ident: 10.1016/j.yjbinx.2019.100057_b0075 – start-page: 70 year: 2004 ident: 10.1016/j.yjbinx.2019.100057_b0370 article-title: Introduction to the bio-entity recognition task at jnlpba – volume: 35 start-page: 99 year: 1943 ident: 10.1016/j.yjbinx.2019.100057_b0440 article-title: On a measure of divergence between two statistical populations defined by their probability distributions publication-title: Bull. Calcutta Math. Soc. – ident: 10.1016/j.yjbinx.2019.100057_b0030 – volume: 9 start-page: 2579 issue: Nov year: 2008 ident: 10.1016/j.yjbinx.2019.100057_b0140 article-title: Visualizing data using t-sne publication-title: J. Machine Learn. Res. – volume: 13 start-page: e0192360 issue: 2 year: 2018 ident: 10.1016/j.yjbinx.2019.100057_b0180 article-title: Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives publication-title: PloS One doi: 10.1371/journal.pone.0192360 – ident: 10.1016/j.yjbinx.2019.100057_b0290 – ident: 10.1016/j.yjbinx.2019.100057_b0115 – ident: 10.1016/j.yjbinx.2019.100057_b0245 – year: 2015 ident: 10.1016/j.yjbinx.2019.100057_b0265 article-title: Retrofitting word vectors to semantic lexicons – start-page: 265 year: 2006 ident: 10.1016/j.yjbinx.2019.100057_b0450 article-title: Calibrating noise to sensitivity in private data analysis – ident: 10.1016/j.yjbinx.2019.100057_b0415 – start-page: 303 year: 1993 ident: 10.1016/j.yjbinx.2019.100057_b0340 article-title: A semantic concordance – start-page: 1090 year: 2015 ident: 10.1016/j.yjbinx.2019.100057_b0155 article-title: Interleaved text/image deep mining on a very large-scale radiology database – start-page: 527 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0230 article-title: Analyzing multiple medical corpora using word embedding – start-page: 1532 year: 2014 ident: 10.1016/j.yjbinx.2019.100057_b0040 article-title: Glove: Global vectors for word representation – volume: 18 start-page: 552 issue: 5 year: 2011 ident: 10.1016/j.yjbinx.2019.100057_b0405 article-title: 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text publication-title: J. Am. Med. Inform. Assoc. doi: 10.1136/amiajnl-2011-000203 – volume: 41 start-page: 665 issue: 4 year: 2015 ident: 10.1016/j.yjbinx.2019.100057_b0315 article-title: Simlex-999: Evaluating semantic models with (Genuine) similarity estimation publication-title: Comput. Linguist. doi: 10.1162/COLI_a_00237 – ident: 10.1016/j.yjbinx.2019.100057_b0280 – ident: 10.1016/j.yjbinx.2019.100057_b0100 – start-page: 5998 year: 2017 ident: 10.1016/j.yjbinx.2019.100057_b0065 article-title: Attention is all you need – volume: 113 start-page: 4296 issue: 16 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0425 article-title: Racial bias in pain assessment and treatment recommendations, and false beliefs about biological differences between blacks and whites publication-title: Proc. Nat. Acad. Sci. doi: 10.1073/pnas.1516047113 – volume: 9 start-page: S2 issue: 2 year: 2008 ident: 10.1016/j.yjbinx.2019.100057_b0365 article-title: Overview of biocreative ii gene mention recognition publication-title: Genome Biol. doi: 10.1186/gb-2008-9-s2-s2 – ident: 10.1016/j.yjbinx.2019.100057_b0165 – ident: 10.1016/j.yjbinx.2019.100057_b0095 – start-page: 43 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0260 publication-title: Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures – start-page: 39 year: 2013 ident: 10.1016/j.yjbinx.2019.100057_b0205 article-title: Distributional semantics resources for biomedical text processing – volume: 15 start-page: 14 issue: 1 year: 2008 ident: 10.1016/j.yjbinx.2019.100057_b0235 article-title: Identifying patient smoking status from medical discharge records publication-title: J. Am. Med. Inform. Assoc. doi: 10.1197/jamia.M2408 – volume: 57 start-page: 28 year: 2015 ident: 10.1016/j.yjbinx.2019.100057_b0005 article-title: Challenges in clinical natural language processing for automated disorder normalization publication-title: J. Biomed. Inform. doi: 10.1016/j.jbi.2015.07.010 – start-page: 30 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0195 article-title: Deepcare: A deep dynamic memory model for predictive medicine – ident: 10.1016/j.yjbinx.2019.100057_b0270 – start-page: 3111 year: 2013 ident: 10.1016/j.yjbinx.2019.100057_b0020 article-title: Distributed representations of words and phrases and their compositionality – start-page: 25 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0170 article-title: The benefits of word embeddings features for active learning in clinical information extraction – ident: 10.1016/j.yjbinx.2019.100057_b0085 – start-page: 746 year: 2013 ident: 10.1016/j.yjbinx.2019.100057_b0025 article-title: Linguistic regularities in continuous space word representations – ident: 10.1016/j.yjbinx.2019.100057_b0015 – start-page: 1948 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0295 article-title: All-in Text: learning document, label, and word representations jointly – volume: vol. 23 year: 2001 ident: 10.1016/j.yjbinx.2019.100057_b0010 article-title: Testing the distributioanl hypothesis: The influence of context on judgements of semantic similarity – ident: 10.1016/j.yjbinx.2019.100057_b0145 – ident: 10.1016/j.yjbinx.2019.100057_b0420 – volume: 17 start-page: 95 issue: 1 year: 2017 ident: 10.1016/j.yjbinx.2019.100057_b0210 article-title: Semantic relatedness and similarity of biomedical terms: examining the effects of recency, size, and section of biomedical publications on the performance of word2vec publication-title: BMC Med. Inform. Decis. Mak. doi: 10.1186/s12911-017-0498-1 – start-page: 1188 year: 2014 ident: 10.1016/j.yjbinx.2019.100057_b0035 article-title: Distributed representations of sentences and documents – ident: 10.1016/j.yjbinx.2019.100057_b0275 – ident: 10.1016/j.yjbinx.2019.100057_b0320 doi: 10.3115/1620754.1620758 – start-page: 2049 year: 2015 ident: 10.1016/j.yjbinx.2019.100057_b0335 article-title: Evaluation of Word Vector Representations by Subspace Alignment – start-page: 72 year: 2019 ident: 10.1016/j.yjbinx.2019.100057_b0080 article-title: Publicly available clinical BERT embeddings – volume: 49 start-page: 1 issue: December year: 2013 ident: 10.1016/j.yjbinx.2019.100057_b0325 article-title: Multimodal distributional semantics publication-title: J. Artif. Intell. Res. – start-page: 363 year: 2010 ident: 10.1016/j.yjbinx.2019.100057_b0390 article-title: Overview of the inex 2010 xml mining track: Clustering and classification of xml documents – ident: 10.1016/j.yjbinx.2019.100057_b0395 doi: 10.1109/GRC.2006.1635880 – ident: 10.1016/j.yjbinx.2019.100057_b0445 – start-page: 166 year: 2016 ident: 10.1016/j.yjbinx.2019.100057_b0360 article-title: How to train good word embeddings for biomedical nlp – start-page: 1027 year: 2007 ident: 10.1016/j.yjbinx.2019.100057_b0385 article-title: k-means++: The advantages of careful seeding – year: 2010 ident: 10.1016/j.yjbinx.2019.100057_b0300 article-title: Semantic similarity and relatedness between clinical terms: An experimental study – ident: 10.1016/j.yjbinx.2019.100057_b0130 – start-page: 6338 year: 2017 ident: 10.1016/j.yjbinx.2019.100057_b0105 article-title: Poincaré embeddings for learning hierarchical representations |
| SSID | ssj0011556 |
| Score | 2.5655596 |
| SecondaryResourceType | review_article |
| Snippet | [Display omitted]
•We survey methods of representing clinical text using neural networks.•We provide a “how-to” guide for training these representations on... Representing words as numerical vectors based on the contexts in which they appear has become the de facto method of analyzing text with machine learning. In... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 100057 |
| SubjectTerms | Clinical data Natural language processing Word embeddings |
| Title | A survey of word embeddings for clinical text |
| URI | https://dx.doi.org/10.1016/j.yjbinx.2019.100057 https://www.proquest.com/docview/2561487008 |
| Volume | 100 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1532-0480 dateEnd: 20210131 omitProxy: false ssIdentifier: ssj0011556 issn: 1532-0464 databaseCode: AIEXJ dateStart: 20010201 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Ra9swEBZrO8bGGFvWsW5d0WBvxcORHUt-DKOl7VgJrAO_GUmWaUJjl8Tpsv363Umyk7aUdoO9mCBsx-gul_Pd931HyKdBOTBRHJkggewhiHmigpQzHUSDIgnTUvULJe2wCX56KrIsHXko79yOE-BVJZbL9PK_mhrWwNhInf0Lc3c3hQX4DEaHI5gdjg8y_HB_vphdGds5_4lkQDNVprAtJosp7LiQiPm4Izd1pHx7lldWbdZQ8V_PZdNIG0YP5fi33IeFzsdOjGrF95FJ2EX9Ub0IRrN66nr8F7Xr0K-6TwVW9G15tj6XU0-58uUIZEBdK0d0PJlrME4IqwxBpK50YNbX3BynLhZb2dLbcd2VGCaff03UuFoiIi9FfEfoxK1vKGZ_Z0io5TyD5BSlTqMNssX4IIWgtzU8PshOujYTJFOJE9R1j9dyKy0A8PZ33ZW73PgXt6nJ2UvywtuNDp0vvCKPTNUjz9aUJnvkyTePoeiR565SSx0B7TUJhtQ5DK1Lig5DVw5Dwfa0dRiKDrNNfhwenH05CvwQjUBHWIbG-aAQpAUXIilYyGXaZyi7z0tj4HU25JqXSpYxUwljMtE8LYsYRYwEUpRlGL0hm1VdmbeEstDoRGkFrxBFnBgtpGZMiD6-kzIemh0StduTa68wj4NOLvIWSjjJ3abmuKm529QdEnRXXTqFlXvO5-3O5z5LdNlfDs5yz5UfW0PlEESxMyYrUy_mObN6uBzy4Xf_fPf35Onq97BLNpvZwnwgj_VVM57P9sgGz8Se978_SeOVcA |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+survey+of+word+embeddings+for+clinical+text&rft.jtitle=Journal+of+biomedical+informatics&rft.au=Khattak%2C+Faiza+Khan&rft.au=Jeblee%2C+Serena&rft.au=Pou-Prom%2C+Chlo%C3%A9&rft.au=Abdalla%2C+Mohamed&rft.date=2019-01-01&rft.pub=Elsevier+Inc&rft.issn=1532-0464&rft.eissn=1532-0480&rft.volume=100&rft_id=info:doi/10.1016%2Fj.yjbinx.2019.100057&rft.externalDocID=S2590177X19300563 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1532-0464&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1532-0464&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1532-0464&client=summon |