Mining peripheral arterial disease cases from narrative clinical notes using natural language processing
Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with b...
Uloženo v:
| Vydáno v: | Journal of vascular surgery Ročník 65; číslo 6; s. 1753 |
|---|---|
| Hlavní autoři: | , , , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
01.06.2017
|
| Témata: | |
| ISSN: | 1097-6809, 1097-6809 |
| On-line přístup: | Zjistit podrobnosti o přístupu |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard.
We compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets.
We iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001).
A knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support. |
|---|---|
| AbstractList | Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard.
We compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets.
We iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001).
A knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support. Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard.OBJECTIVELower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard.We compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets.METHODSWe compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets.We iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001).RESULTSWe iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001).A knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support.CONCLUSIONSA knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support. |
| Author | Arruda-Olson, Adelaide M Chaudhry, Rajeev Scott, Christopher G Afzal, Naveed Kullo, Iftikhar J Sohn, Sunghwan Abram, Sara Liu, Hongfang |
| Author_xml | – sequence: 1 givenname: Naveed surname: Afzal fullname: Afzal, Naveed organization: Department of Health Sciences Research, Mayo Clinic, Rochester, Minn – sequence: 2 givenname: Sunghwan surname: Sohn fullname: Sohn, Sunghwan organization: Department of Health Sciences Research, Mayo Clinic, Rochester, Minn – sequence: 3 givenname: Sara surname: Abram fullname: Abram, Sara organization: Department of Cardiovascular Diseases, Mayo Clinic, Rochester, Minn – sequence: 4 givenname: Christopher G surname: Scott fullname: Scott, Christopher G organization: Department of Health Sciences Research, Mayo Clinic, Rochester, Minn – sequence: 5 givenname: Rajeev surname: Chaudhry fullname: Chaudhry, Rajeev organization: Division of Primary Care Medicine, Knowledge Delivery Center and Center for Innovation, Mayo Clinic, Rochester, Minn – sequence: 6 givenname: Hongfang surname: Liu fullname: Liu, Hongfang organization: Department of Health Sciences Research, Mayo Clinic, Rochester, Minn – sequence: 7 givenname: Iftikhar J surname: Kullo fullname: Kullo, Iftikhar J organization: Department of Cardiovascular Diseases, Mayo Clinic, Rochester, Minn – sequence: 8 givenname: Adelaide M surname: Arruda-Olson fullname: Arruda-Olson, Adelaide M email: olson.adelaide@mayo.edu organization: Department of Cardiovascular Diseases, Mayo Clinic, Rochester, Minn. Electronic address: olson.adelaide@mayo.edu |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/28189359$$D View this record in MEDLINE/PubMed |
| BookMark | eNpNUMtOwzAQtFARfcAHcEE5ckmwEz-PqKKAVMQFzpFfaV0lTrCTSvw9rigSl93Z3dnZ1SzBzPfeAnCLYIEgog-H4nCMRZlggVABK3QBFggKllMOxewfnoNljAcIESKcXYF5yREXFRELsH9z3vldNtjghr0Nss1kGFORgHHRymgznULMmtB3mZchyNEdU7NNizqxfD-m6RRPKl6O00milX43yZ3NhtBrG0-za3DZyDbam3Negc_N08f6Jd--P7-uH7e5JoSOOTeQNVxXxjIuRKWUoQwLgTlRlDa0aqpSMUIx5dxyrBtCGMLKJDuk4qYU5Qrc_-qm01-TjWPduahtm16y_RRrxCkTHEOME_XuTJ1UZ009BNfJ8F3_uVP-AEQ_auY |
| CitedBy_id | crossref_primary_10_1016_j_jvs_2020_12_090 crossref_primary_10_1136_svn_2017_000101 crossref_primary_10_3390_biomimetics9080465 crossref_primary_10_1016_j_jvs_2020_09_032 crossref_primary_10_2196_12239 crossref_primary_10_1371_journal_pone_0247872 crossref_primary_10_1111_1754_9485_12861 crossref_primary_10_34248_bsengineering_1499831 crossref_primary_10_1016_j_mayocpiqo_2020_09_012 crossref_primary_10_1161_JAHA_123_031880 crossref_primary_10_3390_jcdd10050202 crossref_primary_10_1007_s40290_017_0216_4 crossref_primary_10_1016_j_amjcard_2024_03_035 crossref_primary_10_1016_j_jvs_2021_05_054 crossref_primary_10_1109_ACCESS_2019_2923583 crossref_primary_10_1109_ACCESS_2025_3588303 crossref_primary_10_1177_00033197241310572 crossref_primary_10_1186_s12913_020_4925_0 crossref_primary_10_1038_s41746_019_0208_8 crossref_primary_10_1146_annurev_biodatasci_080917_013315 crossref_primary_10_1016_j_mri_2023_11_014 crossref_primary_10_57120_yalvac_1536202 crossref_primary_10_3390_w14040674 crossref_primary_10_1177_00033197251324630 crossref_primary_10_1016_j_ijmedinf_2017_12_024 crossref_primary_10_1145_3511020 crossref_primary_10_1177_1536012120914773 crossref_primary_10_1016_j_jbi_2017_11_011 crossref_primary_10_3390_healthcare11020207 crossref_primary_10_1093_jamia_ocad202 crossref_primary_10_1109_ACCESS_2021_3119621 crossref_primary_10_1161_CIRCINTERVENTIONS_120_009447 crossref_primary_10_1161_CIRCRESAHA_121_318224 crossref_primary_10_1016_j_jvs_2024_02_024 crossref_primary_10_1161_CIRCRESAHA_120_316401 crossref_primary_10_1161_JAHA_118_009680 crossref_primary_10_3389_fcvm_2022_949454 crossref_primary_10_56294_pod2025152 crossref_primary_10_1016_j_jvs_2022_07_160 crossref_primary_10_1016_j_avsg_2023_11_057 crossref_primary_10_1161_CIRCINTERVENTIONS_121_011092 crossref_primary_10_3390_app9112331 crossref_primary_10_1109_ACCESS_2020_3007939 crossref_primary_10_1093_jamia_ocaf155 crossref_primary_10_1016_j_mayocp_2020_01_038 crossref_primary_10_1186_s12911_022_02017_y crossref_primary_10_2196_40964 crossref_primary_10_3390_biomedicines13081836 crossref_primary_10_1016_j_future_2020_07_053 crossref_primary_10_1016_j_ijmedinf_2019_05_008 crossref_primary_10_1038_s41598_022_17180_5 crossref_primary_10_1016_j_ejvsvf_2023_09_002 crossref_primary_10_2196_18542 crossref_primary_10_1038_s41598_025_11870_6 crossref_primary_10_1016_j_imu_2024_101529 crossref_primary_10_1161_CIRCIMAGING_122_014533 crossref_primary_10_1016_j_jss_2024_09_062 crossref_primary_10_1177_15266028231187599 crossref_primary_10_1093_jamiaopen_ooae152 crossref_primary_10_1177_1358863X221094082 crossref_primary_10_2196_42477 crossref_primary_10_2196_27333 crossref_primary_10_2196_23934 crossref_primary_10_1016_j_jvsvi_2024_100111 |
| ContentType | Journal Article |
| Copyright | Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved. |
| Copyright_xml | – notice: Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved. |
| DBID | CGR CUY CVF ECM EIF NPM 7X8 |
| DOI | 10.1016/j.jvs.2016.11.031 |
| DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
| DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Medicine |
| EISSN | 1097-6809 |
| ExternalDocumentID | 28189359 |
| Genre | Journal Article Comparative Study Research Support, N.I.H., Extramural |
| GrantInformation_xml | – fundername: NHLBI NIH HHS grantid: K01 HL124045 – fundername: NHGRI NIH HHS grantid: U01 HG006379 – fundername: NIA NIH HHS grantid: R01 AG034676 – fundername: NHGRI NIH HHS grantid: U01 HG004599 |
| GroupedDBID | --- --K .1- .55 .FO .GJ .XZ 0R~ 1B1 1P~ 1~5 2WC 354 4.4 457 4G. 53G 5GY 5RE 5VS 7-5 AACTN AAEDT AAEDW AAIKJ AALRI AAQFI AAQXK AAXUO ABFRF ABJNI ABLJU ABMAC ABOCM ABWVN ACGFO ACGFS ACPHU ACRPL ADBBV ADEZE ADMUD ADNMO ADVLN AEFWE AENEX AEVXI AEXQZ AFCTW AFFNX AFJKZ AFRHN AFTJW AGHFR AITUG AJUYK AKRWK ALMA_UNASSIGNED_HOLDINGS AMRAJ ASPBG AVWKF AZFZN BAWUL BELOY C45 C5W CAG CGR COF CS3 CUY CVF DIK DU5 E3Z EBS ECM EFJIC EIF EJD FDB FEDTE FGOYB FRP GBLVA HVGLF HZ~ IHE IXB J1W J5H K-O KOM L7B M41 MO0 N4W NPM NQ- O-L O9- OB2 OBH OHH OK- OK1 OVD OW- OZT P2P R2- RIG ROL RPZ SDG SDP SEL SES SEW SJN SSZ TEORI UHS UV1 VVN W2D X7M XH2 YFH YOC Z5R ZGI ZXP ZY1 7X8 AAYWO ACVFH ADCNI AEUPX AFPUW AIGII AKBMS AKYEP EFKBS |
| ID | FETCH-LOGICAL-c556t-8d07f8c3de78993bbd67499485b66f63f32b7564688e84cf55714bd101ab8d292 |
| IEDL.DBID | 7X8 |
| ISICitedReferencesCount | 63 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000402634100030&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1097-6809 |
| IngestDate | Sun Sep 28 05:45:12 EDT 2025 Thu Apr 03 07:04:09 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| License | Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c556t-8d07f8c3de78993bbd67499485b66f63f32b7564688e84cf55714bd101ab8d292 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
| OpenAccessLink | https://www.clinicalkey.com/#!/content/1-s2.0-S0741521416318444 |
| PMID | 28189359 |
| PQID | 1867984044 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_1867984044 pubmed_primary_28189359 |
| PublicationCentury | 2000 |
| PublicationDate | 2017-06-01 |
| PublicationDateYYYYMMDD | 2017-06-01 |
| PublicationDate_xml | – month: 06 year: 2017 text: 2017-06-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Journal of vascular surgery |
| PublicationTitleAlternate | J Vasc Surg |
| PublicationYear | 2017 |
| SSID | ssj0011587 |
| Score | 2.4665644 |
| Snippet | Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| StartPage | 1753 |
| SubjectTerms | Administrative Claims, Healthcare Algorithms Ankle Brachial Index Data Mining - methods Databases, Factual Electronic Health Records Humans International Classification of Diseases Lower Extremity - blood supply Minnesota Models, Statistical Natural Language Processing Peripheral Arterial Disease - classification Peripheral Arterial Disease - diagnosis Retrospective Studies |
| Title | Mining peripheral arterial disease cases from narrative clinical notes using natural language processing |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/28189359 https://www.proquest.com/docview/1867984044 |
| Volume | 65 |
| WOSCitedRecordID | wos000402634100030&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEA7qinjx_VhfRPAa7TaPpicRcfGyiweFvZU8fSDtul339ztJW_biQfCSQ0ugpJPJZL6Z70PoymjGtcopoVIlhGXKEikN2DKjXiUyT7SyUWwiG4_lZJI_tQm3ui2r7HxidNS2MiFHfhOI13K4jTB2O_0iQTUqoKuthMYq6lEIZYJVZ5MlijDgUSAvgKxEyCTvUM1Y3_WxCGzdA3EdSDxbjblfI8x40gy3__uNO2irjTHxXWMUu2jFlXtoY9Si6PvobRRVIXAgOY6sAp84VnaCKeIWsMEGhhqH5hNcqllDD467NkpcVhCh4lAz_4ojNSg86zKfeNr0HsC7A_QyfHi-fySt4gIxnIs5kTbJvDTUugzuYVRrKzK4EjHJtRBeUE9TnXHBhJROMuM5zwZMW1hRpaVN8_QQrZVV6Y4Rhq1tnFKJsh78hFXSQbDCneBOUa9z2keX3RoWYNEBplClq77rYrmKfXTU_Ihi2lBvFIG7KvQSn_xh9inaTMMZHFMmZ6jnYT-7c7RuFvP3enYRTQXG8dPoB-aayrk |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Mining+peripheral+arterial+disease+cases+from+narrative+clinical+notes+using+natural+language+processing&rft.jtitle=Journal+of+vascular+surgery&rft.au=Afzal%2C+Naveed&rft.au=Sohn%2C+Sunghwan&rft.au=Abram%2C+Sara&rft.au=Scott%2C+Christopher+G&rft.date=2017-06-01&rft.issn=1097-6809&rft.eissn=1097-6809&rft.volume=65&rft.issue=6&rft.spage=1753&rft_id=info:doi/10.1016%2Fj.jvs.2016.11.031&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1097-6809&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1097-6809&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1097-6809&client=summon |