Mining peripheral arterial disease cases from narrative clinical notes using natural language processing

Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with b...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of vascular surgery Ročník 65; číslo 6; s. 1753
Hlavní autoři: Afzal, Naveed, Sohn, Sunghwan, Abram, Sara, Scott, Christopher G, Chaudhry, Rajeev, Liu, Hongfang, Kullo, Iftikhar J, Arruda-Olson, Adelaide M
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States 01.06.2017
Témata:
ISSN:1097-6809, 1097-6809
On-line přístup:Zjistit podrobnosti o přístupu
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard. We compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets. We iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001). A knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support.
AbstractList Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard. We compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets. We iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001). A knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support.
Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard.OBJECTIVELower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard.We compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets.METHODSWe compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets.We iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001).RESULTSWe iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001).A knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support.CONCLUSIONSA knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support.
Author Arruda-Olson, Adelaide M
Chaudhry, Rajeev
Scott, Christopher G
Afzal, Naveed
Kullo, Iftikhar J
Sohn, Sunghwan
Abram, Sara
Liu, Hongfang
Author_xml – sequence: 1
  givenname: Naveed
  surname: Afzal
  fullname: Afzal, Naveed
  organization: Department of Health Sciences Research, Mayo Clinic, Rochester, Minn
– sequence: 2
  givenname: Sunghwan
  surname: Sohn
  fullname: Sohn, Sunghwan
  organization: Department of Health Sciences Research, Mayo Clinic, Rochester, Minn
– sequence: 3
  givenname: Sara
  surname: Abram
  fullname: Abram, Sara
  organization: Department of Cardiovascular Diseases, Mayo Clinic, Rochester, Minn
– sequence: 4
  givenname: Christopher G
  surname: Scott
  fullname: Scott, Christopher G
  organization: Department of Health Sciences Research, Mayo Clinic, Rochester, Minn
– sequence: 5
  givenname: Rajeev
  surname: Chaudhry
  fullname: Chaudhry, Rajeev
  organization: Division of Primary Care Medicine, Knowledge Delivery Center and Center for Innovation, Mayo Clinic, Rochester, Minn
– sequence: 6
  givenname: Hongfang
  surname: Liu
  fullname: Liu, Hongfang
  organization: Department of Health Sciences Research, Mayo Clinic, Rochester, Minn
– sequence: 7
  givenname: Iftikhar J
  surname: Kullo
  fullname: Kullo, Iftikhar J
  organization: Department of Cardiovascular Diseases, Mayo Clinic, Rochester, Minn
– sequence: 8
  givenname: Adelaide M
  surname: Arruda-Olson
  fullname: Arruda-Olson, Adelaide M
  email: olson.adelaide@mayo.edu
  organization: Department of Cardiovascular Diseases, Mayo Clinic, Rochester, Minn. Electronic address: olson.adelaide@mayo.edu
BackLink https://www.ncbi.nlm.nih.gov/pubmed/28189359$$D View this record in MEDLINE/PubMed
BookMark eNpNUMtOwzAQtFARfcAHcEE5ckmwEz-PqKKAVMQFzpFfaV0lTrCTSvw9rigSl93Z3dnZ1SzBzPfeAnCLYIEgog-H4nCMRZlggVABK3QBFggKllMOxewfnoNljAcIESKcXYF5yREXFRELsH9z3vldNtjghr0Nss1kGFORgHHRymgznULMmtB3mZchyNEdU7NNizqxfD-m6RRPKl6O00milX43yZ3NhtBrG0-za3DZyDbam3Negc_N08f6Jd--P7-uH7e5JoSOOTeQNVxXxjIuRKWUoQwLgTlRlDa0aqpSMUIx5dxyrBtCGMLKJDuk4qYU5Qrc_-qm01-TjWPduahtm16y_RRrxCkTHEOME_XuTJ1UZ009BNfJ8F3_uVP-AEQ_auY
CitedBy_id crossref_primary_10_1016_j_jvs_2020_12_090
crossref_primary_10_1136_svn_2017_000101
crossref_primary_10_3390_biomimetics9080465
crossref_primary_10_1016_j_jvs_2020_09_032
crossref_primary_10_2196_12239
crossref_primary_10_1371_journal_pone_0247872
crossref_primary_10_1111_1754_9485_12861
crossref_primary_10_34248_bsengineering_1499831
crossref_primary_10_1016_j_mayocpiqo_2020_09_012
crossref_primary_10_1161_JAHA_123_031880
crossref_primary_10_3390_jcdd10050202
crossref_primary_10_1007_s40290_017_0216_4
crossref_primary_10_1016_j_amjcard_2024_03_035
crossref_primary_10_1016_j_jvs_2021_05_054
crossref_primary_10_1109_ACCESS_2019_2923583
crossref_primary_10_1109_ACCESS_2025_3588303
crossref_primary_10_1177_00033197241310572
crossref_primary_10_1186_s12913_020_4925_0
crossref_primary_10_1038_s41746_019_0208_8
crossref_primary_10_1146_annurev_biodatasci_080917_013315
crossref_primary_10_1016_j_mri_2023_11_014
crossref_primary_10_57120_yalvac_1536202
crossref_primary_10_3390_w14040674
crossref_primary_10_1177_00033197251324630
crossref_primary_10_1016_j_ijmedinf_2017_12_024
crossref_primary_10_1145_3511020
crossref_primary_10_1177_1536012120914773
crossref_primary_10_1016_j_jbi_2017_11_011
crossref_primary_10_3390_healthcare11020207
crossref_primary_10_1093_jamia_ocad202
crossref_primary_10_1109_ACCESS_2021_3119621
crossref_primary_10_1161_CIRCINTERVENTIONS_120_009447
crossref_primary_10_1161_CIRCRESAHA_121_318224
crossref_primary_10_1016_j_jvs_2024_02_024
crossref_primary_10_1161_CIRCRESAHA_120_316401
crossref_primary_10_1161_JAHA_118_009680
crossref_primary_10_3389_fcvm_2022_949454
crossref_primary_10_56294_pod2025152
crossref_primary_10_1016_j_jvs_2022_07_160
crossref_primary_10_1016_j_avsg_2023_11_057
crossref_primary_10_1161_CIRCINTERVENTIONS_121_011092
crossref_primary_10_3390_app9112331
crossref_primary_10_1109_ACCESS_2020_3007939
crossref_primary_10_1093_jamia_ocaf155
crossref_primary_10_1016_j_mayocp_2020_01_038
crossref_primary_10_1186_s12911_022_02017_y
crossref_primary_10_2196_40964
crossref_primary_10_3390_biomedicines13081836
crossref_primary_10_1016_j_future_2020_07_053
crossref_primary_10_1016_j_ijmedinf_2019_05_008
crossref_primary_10_1038_s41598_022_17180_5
crossref_primary_10_1016_j_ejvsvf_2023_09_002
crossref_primary_10_2196_18542
crossref_primary_10_1038_s41598_025_11870_6
crossref_primary_10_1016_j_imu_2024_101529
crossref_primary_10_1161_CIRCIMAGING_122_014533
crossref_primary_10_1016_j_jss_2024_09_062
crossref_primary_10_1177_15266028231187599
crossref_primary_10_1093_jamiaopen_ooae152
crossref_primary_10_1177_1358863X221094082
crossref_primary_10_2196_42477
crossref_primary_10_2196_27333
crossref_primary_10_2196_23934
crossref_primary_10_1016_j_jvsvi_2024_100111
ContentType Journal Article
Copyright Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Copyright_xml – notice: Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1016/j.jvs.2016.11.031
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Medicine
EISSN 1097-6809
ExternalDocumentID 28189359
Genre Journal Article
Comparative Study
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NHLBI NIH HHS
  grantid: K01 HL124045
– fundername: NHGRI NIH HHS
  grantid: U01 HG006379
– fundername: NIA NIH HHS
  grantid: R01 AG034676
– fundername: NHGRI NIH HHS
  grantid: U01 HG004599
GroupedDBID ---
--K
.1-
.55
.FO
.GJ
.XZ
0R~
1B1
1P~
1~5
2WC
354
4.4
457
4G.
53G
5GY
5RE
5VS
7-5
AACTN
AAEDT
AAEDW
AAIKJ
AALRI
AAQFI
AAQXK
AAXUO
ABFRF
ABJNI
ABLJU
ABMAC
ABOCM
ABWVN
ACGFO
ACGFS
ACPHU
ACRPL
ADBBV
ADEZE
ADMUD
ADNMO
ADVLN
AEFWE
AENEX
AEVXI
AEXQZ
AFCTW
AFFNX
AFJKZ
AFRHN
AFTJW
AGHFR
AITUG
AJUYK
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
ASPBG
AVWKF
AZFZN
BAWUL
BELOY
C45
C5W
CAG
CGR
COF
CS3
CUY
CVF
DIK
DU5
E3Z
EBS
ECM
EFJIC
EIF
EJD
FDB
FEDTE
FGOYB
FRP
GBLVA
HVGLF
HZ~
IHE
IXB
J1W
J5H
K-O
KOM
L7B
M41
MO0
N4W
NPM
NQ-
O-L
O9-
OB2
OBH
OHH
OK-
OK1
OVD
OW-
OZT
P2P
R2-
RIG
ROL
RPZ
SDG
SDP
SEL
SES
SEW
SJN
SSZ
TEORI
UHS
UV1
VVN
W2D
X7M
XH2
YFH
YOC
Z5R
ZGI
ZXP
ZY1
7X8
AAYWO
ACVFH
ADCNI
AEUPX
AFPUW
AIGII
AKBMS
AKYEP
EFKBS
ID FETCH-LOGICAL-c556t-8d07f8c3de78993bbd67499485b66f63f32b7564688e84cf55714bd101ab8d292
IEDL.DBID 7X8
ISICitedReferencesCount 63
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000402634100030&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1097-6809
IngestDate Sun Sep 28 05:45:12 EDT 2025
Thu Apr 03 07:04:09 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c556t-8d07f8c3de78993bbd67499485b66f63f32b7564688e84cf55714bd101ab8d292
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
OpenAccessLink https://www.clinicalkey.com/#!/content/1-s2.0-S0741521416318444
PMID 28189359
PQID 1867984044
PQPubID 23479
ParticipantIDs proquest_miscellaneous_1867984044
pubmed_primary_28189359
PublicationCentury 2000
PublicationDate 2017-06-01
PublicationDateYYYYMMDD 2017-06-01
PublicationDate_xml – month: 06
  year: 2017
  text: 2017-06-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Journal of vascular surgery
PublicationTitleAlternate J Vasc Surg
PublicationYear 2017
SSID ssj0011587
Score 2.4665644
Snippet Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 1753
SubjectTerms Administrative Claims, Healthcare
Algorithms
Ankle Brachial Index
Data Mining - methods
Databases, Factual
Electronic Health Records
Humans
International Classification of Diseases
Lower Extremity - blood supply
Minnesota
Models, Statistical
Natural Language Processing
Peripheral Arterial Disease - classification
Peripheral Arterial Disease - diagnosis
Retrospective Studies
Title Mining peripheral arterial disease cases from narrative clinical notes using natural language processing
URI https://www.ncbi.nlm.nih.gov/pubmed/28189359
https://www.proquest.com/docview/1867984044
Volume 65
WOSCitedRecordID wos000402634100030&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEA7qinjx_VhfRPAa7TaPpicRcfGyiweFvZU8fSDtul339ztJW_biQfCSQ0ugpJPJZL6Z70PoymjGtcopoVIlhGXKEikN2DKjXiUyT7SyUWwiG4_lZJI_tQm3ui2r7HxidNS2MiFHfhOI13K4jTB2O_0iQTUqoKuthMYq6lEIZYJVZ5MlijDgUSAvgKxEyCTvUM1Y3_WxCGzdA3EdSDxbjblfI8x40gy3__uNO2irjTHxXWMUu2jFlXtoY9Si6PvobRRVIXAgOY6sAp84VnaCKeIWsMEGhhqH5hNcqllDD467NkpcVhCh4lAz_4ojNSg86zKfeNr0HsC7A_QyfHi-fySt4gIxnIs5kTbJvDTUugzuYVRrKzK4EjHJtRBeUE9TnXHBhJROMuM5zwZMW1hRpaVN8_QQrZVV6Y4Rhq1tnFKJsh78hFXSQbDCneBOUa9z2keX3RoWYNEBplClq77rYrmKfXTU_Ihi2lBvFIG7KvQSn_xh9inaTMMZHFMmZ6jnYT-7c7RuFvP3enYRTQXG8dPoB-aayrk
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Mining+peripheral+arterial+disease+cases+from+narrative+clinical+notes+using+natural+language+processing&rft.jtitle=Journal+of+vascular+surgery&rft.au=Afzal%2C+Naveed&rft.au=Sohn%2C+Sunghwan&rft.au=Abram%2C+Sara&rft.au=Scott%2C+Christopher+G&rft.date=2017-06-01&rft.issn=1097-6809&rft.eissn=1097-6809&rft.volume=65&rft.issue=6&rft.spage=1753&rft_id=info:doi/10.1016%2Fj.jvs.2016.11.031&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1097-6809&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1097-6809&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1097-6809&client=summon