Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis.
Saved in:
| Title: | Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis. |
|---|---|
| Authors: | Hendrix N; Center for Professionalism and Value in Health Care, American Board of Family Medicine, Washington, District of Columbia, United States of America., Parikh RV; Department of Epidemiology and Population Health, Stanford School of Medicine, Palo Alto, California, United States of America., Taskier M; Center for Professionalism and Value in Health Care, American Board of Family Medicine, Washington, District of Columbia, United States of America., Walter G; Robert Graham Center, American Academy of Family Physicians, Washington, District of Columbia, United States of America., Rochlin I; Inform and Disseminate Division, Office of Public Health Data, Surveillance, and Technology, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America., Saydah S; Coronavirus and Other Respiratory Viruses Division, National Center for Immunizations and Respiratory Disease, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America., Koumans EH; Coronavirus and Other Respiratory Viruses Division, National Center for Immunizations and Respiratory Disease, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America., Rincón-Guevara O; Inform and Disseminate Division, Office of Public Health Data, Surveillance, and Technology, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America., Rehkopf DH; Department of Epidemiology and Population Health, Stanford School of Medicine, Palo Alto, California, United States of America., Phillips RL; Center for Professionalism and Value in Health Care, American Board of Family Medicine, Washington, District of Columbia, United States of America. |
| Source: | PloS one [PLoS One] 2025 May 16; Vol. 20 (5), pp. e0324017. Date of Electronic Publication: 2025 May 16 (Print Publication: 2025). |
| Publication Type: | Journal Article; Multicenter Study; Observational Study |
| Language: | English |
| Journal Info: | Publisher: Public Library of Science Country of Publication: United States NLM ID: 101285081 Publication Model: eCollection Cited Medium: Internet ISSN: 1932-6203 (Electronic) Linking ISSN: 19326203 NLM ISO Abbreviation: PLoS One Subsets: MEDLINE |
| Imprint Name(s): | Original Publication: San Francisco, CA : Public Library of Science |
| MeSH Terms: | COVID-19*/diagnosis , COVID-19*/complications , COVID-19*/epidemiology , Machine Learning* , Primary Health Care*, Adult ; Aged ; Female ; Humans ; Male ; Middle Aged ; Documentation ; Natural Language Processing ; Retrospective Studies ; SARS-CoV-2/isolation & purification ; United States/epidemiology |
| Abstract: | Competing Interests: The authors have declared that no competing interests exist. Background: Post-COVID conditions (PCC) have proven difficult to diagnose. In this retrospective observational study, we aimed to characterize the level of variation in PCC diagnoses observed across clinicians from a number of methodological angles and to determine whether natural language classifiers trained on clinical notes can reconcile differences in diagnostic definitions. Methods: We used data from 519 primary care clinics around the United States who were in the American Family Cohort registry between October 1, 2021 (when the ICD-10 code for PCC was activated) and November 1, 2023. There were 6,116 patients with a diagnostic code for PCC (U09.9), and 5,020 with diagnostic codes for both PCC and COVID-19. We explored these data using 4 different outcomes: 1) Time between COVID-19 and PCC diagnostic codes; 2) Count of patients with PCC diagnostic codes per clinician; 3) Patient-specific probability of PCC diagnostic code based on patient and clinician characteristics; and 4) Performance of a natural language classifier trained on notes from 5,000 patients annotated by two physicians to indicate probable PCC. Results: Of patients with diagnostic codes for PCC and COVID-19, 61.3% were diagnosed with PCC less than 12 weeks after initial recorded COVID-19. Clinicians in the top 1% of diagnostic propensity accounted for more than a third of all PCC diagnoses (35.8%). Comparing LASSO logistic regressions predicting documentation of PCC diagnosis, a log-likelihood test showed significantly better fit when clinician and practice site indicators were included (p < 0.0001). Inter-rater agreement between physician annotators on PCC diagnosis was moderate (Cohen's kappa: 0.60), and performance of the natural language classifiers was marginal (best AUC: 0.724, 95% credible interval: 0.555-0.878). Conclusion: We found evidence of substantial disagreement between clinicians on diagnostic criteria for PCC. The variation in diagnostic rates across clinicians points to the possibilities of under- and over-diagnosis for patients. (Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.) |
| References: | Open Forum Infect Dis. 2022 Nov 24;9(12):ofac640. (PMID: 36570972) J Biomed Inform. 2017 Aug;72:85-95. (PMID: 28694119) Med. 2021 May 14;2(5):501-504. (PMID: 33786465) J Intensive Med. 2021 Oct 22;1(2):110-116. (PMID: 36785563) Int J Med Inform. 2022 Mar 7;162:104736. (PMID: 35316697) EBioMedicine. 2023 Jan;87:104413. (PMID: 36563487) Eur J Radiol. 2022 Mar;148:110164. (PMID: 35114535) N Engl J Med. 2024 Nov 7;391(18):1746-1753. (PMID: 39083764) Pediatr Infect Dis J. 2022 May 1;41(5):424-426. (PMID: 35213866) JAMA Netw Open. 2022 Oct 3;5(10):e2238804. (PMID: 36301542) BMJ. 2020 Dec 23;371:m4938. (PMID: 33361141) Lancet Digit Health. 2022 Jul;4(7):e532-e541. (PMID: 35589549) Nat Rev Microbiol. 2023 Mar;21(3):133-146. (PMID: 36639608) Br J Gen Pract. 2021 Oct 28;71(712):e806-e814. (PMID: 34340970) JAMA Netw Open. 2022 Jul 1;5(7):e2224359. (PMID: 35904783) JAMA. 2023 May 23;329(20):1727-1729. (PMID: 37133827) |
| Entry Date(s): | Date Created: 20250516 Date Completed: 20250516 Latest Revision: 20250527 |
| Update Code: | 20250527 |
| PubMed Central ID: | PMC12083802 |
| DOI: | 10.1371/journal.pone.0324017 |
| PMID: | 40378166 |
| Database: | MEDLINE |
| Abstract: | Competing Interests: The authors have declared that no competing interests exist.<br />Background: Post-COVID conditions (PCC) have proven difficult to diagnose. In this retrospective observational study, we aimed to characterize the level of variation in PCC diagnoses observed across clinicians from a number of methodological angles and to determine whether natural language classifiers trained on clinical notes can reconcile differences in diagnostic definitions.<br />Methods: We used data from 519 primary care clinics around the United States who were in the American Family Cohort registry between October 1, 2021 (when the ICD-10 code for PCC was activated) and November 1, 2023. There were 6,116 patients with a diagnostic code for PCC (U09.9), and 5,020 with diagnostic codes for both PCC and COVID-19. We explored these data using 4 different outcomes: 1) Time between COVID-19 and PCC diagnostic codes; 2) Count of patients with PCC diagnostic codes per clinician; 3) Patient-specific probability of PCC diagnostic code based on patient and clinician characteristics; and 4) Performance of a natural language classifier trained on notes from 5,000 patients annotated by two physicians to indicate probable PCC.<br />Results: Of patients with diagnostic codes for PCC and COVID-19, 61.3% were diagnosed with PCC less than 12 weeks after initial recorded COVID-19. Clinicians in the top 1% of diagnostic propensity accounted for more than a third of all PCC diagnoses (35.8%). Comparing LASSO logistic regressions predicting documentation of PCC diagnosis, a log-likelihood test showed significantly better fit when clinician and practice site indicators were included (p < 0.0001). Inter-rater agreement between physician annotators on PCC diagnosis was moderate (Cohen's kappa: 0.60), and performance of the natural language classifiers was marginal (best AUC: 0.724, 95% credible interval: 0.555-0.878).<br />Conclusion: We found evidence of substantial disagreement between clinicians on diagnostic criteria for PCC. The variation in diagnostic rates across clinicians points to the possibilities of under- and over-diagnosis for patients.<br /> (Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.) |
|---|---|
| ISSN: | 1932-6203 |
| DOI: | 10.1371/journal.pone.0324017 |
Full Text Finder
Nájsť tento článok vo Web of Science