Predicting treatment retention in medication for opioid use disorder: a machine learning approach using NLP and LLM-derived clinical features.

Saved in:
Bibliographic Details
Title: Predicting treatment retention in medication for opioid use disorder: a machine learning approach using NLP and LLM-derived clinical features.
Authors: Nateghi Haredasht F; Stanford Center for Biomedical Informatics Research, Stanford, CA 94305, United States., Lopez I; Stanford University School of Medicine, Stanford, CA 94305, United States.; Department of Biomedical Data Science, Stanford, CA 94305, United States., Tate S; Department of Psychiatry & Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States., Ashtari P; Department of Electrical Engineering (ESAT), STADIUS Center, KU Leuven, 3001 Leuven, Belgium., Chan MM; KKT Technologies, Pte. Ltd., 139951, Singapore.; Holmusk Technologies, Inc., NY 10012, United States., Kulkarni D; KKT Technologies, Pte. Ltd., 139951, Singapore.; Holmusk Technologies, Inc., NY 10012, United States., Chen CA; Division of Primary Care and Population Health, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, United States., Vangala M; Holmusk Technologies, Inc., NY 10012, United States., Griffith K; Holmusk Europe, Ltd., London, United Kingdom., Bunning B; Department of Biomedical Data Science, Stanford, CA 94305, United States., Miner AS; Department of Psychiatry & Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States., Hernandez-Boussard T; Stanford Center for Biomedical Informatics Research, Stanford, CA 94305, United States., Humphreys K; Department of Psychiatry & Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States.; Veterans Affairs Health Care System, Palo Alto, CA 94304, United States., Lembke A; Department of Psychiatry & Behavioral Sciences, Stanford University School of Medicine, Stanford, CA 94305, United States., Vance LA; Holmusk Technologies, Inc., NY 10012, United States., Chen JH; Stanford Center for Biomedical Informatics Research, Stanford, CA 94305, United States.; Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA 94305, United States.; Clinical Excellence Research Center, Stanford School of Medicine, Stanford, CA 94305, United States.; Department of Medicine, Stanford, CA 94305, United States.
Source: Journal of the American Medical Informatics Association : JAMIA [J Am Med Inform Assoc] 2025 Dec 01; Vol. 32 (12), pp. 1865-1876.
Publication Type: Journal Article
Language: English
Journal Info: Publisher: Oxford University Press Country of Publication: England NLM ID: 9430800 Publication Model: Print Cited Medium: Internet ISSN: 1527-974X (Electronic) Linking ISSN: 10675027 NLM ISO Abbreviation: J Am Med Inform Assoc Subsets: MEDLINE
Imprint Name(s): Publication: 2015- : Oxford : Oxford University Press
Original Publication: Philadelphia, PA : Hanley & Belfus, c1993-
MeSH Terms: Machine Learning* , Opioid-Related Disorders*/drug therapy , Natural Language Processing* , Buprenorphine, Naloxone Drug Combination*/therapeutic use , Opiate Substitution Treatment* , Narcotic Antagonists*/therapeutic use, Humans ; Electronic Health Records ; Female ; Male ; ROC Curve ; Adult ; Logistic Models ; Buprenorphine/therapeutic use
Abstract: Objective: Building upon our previous work on predicting treatment retention in medications for opioid use disorder, we aimed to improve 6-month retention prediction in buprenorphine-naloxone (BUP-NAL) therapy by incorporating features derived from large language models (LLMs) applied to unstructured clinical notes.
Materials and Methods: We used de-identified electronic health record (EHR) data from Stanford Health Care (STARR) for model development and internal validation, and the NeuroBlu behavioral health database for external validation. Structured features were supplemented with 13 clinical and psychosocial features extracted from free-text notes using the CLinical Entity Augmented Retrieval pipeline, which combines named entity recognition with LLM-based classification to provide contextual interpretation. We trained classification (Logistic Regression, Random Forest, XGBoost) and survival models (CoxPH, Random Survival Forest, Survival XGBoost), evaluated using Receiver Operating Characteristic-Area Under the Curve (ROC-AUC) and C-index.
Results: XGBoost achieved the highest classification performance (ROC-AUC = 0.65). Incorporating LLM-derived features improved model performance across all architectures, with the largest gains observed in simpler models such as Logistic Regression. In time-to-event analysis, Random Survival Forest and Survival XGBoost reached the highest C-index (≈0.65). SHapley Additive exPlanations analysis identified LLM-extracted features like Chronic Pain, Liver Disease, and Major Depression as key predictors. We also developed an interactive web tool for real-time clinical use.
Discussion: Features extracted using NLP and LLM-assisted methods improved model accuracy and interpretability, revealing valuable psychosocial risks not captured in structured EHRs.
Conclusion: Combining structured EHR data with LLM-extracted features moderately improves BUP-NAL retention prediction, enabling personalized risk stratification and advancing AI-driven care for substance use disorders.
(© The Author(s) 2025. Published by Oxford University Press on behalf of the American Medical Informatics Association.)
References: J Subst Use Addict Treat. 2025 Jun;173:209685. (PMID: 40127869)
JAMA. 1982 May 14;247(18):2543-6. (PMID: 7069920)
IEEE J Biomed Health Inform. 2023 Jul;27(7):3589-3598. (PMID: 37037255)
J Addict Dis. 2016;35(1):22-35. (PMID: 26467975)
Epidemiol Rev. 2020 Jan 31;42(1):41-56. (PMID: 32239206)
Am J Drug Alcohol Abuse. 2019;45(1):1-10. (PMID: 30675818)
EClinicalMedicine. 2023 Nov 17;66:102311. (PMID: 38045803)
Ann Intern Med. 2015 Jan 6;162(1):W1-73. (PMID: 25560730)
J Addict Med. 2020 Mar/Apr;14(2):99-112. (PMID: 32209915)
J Addict Med. 2023 Jan-Feb 01;17(1):28-34. (PMID: 35914118)
J Subst Abuse Treat. 2018 Dec;95:9-17. (PMID: 30352671)
BMC Med Inform Decis Mak. 2021 Nov 26;21(1):331. (PMID: 34836524)
BJOG. 2015 Feb;122(3):434-43. (PMID: 25623578)
PLoS One. 2024 Nov 25;19(11):e0314136. (PMID: 39585830)
Circ Cardiovasc Qual Outcomes. 2020 Oct;13(10):e006556. (PMID: 33079589)
Addiction. 2023 Jan;118(1):97-107. (PMID: 35815386)
Addiction. 2025 Jun;120(6):1198-1206. (PMID: 39994821)
Addict Behav. 2012 Sep;37(9):1046-53. (PMID: 22626890)
Drug Alcohol Depend. 2023 Oct 1;251:110950. (PMID: 37716289)
Addict Sci Clin Pract. 2022 Mar 7;17(1):15. (PMID: 35255967)
PLoS One. 2022 Dec 15;17(12):e0278988. (PMID: 36520864)
J Neurointerv Surg. 2025 Aug 13;17(9):986-991. (PMID: 39095085)
Lancet Psychiatry. 2023 Jun;10(6):386-402. (PMID: 37167985)
Addiction. 2024 Oct;119(10):1792-1802. (PMID: 38923168)
Addiction. 2016 Apr;111(4):695-705. (PMID: 26599131)
JAMA Psychiatry. 2020 May 1;77(5):493-502. (PMID: 31876906)
PLoS One. 2020 May 14;15(5):e0232086. (PMID: 32407321)
Addict Behav. 2025 Apr;163:108265. (PMID: 39889364)
Am J Addict. 2017 Dec;26(8):859-863. (PMID: 29143483)
Syst Rev. 2021 Aug 6;10(1):216. (PMID: 34362464)
Drug Alcohol Depend. 2020 Nov 1;216:108244. (PMID: 32861134)
Neuropsychiatr Dis Treat. 2021 Oct 28;17:3229-3244. (PMID: 34737569)
JAMA Psychiatry. 2022 Oct 1;79(10):981-992. (PMID: 36044198)
NPJ Digit Med. 2025 Jan 19;8(1):45. (PMID: 39828800)
Drug Alcohol Depend. 2022 Aug 1;237:109507. (PMID: 35660221)
AMIA Annu Symp Proc. 2024 Jan 11;2023:1067-1076. (PMID: 38222349)
Am J Addict. 2016 Sep;25(6):472-7. (PMID: 27442456)
Healthcare (Basel). 2022 Jan 25;10(2):. (PMID: 35206838)
Grant Information: 1R01AI17812101 National Institute of Allergy and Infectious Diseases; Google, Inc.; #12409 Betty Moore Foundation; Human-Centered Artificial Intelligence; NIH; UG1DA015815 NIH/National Institute on Drug Abuse Clinical Trials Network; Stanford Artificial Intelligence in Medicine and Imaging-Human-Centered Artificial Intelligence (AIMI-HAI) Partnership; 1R01AI17812101 NIH/National Institute of Allergy and Infectious Diseases; #12409 Gordon and Betty Moore Foundation; UL1TR003142 NIH-NCATS-CTSA; American Heart Association-Strategically Focused Research Network on Diversity in Clinical Trials; CTN-0136 NIH/National Institute on Drug Abuse Clinical Trials Network; UG1DA015815 - CTN-0136 National Institute on Drug Abuse Clinical Trials Network; Stanford Artificial Intelligence in Medicine and Imaging; UL1 TR003142 United States TR NCATS NIH HHS; UG1 DA015815 United States DA NIDA NIH HHS; UL1TR003142 United States GF NIH HHS
Contributed Indexing: Keywords: electronic health records; large language models; machine learning; natural language processing; opioid use disorder; predictive modeling; treatment attrition
Substance Nomenclature: 0 (Buprenorphine, Naloxone Drug Combination)
0 (Narcotic Antagonists)
40D3SCR4GZ (Buprenorphine)
Entry Date(s): Date Created: 20250922 Date Completed: 20251125 Latest Revision: 20251128
Update Code: 20251128
PubMed Central ID: PMC12646374
DOI: 10.1093/jamia/ocaf157
PMID: 40977375
Database: MEDLINE
Description
Abstract:Objective: Building upon our previous work on predicting treatment retention in medications for opioid use disorder, we aimed to improve 6-month retention prediction in buprenorphine-naloxone (BUP-NAL) therapy by incorporating features derived from large language models (LLMs) applied to unstructured clinical notes.<br />Materials and Methods: We used de-identified electronic health record (EHR) data from Stanford Health Care (STARR) for model development and internal validation, and the NeuroBlu behavioral health database for external validation. Structured features were supplemented with 13 clinical and psychosocial features extracted from free-text notes using the CLinical Entity Augmented Retrieval pipeline, which combines named entity recognition with LLM-based classification to provide contextual interpretation. We trained classification (Logistic Regression, Random Forest, XGBoost) and survival models (CoxPH, Random Survival Forest, Survival XGBoost), evaluated using Receiver Operating Characteristic-Area Under the Curve (ROC-AUC) and C-index.<br />Results: XGBoost achieved the highest classification performance (ROC-AUC = 0.65). Incorporating LLM-derived features improved model performance across all architectures, with the largest gains observed in simpler models such as Logistic Regression. In time-to-event analysis, Random Survival Forest and Survival XGBoost reached the highest C-index (≈0.65). SHapley Additive exPlanations analysis identified LLM-extracted features like Chronic Pain, Liver Disease, and Major Depression as key predictors. We also developed an interactive web tool for real-time clinical use.<br />Discussion: Features extracted using NLP and LLM-assisted methods improved model accuracy and interpretability, revealing valuable psychosocial risks not captured in structured EHRs.<br />Conclusion: Combining structured EHR data with LLM-extracted features moderately improves BUP-NAL retention prediction, enabling personalized risk stratification and advancing AI-driven care for substance use disorders.<br /> (© The Author(s) 2025. Published by Oxford University Press on behalf of the American Medical Informatics Association.)
ISSN:1527-974X
DOI:10.1093/jamia/ocaf157