Evaluating motivational interview quality using large language models and hidden Markov models.

Saved in:
Bibliographic Details
Title: Evaluating motivational interview quality using large language models and hidden Markov models.
Authors: Lim K; Department of Psychiatry, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.; Institute of Behavioral Sciences in Medicine, Yonsei University College of Medicine, Seoul, Republic of Korea., Jung YC; Department of Psychiatry, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea. eugenejung@yuhs.ac.; Institute of Behavioral Sciences in Medicine, Yonsei University College of Medicine, Seoul, Republic of Korea. eugenejung@yuhs.ac., Kim BH; Department of Psychiatry, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea. egyptdj@yonsei.ac.kr.; Institute of Behavioral Sciences in Medicine, Yonsei University College of Medicine, Seoul, Republic of Korea. egyptdj@yonsei.ac.kr.; Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea. egyptdj@yonsei.ac.kr.; Institute for Innovation in Digital Healthcare, Yonsei University, Seoul, Republic of Korea. egyptdj@yonsei.ac.kr.
Source: BMC psychiatry [BMC Psychiatry] 2025 Oct 01; Vol. 25 (1), pp. 908. Date of Electronic Publication: 2025 Oct 01.
Publication Type: Journal Article
Language: English
Journal Info: Publisher: BioMed Central Country of Publication: England NLM ID: 100968559 Publication Model: Electronic Cited Medium: Internet ISSN: 1471-244X (Electronic) Linking ISSN: 1471244X NLM ISO Abbreviation: BMC Psychiatry Subsets: MEDLINE
Imprint Name(s): Original Publication: London : BioMed Central, [2001-
MeSH Terms: Motivational Interviewing*/standards , Motivational Interviewing*/methods , Markov Chains*, Humans ; Female ; Male ; Adult ; Motivation ; Large Language Models ; Hidden Markov Models
Abstract: Competing Interests: Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
Background: Motivational Interviewing (MI) is a counseling approach that promotes behavior change by eliciting "change talk" and minimizing "sustain talk." Traditional methods for assessing MI quality, such as manual coding, are labor-intensive, subjective, and difficult to scale. This study introduces an automated framework integrating large language models (LLMs) and Hidden Markov Models (HMMs) for evaluation of MI session quality.
Aims: This study evaluates the effectiveness of an LLM-HMM framework in predicting MI session quality and examines motivational state transitions in high- and low-quality sessions.
Method: A dataset of 40 MI sessions was analyzed. Client utterances were classified and numerically scored by an LLM based on their intention toward or away from change. With HMMs, we used these scores to examine the motivational state transitions across each session. Differences between high- and low-quality sessions were quantified by comparing transition matrices using Frobenius norms. Statistical significance was assessed via a permutation test. Predictive performance was evaluated using logistic regression with leave-one-out cross-validation (LOOCV), where transition matrix elements served as independent variables and interview quality as the dependent variable.
Results: High-quality MI sessions exhibited fluid transitions between motivational states, whereas low-quality sessions showed persistence in resistance-oriented states. A statistically significant difference in transition matrices was observed between session groups (p < 0.001). The framework achieved a mean LOOCV accuracy of 0.80, demonstrating strong predictive performance in identifying MI session quality.
Conclusions: This study presents a scalable, objective alternative to manual MI evaluation. Future applications may include real-time therapist support, training, and prognosis prediction, pending further validation on field-collected data.
(© 2025. The Author(s).)
References: Am Psychol. 2009 Sep;64(6):527-37. (PMID: 19739882)
Psychol Sci. 2010 Apr;21(4):511-7. (PMID: 20424092)
Br J Gen Pract. 2005 Apr;55(513):305-12. (PMID: 15826439)
J Consult Clin Psychol. 2003 Oct;71(5):862-78. (PMID: 14516235)
Npj Ment Health Res. 2024 Nov 21;3(1):56. (PMID: 39572672)
Addiction. 2010 Dec;105(12):2106-12. (PMID: 20840175)
J Subst Abuse Treat. 2016 Jun;65:36-42. (PMID: 26874558)
J Consult Clin Psychol. 2003 Oct;71(5):843-61. (PMID: 14516234)
Nat Med. 2023 Aug;29(8):1930-1940. (PMID: 37460753)
Annu Rev Clin Psychol. 2005;1:91-111. (PMID: 17716083)
Grant Information: HI22C0404 Ministry of Health & Welfare, Republic of Korea; HI22C0404 Ministry of Health & Welfare, Republic of Korea; 6-2024-0133 Institute for Innovation in Digital Healthcare; 6-2024-0133 Institute for Innovation in Digital Healthcare; RS-2024-00509289 Ministry of Science and ICT, South Korea; NRF-2022R1I1A1A01069589 National Research Foundation of Korea
Contributed Indexing: Keywords: Hidden markov models; Interview analysis; Interview quality assessment; Large language model; Motivational interview
Entry Date(s): Date Created: 20251002 Date Completed: 20251002 Latest Revision: 20251005
Update Code: 20251005
PubMed Central ID: PMC12487504
DOI: 10.1186/s12888-025-07391-1
PMID: 41034852
Database: MEDLINE
Description
Abstract:Competing Interests: Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.<br />Background: Motivational Interviewing (MI) is a counseling approach that promotes behavior change by eliciting "change talk" and minimizing "sustain talk." Traditional methods for assessing MI quality, such as manual coding, are labor-intensive, subjective, and difficult to scale. This study introduces an automated framework integrating large language models (LLMs) and Hidden Markov Models (HMMs) for evaluation of MI session quality.<br />Aims: This study evaluates the effectiveness of an LLM-HMM framework in predicting MI session quality and examines motivational state transitions in high- and low-quality sessions.<br />Method: A dataset of 40 MI sessions was analyzed. Client utterances were classified and numerically scored by an LLM based on their intention toward or away from change. With HMMs, we used these scores to examine the motivational state transitions across each session. Differences between high- and low-quality sessions were quantified by comparing transition matrices using Frobenius norms. Statistical significance was assessed via a permutation test. Predictive performance was evaluated using logistic regression with leave-one-out cross-validation (LOOCV), where transition matrix elements served as independent variables and interview quality as the dependent variable.<br />Results: High-quality MI sessions exhibited fluid transitions between motivational states, whereas low-quality sessions showed persistence in resistance-oriented states. A statistically significant difference in transition matrices was observed between session groups (p &lt; 0.001). The framework achieved a mean LOOCV accuracy of 0.80, demonstrating strong predictive performance in identifying MI session quality.<br />Conclusions: This study presents a scalable, objective alternative to manual MI evaluation. Future applications may include real-time therapist support, training, and prognosis prediction, pending further validation on field-collected data.<br /> (© 2025. The Author(s).)
ISSN:1471-244X
DOI:10.1186/s12888-025-07391-1