Influence of medical educational background on the diagnostic quality of ChatGPT‐4 responses in internal medicine: A pilot study

Saved in:
Bibliographic Details
Title: Influence of medical educational background on the diagnostic quality of ChatGPT‐4 responses in internal medicine: A pilot study
Authors: Gilardi, Nicolò, Ballabio, Massimo, Ravera, Francesco, Ferrando, Lorenzo, Stabile, Mario, Bellodi, Andrea, Talerico, Giovanni, Cigolini, Benedetta, Genova, Carlo, Carbone, Federico, Montecucco, Fabrizio, Bracco, Christian, Ballestrero, Alberto, Zoppoli, Gabriele
Source: European Journal of Clinical Investigation.
Publisher Information: Wiley, 2025.
Publication Year: 2025
Subject Terms: ChatGPT‐4, artificial intelligence, clinical decision making, diagnostic ranking, internal medicine, large language models
Description: This pilot study evaluated the influence of medical background on the diagnostic quality of ChatGPT-4's responses in Internal Medicine. Third-year students, residents and specialists summarised five complex NEJM clinical cases before querying ChatGPT-4. Diagnostic ranking, assessed by independent experts, revealed that residents significantly outperformed students (OR 2.33, p = .007); though overall performance was low. These findings indicate that user expertise and concise case summaries are critical for optimising AI diagnostics, highlighting the need for enhanced AI training and user interaction strategies.
Document Type: Article
Language: English
ISSN: 1365-2362
0014-2972
DOI: 10.1111/eci.70113
Rights: CC BY
Accession Number: edsair.doi.dedup.....35896daab55d5c043f5ef4245c1c56b6
Database: OpenAIRE
Description
Abstract:This pilot study evaluated the influence of medical background on the diagnostic quality of ChatGPT-4's responses in Internal Medicine. Third-year students, residents and specialists summarised five complex NEJM clinical cases before querying ChatGPT-4. Diagnostic ranking, assessed by independent experts, revealed that residents significantly outperformed students (OR 2.33, p = .007); though overall performance was low. These findings indicate that user expertise and concise case summaries are critical for optimising AI diagnostics, highlighting the need for enhanced AI training and user interaction strategies.
ISSN:13652362
00142972
DOI:10.1111/eci.70113