New generative artificial intelligence model: ScholarGPT's performance on dental avulsion.
Gespeichert in:
| Titel: | New generative artificial intelligence model: ScholarGPT's performance on dental avulsion. |
|---|---|
| Autoren: | Kaplan TT; Karabuk University, Faculty of Dentistry, Department of Pediatric Dentistry, Karabük, Turkey. Electronic address: taibetokgozkaplan@karabuk.edu.tr. |
| Quelle: | International journal of medical informatics [Int J Med Inform] 2025 Dec; Vol. 204, pp. 106080. Date of Electronic Publication: 2025 Aug 13. |
| Publikationsart: | Journal Article |
| Sprache: | English |
| Info zur Zeitschrift: | Publisher: Elsevier Science Ireland Ltd Country of Publication: Ireland NLM ID: 9711057 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1872-8243 (Electronic) Linking ISSN: 13865056 NLM ISO Abbreviation: Int J Med Inform Subsets: MEDLINE |
| Imprint Name(s): | Original Publication: Shannon, Co. Clare, Ireland : Elsevier Science Ireland Ltd., c1997- |
| MeSH-Schlagworte: | Artificial Intelligence* , Tooth Injuries*, Humans ; Surveys and Questionnaires ; Generative Artificial Intelligence |
| Abstract: | Competing Interests: Declaration of competing interest The author declares that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Background: This study aims to evaluate the performance of ScholarGPT, a Large Language Model (LLM) developed for academic purposes, on questions related to dental avulsion. In addition, to analyze and compare it with the results of the previous study evaluating the performance of ChatGPT and Gemini. Method: A total of 22 technical questions (11 multiple-choice questions (MCQs), 11 true/false (T/F)) were posed to the ScholarGPT. ScholarGPT responses were assessed using a modified Global Quality Scale (GQS). Responses were randomized using an online randomizer (www.randomizer.org) before scoring. A single researcher carried out the assessments at three different times, two weeks apart, and a new randomization was performed before each scoring. Results: When the answers given by ScholarGPT according to the question groups were analyzed by the Mann-Whitney U test, the mean value was found to be 4.64 for MCQ questions and 4.82 for T/F questions. ScholarGPT provided similarly high-quality and consistent answers in both question types (p = 0.590). When the performance of ScholarGPT was compared with Gemini and ChatGPT via the Friedman test, the mean score of ScholarGPT responses was significantly higher than both ChatGPT and Gemini (mean difference with Gemini = 0.75; mean difference with ChatGPT = 1.62, p < 0.001). ScholarGPT produced statistically significantly more consistent and higher-quality responses than ChatGPT and Gemini. Conclusion: ScholarGPT showed high performance on technical questions related to dental avulsion and produced more consistent and higher-quality answers than ChatGPT and Gemini. According to the findings, LLMs based on academic databases can provide more accurate and reliable information. In the future, through the development of LLMs specific to the branches of dentistry, artificial intelligence systems can produce higher quality and consistent information. (Copyright © 2025 Elsevier B.V. All rights reserved.) |
| Contributed Indexing: | Keywords: Artificial intelligence; ChatGPT; Dental avulsion; Gemini; Large Language Models; ScholarGPT |
| Entry Date(s): | Date Created: 20250815 Date Completed: 20250907 Latest Revision: 20250907 |
| Update Code: | 20250908 |
| DOI: | 10.1016/j.ijmedinf.2025.106080 |
| PMID: | 40816034 |
| Datenbank: | MEDLINE |
| Abstract: | Competing Interests: Declaration of competing interest The author declares that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br />Background: This study aims to evaluate the performance of ScholarGPT, a Large Language Model (LLM) developed for academic purposes, on questions related to dental avulsion. In addition, to analyze and compare it with the results of the previous study evaluating the performance of ChatGPT and Gemini.<br />Method: A total of 22 technical questions (11 multiple-choice questions (MCQs), 11 true/false (T/F)) were posed to the ScholarGPT. ScholarGPT responses were assessed using a modified Global Quality Scale (GQS). Responses were randomized using an online randomizer (www.randomizer.org) before scoring. A single researcher carried out the assessments at three different times, two weeks apart, and a new randomization was performed before each scoring.<br />Results: When the answers given by ScholarGPT according to the question groups were analyzed by the Mann-Whitney U test, the mean value was found to be 4.64 for MCQ questions and 4.82 for T/F questions. ScholarGPT provided similarly high-quality and consistent answers in both question types (p = 0.590). When the performance of ScholarGPT was compared with Gemini and ChatGPT via the Friedman test, the mean score of ScholarGPT responses was significantly higher than both ChatGPT and Gemini (mean difference with Gemini = 0.75; mean difference with ChatGPT = 1.62, p < 0.001). ScholarGPT produced statistically significantly more consistent and higher-quality responses than ChatGPT and Gemini.<br />Conclusion: ScholarGPT showed high performance on technical questions related to dental avulsion and produced more consistent and higher-quality answers than ChatGPT and Gemini. According to the findings, LLMs based on academic databases can provide more accurate and reliable information. In the future, through the development of LLMs specific to the branches of dentistry, artificial intelligence systems can produce higher quality and consistent information.<br /> (Copyright © 2025 Elsevier B.V. All rights reserved.) |
|---|---|
| ISSN: | 1872-8243 |
| DOI: | 10.1016/j.ijmedinf.2025.106080 |
Full Text Finder
Nájsť tento článok vo Web of Science