Performance of three artificial intelligence (AI)‐based large language models in standardized testing; implications for AI‐assisted dental education

Introduction The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long‐term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of periodontal research Jg. 60; H. 2; S. 121 - 133
Hauptverfasser: Sabri, Hamoun, Saleh, Muhammad H. A., Hazrati, Parham, Merchant, Keith, Misch, Jonathan, Kumar, Purnima S., Wang, Hom‐Lay, Barootchi, Shayan
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States Wiley Subscription Services, Inc 01.02.2025
John Wiley and Sons Inc
Schlagworte:
ISSN:0022-3484, 1600-0765, 1600-0765
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Introduction The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long‐term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT‐4 and GPT‐3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in‐service examination questions posed by the American Academy of Periodontology (AAP). Methods Under a comparative cross‐sectional study design, a corpus of 1312 questions from the annual in‐service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi‐square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub‐analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions. Results ChatGPT‐4 (total average: 79.57%) outperformed all human control groups as well as GPT‐3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT‐3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first‐ (63.48% ± 31.67) and second‐year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third‐year residents (69.06% ± 30.45). Conclusions Within the confines of this analysis, ChatGPT‐4 exhibited a robust capability in answering AAP in‐service exam questions in terms of accuracy and reliability while Gemini and ChatGPT‐3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image‐based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT‐4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study. ChatGPT‐4 outperforms ChatGPT‐3.5, Google Gemini, and human periodontics residents in standardized testing (AAP in‐service exams, 2020‐2023). This highlights the potential future role of AI in enhancing dental education.
AbstractList IntroductionThe emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long‐term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT‐4 and GPT‐3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in‐service examination questions posed by the American Academy of Periodontology (AAP).MethodsUnder a comparative cross‐sectional study design, a corpus of 1312 questions from the annual in‐service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi‐square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub‐analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions.ResultsChatGPT‐4 (total average: 79.57%) outperformed all human control groups as well as GPT‐3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT‐3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first‐ (63.48% ± 31.67) and second‐year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third‐year residents (69.06% ± 30.45).ConclusionsWithin the confines of this analysis, ChatGPT‐4 exhibited a robust capability in answering AAP in‐service exam questions in terms of accuracy and reliability while Gemini and ChatGPT‐3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image‐based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT‐4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study.
ChatGPT‐4 outperforms ChatGPT‐3.5, Google Gemini, and human periodontics residents in standardized testing (AAP in‐service exams, 2020‐2023). This highlights the potential future role of AI in enhancing dental education.
The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long-term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT-4 and GPT-3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in-service examination questions posed by the American Academy of Periodontology (AAP).INTRODUCTIONThe emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long-term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT-4 and GPT-3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in-service examination questions posed by the American Academy of Periodontology (AAP).Under a comparative cross-sectional study design, a corpus of 1312 questions from the annual in-service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi-square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub-analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions.METHODSUnder a comparative cross-sectional study design, a corpus of 1312 questions from the annual in-service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi-square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub-analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions.ChatGPT-4 (total average: 79.57%) outperformed all human control groups as well as GPT-3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT-3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first- (63.48% ± 31.67) and second-year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third-year residents (69.06% ± 30.45).RESULTSChatGPT-4 (total average: 79.57%) outperformed all human control groups as well as GPT-3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT-3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first- (63.48% ± 31.67) and second-year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third-year residents (69.06% ± 30.45).Within the confines of this analysis, ChatGPT-4 exhibited a robust capability in answering AAP in-service exam questions in terms of accuracy and reliability while Gemini and ChatGPT-3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image-based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT-4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study.CONCLUSIONSWithin the confines of this analysis, ChatGPT-4 exhibited a robust capability in answering AAP in-service exam questions in terms of accuracy and reliability while Gemini and ChatGPT-3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image-based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT-4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study.
The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long-term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT-4 and GPT-3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in-service examination questions posed by the American Academy of Periodontology (AAP). Under a comparative cross-sectional study design, a corpus of 1312 questions from the annual in-service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi-square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub-analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions. ChatGPT-4 (total average: 79.57%) outperformed all human control groups as well as GPT-3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT-3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first- (63.48% ± 31.67) and second-year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third-year residents (69.06% ± 30.45). Within the confines of this analysis, ChatGPT-4 exhibited a robust capability in answering AAP in-service exam questions in terms of accuracy and reliability while Gemini and ChatGPT-3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image-based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT-4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study.
Introduction The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long‐term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT‐4 and GPT‐3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in‐service examination questions posed by the American Academy of Periodontology (AAP). Methods Under a comparative cross‐sectional study design, a corpus of 1312 questions from the annual in‐service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi‐square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub‐analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions. Results ChatGPT‐4 (total average: 79.57%) outperformed all human control groups as well as GPT‐3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT‐3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first‐ (63.48% ± 31.67) and second‐year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third‐year residents (69.06% ± 30.45). Conclusions Within the confines of this analysis, ChatGPT‐4 exhibited a robust capability in answering AAP in‐service exam questions in terms of accuracy and reliability while Gemini and ChatGPT‐3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image‐based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT‐4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study. ChatGPT‐4 outperforms ChatGPT‐3.5, Google Gemini, and human periodontics residents in standardized testing (AAP in‐service exams, 2020‐2023). This highlights the potential future role of AI in enhancing dental education.
Author Misch, Jonathan
Merchant, Keith
Kumar, Purnima S.
Saleh, Muhammad H. A.
Hazrati, Parham
Sabri, Hamoun
Wang, Hom‐Lay
Barootchi, Shayan
AuthorAffiliation 2 Center for Clinical Research and Evidence Synthesis in Oral Tissue Regeneration (CRITERION) Ann Arbor Michigan USA
5 Division of Periodontology, Department of Oral Medicine, Infection, and Immunity Harvard School of Dental Medicine Boston Massachusetts USA
3 Naval Post‐Graduate Dental School Bethesda Maryland USA
4 Private Practice Ann Arbor Michigan USA
1 Department of Periodontics and Oral Medicine, School of Dentistry University of Michigan Ann Arbor Michigan USA
AuthorAffiliation_xml – name: 2 Center for Clinical Research and Evidence Synthesis in Oral Tissue Regeneration (CRITERION) Ann Arbor Michigan USA
– name: 3 Naval Post‐Graduate Dental School Bethesda Maryland USA
– name: 4 Private Practice Ann Arbor Michigan USA
– name: 1 Department of Periodontics and Oral Medicine, School of Dentistry University of Michigan Ann Arbor Michigan USA
– name: 5 Division of Periodontology, Department of Oral Medicine, Infection, and Immunity Harvard School of Dental Medicine Boston Massachusetts USA
Author_xml – sequence: 1
  givenname: Hamoun
  orcidid: 0000-0001-6581-2104
  surname: Sabri
  fullname: Sabri, Hamoun
  email: hsabri@umich.edu
  organization: Center for Clinical Research and Evidence Synthesis in Oral Tissue Regeneration (CRITERION)
– sequence: 2
  givenname: Muhammad H. A.
  surname: Saleh
  fullname: Saleh, Muhammad H. A.
  organization: University of Michigan
– sequence: 3
  givenname: Parham
  orcidid: 0000-0002-8362-3208
  surname: Hazrati
  fullname: Hazrati, Parham
  organization: University of Michigan
– sequence: 4
  givenname: Keith
  surname: Merchant
  fullname: Merchant, Keith
  organization: Naval Post‐Graduate Dental School
– sequence: 5
  givenname: Jonathan
  surname: Misch
  fullname: Misch, Jonathan
  organization: Private Practice
– sequence: 6
  givenname: Purnima S.
  orcidid: 0000-0001-5844-1341
  surname: Kumar
  fullname: Kumar, Purnima S.
  organization: University of Michigan
– sequence: 7
  givenname: Hom‐Lay
  orcidid: 0000-0003-4238-1799
  surname: Wang
  fullname: Wang, Hom‐Lay
  organization: University of Michigan
– sequence: 8
  givenname: Shayan
  orcidid: 0000-0002-5347-6577
  surname: Barootchi
  fullname: Barootchi, Shayan
  email: shbaroot@umich.edu
  organization: Harvard School of Dental Medicine
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39030766$$D View this record in MEDLINE/PubMed
BookMark eNp1kstuEzEUhi1URNPCghdAI7FpF9P6MlexqKKqlKBKIARry-M5Th3N2MH2gMqKR2DJ8_EkPWlSBBV44Yv8nf9cD8ie8w4Iec7oCcN1ugpwwoTg4hGZsYrSnNZVuUdmlHKei6Ip9slBjCuK76pun5B90VKBTDUjP99DMD6MymnIvMnSdQDIVEjWWG3VkFmXYBjsEjbA0Xxx_Ov7j05F6LNBhSXg7paTwsvoexgi8llMyvUq9PYbUglism75KrPjerBaJetdzNBlNl-glIrRxoRcDy6hO-inLfOUPDZqiPBsdx6ST68vPp6_ya_eXS7O51e5LopC5G3Pm4q2qikVM4bpUvVdJRpuGi4460swLeOiNBU0nJaiKwtetx3tKt0ZTRUTh-Rsq7ueuhF6jWEENch1sKMKN9IrK__-cfZaLv0XyVhTi6pqUeFopxD85wnTlaONGoumHPgpSkEb3pZtTTfOXj5AV34KDvOTgtW8pdi8BqkXf4b0O5b7riFwugV08DEGMFLbdFc1jNAOklG5mQuJcyHv5gItjh9Y3Iv-i92pf7UD3PwflG8_XGwtbgGuMMtj
CitedBy_id crossref_primary_10_3389_froh_2025_1566221
crossref_primary_10_3390_app142310802
crossref_primary_10_1007_s00276_025_03723_8
crossref_primary_10_3390_app15105231
crossref_primary_10_7759_cureus_77292
crossref_primary_10_2196_64486
crossref_primary_10_1016_j_jds_2025_05_012
crossref_primary_10_1111_eje_70011
crossref_primary_10_1111_jcpe_14101
crossref_primary_10_1007_s12144_024_06862_0
crossref_primary_10_31201_ijhmt_1667860
crossref_primary_10_2196_71125
crossref_primary_10_12677_ces_2024_1212889
crossref_primary_10_3390_diagnostics14242818
crossref_primary_10_1016_j_jds_2025_08_016
crossref_primary_10_1016_j_jds_2025_03_018
crossref_primary_10_1111_bjet_13591
crossref_primary_10_1016_j_identj_2025_103854
Cites_doi 10.1007/s00264-024-06182-9
10.1371/journal.pdig.0000198
10.1001/jama.2023.14311
10.5435/JAAOS-D-23-00396
10.1148/radiol.230922
10.1038/s42254-023-00581-4
10.22178/pos.38-2
10.2196/50514
10.1038/s41591-023-02448-8
10.1038/s41391-023-00705-y
10.1016/j.jaad.2023.05.081
10.1016/j.adaj.2023.07.016
10.1093/asj/sjad130
10.1109/MLBDBI54094.2021.00149
10.1001/jama.2023.14217
10.1007/s10439-023-03284-0
10.3389/fonc.2023.1219326
10.1002/ase.2270
10.1093/asjof/ojad066
10.1016/j.prosdent.2024.01.018
10.1016/j.ijinfomgt.2023.102642
10.2196/48002
10.3325/cmj.2023.64.1
10.1002/JPER.23-0514
10.1227/neu.0000000000002551
10.2196/27850
10.2214/AJR.23.29622
ContentType Journal Article
Copyright 2024 The Author(s). published by John Wiley & Sons Ltd.
2024 The Author(s). Journal of Periodontal Research published by John Wiley & Sons Ltd.
2024. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2024 The Author(s). published by John Wiley & Sons Ltd.
– notice: 2024 The Author(s). Journal of Periodontal Research published by John Wiley & Sons Ltd.
– notice: 2024. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 24P
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7QP
K9.
7X8
5PM
DOI 10.1111/jre.13323
DatabaseName Wiley Online Library Open Access
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Calcium & Calcified Tissue Abstracts
ProQuest Health & Medical Complete (Alumni)
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
ProQuest Health & Medical Complete (Alumni)
Calcium & Calcified Tissue Abstracts
MEDLINE - Academic
DatabaseTitleList ProQuest Health & Medical Complete (Alumni)

MEDLINE - Academic
MEDLINE

Database_xml – sequence: 1
  dbid: 24P
  name: Wiley Online Library Open Access (Activated by CARLI)
  url: https://authorservices.wiley.com/open-science/open-access/browse-journals.html
  sourceTypes: Publisher
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Dentistry
DocumentTitleAlternate Sabri et al
EISSN 1600-0765
EndPage 133
ExternalDocumentID PMC11873669
39030766
10_1111_jre_13323
JRE13323
Genre researchArticle
Journal Article
Comparative Study
GroupedDBID ---
.3N
.GA
.GJ
.Y3
05W
0R~
10A
1OB
1OC
24P
29L
31~
33P
34H
3SF
4.4
50Y
50Z
51W
51X
52M
52N
52O
52P
52S
52T
52U
52V
52W
52X
53G
5GY
5HH
5LA
5RE
5VS
66C
702
7PT
8-0
8-1
8-3
8-4
8-5
8UM
930
A03
AAESR
AAEVG
AAHHS
AAHQN
AAIPD
AAKAS
AAMNL
AANHP
AANLZ
AAONW
AASGY
AAWTL
AAXRX
AAYCA
AAZKR
ABCQN
ABCUV
ABEML
ABJNI
ABLJU
ABPVW
ABQWH
ABXGK
ACAHQ
ACBWZ
ACCFJ
ACCZN
ACGFS
ACGOF
ACMXC
ACPOU
ACPRK
ACRPL
ACSCC
ACXBN
ACXQS
ACYXJ
ADBBV
ADBTR
ADEOM
ADIZJ
ADKYN
ADMGS
ADNMO
ADOZA
ADXAS
ADZCM
ADZMN
AEEZP
AEIGN
AEIMD
AENEX
AEQDE
AEUYR
AFBPY
AFEBI
AFFNX
AFFPM
AFGKR
AFWVQ
AFZJQ
AHBTC
AHEFC
AIACR
AITYG
AIURR
AIWBW
AJBDE
ALAGY
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMBMR
AMYDB
ASPBG
ATUGU
AVWKF
AZBYB
AZFZN
AZVAB
BAFTC
BDRZF
BFHJK
BHBCM
BMXJE
BROTX
BRXPI
BY8
C45
CAG
COF
CS3
CWXXS
D-E
D-F
DC6
DCZOG
DPXWK
DR2
DRFUL
DRMAN
DRSTM
DU5
EBD
EBS
EJD
F00
F01
F04
F5P
FEDTE
FUBAC
FZ0
G-S
G.N
GODZA
H.T
H.X
HF~
HGLYW
HVGLF
HZI
HZ~
IHE
IX1
J0M
K48
KBYEO
LATKE
LC2
LC3
LEEKS
LH4
LITHE
LOXES
LP6
LP7
LUTES
LW6
LYRES
MEWTI
MK4
MRFUL
MRMAN
MRSTM
MSFUL
MSMAN
MSSTM
MXFUL
MXMAN
MXSTM
N04
N05
N9A
NF~
O66
O9-
OIG
OVD
P2P
P2W
P2X
P4D
PALCI
Q.N
Q11
QB0
R.K
RIWAO
RJQFR
ROL
RX1
SAMSI
SUPJJ
TEORI
UB1
W8V
W99
WBKPD
WBNRW
WIH
WIJ
WIK
WOHZO
WPGGZ
WQJ
WXSBR
XG1
YFH
ZGI
ZZTAW
~IA
~WT
AAMMB
AAYXX
AEFGJ
AEYWJ
AGHNM
AGQPQ
AGXDD
AGYGG
AIDQK
AIDYY
AIQQE
CITATION
O8X
CGR
CUY
CVF
ECM
EIF
NPM
7QP
K9.
7X8
5PM
ID FETCH-LOGICAL-c4443-9d28609a85a1ff1c5adb6382f82321d5ef91235f6e82053b54279b0b6cbfc0a13
IEDL.DBID 24P
ISICitedReferencesCount 27
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001272576400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0022-3484
1600-0765
IngestDate Tue Nov 04 02:04:52 EST 2025
Fri Jul 11 00:02:35 EDT 2025
Tue Oct 07 06:44:56 EDT 2025
Mon Jul 21 06:07:00 EDT 2025
Tue Nov 18 21:02:54 EST 2025
Sat Nov 29 08:17:12 EST 2025
Mon Mar 03 15:18:47 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Keywords American Academy of Periodontology
dental education
ChatGPT‐4
ChatGPT
ChatGPT‐3.5
Google Gemini
periodontal education
artificial intelligence
Gemini
Google Bard
Language English
License Attribution-NonCommercial-NoDerivs
2024 The Author(s). Journal of Periodontal Research published by John Wiley & Sons Ltd.
This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c4443-9d28609a85a1ff1c5adb6382f82321d5ef91235f6e82053b54279b0b6cbfc0a13
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0001-6581-2104
0000-0001-5844-1341
0000-0002-5347-6577
0000-0002-8362-3208
0000-0003-4238-1799
OpenAccessLink https://onlinelibrary.wiley.com/doi/abs/10.1111%2Fjre.13323
PMID 39030766
PQID 3172906008
PQPubID 2045129
PageCount 13
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_11873669
proquest_miscellaneous_3082959701
proquest_journals_3172906008
pubmed_primary_39030766
crossref_citationtrail_10_1111_jre_13323
crossref_primary_10_1111_jre_13323
wiley_primary_10_1111_jre_13323_JRE13323
PublicationCentury 2000
PublicationDate February 2025
PublicationDateYYYYMMDD 2025-02-01
PublicationDate_xml – month: 02
  year: 2025
  text: February 2025
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: Hoboken
PublicationTitle Journal of periodontal research
PublicationTitleAlternate J Periodontal Res
PublicationYear 2025
Publisher Wiley Subscription Services, Inc
John Wiley and Sons Inc
Publisher_xml – name: Wiley Subscription Services, Inc
– name: John Wiley and Sons Inc
References 2023; 31
2021; 7
2023; 51
2023; 13
2023; 11
2023; 17
2023; 5
2023; 18
2023; 6
2023; 9
2023; 221
2023; 307
2023; 2
2020; 33
2024
2023; 64
2023; 43
2018; 4
2023; 89
2023; 29
2021
2023; 27
2023; 330
2023; 154
2024; 131
2023; 71
2023; 93
e_1_2_10_23_1
Brown T (e_1_2_10_32_1) 2020; 33
e_1_2_10_24_1
e_1_2_10_21_1
e_1_2_10_22_1
e_1_2_10_20_1
Farajollahi M (e_1_2_10_28_1) 2023; 18
e_1_2_10_2_1
e_1_2_10_4_1
e_1_2_10_18_1
e_1_2_10_3_1
e_1_2_10_19_1
e_1_2_10_6_1
e_1_2_10_16_1
e_1_2_10_5_1
e_1_2_10_17_1
e_1_2_10_8_1
e_1_2_10_14_1
e_1_2_10_7_1
e_1_2_10_15_1
e_1_2_10_12_1
e_1_2_10_9_1
e_1_2_10_10_1
e_1_2_10_11_1
e_1_2_10_31_1
Wach K (e_1_2_10_13_1) 2023; 11
e_1_2_10_30_1
e_1_2_10_29_1
e_1_2_10_27_1
e_1_2_10_25_1
e_1_2_10_26_1
References_xml – volume: 17
  start-page: 926
  year: 2023
  end-page: 931
  article-title: The rise of ChatGPT: exploring its potential in medical education
  publication-title: Anat Sci Educ
– volume: 89
  start-page: 870
  year: 2023
  end-page: 871
  article-title: ChatGPT for healthcare providers and patients: practical implications within dermatology
  publication-title: J Am Acad Dermatol
– volume: 154
  start-page: 970
  year: 2023
  end-page: 974
  article-title: The performance of artificial intelligence language models in board‐style dental knowledge assessment: a preliminary study on ChatGPT
  publication-title: J Am Dent Assoc
– volume: 64
  start-page: 1
  year: 2023
  end-page: 3
  article-title: Opportunities and risks of ChatGPT in medicine, science, and academic publishing: a modern promethean dilemma
  publication-title: Croat Med J
– volume: 31
  start-page: 1173
  year: 2023
  end-page: 1179
  article-title: Comparison of ChatGPT‐3.5, ChatGPT‐4, and Orthopaedic resident performance on orthopaedic assessment examinations
  publication-title: J Am Acad Orthop Surg
– volume: 43
  start-page: NP1085
  year: 2023
  end-page: NP1089
  article-title: ChatGPT is equivalent to first year plastic surgery residents: evaluation of ChatGPT on the plastic surgery in‐service exam
  publication-title: Aesthet Surg J
– volume: 9
  year: 2023
  article-title: Performance of GPT‐3.5 and GPT‐4 on the Japanese medical licensing examination: comparison study
  publication-title: JMIR Med Educ
– volume: 33
  start-page: 1877
  year: 2020
  end-page: 1901
  article-title: Language models are few‐shot learners
  publication-title: Adv Neural Inf Process Syst
– volume: 51
  start-page: 2647
  year: 2023
  end-page: 2651
  article-title: ChatGPT, Bard, and large language models for biomedical research: opportunities and pitfalls
  publication-title: Ann Biomed Eng
– volume: 2
  year: 2023
  article-title: Performance of ChatGPT on USMLE: potential for AI‐assisted medical education using large language models
  publication-title: PLOS Digit Health
– volume: 9
  year: 2023
  article-title: Assessment of resident and AI Chatbot performance on the University of Toronto Family Medicine Residency Progress Test: comparative study
  publication-title: JMIR Med Educ
– volume: 93
  start-page: 1090
  issue: 5
  year: 2023
  end-page: 1098
  article-title: Performance of ChatGPT, GPT‐4, and Google Bard on a neurosurgery oral boards preparation question Bank
  publication-title: Neurosurgery
– volume: 11
  start-page: 11
  year: 2023
  end-page: 30
  article-title: The dark side of generative artificial intelligence: a critical analysis of controversies and risks of ChatGPT
  publication-title: Entrep Bus Econ Rev
– volume: 330
  start-page: 866
  year: 2023
  end-page: 869
  article-title: Creation and adoption of large language models in medicine
  publication-title: JAMA
– year: 2024
  article-title: Effectiveness of AI‐powered Chatbots in responding to orthopaedic postgraduate exam questions—an observational study
  publication-title: Int Orthop
– volume: 27
  start-page: 103
  issue: 1
  year: 2023
  end-page: 108
  article-title: Quality of information and appropriateness of ChatGPT outputs for urology patients
  publication-title: Prostate Cancer Prostatic Dis
– volume: 18
  start-page: 192
  year: 2023
  article-title: Can ChatGPT pass the “Iranian Endodontics Specialist Board” exam?
  publication-title: Iran Endod J
– volume: 4
  start-page: 2007
  year: 2018
  end-page: 2012
  article-title: Smart home and artificial intelligence as environment for the implementation of new technologies
  publication-title: Traekt Nauki=Path Sci
– volume: 7
  year: 2021
  article-title: Chatbot for health care and oncology applications using artificial intelligence and machine learning: systematic review
  publication-title: JMIR Cancer
– volume: 13
  year: 2023
  article-title: Evaluating large language models on a highly‐specialized topic, radiation oncology physics
  publication-title: Front Oncol
– year: 2024
  article-title: Artificial intelligence in dental education: ChatGPT's performance on the periodontic in‐service examination
  publication-title: J Periodontol
– volume: 5
  start-page: 277
  year: 2023
  end-page: 280
  article-title: Science in the age of large language models
  publication-title: Nat Rev Phys
– volume: 29
  start-page: 1930
  year: 2023
  end-page: 1940
  article-title: Large language models in medicine
  publication-title: Nat Med
– volume: 330
  start-page: 792
  year: 2023
  article-title: Large language models answer medical questions accurately, but can't match clinicians' knowledge
  publication-title: JAMA
– volume: 6
  year: 2023
  article-title: Bard versus the 2022 American Society of Plastic Surgeons in‐service examination: performance on the examination in its intern year
  publication-title: Aesthetic Surg J Open Forum
– volume: 71
  year: 2023
  article-title: Opinion Paper: “so what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy
  publication-title: Int J Inf Manage
– start-page: 767
  year: 2021
  end-page: 770
– volume: 307
  year: 2023
  article-title: How AI responds to common lung cancer questions: ChatGPT vs Google Bard
  publication-title: Radiology
– volume: 131
  start-page: 659.e1
  year: 2024
  end-page: 659.e6
  article-title: ChatGPT performance in prosthodontics: assessment of accuracy and repeatability in answer generation
  publication-title: J Prosthet Dent
– volume: 221
  start-page: 701
  year: 2023
  end-page: 704
  article-title: Use of ChatGPT, GPT‐4, and Bard to improve readability of ChatGPT's answers to common questions on lung cancer and lung cancer screening
  publication-title: AJR Am J Roentgenol
– ident: e_1_2_10_18_1
  doi: 10.1007/s00264-024-06182-9
– ident: e_1_2_10_26_1
  doi: 10.1371/journal.pdig.0000198
– ident: e_1_2_10_11_1
  doi: 10.1001/jama.2023.14311
– ident: e_1_2_10_27_1
  doi: 10.5435/JAAOS-D-23-00396
– ident: e_1_2_10_22_1
  doi: 10.1148/radiol.230922
– volume: 18
  start-page: 192
  year: 2023
  ident: e_1_2_10_28_1
  article-title: Can ChatGPT pass the “Iranian Endodontics Specialist Board” exam?
  publication-title: Iran Endod J
– ident: e_1_2_10_7_1
  doi: 10.1038/s42254-023-00581-4
– ident: e_1_2_10_2_1
  doi: 10.22178/pos.38-2
– ident: e_1_2_10_15_1
  doi: 10.2196/50514
– ident: e_1_2_10_24_1
– ident: e_1_2_10_6_1
  doi: 10.1038/s41591-023-02448-8
– ident: e_1_2_10_8_1
  doi: 10.1038/s41391-023-00705-y
– ident: e_1_2_10_9_1
  doi: 10.1016/j.jaad.2023.05.081
– ident: e_1_2_10_29_1
  doi: 10.1016/j.adaj.2023.07.016
– ident: e_1_2_10_16_1
  doi: 10.1093/asj/sjad130
– ident: e_1_2_10_4_1
  doi: 10.1109/MLBDBI54094.2021.00149
– ident: e_1_2_10_10_1
  doi: 10.1001/jama.2023.14217
– ident: e_1_2_10_20_1
  doi: 10.1007/s10439-023-03284-0
– ident: e_1_2_10_5_1
  doi: 10.3389/fonc.2023.1219326
– ident: e_1_2_10_14_1
  doi: 10.1002/ase.2270
– ident: e_1_2_10_17_1
  doi: 10.1093/asjof/ojad066
– ident: e_1_2_10_31_1
  doi: 10.1016/j.prosdent.2024.01.018
– ident: e_1_2_10_19_1
  doi: 10.1016/j.ijinfomgt.2023.102642
– ident: e_1_2_10_25_1
  doi: 10.2196/48002
– ident: e_1_2_10_12_1
  doi: 10.3325/cmj.2023.64.1
– ident: e_1_2_10_30_1
  doi: 10.1002/JPER.23-0514
– ident: e_1_2_10_21_1
  doi: 10.1227/neu.0000000000002551
– volume: 33
  start-page: 1877
  year: 2020
  ident: e_1_2_10_32_1
  article-title: Language models are few‐shot learners
  publication-title: Adv Neural Inf Process Syst
– ident: e_1_2_10_3_1
  doi: 10.2196/27850
– volume: 11
  start-page: 11
  year: 2023
  ident: e_1_2_10_13_1
  article-title: The dark side of generative artificial intelligence: a critical analysis of controversies and risks of ChatGPT
  publication-title: Entrep Bus Econ Rev
– ident: e_1_2_10_23_1
  doi: 10.2214/AJR.23.29622
SSID ssj0002679
Score 2.5298033
Snippet Introduction The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line...
The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our...
IntroductionThe emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line...
ChatGPT‐4 outperforms ChatGPT‐3.5, Google Gemini, and human periodontics residents in standardized testing (AAP in‐service exams, 2020‐2023). This highlights...
SourceID pubmedcentral
proquest
pubmed
crossref
wiley
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 121
SubjectTerms Accuracy
American Academy of Periodontology
Artificial Intelligence
Chatbots
ChatGPT
ChatGPT‐3.5
ChatGPT‐4
Clinical Research
Cross-Sectional Studies
dental education
Education, Dental - methods
Educational Measurement - methods
Educational Measurement - standards
Gemini
Google Bard
Google Gemini
Humans
Large Language Models
periodontal education
Periodontics
Periodontics - education
Title Performance of three artificial intelligence (AI)‐based large language models in standardized testing; implications for AI‐assisted dental education
URI https://onlinelibrary.wiley.com/doi/abs/10.1111%2Fjre.13323
https://www.ncbi.nlm.nih.gov/pubmed/39030766
https://www.proquest.com/docview/3172906008
https://www.proquest.com/docview/3082959701
https://pubmed.ncbi.nlm.nih.gov/PMC11873669
Volume 60
WOSCitedRecordID wos001272576400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVWIB
  databaseName: Wiley Online Library - Journals
  customDbUrl:
  eissn: 1600-0765
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002679
  issn: 0022-3484
  databaseCode: DRFUL
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://onlinelibrary.wiley.com
  providerName: Wiley-Blackwell
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB6VFgkulFdpoFQGcSiHRYnjOI56WtGuKELVqqLS3iLHD7pSla12txw48RM48vv4Jcw4j-6qICFxiSJ5EjvxjP3ZnvkG4A3X3IrUGFzkZHIghJEDlTqPFl-lmcf5Xhkdkk3kp6dqMinGG3DYxcI0_BD9hhtZRhivycB1tVg18rl7hwssnt6BrSRJFeVt4GLcD8Nc5kVPFS6UaGmFghtP9-j6ZHQLYd52lFwFsGEGGm3_V9sfwoMWeLJhoymPYMPVj-HeETkLUb63J_BzfBNDwGaeLbGXHSPNakgm2HSFvZMdDE_e_vr-gyZByy7JnZx1W58sZNdZoDzrNiqm31BqSYQe9ZdDNl3xYmdYJRue4KsQxZPKWWZDhCZznevJUzgfHX9-_2HQ5m0YGCHoMN9yJeNCq0wn3icm07ZCM-deIXxLbOZ8QRG6XjqEH1laZYLnRRVX0lTexDpJd2CzntVuF1gVc2dzYXmRG4HoU1sTx9oblRtp8sRGcNB1YGlaUnPKrXFZ9oubuSvDr47gdS961TB5_Elor9OCsjXmRYkQi0jxES1F8KovRjOksxVdu9k1ylCMMi7O4iSCZ43S9LWkBY2kUkag1tSpFyCK7_WSenoRqL4pGXwqZYHfGfTp7y0vP54dh5vn_y76Au5zSmccnND3YHM5v3Yv4a75ipo33w_mhNd8ovZh6-hsdP7pN9BQKfw
linkProvider Wiley-Blackwell
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB6VgtReeFMCBQziUA6LEsdxHMFlBa26UFYrVKTeIsePdqUqi7ZbDj31J3Dk9_FLmHEe7KogIXGLlMn7G89nZ-YbgJdccytSY3CSk8mBEEYOVOo8enyVZh7jvTI6NJvIx2N1dFRM1uBtVwvT6EP0C27kGWG8JgenBellL5-71zjD4uk1uC4wyhDKuZj04zCXedFrhQslWl2hkMfTHboaja5QzKuZkssMNoSgvVv_d_O34WZLPdmwwcodWHP1Xdh4T-lC1PHtHvyY_K4iYDPPFvidHSNsNTITbLqk38l2hqNXPy-_Uxi07JQSylm3-MlCf50ztGfdUsX0Aq0WJOlRH79h06U8doaXZMMRngp5PIHOMhtqNJnrkk_uw5e93cN3-4O2c8PACEG_8y1XMi60ynTifWIybSt0dO4VErjEZs4XVKPrpUMCkqVVJnheVHElTeVNrJP0AazXs9o9BFbF3NlcWF7kRiD_1NbEsfZG5UaaPLER7HRfsDStrDl11zgt--nN3JXhVUfwojf92mh5_Mlou4NB2brzWYkki2TxkS9F8LzfjY5If1d07WbnaENVyjg9i5MIthrU9FdJCxpLpYxAreCpNyCR79U99fQkiH1TO_hUygKfMwDq73defvi8GzYe_bvpM9jYP_x0UB6Mxh8fwyan5sYhJX0b1hfzc_cEbphviML50-BbvwAa_Stf
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB6VLQIuvAuBAgZxKIetEsdxHMFlxXbFQrVaVVTqLUr8aFeqstV220NP_ASO_D5-CTPOg10VJCRukTyJk3jG_sae-QbgLS-4EbHW6OQksi-Eln0VW4cWX8aJw_Ve6cIXm0gnE3V0lE034EObC1PzQ3QbbmQZfr4mA7dnxq1a-cLuoofF4xuwKaiITA82hwejw_1uJuYyzTq2cKFEwyzkI3nam9fXo2sg83qs5CqG9YvQ6N7_vf59uNuATzaoteUBbNjqIdweUsAQ1Xx7BD-mv_MI2NyxJY60ZaRdNdEEm60weLKdwfjdz2_faSE07JRCylm7_cl8hZ1zlGftZsXsCqWWROpRHb9ns5VIdoZdssEYH4VIntTOMOOzNJltw08ew-Fo7-vHT_2mdkNfC0EH-oYrGWaFSorIuUgnhSnR1LlTCOEik1iXUZaukxYhSBKXieBpVoal1KXTYRHFW9Cr5pV9CqwMuTWpMDxLtUAEWhgdhoXTKtVSp5EJYKcdwVw3xOZUX-M07xychc39rw7gTSd6VrN5_Elou1WDvDHo8xxhFhHjI2IK4HXXjKZI5ytFZecXKEN5yuighVEAT2qt6XqJM5pNpQxArelTJ0A03-st1ezE031TQfhYygy_0yvU3988_3yw5y-e_bvoK7g1HY7y_fHky3O4w6m6sY9J34becnFhX8BNfYlKuHjZGNcvyeosdQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Performance+of+three+artificial+intelligence+%28AI%29%E2%80%90based+large+language+models+in+standardized+testing%3B+implications+for+AI%E2%80%90assisted+dental+education&rft.jtitle=Journal+of+periodontal+research&rft.au=Sabri%2C+Hamoun&rft.au=Saleh%2C+Muhammad+H.+A.&rft.au=Hazrati%2C+Parham&rft.au=Merchant%2C+Keith&rft.date=2025-02-01&rft.issn=0022-3484&rft.eissn=1600-0765&rft.volume=60&rft.issue=2&rft.spage=121&rft.epage=133&rft_id=info:doi/10.1111%2Fjre.13323&rft.externalDBID=n%2Fa&rft.externalDocID=10_1111_jre_13323
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0022-3484&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0022-3484&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0022-3484&client=summon