Performance of three artificial intelligence (AI)‐based large language models in standardized testing; implications for AI‐assisted dental education
Introduction The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long‐term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare t...
Saved in:
| Published in: | Journal of periodontal research Vol. 60; no. 2; pp. 121 - 133 |
|---|---|
| Main Authors: | , , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
Wiley Subscription Services, Inc
01.02.2025
John Wiley and Sons Inc |
| Subjects: | |
| ISSN: | 0022-3484, 1600-0765, 1600-0765 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Introduction
The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long‐term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT‐4 and GPT‐3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in‐service examination questions posed by the American Academy of Periodontology (AAP).
Methods
Under a comparative cross‐sectional study design, a corpus of 1312 questions from the annual in‐service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi‐square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub‐analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions.
Results
ChatGPT‐4 (total average: 79.57%) outperformed all human control groups as well as GPT‐3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT‐3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first‐ (63.48% ± 31.67) and second‐year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third‐year residents (69.06% ± 30.45).
Conclusions
Within the confines of this analysis, ChatGPT‐4 exhibited a robust capability in answering AAP in‐service exam questions in terms of accuracy and reliability while Gemini and ChatGPT‐3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image‐based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT‐4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study.
ChatGPT‐4 outperforms ChatGPT‐3.5, Google Gemini, and human periodontics residents in standardized testing (AAP in‐service exams, 2020‐2023). This highlights the potential future role of AI in enhancing dental education. |
|---|---|
| AbstractList | IntroductionThe emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long‐term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT‐4 and GPT‐3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in‐service examination questions posed by the American Academy of Periodontology (AAP).MethodsUnder a comparative cross‐sectional study design, a corpus of 1312 questions from the annual in‐service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi‐square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub‐analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions.ResultsChatGPT‐4 (total average: 79.57%) outperformed all human control groups as well as GPT‐3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT‐3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first‐ (63.48% ± 31.67) and second‐year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third‐year residents (69.06% ± 30.45).ConclusionsWithin the confines of this analysis, ChatGPT‐4 exhibited a robust capability in answering AAP in‐service exam questions in terms of accuracy and reliability while Gemini and ChatGPT‐3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image‐based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT‐4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study. ChatGPT‐4 outperforms ChatGPT‐3.5, Google Gemini, and human periodontics residents in standardized testing (AAP in‐service exams, 2020‐2023). This highlights the potential future role of AI in enhancing dental education. The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long-term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT-4 and GPT-3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in-service examination questions posed by the American Academy of Periodontology (AAP).INTRODUCTIONThe emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long-term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT-4 and GPT-3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in-service examination questions posed by the American Academy of Periodontology (AAP).Under a comparative cross-sectional study design, a corpus of 1312 questions from the annual in-service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi-square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub-analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions.METHODSUnder a comparative cross-sectional study design, a corpus of 1312 questions from the annual in-service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi-square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub-analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions.ChatGPT-4 (total average: 79.57%) outperformed all human control groups as well as GPT-3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT-3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first- (63.48% ± 31.67) and second-year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third-year residents (69.06% ± 30.45).RESULTSChatGPT-4 (total average: 79.57%) outperformed all human control groups as well as GPT-3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT-3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first- (63.48% ± 31.67) and second-year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third-year residents (69.06% ± 30.45).Within the confines of this analysis, ChatGPT-4 exhibited a robust capability in answering AAP in-service exam questions in terms of accuracy and reliability while Gemini and ChatGPT-3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image-based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT-4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study.CONCLUSIONSWithin the confines of this analysis, ChatGPT-4 exhibited a robust capability in answering AAP in-service exam questions in terms of accuracy and reliability while Gemini and ChatGPT-3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image-based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT-4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study. The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long-term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT-4 and GPT-3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in-service examination questions posed by the American Academy of Periodontology (AAP). Under a comparative cross-sectional study design, a corpus of 1312 questions from the annual in-service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi-square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub-analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions. ChatGPT-4 (total average: 79.57%) outperformed all human control groups as well as GPT-3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT-3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first- (63.48% ± 31.67) and second-year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third-year residents (69.06% ± 30.45). Within the confines of this analysis, ChatGPT-4 exhibited a robust capability in answering AAP in-service exam questions in terms of accuracy and reliability while Gemini and ChatGPT-3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image-based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT-4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study. Introduction The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our long‐term goal of harnessing the power of AI to augment didactic teaching, the objective of this study was to quantify and compare the accuracy of responses provided by ChatGPT (GPT‐4 and GPT‐3.5) and Google Gemini, the three primary large language models (LLMs), to human graduate students (control group) to the annual in‐service examination questions posed by the American Academy of Periodontology (AAP). Methods Under a comparative cross‐sectional study design, a corpus of 1312 questions from the annual in‐service examination of AAP administered between 2020 and 2023 were presented to the LLMs. Their responses were analyzed using chi‐square tests, and the performance was juxtaposed to the scores of periodontal residents from corresponding years, as the human control group. Additionally, two sub‐analyses were performed: one on the performance of the LLMs on each section of the exam; and in answering the most difficult questions. Results ChatGPT‐4 (total average: 79.57%) outperformed all human control groups as well as GPT‐3.5 and Google Gemini in all exam years (p < .001). This chatbot showed an accuracy range between 78.80% and 80.98% across the various exam years. Gemini consistently recorded superior performance with scores of 70.65% (p = .01), 73.29% (p = .02), 75.73% (p < .01), and 72.18% (p = .0008) for the exams from 2020 to 2023 compared to ChatGPT‐3.5, which achieved 62.5%, 68.24%, 69.83%, and 59.27% respectively. Google Gemini (72.86%) surpassed the average scores achieved by first‐ (63.48% ± 31.67) and second‐year residents (66.25% ± 31.61) when all exam years combined. However, it could not surpass that of third‐year residents (69.06% ± 30.45). Conclusions Within the confines of this analysis, ChatGPT‐4 exhibited a robust capability in answering AAP in‐service exam questions in terms of accuracy and reliability while Gemini and ChatGPT‐3.5 showed a weaker performance. These findings underscore the potential of deploying LLMs as an educational tool in periodontics and oral implantology domains. However, the current limitations of these models such as inability to effectively process image‐based inquiries, the propensity for generating inconsistent responses to the same prompts, and achieving high (80% by GPT‐4) but not absolute accuracy rates should be considered. An objective comparison of their capability versus their capacity is required to further develop this field of study. ChatGPT‐4 outperforms ChatGPT‐3.5, Google Gemini, and human periodontics residents in standardized testing (AAP in‐service exams, 2020‐2023). This highlights the potential future role of AI in enhancing dental education. |
| Author | Misch, Jonathan Merchant, Keith Kumar, Purnima S. Saleh, Muhammad H. A. Hazrati, Parham Sabri, Hamoun Wang, Hom‐Lay Barootchi, Shayan |
| AuthorAffiliation | 2 Center for Clinical Research and Evidence Synthesis in Oral Tissue Regeneration (CRITERION) Ann Arbor Michigan USA 5 Division of Periodontology, Department of Oral Medicine, Infection, and Immunity Harvard School of Dental Medicine Boston Massachusetts USA 3 Naval Post‐Graduate Dental School Bethesda Maryland USA 4 Private Practice Ann Arbor Michigan USA 1 Department of Periodontics and Oral Medicine, School of Dentistry University of Michigan Ann Arbor Michigan USA |
| AuthorAffiliation_xml | – name: 2 Center for Clinical Research and Evidence Synthesis in Oral Tissue Regeneration (CRITERION) Ann Arbor Michigan USA – name: 3 Naval Post‐Graduate Dental School Bethesda Maryland USA – name: 4 Private Practice Ann Arbor Michigan USA – name: 1 Department of Periodontics and Oral Medicine, School of Dentistry University of Michigan Ann Arbor Michigan USA – name: 5 Division of Periodontology, Department of Oral Medicine, Infection, and Immunity Harvard School of Dental Medicine Boston Massachusetts USA |
| Author_xml | – sequence: 1 givenname: Hamoun orcidid: 0000-0001-6581-2104 surname: Sabri fullname: Sabri, Hamoun email: hsabri@umich.edu organization: Center for Clinical Research and Evidence Synthesis in Oral Tissue Regeneration (CRITERION) – sequence: 2 givenname: Muhammad H. A. surname: Saleh fullname: Saleh, Muhammad H. A. organization: University of Michigan – sequence: 3 givenname: Parham orcidid: 0000-0002-8362-3208 surname: Hazrati fullname: Hazrati, Parham organization: University of Michigan – sequence: 4 givenname: Keith surname: Merchant fullname: Merchant, Keith organization: Naval Post‐Graduate Dental School – sequence: 5 givenname: Jonathan surname: Misch fullname: Misch, Jonathan organization: Private Practice – sequence: 6 givenname: Purnima S. orcidid: 0000-0001-5844-1341 surname: Kumar fullname: Kumar, Purnima S. organization: University of Michigan – sequence: 7 givenname: Hom‐Lay orcidid: 0000-0003-4238-1799 surname: Wang fullname: Wang, Hom‐Lay organization: University of Michigan – sequence: 8 givenname: Shayan orcidid: 0000-0002-5347-6577 surname: Barootchi fullname: Barootchi, Shayan email: shbaroot@umich.edu organization: Harvard School of Dental Medicine |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39030766$$D View this record in MEDLINE/PubMed |
| BookMark | eNp1kstuEzEUhi1URNPCghdAI7FpF9P6MlexqKKqlKBKIARry-M5Th3N2MH2gMqKR2DJ8_EkPWlSBBV44Yv8nf9cD8ie8w4Iec7oCcN1ugpwwoTg4hGZsYrSnNZVuUdmlHKei6Ip9slBjCuK76pun5B90VKBTDUjP99DMD6MymnIvMnSdQDIVEjWWG3VkFmXYBjsEjbA0Xxx_Ov7j05F6LNBhSXg7paTwsvoexgi8llMyvUq9PYbUglism75KrPjerBaJetdzNBlNl-glIrRxoRcDy6hO-inLfOUPDZqiPBsdx6ST68vPp6_ya_eXS7O51e5LopC5G3Pm4q2qikVM4bpUvVdJRpuGi4460swLeOiNBU0nJaiKwtetx3tKt0ZTRUTh-Rsq7ueuhF6jWEENch1sKMKN9IrK__-cfZaLv0XyVhTi6pqUeFopxD85wnTlaONGoumHPgpSkEb3pZtTTfOXj5AV34KDvOTgtW8pdi8BqkXf4b0O5b7riFwugV08DEGMFLbdFc1jNAOklG5mQuJcyHv5gItjh9Y3Iv-i92pf7UD3PwflG8_XGwtbgGuMMtj |
| CitedBy_id | crossref_primary_10_3389_froh_2025_1566221 crossref_primary_10_3390_app142310802 crossref_primary_10_1007_s00276_025_03723_8 crossref_primary_10_3390_app15105231 crossref_primary_10_7759_cureus_77292 crossref_primary_10_2196_64486 crossref_primary_10_1016_j_jds_2025_05_012 crossref_primary_10_1111_eje_70011 crossref_primary_10_1111_jcpe_14101 crossref_primary_10_1007_s12144_024_06862_0 crossref_primary_10_31201_ijhmt_1667860 crossref_primary_10_2196_71125 crossref_primary_10_12677_ces_2024_1212889 crossref_primary_10_3390_diagnostics14242818 crossref_primary_10_1016_j_jds_2025_08_016 crossref_primary_10_1016_j_jds_2025_03_018 crossref_primary_10_1111_bjet_13591 crossref_primary_10_1016_j_identj_2025_103854 |
| Cites_doi | 10.1007/s00264-024-06182-9 10.1371/journal.pdig.0000198 10.1001/jama.2023.14311 10.5435/JAAOS-D-23-00396 10.1148/radiol.230922 10.1038/s42254-023-00581-4 10.22178/pos.38-2 10.2196/50514 10.1038/s41591-023-02448-8 10.1038/s41391-023-00705-y 10.1016/j.jaad.2023.05.081 10.1016/j.adaj.2023.07.016 10.1093/asj/sjad130 10.1109/MLBDBI54094.2021.00149 10.1001/jama.2023.14217 10.1007/s10439-023-03284-0 10.3389/fonc.2023.1219326 10.1002/ase.2270 10.1093/asjof/ojad066 10.1016/j.prosdent.2024.01.018 10.1016/j.ijinfomgt.2023.102642 10.2196/48002 10.3325/cmj.2023.64.1 10.1002/JPER.23-0514 10.1227/neu.0000000000002551 10.2196/27850 10.2214/AJR.23.29622 |
| ContentType | Journal Article |
| Copyright | 2024 The Author(s). published by John Wiley & Sons Ltd. 2024 The Author(s). Journal of Periodontal Research published by John Wiley & Sons Ltd. 2024. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2024 The Author(s). published by John Wiley & Sons Ltd. – notice: 2024 The Author(s). Journal of Periodontal Research published by John Wiley & Sons Ltd. – notice: 2024. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | 24P AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QP K9. 7X8 5PM |
| DOI | 10.1111/jre.13323 |
| DatabaseName | Wiley-Blackwell Open Access Titles CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Calcium & Calcified Tissue Abstracts ProQuest Health & Medical Complete (Alumni) MEDLINE - Academic PubMed Central (Full Participant titles) |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) ProQuest Health & Medical Complete (Alumni) Calcium & Calcified Tissue Abstracts MEDLINE - Academic |
| DatabaseTitleList | ProQuest Health & Medical Complete (Alumni) MEDLINE - Academic MEDLINE |
| Database_xml | – sequence: 1 dbid: 24P name: Wiley Online Library Open Access (Activated by CARLI) url: https://authorservices.wiley.com/open-science/open-access/browse-journals.html sourceTypes: Publisher – sequence: 2 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Dentistry |
| DocumentTitleAlternate | Sabri et al |
| EISSN | 1600-0765 |
| EndPage | 133 |
| ExternalDocumentID | PMC11873669 39030766 10_1111_jre_13323 JRE13323 |
| Genre | researchArticle Journal Article Comparative Study |
| GroupedDBID | --- .3N .GA .GJ .Y3 05W 0R~ 10A 1OB 1OC 24P 29L 31~ 33P 34H 3SF 4.4 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52V 52W 52X 53G 5GY 5HH 5LA 5RE 5VS 66C 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 930 A03 AAESR AAEVG AAHHS AAHQN AAIPD AAKAS AAMNL AANHP AANLZ AAONW AASGY AAWTL AAXRX AAYCA AAZKR ABCQN ABCUV ABEML ABJNI ABLJU ABPVW ABQWH ABXGK ACAHQ ACBWZ ACCFJ ACCZN ACGFS ACGOF ACMXC ACPOU ACPRK ACRPL ACSCC ACXBN ACXQS ACYXJ ADBBV ADBTR ADEOM ADIZJ ADKYN ADMGS ADNMO ADOZA ADXAS ADZCM ADZMN AEEZP AEIGN AEIMD AENEX AEQDE AEUYR AFBPY AFEBI AFFNX AFFPM AFGKR AFWVQ AFZJQ AHBTC AHEFC AIACR AITYG AIURR AIWBW AJBDE ALAGY ALMA_UNASSIGNED_HOLDINGS ALUQN ALVPJ AMBMR AMYDB ASPBG ATUGU AVWKF AZBYB AZFZN AZVAB BAFTC BDRZF BFHJK BHBCM BMXJE BROTX BRXPI BY8 C45 CAG COF CS3 CWXXS D-E D-F DC6 DCZOG DPXWK DR2 DRFUL DRMAN DRSTM DU5 EBD EBS EJD F00 F01 F04 F5P FEDTE FUBAC FZ0 G-S G.N GODZA H.T H.X HF~ HGLYW HVGLF HZI HZ~ IHE IX1 J0M K48 KBYEO LATKE LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LW6 LYRES MEWTI MK4 MRFUL MRMAN MRSTM MSFUL MSMAN MSSTM MXFUL MXMAN MXSTM N04 N05 N9A NF~ O66 O9- OIG OVD P2P P2W P2X P4D PALCI Q.N Q11 QB0 R.K RIWAO RJQFR ROL RX1 SAMSI SUPJJ TEORI UB1 W8V W99 WBKPD WBNRW WIH WIJ WIK WOHZO WPGGZ WQJ WXSBR XG1 YFH ZGI ZZTAW ~IA ~WT AAMMB AAYXX AEFGJ AEYWJ AGHNM AGQPQ AGXDD AGYGG AIDQK AIDYY AIQQE CITATION O8X CGR CUY CVF ECM EIF NPM 7QP K9. 7X8 5PM |
| ID | FETCH-LOGICAL-c4443-9d28609a85a1ff1c5adb6382f82321d5ef91235f6e82053b54279b0b6cbfc0a13 |
| IEDL.DBID | 24P |
| ISICitedReferencesCount | 27 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001272576400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0022-3484 1600-0765 |
| IngestDate | Tue Nov 04 02:04:52 EST 2025 Fri Jul 11 00:02:35 EDT 2025 Tue Oct 07 06:44:56 EDT 2025 Mon Jul 21 06:07:00 EDT 2025 Tue Nov 18 21:02:54 EST 2025 Sat Nov 29 08:17:12 EST 2025 Mon Mar 03 15:18:47 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Keywords | American Academy of Periodontology dental education ChatGPT‐4 ChatGPT ChatGPT‐3.5 Google Gemini periodontal education artificial intelligence Gemini Google Bard |
| Language | English |
| License | Attribution-NonCommercial-NoDerivs 2024 The Author(s). Journal of Periodontal Research published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c4443-9d28609a85a1ff1c5adb6382f82321d5ef91235f6e82053b54279b0b6cbfc0a13 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ORCID | 0000-0001-6581-2104 0000-0001-5844-1341 0000-0002-5347-6577 0000-0002-8362-3208 0000-0003-4238-1799 |
| OpenAccessLink | https://onlinelibrary.wiley.com/doi/abs/10.1111%2Fjre.13323 |
| PMID | 39030766 |
| PQID | 3172906008 |
| PQPubID | 2045129 |
| PageCount | 13 |
| ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_11873669 proquest_miscellaneous_3082959701 proquest_journals_3172906008 pubmed_primary_39030766 crossref_citationtrail_10_1111_jre_13323 crossref_primary_10_1111_jre_13323 wiley_primary_10_1111_jre_13323_JRE13323 |
| PublicationCentury | 2000 |
| PublicationDate | February 2025 |
| PublicationDateYYYYMMDD | 2025-02-01 |
| PublicationDate_xml | – month: 02 year: 2025 text: February 2025 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States – name: Hoboken |
| PublicationTitle | Journal of periodontal research |
| PublicationTitleAlternate | J Periodontal Res |
| PublicationYear | 2025 |
| Publisher | Wiley Subscription Services, Inc John Wiley and Sons Inc |
| Publisher_xml | – name: Wiley Subscription Services, Inc – name: John Wiley and Sons Inc |
| References | 2023; 31 2021; 7 2023; 51 2023; 13 2023; 11 2023; 17 2023; 5 2023; 18 2023; 6 2023; 9 2023; 221 2023; 307 2023; 2 2020; 33 2024 2023; 64 2023; 43 2018; 4 2023; 89 2023; 29 2021 2023; 27 2023; 330 2023; 154 2024; 131 2023; 71 2023; 93 e_1_2_10_23_1 Brown T (e_1_2_10_32_1) 2020; 33 e_1_2_10_24_1 e_1_2_10_21_1 e_1_2_10_22_1 e_1_2_10_20_1 Farajollahi M (e_1_2_10_28_1) 2023; 18 e_1_2_10_2_1 e_1_2_10_4_1 e_1_2_10_18_1 e_1_2_10_3_1 e_1_2_10_19_1 e_1_2_10_6_1 e_1_2_10_16_1 e_1_2_10_5_1 e_1_2_10_17_1 e_1_2_10_8_1 e_1_2_10_14_1 e_1_2_10_7_1 e_1_2_10_15_1 e_1_2_10_12_1 e_1_2_10_9_1 e_1_2_10_10_1 e_1_2_10_11_1 e_1_2_10_31_1 Wach K (e_1_2_10_13_1) 2023; 11 e_1_2_10_30_1 e_1_2_10_29_1 e_1_2_10_27_1 e_1_2_10_25_1 e_1_2_10_26_1 |
| References_xml | – volume: 17 start-page: 926 year: 2023 end-page: 931 article-title: The rise of ChatGPT: exploring its potential in medical education publication-title: Anat Sci Educ – volume: 89 start-page: 870 year: 2023 end-page: 871 article-title: ChatGPT for healthcare providers and patients: practical implications within dermatology publication-title: J Am Acad Dermatol – volume: 154 start-page: 970 year: 2023 end-page: 974 article-title: The performance of artificial intelligence language models in board‐style dental knowledge assessment: a preliminary study on ChatGPT publication-title: J Am Dent Assoc – volume: 64 start-page: 1 year: 2023 end-page: 3 article-title: Opportunities and risks of ChatGPT in medicine, science, and academic publishing: a modern promethean dilemma publication-title: Croat Med J – volume: 31 start-page: 1173 year: 2023 end-page: 1179 article-title: Comparison of ChatGPT‐3.5, ChatGPT‐4, and Orthopaedic resident performance on orthopaedic assessment examinations publication-title: J Am Acad Orthop Surg – volume: 43 start-page: NP1085 year: 2023 end-page: NP1089 article-title: ChatGPT is equivalent to first year plastic surgery residents: evaluation of ChatGPT on the plastic surgery in‐service exam publication-title: Aesthet Surg J – volume: 9 year: 2023 article-title: Performance of GPT‐3.5 and GPT‐4 on the Japanese medical licensing examination: comparison study publication-title: JMIR Med Educ – volume: 33 start-page: 1877 year: 2020 end-page: 1901 article-title: Language models are few‐shot learners publication-title: Adv Neural Inf Process Syst – volume: 51 start-page: 2647 year: 2023 end-page: 2651 article-title: ChatGPT, Bard, and large language models for biomedical research: opportunities and pitfalls publication-title: Ann Biomed Eng – volume: 2 year: 2023 article-title: Performance of ChatGPT on USMLE: potential for AI‐assisted medical education using large language models publication-title: PLOS Digit Health – volume: 9 year: 2023 article-title: Assessment of resident and AI Chatbot performance on the University of Toronto Family Medicine Residency Progress Test: comparative study publication-title: JMIR Med Educ – volume: 93 start-page: 1090 issue: 5 year: 2023 end-page: 1098 article-title: Performance of ChatGPT, GPT‐4, and Google Bard on a neurosurgery oral boards preparation question Bank publication-title: Neurosurgery – volume: 11 start-page: 11 year: 2023 end-page: 30 article-title: The dark side of generative artificial intelligence: a critical analysis of controversies and risks of ChatGPT publication-title: Entrep Bus Econ Rev – volume: 330 start-page: 866 year: 2023 end-page: 869 article-title: Creation and adoption of large language models in medicine publication-title: JAMA – year: 2024 article-title: Effectiveness of AI‐powered Chatbots in responding to orthopaedic postgraduate exam questions—an observational study publication-title: Int Orthop – volume: 27 start-page: 103 issue: 1 year: 2023 end-page: 108 article-title: Quality of information and appropriateness of ChatGPT outputs for urology patients publication-title: Prostate Cancer Prostatic Dis – volume: 18 start-page: 192 year: 2023 article-title: Can ChatGPT pass the “Iranian Endodontics Specialist Board” exam? publication-title: Iran Endod J – volume: 4 start-page: 2007 year: 2018 end-page: 2012 article-title: Smart home and artificial intelligence as environment for the implementation of new technologies publication-title: Traekt Nauki=Path Sci – volume: 7 year: 2021 article-title: Chatbot for health care and oncology applications using artificial intelligence and machine learning: systematic review publication-title: JMIR Cancer – volume: 13 year: 2023 article-title: Evaluating large language models on a highly‐specialized topic, radiation oncology physics publication-title: Front Oncol – year: 2024 article-title: Artificial intelligence in dental education: ChatGPT's performance on the periodontic in‐service examination publication-title: J Periodontol – volume: 5 start-page: 277 year: 2023 end-page: 280 article-title: Science in the age of large language models publication-title: Nat Rev Phys – volume: 29 start-page: 1930 year: 2023 end-page: 1940 article-title: Large language models in medicine publication-title: Nat Med – volume: 330 start-page: 792 year: 2023 article-title: Large language models answer medical questions accurately, but can't match clinicians' knowledge publication-title: JAMA – volume: 6 year: 2023 article-title: Bard versus the 2022 American Society of Plastic Surgeons in‐service examination: performance on the examination in its intern year publication-title: Aesthetic Surg J Open Forum – volume: 71 year: 2023 article-title: Opinion Paper: “so what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy publication-title: Int J Inf Manage – start-page: 767 year: 2021 end-page: 770 – volume: 307 year: 2023 article-title: How AI responds to common lung cancer questions: ChatGPT vs Google Bard publication-title: Radiology – volume: 131 start-page: 659.e1 year: 2024 end-page: 659.e6 article-title: ChatGPT performance in prosthodontics: assessment of accuracy and repeatability in answer generation publication-title: J Prosthet Dent – volume: 221 start-page: 701 year: 2023 end-page: 704 article-title: Use of ChatGPT, GPT‐4, and Bard to improve readability of ChatGPT's answers to common questions on lung cancer and lung cancer screening publication-title: AJR Am J Roentgenol – ident: e_1_2_10_18_1 doi: 10.1007/s00264-024-06182-9 – ident: e_1_2_10_26_1 doi: 10.1371/journal.pdig.0000198 – ident: e_1_2_10_11_1 doi: 10.1001/jama.2023.14311 – ident: e_1_2_10_27_1 doi: 10.5435/JAAOS-D-23-00396 – ident: e_1_2_10_22_1 doi: 10.1148/radiol.230922 – volume: 18 start-page: 192 year: 2023 ident: e_1_2_10_28_1 article-title: Can ChatGPT pass the “Iranian Endodontics Specialist Board” exam? publication-title: Iran Endod J – ident: e_1_2_10_7_1 doi: 10.1038/s42254-023-00581-4 – ident: e_1_2_10_2_1 doi: 10.22178/pos.38-2 – ident: e_1_2_10_15_1 doi: 10.2196/50514 – ident: e_1_2_10_24_1 – ident: e_1_2_10_6_1 doi: 10.1038/s41591-023-02448-8 – ident: e_1_2_10_8_1 doi: 10.1038/s41391-023-00705-y – ident: e_1_2_10_9_1 doi: 10.1016/j.jaad.2023.05.081 – ident: e_1_2_10_29_1 doi: 10.1016/j.adaj.2023.07.016 – ident: e_1_2_10_16_1 doi: 10.1093/asj/sjad130 – ident: e_1_2_10_4_1 doi: 10.1109/MLBDBI54094.2021.00149 – ident: e_1_2_10_10_1 doi: 10.1001/jama.2023.14217 – ident: e_1_2_10_20_1 doi: 10.1007/s10439-023-03284-0 – ident: e_1_2_10_5_1 doi: 10.3389/fonc.2023.1219326 – ident: e_1_2_10_14_1 doi: 10.1002/ase.2270 – ident: e_1_2_10_17_1 doi: 10.1093/asjof/ojad066 – ident: e_1_2_10_31_1 doi: 10.1016/j.prosdent.2024.01.018 – ident: e_1_2_10_19_1 doi: 10.1016/j.ijinfomgt.2023.102642 – ident: e_1_2_10_25_1 doi: 10.2196/48002 – ident: e_1_2_10_12_1 doi: 10.3325/cmj.2023.64.1 – ident: e_1_2_10_30_1 doi: 10.1002/JPER.23-0514 – ident: e_1_2_10_21_1 doi: 10.1227/neu.0000000000002551 – volume: 33 start-page: 1877 year: 2020 ident: e_1_2_10_32_1 article-title: Language models are few‐shot learners publication-title: Adv Neural Inf Process Syst – ident: e_1_2_10_3_1 doi: 10.2196/27850 – volume: 11 start-page: 11 year: 2023 ident: e_1_2_10_13_1 article-title: The dark side of generative artificial intelligence: a critical analysis of controversies and risks of ChatGPT publication-title: Entrep Bus Econ Rev – ident: e_1_2_10_23_1 doi: 10.2214/AJR.23.29622 |
| SSID | ssj0002679 |
| Score | 2.5298033 |
| Snippet | Introduction
The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line... The emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line with our... IntroductionThe emerging rise in novel computer technologies and automated data analytics has the potential to change the course of dental education. In line... ChatGPT‐4 outperforms ChatGPT‐3.5, Google Gemini, and human periodontics residents in standardized testing (AAP in‐service exams, 2020‐2023). This highlights... |
| SourceID | pubmedcentral proquest pubmed crossref wiley |
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 121 |
| SubjectTerms | Accuracy American Academy of Periodontology Artificial Intelligence Chatbots ChatGPT ChatGPT‐3.5 ChatGPT‐4 Clinical Research Cross-Sectional Studies dental education Education, Dental - methods Educational Measurement - methods Educational Measurement - standards Gemini Google Bard Google Gemini Humans Large Language Models periodontal education Periodontics Periodontics - education |
| Title | Performance of three artificial intelligence (AI)‐based large language models in standardized testing; implications for AI‐assisted dental education |
| URI | https://onlinelibrary.wiley.com/doi/abs/10.1111%2Fjre.13323 https://www.ncbi.nlm.nih.gov/pubmed/39030766 https://www.proquest.com/docview/3172906008 https://www.proquest.com/docview/3082959701 https://pubmed.ncbi.nlm.nih.gov/PMC11873669 |
| Volume | 60 |
| WOSCitedRecordID | wos001272576400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVWIB databaseName: Wiley Online Library Full Collection 2020 customDbUrl: eissn: 1600-0765 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002679 issn: 0022-3484 databaseCode: DRFUL dateStart: 19970101 isFulltext: true titleUrlDefault: https://onlinelibrary.wiley.com providerName: Wiley-Blackwell |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB6VFgkuvB-BUhnEoRyC4jwcWz2taFcUVdWqomhvke3YsFKVRbtbDpz4CRz5ffwSZpwHuypISFyiSJk8_Y392Zn5BuBlIq22XqtYZNbGufboc9qIWCVGqpJr2UZbfDgpT0_ldKomW3DQ58K0-hDDght5RuivycG1Wa47-cK9xglWml2DHc4zSXUb0nwydMOpKNUgFZ7LvJMVCmE8_ambg9EVhnk1UHKdwIYRaHz7v579DtzqiCcbtUi5C1uuuQc3DilYiOq93Ycfk985BGzu2Qpb2TFCVisywWZr6p1sf3T86ue37zQI1uyCwslZv_TJQnWdJdqzfqFi9hWtViTo0Xw8YLO1KHaGt2SjY7wUsniCXM3qkKHJXB968gDOx0fv37yNu7oNsc1z-plfp1IkCttZc--5LXRt0M1TL5G-8bpwXlGGrhcO6UeRmSJPS2USI6zxNtE8ewjbzbxxj4GZWuVSGEQNEiXPJZJsJLSaCtEUJEsTwX7fgJXtRM2ptsZFNUxuFq4KnzqCF4Pp51bJ409Guz0Kqs6ZlxVSLBLFR7YUwfPhMLoh_VvRjZtfog3lKOPkLOERPGpBM9wlU9STChGB3IDTYEAS35tHmtmnIPVNxeAzIRS-Z8DT35-8end2FHae_LvpU7iZUjnjEIS-C9urxaV7BtftF0TeYi-4E27LqdyDncOz8fnJL3aBKQk |
| linkProvider | Wiley-Blackwell |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LbxMxEB6VglQuvKELBQziUA5b7cPr2IJLBK0aCFGECuptZXttGqnaoDTlwImfwJHfxy9hxvsgUUFC4rbSzj79jf3ZnvkG4FkirbZeq1jk1sZce_Q5bUSsEiPVINWyibb4OB5MJvL4WE034GWXC9PoQ_QLbuQZob8mB6cF6VUvX7g9nGFl-SW4zHGUIZRnfNr3w5kYqF4rnEve6gqFOJ7u0vXR6ALFvBgpucpgwxB0cP3_Xv4GXGupJxs2WLkJG66-BVuvKVyIKr7dhh_T31kEbO7ZEtvZMcJWIzPBZiv6nWx3OHr-89t3GgYrdkoB5axb_GShvs4Z2rNuqWL2Fa2WJOlRf3rBZitx7AwfyYYjvBXyeAJdxaqQo8lcF3xyBz4c7B-9Oozbyg2x5Zy286tMikRhS-vU-9QWujLo6JmXSODSqnBeUY6uFw4JSJGbgmcDZRIjrPE20Wl-Fzbree22gZlKcSkM4gapkk8l0myktJpK0RQkTBPBbteCpW1lzam6xmnZT28Wrgy_OoKnvennRsvjT0Y7HQzK1p3PSiRZJIuPfCmCJ_1pdETaXdG1m5-jDWUp4_QsSSO416Cmf0quqC8VIgK5hqfegES-18_Us5Mg9k3l4HMhFH5nANTf37x8834_HNz_d9PHsHV49G5cjkeTtw_gakbFjUNI-g5sLhfn7iFcsV8QhYtHwbd-AcPjKmw |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LbxMxEB6VFFEuvB8LBQzi0B4W7dOxBZeINCIQRVFFUW8r22vTSNWmSlMOnPgJHPl9_BJmvA8SFSQkbivt7NPf2J_tmW8AXkbCKOOUDHlqTJgphz6nNA9lpIXsx0rU0RafJv3pVBwfy9kWvGlzYWp9iG7BjTzD99fk4PasdOtevrSvcIaVpFdgO6MiMj3YHh6OjiZdT5zwvuzUwjORNcpCPpKnvXhzPLpEMi_HSq5zWD8IjW7-3-vfghsN-WSDGi23YctWd2BnSAFDVPPtLvyY_c4jYAvHVtjSlhG6aqEJNl9T8GR7g_H-z2_faSAs2SmFlLN2-ZP5CjvnaM_axYr5V7RakahH9fk1m69FsjN8JBuM8VbI5Al2JSt9liazbfjJPTgaHXx8-y5sajeEJstoQ79MBI8ktrWKnYtNrkqNrp44gRQuLnPrJGXpOm6RguSpzrOkL3WkudHORCpO70OvWlT2ITBdykxwjchBsuRigUQbSa2iYjQ5SdMEsNe2YGEaYXOqr3FadBOcpS38rw7gRWd6Vqt5_Mlot4VB0Tj0eYE0i4TxkTEF8Lw7ja5I-yuqsosLtKE8ZZygRXEAD2rUdE9JJfWmnAcgNvDUGZDM9-aZan7i5b6pIHzKucTv9ID6-5sX7w8P_MGjfzd9Btdmw1ExGU8_PIbrCVU39jHpu9BbLS_sE7hqviAIl08b5_oFcsorgg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Performance+of+three+artificial+intelligence+%28AI%29%E2%80%90based+large+language+models+in+standardized+testing%3B+implications+for+AI%E2%80%90assisted+dental+education&rft.jtitle=Journal+of+periodontal+research&rft.au=Sabri%2C+Hamoun&rft.au=Saleh%2C+Muhammad+H.+A.&rft.au=Hazrati%2C+Parham&rft.au=Merchant%2C+Keith&rft.date=2025-02-01&rft.issn=0022-3484&rft.eissn=1600-0765&rft.volume=60&rft.issue=2&rft.spage=121&rft.epage=133&rft_id=info:doi/10.1111%2Fjre.13323&rft.externalDBID=10.1111%252Fjre.13323&rft.externalDocID=JRE13323 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0022-3484&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0022-3484&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0022-3484&client=summon |