Análise das Respostas de LLMs em Relação ao Conteúdo Introdutório de Programação: um Comparativo entre o ChatGPT e o Gemini.
Saved in:
| Title: | Análise das Respostas de LLMs em Relação ao Conteúdo Introdutório de Programação: um Comparativo entre o ChatGPT e o Gemini. (Portuguese) |
|---|---|
| Alternate Title: | Analysis of Responses from LLMs Regarding Introductory Programming Content: A Comparative Study between ChatGPT and Gemini. (English) Análisis de Respuestas de LLMs en Relación al Contenido Introductorio de Programación: Un Estudio Comparativo entre ChatGPT y Gemini. (Spanish) |
| Authors: | Pereira Filho, Luiz Carlos, de Paula Cypriano de Souza, Talita, de Paula, Luciano Bernardes |
| Source: | Revista Brasileira de Informática na Educação; 2025, Vol. 33, p722-747, 26p |
| Subject Terms: | CHATGPT, GEMINI (Chatbot), CODE generators, COMPARATIVE studies, COMPUTER programming education, COMPUTER programming, LANGUAGE models, ALGORITHMS |
| Company/Entity: | GOOGLE Inc. |
| Abstract (English): | Recently, Large Language Models for Natural Language Processing have stood out among current technologies. This technology has opened up a range of possibilities for use in various areas, including programming education, as these models can create program codes. Among these models, two are well-known: OpenAI’s ChatGPT and Google’s Gemini, both demonstrating abilities to create, correct, and explain programming codes in various languages. In a previous work, tests were conducted and the responses of ChatGPT were analyzed regarding introductory programming content from the perspective of beginners in the subject. This work extends the previous research and adds tests with Gemini, also concerning the same content. The goal is to determine whether these models are suitable for beginner programming students and whether they can be used for learning this content. As in the previous work,qualitative tests were conducted, in which some interactions with the model were made if the initial response was unsatisfactory, and quantitative tests, in which these interactions were not made. All tests were conducted on both ChatGPT and Gemini, and their responses were analyzed. Both showed potential to correctly respond to and explain generated codes, but there are caveats. The overall performance of the tested LLMs, in terms of correct responses, was ∼78.2% for ChatGPT and ∼69.6% for Gemini. Even with this potential to assist in the programming learning process, the responses generated by LLMs should not be considered entirely correct, demanding prior knowledge from those who use them to analyze and make use of them. [ABSTRACT FROM AUTHOR] |
| Abstract (Spanish): | Recientemente, los LLMs de procesamiento de lenguaje natural han destacado entre las tecnologías actuales. Esta tecnología ha abierto una variedad de posibilidades de uso en diversas áreas, incluida la educación en programación, ya que estos modelos pueden crear códigos de programas. Entre estos modelos, dos son conocidos: el ChatGPT de OpenAI y el Gemini de Google, ambos demostrando habilidades para crear, corregir y explicar códigos de programación en varios lenguajes. En un trabajo anterior, se realizaron pruebas y se analizaron las respuestas de ChatGPT con respecto al contenido introductorio de programación desde la perspectiva de principiantes en el tema. Este trabajo extiende la investigación anterior y agrega pruebas con Gemini, también en relación con el mismo contenido. El objetivo es verificar si estos modelos son adecuados para estudiantes principiantes en programación y si es posible utilizarlos para el aprendizaje de este contenido. Al igual que en el trabajo anterior, se realizaron pruebas cualitativas, en las que se hicieron algunas interacciones con el modelo si la respuesta inicial no fue satisfactoria, y pruebas cuantitativas, en las que no se realizaron estas interacciones. Todas las pruebas se realizaron tanto en ChatGPT como en Gemini, y se analizaron sus respuestas. Ambos mostraron potencial para responder y explicar correctamente los códigos generados, pero hay advertencias. El rendimiento general de los LLMs probados, en términos de respuestas correctas, fue del ∼78,2% para ChatGPT y del ∼69,6% para Gemini. Aunque existe este potencial para ayudar en el proceso de aprendizaje de programación, las respuestas generadas por los LLMs no deben considerarse totalmente correctas, demandando conocimiento previo de quienes los utilizan para analizarlas y hacer uso de ellas. [ABSTRACT FROM AUTHOR] |
| Abstract (Portuguese): | Recentemente, o uso dos grandes modelos de linguagem (LLMs – Large Language Models) para processamento de linguagem natural teve destaque dentre as tecnologias atuais. Essa tecnologia trouxe uma gama de possibilidades de uso em diversas áreas, incluindo o ensino de programação, uma vez que esses modelos podem criar códigos de programas. Dentre esses modelos, dois são conhecidos: o ChatGPT da OpenAI e o Gemini da Google, e ambos demonstram habilidades de criar, corrigir e explicar códigos de programação em diversas linguagens. Em um trabalho anterior, foram feitos testes e analisadas as respostas do ChatGPT em relação ao conteúdo introdutório de programação, do ponto de vista de estudantes iniciantes no assunto. Este trabalho estende o trabalho anterior e adiciona testes com o Gemini, também em relação ao mesmo conteúdo. O objetivo é verificar se esses modelos são adequados para estudantes iniciantes em programação e se é possível utilizá-los para o aprendizado desse conteúdo. Assim como no trabalho anterior, foram feitos testes qualitativos, nos quais eram feitas algumas interações com o modelo caso a resposta inicial não fosse satisfatória, e testes quantitativos, nos quais não foram feitas essas interações. Todos os testes foram feitos tanto no ChatGPT quanto no Gemini e suas respostas foram analisadas. Ambos apresentaram a existência de potencial para responder e explicar corretamente códigos gerados, mas há ressalvas. O desempenho geral dos LLMs testados, em relação às respostas corretas, foi de ∼78,2% para o ChatGPT e ∼69,6% para o Gemini. Mesmo com esse potencial para auxiliar no processo de aprendizagem de programação, as respostas geradas pelos LLMs não devem ser consideradas totalmente corretas, demandando conhecimento prévio de quem os usa para analisá-las e fazer uso delas [ABSTRACT FROM AUTHOR] |
| Copyright of Revista Brasileira de Informática na Educação is the property of Revista Brasileira de Informatica na Educacao and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Database: | Complementary Index |
| Abstract: | Recently, Large Language Models for Natural Language Processing have stood out among current technologies. This technology has opened up a range of possibilities for use in various areas, including programming education, as these models can create program codes. Among these models, two are well-known: OpenAI’s ChatGPT and Google’s Gemini, both demonstrating abilities to create, correct, and explain programming codes in various languages. In a previous work, tests were conducted and the responses of ChatGPT were analyzed regarding introductory programming content from the perspective of beginners in the subject. This work extends the previous research and adds tests with Gemini, also concerning the same content. The goal is to determine whether these models are suitable for beginner programming students and whether they can be used for learning this content. As in the previous work,qualitative tests were conducted, in which some interactions with the model were made if the initial response was unsatisfactory, and quantitative tests, in which these interactions were not made. All tests were conducted on both ChatGPT and Gemini, and their responses were analyzed. Both showed potential to correctly respond to and explain generated codes, but there are caveats. The overall performance of the tested LLMs, in terms of correct responses, was ∼78.2% for ChatGPT and ∼69.6% for Gemini. Even with this potential to assist in the programming learning process, the responses generated by LLMs should not be considered entirely correct, demanding prior knowledge from those who use them to analyze and make use of them. [ABSTRACT FROM AUTHOR] |
|---|---|
| ISSN: | 14145685 |
| DOI: | 10.5753/rbie.2025.4477 |
Full Text Finder
Nájsť tento článok vo Web of Science