Zero-Shot Prompting Strategies for Table Question Answering in Portuguese: An Exploration of Prompt-Based Approaches for Text2SQL in the Portuguese Language
Saved in:
| Title: | Zero-Shot Prompting Strategies for Table Question Answering in Portuguese: An Exploration of Prompt-Based Approaches for Text2SQL in the Portuguese Language |
|---|---|
| Authors: | Jannuzzi, Marcelo Poles |
| Contributors: | Castelli, Mauro, Peres, Fernando Augusto Junqueira |
| Publication Year: | 1483 |
| Collection: | Repositório da Universidade Nova de Lisboa (UNL) |
| Subject Terms: | Natural Language Processing, Text2SQL, Table Question Answering, Large Language Models, GPT-3, GPT-4, Zero-Shot Prompting, Portuguese Language, Spider Benchmark, Natural Language Interface for Databases, Processamento de Linguagem Natural, Língua Portuguesa, Conjunto de Dados Spider, Interface de Linguagem Natural para Bancos de Dados, Domínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informação |
| Description: | Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science ; This thesis explores the application of zero-shot prompting strategies for table question answering (TQA) in Portuguese, focusing specifically on the Text2SQL task. This task involves translating questions posed in natural language into SQL queries which can be executed against a database to answer the original question. Given the popularity of relational databases across various domains, advancements in this field can substan tially impact the accessibility and democratization of data as simpler and more intuitive interfaces for database interaction are developed. Despite this significant potential, progress in developing Portuguese TQA solutions remains limited. We propose a previously unexplored approach to the Text2SQL task in Portuguese by leveraging Large Language Models (LLMs)—specifically the GPT-3.5 and GPT 4 models—through zero-shot prompting. The primary objectives are to assess the effectiveness of such LLMs in this task and to identify the most suitable prompt styles. These are evaluated using a Portuguese translation of the popular Spider Text2SQL benchmark. Results from this work reveal that, while not outperforming state-of-the-art models, our proposed approach can generate adequate SQL queries to answer Portuguese lan guage questions about various databases, particularly when using GPT-4. The findings suggest that including schema information and database content in the prompts is critical for satisfactory outcomes. Furthermore, we point out issues with the automatic evaluation process used in the Spider benchmark, which may lead to underestimating the performance of the approaches tested here. ; Esta tese explora a aplicação de estratégias de zero-shot prompting para responder per guntas na língua portuguesa a respeito de informações contidas em tabelas (área conhecida como Table Question Answering — TQA), com foco específico em Text2SQL. ... |
| Document Type: | master thesis |
| Language: | English |
| Relation: | http://hdl.handle.net/10362/159406; 203377222 |
| Availability: | http://hdl.handle.net/10362/159406 |
| Rights: | embargoedAccess ; http://creativecommons.org/licenses/by/4.0/ |
| Accession Number: | edsbas.F509F23 |
| Database: | BASE |
Be the first to leave a comment!
Nájsť tento článok vo Web of Science