Zero-Shot Prompting Strategies for Table Question Answering in Portuguese: An Exploration of Prompt-Based Approaches for Text2SQL in the Portuguese Language
Uložené v:
| Názov: | Zero-Shot Prompting Strategies for Table Question Answering in Portuguese: An Exploration of Prompt-Based Approaches for Text2SQL in the Portuguese Language |
|---|---|
| Autori: | Jannuzzi, Marcelo Poles |
| Prispievatelia: | Castelli, Mauro, Peres, Fernando Augusto Junqueira |
| Rok vydania: | 1483 |
| Zbierka: | Repositório da Universidade Nova de Lisboa (UNL) |
| Predmety: | Natural Language Processing, Text2SQL, Table Question Answering, Large Language Models, GPT-3, GPT-4, Zero-Shot Prompting, Portuguese Language, Spider Benchmark, Natural Language Interface for Databases, Processamento de Linguagem Natural, Língua Portuguesa, Conjunto de Dados Spider, Interface de Linguagem Natural para Bancos de Dados, Domínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informação |
| Popis: | Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science ; This thesis explores the application of zero-shot prompting strategies for table question answering (TQA) in Portuguese, focusing specifically on the Text2SQL task. This task involves translating questions posed in natural language into SQL queries which can be executed against a database to answer the original question. Given the popularity of relational databases across various domains, advancements in this field can substan tially impact the accessibility and democratization of data as simpler and more intuitive interfaces for database interaction are developed. Despite this significant potential, progress in developing Portuguese TQA solutions remains limited. We propose a previously unexplored approach to the Text2SQL task in Portuguese by leveraging Large Language Models (LLMs)—specifically the GPT-3.5 and GPT 4 models—through zero-shot prompting. The primary objectives are to assess the effectiveness of such LLMs in this task and to identify the most suitable prompt styles. These are evaluated using a Portuguese translation of the popular Spider Text2SQL benchmark. Results from this work reveal that, while not outperforming state-of-the-art models, our proposed approach can generate adequate SQL queries to answer Portuguese lan guage questions about various databases, particularly when using GPT-4. The findings suggest that including schema information and database content in the prompts is critical for satisfactory outcomes. Furthermore, we point out issues with the automatic evaluation process used in the Spider benchmark, which may lead to underestimating the performance of the approaches tested here. ; Esta tese explora a aplicação de estratégias de zero-shot prompting para responder per guntas na língua portuguesa a respeito de informações contidas em tabelas (área conhecida como Table Question Answering — TQA), com foco específico em Text2SQL. ... |
| Druh dokumentu: | master thesis |
| Jazyk: | English |
| Relation: | http://hdl.handle.net/10362/159406; 203377222 |
| Dostupnosť: | http://hdl.handle.net/10362/159406 |
| Rights: | embargoedAccess ; http://creativecommons.org/licenses/by/4.0/ |
| Prístupové číslo: | edsbas.F509F23 |
| Databáza: | BASE |
| FullText | Text: Availability: 0 CustomLinks: – Url: http://hdl.handle.net/10362/159406# Name: EDS - BASE (s4221598) Category: fullText Text: View record from BASE – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Jannuzzi%20MP Name: ISI Category: fullText Text: Nájsť tento článok vo Web of Science Icon: https://imagesrvr.epnet.com/ls/20docs.gif MouseOverText: Nájsť tento článok vo Web of Science |
|---|---|
| Header | DbId: edsbas DbLabel: BASE An: edsbas.F509F23 RelevancyScore: 674 AccessLevel: 3 PubType: Dissertation/ Thesis PubTypeId: dissertation PreciseRelevancyScore: 673.574829101563 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: Zero-Shot Prompting Strategies for Table Question Answering in Portuguese: An Exploration of Prompt-Based Approaches for Text2SQL in the Portuguese Language – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Jannuzzi%2C+Marcelo+Poles%22">Jannuzzi, Marcelo Poles</searchLink> – Name: Author Label: Contributors Group: Au Data: Castelli, Mauro<br />Peres, Fernando Augusto Junqueira – Name: DatePubCY Label: Publication Year Group: Date Data: 1483 – Name: Subset Label: Collection Group: HoldingsInfo Data: Repositório da Universidade Nova de Lisboa (UNL) – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22Natural+Language+Processing%22">Natural Language Processing</searchLink><br /><searchLink fieldCode="DE" term="%22Text2SQL%22">Text2SQL</searchLink><br /><searchLink fieldCode="DE" term="%22Table+Question+Answering%22">Table Question Answering</searchLink><br /><searchLink fieldCode="DE" term="%22Large+Language+Models%22">Large Language Models</searchLink><br /><searchLink fieldCode="DE" term="%22GPT-3%22">GPT-3</searchLink><br /><searchLink fieldCode="DE" term="%22GPT-4%22">GPT-4</searchLink><br /><searchLink fieldCode="DE" term="%22Zero-Shot+Prompting%22">Zero-Shot Prompting</searchLink><br /><searchLink fieldCode="DE" term="%22Portuguese+Language%22">Portuguese Language</searchLink><br /><searchLink fieldCode="DE" term="%22Spider+Benchmark%22">Spider Benchmark</searchLink><br /><searchLink fieldCode="DE" term="%22Natural+Language+Interface+for+Databases%22">Natural Language Interface for Databases</searchLink><br /><searchLink fieldCode="DE" term="%22Processamento+de+Linguagem+Natural%22">Processamento de Linguagem Natural</searchLink><br /><searchLink fieldCode="DE" term="%22Língua+Portuguesa%22">Língua Portuguesa</searchLink><br /><searchLink fieldCode="DE" term="%22Conjunto+de+Dados+Spider%22">Conjunto de Dados Spider</searchLink><br /><searchLink fieldCode="DE" term="%22Interface+de+Linguagem+Natural+para+Bancos+de+Dados%22">Interface de Linguagem Natural para Bancos de Dados</searchLink><br /><searchLink fieldCode="DE" term="%22Domínio%2FÁrea+Científica%3A%3ACiências+Naturais%3A%3ACiências+da+Computação+e+da+Informação%22">Domínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informação</searchLink> – Name: Abstract Label: Description Group: Ab Data: Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science ; This thesis explores the application of zero-shot prompting strategies for table question answering (TQA) in Portuguese, focusing specifically on the Text2SQL task. This task involves translating questions posed in natural language into SQL queries which can be executed against a database to answer the original question. Given the popularity of relational databases across various domains, advancements in this field can substan tially impact the accessibility and democratization of data as simpler and more intuitive interfaces for database interaction are developed. Despite this significant potential, progress in developing Portuguese TQA solutions remains limited. We propose a previously unexplored approach to the Text2SQL task in Portuguese by leveraging Large Language Models (LLMs)—specifically the GPT-3.5 and GPT 4 models—through zero-shot prompting. The primary objectives are to assess the effectiveness of such LLMs in this task and to identify the most suitable prompt styles. These are evaluated using a Portuguese translation of the popular Spider Text2SQL benchmark. Results from this work reveal that, while not outperforming state-of-the-art models, our proposed approach can generate adequate SQL queries to answer Portuguese lan guage questions about various databases, particularly when using GPT-4. The findings suggest that including schema information and database content in the prompts is critical for satisfactory outcomes. Furthermore, we point out issues with the automatic evaluation process used in the Spider benchmark, which may lead to underestimating the performance of the approaches tested here. ; Esta tese explora a aplicação de estratégias de zero-shot prompting para responder per guntas na língua portuguesa a respeito de informações contidas em tabelas (área conhecida como Table Question Answering — TQA), com foco específico em Text2SQL. ... – Name: TypeDocument Label: Document Type Group: TypDoc Data: master thesis – Name: Language Label: Language Group: Lang Data: English – Name: NoteTitleSource Label: Relation Group: SrcInfo Data: http://hdl.handle.net/10362/159406; 203377222 – Name: URL Label: Availability Group: URL Data: http://hdl.handle.net/10362/159406 – Name: Copyright Label: Rights Group: Cpyrght Data: embargoedAccess ; http://creativecommons.org/licenses/by/4.0/ – Name: AN Label: Accession Number Group: ID Data: edsbas.F509F23 |
| PLink | https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.F509F23 |
| RecordInfo | BibRecord: BibEntity: Languages: – Text: English Subjects: – SubjectFull: Natural Language Processing Type: general – SubjectFull: Text2SQL Type: general – SubjectFull: Table Question Answering Type: general – SubjectFull: Large Language Models Type: general – SubjectFull: GPT-3 Type: general – SubjectFull: GPT-4 Type: general – SubjectFull: Zero-Shot Prompting Type: general – SubjectFull: Portuguese Language Type: general – SubjectFull: Spider Benchmark Type: general – SubjectFull: Natural Language Interface for Databases Type: general – SubjectFull: Processamento de Linguagem Natural Type: general – SubjectFull: Língua Portuguesa Type: general – SubjectFull: Conjunto de Dados Spider Type: general – SubjectFull: Interface de Linguagem Natural para Bancos de Dados Type: general – SubjectFull: Domínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informação Type: general Titles: – TitleFull: Zero-Shot Prompting Strategies for Table Question Answering in Portuguese: An Exploration of Prompt-Based Approaches for Text2SQL in the Portuguese Language Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Jannuzzi, Marcelo Poles – PersonEntity: Name: NameFull: Castelli, Mauro – PersonEntity: Name: NameFull: Peres, Fernando Augusto Junqueira IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Type: published Y: 1483 Identifiers: – Type: issn-locals Value: edsbas – Type: issn-locals Value: edsbas.oa |
| ResultId | 1 |
Nájsť tento článok vo Web of Science