Transqlate: translating enriched natural language sentences to SQL queries using transformers

Saved in:
Bibliographic Details
Title: Transqlate: translating enriched natural language sentences to SQL queries using transformers
Authors: Farshkar Azari, Mousa
Contributors: Ulusoy, Özgür
Publisher Information: Bilkent University, 2022.
Publication Year: 2022
Subject Terms: Natural language interface to databases, Natural language processing, Structured query language, Deep learning, Relational database systems, Neural networks
Description: Cataloged from PDF version of article. Thesis (Master's): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2022. Includes bibliographical references (leaves 41-46). A large amount of the structured data owned by different enterprises is typically stored in Relational Database Management Systems, and a decent knowledge of Structured Language Query (SQL) is required to extract desired information from the relational databases. Many naive users need to access the information from databases, and they do not have the necessary skills or knowledge. Additionally, even some expert users might find it challenging to provide complex SQL queries when they do not know the schema underlying the database. To this end, a considerable amount of research has been conducted recently for the translation of queries formulated by users in a natural language to SQL queries to be processed by database systems. In this thesis, we provide some deep intelligent strategies to be used in natural language to SQL translation. We propose TranSQLate, a novel method to enrich the input sequences and provide more effective Natural Language Interface to Database (NLIDB) systems. We apply our strategies to the Vanilla transformer and T5 transformer models in three different ways. With enriched inputs, we achieve up to 16.7% improvement in translation accuracy, 6.5 points in SacreBLEU score, and 18 points in the n-gram precision, compared to not enriched versions. Our method surpasses the strategies used in the state-of-the-art systems NALIR, TEMPLAR, and DBTagger, in terms of translation accuracy over IMDB, scholar, and Yelp datasets. by Mousa Farshkar Azari M.S.
Document Type: Thesis
File Description: x, 46 leaves : charts; 30 cm.; application/pdf
Language: English
Access URL: https://hdl.handle.net/11693/110500
Accession Number: edsair.dedup.wf.002..fda2e4d98db2cfca8bb0726b4d44ca97
Database: OpenAIRE
Description
Abstract:Cataloged from PDF version of article. Thesis (Master's): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2022. Includes bibliographical references (leaves 41-46). A large amount of the structured data owned by different enterprises is typically stored in Relational Database Management Systems, and a decent knowledge of Structured Language Query (SQL) is required to extract desired information from the relational databases. Many naive users need to access the information from databases, and they do not have the necessary skills or knowledge. Additionally, even some expert users might find it challenging to provide complex SQL queries when they do not know the schema underlying the database. To this end, a considerable amount of research has been conducted recently for the translation of queries formulated by users in a natural language to SQL queries to be processed by database systems. In this thesis, we provide some deep intelligent strategies to be used in natural language to SQL translation. We propose TranSQLate, a novel method to enrich the input sequences and provide more effective Natural Language Interface to Database (NLIDB) systems. We apply our strategies to the Vanilla transformer and T5 transformer models in three different ways. With enriched inputs, we achieve up to 16.7% improvement in translation accuracy, 6.5 points in SacreBLEU score, and 18 points in the n-gram precision, compared to not enriched versions. Our method surpasses the strategies used in the state-of-the-art systems NALIR, TEMPLAR, and DBTagger, in terms of translation accuracy over IMDB, scholar, and Yelp datasets. by Mousa Farshkar Azari M.S.