Translating Questions to SQL Queries with Generative Parsers Discriminatively Reranked

Saved in:
Bibliographic Details
Title: Translating Questions to SQL Queries with Generative Parsers Discriminatively Reranked
Authors: Ra Giordani, Ro Moschitti
Contributors: The Pennsylvania State University CiteSeerX Archives
Source: http://aclweb.org/anthology/C/C12/C12-2040.pdf.
Collection: CiteSeerX
Subject Terms: Natural Language Interface to Databases, Semantic Parsing, Reranking. 401 Proceedings of COLING 2012
Description: In this paper, we define models for automatically translating a factoid question in natural language to an SQL query that retrieves the correct answer from a target relational database (DB). We exploit the DB structure to generate a set of candidate SQL queries, which we rerank with an SVM-ranker based on tree kernels. In particular, in the generation phase, we use (i) lexical dependencies in the question and (ii) the DB metadata, to build a set of plausible SELECT, WHERE and FROM clauses enriched with meaningful joins. We combine the clauses by means of rules and a heuristic weighting scheme, which allows for generating a ranked list of candidate SQL queries. This approach can be recursively applied to deal with complex questions, requiring nested SELECT instructions. Finally, we apply the reranker to reorder the list of question and SQL candidate pairs, whose members are represented as syntactic trees. The F1 of our model derived on standard benchmarks, 87 % on the first question, is in line with the best models using external and expensive hand-crafted resources such as the question meaning interpretation. Moreover, our system shows a Recall of the correct answer of about 94 % and 98 % on the first 2 and 5 candidates, respectively. This is an interesting outcome considering that we only need pairs of questions and answers concerning a target DB (no SQL query is needed) to train our model.
Document Type: text
File Description: application/pdf
Language: English
Relation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.364.651
Availability: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.364.651
http://aclweb.org/anthology/C/C12/C12-2040.pdf
Rights: Metadata may be used without restrictions as long as the oai identifier remains attached to it.
Accession Number: edsbas.5D553D9E
Database: BASE
Description
Abstract:In this paper, we define models for automatically translating a factoid question in natural language to an SQL query that retrieves the correct answer from a target relational database (DB). We exploit the DB structure to generate a set of candidate SQL queries, which we rerank with an SVM-ranker based on tree kernels. In particular, in the generation phase, we use (i) lexical dependencies in the question and (ii) the DB metadata, to build a set of plausible SELECT, WHERE and FROM clauses enriched with meaningful joins. We combine the clauses by means of rules and a heuristic weighting scheme, which allows for generating a ranked list of candidate SQL queries. This approach can be recursively applied to deal with complex questions, requiring nested SELECT instructions. Finally, we apply the reranker to reorder the list of question and SQL candidate pairs, whose members are represented as syntactic trees. The F1 of our model derived on standard benchmarks, 87 % on the first question, is in line with the best models using external and expensive hand-crafted resources such as the question meaning interpretation. Moreover, our system shows a Recall of the correct answer of about 94 % and 98 % on the first 2 and 5 candidates, respectively. This is an interesting outcome considering that we only need pairs of questions and answers concerning a target DB (no SQL query is needed) to train our model.