SEMPRE: Semantic Parsing with Execution

SEMPRE is a toolkit for training semantic parsers, which map natural language utterances to denotations (answers) via intermediate logical forms. Here's an example for querying databases: Here's another example for programming via natural language:

SEMPRE has the following functionality:


You can download all the code and documentation for SEMPRE from GitHub. To learn more about the system, walk through our tutorial.


In our EMNLP 2013 paper, we created a new dataset, WebQuestions, which is released under the CC BY 4.0 license. Here are the train and test splits. You can also see the leader board, upload your predictions, and evaluate your system in this CodaLab worksheet.

In addition, we preprocessed the Free917 dataset (Cai & Yates, 2013) to work with our system. Here are the train and test splits.

Both datasets are provided in JSON format. WebQuestions contains 3,778 training examples and 2,032 test examples. Free917 contains 641 training example and 276 test examples.

On WebQuestions, each example contains three fields:

On Free917, each example contains two fields:


SEMPRE was used in the papers:

Jonathan Berant, Andrew Chou, Roy Frostig, Percy Liang. Semantic Parsing on Freebase from Question-Answer Pairs. Empirical Methods in Natural Language Processing (EMNLP), 2013.
Jonathan Berant, Percy Liang. Semantic Parsing via Paraphrasing. Association for Computational Linguistics (ACL), 2014.
Yushi Wang, Jonathan Berant, Percy Liang. Building a Semantic Parser Overnight. Association for Computational Linguistics (ACL), 2015.
Panupong Pasupat, Percy Liang. Compositional Semantic Parsing on Semi-Structured Tables. Association for Computational Linguistics (ACL), 2015. [Project Page]
Jonathan Berant, Percy Liang. Imitation Learning of Agenda-based Semantic Parsers. Transactions of ACL (TACL), 2015.

SEMPRE supports lambda DCS logical forms, which is the default one used for querying Freebase:

Percy Liang. Lambda Dependency-Based Compositional Semantics. arXiv report.