KQA Pro

KQA Pro^[1] is a large-scale dataset for Complex KBQA, where a compositional and highly-interpretable formal format, named Program, is defined to represent the reasoning process of complex questions. Compositional strategies are proposed to generate questions, corresponding SPARQLs, and Programs with a small number of templates, and the generated questions are then paraphrased to natural language questions (NLQ) by crowdsourcing, giving rise to around 120K diverse instances. SPARQL and Program depict two complementary solutions to answer complex questions, which can benefit a large spectrum of QA methods. Besides the QA task, This dataset can also serves for the semantic parsing task. In addition, it is currently the largest corpus of NLQ-to-SPARQL and NLQ-to-Program.

This dataset can be downloaded via the link.

Leaderboard

Year	Type	F1	Acc	Reported by	Official Repo
2020	SP-based	89.7	-	Cao Y. et al	Repo

References

[1] Shi, Jiaxin, Shulin Cao, Liangming Pan, Yutong Xiang, Lei Hou, Juanzi Li, Hanwang Zhang, and Bin He. KQA Pro: A Large-Scale Dataset with Interpretable Programs and Accurate SPARQLs for Complex Question Answering over Knowledge Base. arXiv preprint arXiv:2007.03875 (2020).

Go back to the README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kqapro.md

kqapro.md

KQA Pro

Leaderboard

References

Files

kqapro.md

Latest commit

History

kqapro.md

File metadata and controls

KQA Pro

Leaderboard

References