Books | Movies | Soccer | Music | TV series |
---|---|---|---|---|
When was the first book of the book series The Dwarves published? | Who played the joker in The Dark Knight? | Which European team did Diego Costa represent in the year 2018? | Led Zeppelin had how many band members? | Who is the actor of James Gordon in Gotham? |
2003 | Heath Ledger | Atletico Madrid | 4 | Ben McKenzie |
What is the name of the second book? | When did he die? | Did they win the Super Cup the previous year? | Which was released first: Houses of the Holy or Physical Graffiti? | What about Bullock? |
The War of the Dwarves | 22 January 2008 | No | Houses of the Holy | Donal Logue |
Who is the author? | Batman actor? | Which club was the winner? | Is the rain song and immigrant song there? | Creator? |
Markus Heitz | Christian Bale | Real Madrid C.F. | No | Bruno Heller |
In which city was he born? | Director? | Which English club did Costa play for before returning to Atletico Madrid? | Who wrote those songs? | Married to in 2017? |
Homburg | Christopher Nolan | Chelsea F.C. | Jimmy Page | Miranda Cowley |
When was he born? | Sequel name? | Which stadium is this club's home ground? | Name of his previous band? | Wedding date first wife? |
10 October 1971 | The Dark Knight Rises | Stamford Bridge | The Yardbirds | 19 June 1993 |
ConvQuestions is the first realistic benchmark for conversational question answering over knowledge graphs. It contains 11,200 conversations which can be evaluated over Wikidata. They are compiled from the inputs of 70 Master crowdworkers on Amazon Mechanical Turk, with conversations from five domains: Books, Movies, Soccer, Music, and TV Series. The questions feature a variety of complex question phenomena like comparisons, aggregations, compositionality, and temporal reasoning. Answers are grounded in Wikidata entities to enable fair comparison across diverse methods. The data gathering setup was kept as natural as possible, with the annotators selecting entities of their choice from each of the five domains, and formulating the entire conversation in one session. All questions in a conversation are from the same Turker, who also provided gold answers to the questions. For suitability to knowledge graphs, questions were constrained to be objective or factoid in nature, but no other restrictive guidelines were set. A notable property of ConvQuestions is that several questions are not answerable by Wikidata alone (as of September 2019), but the required facts can, for example, be found in the open Web or in Wikipedia. For details, please refer to our CIKM 2019 full paper.
Model | MRR ⇩ | P@1 |
---|---|---|
EXPLAIGNN* (heterogeneous sources) Christmann et al. '23 |
0.447 | 0.363 |
EXPLAIGNN (KB-only) Christmann et al. '23 |
0.399 | 0.330 |
KRR† (gold seed entity) Ke et al. '22 |
0.397 | 0.397 |
PRALINE Kacupaj et al. '22 |
0.373 | 0.294 |
CONQUER Kaiser et al. '21 |
0.279 | 0.240 |
OAT† (gold seed entity) Marion et al. '21 |
0.260 | 0.250 |
Focal Entity Model Lan et al. '21 |
0.248 | 0.248 |
CONVEX Christmann et al. '19 |
0.200 | 0.184 |
OAT Marion et al. '21 |
0.175 | 0.166 |
Star Model |
0.175 | 0.175 |
Chain Model | 0.075 | 0.075 |
D2A Guo et al. '18 |
0.061 | 0.061 |
* EXPLAIGNN makes use of heterogeneous sources (KB, text, tables, and infoboxes).
† This variant assumes that the gold seed entity for each conversation is given.
Results on the leaderboard are for the (incomplete) follow-up questions only.