QALD-7
Question Answering over Linked Data Challenge
Presenter: Giulio Napolitano
QALD-7 @ ESWC 2017
Portoroz, Slovenia
Horizon 2020, GA No 688227
May 30th, 2017
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 1 / 16
Overview
Question answering systems mediate between
a user expressing an information need in natural language
and RDF-modelled data
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 2 / 16
Overview
QALD is a series of evaluation campaigns that provide a benchmark for
comparing different approaches and systems
get a picture of their strengths and shortcomings
gain insight into how we can develop approaches that deal with Semantic
Web data as a knowledge source
QALD-1 @ ESWC 2011 (3)
QALD-2 @ ESWC 2012 (4)
QALD-3 @ CLEF 2013 (6)
QALD-4 @ CLEF 2014 QA track (9)
QALD-5 @ CLEF 2015 QA track (7)
QALD-6 @ ESWC 2016 (13)
QALD-7 @ ESWC 2017 (3)
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 3 / 16
Tasks
Overall task Given a natural language question, retrieve the correct
answer(s) from a given RDF repository.
Types of challenges (specific tasks):
1 Multilingual
2 Hybrid
3 Large scale
4 Wikidata
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 4 / 16
Task 1 - Multilingual questions
Dataset: DBpedia 2016-04 (with multilingual labels)
Questions: 215 training, 50 test
provided in 8 languages: English, German, Spanish, Italian, French, Dutch,
Romanian, Farsi
can be answered with respect to the provided RDF data
annotated with corresponding SPARQL queries and answers
Challenge: Lexical and structural gap between natural language expressions and
data, e.g.
high → elevation
have inhabitants → populationTotal
graduate from → almaMater
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 5 / 16
Example
Which book has the most pages?
Welches Buch hat die meisten Seiten?
Quale libro ha il maggior numero di pagine?
Quel livre a le plus de pages?
¿Que libro tiene el mayor numero de paginas?
. . .
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 6 / 16
Task 2 - Hybrid questions
Dataset: DBpedia 2016-04 (with free text abstracts)
Questions: 105 training, 50 test
provided in English
can be answered only by integrating structured data (RDF) and unstructured
data (free text abstracts)
annotated with pseudo-queries and answers
Challenge: find information in several sources, process both structured and
unstructured information, and combine them into one answer.
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 7 / 16
Example
Who is the front man of the band that wrote Coffee & TV?
PREFIX r e s : <http :// dbpedia . org / r e s o u r c e/>
PREFIX dbo : <http :// dbpedia . org / ontology/>
SELECT DISTINCT ? u r i
WHERE {
r e s : Coffee_&_TV dbo : m u s i c a l A r t i s t ?x .
?x dbo : bandMember ? u r i .
? u r i t e x t : " i s " t e x t : " frontman " .
}
http://dbpedia.org/resource/Damon_Albarn
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 8 / 16
Task 3 - Large scale
Dataset: DBpedia 2016-04
Questions: 100 training, 2M test
provided in English
automatically generated
questions sent every minute, n+1 questions asked at minute n
Challenge: deal with high volume requests in a short time
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 9 / 16
Task 4 - Wikidata
Dataset: Wikidata 2017-01
Questions: 100 training, 50 test
provided in English
questions based on DBpedia but performed on Wikidata
Challenge: formulate generic approaches, adapting to new data sources
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 10 / 16
Participants
Dennis Diefenbach, Kamal Singh, Pierre Maret
WDAqua-core0: A Question Answering Component for the
Research Community
Task 1 and Task 4
Nikolay Radoev, Mathieu Tremblay, Michel Gagnon, Amal Zouaq
Answering Natural Language Questions on RDF Knowledge base
in French
Task 1
Daniil Sorokin, Iryna Gurevych
End-to-end Representation Learning for Question Answering with
Weak Supervision
Task 4
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 11 / 16
Organization committee
Ricardo Usbeck
Universität Leipzig, Germany
Axel-Cyrille Ngonga Ngomo
Universität Leipzig, Germany
Bastian Haarmann
Fraunhofer Institute IAIS, Germany
Anastasia Krithara
National Center for Scientic Research “Demokritos", Greece
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 12 / 16
Data experts
Harsh Takkar
Universität Bonn, Germany
Henning Petzka
Fraunhofer Institute IAIS, Germany
Jens Jehmann
Fraunhofer Institute IAIS, Germany
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 13 / 16
Program committee
Corina Forascu - Alexandru Ioan Cuza University, Iasi, Romania
Sebastian Walter - CITEC, Universität Bielefeld, Germany
Bernd Müller - ZBMed, Germany
Christoph Lange - Fraunhofer Gesellschaft, Germany
Dennis Diefenbach - Université de Saint-Étienne, France
Edgard Marx - Universität Leipzig, Germany
Hady Elsahar - Université de Saint-Étienne, France
Ioanna Lytra - Universität Bonn, Germany
John McCrae - INSIGHT - The Centre for Data Analytics, Ireland
Konrad Höffner - Universität Leipzig, Germany
Kuldeep Singh - Universität Bonn, Germany
Saeedeh Shekarpour - Kno.e.sis Center, Ohio Center of Excellence in
Knowledge-enabled Computing, USA
Sherzod Hakimov - CITEC, Universität Bielefeld, Germany
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 14 / 16
The End
Thank You!
Thanks to Christina Unger for sharing her slides!!!!
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 15 / 16
Don’t forget
Thursday 1st June
9:00-11:00 Poster session
17:30 Awards at closing ceremony
Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 16 / 16

Qald 7 at ESWC2017

  • 1.
    QALD-7 Question Answering overLinked Data Challenge Presenter: Giulio Napolitano QALD-7 @ ESWC 2017 Portoroz, Slovenia Horizon 2020, GA No 688227 May 30th, 2017 Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 1 / 16
  • 2.
    Overview Question answering systemsmediate between a user expressing an information need in natural language and RDF-modelled data Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 2 / 16
  • 3.
    Overview QALD is aseries of evaluation campaigns that provide a benchmark for comparing different approaches and systems get a picture of their strengths and shortcomings gain insight into how we can develop approaches that deal with Semantic Web data as a knowledge source QALD-1 @ ESWC 2011 (3) QALD-2 @ ESWC 2012 (4) QALD-3 @ CLEF 2013 (6) QALD-4 @ CLEF 2014 QA track (9) QALD-5 @ CLEF 2015 QA track (7) QALD-6 @ ESWC 2016 (13) QALD-7 @ ESWC 2017 (3) Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 3 / 16
  • 4.
    Tasks Overall task Givena natural language question, retrieve the correct answer(s) from a given RDF repository. Types of challenges (specific tasks): 1 Multilingual 2 Hybrid 3 Large scale 4 Wikidata Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 4 / 16
  • 5.
    Task 1 -Multilingual questions Dataset: DBpedia 2016-04 (with multilingual labels) Questions: 215 training, 50 test provided in 8 languages: English, German, Spanish, Italian, French, Dutch, Romanian, Farsi can be answered with respect to the provided RDF data annotated with corresponding SPARQL queries and answers Challenge: Lexical and structural gap between natural language expressions and data, e.g. high → elevation have inhabitants → populationTotal graduate from → almaMater Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 5 / 16
  • 6.
    Example Which book hasthe most pages? Welches Buch hat die meisten Seiten? Quale libro ha il maggior numero di pagine? Quel livre a le plus de pages? ¿Que libro tiene el mayor numero de paginas? . . . Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 6 / 16
  • 7.
    Task 2 -Hybrid questions Dataset: DBpedia 2016-04 (with free text abstracts) Questions: 105 training, 50 test provided in English can be answered only by integrating structured data (RDF) and unstructured data (free text abstracts) annotated with pseudo-queries and answers Challenge: find information in several sources, process both structured and unstructured information, and combine them into one answer. Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 7 / 16
  • 8.
    Example Who is thefront man of the band that wrote Coffee & TV? PREFIX r e s : <http :// dbpedia . org / r e s o u r c e/> PREFIX dbo : <http :// dbpedia . org / ontology/> SELECT DISTINCT ? u r i WHERE { r e s : Coffee_&_TV dbo : m u s i c a l A r t i s t ?x . ?x dbo : bandMember ? u r i . ? u r i t e x t : " i s " t e x t : " frontman " . } http://dbpedia.org/resource/Damon_Albarn Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 8 / 16
  • 9.
    Task 3 -Large scale Dataset: DBpedia 2016-04 Questions: 100 training, 2M test provided in English automatically generated questions sent every minute, n+1 questions asked at minute n Challenge: deal with high volume requests in a short time Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 9 / 16
  • 10.
    Task 4 -Wikidata Dataset: Wikidata 2017-01 Questions: 100 training, 50 test provided in English questions based on DBpedia but performed on Wikidata Challenge: formulate generic approaches, adapting to new data sources Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 10 / 16
  • 11.
    Participants Dennis Diefenbach, KamalSingh, Pierre Maret WDAqua-core0: A Question Answering Component for the Research Community Task 1 and Task 4 Nikolay Radoev, Mathieu Tremblay, Michel Gagnon, Amal Zouaq Answering Natural Language Questions on RDF Knowledge base in French Task 1 Daniil Sorokin, Iryna Gurevych End-to-end Representation Learning for Question Answering with Weak Supervision Task 4 Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 11 / 16
  • 12.
    Organization committee Ricardo Usbeck UniversitätLeipzig, Germany Axel-Cyrille Ngonga Ngomo Universität Leipzig, Germany Bastian Haarmann Fraunhofer Institute IAIS, Germany Anastasia Krithara National Center for Scientic Research “Demokritos", Greece Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 12 / 16
  • 13.
    Data experts Harsh Takkar UniversitätBonn, Germany Henning Petzka Fraunhofer Institute IAIS, Germany Jens Jehmann Fraunhofer Institute IAIS, Germany Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 13 / 16
  • 14.
    Program committee Corina Forascu- Alexandru Ioan Cuza University, Iasi, Romania Sebastian Walter - CITEC, Universität Bielefeld, Germany Bernd Müller - ZBMed, Germany Christoph Lange - Fraunhofer Gesellschaft, Germany Dennis Diefenbach - Université de Saint-Étienne, France Edgard Marx - Universität Leipzig, Germany Hady Elsahar - Université de Saint-Étienne, France Ioanna Lytra - Universität Bonn, Germany John McCrae - INSIGHT - The Centre for Data Analytics, Ireland Konrad Höffner - Universität Leipzig, Germany Kuldeep Singh - Universität Bonn, Germany Saeedeh Shekarpour - Kno.e.sis Center, Ohio Center of Excellence in Knowledge-enabled Computing, USA Sherzod Hakimov - CITEC, Universität Bielefeld, Germany Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 14 / 16
  • 15.
    The End Thank You! Thanksto Christina Unger for sharing her slides!!!! Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 15 / 16
  • 16.
    Don’t forget Thursday 1stJune 9:00-11:00 Poster session 17:30 Awards at closing ceremony Napolitano (Fraunhofer IAIS) Plenary 3 May 30th, 2017 16 / 16