Report on the First Knowledge Graph Reasoning Challenge 2018 -Toward the eXplainable AI System-

Report on the First Knowledge Graph
Reasoning Challenge 2018
– Toward the eXplainable AI System –
Takahiro Kawamura*1, Shusaku Egami*2, Koutarou Tamura*3,
Yasunori Hokazono*4, Takanori Ugai*5, Yusuke Koyanagi*5,
Fumihito Nishino*5, Seiji Okajima*5, Katsuhiko Murakami*5,
Kunihiko Takamatsu*6, Aoi Sugiura*7, Shun Shiramatsu*8,
Shawn Zhang*8, Kouji Kozaki*9
1.National Agriculture and Food Research Organization, Japan
2.National Institute of Maritime, Port and Aviation Technology, Japan
3. NRI digital, Ltd. 4.Nomura Research Institute, Ltd.
5. Fujitsu Laboratories Ltd. 6. Kobe Tokiwa University 7. Kobe City Nishi-Kobe Medical Center
8.Nagoya Institute of Technology 9. Osaka Electro-Communication University

Investig
ation
strategy
Criminal
motive ….
Summary of Knowledge Graph Reasoning Challenge
A Contest to develop AI systems which have
abilities for “Reasoning” and “Explanation”
such like Sherlock Holmes.
Sherlock
Holmes
mystery story
Open Knowledge
Graph(OKG) AI system that estimate criminals
with reasonable explanations using
the OKG and other knowledge
The motive is …
Trick is …
The criminal is
XX Because …

Agenda of Talk
• Summary of Knowledge Graph Reasoning Challenge
by K. Kozaki
• Knowledge Graph Construction
by S. Egami
• Approach for estimation and reasoning techniques
by T. Ugai
• Evaluation / Conclusion and Current Work
by T. Kawamra
3

Knowledge Graph Construction
4

Knowledge Graph construction process
Extract scenes
(Manual)
Sentence
Simplification
(Manual)
Semantic Role
Annotating
(Manual)
Translation
(Auto)
RDF
construction
(Auto)
Add object types
(Manual)
Add absolute
time
(Manual)
Scene Linking
(Manual)
• Discussion about schema design and methodology
• We first held five open workshops from November 2017 to April 2018.
• The total number of participants in the workshops was 110.
• After the preliminary experiment of knowledge graph construction cooperating with the participants, we
finally adopted the below process.
5

• Basic policy
– Focus on scenes in a novel and the relationship of those scenes, including the
characters, objects, places, etc., with related scenes
• A scene ID (IRI) has subjects, verbs, objects, etc.
• Edges mainly represent five Ws (When, Where, Who, What, and Why).
Architecture of the Knowledge Graph
Scene1 Scene2 Scene3
Scene4
Scene5
Resource
Literal
type subject
source
source
subject subject
hasPredicate
source
hasPredicate
subject
hasPredicate
then
therefore because
then
6

Scene ID
Source text
Subject
Predicate
Object
subject
hasPredicate
source
5W1H
Scene ID
Relatio
n
• Properties representing scenes
– subject (who)： A person or object representing the subject of the scene
– hasPredicate： A predicate representing the content of the scene
– 5W1H： what, where, when, whom, why, how.
– Relation among scenes： then, if, because, …etc.
– time： Absolute time（xsd:DateTime）
– source： Source text（EN/JP literal）
subject object
predicate
Scene
subject objectpredicate
Schema (Scene)
7

Original Sentence (EN|JA)
Absolute Time
Property values are defined as
resources to be referred in the
other scene
Predicate
Subject
Relationship to other Scene ID
Scene Type
- Situation： Fact
- Statement：Remark by A
- Talk： Remark by A to B
- Thought： Idea of A
Schema (Scene): Example
Unique ID (IRI)
of Each scene
8

Visualization tool
• We provided a knowledge graph visualization tool for applicants
Visualization
SPARQL query text
Keyword search
http://knowledge-graph.jp/visualization/ 10

Approach for estimation and
reasoning techniques
11

First Prized: NRI
• formalized the problem
as a constraint
satisfaction problem and
solved with a
lightweight formal
method
12

Second Prized: Fujitsu Laboratories
• used SPARQL and rules
13

Best Idea: Nagoya Institute of Technology
• discusses multi-agents model
14

Best Resource: Fujitsu Laboratories
• constructed word embedding of characters from all sentences of
Sherlock Holmes novels
15

Evaluation
• For evaluating estimation and reasoning techniques with explainability,
– metrics design for explainability, utility, novelty, and performance is required,
– but evaluation is also based on a qualitative comparison through discussion and
peer reviews.
• DARPA XAI states that
– the current AI techniques have a trade-off between accuracy and explainability,
so both properties should be measured.
– In particular, to measure the effectiveness of the explainability,
DARPA XAI rates user satisfaction regarding its clarity and utility.
• We first share the basic information of the proposed approaches, and then discuss the
evaluation of experts and of the general public.
17

Sharing Basic Information for Preparation
• The basic info. Was investigated and shared with experts in advance,
who were 7 board members of the SIG on Semantic Web and Ontology in JSAI
– Correctness of the answer
Check if the resulting criminal was correct or not, regardless of the approach.
The criminal, in this case, is the one designated in the novel or story.
– Feasibility of the program
Check if the submitted program correctly worked and the results were reproduced
(excluding idea-only submissions).
– Performance of the program
Referential information on the system environment and performance of the submitted
program, except for the idea only.
– Amount of data/knowledge to be used
How much did the approach use the knowledge graph (the total num. of scene IDs used)?
If the approach used external knowledge and data, we noted information about them.
18

Expert Evaluation
• Over more than a week, the experts evaluated the following aspects
according to five grades (1–5).
• For Estimation and/or reasoning methods,
– Significance
Novelty and technical improvement of the method.
– Applicability
Is the approach applicable to the other problems?
3 : applicable to the other novels and stories
5 : applicable to other domains.
– Extensibility
Is the approach expected to have a further technical extension?
19

Expert Evaluation cont'd
• For Knowledge and data,
– Originality of knowledge/data construction
E.g., how much external knowledge and data were prepared?
– Originality of knowledge/data use
E.g., was a small set of knowledge used efficiently, or
was a large set of knowledge used to simplify the process.
• and…
– Feasibility of idea (for idea only)
Feasibility of idea including algorithms and data/knowledge construction.
– Logical explainability
Is an explanation logically persuadable?
1 : no explanation and evidence
3 : some evidence in any form is provided
5 : there is an explanation that is consistent with the estimation and reasoning process.
– Effort
Amount of effort required for the submission (knowledge/data/system).
20

Public Evaluation
• Although the experts determine if a logical explanation could be held,
the public eval. focused on the psychological aspect of the explanations, that is,
the satisfaction with the explanation.
• The 45 applicants of SIGSWO meeting, Nov. 2018 answered to
after 15 min presentation
– Total score
– Explainability
• We added the total score to include psychological impressions other than explainability,
– such as presentation quality and entertainment aspects.
– If we only had a score for explainability, such aspects could be mixed in the explainability.
21

Public evaluation results
• In the public eval.,
ave., med., and sd
of the total scores and
the explainability
were compared.
• Comparing 1st and 2nd prizes, we found
– Ave. of both the total score and the explainability were higher for 1st prize.
– Med. of the total score of 1st was higher than 2nd prize, but
Med. of the explainability of 1st was the same as 2nd prize.
– Sd of both the total score and the explainability were also bigger for 1st prize
than for 2nd prize. (smaller the better)
• The paired t-test (α = 0.05) indicated that
– diff. in total score had a statistically significant difference bet. 1st and 2nd., but
– diff. in the explainability was not significantly different.
>
>
>
>
=
>
22

Expert evaluation results
• In the expert evaluation,
– Ave. of each metric in 1st prize
were higher than 2nd prize, but
the explainability was statistically
significantly higher for 2nd prize.
– We should note that sd of ave.
for each metric were less than 0.1
• Therefore, the final decision was left
to the expert peer review.
• As a result, the prize order was
determined, since
– the metrics other than the explainability
of 1st prize were > or = 2nd
• The eval. including explainability was left
to the future challenge…
2nd 1st
>
<
23

Conclusion and Current Work
• The 1st knowledge graph reasoning challenge in 2018 was summarized
for the development of AI techniques integrating ML and reasoning.
• The 2nd challenge started in June 2019 !
– 4 knowledge graphs from 4 mystery novels were added and published on
https://github.com/KnowledgeGraphJapan/KGRC-RDF/tree/master/2019
(Speckled Band + A Case Of Identity / Crooked Man / Dancing Men / Devils Foot)
– In addition to guessing criminal with explanation,
– it is better to be commonly applied to as many novels as possible.
– New tool/utility creation task is also added.
• The 3rd international challenge will open call for application in 2020!
• The 4th challenge will have knowledge graphs of real social problems, e.g., books listing
best practices of social problem solving, etc. 24

Report on the First Knowledge Graph Reasoning Challenge 2018 -Toward the eXplainable AI System-

More Related Content

What's hot

Similar to Report on the First Knowledge Graph Reasoning Challenge 2018 -Toward the eXplainable AI System-

More from KnowledgeGraph

Recently uploaded

Report on the First Knowledge Graph Reasoning Challenge 2018 -Toward the eXplainable AI System-