Towards Knowledge Graphs Validation through Weighted Knowledge Sources

Towards Knowledge Graphs Validation
through Weighted Knowledge Sources
Elwin Huaman, Amar Tauqeer, and Anna Fensel
Semantic Technology Institute (STI) Innsbruck
Department of Computer Science,
University of Innsbruck, Austria
KGSWC 2021 Next Generation

Elwin Huaman | KGSWC2021 | 23/11/2021
Outline
● What?
Basics - Research questions
● How?
Approach - Solution
● Why?
Use cases
2

What?
Basics - Research questions
3

What?
Weighted knowledge sources are data sources that have different weights (or degree of
importance) for different application scenarios.
Weighted Knowledge Sources
4
Which KG is best for me?
● Quality - ﬁtness for use
● Whether data complies to the user's need
● Dependent on tasks

●
●
● … and more statements can be added
●
●
●
● Forming a graph
●
●
●
●
● Graphs can be created independently
●
●
●
●
●
● … and can be integrated
●
● We can add more statements
What?
Knowledge Graphs are very large semantic nets that integrate various and heterogeneous
information sources to represent knowledge about certain domains of discourse.
Knowledge Graphs
:anna
:cs101
:enrolledIn
21
Anna
:name
:age
:carol
:enrolledIn
:knows
Carol
:name
Programming
:subject
:Puno
:birthPlace
Puno
:name
:luis
:Puno
Puno
:name
:birthPlace
None
Luis
:name :age
:cs102
:enrolledIn
Algebra
:subject
:enrolledIn
sameAs
Entity
Literal
Relationship
sameAs relatioship
preﬁx : <http://example.org/>
● Basic statement (or triple)
5

What?
Knowledge Graphs Validation task aims measuring whether statements from KGs are
semantically correct and correspond to the so-called "real" world.
The University of Innsbruck is located in the city of Innsbruck
A simple statement or triple.
A triple = (subject, predicate, object)
Knowledge Graphs Validation
is located in
University of Innsbruck City of Innsbruck
http://schema.org/containedInPlace
http://example.com/University_of_Innsbruck http://example.com/Innsbruck
A triple:
An RDF triple:
6

What?
prefix : <http://example.org/>
prefix e: <http://example.com/>
prefix so: <http://schema.org/>
● Wrong instance assertion
E.g. :anna is a Person, not a Product
What needs to be fixed?
Type
Entity
Literal
Relationship
sameAs relatioship :anna
:cs101
so:knows
21
Anna
so:name
so:age
:carol
so:teaches
so:knows
Carol
so:name
Programming
so:name
:Puno
so:birthPlace
Puno
so:name
e:luis
e:Puno
Puno
so:name
so:birthPlace
None
Luis
so:name
so:age
sameAs
so:Course so:Product so:Place
cs101
so:courseCode
● Wrong property value assertion
E.g. so:knows is semantically wrong
● Wrong equality assertion
E.g. :Puno and e:Puno are related, but not
the same
● …
7
Knowledge Graphs Validation

Compute a conﬁdence score for every triple (or statement) and instance in KGs. The computed score is
based on ﬁnding the same instances across different weighted knowledge sources and comparing their
features.
What?
Towards Knowledge Graphs Validation through Weighted
Knowledge Sources
8
Validator
Reliable
KGs
KG
[0.1]
Validator
Weights

How?
Approach - Solution
9

Input: The user has two options, a) to provide a SPARQL endpoint where to fetch the data from, or b) to
load a dataset in a Turtle format.
How?
Towards Knowledge Graphs Validation through Weighted Knowledge Sources
10
Validator
Reliable
KGs
KG

Mapping: The validator maps the input KG and the external sources to a common format.
How?
11
Validator
Reliable
KGs
KG
Mapping
DS
Validator

Instance Matching: The Validator requests to deﬁne at least two or more properties (e.g., name and geo
coordinates) that are to be used for the instance matching process.
How?
12
Validator
Reliable
KGs
KG
Mapping
DS
Instance
matching
Validator

Confidence Measurement / Triple validation: Calculates a confidence score of whether a property value
on various external sources matches the property value in the user’s KG.
How?
13
Validator
Reliable
KGs
KG
Mapping
DS
Instance
matching
Triple
validation
Weights
Confidence Measurement
Validator
[0.1]

Conﬁdence Measurement / Instance validation: Computes the aggregated score from the attribute
space of an instance.
How?
14
Validator
Reliable
KGs
KG
Mapping
DS
Instance
matching
Triple
validation
Instance
validation
[0.1]
Weights
Validator
[0.1]

Output: The computed scores for triples and instances are shown in a graphical user interface.
How?
15
Validator
Reliable
KGs
KG
Mapping
DS
Instance
matching
Triple
validation
Instance
validation
[0.1] [0.1]
Weights
Validator

Output: The computed scores for triples and instances are shown in a graphical user interface.
How?
16
Validator

Evaluation I:
● Dataset: A subset of the Tirol Knowledge
Graph (~15 Billion statements).
○ 50 Hotel instances
● Baseline: We performed a manual validation.
○ Precision, Recall, and F-measure
● Result: F-measure of at least 75% on
address, name, and phone properties.
How?
17
Validator
Comparison of precision, recall, and f-measure scores over the
manual and semi-automatic validation.

Evaluation II:
● Dataset: Pantheon dataset 11341 famous biographies
○ 2530 politician instances
● Setup: We deﬁned two external sources.
○ Wikidata and DBpedia.
● Result: ~15 minutes.
○ Overall recall scores are
■ 0.36% (DBpedia)
■ 0.49% (Wikidata)
How?
18
Validator
The recall score results of the validation of politician instances.

Why?
Use cases
19

Why?
Use cases:
Semantic correctness of a triple.
E.g. To validate if the shown data of a person,
business, are correct based on different sources
20

Linking different Knowledge Sources.
E.g. Linking an instance of the user’s KG with the matched
instance in Wikidata
Why?
Use cases:
21

Linking different Knowledge Sources.
E.g. Linking an instance of the user’s KG with the matched
instance in Wikidata.
Validating static data.
E.g. Checking whether the addresses of hotels are
up-to-date and are correctly shown by external sources.
Why?
Use cases:
22

Insights & Limitations
❏ Assessment
❏ Automation
❏ Cost-effectiveness
❏ Dynamic-data
❏ Scalability
23

Summary
● A Validation framework
○ Mapping
○ Instance Matching
○ Conﬁdence Measurement
■ Triple validation
■ Instance Validation
○ GUI
● Use cases
● Insights and limitations
24

Acknowledgement
Univ.-Prof. Dr. Fensel Dieter
Assoc.-Prof. Dr. Fensel Anna
Tauqeer Amar M.Sc.
MindLab (mindlab.ai)
WordLiftNG (wordlift.io/ng/)
STI
Projects
Next Generation
25

Towards Knowledge Graphs Validation through Weighted Knowledge Sources

More Related Content

Similar to Towards Knowledge Graphs Validation through Weighted Knowledge Sources

More from Elwin Huaman

Recently uploaded

Towards Knowledge Graphs Validation through Weighted Knowledge Sources