EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

EARL
Joint Entity and Relation Linking for
Question Answering over Knowledge Graphs
Mohnish Dubey1,2
, Debayan Banerjee1,2
, Debanjan Chaudhuri1,2
,Jens Lehmann1,2
1
University of Bonn, Bonn, Germany
2
Fraunhofer IAIS, St. Augustin, Germany

Question Answering
Entity linking in a minute
EARL Overview
Approach 1: GTSP
Approach 2: Connection Density
Comparison of the two approach
Evaluation
Conclusion
Outline

Question Answering over KG
Understand the intent of a factual question, and return the
implicit KB resource.
Typically treated as a translation problem from natural
language to formal language.
Seen major advancement in the past five years.
4

Question Answering over Knowledge Graph
end-to-end
machine learning
fully rule based
systems
5

dbr:Angela_Merkel
dbo:birthPlace“Angela Merkel”
Select right template for
the type of question
Where was Angela Merkel born?
NER
“born”Relation Linking
“Where/WRB was/VBD
Angela/NNP Merkel/NNP
born/VBN ?/.”
NLP
Preprocessing
Entity Linking
Relation Linking
SPARQL type
SELECT DISTINCT ?uri WHERE {
dbr:Angela_Merkel dbo:birthPlace ?uri }
Question Entity ; Relation Extracted ; Expected Answer type
Question Understanding
6
Refer : [1][2]

Key components in QA pipeline
1. Entity Identification and Linking
2. Relation Identification and Linking
3. Query Building
7

Key components in QA pipeline
1. Entity Identification and Linking
2. Relation Identification and Linking
3. Query Building
} EARL
8

Anatomy of a Entity Linking
System
9
Mention Detection Candidate Selection Disambiguation
Refer : http://qatutorial.sda.tech/

Mention Detection
Anatomy
10
Robert John Downey Jr. is an American actor
and singer. Downey was born April 4, 1965 in
Manhattan, New York, the son of writer,
director and filmographer Robert Downey Sr.
and actress Elsie Downey . Beginning in 2008,
Downey began portraying the role of Marvel
Comics superhero Iron Man in the Marvel
Cinematic Universe, appearing in several films
as either the lead role
Candidate Selection Disambiguation

Anatomy
11
Cinematic Universe, appearing in several
films as either the lead role
Mention Detection Candidate Selection Disambiguation

Anatomy
12
Candidate
Selection
DisambiguationMention Detection
Cinematic Universe, appearing in several
films as either the lead role

Anatomy
13
Candidate Selection DisambiguationMention Detection
Cinematic Universe, appearing in several films
as either the lead role

But what about questions
● Single sentence
● Mostly with One Entity
● No previous context
- Disambiguation is problem
14

Relation Linking
● String Similarity
● Dictionary
● WordNet
● Word Embedding
● Machine learning approaches
15
.. written by ..
.. wrote ..
.. author ...
.. screenwriter..
dbo:writer

Advantage Disadvantage
Sequential - Reduces candidate search
space for Relation Linking
- Allows schema verification
- Relation Linking information cannot be
exploited in Entity Linking process
- Errors in Entity Linking cannot be overcome
Parallel - Lower runtime
- Re-ranking of Entities possible
based on Relation Linking
- Entity Linking process cannot use information
from Relation Linking process and vice versa
- Does not allow schema verification
Joint - Potentially high accuracy
- Reduces error propagation
- Better disambiguation
- Allows re-ranking
- Complexity increase
- Larger search space
State of the art for Entity and Relation linking in QA
16

Preliminaries
Transform KG to the Subdivision Graph
S(G) of a graph G is the graph obtained from G by replacing each
edge e = (u, v) of G by a new vertex we
and 2 new edges (u, we
) and (v, we
) .
17

Postulates
H1:Given candidate lists of entities and relations from a question, the
correct solution is a cycle of minimal cost that visits exactly one
candidate from each list.
H2: Given candidate lists of entities and relations from a question, the
correct candidates exhibit relatively dense and short-hop connections
among themselves in the knowledge graph compared to wrong
candidate sets.
H3: Jointly linking entity and relation leads to higher accuracy compared
to performing these tasks separately.
18

Preprocessing
Word Tokenizer
- "Where was the founder of Tesla and SpaceX born?" we identify
<founder, Tesla, SpaceX, born> as our keyword phrases.
- Stop word removal
21
Refer : [3]
"Where was the founder of Tesla and SpaceX born?"

Preprocessing
Entity Relation Predictor
- Predict whether each keyword phrase is an entity or a relation
- Character embedding based long-short term memory network (LSTM)
- RDF2Vec[4] provides latent representation for entities and relations in RDF
graphs
22
Refer : [4]

Preprocessing
Candidate Generation
- To retrieve the top candidates for a keyword EARL uses Elasticsearch index of
URI-label pairs
- Expanded labels - Wikidata, Oxford Dictionary and fastText
23
Refer : https://www.elastic.co/products/elasticsearch
https://developer.oxforddictionaries.com/
https://fasttext.cc/

Approach 1: GTSP
● f(spot) which maps a string s (the input question) to a set K of substrings of s.
● f(cand) which maps each keyword to a set of candidate node
● f(cost) which maps how closely nodes are related
● Goal: is to find combinations of candidates, which are closely related
25

Approach 1: GTSP
26
Who is the husband of the leader of Germany?
Refer : [5]

Approach 1: GTSP
27

Approach 1: GTSP
28
…
….
…..
…
….
…..
…
….
…..

Approach 1: GTSP
Each node set cand(k) is called a cluster in this vertex set.
“The GTSP problem is to find a subset V’ = (v1
, . . . , vn
) of
V which contains exactly one node from each cluster and the
total cost n−1
∑i=1
cost(vi
, vi+1
) is minimal with respect to all such
subsets.”
29

30

31

Approximate GTSP Solvers
The GTSP is NP-hard and hence it is intractable.
GTSP can be reduced to standard TSP.
state-of-the-art approximate GTSP solver is the Lin–Kernighan–Helsgaun algorithm
32
Refer : [6]

Drawbacks of GTSP solution
Only provide the best candidate for each keyword given the list of candidates
Approximate GTSP solutions do not explore all possible paths and nodes
33

Connection Density
Connection Density is based on the three features:
1. Text similarity based initial Rank of the List item (Ri
)
2. Connection-Count (C)
3. Hop-Count (H)
35

Connection Density
)
Initial Rank of the List (Ri
), is generated by retrieving the candidates
from the search index via text search. This is achieved in the
preprocessing steps
3. Hop-Count (H)
36

Connection Density
)
The Connection-Count C for an candidate c, is the number of
connections from c to candidates in all the other lists divided by
the total number n of keywords spotted.
3. Hop-Count (H)
37

Connection Density
)
The Connection-Count C for an candidate c, is the number of
connections from c to candidates in all the other lists divided by
the total number n of keywords spotted.
3. Hop-Count (H)
38
…
….
…..
…
….
…..
…
….
…..

Connection Density
)
3. Hop-Count (H)
The Hop-Count H for a candidate c, is the sum of distances
from c to all the other candidates in all the other lists divided
by the total number of keywords spotted.
39

Connection Density
)
3. Hop-Count (H)
Final rank list ( Rf
) based on the the three features
40

Connection Density
)
3. Hop-Count (H)
Classifier
Ri
C
H
Rf
41

Connection Density
)
3. Hop-Count (H)
Classifier
xgboost
Ri
C
H
Rf
42

Connection Density
43

GTSP vs Connection Density
GTSP Connection Density
Requires no training data Requires data to train the XGBoost classifier
The approximate GSTP LKH solution is only
able to return the top result as not all
possible paths are explored.
Returns a list of all possible candidates in
order of score
Time complexity of LKH is O(nL2
)
where n = number of nodes in graph,
L = number of clusters in graph
Time complexity is O(N2
L2
)
where N = number of nodes per cluster,
L = number of clusters in graph
Relies on identifying the path with minimum
cost
Depends on identifying dense and short-hop
connections
45

Postulates ( a look back)
H1:Given candidate lists of entities and relations from a question, the
correct solution is a cycle of minimal cost that visits exactly one
candidate from each list.
H2: Given candidate lists of entities and relations from a question, the
correct candidates exhibit relatively dense and short-hop connections
among themselves in the knowledge graph compared to wrong
candidate sets.
H3: Jointly linking entity and relation leads to higher accuracy compared
to performing these tasks separately.
46

Evaluation
Contribution 1 : create a gold label data set for entity and relation linking over
LC-QuAD[7] dataset of 5000 questions
Contribution 2 : Extended label set for Dbpedia entity and relations
Experiment 1: hypothesis H1 and H2
Experiment 2: hypothesis H2
Experiment 3 & 4 : hypothesis H3
47
Refer : [7]

Experiment 1: GTSP vs Connection density
Approach Accuracy (K=30) Accuracy (K=10) Time Complexity
Brute Force GTSP 0.61 0.62 O(n2
2n
)
LKH- GTSP 0.59 0.58 O(nLn
)
Connection Density 0.61 0.62 O(N2
L2
)
Aim: To compare the time complexity and accuracy of different approaches
48

Experiment 2: Reranking
Value of k Rf
based on Ri
Rf
based on C, H Rf
based on Ri
, C, H
k = 10 0.543 0.689 0.708
k = 30 0.544 0.666 0.735
k = 50 0.543 0.617 0.737
k* = 10 0.568 0.864 0.905
k* = 30 0.554 0.779 0.864
k* = 50 0.549 0.603 0.852
Aim: Evaluating Connection Density for predicting the correct entity and
relation candidates from a set of candidates (Metric is MRR scores)
49k* - artificially insert the correct candidate into each list to purely test re-ranking abilities of our system

Experiment 3 : Entity Linking
System Accuracy LC-QuAD Accuracy QALD
FOX[8] + AGDISTIS[9] 0.36 0.30
DBpediaSpotlight[10] 0.40 0.42
TextRazor[11] 0.52 0.53
Babelfy[12] 0.56 0.56
EARL without adaptive learning 0.61 0.55
EARL with adaptive learning 0.65 0.57
Aim: To evaluate EARL with other state-of-the-art systems on the entity
linking task (Metric is Accuracy scores (top-1))
50
Refer : [8,9,10,11,12]

Experiment 4 : Relation Linking
System Accuracy LC-QuAD Accuracy QALD
ReMatch[13] 0.12 0.31
RelMatch[14] 0.15 0.29
EARL without adaptive learning 0.32 0.45
EARL with adaptive learning 0.36 0.47
Aim: To evaluate EARL with other Relation linking system
(Metric is Accuracy scores (top-1))
51
Refer : [13,14]

Discussion
We experimentally achieve similar accuracy as the exact GTSP solution with
both LKH-GTSP and Connection Density with better time complexity.
The current approach does not tackle questions with hidden relations
EARL cannot be used as inference tool for entities
52

Conclusion
We provided two strategies for joint linking:
- one based on reducing the problem to an instance of the Generalised
Travelling Salesman problem.
- other based on a connection density based machine learning approach.
53

54
This work was supported by grants from the EU H2020 Framework Programme
provided for the project HOBBIT (GA no. 688227).
Acknowledgement

55
This work was supported by grants from the EU H2020 Framework Programme
provided for the project WDAqua (ITN, GA. 642795)
Acknowledgement

References
56
[1] K. Höffner, S. Walter, E. Marx, R. Usbeck, J. Lehmann, and A.-C. Ngonga Ngomo. Survey on challenges of
question answering in the semantic web. Semantic Web, 2017
[2] K. Singh, A. S. Radhakrishna, A. Both, S. Shekarpour, I. Lytra, R. Usbeck, et al. Why reinvent the wheel: Let’s build
question answering systems together. In Proceedings of the 2018 World Wide Web Conference on World Wide Web.
[3] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural language processing (almost)
from scratch. Journal of Machine Learning Research, 2(Aug):2493–2537, 2011.
[4] P. Ristoski and H. Paulheim. Rdf2vec: Rdf graph embeddings for data mining. In International Semantic Web
Conference, pages 498–514. Springer, 2016.
[5] G. Laporte, H. Mercure, and Y. Nobert. Generalized travelling salesman problem through n sets of nodes: the
asymmetrical case. Discrete Applied Mathematics, 18(2):185-197, 1987.
[6] K. Helsgaun. Solving the equality generalized traveling salesman problem using the lin–kernighan–helsgaun
algorithm. Mathematical Programming Computation, 2015.
[7] P. Trivedi, G. Maheshwari, M. Dubey, and J. Lehmann. Lc-quad: A corpus for complex question answering over
knowledge graphs. In International Semantic Web Conference, 2017.

References
57
[8] R. Speck and A.-C. Ngonga Ngomo. Ensemble learning for named entity recognition. In The Semantic Web –
International Semantic Web Conference, 2014.
[9] R. Usbeck, A.-C. N. Ngomo, M. Röder, et al. Agdistis-graph-based disambiguation of named entities using
linked data. In International Semantic Web Conference, 2014.
[10] P. N. Mendes, M. Jakob, A. García-Silva, and C. Bizer. Dbpedia spotlight: shedding lighton the web of
documents. In Proceedings of the 7th international conference on semantic systems, 2011.
[11] https://www.textrazor.com/
[12] A. Moro, A. Raganato, and R. Navigli. Entity linking meets word sense disambiguation: a unified approach.
Transactions of the Association for Computational Linguistics, 2014.
[13] I. O. Mulang, K. Singh, and F. Orlandi. Matching natural language relations to knowledge graph properties for
question answering. In Proceedings of the 13th International Conference on Semantic Systems, pages 89–96.
ACM, 2017.
[14] K. Singh, I. O. Mulang, I. Lytra, M. Y. Jaradeh, A. Sakor, M.-E. Vidal, C. Lange, and S. Auer. Capturing
knowledge in semantically-typed relational patterns to enhance relation linking. In Proceedings of the Knowledge
Capture Conference, page 31. ACM, 2017.

Questions
Suggestions
Discussions
59

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Recommended

Recommended

More Related Content

What's hot

What's hot (10)

Similar to EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Similar to EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs (20)

More from Holistic Benchmarking of Big Linked Data

More from Holistic Benchmarking of Big Linked Data (20)

Recently uploaded

Recently uploaded (20)

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs