This document presents an algorithm called CORE for generating compact yet relaxable answers to keyword queries over knowledge graphs. CORE aims to balance answer compactness, defined as having a bounded diameter, with answer completeness, defined as covering the most query keywords. It provides theoretical foundations for the existence of such answers and uses a best-first search approach. An evaluation shows CORE efficiently computes answers that are more complete than alternatives while remaining compact.
Zone Chairperson Role and Responsibilities New updated.pptx
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Graphs
1. Generating Compact and Relaxable Answers to
Keyword Queries over Knowledge Graphs
Gong Cheng1, Shuxin Li1, Ke Zhang1, Chengkai Li2
1State Key Laboratory for Novel Software Technology, Nanjing University, China
2Department of Computer Science and Engineering, University of Texas at Arlington, United States
ISWC 2020 1
2. ISWC 2020 2
Keyword Search over Knowledge Graphs
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
3. Two Paradigms
For lookup tasks: semantic parsing (keyword query SPARQL query)
ISWC 2020 3
Keyword Search over Knowledge Graphs
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
4. Two Paradigms
For lookup tasks: semantic parsing (keyword query SPARQL query)
For exploratory tasks: answer subgraph extraction (keyword query GST)
ISWC 2020 4
Keyword Search over Knowledge Graphs
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
countryGroup Steiner Tree
(GST)
5. Answer completeness?
Covering all the query keywords
Answer compactness?
Having a compact structure (e.g., a small diameter)
ISWC 2020 5
Motivation --- Pros and Cons of GST
Group Steiner Tree
(GST)
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
The Trip (TT)
country
London
country
producedBy
G7
member
member
diameter: 7
uncovered keywords: 0
6. Computing compact but relaxable subgraphs
Guaranteed answer compactness: having a bounded diameter (D)
Maximized answer completeness: covering the largest number of query
keywords
ISWC 2020 6
Main Idea
Group Steiner Tree
(GST)
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
The Trip (TT)
country
London
country
producedBy
G7
member
member
diameter: 7
uncovered keywords: 0
7. Computing compact but relaxable subgraphs
Guaranteed answer compactness: having a bounded diameter (D)
Maximized answer completeness: covering the largest number of query
keywords
ISWC 2020 7
Main Idea
Minimally Relaxed Answer
(MRA)
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
The Trip (TT)
country
London
country
producedBy
G7
member
member
diameter: 2
uncovered keywords: 1
(D=2)
8. A necessary and sufficient condition for the existence of a
compactness-bounded complete answer to a keyword query
ISWC 2020 8
Approach --- Theoretical Foundations
United States (US)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
country
We refer to v as a certificate vertex for Q.
• E.g., Montana for "united states yellowstone park" under D=2
9. ISWC 2020 9
Approach --- Algorithm CORE
A best-first search algorithm
one independent search
starting from each
keyword vertex
a shared priority queue keeping
search frontiers
(priority: potentially uncovered
keywords, based on distances)
a more complete answer
which the current vertex
is a certificate vertex for
early stop
unvisited neighbors
10. Running example
ISWC 2020 10
Approach --- Algorithm CORE
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
11. Running example
ISWC 2020 11
Approach --- Algorithm CORE
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
12. Running example
ISWC 2020 12
Approach --- Algorithm CORE
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
13. Running example
ISWC 2020 13
Approach --- Algorithm CORE
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
15. Main finding
Trading off answer completeness for compactness is necessary.
ISWC 2020 15
Results --- Compactness of GST-Based Answers
(doc = diameter; |K| = max keyword hits)
16. Main finding
The completeness of our computed answers is very high.
ISWC 2020 16
Results --- Completeness of Relaxable Answers
(dor = number of uncovered keywords; |K| = max keyword hits)
17. Main finding
CORE is efficient and significantly outperforms CertQR+.
ISWC 2020 17
Results --- Efficiency of CORE
18. Take-home messages
Necessity of trading off answer completeness for compactness
Polynomial-time algorithm for generating compact but relaxable answers
https://github.com/nju-websoft/CORE
Future work
Vertex and/or edge weights
ISWC 2020 18
Conclusion