Paper presented at European Semantic Web Conference ESWC, 3-7 June 2018, held in Heraklion, Crete, Greece (Aldemar Knossos Royal & Royal Villa).
This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).
1. Dynamic Planning for Link Discovery
Kleanthi Georgala and Daniel Obraczka and Axel-Cyrille Ngonga Ngomo
AKSW Research Group, University of Leipzig, Germany
Data Science Group (DICE), Paderborn University, Germany
June 2nd, 2018
Heraklion, Crete, Greece
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 1 / 29
2. Overview
1 Motivation
2 Approach
3 Evaluation
4 Conclusions and Future Work
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 2 / 29
5. What is Link Discovery
4th Linked Data principle: Include links to other
URIs so that they can discover more things.
Definition (Link Discovery)
Given sets S and T of resources and relation R
Find M = {(s, t) ∈ S × T : R(s, t)}
Example: R = :failureType
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 5 / 29
6. Declarative Link Discovery
M is difficult to compute directly
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 6 / 29
7. Declarative Link Discovery
M is difficult to compute directly
compute M = {(s, t) ∈ S × T : σ(s, t) ≥ θ}
use Link Specification (LS)
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 6 / 29
8. Declarative Link Discovery
M is difficult to compute directly
compute M = {(s, t) ∈ S × T : σ(s, t) ≥ θ}
use Link Specification (LS)
describe conditions for which R(s, t) holds
Similarity measure m : S × T → [0, 1]
Specification operators op: , ,
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 6 / 29
9. Declarative Link Discovery
M is difficult to compute directly
compute M = {(s, t) ∈ S × T : σ(s, t) ≥ θ}
use Link Specification (LS)
describe conditions for which R(s, t) holds
Similarity measure m : S × T → [0, 1]
Specification operators op: , ,
(θ, 0.73) cosine(:label, :label), 0.46
Right Child
Similarity measure:cosine(:label, :label)Threshold:0.46
Atomic LS
trigrams(:type, :type), 0.87
Left Child
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 6 / 29
10. Why is it difficult?
Accuracy: correct links
Genetic programming
Refinement operators
. . .
Time efficiency: fast and scalable linking
Runtime reduction of the atomic similarity measures
Planning algorithms (e.g. HELIOS [1])
Use of cost functions to approximate runtime of LS
No exploitation of global knowledge about the LS
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 7 / 29
11. Our Contributions
Intuition
The execution engine knows more about runtimes than the planner once it has
executed a portion of the specification.
First dynamic planner for LD (Condor)
Mutable plans by re-shaping
Feedback loop between the planner and
the engine
Duplicated steps are executed once
Dependencies between steps of the plan
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 8 / 29
26. Dynamic execution of LS
Execution Engine:
Execute Left Child
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 17 / 29
27. Dynamic execution of LS
Execution Engine:
Execute Left Child
Cache intermediate results
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 17 / 29
28. Dynamic execution of LS
Execution Engine:
Execute Left Child
Cache intermediate results
Replace the estimated costs with its real costs
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 17 / 29
29. Dynamic execution of LS
Execution Engine:
Execute Left Child
Cache intermediate results
Replace the estimated costs with its real costs
Set Left Child and its sub-LSs as executed
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 17 / 29
30. Dynamic execution of LS
Execution Engine:
Execute Left Child
Cache intermediate results
Replace the estimated costs with its real costs
Set Left Child and its sub-LSs as executed
Condor:
Receive feedback from Execution Engine
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 17 / 29
31. Dynamic execution of LS
Execution Engine:
Execute Left Child
Cache intermediate results
Replace the estimated costs with its real costs
Set Left Child and its sub-LSs as executed
Condor:
Receive feedback from Execution Engine
Re-evaluate plan
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 17 / 29
32. Dynamic execution of LS
Execution Engine:
Execute Left Child
Cache intermediate results
Replace the estimated costs with its real costs
Set Left Child and its sub-LSs as executed
Condor:
Receive feedback from Execution Engine
Re-evaluate plan
Executed plans are more important than runtime estimations
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 17 / 29
33. Dynamic execution of LS
Execution Engine:
Execute Left Child
Cache intermediate results
Replace the estimated costs with its real costs
Set Left Child and its sub-LSs as executed
Condor:
Receive feedback from Execution Engine
Re-evaluate plan
Executed plans are more important than runtime estimations
Re-plan remaining steps
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 17 / 29
41. Experiment set-up
Datasets:
4 benchmark datasets: Abt-Buy Amazon-GP, DBLP-ACM and DBLP-Scholar
Scalability: MOVIES, TOWNS and VILLAGES
Input LS:
100 LSs for each dataset by Eagle
Unsupervised version
High accuracy in LSs
Comparison with Canonical and Helios
All planners achieved 100% F-measure
Evaluation metric: Runtime
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 22 / 29
42. Experiment 1
Q1 : Does Condor achieve better runtimes for LSs?
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 23 / 29
43. Experiment 1
Q1 : Does Condor achieve better runtimes for LSs?
Condor outperforms both static planners in all datasets
Wilcoxon signed-rank test on cumulative runtimes: statistically significant
differences
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 23 / 29
44. Experiment 2
Q2 : How much time does Condor spend planning?
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 24 / 29
45. Experiment 2
Q2 : How much time does Condor spend planning?
Condor needs less than 10ms for planning
Best average performance in Amazon-GP
4.6 times faster than Canonical
8 times faster than Helios
0.1% of overall runtime used in planning
Highest absolute difference in DBLP-Scholar
600s less runtime than Canonical
110s less runtime than Helios
0.0005% of overall runtime used in planning
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 24 / 29
46. Experiment 3
Q3 : How do the different sizes of LSs affect Condor ’s runtime?
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 25 / 29
47. Experiment 3
Q3 : How do the different sizes of LSs affect Condor ’s runtime?
LSs of size 1: same results for all planners
LSs of size 3: 7.5% faster than static planners
LSs of size 5++: 30.5% resp. 55.7% less time compared to Canonical
resp. Helios
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 25 / 29
48. Conclusions and Future Work
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 26 / 29
49. Conclusions and Future Work
Condor, a dynamic planner for link discovery:
Combination of dynamic planning with subsumption and result caching
Comparison with state-of-the-art: Canonical and Helios
Evaluation:
Experiments on 7 datasets: variety in size and classes
Significantly better runtimes than existing planning solutions
Up to 2 orders of magnitude faster
Requires less than 0.1% of the total runtime of a given LS for plan generation
Future Work:
Improvement of the cost function
Parallel execution of plans
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 27 / 29
50. Thank you!
Visit http://aksw.org/Projects/LIMES.html
https://twitter.com/DiceResearch
Questions?
Kleanthi Georgala
georgala@informatik.uni-leipzig.de
AKSW Research Group at Leipzig University
DICE Group at Paderborn University
http://aksw.org/KleanthiGeorgala.html
This work has been supported by H2020 projects SLIPO (GA no. 731581) and HOBBIT (GA no. 688227) as well as the DFG project LinkingLOD (project
no. NG 105/3-2) and the BMWI Project GEISER (project no. 01MD16014)
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 28 / 29
51. References
A.-C. Ngonga Ngomo.
HELIOS - Execution Optimization for Link Discovery.
In The Semantic Web - ISWC 2014 - 13th International Semantic Web Conference,
Riva del Garda, Italy, October 19-23, 2014. Proceedings, Part I, pages 17–32.
Springer, 2014.
Georgala, Obraczka & Ngonga Ngomo (DICE) CONDOR June 5, 2018 29 / 29