1. Using Caching for Local Link Discovery on Large
Data Sets
Mofeed Hassan, Ren´e Speck and Axel-Cyrille Ngonga Ngomo
Agile Knowledge Engineering and Semantic Web
Department of Computer Science
University of Leipzig
Augustusplatz 10, 04109 Leipzig
{mounir,speck,ngonga}@informatik.uni-leipzig.de
June 25, 2015
2. t
ICWE-2015
Data Web and Link Discovery
1 Web of Data
2 Fourth Linked Data principle
3 Links are central for
Cross-ontology QA
Data Integration
Reasoning
Federated Queries
...
4 Linked Data on the Web
10+ thousand datasets
89+ billion triples
≈ 500+ million links
M. Hassan, R. Speck and A. Ngonga June 25, 2015 Caching in Link Discovery
2 / 15
12. t
ICWE-2015
ORCHID
Idea
How to cache the closest segments to compare?
M. Hassan, R. Speck and A. Ngonga June 25, 2015 Caching in Link Discovery
7 / 15
13. t
ICWE-2015
ORCHID
Idea
How to cache the closest segments to compare?
M. Hassan, R. Speck and A. Ngonga June 25, 2015 Caching in Link Discovery
7 / 15
15. t
ICWE-2015
Experiment Set-up
Datasets: LinkedGeoData
Experiment is two phases
Phase I : same cache size, different distance thresholds
Phase II: different cache sizes, same distance threshold
Two phases set-up
Data size Cache size Dist. threshold
Phase I 104 103 0,0.1,0.3,0.5
Phase II 105 101, 102, 103, 104, 105 0.5
M. Hassan, R. Speck and A. Ngonga June 25, 2015 Caching in Link Discovery
9 / 15
16. t
ICWE-2015
Results-Phase I
Fifo Fifo2ndChance LRU Slru Lfu LfuDA
Caching Approaches
10
3
10
4
10
5
10
6
10
7
CacheHits
Distance Threshold= 0
Distance Threshold= 0.1
Distance Threshold= 0.3
Distance Threshold= 0.5
Figure : Cache hits for different distance thresholds (104
resources)
M. Hassan, R. Speck and A. Ngonga June 25, 2015 Caching in Link Discovery
10 / 15
17. t
ICWE-2015
Results-Phase I
Fifo Fifo2ndChance LRU Slru Lfu LfuDA
Caching Approaches
10
3
10
4
10
5
10
6
10
7
10
8
RunTime(milliseconds)
Distance Threshold= 0
Distance Threshold= 0.1
Distance Threshold= 0.3
Distance Threshold= 0.5
Figure : Run times for different distance thresholds (104
resources)
M. Hassan, R. Speck and A. Ngonga June 25, 2015 Caching in Link Discovery
11 / 15
20. t
ICWE-2015
Conclusion and Future Work
Experiment’s findings:
Preliminary results of Caching with Link Discovery
Most of the Caching approaches performed closely
Caching approaches performed relatively with low cache hit
rates
A need for dedicated caching approach for Link Discovery
arises.
M. Hassan, R. Speck and A. Ngonga June 25, 2015 Caching in Link Discovery
14 / 15
21. t
ICWE-2015
Thank you!
Questions?
Mofeed Hassan
University of Leipzig
AKSW Research Group
Augustusplatz 10, Room P616
04109 Leipzig, Germany
mounir@informatik.uni-leipzig.de
@akswgroup
M. Hassan, R. Speck and A. Ngonga June 25, 2015 Caching in Link Discovery
15 / 15