Parallel and incremental materialisation of RDF/DATALOG in RDFOX

PARALLEL AND INCREMENTAL MATERIALISATION OF
RDF/DATALOG IN RDFOX
Boris Motik
University of Oxford
March 20, 2015

TABLE OF CONTENTS
1 INTRODUCTION
2 PARALLEL MATERIALISATION OF DATALOG
3 HANDLING owl:sameAs VIA REWRITING
4 INCREMENTAL MATERIALISATION MAINTENANCE
5 CONCLUSION
Boris Motik Parallel and Incremental Materialisation of RDF/Datalog in RDFox 0/13

Introduction
TABLE OF CONTENTS
1 INTRODUCTION
5 CONCLUSION

Introduction
RDFOX SUMMARY
RDFox: a new RDF store and reasoner
http://www.cs.ox.ac.uk/isg/tools/RDFox/
Features:
RAM-based storage of RDF data
Currently centralised, but a distributed system is in the works
Datalog reasoning via materialisation
Can handle arbitrary (recursive) datalog rules, not just OWL 2 RL
Very effective parallelisation
Efﬁcient reasoning with owl:sameAs via rewriting
Known and widely-used technique, but correctness not trivial
The Backward-Forward (B/F) incremental maintenance algorithm
Considerably improves on DRed
Compatible with rewriting of owl:sameAs
SPARQL query answering
Most of SPARQL 1.0 and some of SPARQL 1.1

Parallel Materialisation of Datalog
TABLE OF CONTENTS
1 INTRODUCTION
5 CONCLUSION

MAIN CHALLENGES
1 Assign workload to threads evenly
Rules are generally not independent due to recursion
Static assignment of rule instances can be affected due to data skew
⇒ Dynamic assignment with low overhead needed
2 Efficiently interleave . . .
. . . querying (during evaluation of rule bodies)
. . . updates (during updates of derived facts)
3 Provide indexes for efficient rule body evaluation
Crucial for elimination of duplicate triples ⇒ ensures termination
Usually sorted (and clustered) to allow for merge joins
Hash indexes can also be used
Individual (i.e., not bulk) index updates are inefficient
B. Motik, Y. Nenov, R. Piro, I. Horrocks, and D. Olteanu. Parallel Materialisation of Datalog Programs in
Centralised, Main-Memory RDF Systems. AAAI 2014, pages 129–137

SOLUTION PART I: ALGORITHM
R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(x) ∧ R(x, y) → A(y)
For each fact:
match the fact to all body atoms to obtain subqueries
evaluate subqueries w.r.t. all previous facts
add results to the table
Current subquery:

⇒ R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: A(a)

R(a,b)
⇒ R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: A(a)

R(a,b)
R(a,c)
⇒ R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: A(b)

R(a,b)
R(a,c)
R(b,d)
⇒ R(b,e)
A(a)
R(c,f)
R(c,g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: A(b)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
⇒ A(a)
R(c,f)
R(c,g)
A(b)
A(c)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: R(a,y)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
⇒ R(c,f)
R(c,g)
A(b)
A(c)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: A(c)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
⇒ R(c,g)
A(b)
A(c)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: A(c)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
⇒ A(b)
A(c)
A(d)
A(e)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: R(b,y)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(b)
⇒ A(c)
A(d)
A(e)
A(f)
A(g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: R(c,y)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(b)
A(c)
⇒ A(d)
A(e)
A(f)
A(g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: R(d,y)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(b)
A(c)
A(d)
⇒ A(e)
A(f)
A(g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: R(e,y)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(b)
A(c)
A(d)
A(e)
⇒ A(f)
A(g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: R(f,y)

R(a,b)
R(a,c)
R(b,d)
R(b,e)
A(a)
R(c,f)
R(c,g)
A(b)
A(c)
A(d)
A(e)
A(f)
⇒ A(g)
A(x) ∧ R(x, y) → A(y)
For each fact:
Current subquery: R(g,y)

PARALLELISING COMPUTATION
Each thread extracts facts and evaluates subqueries independently
The number of subqueries is determined by the number of facts
ensures in practice that threads are equally loaded
Requires no thread synchronisation
⇒ We partition rule instances dynamically and with little overhead

SOLUTION PART II: INDEXING RDF DATA IN MAIN MEMORY
The critically algorithm depends on:
matching atoms t1, t2, t3 with ti a constant or variable
continuous concurrent updates
Our RDF storage data structure:
Hash-based indexes ⇒ naturally parallel data structure
‘Mostly’ lock-free: at least one thread makes progress at most of the time
compare: if a thread acquire a lock and dies, other threads are blocked
main beneﬁt: performance is less susceptible to scheduling decisions
Main technical challenge: reduce thread interference
When A writes to a location cached by B, the cache of B is invalidated
Our updates ensure that threads (typically) write to different locations

EVALUATION I: PARALLELISATION SPEEDUP
RDFox: an RDF store developed at Oxford University
http://www.cs.ox.ac.uk/isg/tools/RDFox/
8 16 24 32
2
4
6
8
10
12
14
16
18
20
ClarosL
ClarosLE
DBpediaL
DBpediaLE
LUBML 01K
LUBMU 01K
8 16 24 32
2
4
6
8
10
12
14
16
18
20
UOBML 01K
UOBMU 010
LUBMLE 01K
LUBML 05K
LUBMLE 05K
LUBMU 05K
Small concurrency overhead; parallelisation pays off already with two threads
Speedup continues to increase after we exhaust all physical cores
⇒ hyperthreading and parallelism can compensate CPU cache misses

EVALUATION II: ORACLE’S SPARC T5
Machine speciﬁcation:
4 TB of RAM
128 physical cores that support 1024 virtual cores via hyperthreading
Name Triples Time (s) Speedup Inference
Initial Resulting Threads rate
1 1024 (triples/s)
ClarosLE 18.8 M 533.7 M 7484 74 101 7.2 M
LUBML-1K 133.6 M 182.4 M 511 10 51 4.9 M
LUBMLE -1K 133.6 M 332.6 M 5267 37 142 5.4 M
LUBML-140k: 8 G triples, materialised to 10.9 G triples
20 threads: 2000 s, inference rate 1.45 M triples/s
128 threads: 599 s, inference rate 4.84 M triples/s
Materialised dataset used about half of RAM (2 TB)
Thanks to Jay Banerjee, Brian Whitney, Hassan Chaﬁ, and Zhe Wu @ Oracle

Handling owl:sameAs via Rewriting
TABLE OF CONTENTS
1 INTRODUCTION
5 CONCLUSION

HANDLING owl:sameAs VIA REWRITING
Rewriting: replace all equal constants with one representative
Well known approach; used in graphDB, Oracle, WebPIE, . . .
Much more efﬁcient than direct materialisation
Open question I: Effective parallelisation
Lock-free maintenance of representatives
Care needed to ensure correctness and nonrepetition of derivations
Open question II: Query evaluation
Could expand rewritten data before query evaluation, but that is inefﬁcient
Better: evaluate queries on rewritten data and expand the answer
Such a straightforward approach is incorrect:
Result cardinalities might be wrong
The presence of FILTER and BIND can make results incorrect
B. Motik, Y. Nenov, R. Piro, and I. Horrocks. Handling owl:sameAs via Rewriting. AAAI 2015

EVALUATION OF REWRITING
UOBM 279 4 2.2M
AX 36M 1.2 332M 16,152M
REW 9.4M 9.7M 0.4 33.8M 4,256M 686
factor 3.2x 3.2x 9.9x 3.8x
Table 3: Materialisation Times with Axiomatisation and Rewriting
Test Claros DBpedia OpenCyc
Threads AX REW AX
REW
AX REW AX
REW
AX REW AX
REW
sec spd sec spd sec spd sec spd sec spd sec spd
1 2042.9 1.0 65.8 1.0 31.1 219.8 1.0 31.7 1.0 6.9 2093.7 1.0 119.9 1.0 17.5
2 969.7 2.1 35.2 1.9 27.6 114.6 1.9 17.6 1.8 6.5 1326.5 1.6 78.3 1.5 16.9
4 462.0 4.4 18.1 3.6 25.5 66.3 3.3 10.7 3.0 6.2 692.6 3.0 40.5 3.0 17.1
8 237.2 8.6 9.9 6.7 24.1 36.1 6.1 5.2 6.0 6.9 351.3 6.0 23.0 5.2 15.2
12 184.9 11.1 7.9 8.3 23.3 31.9 6.9 4.1 7.7 7.7 291.8 7.2 56.2 2.1 5.5
16 153.4 13.3 6.9 9.6 22.3 27.5 8.0 3.6 8.8 7.7 254.0 8.2 52.3 2.3 4.9
Test UniProt UOBM
Threads AX REW AX
REW
AX REW AX
REW
sec spd sec spd sec spd sec spd
1 370.6 1.0 143.4 1.0 2.6 2696.7 1.0 1152.7 1.0 2.3
2 232.3 1.6 86.7 1.7 2.7 1524.6 1.8 599.6 1.9 2.5
4 129.2 2.9 46.5 3.1 2.8 813.3 3.3 318.3 3.6 2.6
8 74.7 5.0 25.1 5.7 3.0 439.9 6.1 177.7 6.5 2.5
12 61.0 6.1 19.9 7.2 3.1 348.9 7.7 152.7 7.6 2.3
16 61.9 6.0 17.1 8.4 3.6 314.4 8.6 137.9 8.4 2.3
mode takes less than ten seconds, these results are difficult
o measure and are susceptible to skew.
Our results confirm that rewriting can significantly reduce
materialisation times. RDFox was consistently faster in the
REW mode than in the AX mode even on UniProt, where the
eduction in the number of triples is negligible. This is due to
he reduction in the number of derivations, mainly involving
ules (⇡1)–(⇡5), which is still significant on UniProt. In all
cases, the speedup of rewriting is typically much larger than
connected by the :hasSameHomeTownWith property. Thi
property is also symmetric and transitive so, for each pai
of connected resources, the number of times each triple i
derived by the transitivity rule is quadratic in the number o
connected resources. This leads to a large number of dupli
cate derivations that do not involve equality. Thus, althoug
it is helpful, rewriting does not reduce the number of deriva
tion in the same way as, for example, on Claros, which ex
plains the relatively modest speedup of REW over AX.
Speedup is bigger than the reduction in the number of triples
The number of derivations is the determining factor
Contrary to popular belief!

Incremental Materialisation Maintenance
TABLE OF CONTENTS
1 INTRODUCTION
5 CONCLUSION

THE DRED ALGORITHM AT A GLANCE
Delete/Rederive (DRed): state of the art incremental maintenance algorithm
EXAMPLE
C0(x) ← A(x) C0(x) ← B(x) Ci (x) ← Ci−1(x) for 1 ≤ i ≤ n C0(x) ← Cn(x)
A(a)
B(a)

EXAMPLE
A(a)
B(a)
C0(a)
C1(a)
. . .
Cn(a)
Materialise initial facts

EXAMPLE
A(a)
B(a)
C0(a)
C1(a)
. . .
Cn(a)
Delete A(a) using DRed:

EXAMPLE
A(a)
B(a)
C0(a)
C1(a)
. . .
Cn(a)
1 Delete all facts with a derivation from A(a)
C0(x)D ← A(x)D
C0(x)D ← B(x)D
Ci (x)D ← Ci−1(x)D for 1 ≤ i ≤ n
C0(x)D ← Cn(x)D

EXAMPLE
A(a)
B(a)
C0(a)
C1(a)
. . .
Cn(a)
1 Delete all facts with a derivation from A(a)
C0(x)D ← A(x)D
C0(x)D ← B(x)D
Ci (x)D ← Ci−1(x)D for 1 ≤ i ≤ n
C0(x)D ← Cn(x)D
2 Rederive facts that have an alternative derivation
C0(x) ← C0(x)D ∧ A(x)
C0(x) ← C0(x)D ∧ B(x)
Ci (x) ← Ci (x)D ∧ Ci−1(x) for 1 ≤ i ≤ n
C0(x) ← C0(x)D ∧ Cn(x)

IMPROVEMENT: THE B/F ALGORITHM
In RDF, a fact often has many alternative derivations
⇒ Many facts get deleted in the ﬁrst step
The Backward/Forward (B/F) algorithm: look for alternatives immediately
A(a)
B(a)
C0(a)
C1(a)
. . .
Cn(a)
B. Motik, Y. Nenov, R. Piro, and I. Horrocks. Incremental Update of Datalog Materialisation: the
Backward/Forward Algorithm. AAAI 2015

A(a)
B(a)
C0(a)
C1(a)
. . .
Cn(a)
Delete A(a) using B/F:

A(a) ?
B(a)
C0(a)
C1(a)
. . .
Cn(a)
1 Is A(a) derivable in any other way?

A(a) ×
B(a)
C0(a)
C1(a)
. . .
Cn(a)
2 No ⇒ delete

A(a) ×
B(a)
C0(a) ?
C1(a)
. . .
Cn(a)
2 No ⇒ delete
3 As in DRed, identify C0(a) as derivable from A(a)

A(a) ×
B(a) ?
C0(a) ?
C1(a)
. . .
Cn(a)
2 No ⇒ delete
4 Apply the rules to C0(a) ‘backwards’ ⇒ by C0(x) ← B(x), we get B(a)

A(a) ×
B(a)
C0(a) ?
C1(a)
. . .
Cn(a)
2 No ⇒ delete
5 B(a) is explicit so it is derivable

A(a) ×
B(a)
C0(a)
C1(a)
. . .
Cn(a)
2 No ⇒ delete
6 So C0(a) is derivable too

A(a) ×
B(a)
C0(a)
C1(a)
. . .
Cn(a)
2 No ⇒ delete
6 So C0(a) is derivable too
7 Stop propagation and terminate

EVALUATION OF THE B/F ALGORITHM
Table 2: Experimental results
Dataset |E | |I I0
|
Rematerialise DRed B/F
Time Derivations Time Derivations Time Derivations
(s) Fwd (s) |D| DR2 DR4 DR5 (s) |C| Bwd Sat Del Prop
LUBM-1k-L
100 113 139.4 212.5M 0.0 1.0k 1.1k 0.8k 1.0k 0.0 0.5k 0.2k 0.3k 0.2k
|E| = 133.6M 5.0k 5.5k 101.8 212.5M 0.2 55.5k 67.2k 46.9k 59.8k 0.2 23.0k 9.3k 13.7k 7.4k
|I| = 182.4M 2.5M 2.7M 138.5 208.8M 39.4 10.3M 15.2M 6.6M 11.5M 32.8 10.0M 4.1M 5.6M 3.7M
Mt = 121.5s 5.0M 5.5M 91.8 205.0M 54.8 17.8M 26.3M 10.5M 18.9M 62.3 18.8M 7.8M 10.1M 7.5M
Md = 212.5M 7.5M 8.3M 89.2 201.3M 71.5 24.3M 35.5M 13.6M 24.3M 85.4 26.7M 11.0M 14.0M 11.2M
10.0M 11.0M 99.5 197.5M 127.9 30.0M 43.1M 15.9M 28.1M 102.2 34.1M 14.0M 17.4M 15.0M
UOBM-1k-Uo
100 160 3482.0 3.6G 8797.6 1.8G 2.6G 53.2M 2.6G 5.4 0.8k 0.5k 1.3k 0.5k
|E| = 254.8M 5.0k 85.2k 3417.8 3.6G 9539.3 1.8G 2.6G 53.2M 2.6G 28.2 105.9k 17.9k 42.1k 104.1k
|I| = 2.2G 17.0M 130.9M 3903.1 3.4G 8934.3 1.8G 2.7G 63.7M 2.5G 988.8 175.8M 47.6M 104.0M 196.7M
Mt = 5034.0s 34.0M 269.0M 4084.1 3.2G 9492.5 1.9G 2.8G 68.4M 2.4G 1877.2 340.7M 87.5M 182.3M 401.1M
Md = 3.6G 51.0M 422.8M 4010.0 3.0G 10659.3 1.9G 2.9G 71.5M 2.2G 2772.7 513.7M 125.2M 246.8M 622.0M
68.0M 581.4M 3981.9 2.8G 11351.6 1.9G 2.9G 73.3M 2.1G 3737.3 687.0M 162.5M 289.5M 848.6M
Claros-L
100 212 62.9 128.6M 0.0 0.8k 1.0k 0.2k 0.5k 0.0 0.6k 0.3k 0.7k 0.5k
|E| = 18.8M 5.0k 11.3k 62.8 128.6M 0.4 37.8k 50.7k 10.9k 23.9k 0.4 29.1k 18.8k 35.3k 26.8k
|I| = 74.2M 0.6M 1.3M 62.3 125.6M 32.3 4.1M 5.5M 1.1M 2.5M 14.9 3.1M 2.0M 3.6M 3.0M
Mt = 78.9s 1.2M 2.6M 61.2 122.6M 53.2 7.8M 10.8M 2.0M 4.8M 33.6 6.1M 3.8M 6.7M 6.0M
Md = 128.6M 1.7M 4.0M 60.5 119.5M 73.6 11.4M 15.9M 2.8M 6.8M 47.8 8.9M 5.6M 9.5M 9.1M
2.3M 5.5M 60.0 116.3M 91.0 14.8M 20.9M 3.6M 8.6M 60.6 11.7M 7.3M 12.0M 12.3M
Claros-LE
100 0.5k 3992.8 12.6G 0.0 1.3k 2.0k 0.3k 0.9k 0.0 1.0k 0.7k 1.0k 1.1k
|E| = 18.8M 2.5k 178.9k 5235.1 12.6G 8077.4 5.5M 11.7G 176.6k 11.7G 10.3 216.4k 161.2k 8.8M 320.0k
|I| = 533.7M 5.0k 427.5k 4985.1 12.6G 7628.2 6.0M 11.7G 186.0k 11.7G 16.5 485.6k 369.0k 8.9M 769.3k
Mt = 4024.5s 7.5k 609.6k 4855.0 12.6G 7419.1 6.5M 11.7G 193.9k 11.7G 19.5 683.4k 516.8k 9.0M 1.1M
Md = 12.9G 10.0k 780.8k 5621.3 12.6G 7557.9 6.8M 11.7G 207.6k 11.7G 3907.2 6.0M 723.0M 11.7G 16.9M
Test Datasets Table 2 summarises the properties of the
four datasets we used in our tests.
LUBM (Guo, Pan, and Heﬂin 2005) is a well-known RDF
benchmark. We extracted the datalog fragment of the LUBM
a congruence relation: derivations with owl:sameAs tend to
proliferate so an efﬁcient incremental algorithm would have
to treat this property directly; we leave developing such an
extension to our future work. Please note that omitting the

Conclusion
TABLE OF CONTENTS
1 INTRODUCTION
5 CONCLUSION

Conclusion
RESEARCH DIRECTIONS
Add a data/query/reasoning distribution layer:
Initial results very promising
Implementation in progress
Future work:
Investigate potential for data compression
Improve join cardinality estimation
Improve query planning

Parallel and incremental materialisation of RDF/DATALOG in RDFOX

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to Parallel and incremental materialisation of RDF/DATALOG in RDFOX

Similar to Parallel and incremental materialisation of RDF/DATALOG in RDFOX (20)

More from LDBC council

More from LDBC council (20)

Recently uploaded

Recently uploaded (20)

Parallel and incremental materialisation of RDF/DATALOG in RDFOX