MOCHA 2018 Challenge @ ESWC2018

MOCHA 2018
Mighty Storage Challenge II
Kleanthi Georgala, Mirko Spasić, Vassilis Papakonstantiou, Claus Stadler
MOCHA @ ESWC 2018
Heraklion, Crete
Horizon 2020, GA No 688227
Georgala (InfAI) MOCHA2018 June 5th, 2018 1 / 17

Organization
Axel-Cyrille Ngonga Ngomo, Department of Computer Science, Paderborn,
Germany
Irini Fundulaki, Foundation for Research and Technology – Hellas (FORTH),
Greece
Mirko Spasić, OpenLink, UK
Vassiliki Rentoumi, National Center for Scientiﬁc Research, “Demokritos”,
Greece
Kleanthi Georgala, Universität Leipzig, AKSW, Institut für Angewandte
Informatik (InfAI)
250 Euro for the winner of most tasks

Overview
Triple store performance evaluation:
1 Sensor Streams Benchmark
2 Data Storage Benchmark
3 Versioning RDF Data Benchmark
4 Faceted Browsing Benchmark
Carried out using HOBBIT benchmarking platform
Public results

Task 1 - Sensor Streams Benchmark
Goal of Task 1: Storage and Retrieval of Streamed Data from triple stores
Choke points:
Scalability
Time complexity
Input: RDF triples describing events in a production system via mimicking

Task 1 - Sensor Streams Benchmark cont.
Divide triples into streams based on generation time stamp
Perform INSERT queries against triple store
SELECT queries against triple store after each stream
Create reference set for each SELECT query (Jena TDB)
KPIs:
1 Correctness
2 Eﬃciency

Task 2 - Data Storage Benchmark (DSB)
Goal of Task 2: To measure how data stores perform with different types of
queries and bulk-loading
Synthetic Dataset
Social Network scenario
1.4 billion of triples
Query selection performed based on the choke-points relevant for query
executions (subquery unnesting, complex aggregate performance, etc)
Complex SPARQL SELECT queries (14 different types)
Simple SPARQL SELECT queries - lookups (7 different types)
SPARQL INSERT queries (8 different types)

Task 2 - Data Storage Benchmark (DSB)
Workload consists of:
Bulk loading of the dataset
Warm-up phase
20000 of queries mimicking the real-world scenario regarding:
Distributions of the queries
Frequencies of the queries
Equal inﬂuence of each query type on the overall performance
KPIs:
Average query execution time
Loading time
Number of query failures
Average query execution time per query type

Task 3 - Versioning RDF Data
Goal of Task 3: test the ability of versioning systems to eﬃciently manage
evolving datasets
queries evaluated across the multiple versions of said datasets
Dataset
produced using real DBpedia data and the data generator of LDBC’s SPB
conﬁgurable in terms of:
dataset size
numbers of versions
insertion/deletion ratios from version to version
generated data form (independent copies, change-sets)

Task 3 - Versioning RDF Data
Eight different query types are supported
Partially based on a subset of the 25 query templates defined in the context of
DBpedia SPARQL Benchmark
Queries on single versions, multiple versions, deltas (difference of two
versions), materialization queries on versions/deltas etc.
KPIs (in order of importance)
Throughput (in queries per second)
Query failures
Initial version ingestion speed (in triples per second)
Applied changes speed (in triples per second)
Storage space cost (in MB)
Average query execution time (in ms)

Task 4 - Faceted Browsing
Goal of Task 4: test software’s ability of enabling Faceted Browsing through
structured datasets.
Dataset:HOBBIT’s PoDiGG simulator
public transport dataset
train connections between stations
routes
trips

Task 4 - Faceted Browsing
Workload consists of 11 browsing scenarios comprising 172 SPARQL queries
categorized into 14 choke points.
Instance retrievals - returns instances of state within browsing scenario
Facet counts - returns count for suggested facet for transition in browsing
scenario
Choke points consist of types of transitions from one state to the other
KPIs are correctness and performance

Participants
Virtuoso v8.0 (Spasić et al.)
OSTRICH only for task 3 (Taelman et al.)
Baseline Virtuoso Open Source
Blazegraph
Graph DB Free 8.5
Apache Jena Fuseki 3.6.0

Task 1 - Sensor Streams Benchmark
Macro-Average-Recall Macro-Average-F-Measure
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
System 1 System 2 System 3 System 4 System 5
Top 2 KPI:
Macro-Average-F-Measure
and Macro-Average-Recall
System 1 and System 2:
top similar results for
Macro-Average-F-Measure
System 2: best results for
Macro-Average-Recall

Task 2 - Data Storage Benchmark
System
1
System
2
System
3
System
4
System
5
60
80
100 95.5
64.93
50 50 50
KPI1(inms)
System
1
System
2
System
3
System
4
System
5
54
56
58
60 58.55
55.02
54 54 54
KPI2(inmin)
Top KPIs:
KPI1: Average Query Execution
Time
KPI2: Bulk Loading Time
System 2: best results for KPI1
System 2: best results for KPI2

Task 3 - Versioning RDF Data Benchmark
System
1
System
2
System
3
System
4
System
5
System
6
0.2
0.4
0.6
0.669 0.697
0.494
0.395
0.674
0.093
FinalScore[0-1]
For the 4 most important KPIs:
Results normalized to
[0-1] range
Weights assigned to them
Throughput: 0.4
Queries failed: 0.3
Initial version ingestion
speed: 0.15
Applied changes speed:
0.15
Final Score: The sum of
the normalized weighted
results

Task 4 - Faceted Browsing Benchmark
System
1
System
2
System
3
System
4
System
5
1.5
2
2.327
1.575
1.437
1.28
1.969
FinalScore[Queries/Second]
Top KPIs:
KPI1: F-Measure
KPI2: Performance
All systems working
correct in regard to KPI1
System 1: best results for
KPI2

The End
Nearer future: Results at the closing session
Thank You!
Questions?

MOCHA 2018 Challenge @ ESWC2018

MOCHA 2018 Challenge @ ESWC2018

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MOCHA 2018 Challenge @ ESWC2018

Similar to MOCHA 2018 Challenge @ ESWC2018 (20)

More from Holistic Benchmarking of Big Linked Data

More from Holistic Benchmarking of Big Linked Data (20)

Recently uploaded

Recently uploaded (20)

MOCHA 2018 Challenge @ ESWC2018