MOCHA 2018
Mighty Storage Challenge II
Kleanthi Georgala, Mirko Spasić, Vassilis Papakonstantiou, Claus Stadler
MOCHA @ ESWC 2018
Heraklion, Crete
Horizon 2020, GA No 688227
Georgala (InfAI) MOCHA2018 June 5th, 2018 1 / 17
Organization
Axel-Cyrille Ngonga Ngomo, Department of Computer Science, Paderborn,
Germany
Irini Fundulaki, Foundation for Research and Technology – Hellas (FORTH),
Greece
Mirko Spasić, OpenLink, UK
Vassiliki Rentoumi, National Center for Scientific Research, “Demokritos”,
Greece
Kleanthi Georgala, Universität Leipzig, AKSW, Institut für Angewandte
Informatik (InfAI)
250 Euro for the winner of most tasks
Georgala (InfAI) MOCHA2018 June 5th, 2018 2 / 17
Overview
Triple store performance evaluation:
1 Sensor Streams Benchmark
2 Data Storage Benchmark
3 Versioning RDF Data Benchmark
4 Faceted Browsing Benchmark
Carried out using HOBBIT benchmarking platform
Public results
Georgala (InfAI) MOCHA2018 June 5th, 2018 3 / 17
Task 1 - Sensor Streams Benchmark
Goal of Task 1: Storage and Retrieval of Streamed Data from triple stores
Choke points:
Scalability
Time complexity
Input: RDF triples describing events in a production system via mimicking
Georgala (InfAI) MOCHA2018 June 5th, 2018 4 / 17
Task 1 - Sensor Streams Benchmark cont.
Divide triples into streams based on generation time stamp
Perform INSERT queries against triple store
SELECT queries against triple store after each stream
Create reference set for each SELECT query (Jena TDB)
KPIs:
1 Correctness
2 Efficiency
Georgala (InfAI) MOCHA2018 June 5th, 2018 5 / 17
Task 2 - Data Storage Benchmark (DSB)
Goal of Task 2: To measure how data stores perform with different types of
queries and bulk-loading
Synthetic Dataset
Social Network scenario
1.4 billion of triples
Query selection performed based on the choke-points relevant for query
executions (subquery unnesting, complex aggregate performance, etc)
Complex SPARQL SELECT queries (14 different types)
Simple SPARQL SELECT queries - lookups (7 different types)
SPARQL INSERT queries (8 different types)
Georgala (InfAI) MOCHA2018 June 5th, 2018 6 / 17
Task 2 - Data Storage Benchmark (DSB)
Workload consists of:
Bulk loading of the dataset
Warm-up phase
20000 of queries mimicking the real-world scenario regarding:
Distributions of the queries
Frequencies of the queries
Equal influence of each query type on the overall performance
KPIs:
Average query execution time
Loading time
Number of query failures
Average query execution time per query type
Georgala (InfAI) MOCHA2018 June 5th, 2018 7 / 17
Task 3 - Versioning RDF Data
Goal of Task 3: test the ability of versioning systems to efficiently manage
evolving datasets
queries evaluated across the multiple versions of said datasets
Dataset
produced using real DBpedia data and the data generator of LDBC’s SPB
configurable in terms of:
dataset size
numbers of versions
insertion/deletion ratios from version to version
generated data form (independent copies, change-sets)
Georgala (InfAI) MOCHA2018 June 5th, 2018 8 / 17
Task 3 - Versioning RDF Data
Eight different query types are supported
Partially based on a subset of the 25 query templates defined in the context of
DBpedia SPARQL Benchmark
Queries on single versions, multiple versions, deltas (difference of two
versions), materialization queries on versions/deltas etc.
KPIs (in order of importance)
Throughput (in queries per second)
Query failures
Initial version ingestion speed (in triples per second)
Applied changes speed (in triples per second)
Storage space cost (in MB)
Average query execution time (in ms)
Georgala (InfAI) MOCHA2018 June 5th, 2018 9 / 17
Task 4 - Faceted Browsing
Goal of Task 4: test software’s ability of enabling Faceted Browsing through
structured datasets.
Dataset:HOBBIT’s PoDiGG simulator
public transport dataset
train connections between stations
routes
trips
Georgala (InfAI) MOCHA2018 June 5th, 2018 10 / 17
Task 4 - Faceted Browsing
Workload consists of 11 browsing scenarios comprising 172 SPARQL queries
categorized into 14 choke points.
Instance retrievals - returns instances of state within browsing scenario
Facet counts - returns count for suggested facet for transition in browsing
scenario
Choke points consist of types of transitions from one state to the other
KPIs are correctness and performance
Georgala (InfAI) MOCHA2018 June 5th, 2018 11 / 17
Participants
Virtuoso v8.0 (Spasić et al.)
OSTRICH only for task 3 (Taelman et al.)
Baseline Virtuoso Open Source
Blazegraph
Graph DB Free 8.5
Apache Jena Fuseki 3.6.0
Georgala (InfAI) MOCHA2018 June 5th, 2018 12 / 17
Task 1 - Sensor Streams Benchmark
Macro-Average-Recall Macro-Average-F-Measure
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
System 1 System 2 System 3 System 4 System 5
Top 2 KPI:
Macro-Average-F-Measure
and Macro-Average-Recall
System 1 and System 2:
top similar results for
Macro-Average-F-Measure
System 2: best results for
Macro-Average-Recall
Georgala (InfAI) MOCHA2018 June 5th, 2018 13 / 17
Task 2 - Data Storage Benchmark
System
1
System
2
System
3
System
4
System
5
60
80
100 95.5
64.93
50 50 50
KPI1(inms)
System
1
System
2
System
3
System
4
System
5
54
56
58
60 58.55
55.02
54 54 54
KPI2(inmin)
Top KPIs:
KPI1: Average Query Execution
Time
KPI2: Bulk Loading Time
System 2: best results for KPI1
System 2: best results for KPI2
Georgala (InfAI) MOCHA2018 June 5th, 2018 14 / 17
Task 3 - Versioning RDF Data Benchmark
System
1
System
2
System
3
System
4
System
5
System
6
0.2
0.4
0.6
0.669 0.697
0.494
0.395
0.674
0.093
FinalScore[0-1]
For the 4 most important KPIs:
Results normalized to
[0-1] range
Weights assigned to them
Throughput: 0.4
Queries failed: 0.3
Initial version ingestion
speed: 0.15
Applied changes speed:
0.15
Final Score: The sum of
the normalized weighted
results
Georgala (InfAI) MOCHA2018 June 5th, 2018 15 / 17
Task 4 - Faceted Browsing Benchmark
System
1
System
2
System
3
System
4
System
5
1.5
2
2.327
1.575
1.437
1.28
1.969
FinalScore[Queries/Second]
Top KPIs:
KPI1: F-Measure
KPI2: Performance
All systems working
correct in regard to KPI1
System 1: best results for
KPI2
Georgala (InfAI) MOCHA2018 June 5th, 2018 16 / 17
The End
Nearer future: Results at the closing session
Thank You!
Questions?
Georgala (InfAI) MOCHA2018 June 5th, 2018 17 / 17
MOCHA 2018 Challenge @ ESWC2018

MOCHA 2018 Challenge @ ESWC2018

  • 1.
    MOCHA 2018 Mighty StorageChallenge II Kleanthi Georgala, Mirko Spasić, Vassilis Papakonstantiou, Claus Stadler MOCHA @ ESWC 2018 Heraklion, Crete Horizon 2020, GA No 688227 Georgala (InfAI) MOCHA2018 June 5th, 2018 1 / 17
  • 2.
    Organization Axel-Cyrille Ngonga Ngomo,Department of Computer Science, Paderborn, Germany Irini Fundulaki, Foundation for Research and Technology – Hellas (FORTH), Greece Mirko Spasić, OpenLink, UK Vassiliki Rentoumi, National Center for Scientific Research, “Demokritos”, Greece Kleanthi Georgala, Universität Leipzig, AKSW, Institut für Angewandte Informatik (InfAI) 250 Euro for the winner of most tasks Georgala (InfAI) MOCHA2018 June 5th, 2018 2 / 17
  • 3.
    Overview Triple store performanceevaluation: 1 Sensor Streams Benchmark 2 Data Storage Benchmark 3 Versioning RDF Data Benchmark 4 Faceted Browsing Benchmark Carried out using HOBBIT benchmarking platform Public results Georgala (InfAI) MOCHA2018 June 5th, 2018 3 / 17
  • 4.
    Task 1 -Sensor Streams Benchmark Goal of Task 1: Storage and Retrieval of Streamed Data from triple stores Choke points: Scalability Time complexity Input: RDF triples describing events in a production system via mimicking Georgala (InfAI) MOCHA2018 June 5th, 2018 4 / 17
  • 5.
    Task 1 -Sensor Streams Benchmark cont. Divide triples into streams based on generation time stamp Perform INSERT queries against triple store SELECT queries against triple store after each stream Create reference set for each SELECT query (Jena TDB) KPIs: 1 Correctness 2 Efficiency Georgala (InfAI) MOCHA2018 June 5th, 2018 5 / 17
  • 6.
    Task 2 -Data Storage Benchmark (DSB) Goal of Task 2: To measure how data stores perform with different types of queries and bulk-loading Synthetic Dataset Social Network scenario 1.4 billion of triples Query selection performed based on the choke-points relevant for query executions (subquery unnesting, complex aggregate performance, etc) Complex SPARQL SELECT queries (14 different types) Simple SPARQL SELECT queries - lookups (7 different types) SPARQL INSERT queries (8 different types) Georgala (InfAI) MOCHA2018 June 5th, 2018 6 / 17
  • 7.
    Task 2 -Data Storage Benchmark (DSB) Workload consists of: Bulk loading of the dataset Warm-up phase 20000 of queries mimicking the real-world scenario regarding: Distributions of the queries Frequencies of the queries Equal influence of each query type on the overall performance KPIs: Average query execution time Loading time Number of query failures Average query execution time per query type Georgala (InfAI) MOCHA2018 June 5th, 2018 7 / 17
  • 8.
    Task 3 -Versioning RDF Data Goal of Task 3: test the ability of versioning systems to efficiently manage evolving datasets queries evaluated across the multiple versions of said datasets Dataset produced using real DBpedia data and the data generator of LDBC’s SPB configurable in terms of: dataset size numbers of versions insertion/deletion ratios from version to version generated data form (independent copies, change-sets) Georgala (InfAI) MOCHA2018 June 5th, 2018 8 / 17
  • 9.
    Task 3 -Versioning RDF Data Eight different query types are supported Partially based on a subset of the 25 query templates defined in the context of DBpedia SPARQL Benchmark Queries on single versions, multiple versions, deltas (difference of two versions), materialization queries on versions/deltas etc. KPIs (in order of importance) Throughput (in queries per second) Query failures Initial version ingestion speed (in triples per second) Applied changes speed (in triples per second) Storage space cost (in MB) Average query execution time (in ms) Georgala (InfAI) MOCHA2018 June 5th, 2018 9 / 17
  • 10.
    Task 4 -Faceted Browsing Goal of Task 4: test software’s ability of enabling Faceted Browsing through structured datasets. Dataset:HOBBIT’s PoDiGG simulator public transport dataset train connections between stations routes trips Georgala (InfAI) MOCHA2018 June 5th, 2018 10 / 17
  • 11.
    Task 4 -Faceted Browsing Workload consists of 11 browsing scenarios comprising 172 SPARQL queries categorized into 14 choke points. Instance retrievals - returns instances of state within browsing scenario Facet counts - returns count for suggested facet for transition in browsing scenario Choke points consist of types of transitions from one state to the other KPIs are correctness and performance Georgala (InfAI) MOCHA2018 June 5th, 2018 11 / 17
  • 12.
    Participants Virtuoso v8.0 (Spasićet al.) OSTRICH only for task 3 (Taelman et al.) Baseline Virtuoso Open Source Blazegraph Graph DB Free 8.5 Apache Jena Fuseki 3.6.0 Georgala (InfAI) MOCHA2018 June 5th, 2018 12 / 17
  • 13.
    Task 1 -Sensor Streams Benchmark Macro-Average-Recall Macro-Average-F-Measure 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 System 1 System 2 System 3 System 4 System 5 Top 2 KPI: Macro-Average-F-Measure and Macro-Average-Recall System 1 and System 2: top similar results for Macro-Average-F-Measure System 2: best results for Macro-Average-Recall Georgala (InfAI) MOCHA2018 June 5th, 2018 13 / 17
  • 14.
    Task 2 -Data Storage Benchmark System 1 System 2 System 3 System 4 System 5 60 80 100 95.5 64.93 50 50 50 KPI1(inms) System 1 System 2 System 3 System 4 System 5 54 56 58 60 58.55 55.02 54 54 54 KPI2(inmin) Top KPIs: KPI1: Average Query Execution Time KPI2: Bulk Loading Time System 2: best results for KPI1 System 2: best results for KPI2 Georgala (InfAI) MOCHA2018 June 5th, 2018 14 / 17
  • 15.
    Task 3 -Versioning RDF Data Benchmark System 1 System 2 System 3 System 4 System 5 System 6 0.2 0.4 0.6 0.669 0.697 0.494 0.395 0.674 0.093 FinalScore[0-1] For the 4 most important KPIs: Results normalized to [0-1] range Weights assigned to them Throughput: 0.4 Queries failed: 0.3 Initial version ingestion speed: 0.15 Applied changes speed: 0.15 Final Score: The sum of the normalized weighted results Georgala (InfAI) MOCHA2018 June 5th, 2018 15 / 17
  • 16.
    Task 4 -Faceted Browsing Benchmark System 1 System 2 System 3 System 4 System 5 1.5 2 2.327 1.575 1.437 1.28 1.969 FinalScore[Queries/Second] Top KPIs: KPI1: F-Measure KPI2: Performance All systems working correct in regard to KPI1 System 1: best results for KPI2 Georgala (InfAI) MOCHA2018 June 5th, 2018 16 / 17
  • 17.
    The End Nearer future:Results at the closing session Thank You! Questions? Georgala (InfAI) MOCHA2018 June 5th, 2018 17 / 17