FEASIBLE-Benchmark-Framework-ISWC2015

FEASIBLE: A Feature-Based SPARQL Benchmark
Generation Framework
Muhammad Saleem1, Qaiser Mehmood2, Axel-Cyrille Ngonga Ngomo1
http://feasible.aksw.org/
1Agile Knowledge Engineering and Semantic Web (AKSW), University of Leipzig, Germany
2Insight Center for Data Analytics, National University of Ireland, Galway
International Semantic Web Conference, Bethlehem, USA, 2015
10/14/2015 1

Triple Stores Benchmarks
• Synthetic Benchmarks
• Make use of the synthetic queries and/or data
• Benchmarks of different data sizes possible
• Suitable to test the scalability
• Often fail to reflect the reality
• For example, LUBM, SP2Bench, BSBM, WatDiv etc.
• Queries Log Benchmarks
• Make use of the real queries from queries log
• Can be more close to the reality
• Can be used with different data sizes
• Scalability can be tested
• For example, DBPSB, FEASIBLE
10/14/2015 2

DBpedia SPARQL Benchmark
• Based on real DBpedia queries log
• Benchmarks of different data sizes possible
• Suitable to test the scalability
• Only Considers SPARQL SELECT
• Does not consider Important query features
• For example, number of join vertices, triple patterns selectivities
• Not customizable for given use cases or needs of an application
10/14/2015 3

FEASIBLE SPARQL Benchmark
• Can be applied to any SPARQL queries log
• Considers SPARQL SELECT, ASK, DESCRIBE, CONSTRUCT
• Considers Important query features
• For example, number of join vertices, triple patterns selectivities,
query runtime, resultset size, number of BGPs, Mean join vertices
degree, number of triple patterns etc.
• Customizable for given use cases or needs of an application
10/14/2015 4

FEASIBLE SPARQL Benchmark
• Dataset cleaning
• Feature vectors and normalization
• Selection of exemplars
• Selection of benchmark queries
10/14/2015 5

Dataset Cleaning
• Remove syntactically incorrect queries
• Remove zero result size queries
• It is an optional step
• Not of theoretical necessity
• Leads to practically reliable benchmarks
10/14/2015 6

Feature Vectors and Normalization
SELECT DISTINCT ?entita ?nome
WHERE
{
?entita rdf:type dbo:VideoGame .
?entita rdfs:label ?nome
FILTER regex(?nome, "konami", "i")
}
LIMIT 100
Query Type: SELECT
Results Size: 13
Basic Graph Patterns (BGPs): 1
Triple Patterns: 2
Join Vertices: 1
Mean Join Vertices Degree: 2.0
Mean triple patterns selectivity: 0.01709761619798973
UNION: No
DISTINCT: Yes
ORDER BY: No
REGEX: Yes
LIMIT: Yes
OFFSET: No
OPTIONAL: No
FILTER: Yes
GROUP BY: No
Runtime (ms): 65
13 1 2 1 2 0.017 0 1 0 1 1 0 0 1 0 65
0.11 0.53 0.67 0.14 0.08 0.017 0 1 0 1 1 0 0 1 0 0.14
Feature Vector
Normalized Feature Vector
10/14/2015 7

Selection of exemplars
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 8
Plot feature vectors in a multidimensional space
Query Feature 1 Feature 2
Q1 0.2 0.2
Q2 0.5 0.3
Q3 0.8 0.3
Q4 0.9 0.1
Q5 0.5 0.5
Q6 0.2 0.7
Q7 0.1 0.8
Q8 0.13 0.65
Q9 0.9 0.5
Q10 0.1 0.5
Suppose we need a benchmark of 3 queries

10/14/2015 9
Calculate average point
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
Avg.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 10
Select point of minimum Euclidean distance to avg. point
*Red is our first exemplar

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 11
Select point that is farthest to exemplars

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 12

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 13
Select point that is farthest to exemplars

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 14

Selection of Benchmark Queries
10/14/2015 15
Calculate distance from Q1 to each exemplars
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

10/14/2015 16
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Assign Q1 to the minimum distance exemplar

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 17
Repeat the process for Q2

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 18

10/14/2015 19
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 20

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 21

Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10/14/2015 22

10/14/2015 23
Calculate Average across each cluster
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
Avg.
Avg.
Avg.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

10/14/2015 24
Calculate distance of each point in cluster to the average
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
Avg.
Avg.
Avg.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

10/14/2015 25
Select minimum distance query as the final benchmark
query from that cluster
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
Avg.
Avg.
Avg.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Black, i.e., Q2 is the final selected query from yellow cluster

10/14/2015 26
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
Avg.
Avg.
Avg.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Black, i.e., Q8 is the final selected query from brown cluster

10/14/2015 27
Q1
Q2 Q3
Q4
Q5
Q6
Q7
Q8
Q9Q10
Avg.
Avg.
Avg.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Black, i.e., Q3 is the final selected query from green cluster
Our benchmark queries are Q2, Q3, and Q8

Experimental Setup
• Composite Error Estimation
• L is the query log, B is the benchmark and K is the set of all features
10/14/2015 28

Experimental Setup
• Virtuoso Open-Source Edition version 7.2
• NumberOfBuffers = 680000, MaxDirtyBuffers = 500000
• Sesame Version 2.7.8
• Tomcat 7 as HTTP interface and native storage layout.
• Set the spoc, posc, opsc indices to those specified in the native storage configuration
• The Java heap size was set to 6GB
• Jena-TDB (Fuseki) Version 2.0
• Java heap size set to 6GB
• OWLIM-SE Version 6.1
• Tomcat 7.0 as HTTP interface
• Set the entity index size to 45,000,000 and enabled the predicate list
• Rule set was empty and the Java heap size was set to 6GB.
• We configured all triple stores to use 6GB of memory and used default values
otherwise.
10/14/2015 29

Comparison of Composite Error
10/14/2015 30
FEASIBLE’s composite error is 54.9% less than DBPSB

Comparison of Triple Stores: QpS
10/14/2015 31
0
50
100
150
200
250
Sesame
Virtuoso
OWLIM-SE
Fuseki
Sesame
Virtuoso
OWLIM-SE
Fuseki
SWDF DBpedia
QpS
0
0.5
1
1.5
2
2.5
3
Sesame
Virtuoso
OWLIM-SE
Fuseki
Sesame
Virtuoso
OWLIM-SE
Fuseki
SWDF DBpedia
QpS
0
10
20
30
40
50
60
70
Sesame
Virtuoso
OWLIM-SE
Fuseki
Sesame
Virtuoso
OWLIM-SE
Fuseki
SWDF DBpedia
QpS
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Sesame
Virtuoso
OWLIM-SE
Fuseki
Sesame
Virtuoso
OWLIM-SE
Fuseki
SWDF DBpedia
QpS
SPARQL ASK SPARQL CONSTRUCT
SPARQL DESCRIBE SPARQL SELECT

Comparison of Triple Stores: Mix Queries
10/14/2015 32
0
5
10
15
20
25
30
35
40
Sesame Virtuoso OWLIM-SE Fuseki
SWDF
QMpH
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
DBpedia
QMpH
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
SWDF
QpS
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
DBpedia
QpS

Rank-wise Ranking of Triple Stores
10/14/2015 33
All values are in percentages
• None of the system is sole winner or loser for a particular rank
• Virtuoso mostly lies in the higher ranks, i.e., rank 1 and 2 (68.29%)
• Fuseki mostly in the middle ranks, i.e., rank 2 and 3 (65.14%)
• OWLIM-SE usually on the slower side, i.e., rank 3 and 4 (60.86 %)
• Sesame is either fast or slow. Rank 1 (31.71% of the queries) and rank 4 (23.14%)

Thanks
saleem@informatik.uni-Leipzig.de
Try Yourself
http://feasible.aksw.org/
10/14/2015 34

FEASIBLE-Benchmark-Framework-ISWC2015

Recommended

Recommended

More Related Content

Similar to FEASIBLE-Benchmark-Framework-ISWC2015

Similar to FEASIBLE-Benchmark-Framework-ISWC2015 (20)

More from Muhammad Saleem

More from Muhammad Saleem (15)

Recently uploaded

Recently uploaded (20)

FEASIBLE-Benchmark-Framework-ISWC2015