SlideShare a Scribd company logo
1 of 50
Download to read offline
May 14, 2014
RDF Analytics
Lenses over Semantic Graphs
Dario Colazzo 3,1
FrancĀøois GoasdouĀ“e 4,1
Ioana Manolescu 1,2
Alexandra RoatisĀø 2,1
1OAK ā€“ Inria, France
2LRI ā€“ UniversitĀ“e Paris-Sud, France
3LAMSADE ā€“ UniversitĀ“e Paris Dauphine, France
4PILGRIM ā€“ UniversitĀ“e Rennes 1, France
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2
RDF data warehousing scenario
Ć¾Alice
software engineer
IT company
builds user applications
open RDF data (Grenoble)
worksFor
DS: Restaurants
(i) heterogeneous data
App: clickable map m
#restaurants
region & average rating
type of cuisine
build
RDW: relational data warehouse
extract tabular data (SPARQL queries)
merge
(ii) new central concepts
DS3: MuseumsDS2: Shops
RDW2 RDW3
(iii) other missing relationships?
Bug: landmarks museums
ļ¬nd
redesign
Feature: query relationships
region famous people
(iv) query schema
add
Feature: new type of aggregation
for each landmark, show how many restaurants are nearby
(v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities)
add
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 3
RDF data warehousing
Application needs:
(i) support of heterogeneous data
(ii) multiple central concepts
(iii) support for RDF semantics when querying
(iv) possibility to query the relationships between entities (the schema)
(v) ļ¬‚exible choice of aggregation dimensions
This work:
redesign the core data analytics concepts and tools for RDF
formal framework for warehouse-style analytics on RDF data
suited to heterogeneous, semantic-rich corpora of Linked Data
Summary
1. RDF Graphs & BGP Queries
2. RDF Graph Analysis
3. On-Line Analytical Processing
4. Empirical Evaluation
5. Sum Up
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 4
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 5
RDF Graphs & BGP Queries
ā€“ recall ā€“
The Resource Description Framework (RDF)
RDF graph ā€“ set of triples
Assertion Triple Relational notation
Class s rdf:type o o(s)
Property s p o p(s, o)
user1
user2
worksWith
Bill hasName
28 hasAge
Madrid
inCity
Studentrdf:type
:b1wrote
blog1
inBlog
resource (URI)
blank node
literal (string)
property
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 6
RDF Schema (RDFS)
ā€“ declare semantic constraints between classes and properties
Constraint Triple Relational notation
Subclass s rdfs:subClassOf o s āŠ† o
Subproperty s rdfs:subPropertyOf o s āŠ† o
Domain typing s rdfs:domain o Ī domain(s) āŠ† o
Range typing s rdfs:range o Ī range(s) āŠ† o
Person
Student
rdfs:subClassOf
knows
rdfs:range
rdfs:domain
worksWith
rdfs:subPropertyOf
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 7
Open-world assumption and RDF entailment
RDF data model ā€“ based on the open-world assumption.
ā†’ deductive constraints ā€“ implicitly propagate tuples
Entailment ā€“ reasoning mechanism
set of explicit triples
+ ā†’ derive implicit triples
some entailment rules
Exhaustive application of entailment ā†’ saturation (closure)
The semantics of an RDF graph is its saturation.
user1 Student
Person
rdfs:subClassOf
rdf:type
rdf:type
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 8
Basic Graph Pattern (BGP) queries
ā†’ subset of SPARQL; BGP ā€“ conjunctions of triple patterns
q(y) :- x rdf:type Person, x hasName y
query evaluation query answering
the evaluation of a query only uses the graphā€™s explicit triples
(complete) answer set ā€“ evaluate q against the graphā€™s saturation
user1 Student
Person
rdfs:subClassOf
rdf:type
rdf:type
Bill
hasName
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 9
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 10
RDF Graph Analysis
ā€“ formal framework for warehousing RDF data ā€“
Analytical schema (AnS) and instance (I)
RDF graph:
Person
user1
user2
rdf:type
rdf:type
BillhasName
post1
post2
wrote
wrote
blog1
inBlog
inBlog
Code Blog
hasName
Analytical schema:
ā†’ labeled directed graph
Instance of the analytical schema w.r.t. the graph
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
Analytical schema (AnS) and instance (I)
RDF graph:
Person
user1
user2
rdf:type
rdf:type
BillhasName
post1
post2
wrote
wrote
blog1
inBlog
inBlog
Code Blog
hasName
Analytical schema:
ā†’ labeled directed graph
n1
Ī»(n1) ā† Blogger
Ī“(n1) ā†
q(x) :- x rdf:type Person,
x wrote y,
y inBlog z
n2
Ī»(n2) ā† Name
Ī“(n2) ā† q(x) :- y hasName x
e2
Ī»(e2) ā† identiļ¬edBy
Ī“(e2) ā†
q(x, y) :- x rdf:type Person,
x hasName y
Instance of the analytical schema w.r.t. the graph
x rdf:type Ī»(n1)
user1 rdf:type Blogger
user2 rdf:type Blogger
x Ī»(e2) y
user1 identiļ¬edBy Bill
x rdf:type Ī»(n2)
Bill rdf:type Name
Code Blog rdf:type Name
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
Analytical schema (AnS) and instance (I)
RDF graph:
Person
user1
user2
rdf:type
rdf:type
BillhasName
post1
post2
wrote
wrote
blog1
inBlog
inBlog
Code Blog
hasName
Analytical schema:
ā†’ labeled directed graph
n1
Ī»(n1) ā† Blogger
Ī“(n1) ā†
q(x) :- x rdf:type Person,
x wrote y,
y inBlog z
n2
Ī»(n2) ā† Name
Ī“(n2) ā† q(x) :- y hasName x
e2
Ī»(e2) ā† identiļ¬edBy
Ī“(e2) ā†
q(x, y) :- x rdf:type Person,
x hasName y
Instance of the analytical schema w.r.t. the graph
x rdf:type Ī»(n1)
user1 rdf:type Blogger
user2 rdf:type Blogger
x Ī»(e2) y
user1 identiļ¬edBy Bill
x rdf:type Ī»(n2)
Bill rdf:type Name
Code Blog rdf:type Name
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
Analytical schema (AnS) and instance (I)
RDF graph:
Person
user1
user2
rdf:type
rdf:type
BillhasName
post1
post2
wrote
wrote
blog1
inBlog
inBlog
Code Blog
hasName
Analytical schema:
ā†’ labeled directed graph
n1
Ī»(n1) ā† Blogger
Ī“(n1) ā†
q(x) :- x rdf:type Person,
x wrote y,
y inBlog z
n2
Ī»(n2) ā† Name
Ī“(n2) ā† q(x) :- y hasName x
e2
Ī»(e2) ā† identiļ¬edBy
Ī“(e2) ā†
q(x, y) :- x rdf:type Person,
x hasName y
Instance of the analytical schema w.r.t. the graph
x rdf:type Ī»(n1)
user1 rdf:type Blogger
user2 rdf:type Blogger
x Ī»(e2) y
user1 identiļ¬edBy Bill
x rdf:type Ī»(n2)
Bill rdf:type Name
Code Blog rdf:type Name
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
Analytical schema (AnS) and instance (I)
RDF graph:
Person
user1
user2
rdf:type
rdf:type
BillhasName
post1
post2
wrote
wrote
blog1
inBlog
inBlog
Code Blog
hasName
Analytical schema:
ā†’ labeled directed graph
n1
Ī»(n1) ā† Blogger
Ī“(n1) ā†
q(x) :- x rdf:type Person,
x wrote y,
y inBlog z
n2
Ī»(n2) ā† Name
Ī“(n2) ā† q(x) :- y hasName x
e2
Ī»(e2) ā† identiļ¬edBy
Ī“(e2) ā†
q(x, y) :- x rdf:type Person,
x hasName y
! data heterogeneity preserved !
Instance of the analytical schema w.r.t. the graph
x rdf:type Ī»(n1)
user1 rdf:type Blogger
user2 rdf:type Blogger
x Ī»(e2) y
user1 identiļ¬edBy Bill
x rdf:type Ī»(n2)
Bill rdf:type Name
Code Blog rdf:type Name
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
Analytical schema (AnS) and instance (I)
RDF graph:
Person
user1
user2
rdf:type
rdf:type
BillhasName
post1
post2
wrote
wrote
blog1
inBlog
inBlog
Code Blog
hasName
Analytical schema:
ā†’ labeled directed graph
n1
Ī»(n1) ā† Blogger
Ī“(n1) ā†
q(x) :- x rdf:type Person,
x wrote y,
y inBlog z
n2
Ī»(n2) ā† Name
Ī“(n2) ā† q(x) :- y hasName x
e2
Ī»(e2) ā† identiļ¬edBy
Ī“(e2) ā†
q(x, y) :- x rdf:type Person,
x hasName y
! easy to extend !
Instance of the analytical schema w.r.t. the graph
x rdf:type Ī»(n1)
user1 rdf:type Blogger
user2 rdf:type Blogger
x Ī»(e2) y
user1 identiļ¬edBy Bill
x rdf:type Ī»(n2)
Bill rdf:type Name
Code Blog rdf:type Name
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
Analytical query (AnQ)
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 12
Analytical schema: Instance:
n1 : Blogger n2 : Citye2 : from
n3 : Value
e3 : age
n4 : BlogPost
e4 : posted
n5 : Site e5 : on
user1
user2
user3
28 age
Madrid from
40 age
35 age
New York from
post1
post2
post3
post4
posted
posted
posted
posted
blog1
blog2
on
on
on
on
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
m(x, v) :- x posted y, y on v
count
Analytical query (AnQ)
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 12
Analytical schema: Instance:
n1 : Blogger n2 : Citye2 : from
n3 : Value
e3 : age
n4 : BlogPost
e4 : posted
n5 : Site e5 : on
user1
user2
user3
28 age
Madrid from
40 age
35 age
New York from
post1
post2
post3
post4
posted
posted
posted
posted
blog1
blog2
on
on
on
on
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
{ user1, ā€œ28ā€, ā€œMadridā€ , user3, ā€œ35ā€, ā€œNew Yorkā€ }
m(x, v) :- x posted y, y on v
count
Analytical query (AnQ)
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 12
Analytical schema: Instance:
n1 : Blogger n2 : Citye2 : from
n3 : Value
e3 : age
n4 : BlogPost
e4 : posted
n5 : Site e5 : on
user1
user2
user3
28 age
Madrid from
40 age
35 age
New York from
post1
post2
post3
post4
posted
posted
posted
posted
blog1
blog2
on
on
on
on
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
{ user1, ā€œ28ā€, ā€œMadridā€ , user3, ā€œ35ā€, ā€œNew Yorkā€ }
m(x, v) :- x posted y, y on v
{ user1, blog1 , user1, blog2 , user2, blog2 , user3, blog2 }
count
Analytical query (AnQ)
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 12
Analytical schema: Instance:
n1 : Blogger n2 : Citye2 : from
n3 : Value
e3 : age
n4 : BlogPost
e4 : posted
n5 : Site e5 : on
user1
user2
user3
28 age
Madrid from
40 age
35 age
New York from
post1
post2
post3
post4
posted
posted
posted
posted
blog1
blog2
on
on
on
on
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
{ user1, ā€œ28ā€, ā€œMadridā€ , user3, ā€œ35ā€, ā€œNew Yorkā€ }
m(x, v) :- x posted y, y on v
{ user1, blog1 , user1, blog2 , user2, blog2 , user3, blog2 }
count
{ ā€œ28ā€, ā€œMadridā€, 2 , ā€œ35ā€, ā€œNew Yorkā€, 1 }
Analytical query answering
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 13
through analytical schema materialization
through analytical query reformulation
Analytical query answering
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 13
through analytical schema materialization
through analytical query reformulation
Analytical schema:
n1
Ī»(n1) ā† Blogger
Ī“(n1) ā†
q(x) :- x rdf:type Person,
x wrote y,
y inBlog z
e1
Ī»(e1) ā† acquaintedWith
Ī“(e1) ā†
q(x, y) :- z rdfs:subPropertyOf knows,
x z y
Query:
c(x, d) :- x rdf:type Blogger,
x acquaintedWith d
c (x, d) :- x rdf:type Person,
x wrote y1,
y1 inBlog y2,
z1 rdfs:subPropertyOf knows,
x z1 d
Analytical query answering
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 13
through analytical schema materialization
through analytical query reformulation
Analytical schema:
n1
Ī»(n1) ā† Blogger
Ī“(n1) ā†
q(x) :- x rdf:type Person,
x wrote y,
y inBlog z
e1
Ī»(e1) ā† acquaintedWith
Ī“(e1) ā†
q(x, y) :- z rdfs:subPropertyOf knows,
x z y
Query:
c(x, d) :- x rdf:type Blogger,
x acquaintedWith d
c (x, d) :- x rdf:type Person,
x wrote y1,
y1 inBlog y2,
Analytical query answering
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 13
through analytical schema materialization
through analytical query reformulation
Analytical schema:
n1
Ī»(n1) ā† Blogger
Ī“(n1) ā†
q(x) :- x rdf:type Person,
x wrote y,
y inBlog z
e1
Ī»(e1) ā† acquaintedWith
Ī“(e1) ā†
q(x, y) :- z rdfs:subPropertyOf knows,
x z y
Query:
c(x, d) :- x rdf:type Blogger,
x acquaintedWith d
c (x, d) :- x rdf:type Person,
x wrote y1,
y1 inBlog y2,
z1 rdfs:subPropertyOf knows,
x z1 d
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 14
On-Line Analytical Processing
ā€“ applying OLAP operations ā€“
Slice, dice, drill-in and drill-out
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
m(x, v) :- x posted y, y on v
count
Slice, dice, drill-in and drill-out
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
m(x, v) :- x posted y, y on v
count
Slice: bind an aggregation dimension to a single value
cĪ£ (x, d1, d2) :- x age d1, x from d2
Ī£ = { d1 ā† ā€œ35ā€ }
Slice, dice, drill-in and drill-out
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
m(x, v) :- x posted y, y on v
count
Slice: bind an aggregation dimension to a single value
cĪ£ (x, d1, d2) :- x age d1, x from d2
Ī£ = { d1 ā† ā€œ35ā€ }
Dice: bind several aggregation dimensions to sets of values
cĪ£ (x, d1, d2) :- x age d1, x from d2
Ī£ = { d1 ā† {ā€œ28ā€}, d2 ā† {ā€œMadridā€, ā€œKyotoā€} }
Slice, dice, drill-in and drill-out
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
m(x, v) :- x posted y, y on v
count
Slice: bind an aggregation dimension to a single value
cĪ£ (x, d1, d2) :- x age d1, x from d2
Ī£ = { d1 ā† ā€œ35ā€ }
Dice: bind several aggregation dimensions to sets of values
cĪ£ (x, d1, d2) :- x age d1, x from d2
Ī£ = { d1 ā† {ā€œ28ā€}, d2 ā† {ā€œMadridā€, ā€œKyotoā€} }
Drill-in: remove a dimension from the classiļ¬er
c (x, d2) :- x from d2
Slice, dice, drill-in and drill-out
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
m(x, v) :- x posted y, y on v
count
Slice: bind an aggregation dimension to a single value
cĪ£ (x, d1, d2) :- x age d1, x from d2
Ī£ = { d1 ā† ā€œ35ā€ }
Dice: bind several aggregation dimensions to sets of values
cĪ£ (x, d1, d2) :- x age d1, x from d2
Ī£ = { d1 ā† {ā€œ28ā€}, d2 ā† {ā€œMadridā€, ā€œKyotoā€} }
Drill-in: remove a dimension from the classiļ¬er
c (x, d2) :- x from d2
Drill-out: add a dimension to the classiļ¬er
c (x, d1, d2, d3) :- x age d1, x from d2, x acquaintedWith d3
Roll-up and drill-down
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 16
Query: Find the number of sites where each blogger posts,
classiļ¬ed by the bloggerā€™s age and city.
c(x, d1, d2) :- x age d1, x from d2
m(x, v) :- x posted y, y on v
count
nextLevel relationship ā€“ hierarchies among nodes or edges
n1 : Blogger n2 : Citye2 : from n6 : Statee6 : nextLevel
n3 : Value
e3 : age
n4 : BlogPost
e4 : posted
n5 : Site e5 : on
Roll-up: along the City dimension to the State level
c (x, d1, d3) :- x age d1, x from d2, d2 nextLevel d3
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 17
Empirical Evaluation
ā€“ experiments and demo ā€“
Experiments
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 18
Settings: kdb+ v3.0 (64 bits) ā€“ highly eļ¬ƒcient in-memory column store
q interpreted programming language
Dataset: DBpedia Download 3.8
Ontology and Ontology Infobox datasets
Hardware: 8-core DELL server at 2.13 GHz
16 GB of RAM
running Linux 2.6.31.14
Results: linear scale-up w.r.t. the data size
for instance materialization and query answering
Analytical query answering
12 patterns c number of triple patterns in the classiļ¬er query
1,097 queries v number of dimension variables in the classiļ¬er query
m number of triple patterns in the measure query
c1v1m1
c1v1m2
c1v1m3
c2v1m3
c3v2m3
c4v3m3
c5v1m3
c5v2m3
c5v3m3
c5v4m1
c5v4m2
c5v4m3
0
1
10
average minimum maximum
c1v1m1
(73)
c1v1m2
(53)
c1v1m3
(62)
c2v1m3
(71)
c3v2m3
(76)
c4v3m3
(130)
c5v1m3
(144)
c5v2m3
(216)
c5v3m3
(144)
c5v4m1
(28)
c5v4m2
(64)
c5v4m3
(36)
0
1
10
100
1,000
10,000
100,000
evaluation time (s)
number of results
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 19
Java GUI using the Prefuse toolkit
(collaboration with Tushar Ghosh)
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 20
Java GUI using the Prefuse toolkit
(collaboration with Tushar Ghosh)
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 20
Java GUI using the Prefuse toolkit
(collaboration with Tushar Ghosh)
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 20
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 21
Sum Up
Related works
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 22
Graph cube: on warehousing and OLAP multidimensional networks [SIGMOD 2011]
ā†’ do not handle heterogeneous graphs, nor data semantics, both central in RDF
ā†’ only focus on counting edges in contrast with our ļ¬‚exible analytical queries
Business intelligence on complex graph data [EDBT/ICDT 2012 Workshops]
ā†’ graph data aggregated in a spatial fashion (group connected nodes into regions)
ā†’ our framework ā€“ RDF-speciļ¬c + more general aggregation
No Size Fits All ā€“ Running the Star Schema Benchmark with SPARQL and
RDF Aggregate Views [ESWC 2013]
ā†’ techniques for transforming OLAP queries into SPARQL
ā†’ could be used to further optimize analytical query answering in our framework
The MD-join: An Operator for Complex OLAP [ICDE 2001]
ā†’ separation between grouping and aggregation present in our analytical queries
is similar to the MD-join operator for RDWs
W3Cā€™s SPARQL 1.1 Query Language
ā†’ features SQL-style grouping and aggregation
ā†’ eļ¬ƒcient SPARQL 1.1 platforms ā€“ ideal for deploying our framework
Sum up and perspectives
RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 23
Sum up:
Approach for specifying and exploiting an RDF data warehouse
deļ¬ne an analytical schema that captures the information of
interest
formalize analytical queries (or cubes) over the analytical schema
Instances of analytical schemas are RDF graphs themselves, which
allows to exploit the rich semantics and heterogeneous structure.
Perspectives:
semi-automatic analytical schema design
optimized OLAP operation on analytical queries results
eļ¬ƒcient methods for deploying analytical schemas and analytical
queries in parallel contexts
Questions?
I
You Attention
Question
:b1
:b2
:b3
thank
payed
ask
ask
ask
rdf:type
rdf:type
rdf:type
alexandra.roatis@inria.fr
https://team.inria.fr/oak/warg/

More Related Content

Similar to RDF Analytics: Lenses over Semantic Graphs

Michael mrissa c aise
Michael mrissa c aiseMichael mrissa c aise
Michael mrissa c aise
caise2013vlc
Ā 
ā€œPublishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...
ā€œPublishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...ā€œPublishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...
ā€œPublishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...
Marta Villegas
Ā 

Similar to RDF Analytics: Lenses over Semantic Graphs (20)

FIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software ComponentsFIWARE Global Summit - IDS Implementation with FIWARE Software Components
FIWARE Global Summit - IDS Implementation with FIWARE Software Components
Ā 
Modern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyModern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative study
Ā 
Lemmens kessler-agile-linked data v3-slideshare
Lemmens kessler-agile-linked data v3-slideshareLemmens kessler-agile-linked data v3-slideshare
Lemmens kessler-agile-linked data v3-slideshare
Ā 
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
Sigma EE: Reaping low-hanging fruits in RDF-based data integrationSigma EE: Reaping low-hanging fruits in RDF-based data integration
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
Ā 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Ā 
What Factors Influence the Design of a Linked Data Generation Algorithm?
What Factors Influence the Design of a Linked Data Generation Algorithm?What Factors Influence the Design of a Linked Data Generation Algorithm?
What Factors Influence the Design of a Linked Data Generation Algorithm?
Ā 
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
Ā 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
Ā 
Geo know general presentation 2013
Geo know general presentation 2013Geo know general presentation 2013
Geo know general presentation 2013
Ā 
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Ā 
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataGoing for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Ā 
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
20140902 LinDa Workshop Semantincs2014 - LinDA Project Overview
Ā 
TEAMS 6, 7 and 8
TEAMS 6, 7 and 8TEAMS 6, 7 and 8
TEAMS 6, 7 and 8
Ā 
Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
Ā 
LOD(Linked Open Data) Recommendations
LOD(Linked Open Data) RecommendationsLOD(Linked Open Data) Recommendations
LOD(Linked Open Data) Recommendations
Ā 
Research Plan 2014
Research Plan 2014Research Plan 2014
Research Plan 2014
Ā 
Michael mrissa c aise
Michael mrissa c aiseMichael mrissa c aise
Michael mrissa c aise
Ā 
ā€œPublishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...
ā€œPublishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...ā€œPublishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...
ā€œPublishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...
Ā 
DBpedia mobile
DBpedia mobileDBpedia mobile
DBpedia mobile
Ā 
cold2014-ldvizwiz
cold2014-ldvizwizcold2014-ldvizwiz
cold2014-ldvizwiz
Ā 

Recently uploaded

Mg Road Call Girls Service: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Banga...
Mg Road Call Girls Service: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Banga...Mg Road Call Girls Service: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Banga...
Mg Road Call Girls Service: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Banga...
amitlee9823
Ā 
āž„šŸ” 7737669865 šŸ”ā–» Bangalore Call-girls in Women Seeking Men šŸ”BangalorešŸ” Esc...
āž„šŸ” 7737669865 šŸ”ā–» Bangalore Call-girls in Women Seeking Men  šŸ”BangalorešŸ”   Esc...āž„šŸ” 7737669865 šŸ”ā–» Bangalore Call-girls in Women Seeking Men  šŸ”BangalorešŸ”   Esc...
āž„šŸ” 7737669865 šŸ”ā–» Bangalore Call-girls in Women Seeking Men šŸ”BangalorešŸ” Esc...
amitlee9823
Ā 
Call Girls Jalahalli Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ban...
amitlee9823
Ā 
Call Girls Hsr Layout Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ba...
amitlee9823
Ā 
Call Girls In Bellandur ā˜Ž 7737669865 šŸ„µ Book Your One night Stand
Call Girls In Bellandur ā˜Ž 7737669865 šŸ„µ Book Your One night StandCall Girls In Bellandur ā˜Ž 7737669865 šŸ„µ Book Your One night Stand
Call Girls In Bellandur ā˜Ž 7737669865 šŸ„µ Book Your One night Stand
amitlee9823
Ā 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
Ā 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
Ā 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
Ā 
Escorts Service Kumaraswamy Layout ā˜Ž 7737669865ā˜Ž Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ā˜Ž 7737669865ā˜Ž Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ā˜Ž 7737669865ā˜Ž Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ā˜Ž 7737669865ā˜Ž Book Your One night Stand (B...
amitlee9823
Ā 
Call Girls Begur Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Bangalore
Call Girls Begur Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service BangaloreCall Girls Begur Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Bangalore
Call Girls Begur Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Bangalore
amitlee9823
Ā 

Recently uploaded (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
Ā 
Mg Road Call Girls Service: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Banga...
Mg Road Call Girls Service: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Banga...Mg Road Call Girls Service: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Banga...
Mg Road Call Girls Service: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Banga...
Ā 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Ā 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
Ā 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Ā 
āž„šŸ” 7737669865 šŸ”ā–» Bangalore Call-girls in Women Seeking Men šŸ”BangalorešŸ” Esc...
āž„šŸ” 7737669865 šŸ”ā–» Bangalore Call-girls in Women Seeking Men  šŸ”BangalorešŸ”   Esc...āž„šŸ” 7737669865 šŸ”ā–» Bangalore Call-girls in Women Seeking Men  šŸ”BangalorešŸ”   Esc...
āž„šŸ” 7737669865 šŸ”ā–» Bangalore Call-girls in Women Seeking Men šŸ”BangalorešŸ” Esc...
Ā 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
Ā 
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort Service
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort ServiceBDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort Service
BDSMāš”Call Girls in Mandawali Delhi >ą¼’8448380779 Escort Service
Ā 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Ā 
Call Girls Jalahalli Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ban...
Ā 
Call Girls Hsr Layout Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Ba...
Ā 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Ā 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
Ā 
Call Girls In Bellandur ā˜Ž 7737669865 šŸ„µ Book Your One night Stand
Call Girls In Bellandur ā˜Ž 7737669865 šŸ„µ Book Your One night StandCall Girls In Bellandur ā˜Ž 7737669865 šŸ„µ Book Your One night Stand
Call Girls In Bellandur ā˜Ž 7737669865 šŸ„µ Book Your One night Stand
Ā 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
Ā 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Ā 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Ā 
Escorts Service Kumaraswamy Layout ā˜Ž 7737669865ā˜Ž Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ā˜Ž 7737669865ā˜Ž Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ā˜Ž 7737669865ā˜Ž Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ā˜Ž 7737669865ā˜Ž Book Your One night Stand (B...
Ā 
Call Girls Begur Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Bangalore
Call Girls Begur Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service BangaloreCall Girls Begur Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Bangalore
Call Girls Begur Just Call šŸ‘— 7737669865 šŸ‘— Top Class Call Girl Service Bangalore
Ā 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
Ā 

RDF Analytics: Lenses over Semantic Graphs

  • 1. May 14, 2014 RDF Analytics Lenses over Semantic Graphs Dario Colazzo 3,1 FrancĀøois GoasdouĀ“e 4,1 Ioana Manolescu 1,2 Alexandra RoatisĀø 2,1 1OAK ā€“ Inria, France 2LRI ā€“ UniversitĀ“e Paris-Sud, France 3LAMSADE ā€“ UniversitĀ“e Paris Dauphine, France 4PILGRIM ā€“ UniversitĀ“e Rennes 1, France
  • 2. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 3. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 4. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 5. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 6. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 7. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 8. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 9. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 10. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 11. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 2 RDF data warehousing scenario Ć¾Alice software engineer IT company builds user applications open RDF data (Grenoble) worksFor DS: Restaurants (i) heterogeneous data App: clickable map m #restaurants region & average rating type of cuisine build RDW: relational data warehouse extract tabular data (SPARQL queries) merge (ii) new central concepts DS3: MuseumsDS2: Shops RDW2 RDW3 (iii) other missing relationships? Bug: landmarks museums ļ¬nd redesign Feature: query relationships region famous people (iv) query schema add Feature: new type of aggregation for each landmark, show how many restaurants are nearby (v) impossible ! (separate star schema; restaurants and landmarks ā€“ central entities) add
  • 12. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 3 RDF data warehousing Application needs: (i) support of heterogeneous data (ii) multiple central concepts (iii) support for RDF semantics when querying (iv) possibility to query the relationships between entities (the schema) (v) ļ¬‚exible choice of aggregation dimensions This work: redesign the core data analytics concepts and tools for RDF formal framework for warehouse-style analytics on RDF data suited to heterogeneous, semantic-rich corpora of Linked Data
  • 13. Summary 1. RDF Graphs & BGP Queries 2. RDF Graph Analysis 3. On-Line Analytical Processing 4. Empirical Evaluation 5. Sum Up RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 4
  • 14. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 5 RDF Graphs & BGP Queries ā€“ recall ā€“
  • 15. The Resource Description Framework (RDF) RDF graph ā€“ set of triples Assertion Triple Relational notation Class s rdf:type o o(s) Property s p o p(s, o) user1 user2 worksWith Bill hasName 28 hasAge Madrid inCity Studentrdf:type :b1wrote blog1 inBlog resource (URI) blank node literal (string) property RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 6
  • 16. RDF Schema (RDFS) ā€“ declare semantic constraints between classes and properties Constraint Triple Relational notation Subclass s rdfs:subClassOf o s āŠ† o Subproperty s rdfs:subPropertyOf o s āŠ† o Domain typing s rdfs:domain o Ī domain(s) āŠ† o Range typing s rdfs:range o Ī range(s) āŠ† o Person Student rdfs:subClassOf knows rdfs:range rdfs:domain worksWith rdfs:subPropertyOf RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 7
  • 17. Open-world assumption and RDF entailment RDF data model ā€“ based on the open-world assumption. ā†’ deductive constraints ā€“ implicitly propagate tuples Entailment ā€“ reasoning mechanism set of explicit triples + ā†’ derive implicit triples some entailment rules Exhaustive application of entailment ā†’ saturation (closure) The semantics of an RDF graph is its saturation. user1 Student Person rdfs:subClassOf rdf:type rdf:type RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 8
  • 18. Basic Graph Pattern (BGP) queries ā†’ subset of SPARQL; BGP ā€“ conjunctions of triple patterns q(y) :- x rdf:type Person, x hasName y query evaluation query answering the evaluation of a query only uses the graphā€™s explicit triples (complete) answer set ā€“ evaluate q against the graphā€™s saturation user1 Student Person rdfs:subClassOf rdf:type rdf:type Bill hasName RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 9
  • 19. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 10 RDF Graph Analysis ā€“ formal framework for warehousing RDF data ā€“
  • 20. Analytical schema (AnS) and instance (I) RDF graph: Person user1 user2 rdf:type rdf:type BillhasName post1 post2 wrote wrote blog1 inBlog inBlog Code Blog hasName Analytical schema: ā†’ labeled directed graph Instance of the analytical schema w.r.t. the graph RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
  • 21. Analytical schema (AnS) and instance (I) RDF graph: Person user1 user2 rdf:type rdf:type BillhasName post1 post2 wrote wrote blog1 inBlog inBlog Code Blog hasName Analytical schema: ā†’ labeled directed graph n1 Ī»(n1) ā† Blogger Ī“(n1) ā† q(x) :- x rdf:type Person, x wrote y, y inBlog z n2 Ī»(n2) ā† Name Ī“(n2) ā† q(x) :- y hasName x e2 Ī»(e2) ā† identiļ¬edBy Ī“(e2) ā† q(x, y) :- x rdf:type Person, x hasName y Instance of the analytical schema w.r.t. the graph x rdf:type Ī»(n1) user1 rdf:type Blogger user2 rdf:type Blogger x Ī»(e2) y user1 identiļ¬edBy Bill x rdf:type Ī»(n2) Bill rdf:type Name Code Blog rdf:type Name RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
  • 22. Analytical schema (AnS) and instance (I) RDF graph: Person user1 user2 rdf:type rdf:type BillhasName post1 post2 wrote wrote blog1 inBlog inBlog Code Blog hasName Analytical schema: ā†’ labeled directed graph n1 Ī»(n1) ā† Blogger Ī“(n1) ā† q(x) :- x rdf:type Person, x wrote y, y inBlog z n2 Ī»(n2) ā† Name Ī“(n2) ā† q(x) :- y hasName x e2 Ī»(e2) ā† identiļ¬edBy Ī“(e2) ā† q(x, y) :- x rdf:type Person, x hasName y Instance of the analytical schema w.r.t. the graph x rdf:type Ī»(n1) user1 rdf:type Blogger user2 rdf:type Blogger x Ī»(e2) y user1 identiļ¬edBy Bill x rdf:type Ī»(n2) Bill rdf:type Name Code Blog rdf:type Name RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
  • 23. Analytical schema (AnS) and instance (I) RDF graph: Person user1 user2 rdf:type rdf:type BillhasName post1 post2 wrote wrote blog1 inBlog inBlog Code Blog hasName Analytical schema: ā†’ labeled directed graph n1 Ī»(n1) ā† Blogger Ī“(n1) ā† q(x) :- x rdf:type Person, x wrote y, y inBlog z n2 Ī»(n2) ā† Name Ī“(n2) ā† q(x) :- y hasName x e2 Ī»(e2) ā† identiļ¬edBy Ī“(e2) ā† q(x, y) :- x rdf:type Person, x hasName y Instance of the analytical schema w.r.t. the graph x rdf:type Ī»(n1) user1 rdf:type Blogger user2 rdf:type Blogger x Ī»(e2) y user1 identiļ¬edBy Bill x rdf:type Ī»(n2) Bill rdf:type Name Code Blog rdf:type Name RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
  • 24. Analytical schema (AnS) and instance (I) RDF graph: Person user1 user2 rdf:type rdf:type BillhasName post1 post2 wrote wrote blog1 inBlog inBlog Code Blog hasName Analytical schema: ā†’ labeled directed graph n1 Ī»(n1) ā† Blogger Ī“(n1) ā† q(x) :- x rdf:type Person, x wrote y, y inBlog z n2 Ī»(n2) ā† Name Ī“(n2) ā† q(x) :- y hasName x e2 Ī»(e2) ā† identiļ¬edBy Ī“(e2) ā† q(x, y) :- x rdf:type Person, x hasName y ! data heterogeneity preserved ! Instance of the analytical schema w.r.t. the graph x rdf:type Ī»(n1) user1 rdf:type Blogger user2 rdf:type Blogger x Ī»(e2) y user1 identiļ¬edBy Bill x rdf:type Ī»(n2) Bill rdf:type Name Code Blog rdf:type Name RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
  • 25. Analytical schema (AnS) and instance (I) RDF graph: Person user1 user2 rdf:type rdf:type BillhasName post1 post2 wrote wrote blog1 inBlog inBlog Code Blog hasName Analytical schema: ā†’ labeled directed graph n1 Ī»(n1) ā† Blogger Ī“(n1) ā† q(x) :- x rdf:type Person, x wrote y, y inBlog z n2 Ī»(n2) ā† Name Ī“(n2) ā† q(x) :- y hasName x e2 Ī»(e2) ā† identiļ¬edBy Ī“(e2) ā† q(x, y) :- x rdf:type Person, x hasName y ! easy to extend ! Instance of the analytical schema w.r.t. the graph x rdf:type Ī»(n1) user1 rdf:type Blogger user2 rdf:type Blogger x Ī»(e2) y user1 identiļ¬edBy Bill x rdf:type Ī»(n2) Bill rdf:type Name Code Blog rdf:type Name RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 11
  • 26. Analytical query (AnQ) RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 12 Analytical schema: Instance: n1 : Blogger n2 : Citye2 : from n3 : Value e3 : age n4 : BlogPost e4 : posted n5 : Site e5 : on user1 user2 user3 28 age Madrid from 40 age 35 age New York from post1 post2 post3 post4 posted posted posted posted blog1 blog2 on on on on Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 m(x, v) :- x posted y, y on v count
  • 27. Analytical query (AnQ) RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 12 Analytical schema: Instance: n1 : Blogger n2 : Citye2 : from n3 : Value e3 : age n4 : BlogPost e4 : posted n5 : Site e5 : on user1 user2 user3 28 age Madrid from 40 age 35 age New York from post1 post2 post3 post4 posted posted posted posted blog1 blog2 on on on on Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 { user1, ā€œ28ā€, ā€œMadridā€ , user3, ā€œ35ā€, ā€œNew Yorkā€ } m(x, v) :- x posted y, y on v count
  • 28. Analytical query (AnQ) RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 12 Analytical schema: Instance: n1 : Blogger n2 : Citye2 : from n3 : Value e3 : age n4 : BlogPost e4 : posted n5 : Site e5 : on user1 user2 user3 28 age Madrid from 40 age 35 age New York from post1 post2 post3 post4 posted posted posted posted blog1 blog2 on on on on Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 { user1, ā€œ28ā€, ā€œMadridā€ , user3, ā€œ35ā€, ā€œNew Yorkā€ } m(x, v) :- x posted y, y on v { user1, blog1 , user1, blog2 , user2, blog2 , user3, blog2 } count
  • 29. Analytical query (AnQ) RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 12 Analytical schema: Instance: n1 : Blogger n2 : Citye2 : from n3 : Value e3 : age n4 : BlogPost e4 : posted n5 : Site e5 : on user1 user2 user3 28 age Madrid from 40 age 35 age New York from post1 post2 post3 post4 posted posted posted posted blog1 blog2 on on on on Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 { user1, ā€œ28ā€, ā€œMadridā€ , user3, ā€œ35ā€, ā€œNew Yorkā€ } m(x, v) :- x posted y, y on v { user1, blog1 , user1, blog2 , user2, blog2 , user3, blog2 } count { ā€œ28ā€, ā€œMadridā€, 2 , ā€œ35ā€, ā€œNew Yorkā€, 1 }
  • 30. Analytical query answering RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 13 through analytical schema materialization through analytical query reformulation
  • 31. Analytical query answering RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 13 through analytical schema materialization through analytical query reformulation Analytical schema: n1 Ī»(n1) ā† Blogger Ī“(n1) ā† q(x) :- x rdf:type Person, x wrote y, y inBlog z e1 Ī»(e1) ā† acquaintedWith Ī“(e1) ā† q(x, y) :- z rdfs:subPropertyOf knows, x z y Query: c(x, d) :- x rdf:type Blogger, x acquaintedWith d c (x, d) :- x rdf:type Person, x wrote y1, y1 inBlog y2, z1 rdfs:subPropertyOf knows, x z1 d
  • 32. Analytical query answering RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 13 through analytical schema materialization through analytical query reformulation Analytical schema: n1 Ī»(n1) ā† Blogger Ī“(n1) ā† q(x) :- x rdf:type Person, x wrote y, y inBlog z e1 Ī»(e1) ā† acquaintedWith Ī“(e1) ā† q(x, y) :- z rdfs:subPropertyOf knows, x z y Query: c(x, d) :- x rdf:type Blogger, x acquaintedWith d c (x, d) :- x rdf:type Person, x wrote y1, y1 inBlog y2,
  • 33. Analytical query answering RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 13 through analytical schema materialization through analytical query reformulation Analytical schema: n1 Ī»(n1) ā† Blogger Ī“(n1) ā† q(x) :- x rdf:type Person, x wrote y, y inBlog z e1 Ī»(e1) ā† acquaintedWith Ī“(e1) ā† q(x, y) :- z rdfs:subPropertyOf knows, x z y Query: c(x, d) :- x rdf:type Blogger, x acquaintedWith d c (x, d) :- x rdf:type Person, x wrote y1, y1 inBlog y2, z1 rdfs:subPropertyOf knows, x z1 d
  • 34. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 14 On-Line Analytical Processing ā€“ applying OLAP operations ā€“
  • 35. Slice, dice, drill-in and drill-out RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15 Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 m(x, v) :- x posted y, y on v count
  • 36. Slice, dice, drill-in and drill-out RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15 Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 m(x, v) :- x posted y, y on v count Slice: bind an aggregation dimension to a single value cĪ£ (x, d1, d2) :- x age d1, x from d2 Ī£ = { d1 ā† ā€œ35ā€ }
  • 37. Slice, dice, drill-in and drill-out RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15 Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 m(x, v) :- x posted y, y on v count Slice: bind an aggregation dimension to a single value cĪ£ (x, d1, d2) :- x age d1, x from d2 Ī£ = { d1 ā† ā€œ35ā€ } Dice: bind several aggregation dimensions to sets of values cĪ£ (x, d1, d2) :- x age d1, x from d2 Ī£ = { d1 ā† {ā€œ28ā€}, d2 ā† {ā€œMadridā€, ā€œKyotoā€} }
  • 38. Slice, dice, drill-in and drill-out RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15 Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 m(x, v) :- x posted y, y on v count Slice: bind an aggregation dimension to a single value cĪ£ (x, d1, d2) :- x age d1, x from d2 Ī£ = { d1 ā† ā€œ35ā€ } Dice: bind several aggregation dimensions to sets of values cĪ£ (x, d1, d2) :- x age d1, x from d2 Ī£ = { d1 ā† {ā€œ28ā€}, d2 ā† {ā€œMadridā€, ā€œKyotoā€} } Drill-in: remove a dimension from the classiļ¬er c (x, d2) :- x from d2
  • 39. Slice, dice, drill-in and drill-out RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 15 Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 m(x, v) :- x posted y, y on v count Slice: bind an aggregation dimension to a single value cĪ£ (x, d1, d2) :- x age d1, x from d2 Ī£ = { d1 ā† ā€œ35ā€ } Dice: bind several aggregation dimensions to sets of values cĪ£ (x, d1, d2) :- x age d1, x from d2 Ī£ = { d1 ā† {ā€œ28ā€}, d2 ā† {ā€œMadridā€, ā€œKyotoā€} } Drill-in: remove a dimension from the classiļ¬er c (x, d2) :- x from d2 Drill-out: add a dimension to the classiļ¬er c (x, d1, d2, d3) :- x age d1, x from d2, x acquaintedWith d3
  • 40. Roll-up and drill-down RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 16 Query: Find the number of sites where each blogger posts, classiļ¬ed by the bloggerā€™s age and city. c(x, d1, d2) :- x age d1, x from d2 m(x, v) :- x posted y, y on v count nextLevel relationship ā€“ hierarchies among nodes or edges n1 : Blogger n2 : Citye2 : from n6 : Statee6 : nextLevel n3 : Value e3 : age n4 : BlogPost e4 : posted n5 : Site e5 : on Roll-up: along the City dimension to the State level c (x, d1, d3) :- x age d1, x from d2, d2 nextLevel d3
  • 41. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 17 Empirical Evaluation ā€“ experiments and demo ā€“
  • 42. Experiments RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 18 Settings: kdb+ v3.0 (64 bits) ā€“ highly eļ¬ƒcient in-memory column store q interpreted programming language Dataset: DBpedia Download 3.8 Ontology and Ontology Infobox datasets Hardware: 8-core DELL server at 2.13 GHz 16 GB of RAM running Linux 2.6.31.14 Results: linear scale-up w.r.t. the data size for instance materialization and query answering
  • 43. Analytical query answering 12 patterns c number of triple patterns in the classiļ¬er query 1,097 queries v number of dimension variables in the classiļ¬er query m number of triple patterns in the measure query c1v1m1 c1v1m2 c1v1m3 c2v1m3 c3v2m3 c4v3m3 c5v1m3 c5v2m3 c5v3m3 c5v4m1 c5v4m2 c5v4m3 0 1 10 average minimum maximum c1v1m1 (73) c1v1m2 (53) c1v1m3 (62) c2v1m3 (71) c3v2m3 (76) c4v3m3 (130) c5v1m3 (144) c5v2m3 (216) c5v3m3 (144) c5v4m1 (28) c5v4m2 (64) c5v4m3 (36) 0 1 10 100 1,000 10,000 100,000 evaluation time (s) number of results RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 19
  • 44. Java GUI using the Prefuse toolkit (collaboration with Tushar Ghosh) RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 20
  • 45. Java GUI using the Prefuse toolkit (collaboration with Tushar Ghosh) RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 20
  • 46. Java GUI using the Prefuse toolkit (collaboration with Tushar Ghosh) RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 20
  • 47. RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 21 Sum Up
  • 48. Related works RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 22 Graph cube: on warehousing and OLAP multidimensional networks [SIGMOD 2011] ā†’ do not handle heterogeneous graphs, nor data semantics, both central in RDF ā†’ only focus on counting edges in contrast with our ļ¬‚exible analytical queries Business intelligence on complex graph data [EDBT/ICDT 2012 Workshops] ā†’ graph data aggregated in a spatial fashion (group connected nodes into regions) ā†’ our framework ā€“ RDF-speciļ¬c + more general aggregation No Size Fits All ā€“ Running the Star Schema Benchmark with SPARQL and RDF Aggregate Views [ESWC 2013] ā†’ techniques for transforming OLAP queries into SPARQL ā†’ could be used to further optimize analytical query answering in our framework The MD-join: An Operator for Complex OLAP [ICDE 2001] ā†’ separation between grouping and aggregation present in our analytical queries is similar to the MD-join operator for RDWs W3Cā€™s SPARQL 1.1 Query Language ā†’ features SQL-style grouping and aggregation ā†’ eļ¬ƒcient SPARQL 1.1 platforms ā€“ ideal for deploying our framework
  • 49. Sum up and perspectives RDF Analytics: Lenses over Semantic Graphs May 14, 2014 ā€“ 23 Sum up: Approach for specifying and exploiting an RDF data warehouse deļ¬ne an analytical schema that captures the information of interest formalize analytical queries (or cubes) over the analytical schema Instances of analytical schemas are RDF graphs themselves, which allows to exploit the rich semantics and heterogeneous structure. Perspectives: semi-automatic analytical schema design optimized OLAP operation on analytical queries results eļ¬ƒcient methods for deploying analytical schemas and analytical queries in parallel contexts