Access Control for RDF graphs using Abstract Models

ACCESS CONTROL FOR
RDF GRAPHS USING
ABSTRACT MODELS

Vassilis Papakonstantinou
(papv@ics.forth.gr)

Joint work with:
Maria Michou, Irini Fundulaki,
Giorgos Flouris, Grigoris Antoniou

SACMAT 2012

MOTIVATION

June 20-22, 2012
 Why RDF Data?
 RDF is the de-facto standard for publishing data in
the Linked Open Data Cloud
 E-Science (astronomy, life
 sciences, earth sciences)
 Public Government Data
 (US, UK, The Netherlands, … )
 Social Networks

SACMAT-2012
 DBPedia, CIA World FactBook, …

 Why Access Control?
 Crucial for sensitive content since it ensures the
2
selective exposure of information to different
classes of users

MAIN CONTRIBUTIONS

June 20-22, 2012
 Fine-grained Access Control Model for RDF
 defined at the level of RDF triples
 focus on read-only permissions
 with support for RDFS inference to infer new
knowledge
 encodes how an access label has been computed
 Supports dynamic datasets

SACMAT-2012
 Supports dynamic access control policies

 Implementation and experiments on top of
MonetDB and PostgreSQL
3

OUTLINE

June 20-22, 2012
 Preliminaries: RDF and RDF Schema
 Current models: Access Control Annotations

 Our approach: Abstract Access Control Models

 Implementation

 Experiments

SACMAT-2012
4

RESOURCE DESCRIPTION
FRAMEWORK (RDF)

June 20-22, 2012
 General-purpose language for representing
information in the Semantic Web
 Information represented using triples
 (s, p, o) [subject, predicate, object]
 s, p, o: URIs or literals
 Example: (&a, firstName, “Alice”)

SACMAT-2012
firstName
&a “Alice”
An entity being A property of the entity The value of the
described (first name) predicate
(the first name)
[subject] [predicate] [object] 5

RDF SCHEMA

June 20-22, 2012
 RDF Schema is a Vocabulary Agent
Description Language
 Used to define the vocabulary used
sc
in an RDF graph. (Class, Property,
subClassOf, subPropertyOf,
domain, range) Person sc

Semantics add simple reasoning

SACMAT-2012
 sc
capabilities
 e.g. inference rules for subClass or Student
subProperty relations
(sc  rdfs:subClassOf)
6

CURRENT MODELS: ACCESS
CONTROL ANNOTATIONS

June 20-22, 2012
 Access control provided at the level of RDF triples
 Represented by RDF quadruples (s,p,o,l)

subject predicate object label
Student sc Person Accessible
Person sc Agent Inaccessible

SACMAT-2012
 In implied triples semantics are applied directly to
give them labels
Acc.∧ Inacc.
Inaccessible 7
Student sc Agent

PROBLEMS OF ACCESS CONTROL
ANNNOTATIONS

June 20-22, 2012
 Easy, but not amenable to changes
 If one access label of one triple changes, it has
cascading effects to implied labels of other triples
 Cannot know which labels/triples are affected
 Re-computation of access labels is necessary (for the
entire dataset)
 If the access label of one triple changes
 If a triple is deleted, modified or added

SACMAT-2012
 If the semantics according to which labels of inferred triples

are computed change
 If the policy changes (e.g. a liberal policy becomes

conservative)
8

OUR APPROACH: ABSTRACT ACCESS
CONTROL MODELS

June 20-22, 2012
 Abstract Access Control Model defined by a set
of abstract tokens and abstract operators to
model
 computation of access labels of implied RDF triples
 propagation of access labels
 Access Control Authorizations associate triples in
the RDF/S graph with abstract tokens: quadruples

SACMAT-2012
 RDFS inference rules for computing the access
labels of implied quadruples
 Propagation rules to specify how access labels are
propagated along the subClassOf and
9
subPropertyOf relations

ABSTRACT ACCESS CONTROL
MODELS

June 20-22, 2012
of abstract tokens and abstract operators
 ⊙: binary operator over access tokens to model RDFS
inference
 computes the label of implied RDF triples for the
subClassOf/subPropertyOf and type hierarchies

SACMAT-2012
(A1, sc, A2, l1) (A2, sc, A3, l2) (A1, sc, A3, l1 ⊙ l2)

10

ABSTRACT ACCESS CONTROL
MODELS

June 20-22, 2012
of abstract tokens and abstract operators
 ⊗ : unary operator over multi-sets of access tokens to
model propagation of access labels
 propagates the access labels along the subclass/subproperty and
type hierarchies
 the subclasses of a class inherit the label of its superclass, the

instances of a class inherit the label of its superclass, etc.

SACMAT-2012
(A1, type, class, l1) (A2, sc, A1, l2) (A2, type, class, l3) (A2, type, class, ⊗ (l1 ))

11

ANNOTATION - DETERMINING THE
ABSTRACT EXPRESSIONS (1/3)

June 20-22, 2012
 Apply authorizations Authorizations (Query, access token)
 we are going from A1: (construct {?x sc ?y}, at1)
triples to quadruples A2: (construct {?x type Student }, at2)
A3: (construct {?x type class}, at3)
A4: (construct {?x ?p Person}, at4)

id S p o id s p o l
t1 Student sc Person q1 Student sc Person at1

SACMAT-2012
t2 Person sc Agent q2 Person sc Agent at1
t3 &a type Student q3 &a type Student at2
t4 &a lastName “Smith” q4 &a lastName “Smith” ⊥
t5 Agent type Class q5 Agent type Class at3 12

q6 Student sc Person at4


June 20-22, 2012
 Apply RDFS id s p o l
inference rules q1 Student sc Person at1
 New quadruples Person sc Agent at1
q2
produced
q3 &a type Student at2
R1 q6 Student sc Person at4
(A, sc, B, l1)
(A, sc, C, l1⊙l2) …
(B, sc, C, l2)

SACMAT-2012
q7 Student sc Agent q1 q2
R2
(x, type, A, l1) q8 Student sc Agent q6 q2
(x, type, B, l1⊙l2)
(A, sc, B, l2) q9 &a type Person q3 q1

q10 &a type Agent q3 (q1 q2 )
… 13


June 20-22, 2012
 Apply propagation id s p o l
rules Agent type Class at3
q5
 Add new labels to
q10 &a type Agent q3 (q1 q2 )
existing triples
…
 e.g. classes propagate
q11 &a type Agent ⊗q5
labels to their
…
instances and

SACMAT-2012
subclasses R5
R6 (B, type, class, l1)
(A, type, class, l1) (A, sc, B, l2) (x, type, class, ⊗l1)
(x, type, A, ⊗l1)
(x, type, A, l2) (A, type, class, l3)

14

EVALUATION - DETERMINING
ACCESSIBILITY

June 20-22, 2012
 We have to define

 Set of Concrete Tokens and a Mapping from
abstract to concrete tokens
 Set of Concrete operators that implement the
abstract ones
 Conflict resolution operator to resolve ambiguous

SACMAT-2012
labels
 Access Function to decide when a triple is
accessible

15

CONCRETE ACCESS CONTROL
POLICY

June 20-22, 2012
 Example: Concrete Policy 1

 Concrete tokens: LP = {true, false}
 Inference operator: (∧) Conjunction  ⊙
 Propagation operator: (IDL ) Identity function  ⊗
 Conflict resolution operator: (∧) Conjunction  ⊕

SACMAT-2012
 Access function: triples with label true are accessible,
otherwise, inaccessible

16

EVALUATION FOR CP1 (1/3)
COMPUTE LABELS

June 20-22, 2012
 Concrete policy 1 id s p o l
 LP = {true, false} q1 Student sc Person at1
true
 ∧⊙
q2 Person sc Agent true
at1
 IDL  ⊗
q5 Agent type Class false
at3
 ∧⊕
q6 Student sc Person false
at4
q7 Student sc Agent true⊙q2
qtrue
1 ∧ true
 Map abstract tokens

SACMAT-2012
q8 Student sc Agent q6⊙ true
false∧q2
false
to concrete q11 &a type Agent false
⊗q5
 true  at1, at2
 false  at3, at4

17

AMBIGUOUS LABELS REMOVAL

June 20-22, 2012
 Back from quadruples to triples

Student sc Person true
Student sc Person false

SACMAT-2012
Student sc Person true∧ false
false

18

DETERMINING ACCESSIBILITY

June 20-22, 2012
 The essence of access control:

Student sc Person false
Inaccessible
Person sc Agent true
Accessible
&a type Student true
Accessible
&a lastName “Smith” false
Inaccessible

SACMAT-2012
Agent type Class false
Inaccessible

19

PROS & CONS OF ABSTRACT ACCESS
CONTROL MODELS

June 20-22, 2012
 Pros:
 The same application can experiment with different
concrete policies over the same dataset
 liberal vs conservative policies for different classes of users
 Different applications can experiment with different
concrete policies for the same data
 In the case of updates there is no need to re-
compute the inferred triples

SACMAT-2012
 Cons:
 overhead in the required storage space
 algebraic expressions can become complex depending on the
structure of the dataset
20

IMPLEMENTATION

June 20-22, 2012
 Used a relational schema to store quadruples
and their labels (including abstract expressions)

 Using stored procedure mechanism through
which we perform annotation and evaluation
 MonetDB
 PostgreSQL

SACMAT-2012
21

EXPERIMENTS

June 20-22, 2012
 Experiments
 Experiment 1: annotation time (the time required
to compute the inferred triples with their labels and
the propagated labels)
 Experiment 2: evaluation time (a) (the time
needed to compute for a concrete policy, the concrete
access labels of all RDF triples)
 Experiment 3: evaluation time (b) (the time

SACMAT-2012
needed to compute for a concrete policy, the concrete
access label of a percentage of the RDF triples)
 Datasets:
 Synthetic Schemas produced with PowerGen
22
 Real: CIDOC, GO

EXPERIMENTAL RESULTS
ANNOTATION TIME – MONETDB
(SYNTHETIC)

June 20-22, 2012
 Annotation time
increases as the
number of implied
triples increases

 Plunges are due to
changes in the
structure of the

SACMAT-2012
ontology
(reduction of the
depth)

 152 Synthetic ontologies
 100-1000 classes, 113-1635 properties, 124-50295 class instances 23
and 110-1321 property instances
 Different depth for the sc and sp hierarchies (from 4 to 8)

EVALUATION TIME (FULL)

June 20-22, 2012
 Evaluation time
increases linearly
as the number of
total triples
increases
 MonetDB
outperforms
PostgreSQL

SACMAT-2012
 Some of synthetic
datasets couldn’t
be evaluated

24

EVALUATION TIME (DATASET
PERCENTAGE) - MONETDB

June 20-22, 2012
 Evaluation time
for largest dataset
that evaluated
successfully on
Experiment 2

 Similar conclu-
sions as with

SACMAT-2012
Experiment 2

25

EXPERIMENTAL RESULTS - REAL
DATASETS

June 20-22, 2012
 CIDOC
 Annotation time
 MonetDB: 69ms
 PostgreSQL: 4000ms

 Evaluation time (full)
 MonetDB – CP1: 7775ms
 MonetDB – CP2: 3923ms

GO

SACMAT-2012

 Annotation time
 MonetDB: 32s
 PostgreSQL: 844s

 Evaluation time (full) 26
 Exceeded our set timeout

CONCLUSIONS

June 20-22, 2012
 Proposed a new paradigm based on abstract
models and operators
 Advantages
 Flexibility and easy adaptation to change (no re-
computation necessary)
 Easy experimentation with different access control
policies

SACMAT-2012
 Disadvantages
 Increased space requirements
 Overhead at query time (for evaluation)
 Suitable for dynamic datasets
27

ANNOTATION TIME –
POSTGRESQL(SYNTHETIC)

June 20-22, 2012
 Annotation time
increases as the
number of implied
triples increases

 One plunges are
due to change in
the structure of

SACMAT-2012
the ontology
(reduction of the
depth)

 Up to 1000 classes, 1635 properties, 50167 class instances and 95
property instances before reaching the timeout. 29

IMPLEMENTATION

June 20-22, 2012
 Used a relational schema to store quadruples
 Quad(qid, s, p, o, propop, inferop, label)
 inferop, propop: boolean values indicating whether the label
is obtained through propagation or inference
 LabelStore(qid, qid_uses)
 stores the access label of a triple
 qid: the quadruple whose label is stored

 qid_uses: the explict quadruple’s qid through which qid

SACMAT-2012
produced.

30

IMPLEMENTATION

id s p o l id s p o iop pop l

q1 Student sc Person at1 q1 Student sc Person f f at1

q2 Person sc Agent at1 q2 Person sc Agent f f at1

q3 &a type Student at2 q3 &a type Student f f at2

q5 Agent type Class at3 q5 Agent type Class f f at3

q6 Student sc Person at4 q6 Student sc Person f f at4

q7 Student sc Agent at1⊙at1 q7 Student sc Agent t f null

q10 &a type Agent at2⊙(at1⊙at1) q9 &a type Agent t f null

q11 &a type Agent ⊗at3 q10 Person Sc Agent f t null

Quadruples (Motivating example) Quad(qid,s,p,o,propop,inferop,label)

31

IMPLEMENTATION

June 20-22, 2012
id s p o l qid qid_uses
q1 Student sc Person at1 q7 q1
q2 Person sc Agent at1 q7 q2
q3 &a type Student at2 q10 q3
q5 Agent type Class at3 q10 q1
q6 Student sc Person at4 q10 q2
q7 Student sc Agent at1⊙at1 q11 q5

SACMAT-2012
q10 &a type Agent at2⊙(at1⊙at1)

q11 &a type Agent ⊗at3

Quadruples (Motivating example) Labelstore(qid,qid_uses)

32

Access Control for RDF graphs using Abstract Models

Recommended

Recommended

More Related Content

Similar to Access Control for RDF graphs using Abstract Models

Similar to Access Control for RDF graphs using Abstract Models (20)

More from PlanetData Network of Excellence

More from PlanetData Network of Excellence (20)

Recently uploaded

Recently uploaded (20)

Access Control for RDF graphs using Abstract Models