Access Control for RDF graphs using Abstract Models

1,190 views
1,092 views

Published on

This paper was presented by Vassilis Papakonstantinou at the 17th ACM Symposium on Access Control Models and Technologies (ACM SACMAT 2012) in Newark, USA, June 20 - 22, 2012.

Abstract:
The Resource Description Framework (RDF) has become the defacto standard for representing information in the Semantic Web. Given the increasing amount of sensitive RDF data available on the Web, it becomes increasingly critical to guarantee secure access to this content. In this paper we advocate the use of an abstract access control model to ensure the selective exposure of RDF information. The model is defined by a set of abstract operators. Tokens are used to label RDF triples with access information. Abstract operators model RDF Schema inference rules and propagation of labels along the RDF Schema (RDFS) class and property hierarchies. In this way, the access label of a triple is a complex expression that involves the labels of the triples and the operators applied to obtain said label. Different applications can then adopt different concrete access policies that encode an assignment of the abstract tokens and operators to concrete (specific) values. Following this approach, changes in the interpretation of abstract tokens and operators can be easily implemented resulting in a very flexible mechanism that allows one to easily experiment with different concrete access policies (defined per context or user). To demonstrate the feasibility of the approach, we implemented our ideas on top of the MonetDB and PostgreSQL open source database systems. We conducted an initial set of experiments which showed that the overhead for using abstract expressions is roughly linear to the number of triples considered; performance is also affected by the characteristics of the dataset, such as the size and depth of class and property hierarchies as well as the considered concrete policy.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,190
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Access Control for RDF graphs using Abstract Models

  1. 1. ACCESS CONTROL FOR RDF GRAPHS USING ABSTRACT MODELS Vassilis Papakonstantinou (papv@ics.forth.gr) Joint work with: Maria Michou, Irini Fundulaki, Giorgos Flouris, Grigoris Antoniou SACMAT 2012
  2. 2. MOTIVATION June 20-22, 2012 Why RDF Data?  RDF is the de-facto standard for publishing data in the Linked Open Data Cloud  E-Science (astronomy, life  sciences, earth sciences)  Public Government Data  (US, UK, The Netherlands, … )  Social Networks SACMAT-2012  DBPedia, CIA World FactBook, … Why Access Control?  Crucial for sensitive content since it ensures the 2 selective exposure of information to different classes of users
  3. 3. MAIN CONTRIBUTIONS June 20-22, 2012 Fine-grained Access Control Model for RDF  defined at the level of RDF triples  focus on read-only permissions  with support for RDFS inference to infer new knowledge  encodes how an access label has been computed Supports dynamic datasets SACMAT-2012 Supports dynamic access control policies Implementation and experiments on top of MonetDB and PostgreSQL 3
  4. 4. OUTLINE June 20-22, 2012 Preliminaries: RDF and RDF Schema Current models: Access Control Annotations Our approach: Abstract Access Control Models Implementation Experiments SACMAT-2012 4
  5. 5. RESOURCE DESCRIPTION FRAMEWORK (RDF) June 20-22, 2012 General-purpose language for representing information in the Semantic Web Information represented using triples  (s, p, o) [subject, predicate, object]  s, p, o: URIs or literals  Example: (&a, firstName, “Alice”) SACMAT-2012 firstName &a “Alice” An entity being A property of the entity The value of the described (first name) predicate (the first name) [subject] [predicate] [object] 5
  6. 6. RDF SCHEMA June 20-22, 2012 RDF Schema is a Vocabulary Agent Description Language  Used to define the vocabulary used sc in an RDF graph. (Class, Property, subClassOf, subPropertyOf, domain, range) Person sc Semantics add simple reasoning SACMAT-2012 sc capabilities  e.g. inference rules for subClass or Student subProperty relations (sc  rdfs:subClassOf) 6
  7. 7. CURRENT MODELS: ACCESS CONTROL ANNOTATIONS June 20-22, 2012 Access control provided at the level of RDF triples Represented by RDF quadruples (s,p,o,l) subject predicate object label Student sc Person Accessible Person sc Agent Inaccessible SACMAT-2012 In implied triples semantics are applied directly to give them labels subject predicate object label Acc.∧ Inacc. Inaccessible 7 Student sc Agent
  8. 8. PROBLEMS OF ACCESS CONTROL ANNNOTATIONS June 20-22, 2012 Easy, but not amenable to changes  If one access label of one triple changes, it has cascading effects to implied labels of other triples  Cannot know which labels/triples are affected  Re-computation of access labels is necessary (for the entire dataset)  If the access label of one triple changes  If a triple is deleted, modified or added SACMAT-2012  If the semantics according to which labels of inferred triples are computed change  If the policy changes (e.g. a liberal policy becomes conservative) 8
  9. 9. OUR APPROACH: ABSTRACT ACCESS CONTROL MODELS June 20-22, 2012 Abstract Access Control Model defined by a set of abstract tokens and abstract operators to model  computation of access labels of implied RDF triples  propagation of access labels Access Control Authorizations associate triples in the RDF/S graph with abstract tokens: quadruples SACMAT-2012 RDFS inference rules for computing the access labels of implied quadruples Propagation rules to specify how access labels are propagated along the subClassOf and 9 subPropertyOf relations
  10. 10. ABSTRACT ACCESS CONTROL MODELS June 20-22, 2012 Abstract Access Control Model defined by a set of abstract tokens and abstract operators  ⊙: binary operator over access tokens to model RDFS inference  computes the label of implied RDF triples for the subClassOf/subPropertyOf and type hierarchies SACMAT-2012 (A1, sc, A2, l1) (A2, sc, A3, l2) (A1, sc, A3, l1 ⊙ l2) 10
  11. 11. ABSTRACT ACCESS CONTROL MODELS June 20-22, 2012 Abstract Access Control Model defined by a set of abstract tokens and abstract operators  ⊗ : unary operator over multi-sets of access tokens to model propagation of access labels  propagates the access labels along the subclass/subproperty and type hierarchies  the subclasses of a class inherit the label of its superclass, the instances of a class inherit the label of its superclass, etc. SACMAT-2012(A1, type, class, l1) (A2, sc, A1, l2) (A2, type, class, l3) (A2, type, class, ⊗ (l1 )) 11
  12. 12. ANNOTATION - DETERMINING THE ABSTRACT EXPRESSIONS (1/3) June 20-22, 2012 Apply authorizations Authorizations (Query, access token)  we are going from A1: (construct {?x sc ?y}, at1) triples to quadruples A2: (construct {?x type Student }, at2) A3: (construct {?x type class}, at3) A4: (construct {?x ?p Person}, at4)id S p o id s p o lt1 Student sc Person q1 Student sc Person at1 SACMAT-2012t2 Person sc Agent q2 Person sc Agent at1t3 &a type Student q3 &a type Student at2t4 &a lastName “Smith” q4 &a lastName “Smith” ⊥t5 Agent type Class q5 Agent type Class at3 12 q6 Student sc Person at4
  13. 13. ANNOTATION - DETERMINING THE ABSTRACT EXPRESSIONS (2/3) June 20-22, 2012 Apply RDFS id s p o l inference rules q1 Student sc Person at1  New quadruples Person sc Agent at1 q2 produced q3 &a type Student at2 R1 q6 Student sc Person at4 (A, sc, B, l1) (A, sc, C, l1⊙l2) … (B, sc, C, l2) SACMAT-2012 q7 Student sc Agent q1 q2 R2(x, type, A, l1) q8 Student sc Agent q6 q2 (x, type, B, l1⊙l2)(A, sc, B, l2) q9 &a type Person q3 q1 q10 &a type Agent q3 (q1 q2 ) … 13
  14. 14. ANNOTATION - DETERMINING THE ABSTRACT EXPRESSIONS (3/3) June 20-22, 2012  Apply propagation id s p o l rules Agent type Class at3 q5  Add new labels to q10 &a type Agent q3 (q1 q2 ) existing triples …  e.g. classes propagate q11 &a type Agent ⊗q5 labels to their … instances and SACMAT-2012 subclasses R5 R6 (B, type, class, l1)(A, type, class, l1) (A, sc, B, l2) (x, type, class, ⊗l1) (x, type, A, ⊗l1)(x, type, A, l2) (A, type, class, l3) 14
  15. 15. EVALUATION - DETERMINING ACCESSIBILITY June 20-22, 2012 We have to define  Set of Concrete Tokens and a Mapping from abstract to concrete tokens  Set of Concrete operators that implement the abstract ones  Conflict resolution operator to resolve ambiguous SACMAT-2012 labels  Access Function to decide when a triple is accessible 15
  16. 16. CONCRETE ACCESS CONTROL POLICY June 20-22, 2012 Example: Concrete Policy 1  Concrete tokens: LP = {true, false}  Inference operator: (∧) Conjunction  ⊙  Propagation operator: (IDL ) Identity function  ⊗  Conflict resolution operator: (∧) Conjunction  ⊕ SACMAT-2012  Access function: triples with label true are accessible, otherwise, inaccessible 16
  17. 17. EVALUATION FOR CP1 (1/3) COMPUTE LABELS June 20-22, 2012 Concrete policy 1 id s p o l  LP = {true, false} q1 Student sc Person at1 true  ∧⊙ q2 Person sc Agent true at1  IDL  ⊗ q5 Agent type Class false at3  ∧⊕ q6 Student sc Person false at4 q7 Student sc Agent true⊙q2 qtrue 1 ∧ true Map abstract tokens SACMAT-2012 q8 Student sc Agent q6⊙ true false∧q2 false to concrete q11 &a type Agent false ⊗q5  true  at1, at2  false  at3, at4 17
  18. 18. EVALUATION FOR CP1 (2/3) AMBIGUOUS LABELS REMOVAL June 20-22, 2012 Back from quadruples to triples subject predicate object label Student sc Person true Student sc Person false SACMAT-2012 subject predicate object label Student sc Person true∧ false false 18
  19. 19. EVALUATION FOR CP1 (3/3) DETERMINING ACCESSIBILITY June 20-22, 2012 The essence of access control: subject predicate object label Student sc Person false Inaccessible Person sc Agent true Accessible &a type Student true Accessible &a lastName “Smith” false Inaccessible SACMAT-2012 Agent type Class false Inaccessible 19
  20. 20. PROS & CONS OF ABSTRACT ACCESS CONTROL MODELS June 20-22, 2012 Pros:  The same application can experiment with different concrete policies over the same dataset  liberal vs conservative policies for different classes of users  Different applications can experiment with different concrete policies for the same data  In the case of updates there is no need to re- compute the inferred triples SACMAT-2012 Cons:  overhead in the required storage space  algebraic expressions can become complex depending on the structure of the dataset 20
  21. 21. IMPLEMENTATION June 20-22, 2012 Used a relational schema to store quadruples and their labels (including abstract expressions) Using stored procedure mechanism through which we perform annotation and evaluation  MonetDB  PostgreSQL SACMAT-2012 21
  22. 22. EXPERIMENTS June 20-22, 2012 Experiments  Experiment 1: annotation time (the time required to compute the inferred triples with their labels and the propagated labels)  Experiment 2: evaluation time (a) (the time needed to compute for a concrete policy, the concrete access labels of all RDF triples)  Experiment 3: evaluation time (b) (the time SACMAT-2012 needed to compute for a concrete policy, the concrete access label of a percentage of the RDF triples) Datasets:  Synthetic Schemas produced with PowerGen 22  Real: CIDOC, GO
  23. 23. EXPERIMENTAL RESULTS ANNOTATION TIME – MONETDB (SYNTHETIC) June 20-22, 2012  Annotation time increases as the number of implied triples increases  Plunges are due to changes in the structure of the SACMAT-2012 ontology (reduction of the depth) 152 Synthetic ontologies  100-1000 classes, 113-1635 properties, 124-50295 class instances 23 and 110-1321 property instances Different depth for the sc and sp hierarchies (from 4 to 8)
  24. 24. EXPERIMENTAL RESULTSEVALUATION TIME (FULL) June 20-22, 2012  Evaluation time increases linearly as the number of total triples increases  MonetDB outperforms PostgreSQL SACMAT-2012  Some of synthetic datasets couldn’t be evaluated 24
  25. 25. EXPERIMENTAL RESULTSEVALUATION TIME (DATASET PERCENTAGE) - MONETDB June 20-22, 2012  Evaluation time for largest dataset that evaluated successfully on Experiment 2  Similar conclu- sions as with SACMAT-2012 Experiment 2 25
  26. 26. EXPERIMENTAL RESULTS - REAL DATASETS June 20-22, 2012 CIDOC  Annotation time  MonetDB: 69ms  PostgreSQL: 4000ms  Evaluation time (full)  MonetDB – CP1: 7775ms  MonetDB – CP2: 3923ms GO SACMAT-2012  Annotation time  MonetDB: 32s  PostgreSQL: 844s  Evaluation time (full) 26  Exceeded our set timeout
  27. 27. CONCLUSIONS June 20-22, 2012 Proposed a new paradigm based on abstract models and operators Advantages  Flexibility and easy adaptation to change (no re- computation necessary)  Easy experimentation with different access control policies SACMAT-2012 Disadvantages  Increased space requirements  Overhead at query time (for evaluation) Suitable for dynamic datasets 27
  28. 28. Thank you! 28
  29. 29. EXPERIMENTAL RESULTS ANNOTATION TIME – POSTGRESQL(SYNTHETIC) June 20-22, 2012  Annotation time increases as the number of implied triples increases  One plunges are due to change in the structure of SACMAT-2012 the ontology (reduction of the depth) Up to 1000 classes, 1635 properties, 50167 class instances and 95 property instances before reaching the timeout. 29
  30. 30. IMPLEMENTATION June 20-22, 2012 Used a relational schema to store quadruples  Quad(qid, s, p, o, propop, inferop, label)  inferop, propop: boolean values indicating whether the label is obtained through propagation or inference  LabelStore(qid, qid_uses)  stores the access label of a triple  qid: the quadruple whose label is stored  qid_uses: the explict quadruple’s qid through which qid SACMAT-2012 produced. 30
  31. 31. IMPLEMENTATIONid s p o l id s p o iop pop lq1 Student sc Person at1 q1 Student sc Person f f at1q2 Person sc Agent at1 q2 Person sc Agent f f at1q3 &a type Student at2 q3 &a type Student f f at2q5 Agent type Class at3 q5 Agent type Class f f at3q6 Student sc Person at4 q6 Student sc Person f f at4q7 Student sc Agent at1⊙at1 q7 Student sc Agent t f nullq10 &a type Agent at2⊙(at1⊙at1) q9 &a type Agent t f nullq11 &a type Agent ⊗at3 q10 Person Sc Agent f t null Quadruples (Motivating example) Quad(qid,s,p,o,propop,inferop,label) 31
  32. 32. IMPLEMENTATION June 20-22, 2012id s p o l qid qid_usesq1 Student sc Person at1 q7 q1q2 Person sc Agent at1 q7 q2q3 &a type Student at2 q10 q3q5 Agent type Class at3 q10 q1q6 Student sc Person at4 q10 q2q7 Student sc Agent at1⊙at1 q11 q5 SACMAT-2012q10 &a type Agent at2⊙(at1⊙at1)q11 &a type Agent ⊗at3Quadruples (Motivating example) Labelstore(qid,qid_uses) 32

×