Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ProvAbs: model, policy, and tooling for abstracting PROV graphs

1,094 views

Published on

IPAW'14 talk for our paper: http://arxiv.org/abs/1406.1998

Published in: Technology
  • Be the first to comment

  • Be the first to like this

ProvAbs: model, policy, and tooling for abstracting PROV graphs

  1. 1. IPAW2014–P.Missier ProvAbs: model, policy, and tooling for abstracting PROV graphs Paolo Missier, Jeremy Bryans, Carl Gamble School of Computing Science, Newcastle University Vasa Curcin, Roxana Danger Imperial College, London IPAW’14 Koln, June 10th, 2014
  2. 2. IPAW2014–P.Missier Motivation: partial disclosure of provenance Consumer: • Motivated to acquire and act upon analysis But: expect support evidence, mitigate risk of acting upon inaccurate information Provider: • Motivated to provide accurate analysis to Public Agencies • Enhance communication using provenance metadata for evidence But: cannot fully disclose sources, analysis methods, etc.
  3. 3. IPAW2014–P.Missier Provenance-enabled data exchanges
  4. 4. IPAW2014–P.Missier Provenance exchange as part of data exchange
  5. 5. IPAW2014–P.Missier Provenance abstraction What: • Abstraction model for PROV • Policy model and language to drive the abstraction • Implementation: the ProvAbs tool Why: • To enable data exchanges with partial disclosure of the data provenance • To simplify understanding of provenance traces by humans How: • Graph rewriting, from valid PROV to valid PROV • A node grouping operator
  6. 6. IPAW2014–P.Missier Provenance views Motivation similar to the UserViews model (*) Goals: 1. construct relevant user views 2. answer to a provenance query depends on the workflow view In contrast, in our work: No assumption on any process specification (formal or not) driving the views on provenance (*) Biton, O, S Cohen Boulakia, S B Davidson, and C S Hara. “Querying and Managing Provenance through User Views in Scientific Workflows.” In ICDE, 1072–1081, 2008. doi:http://dx.doi.org/10.1109/ICDE.2008.4497516. • Heavily focused on workflow and their provenance • Scenario: one (or more) workflows, multiple users/viewers • Rely on “composite modules” (sub-workflow structuring): • Real workflow  induced workflow
  7. 7. IPAW2014–P.Missier History of an analyst’s report Document produced by the “incident room analysts”
  8. 8. IPAW2014–P.Missier 1 – Define policy to assign sensitivity to graph nodes consolidate-X consolidate-Y report-editing report-1 use report-2 use report-3 use analytics1 X-summary use analytics2 use Y-summary use analytics3 use use gen Status: Secret gen Status: Secret gen Status: confidential gen Status: confidential gen Status: confidential list classifications [protect, restricted, confidential, secret, topSecret]; for all (activity used data) where (data.Status > confidential in classifications) setSensitivity(activity, 7); for all (activity used data) where (data.Status <= confidential in classifications) setSensitivity(activity, 5);
  9. 9. IPAW2014–P.Missier 2- Node selection Select nodes for abstraction based on the receiver’s clearance level consolidate-X consolidate-Y report-editing report-1 use report-2 use report-3 use analytics1 X-summary use analytics2 use Y-summary use analytics3 use use gen Status: Secret gen Status: Secret gen Status: confidential gen Status: confidential gen Status: confidential 7 7 7 5 Receiver’s clearance level: 6 ✔ ︎✗︎✗ ︎✗ ︎✗
  10. 10. IPAW2014–P.Missier 3- Abstraction Apply abstraction operator consolidate-X consolidate-Y report-editing report-1 use report-2 use report-3 use analytics1 X-summary use analytics2 use Y-summary use analytics3 use use gen Status: Secret gen Status: Secret gen Status: confidential gen Status: confidential gen Status: confidential 7 7 7 5 ✔ ︎✗︎✗ ︎✗ ︎✗ consolidate-X consolidate-Y report-editing report-1 use report-2 use report-3 use abs X-summary use Y-summary use gen Status: Secret gen Status: Secret gen Status: confidential gen Status: confidential gen Status: confidenti
  11. 11. IPAW2014–P.Missier Abstracting over sets of nodes General abstraction idea: replace a group of (possibly non- contiguous) nodes with a new node
  12. 12. IPAW2014–P.Missier Naïve node group replacement: introducing cycles e1 e2 e3 e4 e5 a1 a3 a2 a4 used used used used used wgBy wgBy e2 a1 a3 a2 a4 used used used e' wgBy wgBy e6 e6 used used Generation-usage cycles are legal in PROV Note: initial focus on vanilla PROV: usage-generation/entity-activity
  13. 13. IPAW2014–P.Missier What’s wrong with cycles? New cycles introduce new constraints on the temporal ordering of events ae1 e2 u1 g2 a e1 s u1 g2 e e2 a e' u' g' a e' start u' g' end s(a) ≤ u' ≤ g' ≤ e(a) u' ← u1 g' ← g2 e' ← {e1, e2} u’, g’ simultaneous
  14. 14. IPAW2014–P.Missier More generally: mapping concrete to abstract events Abstract graph nodes should be characterised by abstract events • Generation is the completion of production of a new entity (PROV-DM Sec. 5.1.3) • Usage is the beginning of utilizing an entity (PROV-DM Sec. 5.1.4). g’ = max { g1, g2 } u’ = min { u3, u4 } e3 e4 e1 e2 a u4 g2 g1 u3 a e' u' g' a e1 s g1 g2 e e2 e3 e4 u4 u3 a e' s g' e u'
  15. 15. IPAW2014–P.Missier Usage-follows-generation Abstract graphs with abstract usage-generation events correspond to a specific class of base graphs with pattern: <all generations> -- <all usages> e3 e4 e1 e2 a u4 g2 g1 u3 a e1 s e e2 e3 e4 generation phase usage phase All generation events for all ei must precede all usage events for all ei. Given a grouping set of entities {e1…en} such that: ei wasGeneratedBy a or a used ei:
  16. 16. IPAW2014–P.Missier Naïve node group replacement -2: Type violations e1 e2 e4 e5 a1 a3 a2 a4 used used used used wgBy wgBy e1 e2 e' a3 a2 a4 used used wgBy ??
  17. 17. IPAW2014–P.Missier Criteria for abstraction 1. No new generation-usage cycles 2. No new dependencies 3. Satisfy type constraints on relationship but: ok to remove some dependencies Convexity by closure Extension Replacement, rewiring
  18. 18. IPAW2014–P.Missier Convexity by path closure e1 e2 e3 e4 e5 a1 a3 a2 a4 used used used used used wgBy wgBy e1 e2 e3 e4 e5 a1 a3 a2 a4 used used used used used wgBy wgBy (a) (b) e6 e6 closea5 a5
  19. 19. IPAW2014–P.Missier Replacement , rewiring e1 e2 e3 e4 e5 a1 a3 a2 a4 used used used used used wgBy wgBy e2 e6 a2 a4 used used e' (c) e6 replace a5 a5
  20. 20. IPAW2014–P.Missier Extension – restore type correctness e2 e6 a2 a4 used used e' a2 a4 used used e'' a5a5
  21. 21. IPAW2014–P.Missier t-grouping Nodes in the grouping set can be a mix of Entities or Activities • When all boundary nodes are of the same type:  grouping creates a node of that type • e-grouping: new Entity node • a-grouping: new Activity node • Boundary nodes of mixed types:  grouping can introduce a node of either type t-grouping: creates new node of type t ∈ { En, Act } Note: Grouping is commutative and closed wrt composition
  22. 22. IPAW2014–P.Missier t-grouping e4 e5 a1 a3 a2 a4 u42 u52 g53 g41 a-grouping replace e5a3 a4 u54g53 aN e-grouping replace a1 a3 a4 un4 gN1 eN (e-2) (a-1) (a-2) aN a4 uN4 gNN eN (e-3) extend and replace e4 e5 a1 a3 a2 a4 u42 u52 g53 g41 e4 e5 a1 a3 a2 a4 u42 u52 g53 g41 (e-1) u54 u54u54 u5N gN3 a-grouping e-grouping
  23. 23. IPAW2014–P.Missier The ProvAbs tool • A tool to let a policy designer explore partial disclosure options • by experimenting with policy settings and clearance thresholds. • Accepts graphs in PROV-N format • Policy specified interactively, or loaded from file Demo available!
  24. 24. IPAW2014–P.Missier Summary  A model for abstracting PROV graph by (recursively) replacing sets of nodes with new nodes • Map valid PROV to valid PROV – ref.: PROV-CONSTRAINTS • No false dependencies introduced  Abstract nodes  abstract events  Extended to Agents (see TechReport)  Need to extend to more PROV relationship types See also: Missier, P., Gamble, C., Bryans, J.: Provenance graph abstraction by node grouping. Technical report, Newcastle University (2013) http://www.ncl.ac.uk/computing/research/publication/194432

×