ProvAbs: model, policy, and tooling for abstracting PROV graphs
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

ProvAbs: model, policy, and tooling for abstracting PROV graphs

  • 267 views
Uploaded on

IPAW'14 talk for our paper: http://arxiv.org/abs/1406.1998 ...

IPAW'14 talk for our paper: http://arxiv.org/abs/1406.1998

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
267
On Slideshare
256
From Embeds
11
Number of Embeds
1

Actions

Shares
Downloads
4
Comments
0
Likes
0

Embeds 11

https://twitter.com 11

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Reference scenario: intelligence analysis, two parties
    Incident Room specialists issue analyses to law enforcement agencies - Agencies may act upon such analyses

    [describe incentives/hurdles for each actor here]
  • Ref. to Zoom
  • Complete running example trace
  • Note that one cannot simply replace association with attribution, i.e., replace relation waw(a, ag) with wat(eN , ag), because there is no guarantee that any of the entities represented by the new eN had been attributed to ag in the original graph.

Transcript

  • 1. IPAW2014–P.Missier ProvAbs: model, policy, and tooling for abstracting PROV graphs Paolo Missier, Jeremy Bryans, Carl Gamble School of Computing Science, Newcastle University Vasa Curcin, Roxana Danger Imperial College, London IPAW’14 Koln, June 10th, 2014
  • 2. IPAW2014–P.Missier Motivation: partial disclosure of provenance Consumer: • Motivated to acquire and act upon analysis But: expect support evidence, mitigate risk of acting upon inaccurate information Provider: • Motivated to provide accurate analysis to Public Agencies • Enhance communication using provenance metadata for evidence But: cannot fully disclose sources, analysis methods, etc.
  • 3. IPAW2014–P.Missier Provenance-enabled data exchanges
  • 4. IPAW2014–P.Missier Provenance exchange as part of data exchange
  • 5. IPAW2014–P.Missier Provenance abstraction What: • Abstraction model for PROV • Policy model and language to drive the abstraction • Implementation: the ProvAbs tool Why: • To enable data exchanges with partial disclosure of the data provenance • To simplify understanding of provenance traces by humans How: • Graph rewriting, from valid PROV to valid PROV • A node grouping operator
  • 6. IPAW2014–P.Missier Provenance views Motivation similar to the UserViews model (*) Goals: 1. construct relevant user views 2. answer to a provenance query depends on the workflow view In contrast, in our work: No assumption on any process specification (formal or not) driving the views on provenance (*) Biton, O, S Cohen Boulakia, S B Davidson, and C S Hara. “Querying and Managing Provenance through User Views in Scientific Workflows.” In ICDE, 1072–1081, 2008. doi:http://dx.doi.org/10.1109/ICDE.2008.4497516. • Heavily focused on workflow and their provenance • Scenario: one (or more) workflows, multiple users/viewers • Rely on “composite modules” (sub-workflow structuring): • Real workflow  induced workflow
  • 7. IPAW2014–P.Missier History of an analyst’s report Document produced by the “incident room analysts”
  • 8. IPAW2014–P.Missier 1 – Define policy to assign sensitivity to graph nodes consolidate-X consolidate-Y report-editing report-1 use report-2 use report-3 use analytics1 X-summary use analytics2 use Y-summary use analytics3 use use gen Status: Secret gen Status: Secret gen Status: confidential gen Status: confidential gen Status: confidential list classifications [protect, restricted, confidential, secret, topSecret]; for all (activity used data) where (data.Status > confidential in classifications) setSensitivity(activity, 7); for all (activity used data) where (data.Status <= confidential in classifications) setSensitivity(activity, 5);
  • 9. IPAW2014–P.Missier 2- Node selection Select nodes for abstraction based on the receiver’s clearance level consolidate-X consolidate-Y report-editing report-1 use report-2 use report-3 use analytics1 X-summary use analytics2 use Y-summary use analytics3 use use gen Status: Secret gen Status: Secret gen Status: confidential gen Status: confidential gen Status: confidential 7 7 7 5 Receiver’s clearance level: 6 ✔ ︎✗︎✗ ︎✗ ︎✗
  • 10. IPAW2014–P.Missier 3- Abstraction Apply abstraction operator consolidate-X consolidate-Y report-editing report-1 use report-2 use report-3 use analytics1 X-summary use analytics2 use Y-summary use analytics3 use use gen Status: Secret gen Status: Secret gen Status: confidential gen Status: confidential gen Status: confidential 7 7 7 5 ✔ ︎✗︎✗ ︎✗ ︎✗ consolidate-X consolidate-Y report-editing report-1 use report-2 use report-3 use abs X-summary use Y-summary use gen Status: Secret gen Status: Secret gen Status: confidential gen Status: confidential gen Status: confidenti
  • 11. IPAW2014–P.Missier Abstracting over sets of nodes General abstraction idea: replace a group of (possibly non- contiguous) nodes with a new node
  • 12. IPAW2014–P.Missier Naïve node group replacement: introducing cycles e1 e2 e3 e4 e5 a1 a3 a2 a4 used used used used used wgBy wgBy e2 a1 a3 a2 a4 used used used e' wgBy wgBy e6 e6 used used Generation-usage cycles are legal in PROV Note: initial focus on vanilla PROV: usage-generation/entity-activity
  • 13. IPAW2014–P.Missier What’s wrong with cycles? New cycles introduce new constraints on the temporal ordering of events ae1 e2 u1 g2 a e1 s u1 g2 e e2 a e' u' g' a e' start u' g' end s(a) ≤ u' ≤ g' ≤ e(a) u' ← u1 g' ← g2 e' ← {e1, e2} u’, g’ simultaneous
  • 14. IPAW2014–P.Missier More generally: mapping concrete to abstract events Abstract graph nodes should be characterised by abstract events • Generation is the completion of production of a new entity (PROV-DM Sec. 5.1.3) • Usage is the beginning of utilizing an entity (PROV-DM Sec. 5.1.4). g’ = max { g1, g2 } u’ = min { u3, u4 } e3 e4 e1 e2 a u4 g2 g1 u3 a e' u' g' a e1 s g1 g2 e e2 e3 e4 u4 u3 a e' s g' e u'
  • 15. IPAW2014–P.Missier Usage-follows-generation Abstract graphs with abstract usage-generation events correspond to a specific class of base graphs with pattern: <all generations> -- <all usages> e3 e4 e1 e2 a u4 g2 g1 u3 a e1 s e e2 e3 e4 generation phase usage phase All generation events for all ei must precede all usage events for all ei. Given a grouping set of entities {e1…en} such that: ei wasGeneratedBy a or a used ei:
  • 16. IPAW2014–P.Missier Naïve node group replacement -2: Type violations e1 e2 e4 e5 a1 a3 a2 a4 used used used used wgBy wgBy e1 e2 e' a3 a2 a4 used used wgBy ??
  • 17. IPAW2014–P.Missier Criteria for abstraction 1. No new generation-usage cycles 2. No new dependencies 3. Satisfy type constraints on relationship but: ok to remove some dependencies Convexity by closure Extension Replacement, rewiring
  • 18. IPAW2014–P.Missier Convexity by path closure e1 e2 e3 e4 e5 a1 a3 a2 a4 used used used used used wgBy wgBy e1 e2 e3 e4 e5 a1 a3 a2 a4 used used used used used wgBy wgBy (a) (b) e6 e6 closea5 a5
  • 19. IPAW2014–P.Missier Replacement , rewiring e1 e2 e3 e4 e5 a1 a3 a2 a4 used used used used used wgBy wgBy e2 e6 a2 a4 used used e' (c) e6 replace a5 a5
  • 20. IPAW2014–P.Missier Extension – restore type correctness e2 e6 a2 a4 used used e' a2 a4 used used e'' a5a5
  • 21. IPAW2014–P.Missier t-grouping Nodes in the grouping set can be a mix of Entities or Activities • When all boundary nodes are of the same type:  grouping creates a node of that type • e-grouping: new Entity node • a-grouping: new Activity node • Boundary nodes of mixed types:  grouping can introduce a node of either type t-grouping: creates new node of type t ∈ { En, Act } Note: Grouping is commutative and closed wrt composition
  • 22. IPAW2014–P.Missier t-grouping e4 e5 a1 a3 a2 a4 u42 u52 g53 g41 a-grouping replace e5a3 a4 u54g53 aN e-grouping replace a1 a3 a4 un4 gN1 eN (e-2) (a-1) (a-2) aN a4 uN4 gNN eN (e-3) extend and replace e4 e5 a1 a3 a2 a4 u42 u52 g53 g41 e4 e5 a1 a3 a2 a4 u42 u52 g53 g41 (e-1) u54 u54u54 u5N gN3 a-grouping e-grouping
  • 23. IPAW2014–P.Missier The ProvAbs tool • A tool to let a policy designer explore partial disclosure options • by experimenting with policy settings and clearance thresholds. • Accepts graphs in PROV-N format • Policy specified interactively, or loaded from file Demo available!
  • 24. IPAW2014–P.Missier Summary  A model for abstracting PROV graph by (recursively) replacing sets of nodes with new nodes • Map valid PROV to valid PROV – ref.: PROV-CONSTRAINTS • No false dependencies introduced  Abstract nodes  abstract events  Extended to Agents (see TechReport)  Need to extend to more PROV relationship types See also: Missier, P., Gamble, C., Bryans, J.: Provenance graph abstraction by node grouping. Technical report, Newcastle University (2013) http://www.ncl.ac.uk/computing/research/publication/194432