PROV-O-Viz
InteractiveProvenanceVisualization
RinkeHoekstra and Paul Groth

VU University Amsterdam/University of Amsterda...
Provenance?
Provenance
byJenniferCompton
http://stillcraic.blogspot.nl/2014/01/tuesday-poem-provenance-by-jennifer.html
Definition

(OxfordEnglishDictionary)
• The fact of coming from some particular source or quarter;
origin, derivation;
• t...
Provenance
Provenance
Provenance
Making trust judgements on the Web
Provenance
Making trust judgements on the Web
Provenance
Making trust judgements on the Web
Compliance and auditing of business processes
Provenance
Making trust judgements on the Web
Compliance and auditing of business processes
Provenance
Making trust judgements on the Web
Licensing and attribution of combined information
Compliance and auditing of...
Provenance
Making trust judgements on the Web
Licensing and attribution of combined information
Compliance and auditing of...
Provenance
Making trust judgements on the Web
Licensing and attribution of combined information
Liability, trust and priva...
Provenance
Making trust judgements on the Web
Licensing and attribution of combined information
Liability, trust and priva...
Provenance
Making trust judgements on the Web
Licensing and attribution of combined information
Liability, trust and priva...
“WebDesignIssues”
“At the toolbar (menu, whatever) associated
with a document there is a button marked
“Oh, yeah?”. You pr...
ProvenanceinWebDocuments
ProvenanceinWebDocuments
Standards for ethical aggregation?
Curator’s code for attributing discovery?
ProvenanceinOpenGovernment
Need provenance for data integration and reuse

diversity of data sources

varying quality

dif...
ProvenanceinScience
“We need a paradigm that makes it simple […]
to perform and publish reproducible
computational researc...
W3CWorkingGroup
Provenance is a record that describes the people,
institutions, entities, and activities, involved in
prod...
Provenance?
• Provenance = Metadata?

Provenance can be seen as metadata, but not all metadata is
provenance
• Provenance ...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
ThreeDimensions
• Content

Capturing and representing provenance information
• Management

Storing, querying, and accessin...
BasicIdea
Whatyoucando…
Warning: provenance is about history!
VisualizationAnyone?
NaiveApproaches
InProv: Visualizing Provenance Graphs with Radial Layouts and Time-Based Hierarchical Grouping

Madelaine ...
InProv
InProv: Visualizing Provenance Graphs with Radial Layouts and Time-Based Hierarchical Grouping

Madelaine D. Boyd -...
D3.js
Visualize the magnitudeofflow between nodes in a network
PROV-O-Vizhttp://provoviz.org
PROV-O-Vizhttp://provoviz.org
Insert any PROV-O RDF
Or connect to a SPARQL endpoint
Width of activities and entities is based on informationflow
Activities and entities are extracted from an egograph
Move activities and entities around
Hover over interesting dependencies
Embed graph into your own webpage
TomdeNies(Ghent University)

SaraMagliacane (VU University Amsterdam)
Discussion
• Provenance is vital in many areas

government, science, industry, …
• PROV is the W3Cstandard for expressing ...
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Upcoming SlideShare
Loading in...5
×

Prov-O-Viz: Interactive Provenance Visualization

593

Published on

Prov-O-Viz is a visualisation service for provenance graphs expressed using the W3C PROV vocabulary. It uses the Sankey-style visualisation from D3js.

See http://provoviz.org

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
593
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
19
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Prov-O-Viz: Interactive Provenance Visualization"

  1. 1. PROV-O-Viz InteractiveProvenanceVisualization RinkeHoekstra and Paul Groth
 VU University Amsterdam/University of Amsterdam rinke.hoekstra@vu.nl TM to 2Data SemanticsSemantics for Scientific Data PublishersFrom Data Many slides courtesy of PaulGroth
  2. 2. Provenance?
  3. 3. Provenance byJenniferCompton
http://stillcraic.blogspot.nl/2014/01/tuesday-poem-provenance-by-jennifer.html
  4. 4. Definition
 (OxfordEnglishDictionary) • The fact of coming from some particular source or quarter; origin, derivation; • the history or pedigree of a work of art, manuscript, rare book, etc.; • concretely, arecordofthepassage of an item through its various owners.
  5. 5. Provenance
  6. 6. Provenance
  7. 7. Provenance Making trust judgements on the Web
  8. 8. Provenance Making trust judgements on the Web
  9. 9. Provenance Making trust judgements on the Web Compliance and auditing of business processes
  10. 10. Provenance Making trust judgements on the Web Compliance and auditing of business processes
  11. 11. Provenance Making trust judgements on the Web Licensing and attribution of combined information Compliance and auditing of business processes
  12. 12. Provenance Making trust judgements on the Web Licensing and attribution of combined information Compliance and auditing of business processes
  13. 13. Provenance Making trust judgements on the Web Licensing and attribution of combined information Liability, trust and privacy in open government data Compliance and auditing of business processes
  14. 14. Provenance Making trust judgements on the Web Licensing and attribution of combined information Liability, trust and privacy in open government data Compliance and auditing of business processes
  15. 15. Provenance Making trust judgements on the Web Licensing and attribution of combined information Liability, trust and privacy in open government data Compliance and auditing of business processes Safeguarding quality, reproducibility and integrity of the scientific process
  16. 16. “WebDesignIssues” “At the toolbar (menu, whatever) associated with a document there is a button marked “Oh, yeah?”. You press it when you lose that feeling of trust. It says to the Web, “so how do I know I can trust this information?”. The software then goes directly or indirectly back to metainformation about the document, which suggests a number of reasons.” Tim Berners-Lee, Web Design Issues, September 1997
  17. 17. ProvenanceinWebDocuments
  18. 18. ProvenanceinWebDocuments Standards for ethical aggregation? Curator’s code for attributing discovery?
  19. 19. ProvenanceinOpenGovernment Need provenance for data integration and reuse
 diversity of data sources
 varying quality
 different scope
 different assumptions “Provenance is the number one issue that we face when publishing government data in data.gov.uk” John Sheridan, UK National Archives, data.gov.uk
  20. 20. ProvenanceinScience “We need a paradigm that makes it simple […] to perform and publish reproducible computational research. […] a Reproducible Research Environment (RRE) […] provides computational tools together with the ability to automatically track the provenance of data, analysis, and results and to package them (or pointers to persistent versions of them) for redistribution.” Jill Mesirov, Chief Informatics Officer of the MIT/
 Harvard Broad Institute, in Science, January 2010 Need provenance for reproducibility 
 and verification of processes
  21. 21. W3CWorkingGroup Provenance is a record that describes the people, institutions, entities, and activities, involved in producing, influencing, or delivering a piece of data or a thing. http://www.w3.org/TR/prov-overview Luc Moreau & Paul Groth
  22. 22. Provenance? • Provenance = Metadata?
 Provenance can be seen as metadata, but not all metadata is provenance • Provenance = Trust?
 Provenance provides a substrate for deriving different trust metrics • Provenance = Authentication?
 Provenance records can be used to verify and authenticate amongst users
  23. 23. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice
  24. 24. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording
  25. 25. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating
  26. 26. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflow systems
  27. 27. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflow systems scalability
  28. 28. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflow systems scalability interoperability
  29. 29. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflow systems scalability interoperability trust
  30. 30. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflow systems scalability interoperability trust accountability
  31. 31. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflow systems scalability interoperability trust accountability compliance
  32. 32. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflow systems scalability interoperability trust accountability compliance explanation
  33. 33. ThreeDimensions • Content
 Capturing and representing provenance information • Management
 Storing, querying, and accessing provenance information • Use
 Interpreting and understanding provenance in practice recording annotating workflow systems scalability interoperability trust accountability compliance explanation debugging
  34. 34. BasicIdea
  35. 35. Whatyoucando…
  36. 36. Warning: provenance is about history!
  37. 37. VisualizationAnyone?
  38. 38. NaiveApproaches InProv: Visualizing Provenance Graphs with Radial Layouts and Time-Based Hierarchical Grouping
 Madelaine D. Boyd - http://www.seas.harvard.edu/sites/default/files/files/archived/Boyd.pdf Orbiter has several limitations. It does not have capabilities for query subgraph high- lighting, regular expression filters, process grouping, annotations, or programmable views[16]. Furthermore, the structure of each summary node, where child nodes are grouped within parents and are hidden until the parent is expanded, benefits queries earlier in the depen- dency chain. Initial overviews often correspond with system bootup, and appear very similar across di↵erent traces (time slices of system activity). Figure 10: In these screenshots of Orbiter, the presence of edges overwhelms the visibility of nodes. By relying on a node-link graph layout and using spatial location to encode object relationships, Orbiter’s graph layout algorithm must draw many long edges to communi- cate node connections. Without edge bundling or opacity variation, the meanings of these relationships are obscured. Another one of Orbiter’s weaknesses is its node-link diagram layout. As a result, each node’s position in the X-Y plane and the length and angle of connecting lines are wasted attributes. The chosen graph layout algorithm (dot by default) arranges nodes to minimize Figure 11: (Top): A screenshot of the portion of the graph generated by GraphViz for a trace of the third provenance challenge. (Bottom): A zoomed-in view of the same graph. The horizontal black bars across the images are dense collections of edges. E↵ective large graph visualizations present the user with a summary view that can be explored, filtered, and expanded interactively. 2.5 Tree Visualization While trees are a subcategory of graphs, because of their hierarchical composition, tree visu- alization forms its own subfield of research. A survey of over two-hundred tree visualizations is given at Hans-Jrg Schulz’s treevis.net. Visitors can narrow down by dimensionality (2D, 3D, or mixed), representation (explicit node-link diagram, implicit treemap, or combi- nation), alignment (XY plot, radial layout, or free diagram)[55]. These categories are shown Figure 12: Left: Pajek uses various summary node-link and matrix-based representations depending on the structure of the supplied data set. Pictured is a main core subgraph extracted from routing data on the Internet. Right: TopoLayout optimizes the choice of visualization display depending on the underlying graph structure. The right column is TopoLayout’s output, while the left and middle columns are the outputs of the GRIP and FM graph layout algorithms. Figure 13: treevis.net defines di↵erent categories for tree maps. Tree maps can be cate- gorized by dimensionality (2D, 3D, or mixed), representation (explicit, implicit, or mixed), or alignment (XY, radial, or spring). Tree visualizations are either explicit or implicit. Explicit representations resemble node- link diagrams. An example of an implicit representation is a tree map, a diagram where the entire tree is inscribed in a rectangle representing the root node. This root is subdivided hierarchically into more rectangles, which represent child nodes, and each child node is subdivided into more child nodes. Treemaps are excellent for displaying hierarchical or categorical data[57]. One famous example, shown in Figure 14, is the “Map of the Market” from SmartMoney.com, which displays in red and green the changes in market value of publicly-traded companies, grouped by market sector, with cell size proportional to market capitalization[64]. TreePlus is an example of a tree-inspired graph visualization tool (Figure 15). It uses the guiding metaphor of “plant a seed to watch it grow” to summarize navigation of its tree-
  39. 39. InProv InProv: Visualizing Provenance Graphs with Radial Layouts and Time-Based Hierarchical Grouping
 Madelaine D. Boyd - http://www.seas.harvard.edu/sites/default/files/files/archived/Boyd.pdf 6 Final Design Figure 30: A view of a cluster of system activity. This particular timeslice shows the activity of the init.sh and mount processes. This visualization was designed with the Visual Information-Seeking Mantra in mind - “overview first, zoom and filter, then details-on-demand”[56].
  40. 40. D3.js Visualize the magnitudeofflow between nodes in a network
  41. 41. PROV-O-Vizhttp://provoviz.org
  42. 42. PROV-O-Vizhttp://provoviz.org Insert any PROV-O RDF Or connect to a SPARQL endpoint
  43. 43. Width of activities and entities is based on informationflow Activities and entities are extracted from an egograph
  44. 44. Move activities and entities around Hover over interesting dependencies
  45. 45. Embed graph into your own webpage
  46. 46. TomdeNies(Ghent University)
 SaraMagliacane (VU University Amsterdam)
  47. 47. Discussion • Provenance is vital in many areas
 government, science, industry, … • PROV is the W3Cstandard for expressing provenance • Provenance graphs can be overwhelming and complex • PROV-O-Viz builds intuitive Sankey-style visualizations • … for any provenance trace expressed using PROV to 2Data SemanticsSemantics for Scientific Data PublishersFrom Data http://semweb.cs.vu.nl/provoviz Thanks to: Paul Groth, Provenance XG, WG, Luc Moreau, James Cheney, Paolo Missier, Olaf Hartig, Satya Sahoo
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×