SlideShare a Scribd company logo
1 of 12
The Symbiotic Nature of Provenance and
                     Workflow




    Eric Stephan, Todd Halter

    Pacific Northwest National Laboratory
1
The Systems Science Challenge
!   Studying complex systems typically has the
    following characteristics:
     !    Interdisciplinary problem involving various
          stakeholders
     !    Leverage multiple tools, algorithms, data products, and
          sensors
     !    Reliant on highly iterative and repetitive techniques
     !    Steps are difficult to document and are often time
          committed to memory or notes.
!   Solution is to provide:
     !    ‘plumbing’ to more easily configure and automate
          integration, calculation, analysis, and visualization
     !    Provide a historical explanation of what occurred

2
Active Computer Science Research Areas
    !   Workflows – plumbing
    !   Provenance – explanation
    • Without a historical explanation
    workflows provide capability,
    but neglect a documentation
    trail of what transpired.

    • Without plumbing provenance
    is difficult to introduce
    generically or to support legacy
    applications

3
Example Workflow Products

    !   Creating executable workflows
    from schematic drawings
    I.	
  Al&nas,	
  O.	
  Barney,	
  Z.Cheng,	
  T.	
  Critchlow,	
  B.	
  Ludaescher,	
  S.	
  Parker,	
  A.	
  Shoshani,	
  M.	
  Vouk,	
  
    “Accelera&ng	
  the	
  Scien&fic	
  Explora&on	
  Process	
  with	
  Scien&fic	
  Workflows”,	
  In	
  Journal	
  of	
  
    Physics:	
  Conference	
  Series	
  SciDAC	
  2006	
  proceedings.	
  	
  June	
  2006.




4
Example Workflow Products

    !   Constructing component based                                                                           MeDICi: Middleware for Data-
                                                                                                               Intensive Computing
    analytical pipelines on enterprise
    service bus technology
    Gorton	
  I,	
  AS	
  Wynne,	
  JP	
  Almquist,	
  and	
  J	
  ChaQerton.	
  2008.	
  ”The	
  MeDICi	
  
    Integra&on	
  Framework:	
  A	
  PlaVorm	
  for	
  High	
  Performance	
  Data	
  Streaming	
  
    Applica&ons.”	
  In	
  WICSA	
  2008.	
  7th	
  IEEE/IFIP	
  Working	
  Conference	
  on	
  So[ware	
  
    Architecture,	
  Feb.	
  18-­‐22,	
  2008,	
  Vancouver,	
  Canada	
  ,	
  pp.	
  95-­‐104.	
  IEEE	
  
    Computer	
  Society,	
  Los	
  Alamitos,	
  CA.	
  doi:10.1109/WICSA.2008.21	
  




5
Example of Provenance

    !   Digital Library, Lineage

    !   Extensible Open Model- Open Provenance Model
       Moreau	
  L,	
  B	
  Clifford,	
  J	
  Freire,	
  J	
  Futrelle,	
  Y	
  Gil,	
  P	
  Groth,	
  N	
  Kwasnikowska,	
  S	
  Miles,	
  P	
  Missier,	
  J	
  Myers,	
  BA	
  Plale,	
  YL	
  
       Simmhan,	
  EG	
  Stephan,	
  and	
  J	
  Van	
  den	
  Bussche.	
  	
  2010.	
  	
  "The	
  Open	
  Provenance	
  Model	
  Core	
  Specifica&on	
  
       (v1.1)	
  ."	
  	
  Future	
  Genera@ons	
  Computer	
  Systems.	
  	
  doi:10.1016/j.future.2010.07.005	
  



     !   Semantic web-based Models- Proof Markup Language
         W3C	
  Incubator	
  Group,	
  hQp://www.w3.org/2005/Incubator/prov/wiki/
         W3C_Provenance_Incubator_Group_Wiki	
  




6
Examples of Creating Connectivity…

    !   Workflows
       !   Event listeners
       !   Self describing workflow components, flow

    !   Provenance
       !   Formally described
       !   Support for reasoning, transitive closure etc.
       !   Semantically relevant to provenance consumers.




7
Existing Deficiencies
    !   Workflows
       !   Listeners only reporting syntactic events
             !   Deluge of atomic transactions

       !   Inability to convey logical constructs
             !   E.g. initialization stage

       !   Lack of support to collect logs from legacy applications
    !   Provenance
       !   Collecting naïve provenance – big graph dilemma
       !   Hardcoded – risk being out of sync with workflow
       !   Collection without end user requirements


8
Interoperability Aides
    !   Applying provenance execution models to workflow
        listeners
       !       E.g. Describe Anything DaAPI
      Wynne	
  AS,	
  I	
  Gorton,	
  JM	
  Chase,	
  and	
  EG	
  Stephan.	
  	
  2009.	
  	
  MeDICi:	
  An	
  Open	
  PlaEorm	
  for	
  Sensor	
  Integra@on	
  .	
  	
  PNNL-­‐18716,	
  Pacific	
  
      Northwest	
  Na&onal	
  Laboratory,	
  Richland,	
  WA.	
  

    !   Incorporating provenance in workflow framework
       !       Semantic Abstract Workflow (SAW)
      Leonardo	
  Salayandia	
  and	
  Paulo	
  Pinheiro	
  da	
  Silva.	
  On	
  the	
  Use	
  of	
  Seman&c	
  Abstract	
  Workflows	
  Rooted	
  on	
  Provenance	
  
      Concepts	
  .PROVENANCE	
  AND	
  ANNOTATION	
  OF	
  DATA	
  AND	
  PROCESSES.	
  Lecture	
  Notes	
  in	
  Computer	
  Science,	
  2010,	
  
      Volume	
  6378/2010,	
  216-­‐220,	
  DOI:	
  10.1007/978-­‐3-­‐642-­‐17819-­‐1_24	
  




9
Interoperability Aides
     !   Advanced storage –
               !       Grids, Semantic Wikis




     !   New Provenance Model Abstractions
     Stephan	
  EG,	
  TD	
  Halter,	
  and	
  BD	
  Ermold.	
  	
  2010.	
  	
  "Leveraging	
  The	
  Open	
  Provenance	
  Model	
  as	
  a	
  Mul&-­‐Tier	
  Model	
  for	
  
     Global	
  Climate	
  Research	
  ."	
  	
  In	
  The	
  3rd	
  Interna@onal	
  Provenance	
  and	
  Annota@on	
  Workshop	
  (IPAW'2010).

     Gibson	
  TD,	
  KL	
  Schuchardt,	
  and	
  EG	
  Stephan.	
  	
  2009.	
  	
  "Applica&on	
  of	
  Named	
  Graphs	
  Towards	
  Custom	
  Provenance	
  
     Views."	
  	
  In	
  1st	
  Workshop	
  on	
  the	
  Theory	
  and	
  Prac&ce	
  of	
  Provenance	
  (TaPP	
  '09),	
  p.	
  Paper	
  No.	
  5.	
  	
  USENIX,	
  Berkeley,	
  CA.	
  	
  	
  


10
Conclusions

     !   Good news - Workflow and provenance interoperability is
         evolving.

     !   Challenge #1: Recognizing existence of symbiotic
         relationship between Workflow and Provenance.

     !   Challenge #2: Finding new ways to harness this
         relationship to advance systems science research.

11
Questions?

     !   Contact: eric.stephan@pnl.gov




12

More Related Content

What's hot

Scientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchScientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchPeter van Heusden
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Carole Goble
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...Carole Goble
 
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Yves Sucaet
 
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Deborah McGuinness
 
Finding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologiesFinding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologiesmhaendel
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402vrij
 
Interventionist-methods - Methods in user-technology studies
Interventionist-methods - Methods in user-technology studiesInterventionist-methods - Methods in user-technology studies
Interventionist-methods - Methods in user-technology studiesAntti Salovaara
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text Paul Groth
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsCarole Goble
 
Scratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity scienceScratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity scienceVince Smith
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationRinke Hoekstra
 

What's hot (20)

Scientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchScientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible research
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
Summary of 3DPAS
Summary of 3DPASSummary of 3DPAS
Summary of 3DPAS
 
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
 
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
 
Finding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologiesFinding common ground: integrating the eagle-i and VIVO ontologies
Finding common ground: integrating the eagle-i and VIVO ontologies
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402
 
Interventionist-methods - Methods in user-technology studies
Interventionist-methods - Methods in user-technology studiesInterventionist-methods - Methods in user-technology studies
Interventionist-methods - Methods in user-technology studies
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of Scientists
 
Katie Hochberg Resume
Katie Hochberg ResumeKatie Hochberg Resume
Katie Hochberg Resume
 
Resume 2016 detailed
Resume 2016 detailedResume 2016 detailed
Resume 2016 detailed
 
Scratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity scienceScratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity science
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 

Viewers also liked

A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...Eric Stephan
 
Climate Science for a Sustainable Energy Future Provenance
Climate Science for a Sustainable Energy Future ProvenanceClimate Science for a Sustainable Energy Future Provenance
Climate Science for a Sustainable Energy Future ProvenanceEric Stephan
 
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Eric Stephan
 
Open source Software: pros and cons
Open source Software: pros and consOpen source Software: pros and cons
Open source Software: pros and consygpriya
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanPost Planner
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 

Viewers also liked (7)

A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
 
Climate Science for a Sustainable Energy Future Provenance
Climate Science for a Sustainable Energy Future ProvenanceClimate Science for a Sustainable Energy Future Provenance
Climate Science for a Sustainable Energy Future Provenance
 
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
 
Open source Software: pros and cons
Open source Software: pros and consOpen source Software: pros and cons
Open source Software: pros and cons
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 

Similar to The Symbiotic Nature of Provenance and Workflow

Semantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataSemantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataOscar Corcho
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible researchYannick Wurm
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Semantic Web in Physical Science
Semantic Web in Physical ScienceSemantic Web in Physical Science
Semantic Web in Physical Sciencepetermurrayrust
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynoteCarole Goble
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009Ian Foster
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Richard Zijdeman
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemHerbert Van de Sompel
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Jisc
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Rudy Potenzone
 
Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Richard Zijdeman
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...Advanced-Concepts-Team
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleAndy Petrella
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science Carole Goble
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 

Similar to The Symbiotic Nature of Provenance and Workflow (20)

Semantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream DataSemantic Sensor Networks and Linked Stream Data
Semantic Sensor Networks and Linked Stream Data
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Semantic Web in Physical Science
Semantic Web in Physical ScienceSemantic Web in Physical Science
Semantic Web in Physical Science
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009
 
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
Linked Open Data: Combining Data for the Social Sciences and Humanities (and ...
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven Linked Data: Een extra ontstluitingslaag op archieven
Linked Data: Een extra ontstluitingslaag op archieven
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Research Objects in Wf4Ever
Research Objects in Wf4EverResearch Objects in Wf4Ever
Research Objects in Wf4Ever
 
Reproducible Research and the Cloud
Reproducible Research and the CloudReproducible Research and the Cloud
Reproducible Research and the Cloud
 
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
Smart Specifications - On the Move to Ontology-Supported Requirements Enginee...
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 

Recently uploaded

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

The Symbiotic Nature of Provenance and Workflow

  • 1. The Symbiotic Nature of Provenance and Workflow Eric Stephan, Todd Halter Pacific Northwest National Laboratory 1
  • 2. The Systems Science Challenge !   Studying complex systems typically has the following characteristics: !  Interdisciplinary problem involving various stakeholders !  Leverage multiple tools, algorithms, data products, and sensors !  Reliant on highly iterative and repetitive techniques !  Steps are difficult to document and are often time committed to memory or notes. !   Solution is to provide: !  ‘plumbing’ to more easily configure and automate integration, calculation, analysis, and visualization !  Provide a historical explanation of what occurred 2
  • 3. Active Computer Science Research Areas !   Workflows – plumbing !   Provenance – explanation • Without a historical explanation workflows provide capability, but neglect a documentation trail of what transpired. • Without plumbing provenance is difficult to introduce generically or to support legacy applications 3
  • 4. Example Workflow Products !   Creating executable workflows from schematic drawings I.  Al&nas,  O.  Barney,  Z.Cheng,  T.  Critchlow,  B.  Ludaescher,  S.  Parker,  A.  Shoshani,  M.  Vouk,   “Accelera&ng  the  Scien&fic  Explora&on  Process  with  Scien&fic  Workflows”,  In  Journal  of   Physics:  Conference  Series  SciDAC  2006  proceedings.    June  2006. 4
  • 5. Example Workflow Products !   Constructing component based MeDICi: Middleware for Data- Intensive Computing analytical pipelines on enterprise service bus technology Gorton  I,  AS  Wynne,  JP  Almquist,  and  J  ChaQerton.  2008.  ”The  MeDICi   Integra&on  Framework:  A  PlaVorm  for  High  Performance  Data  Streaming   Applica&ons.”  In  WICSA  2008.  7th  IEEE/IFIP  Working  Conference  on  So[ware   Architecture,  Feb.  18-­‐22,  2008,  Vancouver,  Canada  ,  pp.  95-­‐104.  IEEE   Computer  Society,  Los  Alamitos,  CA.  doi:10.1109/WICSA.2008.21   5
  • 6. Example of Provenance !   Digital Library, Lineage !   Extensible Open Model- Open Provenance Model Moreau  L,  B  Clifford,  J  Freire,  J  Futrelle,  Y  Gil,  P  Groth,  N  Kwasnikowska,  S  Miles,  P  Missier,  J  Myers,  BA  Plale,  YL   Simmhan,  EG  Stephan,  and  J  Van  den  Bussche.    2010.    "The  Open  Provenance  Model  Core  Specifica&on   (v1.1)  ."    Future  Genera@ons  Computer  Systems.    doi:10.1016/j.future.2010.07.005   !   Semantic web-based Models- Proof Markup Language W3C  Incubator  Group,  hQp://www.w3.org/2005/Incubator/prov/wiki/ W3C_Provenance_Incubator_Group_Wiki   6
  • 7. Examples of Creating Connectivity… !   Workflows !  Event listeners !   Self describing workflow components, flow !   Provenance !  Formally described !   Support for reasoning, transitive closure etc. !   Semantically relevant to provenance consumers. 7
  • 8. Existing Deficiencies !   Workflows !  Listeners only reporting syntactic events !   Deluge of atomic transactions !   Inability to convey logical constructs !   E.g. initialization stage !   Lack of support to collect logs from legacy applications !   Provenance !  Collecting naïve provenance – big graph dilemma !   Hardcoded – risk being out of sync with workflow !   Collection without end user requirements 8
  • 9. Interoperability Aides !   Applying provenance execution models to workflow listeners !  E.g. Describe Anything DaAPI Wynne  AS,  I  Gorton,  JM  Chase,  and  EG  Stephan.    2009.    MeDICi:  An  Open  PlaEorm  for  Sensor  Integra@on  .    PNNL-­‐18716,  Pacific   Northwest  Na&onal  Laboratory,  Richland,  WA.   !   Incorporating provenance in workflow framework !  Semantic Abstract Workflow (SAW) Leonardo  Salayandia  and  Paulo  Pinheiro  da  Silva.  On  the  Use  of  Seman&c  Abstract  Workflows  Rooted  on  Provenance   Concepts  .PROVENANCE  AND  ANNOTATION  OF  DATA  AND  PROCESSES.  Lecture  Notes  in  Computer  Science,  2010,   Volume  6378/2010,  216-­‐220,  DOI:  10.1007/978-­‐3-­‐642-­‐17819-­‐1_24   9
  • 10. Interoperability Aides !   Advanced storage – !  Grids, Semantic Wikis !   New Provenance Model Abstractions Stephan  EG,  TD  Halter,  and  BD  Ermold.    2010.    "Leveraging  The  Open  Provenance  Model  as  a  Mul&-­‐Tier  Model  for   Global  Climate  Research  ."    In  The  3rd  Interna@onal  Provenance  and  Annota@on  Workshop  (IPAW'2010). Gibson  TD,  KL  Schuchardt,  and  EG  Stephan.    2009.    "Applica&on  of  Named  Graphs  Towards  Custom  Provenance   Views."    In  1st  Workshop  on  the  Theory  and  Prac&ce  of  Provenance  (TaPP  '09),  p.  Paper  No.  5.    USENIX,  Berkeley,  CA.       10
  • 11. Conclusions !   Good news - Workflow and provenance interoperability is evolving. !   Challenge #1: Recognizing existence of symbiotic relationship between Workflow and Provenance. !   Challenge #2: Finding new ways to harness this relationship to advance systems science research. 11
  • 12. Questions? !   Contact: eric.stephan@pnl.gov 12