SlideShare a Scribd company logo
1 of 15
Download to read offline
What can provenance
           do for me?
                                                    Stian Soiland-Reyes
                                              myGrid, University of Manchester

This work is licensed under a                           Ocean Sampling Day planning Bremen 2013-03-21
Creative Commons Attribution 3.0 Unported License
Provenance of Stian Soiland-Reyes
• Developer/researcher in myGrid team, School of Computer
  Science, University of Manchester since 2006
• Involved with:
  •   Taverna - Scientific workflow system




                                                                        What can provenance do for me?
  •   myExperiment – sharing workflows and artefacts
  •   Wf4Ever - digital preservation (of workflows and workflow runs)
  •   W3C Provenance WG – standards for describing provenance
  •   Open Annotation – standard for tracking who said what about
      something



                                                                               2


                                http://soiland-reyes.com/stian/work/
Overview
What is provenance?
• Attribution
• Derivation




                          What can provenance do for me?
• Activities
• PROV model
Aggregating and sharing

Why you want provenance          3
What is provenance?                                                                              Attribution
                                                                                                          who did it?

  Abstraction levels                                            Activity
  shallots, sign, photo or flickr page?                         what happens to it?
                                                                                            Date and tool
                                                                                            when was it made?
                                                                                            using what?




           Derivation                                                   Origin
           how did it change?                                           where is it from?            Aggregation
                                                                                                     what is it part of?

                   Annotations                                                                          Attributes
                   what do others say about it?                                                         what is it?

                                                                        Licensing                                     4
                                                                        can I use it?
By Dr Stephen Dann
licensed under Creative Commons Attribution-ShareAlike 2.0 Generic
http://www.flickr.com/photos/stephendann/3375055368/
Attribution
                                                                actedOnBehalfOf



                                                                             The
                                                        Alice
                                                                             lab
•   Who collected this sample? Who helped?
•   Which lab performed the sequencing?                wasAttributedTo

•   Who did the data analysis?
                                                             Data
•   Who curated the results?
•




                                                                                         What can provenance do for me?
    Who produced the raw data this analysis is based on?
•   Who wrote the analysis workflow?

Why do I need this?
                                               Roles                     Agent types
i.   To be recognized for my work              prov:wasAttributedTo      Person
                                               prov:actedOnBehalfOf      Organization
ii. Who should I give credits to?              dct:creator               SoftwareAgent
                                               dct:publisher
iii. Who should I complain to?                 pav:authoredBy
                                               pav:contributedBy
iv. Can I trust them?                          pav:curatedBy
                                               pav:createdBy
                                                                                                5
                                               pav:importedBy
v. Who should I make friends with?             pav:providedBy
                                               ...
Sample

    Derivation                                                       wasDerivedFrom




•   Which sample was this metagenome sequenced from?
                                                                     Meta -
•   Which meta-genomes was this sequence extracted from?            genome
•   Which sequence was the basis for the results?
•   What is the previous revision of the new results?                wasQuotedFrom




                                                                                      What can provenance do for me?
                                                                  Sequence
Why do I need this?
i.   To verify consistency (did I use           wasInfluencedBy
                                                                   wasDerivedFrom
     the correct sequence?)
ii. To find the latest revision         Old
                                                  wasRevisionOf
                                                                    New
iii. To backtrack where a diversion   results                      results
     appeared after a change
iv. To credit work I depend on                                                               6
v. Auditing and defence for peer review
Lab
                                                                                Alice
                                                         technician
    Activities
                                      Sample
                                                                 hadRole
                                                                                  wasAssociatedWith
                                                  used

•   What happened? When? Who?                                          "2012-06-21"
                                                   Sequencing
•   What was used and generated?                                      wasStartedAt


•   Why was this workflow started?       wasGeneratedBy   wasInformedBy

•   Which workflow ran? Where?     Metagenome                       Workflow




                                                                                                      What can provenance do for me?
                                                                                   server
                                                  wasStartedBy
Why do I need this?
                                                                       wasAssociatedWith
i. To see which analysis was performed
                                              Workflow
ii. To find out who did what                       run                            hadPlan
iii. What was the metagenome
     used for?                      wasGeneratedBy


iv. To understand the whole process                                        Workflow
     “make me a Methods section”         Results                           definition
                                           Results                                                           7
v. To track down inconsistencies
Core PROV model                                                              Provenance Working Group




                                                                                                        What can provenance do for me?
   Copyright © 2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved.



                                                                                                               8

                                                           http://www.w3.org/TR/prov-primer/
How to find provenance data
                                                    has_provenance
                                  resource
                                                                     provenance



• Tracking provenance data                            has_query_service
• Querying provenance “What was derived from X?”




                                                                                  What can provenance do for me?
• Pingback of provenance “Here’s new provenance
  data about X”
                                                                     Provenance
                                                                       service
Why do I need this?
i.   To propagate provenance data (e.g. when integrating data)
ii. To include external provenance (e.g. for reference datasets)
iii. To avoid black-box provenance (e.g. in workflows)
iv. To merge provenance at different abstraction levels                                  9
v. To see what has used the data (“Has someone done the analysis?”)
                                             http://www.w3.org/TR/prov-aq/
Let’s talk about it
       Open Annotation Data Model




                                                                                                          What can provenance do for me?
       Copyright © 2012-2013 the Contributors to the Open Annotation Core Data Model Specification,
       published by the Open Annotation Community Group under the W3C Community Contributor License
       Agreement (CLA).


• The body is somewhat about or related to the target
• Provenance: Who said that? When? Why?
• E.g. describing, commenting, highlighting, bookmarking,                                                 10
  tagging, classifying, identifying
                                                               http://www.openannotation.org/spec/core/
Gathering everything
 • Research Objects (RO) aggregate related resources, their
   provenance and annotations
 • Conveys “everything you need to know” about a
   study/experiment/analysis/dataset/workflow
 • Shareable, evolvable, contributable, citable




                                                                                       What can provenance do for me?
 • ROs have their own provenance and lifecycles

      Hypothesis                                Provenance

   Raw data
                   aggregates
                                Research
                                 Object                Annotations
 Workflow

Analysis tools                                                                         11
             Results                                   http://purl.org/wf4ever/model
                       Paper    Reference literature
Research Objects
                              Hypothesis                                Provenance

                          Raw data
                                           aggregates
                                                        Research
                                                         Object                Annotations
                        Workflow

                       Analysis tools
                                     Results




                                                                                             What can provenance do for me?
                                               Paper    Reference literature

Why do I need them?
i. To share your research materials (RO as a social object)
ii. To facilitate reproducibility and reuse of methods
iii. To be recognized and cited (even for constituent resources)
iv. To preserve results and prevent decay (curation of
     workflow definition; using provenance for partial rerun)
                                                                                             12
myExperiment Research Objects




                                What can provenance do for me?
                                13
Why you want provenance
i. To acknowledge sources you have based your work on
ii. Receive credit when others uses your work
iii. Build trust (who did it?) and verify consistency (was it done
      correctly?)
iv. To audit and defend for peer review




                                                                     What can provenance do for me?
v. Keep track of resources that change over time (versioning)
vi. Investigate and compare data (where did that strange value
      come from?)
vii. Gather everything you need for that Methods section
viii. Facilitate reproducibility by tracking activities and their
      outcomes
ix. To prevent decay by aggregating related resources and their
      descriptions                                                   14
Thank you


     Questions?




                                            What can provenance do for me?
     Twitter: @soilandreyes

     Skype: soiland


     http://soiland-reyes.com/stian/work/
                                            15
     http://www.wf4ever-project.org/

More Related Content

Similar to 2013-03-21 What can provenance do for me?

My Experiments with FOSS
My Experiments with FOSSMy Experiments with FOSS
My Experiments with FOSSGururaja KV
 
Creating innovation engines organizational patterns ver 2.0
Creating innovation engines   organizational patterns ver 2.0Creating innovation engines   organizational patterns ver 2.0
Creating innovation engines organizational patterns ver 2.0toriat123
 
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Yandex
 
Customer Development Past Present Future Steve Blank 111909
Customer Development Past Present Future Steve Blank 111909Customer Development Past Present Future Steve Blank 111909
Customer Development Past Present Future Steve Blank 111909Stanford University
 
Open-Source Project Tools for Corporate Projects?
Open-Source Project Tools for Corporate Projects?Open-Source Project Tools for Corporate Projects?
Open-Source Project Tools for Corporate Projects?Bertrand Delacretaz
 
The Future of CrossRef: What's in it for Publishers?
The Future of CrossRef: What's in it for Publishers?The Future of CrossRef: What's in it for Publishers?
The Future of CrossRef: What's in it for Publishers?Crossref
 
Qualitative Methods Workshop Day 1
Qualitative Methods Workshop Day 1Qualitative Methods Workshop Day 1
Qualitative Methods Workshop Day 1Jason Rutter
 
Andy Kirk Malofiej 20 Presentation
Andy Kirk Malofiej 20 PresentationAndy Kirk Malofiej 20 Presentation
Andy Kirk Malofiej 20 PresentationAndy Kirk
 
Trends In Usability Testing
Trends In  Usability  TestingTrends In  Usability  Testing
Trends In Usability TestingKyle Soucy
 
How penetration testing techniques can help you improve your qa skills
How penetration testing techniques can help you improve your qa skillsHow penetration testing techniques can help you improve your qa skills
How penetration testing techniques can help you improve your qa skillsMarian Marinov
 
The (very) basics of AI for the Radiology resident
The (very) basics of AI for the Radiology residentThe (very) basics of AI for the Radiology resident
The (very) basics of AI for the Radiology residentPedro Staziaki
 
10 practices that every developer needs to start right now
10 practices that every developer needs to start right now10 practices that every developer needs to start right now
10 practices that every developer needs to start right nowCaleb Jenkins
 
AATC2016: Exploratory testing an API
AATC2016: Exploratory testing an APIAATC2016: Exploratory testing an API
AATC2016: Exploratory testing an APIMaaret Pyhäjärvi
 
Using Public Social Media to Find Answers to Questions
Using Public Social Media to Find Answers to QuestionsUsing Public Social Media to Find Answers to Questions
Using Public Social Media to Find Answers to QuestionsJeffrey Nichols
 

Similar to 2013-03-21 What can provenance do for me? (20)

My Experiments with FOSS
My Experiments with FOSSMy Experiments with FOSS
My Experiments with FOSS
 
Creating innovation engines organizational patterns ver 2.0
Creating innovation engines   organizational patterns ver 2.0Creating innovation engines   organizational patterns ver 2.0
Creating innovation engines organizational patterns ver 2.0
 
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
 
Customer Development Past Present Future Steve Blank 111909
Customer Development Past Present Future Steve Blank 111909Customer Development Past Present Future Steve Blank 111909
Customer Development Past Present Future Steve Blank 111909
 
Open-Source Project Tools for Corporate Projects?
Open-Source Project Tools for Corporate Projects?Open-Source Project Tools for Corporate Projects?
Open-Source Project Tools for Corporate Projects?
 
Sexy defense
Sexy defenseSexy defense
Sexy defense
 
The Future of CrossRef: What's in it for Publishers?
The Future of CrossRef: What's in it for Publishers?The Future of CrossRef: What's in it for Publishers?
The Future of CrossRef: What's in it for Publishers?
 
Catalyst impact strategy
Catalyst impact strategyCatalyst impact strategy
Catalyst impact strategy
 
Qualitative Methods Workshop Day 1
Qualitative Methods Workshop Day 1Qualitative Methods Workshop Day 1
Qualitative Methods Workshop Day 1
 
Andy Kirk Malofiej 20 Presentation
Andy Kirk Malofiej 20 PresentationAndy Kirk Malofiej 20 Presentation
Andy Kirk Malofiej 20 Presentation
 
Trends In Usability Testing
Trends In  Usability  TestingTrends In  Usability  Testing
Trends In Usability Testing
 
Viral loops
Viral loopsViral loops
Viral loops
 
How penetration testing techniques can help you improve your qa skills
How penetration testing techniques can help you improve your qa skillsHow penetration testing techniques can help you improve your qa skills
How penetration testing techniques can help you improve your qa skills
 
1330 mon dochart2 brock
1330 mon dochart2 brock1330 mon dochart2 brock
1330 mon dochart2 brock
 
The (very) basics of AI for the Radiology resident
The (very) basics of AI for the Radiology residentThe (very) basics of AI for the Radiology resident
The (very) basics of AI for the Radiology resident
 
10 practices that every developer needs to start right now
10 practices that every developer needs to start right now10 practices that every developer needs to start right now
10 practices that every developer needs to start right now
 
Better search engine testing for solr
Better search engine testing for solrBetter search engine testing for solr
Better search engine testing for solr
 
AATC2016: Exploratory testing an API
AATC2016: Exploratory testing an APIAATC2016: Exploratory testing an API
AATC2016: Exploratory testing an API
 
Using Public Social Media to Find Answers to Questions
Using Public Social Media to Find Answers to QuestionsUsing Public Social Media to Find Answers to Questions
Using Public Social Media to Find Answers to Questions
 
Chemical Database Projects Delivered by RSC eScience
Chemical Database Projects Delivered by RSC eScienceChemical Database Projects Delivered by RSC eScience
Chemical Database Projects Delivered by RSC eScience
 

More from Stian Soiland-Reyes

2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systemsStian Soiland-Reyes
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research ObjectStian Soiland-Reyes
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language ViewerStian Soiland-Reyes
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015Stian Soiland-Reyes
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architectureStian Soiland-Reyes
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator projectStian Soiland-Reyes
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wildStian Soiland-Reyes
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)Stian Soiland-Reyes
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)Stian Soiland-Reyes
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objectsStian Soiland-Reyes
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...Stian Soiland-Reyes
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow systemStian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXStian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Stian Soiland-Reyes
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using TavernaStian Soiland-Reyes
 

More from Stian Soiland-Reyes (19)

2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture
 
2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance
 
2013-01-17 Research Object
2013-01-17 Research Object2013-01-17 Research Object
2013-01-17 Research Object
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using Taverna
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

2013-03-21 What can provenance do for me?

  • 1. What can provenance do for me? Stian Soiland-Reyes myGrid, University of Manchester This work is licensed under a Ocean Sampling Day planning Bremen 2013-03-21 Creative Commons Attribution 3.0 Unported License
  • 2. Provenance of Stian Soiland-Reyes • Developer/researcher in myGrid team, School of Computer Science, University of Manchester since 2006 • Involved with: • Taverna - Scientific workflow system What can provenance do for me? • myExperiment – sharing workflows and artefacts • Wf4Ever - digital preservation (of workflows and workflow runs) • W3C Provenance WG – standards for describing provenance • Open Annotation – standard for tracking who said what about something 2 http://soiland-reyes.com/stian/work/
  • 3. Overview What is provenance? • Attribution • Derivation What can provenance do for me? • Activities • PROV model Aggregating and sharing Why you want provenance 3
  • 4. What is provenance? Attribution who did it? Abstraction levels Activity shallots, sign, photo or flickr page? what happens to it? Date and tool when was it made? using what? Derivation Origin how did it change? where is it from? Aggregation what is it part of? Annotations Attributes what do others say about it? what is it? Licensing 4 can I use it? By Dr Stephen Dann licensed under Creative Commons Attribution-ShareAlike 2.0 Generic http://www.flickr.com/photos/stephendann/3375055368/
  • 5. Attribution actedOnBehalfOf The Alice lab • Who collected this sample? Who helped? • Which lab performed the sequencing? wasAttributedTo • Who did the data analysis? Data • Who curated the results? • What can provenance do for me? Who produced the raw data this analysis is based on? • Who wrote the analysis workflow? Why do I need this? Roles Agent types i. To be recognized for my work prov:wasAttributedTo Person prov:actedOnBehalfOf Organization ii. Who should I give credits to? dct:creator SoftwareAgent dct:publisher iii. Who should I complain to? pav:authoredBy pav:contributedBy iv. Can I trust them? pav:curatedBy pav:createdBy 5 pav:importedBy v. Who should I make friends with? pav:providedBy ...
  • 6. Sample Derivation wasDerivedFrom • Which sample was this metagenome sequenced from? Meta - • Which meta-genomes was this sequence extracted from? genome • Which sequence was the basis for the results? • What is the previous revision of the new results? wasQuotedFrom What can provenance do for me? Sequence Why do I need this? i. To verify consistency (did I use wasInfluencedBy wasDerivedFrom the correct sequence?) ii. To find the latest revision Old wasRevisionOf New iii. To backtrack where a diversion results results appeared after a change iv. To credit work I depend on 6 v. Auditing and defence for peer review
  • 7. Lab Alice technician Activities Sample hadRole wasAssociatedWith used • What happened? When? Who? "2012-06-21" Sequencing • What was used and generated? wasStartedAt • Why was this workflow started? wasGeneratedBy wasInformedBy • Which workflow ran? Where? Metagenome Workflow What can provenance do for me? server wasStartedBy Why do I need this? wasAssociatedWith i. To see which analysis was performed Workflow ii. To find out who did what run hadPlan iii. What was the metagenome used for? wasGeneratedBy iv. To understand the whole process Workflow “make me a Methods section” Results definition Results 7 v. To track down inconsistencies
  • 8. Core PROV model Provenance Working Group What can provenance do for me? Copyright © 2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. 8 http://www.w3.org/TR/prov-primer/
  • 9. How to find provenance data has_provenance resource provenance • Tracking provenance data has_query_service • Querying provenance “What was derived from X?” What can provenance do for me? • Pingback of provenance “Here’s new provenance data about X” Provenance service Why do I need this? i. To propagate provenance data (e.g. when integrating data) ii. To include external provenance (e.g. for reference datasets) iii. To avoid black-box provenance (e.g. in workflows) iv. To merge provenance at different abstraction levels 9 v. To see what has used the data (“Has someone done the analysis?”) http://www.w3.org/TR/prov-aq/
  • 10. Let’s talk about it Open Annotation Data Model What can provenance do for me? Copyright © 2012-2013 the Contributors to the Open Annotation Core Data Model Specification, published by the Open Annotation Community Group under the W3C Community Contributor License Agreement (CLA). • The body is somewhat about or related to the target • Provenance: Who said that? When? Why? • E.g. describing, commenting, highlighting, bookmarking, 10 tagging, classifying, identifying http://www.openannotation.org/spec/core/
  • 11. Gathering everything • Research Objects (RO) aggregate related resources, their provenance and annotations • Conveys “everything you need to know” about a study/experiment/analysis/dataset/workflow • Shareable, evolvable, contributable, citable What can provenance do for me? • ROs have their own provenance and lifecycles Hypothesis Provenance Raw data aggregates Research Object Annotations Workflow Analysis tools 11 Results http://purl.org/wf4ever/model Paper Reference literature
  • 12. Research Objects Hypothesis Provenance Raw data aggregates Research Object Annotations Workflow Analysis tools Results What can provenance do for me? Paper Reference literature Why do I need them? i. To share your research materials (RO as a social object) ii. To facilitate reproducibility and reuse of methods iii. To be recognized and cited (even for constituent resources) iv. To preserve results and prevent decay (curation of workflow definition; using provenance for partial rerun) 12
  • 13. myExperiment Research Objects What can provenance do for me? 13
  • 14. Why you want provenance i. To acknowledge sources you have based your work on ii. Receive credit when others uses your work iii. Build trust (who did it?) and verify consistency (was it done correctly?) iv. To audit and defend for peer review What can provenance do for me? v. Keep track of resources that change over time (versioning) vi. Investigate and compare data (where did that strange value come from?) vii. Gather everything you need for that Methods section viii. Facilitate reproducibility by tracking activities and their outcomes ix. To prevent decay by aggregating related resources and their descriptions 14
  • 15. Thank you Questions? What can provenance do for me? Twitter: @soilandreyes Skype: soiland http://soiland-reyes.com/stian/work/ 15 http://www.wf4ever-project.org/