SlideShare a Scribd company logo
1 of 25
myExperiment Research Objects:
Beyond Workflows and Packs
Stian Soiland-Reyes
myGrid, University of Manchester
BOSC 2013, ISMB, Berlin, 2013-07-19
This work is licensed under a
Creative Commons Attribution 3.0 Unported License
Motivation: Scientific workflows
Coordinated execution of
services and linked resources
Dataflow between services
Web services (SOAP, REST)
Command line tools
Scripts
User interactions
Components (nested workflows)
Method becomes:
Documented visually
Shareable as single definition
Reusable with new inputs
Repurposable other services
Reproducible?
http://www.myexperiment.org/workflows/3355
http://www.taverna.org.uk/
http://www.biovel.eu/
But workflows are complex machines
Outputs
Inputs
Configuration
Components
http://www.myexperiment.org/workflows/3355
• Will it still work after a year? 10 years?
• Expanding components, we see a workflow involves a
series of specific tools and services which
• Depend on datasets, software libraries, other tools
• Are often poorly described or understood
• Over time evolve, change, break or are replaced
• User interactions are not reproducible
• But can be tracked and replayed
Electronic Paper Not Enough
Hypothesis Experiment
Result
Analysis
Conclusions
Investigation
Data Data
Electronic
paper
Publish
http://www.force11.org/beyondthepdf2http://figshare.com/
Open Research movement: Openly share the data of your experiments
http://datadryad.org/
http://www.researchobject.org/
RESEARCH OBJECT (RO)
http://www.researchobject.org/
Research objects goal: Openly share everything about your
experiments, including how those things are related
What is in a research object?
A Research Object bundles and relates digital
resources of a scientific experiment or
investigation:
Data used and results produced in
experimental study
Methods employed to produce and analyse
that data
Provenance and settings for the experiments
People involved in the investigation
Annotations about these resources, that are
essential to the understanding and
interpretation of the scientific outcomes
captured by a research object
http://www.researchobject.org/
Gathering everything
Research Objects (RO) aggregate related resources, their
provenance and annotations
Conveys “everything you need to know” about a
study/experiment/analysis/dataset/workflow
Shareable, evolvable, contributable, citable
ROs have their own provenance and lifecycles
Why Research Objects?
i. To share your research materials
(RO as a social object)
ii. To facilitate reproducibility and reuse of methods
iii. To be recognized and cited
(even for constituent resources)
iv. To preserve results and prevent decay
(curation of workflow definition;
using provenance for partial rerun)
A Research object http://alpha.myexperiment.org/packs/387
QualityAssessment of a research object
QualityMonitoring
Annotations in research objects
Types: “This document contains an hypothesis”
Relations: “These datasets are consumed by that tool”
Provenance: “These results came from this workflow run”
Descriptions: “Purpose of this step is to filter out invalid data”
Comments: “This method looks useful, but how do I install it?”
Examples: “This is how you could use it”
What is provenance?
By Dr Stephen Dann
licensed under Creative Commons Attribution-ShareAlike 2.0 Generic
http://www.flickr.com/photos/stephendann/3375055368/
Attribution
who did it?
Derivation
how did it change?
Activity
what happens to it?
Licensing
can I use it?
Attributes
what is it?
Origin
where is it from?
Annotations
what do others say about it?
Abstraction levels
shallots, sign, photo or flickr page?
Aggregation
what is it part of?
Date and tool
when was it made?
using what?
Attribution
Who collected this sample? Who helped?
Which lab performed the sequencing?
Who did the data analysis?
Who wrote the analysis workflow?
Who made the data set used by analysis?
Who curated the results?
Why do I need this?
i. To be recognized for my work
ii. Who should I give credits to?
iii. Who should I complain to?
iv. Can I trust them?
v. Who should I make friends with?
Alice
The
lab
Data
wasAttributedTo
actedOnBehalfOf
Derivation
Which sample was this metagenome sequenced from?
Which meta-genomes was this sequence extracted from?
Which sequence was the basis for the results?
What is the previous revision of the new results?
Why do I need this?
i. To verify consistency (did I use
the correct sequence?)
ii. To find the latest revision
iii. To backtrack where a diversion
appeared after a change
iv. To credit work I depend on
v. Auditing and defence for
peer review
wasDerivedFrom
wasQuotedFrom
Sequence
New
results
wasDerivedFrom
Sample
Meta -
genome
Old
results
wasRevisionOf
wasInfluencedBy
Activities
What happened? When? Who?
What was used and generated?
Why was this workflow started?
Which workflow ran? Where?
Why do I need this?
i. To see which analysis was performed
ii. To find out who did what
iii. What was the metagenome
used for?
iv. To understand the whole process
“make me a Methods section”
v. To track down inconsistencies
used
wasGeneratedBy
wasStartedAt
"2012-06-21"
Metagenome
Sample
wasAssociatedWith
Workflow
server
wasInformedBy
wasStartedBy
Workflow
run
wasGeneratedBy
Results
Sequencing
wasAssociatedWith
Alice
hadPlan
Workflow
definition
hadRole
Lab
technician
Results
Core PROV model
http://www.w3.org/TR/prov-primer/
Copyright © 2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved.
Provenance Working Group
Provenance of what?
Who made the (content of) research object? Who
maintains it?
Who wrote this document? Who uploaded it?
Which CSV was this Excel file imported from?
Who wrote this description? When? How did we get
it?
What is the state of this RO? (Live or Published?)
What did the research object look like before?
(Revisions) – are there newer versions?
Which research objects are derived from this RO?
Research object model at a glance
Research
Object
Resource
Resource
Resource
Annotation
Annotation
Annotation
oa:hasTarget
Resource
Resource
Annotation graph
oa:hasBody
ore:aggregates
Manifest
Wf4Ever architecture
Semantic REST API
RDF triple store
(RO structure,
Annotations)
RO index
Uploaded files
PortalChecklist
service
Command
line
Workflow
runner
...
Saving a research object:
RO bundle
Single, transferrable research object
Self-contained snapshot
Which files in ZIP, which are URIs? (Up to user/application)
Regular ZIP file, explored and unpacked with standard tools
JSON manifest is programmatically accessible without RDF
understanding
Works offline and in desktop applications – no REST API
access required
Basis for RO-enabled file formats, e.g. Taverna run bundle
Exchanged with myExperiment and RO tools
Example RO bundle:
Workflow Results
workflowrun.prov.ttl
(RDF)
outputA.txt
outputC.jpg
outputB/
https://w3id.org/bundl
intermediates/
1.txt
2.txt
3.txt
de/def2e58b-50e2-4949-9980-fd310166621a.txt
inputA.txt
workflow
URI
references
attribution
execution
environment
Aggregating in Research Object
ZIP folder structure (RO Bundle)
mimetype
application/vnd.wf4ever.robundle+zip
.ro/manifest.json
W3C community group for RO
http://www.w3.org/community/rosc/
Summary
Provide provenance records: Use (and extend) PROV.
Share your methods (tools, workflows), not just your data
Make research objects: Collect resources and relate them
Pick your RO integration level to adapt:
1. Conceptual model
2. RO Core ontology (aggregation and annotation)
3. Workflow description and provenance
4. Wf4Ever architecture for APIs
5. myExperiment as user interface

More Related Content

Viewers also liked

2012 CNI Fall Membership Meeting
2012 CNI Fall Membership Meeting2012 CNI Fall Membership Meeting
2012 CNI Fall Membership MeetingPaolo Ciccarese
 
PRO Use Cases for Scientific Communities
PRO Use Cases for Scientific CommunitiesPRO Use Cases for Scientific Communities
PRO Use Cases for Scientific CommunitiesPaolo Ciccarese
 
Paolo ciccarese DILS 2013 keynote
Paolo ciccarese DILS 2013 keynotePaolo ciccarese DILS 2013 keynote
Paolo ciccarese DILS 2013 keynotePaolo Ciccarese
 
SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online do...
SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online do...SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online do...
SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online do...Paolo Ciccarese
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Stian Soiland-Reyes
 
Integrating OPEN ANNOTATION with any DOMAIN ONTOLOGY
Integrating OPEN ANNOTATION with any DOMAIN ONTOLOGY Integrating OPEN ANNOTATION with any DOMAIN ONTOLOGY
Integrating OPEN ANNOTATION with any DOMAIN ONTOLOGY Paolo Ciccarese
 
Annotopia: Open Annotation Server
Annotopia: Open Annotation ServerAnnotopia: Open Annotation Server
Annotopia: Open Annotation ServerPaolo Ciccarese
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?Stian Soiland-Reyes
 
(Live) Annotopia Overview by Paolo Ciccarese (Architect and principal developer)
(Live) Annotopia Overview by Paolo Ciccarese (Architect and principal developer)(Live) Annotopia Overview by Paolo Ciccarese (Architect and principal developer)
(Live) Annotopia Overview by Paolo Ciccarese (Architect and principal developer)Paolo Ciccarese
 
Los leones art festival
Los  leones  art  festivalLos  leones  art  festival
Los leones art festivalNorthside ISD
 
Marc Garufi: National Chlamydia Coalition Presentation
Marc Garufi: National Chlamydia Coalition PresentationMarc Garufi: National Chlamydia Coalition Presentation
Marc Garufi: National Chlamydia Coalition PresentationNational Chlamydia Coalition
 
How to build succesful relationships
How to build succesful relationshipsHow to build succesful relationships
How to build succesful relationshipstonybrammer
 

Viewers also liked (16)

Stills from The Cellist
Stills from The CellistStills from The Cellist
Stills from The Cellist
 
2012 CNI Fall Membership Meeting
2012 CNI Fall Membership Meeting2012 CNI Fall Membership Meeting
2012 CNI Fall Membership Meeting
 
PRO Use Cases for Scientific Communities
PRO Use Cases for Scientific CommunitiesPRO Use Cases for Scientific Communities
PRO Use Cases for Scientific Communities
 
Rules at #4SQHACKPH
Rules at #4SQHACKPHRules at #4SQHACKPH
Rules at #4SQHACKPH
 
Foursquare API + GoRated.PH
Foursquare API + GoRated.PHFoursquare API + GoRated.PH
Foursquare API + GoRated.PH
 
WebGeek DevCup Theme
WebGeek DevCup ThemeWebGeek DevCup Theme
WebGeek DevCup Theme
 
Paolo ciccarese DILS 2013 keynote
Paolo ciccarese DILS 2013 keynotePaolo ciccarese DILS 2013 keynote
Paolo ciccarese DILS 2013 keynote
 
SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online do...
SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online do...SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online do...
SemTechBiz 2012: Domeo: a web-based tool for semantic annotation of online do...
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
 
Integrating OPEN ANNOTATION with any DOMAIN ONTOLOGY
Integrating OPEN ANNOTATION with any DOMAIN ONTOLOGY Integrating OPEN ANNOTATION with any DOMAIN ONTOLOGY
Integrating OPEN ANNOTATION with any DOMAIN ONTOLOGY
 
Annotopia: Open Annotation Server
Annotopia: Open Annotation ServerAnnotopia: Open Annotation Server
Annotopia: Open Annotation Server
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?
 
(Live) Annotopia Overview by Paolo Ciccarese (Architect and principal developer)
(Live) Annotopia Overview by Paolo Ciccarese (Architect and principal developer)(Live) Annotopia Overview by Paolo Ciccarese (Architect and principal developer)
(Live) Annotopia Overview by Paolo Ciccarese (Architect and principal developer)
 
Los leones art festival
Los  leones  art  festivalLos  leones  art  festival
Los leones art festival
 
Marc Garufi: National Chlamydia Coalition Presentation
Marc Garufi: National Chlamydia Coalition PresentationMarc Garufi: National Chlamydia Coalition Presentation
Marc Garufi: National Chlamydia Coalition Presentation
 
How to build succesful relationships
How to build succesful relationshipsHow to build succesful relationships
How to build succesful relationships
 

Similar to 2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)

2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research ObjectStian Soiland-Reyes
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsAndrea Wiggins
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasMerce Crosas
 
Dave de Roure - The myExperiment approach towards Open Science
Dave de Roure - The myExperiment approach towards Open ScienceDave de Roure - The myExperiment approach towards Open Science
Dave de Roure - The myExperiment approach towards Open Scienceshwu
 
Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11monicaduke
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk SlidesBioCatalogue
 
myExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentmyExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentDavid De Roure
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...Carole Goble
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurghJun Zhao
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Carole Goble
 
Replicating FLOSS Research as eResearch
Replicating FLOSS Research as eResearchReplicating FLOSS Research as eResearch
Replicating FLOSS Research as eResearchAndrea Wiggins
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Bhagat Myexperiment Bosc2008
Bhagat Myexperiment Bosc2008Bhagat Myexperiment Bosc2008
Bhagat Myexperiment Bosc2008bosc_2008
 
Cleaning Up the Mess: Modernizing Your Dev Team’s Outdated Workflow
Cleaning Up the Mess: Modernizing Your Dev Team’s Outdated WorkflowCleaning Up the Mess: Modernizing Your Dev Team’s Outdated Workflow
Cleaning Up the Mess: Modernizing Your Dev Team’s Outdated WorkflowBohyun Kim
 

Similar to 2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX) (20)

2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object
 
20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
UCIAD overview
UCIAD overviewUCIAD overview
UCIAD overview
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
Dave de Roure - The myExperiment approach towards Open Science
Dave de Roure - The myExperiment approach towards Open ScienceDave de Roure - The myExperiment approach towards Open Science
Dave de Roure - The myExperiment approach towards Open Science
 
My Experiment
My ExperimentMy Experiment
My Experiment
 
Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11
 
Dev ops
Dev opsDev ops
Dev ops
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk Slides
 
myExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentmyExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research Environment
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
DataHub
DataHubDataHub
DataHub
 
Replicating FLOSS Research as eResearch
Replicating FLOSS Research as eResearchReplicating FLOSS Research as eResearch
Replicating FLOSS Research as eResearch
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Bhagat Myexperiment Bosc2008
Bhagat Myexperiment Bosc2008Bhagat Myexperiment Bosc2008
Bhagat Myexperiment Bosc2008
 
Cleaning Up the Mess: Modernizing Your Dev Team’s Outdated Workflow
Cleaning Up the Mess: Modernizing Your Dev Team’s Outdated WorkflowCleaning Up the Mess: Modernizing Your Dev Team’s Outdated Workflow
Cleaning Up the Mess: Modernizing Your Dev Team’s Outdated Workflow
 

More from Stian Soiland-Reyes

2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systemsStian Soiland-Reyes
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language ViewerStian Soiland-Reyes
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.orgStian Soiland-Reyes
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015Stian Soiland-Reyes
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architectureStian Soiland-Reyes
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator projectStian Soiland-Reyes
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wildStian Soiland-Reyes
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objectsStian Soiland-Reyes
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...Stian Soiland-Reyes
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow systemStian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXStian Soiland-Reyes
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using TavernaStian Soiland-Reyes
 

More from Stian Soiland-Reyes (16)

2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture
 
2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild
 
2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance
 
2013-01-17 Research Object
2013-01-17 Research Object2013-01-17 Research Object
2013-01-17 Research Object
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using Taverna
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Recently uploaded (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)

  • 1. myExperiment Research Objects: Beyond Workflows and Packs Stian Soiland-Reyes myGrid, University of Manchester BOSC 2013, ISMB, Berlin, 2013-07-19 This work is licensed under a Creative Commons Attribution 3.0 Unported License
  • 2. Motivation: Scientific workflows Coordinated execution of services and linked resources Dataflow between services Web services (SOAP, REST) Command line tools Scripts User interactions Components (nested workflows) Method becomes: Documented visually Shareable as single definition Reusable with new inputs Repurposable other services Reproducible? http://www.myexperiment.org/workflows/3355 http://www.taverna.org.uk/ http://www.biovel.eu/
  • 3. But workflows are complex machines Outputs Inputs Configuration Components http://www.myexperiment.org/workflows/3355 • Will it still work after a year? 10 years? • Expanding components, we see a workflow involves a series of specific tools and services which • Depend on datasets, software libraries, other tools • Are often poorly described or understood • Over time evolve, change, break or are replaced • User interactions are not reproducible • But can be tracked and replayed
  • 4. Electronic Paper Not Enough Hypothesis Experiment Result Analysis Conclusions Investigation Data Data Electronic paper Publish http://www.force11.org/beyondthepdf2http://figshare.com/ Open Research movement: Openly share the data of your experiments http://datadryad.org/
  • 5. http://www.researchobject.org/ RESEARCH OBJECT (RO) http://www.researchobject.org/ Research objects goal: Openly share everything about your experiments, including how those things are related
  • 6. What is in a research object? A Research Object bundles and relates digital resources of a scientific experiment or investigation: Data used and results produced in experimental study Methods employed to produce and analyse that data Provenance and settings for the experiments People involved in the investigation Annotations about these resources, that are essential to the understanding and interpretation of the scientific outcomes captured by a research object http://www.researchobject.org/
  • 7. Gathering everything Research Objects (RO) aggregate related resources, their provenance and annotations Conveys “everything you need to know” about a study/experiment/analysis/dataset/workflow Shareable, evolvable, contributable, citable ROs have their own provenance and lifecycles
  • 8. Why Research Objects? i. To share your research materials (RO as a social object) ii. To facilitate reproducibility and reuse of methods iii. To be recognized and cited (even for constituent resources) iv. To preserve results and prevent decay (curation of workflow definition; using provenance for partial rerun)
  • 9. A Research object http://alpha.myexperiment.org/packs/387
  • 10.
  • 11. QualityAssessment of a research object
  • 13. Annotations in research objects Types: “This document contains an hypothesis” Relations: “These datasets are consumed by that tool” Provenance: “These results came from this workflow run” Descriptions: “Purpose of this step is to filter out invalid data” Comments: “This method looks useful, but how do I install it?” Examples: “This is how you could use it”
  • 14. What is provenance? By Dr Stephen Dann licensed under Creative Commons Attribution-ShareAlike 2.0 Generic http://www.flickr.com/photos/stephendann/3375055368/ Attribution who did it? Derivation how did it change? Activity what happens to it? Licensing can I use it? Attributes what is it? Origin where is it from? Annotations what do others say about it? Abstraction levels shallots, sign, photo or flickr page? Aggregation what is it part of? Date and tool when was it made? using what?
  • 15. Attribution Who collected this sample? Who helped? Which lab performed the sequencing? Who did the data analysis? Who wrote the analysis workflow? Who made the data set used by analysis? Who curated the results? Why do I need this? i. To be recognized for my work ii. Who should I give credits to? iii. Who should I complain to? iv. Can I trust them? v. Who should I make friends with? Alice The lab Data wasAttributedTo actedOnBehalfOf
  • 16. Derivation Which sample was this metagenome sequenced from? Which meta-genomes was this sequence extracted from? Which sequence was the basis for the results? What is the previous revision of the new results? Why do I need this? i. To verify consistency (did I use the correct sequence?) ii. To find the latest revision iii. To backtrack where a diversion appeared after a change iv. To credit work I depend on v. Auditing and defence for peer review wasDerivedFrom wasQuotedFrom Sequence New results wasDerivedFrom Sample Meta - genome Old results wasRevisionOf wasInfluencedBy
  • 17. Activities What happened? When? Who? What was used and generated? Why was this workflow started? Which workflow ran? Where? Why do I need this? i. To see which analysis was performed ii. To find out who did what iii. What was the metagenome used for? iv. To understand the whole process “make me a Methods section” v. To track down inconsistencies used wasGeneratedBy wasStartedAt "2012-06-21" Metagenome Sample wasAssociatedWith Workflow server wasInformedBy wasStartedBy Workflow run wasGeneratedBy Results Sequencing wasAssociatedWith Alice hadPlan Workflow definition hadRole Lab technician Results
  • 18. Core PROV model http://www.w3.org/TR/prov-primer/ Copyright © 2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. Provenance Working Group
  • 19. Provenance of what? Who made the (content of) research object? Who maintains it? Who wrote this document? Who uploaded it? Which CSV was this Excel file imported from? Who wrote this description? When? How did we get it? What is the state of this RO? (Live or Published?) What did the research object look like before? (Revisions) – are there newer versions? Which research objects are derived from this RO?
  • 20. Research object model at a glance Research Object Resource Resource Resource Annotation Annotation Annotation oa:hasTarget Resource Resource Annotation graph oa:hasBody ore:aggregates Manifest
  • 21. Wf4Ever architecture Semantic REST API RDF triple store (RO structure, Annotations) RO index Uploaded files PortalChecklist service Command line Workflow runner ...
  • 22. Saving a research object: RO bundle Single, transferrable research object Self-contained snapshot Which files in ZIP, which are URIs? (Up to user/application) Regular ZIP file, explored and unpacked with standard tools JSON manifest is programmatically accessible without RDF understanding Works offline and in desktop applications – no REST API access required Basis for RO-enabled file formats, e.g. Taverna run bundle Exchanged with myExperiment and RO tools
  • 23. Example RO bundle: Workflow Results workflowrun.prov.ttl (RDF) outputA.txt outputC.jpg outputB/ https://w3id.org/bundl intermediates/ 1.txt 2.txt 3.txt de/def2e58b-50e2-4949-9980-fd310166621a.txt inputA.txt workflow URI references attribution execution environment Aggregating in Research Object ZIP folder structure (RO Bundle) mimetype application/vnd.wf4ever.robundle+zip .ro/manifest.json
  • 24. W3C community group for RO http://www.w3.org/community/rosc/
  • 25. Summary Provide provenance records: Use (and extend) PROV. Share your methods (tools, workflows), not just your data Make research objects: Collect resources and relate them Pick your RO integration level to adapt: 1. Conceptual model 2. RO Core ontology (aggregation and annotation) 3. Workflow description and provenance 4. Wf4Ever architecture for APIs 5. myExperiment as user interface

Editor's Notes

  1. http://purl.org/wf4ever/model
  2. http://purl.org/wf4ever/model
  3. Typing of resources and relating them to each-other are individual annotations
  4. Quality metrics are monitored over time, so as the research object evolves (say gaining additional annotations), the health of the annotations can be indicated graphically.
  5. Sequencing machines:illumina
  6. W3C standardisation body HTML
  7. So for Wf4Ever, what kind of provenance are we talking about? It is a bit more down to earth as we are generally just talking about files and documents, but still, as you form a research object combining many different resources, the different dimensions of provenance become increasingly important to track.
  8. So let’s have a look at what a Research Object looks like. The core is the concept of the Research Object itself, which you may also known as an ORE aggregation. This is described by the manifest, which is simply an RDF file. The RO aggregates a series of resources – in Linked Data these could be anywhere in the world. Additionally it aggregates a set of annotations, which we know is the link between a target resource (here aggregated in the RO), and an body resource. In Wf4Ever we typically provide the body as a separate RDF Graph, so that we can use existing vocabularies to describe and relate the resources.
  9. So not everyone have access to set up a RESTful semantic web servers, in particular we’ve run into this with desktop applications – users just want to save files and then they decide where they are stored. So we decided to write a serialization format for Research Object, which we call the RO Bundle.We wanted this to be accessible for applicaton developers, so we’ve adopted ZIP and JSON, and in a way this would let you create research objects and make annotations without ever seing any RDF.
  10. This shows how the JSON manifest focuses on the most common aspect of a research object – who made it? When? What is aggregated – files in the ZIP but also external URIs – up to the application or person making the bundle to decide what is to be included in the ZIP. Annotations are included at the bottom here, we see that there’s an annotation “about” (target) the analysis JPEG, and the content (the body) is within the annotations/ folder. Similarly, the next annotations relates the external resource (a blog post) with our aggregation of a resource.This is processable as JSON-LD – so it is not just JSON, it is also RDF, and out comes normal ORE aggregations and OA annotations.
  11. This is how we represent a workflow run as a Workflow Results RO Bundle. We aggregate the workflowoutputs, , workflow definition, the inputs used for execution, a description of the execution environment, external URI references (such as the project homepage) and attribution to scientists who contributed to the bundle. This effectively forms a Research Object, all tied together by the RO Bundle Manifest, which is in JSON-LD format. (normal JSON that is also valid RDF).
  12. So we have recently formed a W3C Community Group for Research Object, which has gathered significant interest, 75 participant. As you see, I am one of the chairs, and so is Rob which you already know from OA group. We are just starting up, and our focus in the RO community group is rather how to practically use Research Objects as a concept than to specify a new model – we’ll refer to existing models where it’s appropriate, but also explore other models which could be described as research objects.