Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Research Shared
BOSC
July 11th 2015, Dublin
Norman Morrison, The University of Manchester
researchobject.org
Framework
A	
  framework	
  to	
  bundle,	
  exchange	
  and	
  link	
  (scattered)	
  resources	
  about	
  experiments.	...
Framework desiderata
	
  
	
  
	
  
	
  
Technology	
  Independent.	
  
The	
  least	
  possible	
  
The	
  simplest	
  fe...
How?
The	
  Container	
  
	
  
Packaging:	
  	
  
Zip	
  files,	
  Docker	
  images,	
  BagIt,	
  Web,	
  …	
  
Catalogues	...
Manifest
Manifest	
  Construction	
  
•  Identification	
  –	
  id,	
  title,	
  creator,	
  status….	
  
•  Aggregates	
  ...
Manifest
id:	
  doi:10.000/zenodo.123	
  
createdOn:	
  2015-­‐07-­‐10T16:46:00Z	
  
createdBy:	
  http://orcid.org/0000-­...
RO Principles
Use unique identifiers as names for things.
Use some mechanism of aggregation to
group things together.
Prov...
Get tooled up
https://github.com/ResearchObject
Real world examples
•  Reviewed to Reproduced
•  Workflow run (CWL)
•  Farr Commons
•  Capturing and describing Docker ima...
Reviewed to Reproduced
Reviewed to Reproduced
From González-Beltrán et al. doi:
10.1371/journal.pone.0127612
Reproducibility
Same data
Same code
...
Workflow Run
workflowrun.prov.ttl
(RDF)
outputA.txt
outputC.jpg
outputB/
intermediates/
1.txt
2.txt
3.txt
de/def2e58b-50e2...
RO’s and Sensitive data
Farr Commons
Exchange
Systematic and
extensible
meta-data
collection
✔
✔
Use	
  case:	
  ATLAS	
  Collider	
  	
  
Data	
  Analytics	
  
Portable,	
  lightweight	
  
application	
  runtime	
  
an...
FAIRDOM SEEK
FAIRDOM
Export as RO Model, Data, SOP,
Parameters
RO Unzip
Reproducibility
Versioning
Systematic and
extensible
meta-data
collection
✔
✔
✔
FAIR Publishing
Research Objects
•  Reproducibility
– Same data, same code, same run time
environment
•  Versioning
•  Exchange
•  Systema...
Research Objects
Publish a digital record
of your entire scientific
enterprise
You can give it
to someone
else
You can get...
Okay, but what does it cost?
Conclusion
•  Simple solution, addressing needs towards
transparent FAIR principles
–  Findable, Accessible, Interoperable...
Acknowledgements
Carole	
  Goble	
  
Stian	
  Soiland-­‐Reyes	
  
Matt	
  Gamble	
  
Rob	
  Haines	
  	
  
Sean	
  Bechhof...
Upcoming SlideShare
Loading in …5
×

Research Shared: researchobject.org

Project Website: http://www.researchobject.org/

researchobjects.org is a community project that has developed an approach to describe and package up all resources used as part of an investigation as Research Objects (RO’s).
RO’s - provide two main features; a manifest - a consistent way to provide a well-typed, structured description of the resources used in an investigation; and a ‘bundle’ - a mechanism for packaging up manifests with resources as a single, publishable unit.
RO’s therefore carry the research context of an experiment - data, software, standard operating procedures (SOPs), models etc - and gather together the components of an experiment so that they are findable, accessible, interoperable and reproducible (FAIR). RO’s combine software and data into an aggregative data structure consisting of well described reconstructable parts.
RO’s have the potential to address a number of challenges pertinent to open research including: a) supporting interoperability between infrastructures by using ROs as a primary mechanism for exchange and publication b) supporting the evolution of research objects as a living collection, enabling provenance tracking c) providing the ability to pivot research object components (data, software, models) that are not restricted to the traditional publication.
Here we present work towards the development and adoption of ROs:
(i) A series of specifications and conventions, using community standards, for the RO manifest and RO bundles.
(ii) Implementations of Java, Python and Ruby APIs and tooling against those specifications;
(iii) Examples of representations of the RO models in various languages (e.g. JSON-LD, RDF, HTML).

  • Be the first to comment

  • Be the first to like this

Research Shared: researchobject.org

  1. 1. Research Shared BOSC July 11th 2015, Dublin Norman Morrison, The University of Manchester researchobject.org
  2. 2. Framework A  framework  to  bundle,  exchange  and  link  (scattered)  resources  about  experiments.  
  3. 3. Framework desiderata         Technology  Independent.   The  least  possible   The  simplest  feasible   Graceful degradation Standard  tooling  
  4. 4. How? The  Container     Packaging:     Zip  files,  Docker  images,  BagIt,  Web,  …   Catalogues  &  Commons:     FAIRDOM  SEEK,  Farr  Commons  CKAN,   myExperiment,  Zenodo,  Figshare,  …   Manifest   Describes the aggregated resources, their annotations and provenance   Manifest
  5. 5. Manifest Manifest  Construction   •  Identification  –  id,  title,  creator,  status….   •  Aggregates  –  list  of  ids/links  to  resources   •  Annotations  –  list  of  annotations  about   resources   Manifest Manifest  Description   •  Checklists  –    what  should  be  there   •  Provenance  –  where  it  came  from   •  Versioning  –  its  evolution   •  Dependencies  –  what  else  is  needed   Manifest
  6. 6. Manifest id:  doi:10.000/zenodo.123   createdOn:  2015-­‐07-­‐10T16:46:00Z   createdBy:  http://orcid.org/0000-­‐0001-­‐9842-­‐9718   aggregates:        -­‐  id:  /sequence/specimen5.bam          conformsTo:  http://gemrb.org/iesdp/file_formats/ie_formats/bam_v1.htm            -­‐  id:  http://example.com/blog/about-­‐specimen5          authoredBy:  http://orcid.org/0000-­‐0001-­‐7066-­‐3350        -­‐  id:  http://www.myexperiment.org/workflows/3355            history:  provenance/workflow-­‐evolution.ttl   annotations:      -­‐  about:      /sequence/specimen5.bam          content:  annotations/specimen5-­‐properties.jsonld          createdBy:  http://orcid.org/0000-­‐0001-­‐7066-­‐3350      -­‐  about:      /sequence/specimen5.bam          content:  http://example.com/blog/about-­‐specimen5          oa:motivatedBy  oa:questioning  
  7. 7. RO Principles Use unique identifiers as names for things. Use some mechanism of aggregation to group things together. Provide metadata about those things & how they relate to each other.
  8. 8. Get tooled up https://github.com/ResearchObject
  9. 9. Real world examples •  Reviewed to Reproduced •  Workflow run (CWL) •  Farr Commons •  Capturing and describing Docker images for CERN Atlas analyses •  FAIR-DOM http://fair-dom.org/ – SEEK http://seek4science.org/ •  FAIR Publishing - RO to Figshare
  10. 10. Reviewed to Reproduced
  11. 11. Reviewed to Reproduced From González-Beltrán et al. doi: 10.1371/journal.pone.0127612 Reproducibility Same data Same code Systematic and extensible meta-data collection ✔ ✔
  12. 12. Workflow Run workflowrun.prov.ttl (RDF) outputA.txt outputC.jpg outputB/ intermediates/ 1.txt 2.txt 3.txt de/def2e58b-50e2-4949-9980-fd310166621a.txt inputA.txt workflow attribution execution environment Aggregating in Research Object ZIP folder structure (RO Bundle) mimetype application/vnd.wf4ever.robundle +zip     .ro/ manifest.json URI reference s Exchange Reproducibility Same data Same code Systematic and extensible meta- data collection Uses RO Model WF Extension - basis of CWL ✔ ✔ ✔ ✔
  13. 13. RO’s and Sensitive data
  14. 14. Farr Commons Exchange Systematic and extensible meta-data collection ✔ ✔
  15. 15. Use  case:  ATLAS  Collider     Data  Analytics   Portable,  lightweight   application  runtime   and  packaging  tool.     Image   ATLAS  and  CMS  detector  data   Charles  Vardeman,  Da  Huo       All  data  and  files   of  the  execution   +  Instructions   convert   bundle   manifest   Relate  files     and  layers   Add  provenance   and  annotations   Link  in  other   content   run   Exchange Reproducibility Same data Same code Same run time environment Systematic and extensible meta- data collection ✔ ✔ ✔
  16. 16. FAIRDOM SEEK
  17. 17. FAIRDOM
  18. 18. Export as RO Model, Data, SOP, Parameters
  19. 19. RO Unzip Reproducibility Versioning Systematic and extensible meta-data collection ✔ ✔ ✔
  20. 20. FAIR Publishing
  21. 21. Research Objects •  Reproducibility – Same data, same code, same run time environment •  Versioning •  Exchange •  Systematic and extensible meta-data collection
  22. 22. Research Objects Publish a digital record of your entire scientific enterprise You can give it to someone else You can get credit for it People think you are a good person You get a promotion •  Why does this matter to Biologists?
  23. 23. Okay, but what does it cost?
  24. 24. Conclusion •  Simple solution, addressing needs towards transparent FAIR principles –  Findable, Accessible, Interoperable, Reproducible •  Adoption –  Training •  Online tutorials •  Face to face –  Need more tools that take advantage of the RO Framework and lower the cost (technological debt) of reproducibility •  Work together
  25. 25. Acknowledgements Carole  Goble   Stian  Soiland-­‐Reyes   Matt  Gamble   Rob  Haines     Sean  Bechhofer   Phil  Crouch   Finn  Bacall   Stuart  Owen   Carole  Goble   Khalid  Belhajjame     Graham  Klyne   Jun  Zhao       Daniel  Garijo,     Oscar  Corcho     Esteban  García   Cuesta   University  of   Manchester     University  of  Oxford   Lancaster  University     UPM     http://researchobject.org   http://fair-­‐dom.org   http://www.seek4science.org   http://www.farrinstitute.org   http://www.wf4ever-­‐project.org   http://myexperiment.org     Raul  Palma     iSOCO   PSNC   Paris  6  

×