Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards Computational Research Objects


Published on

"Towards a Science of Reproducible Science?" DPRMA Workshop talk at JCDL 2013, Indianapolis, 25th July 2013. Workshop website is
Paper is
David De Roure. 2013. Towards computational research objects. In Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts (DPRMA '13). ACM, New York, NY, USA, 16-19. DOI=10.1145/2499583.2499590

Published in: Technology
  • Be the first to comment

Towards Computational Research Objects

  1. 1. Towards Computational Research Objects David De Roure Indianapolis Edition
  2. 2. 1. A Brief History of Research Objects 2. The motivation for Computational Research Objects 3. (A small illustration)
  3. 3. Packs
  4. 4. In contrast to photo-sharing on Flickr or videos on YouTube, the basic unit of sharing in myExperiment is not a single file but rather a package of components that make up an experiment - what we call an Encapsulated myExperiment Object (EMO), and others have called Reproducible Research Objects. Notionally an EMO is a folder containing the various assets associated with an experiment. In the scientific context there are stringent requirements with respect to versioning, ownership, intellectual property and the maintenance of provenance information. We have looked at emerging practice in sharing “pieces of science” in the scientific and scholarly lifecycle, from social sites to digital repositories. myExperiment provides simple and extensible support to better understand requirements as new collaborative practice emerges. In this presentation, we will describe the characteristics of EMOs and present our initial design solution which supports the requirements of encapsulation and preserves our principles of simplicity and interoperability. Sharing Digital Science David De Roure, University of Southampton; Carole Goble, University of Manchester EMOs
  5. 5. Iain Buchan Research Objects
  6. 6. Results Logs Results Metadata PaperSlides Feeds into produces Included in produces Published in produces Included in Included in Included in Published in Workflow 16 Workflow 13 Common pathways QTL Paul’s PackPaul’s Research Object
  7. 7. OAI-ORE
  8. 8. • Workflow – pack contains a number of workflows • Presentation - encapsulation of a single presentation • Collection - a number of things (workflows/presentations/pa pers) • Heterogeneous - where the workflows do not appear to have a clear common purpose • Homogeneous - workflows appear to be designed to work together • Paper - source for a paper • Tutorial - tutorial material • Data - collection of data files • Derived data - results of workflow • Benchmark - benchmarking data • Supplementary - stuff associated with a paper • Noise - tests, tryouts, rubbish • Oddity - none of the above Analysis by Sean Bechhofer Pack analysis Workflow Centric ROs
  9. 9. used wasGeneratedBy wasStartedAt "2012-06-21" Metagenome Sample wasAssociatedWith Workflow server wasInformedBy wasStartedBy Workflow run wasGeneratedBy Results Sequencing wasAssociatedWith Alice hadPlan Workflow definition hadRole Lab technician Results Soiland-Reyes Research Object Bundle
  10. 10. Join the W3C Community Group
  11. 11. Notifications and automatic re-runs Machines are users too Autonomic Curation Self-repair New research?
  12. 12. The Executable Thesis new data new results executable thesis PhD Student
  13. 13. A new role for the scientific publisher? Digital library? The Executable Journal A thought experiment…
  14. 14. Knowledge InfrastructureKnowledge Objects Descriptive layer Observatories Annotation
  15. 15. Research Objects Computational Research Objects Workflows Packs OAI ORE W3CPROV
  16. 16. • Social Objects, designed to facilitate human interpretation (e.g. containing narratives) and shared as part of a (hybrid) sensemaking network • Machine Objects, semantically described and programmatically accessible, designed for automation, scale and heterogeneity • Composable with a distributed computational model, such that a Computational Research Object can itself assemble systems of objects, and these systems may consume and produce Computational Research Objects. We can reason about them. Computational Research Objects
  17. 17. 1. I take a digital audio recording and perform a series of analysis tasks leading to a result dataset 2. The environment captures the history of my analysis in a CRO, with descriptions of input data, analysis history (workflow) inc software, output data, narrative. 3. Another researcher finds CRO (cited in social media), tests it, runs it with different audio data (capturing as a CRO) 4. A data scientist registers the CRO to be run automatically when new data arrives, and configures a post-process so that they are notified if new results meet criteria 5. This common pattern of installing multiple CROs with a post-processor is captured for reuse Simplest Scenario
  18. 18. • The simple example takes us quickly to the stage of writing programs which act on CROs • Isn’t this all a bit Computer Sciencey? • Yes! But it’s not CS for the sake of CS  • It’s CS for “rigour and openness” • The idea is to establish Computer Science techniques to be able to help design and validate our future research systems Towards a Science of Reproducibility?
  19. 19. Several Scheme concepts map directly into the CRO model: 1. Closures (as mutable objects and first class functions) 2. Environments 3. Continuations A prototype RO interpreter has been implemented – here is a simple example based on memoization (or should I say roification…) (For Lisp hackers)
  20. 20. > (define (f x) (analyse x)) > (f 10) ;Value: 100 > (define ro1 (roify f)) > ((ro1 'x) 2) ;Value: 4 > ((ro1 'x) 3) ;Value: 9 > ((ro1 'x) 2) ; precomputed ;Value: 4 > (define foo (ro1 'v)) > (foo) ; confirmed(3) = 9 ; confirmed(2) = 4 ;Value: #t > (define (analyse x) (+ x x)) > (foo) ; changed(3) = 6 <> 9 ;Value: #f > (define a (delay ((ro1 'x) 5)) > (a) ;Value: 10
  21. 21. 1. Next steps? Develop more scenarios – including scale, validation, design 2. Higher order functions, e.g. capturing common patterns, seem to be expressive compared to normal workflow mechanics 3. The RO interpreter in Scheme is proof of concept… but actually it could be made operational 4. If nothing else this is a simulation of the/a future and may provide insights 5. Social machines and human computation research involves computational-style descriptions of processes involving humans – exploring in SOCIAM and Smart Society projects Closing thoughts
  22. 22. @dder Thanks to Iain Buchan, Sean Bechhofer, Carole Goble and all my colleagues in myExperiment, Wf4Ever, myGrid and FORCE11. Research supported in part by Wf4Ever (FP7-ICT ICT-2009.4 project 270192) Some of these ideas were first presented at Microsoft e-Science Workshop, Stockholm, December 2011