Towards Computational Research Objects

1,320 views

Published on

"Towards a Science of Reproducible Science?" DPRMA Workshop talk at JCDL 2013, Indianapolis, 25th July 2013. Workshop website is http://dprma.oerc.ox.ac.uk/
Paper is
David De Roure. 2013. Towards computational research objects. In Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts (DPRMA '13). ACM, New York, NY, USA, 16-19. DOI=10.1145/2499583.2499590 http://doi.acm.org/10.1145/2499583.2499590

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,320
On SlideShare
0
From Embeds
0
Number of Embeds
38
Actions
Shares
0
Downloads
14
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • This is reflected in a third distinctive – the pack. This is Paul Fishers pack from the Tryps example.Some packs contain example input and output data so workflows can be checked for “decay” (they don’t actually rot, but the world changes round them).While others are looking at semantically enhanced publication, we are asking “what is the shared artefact of future research?” We come at the same problem from the other side. We have it surrounded! Our approach relieves us of the paper mindest – so, for example, a Research Object could contain information for many audiences and purposes, with a commonly interpreted core (social scientists will recognise the idea of a “boundary object”).
  • Towards Computational Research Objects

    1. 1. Towards Computational Research Objects David De Roure Indianapolis Edition
    2. 2. 1. A Brief History of Research Objects 2. The motivation for Computational Research Objects 3. (A small illustration)
    3. 3. http://www.myexperiment.org/ Packs
    4. 4. In contrast to photo-sharing on Flickr or videos on YouTube, the basic unit of sharing in myExperiment is not a single file but rather a package of components that make up an experiment - what we call an Encapsulated myExperiment Object (EMO), and others have called Reproducible Research Objects. Notionally an EMO is a folder containing the various assets associated with an experiment. In the scientific context there are stringent requirements with respect to versioning, ownership, intellectual property and the maintenance of provenance information. We have looked at emerging practice in sharing “pieces of science” in the scientific and scholarly lifecycle, from social sites to digital repositories. myExperiment provides simple and extensible support to better understand requirements as new collaborative practice emerges. In this presentation, we will describe the characteristics of EMOs and present our initial design solution which supports the requirements of encapsulation and preserves our principles of simplicity and interoperability. Sharing Digital Science David De Roure, University of Southampton; Carole Goble, University of Manchester EMOs
    5. 5. Iain Buchan Research Objects
    6. 6. Results Logs Results Metadata PaperSlides Feeds into produces Included in produces Published in produces Included in Included in Included in Published in Workflow 16 Workflow 13 Common pathways QTL Paul’s PackPaul’s Research Object
    7. 7. http://www.openarchives.org/ore/terms/aggregates http://eprints.ecs.soton.ac.uk/id/eprint/20817 OAI-ORE
    8. 8. • Workflow – pack contains a number of workflows • Presentation - encapsulation of a single presentation • Collection - a number of things (workflows/presentations/pa pers) • Heterogeneous - where the workflows do not appear to have a clear common purpose • Homogeneous - workflows appear to be designed to work together • Paper - source for a paper • Tutorial - tutorial material • Data - collection of data files • Derived data - results of workflow • Benchmark - benchmarking data • Supplementary - stuff associated with a paper • Noise - tests, tryouts, rubbish • Oddity - none of the above Analysis by Sean Bechhofer Pack analysis Workflow Centric ROs
    9. 9. used wasGeneratedBy wasStartedAt "2012-06-21" Metagenome Sample wasAssociatedWith Workflow server wasInformedBy wasStartedBy Workflow run wasGeneratedBy Results Sequencing wasAssociatedWith Alice hadPlan Workflow definition hadRole Lab technician Results https://w3id.org/bundleStian Soiland-Reyes Research Object Bundle
    10. 10. Join the W3C Community Group www.w3.org/community/rosc www.researchobject.org
    11. 11. Notifications and automatic re-runs Machines are users too Autonomic Curation Self-repair New research?
    12. 12. The Executable Thesis new data new results executable thesis PhD Student
    13. 13. A new role for the scientific publisher? Digital library? The Executable Journal A thought experiment…
    14. 14. Knowledge InfrastructureKnowledge Objects Descriptive layer Observatories Annotation
    15. 15. Research Objects Computational Research Objects Workflows Packs OAI ORE W3CPROV
    16. 16. • Social Objects, designed to facilitate human interpretation (e.g. containing narratives) and shared as part of a (hybrid) sensemaking network • Machine Objects, semantically described and programmatically accessible, designed for automation, scale and heterogeneity • Composable with a distributed computational model, such that a Computational Research Object can itself assemble systems of objects, and these systems may consume and produce Computational Research Objects. We can reason about them. Computational Research Objects
    17. 17. 1. I take a digital audio recording and perform a series of analysis tasks leading to a result dataset 2. The environment captures the history of my analysis in a CRO, with descriptions of input data, analysis history (workflow) inc software, output data, narrative. 3. Another researcher finds CRO (cited in social media), tests it, runs it with different audio data (capturing as a CRO) 4. A data scientist registers the CRO to be run automatically when new data arrives, and configures a post-process so that they are notified if new results meet criteria 5. This common pattern of installing multiple CROs with a post-processor is captured for reuse Simplest Scenario
    18. 18. • The simple example takes us quickly to the stage of writing programs which act on CROs • Isn’t this all a bit Computer Sciencey? • Yes! But it’s not CS for the sake of CS  • It’s CS for “rigour and openness” • The idea is to establish Computer Science techniques to be able to help design and validate our future research systems Towards a Science of Reproducibility?
    19. 19. Several Scheme concepts map directly into the CRO model: 1. Closures (as mutable objects and first class functions) 2. Environments 3. Continuations A prototype RO interpreter has been implemented – here is a simple example based on memoization (or should I say roification…) (For Lisp hackers)
    20. 20. > (define (f x) (analyse x)) > (f 10) ;Value: 100 > (define ro1 (roify f)) > ((ro1 'x) 2) ;Value: 4 > ((ro1 'x) 3) ;Value: 9 > ((ro1 'x) 2) ; precomputed ;Value: 4 > (define foo (ro1 'v)) > (foo) ; confirmed(3) = 9 ; confirmed(2) = 4 ;Value: #t > (define (analyse x) (+ x x)) > (foo) ; changed(3) = 6 <> 9 ;Value: #f > (define a (delay ((ro1 'x) 5)) > (a) ;Value: 10
    21. 21. 1. Next steps? Develop more scenarios – including scale, validation, design 2. Higher order functions, e.g. capturing common patterns, seem to be expressive compared to normal workflow mechanics 3. The RO interpreter in Scheme is proof of concept… but actually it could be made operational 4. If nothing else this is a simulation of the/a future and may provide insights 5. Social machines and human computation research involves computational-style descriptions of processes involving humans – exploring in SOCIAM and Smart Society projects Closing thoughts
    22. 22. david.deroure@oerc.ox.ac.uk www.oerc.ox.ac.uk/people/dder www.scilogs.com/eresearch @dder Thanks to Iain Buchan, Sean Bechhofer, Carole Goble and all my colleagues in myExperiment, Wf4Ever, myGrid and FORCE11. Research supported in part by Wf4Ever (FP7-ICT ICT-2009.4 project 270192) Some of these ideas were first presented at Microsoft e-Science Workshop, Stockholm, December 2011

    ×