Presented on 2012-03-28 at EGI Community Forum 2012, Munich.
http://www.wf4ever-project.org/
http://purl.org/wf4ever/model
http://cf2012.egi.eu/
https://www.egi.eu/indico/sessionDisplay.py?sessionId=66&confId=679#20120328
Presentation given at CERN Workshop on Innovations in Scholarly Communication (OAI7) on 22nd June 2011
http://indico.cern.ch/conferenceDisplay.py?confId=103325
Knowledge Infrastructure for Global Systems ScienceDavid De Roure
Presentation at the First Open Global Systems Science Conference, Brussels, 8-10 November 2012
http://www.gsdp.eu/nc/news/news/date/2012/10/31/first-open-global-systems-science-conference/
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
This is a derivative of a talk I gave at the Linnean society on 20th Sept. 2012. This version was given at the i4Life Environmental Genomics workshop on 25th Sept. and refocused to look at the dark taxa problem and developing published descriptions of molecular sequence clusters.
The digital universe is booming, especially metadata and user-generated data. This raises strong challenges in order to identify the relevant portions of data which are relevant for a particular problem and to deal with the lifecycle of data. Finer grain problems include data evolution and the potential impact of change in the applications relying on the data, causing decay. The management of scientific data is especially sensitive to this. We present the Research Objects concept as the means to indentify and structure relevant data in scientific domains, addressing data as first-class citizens. We also identify and formally represent the main reasons for decay in this domain and propose methods and tools for their diagnosis and repair, based on provenance information. Finally, we discuss on the application of these concepts to the broader domain of the Web of Data: Data with a Purpose.
Presentation given at CERN Workshop on Innovations in Scholarly Communication (OAI7) on 22nd June 2011
http://indico.cern.ch/conferenceDisplay.py?confId=103325
Knowledge Infrastructure for Global Systems ScienceDavid De Roure
Presentation at the First Open Global Systems Science Conference, Brussels, 8-10 November 2012
http://www.gsdp.eu/nc/news/news/date/2012/10/31/first-open-global-systems-science-conference/
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
This is a derivative of a talk I gave at the Linnean society on 20th Sept. 2012. This version was given at the i4Life Environmental Genomics workshop on 25th Sept. and refocused to look at the dark taxa problem and developing published descriptions of molecular sequence clusters.
The digital universe is booming, especially metadata and user-generated data. This raises strong challenges in order to identify the relevant portions of data which are relevant for a particular problem and to deal with the lifecycle of data. Finer grain problems include data evolution and the potential impact of change in the applications relying on the data, causing decay. The management of scientific data is especially sensitive to this. We present the Research Objects concept as the means to indentify and structure relevant data in scientific domains, addressing data as first-class citizens. We also identify and formally represent the main reasons for decay in this domain and propose methods and tools for their diagnosis and repair, based on provenance information. Finally, we discuss on the application of these concepts to the broader domain of the Web of Data: Data with a Purpose.
Open Annotation Rollout, Manchester, 2013-06-25
See also PDF version: http://www.slideshare.net/soilandreyes/2013-0624annotatingr-osopenannotationmeeting-23289491
Open Annotation Rollout, Manchester, 2013-06-25
See also PPTX version with Notes: http://www.slideshare.net/soilandreyes/2013-0624annotatingr-osopenannotationmeeting
Presentation of Taverna from UKOLN DevSci "Workflow Tools" event in Bath, 2010-11-30
PDF version: http://www.slideshare.net/soilandreyes/taverna-workflow-management-system-2010-1130-bath-workflow-tools
http://taverna.org.uk/
http://www.ukoln.ac.uk/events/devcsi/workflow_tools/programme/index.html
http://devcsi.ukoln.ac.uk/
Sustaining research software at the Apache Software Foundation
Presented at BOSC 2015, Dublin on 2015-07-11. http://www.open-bio.org/wiki/BOSC_2015
Source: http://slides.com/soilandreyes/20150611-bosc2015-apache
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
myExperiment and the Rise of Social MachinesDavid De Roure
Talk at hubbub 2012, Indianapolis, 25 September 2012. The talk introduces myExperiment and Wf4Ever, discusses the future of research communication including FORCE11, and introduces the SOCIAM project (Theory and Practice of Social Machines) which launches in October 2012.
Keynote: SemSci 2017: Enabling Open Semantic Science
1st International Workshop co-located with ISWC 2017, October 2017, Vienna, Austria,
https://semsci.github.io/semSci2017/
Abstract
We have all grown up with the research article and article collections (let’s call them libraries) as the prime means of scientific discourse. But research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
We can think of “Research Objects” as different types and as packages all the components of an investigation. If we stop thinking of publishing papers and start thinking of releasing Research Objects (software), then scholar exchange is a new game: ROs and their content evolve; they are multi-authored and their authorship evolves; they are a mix of virtual and embedded, and so on.
But first, some baby steps before we get carried away with a new vision of scholarly communication. Many journals (e.g. eLife, F1000, Elsevier) are just figuring out how to package together the supplementary materials of a paper. Data catalogues are figuring out how to virtually package multiple datasets scattered across many repositories to keep the integrated experimental context.
Research Objects [1] (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described. The brave new world of containerisation provides the containers and Linked Data provides the metadata framework for the container manifest construction and profiles. It’s not just theory, but also in practice with examples in Systems Biology modelling, Bioinformatics computational workflows, and Health Informatics data exchange. I’ll talk about why and how we got here, the framework and examples, and what we need to do.
[1] Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, Carole Goble, Why linked data is not enough for scientists, In Future Generation Computer Systems, Volume 29, Issue 2, 2013, Pages 599-611, ISSN 0167-739X, https://doi.org/10.1016/j.future.2011.08.004
Research Objects for improved sharing and reproducibilityOscar Corcho
Presentation about the usage of Research Objects to improve scientific experiment sharing and reproducibility, given at the Dagstuhl Perspective Workshop on the intersection between Computer Sciences and Psychology (July 2015)
Project Website: http://www.researchobject.org/
researchobjects.org is a community project that has developed an approach to describe and package up all resources used as part of an investigation as Research Objects (RO’s).
RO’s - provide two main features; a manifest - a consistent way to provide a well-typed, structured description of the resources used in an investigation; and a ‘bundle’ - a mechanism for packaging up manifests with resources as a single, publishable unit.
RO’s therefore carry the research context of an experiment - data, software, standard operating procedures (SOPs), models etc - and gather together the components of an experiment so that they are findable, accessible, interoperable and reproducible (FAIR). RO’s combine software and data into an aggregative data structure consisting of well described reconstructable parts.
RO’s have the potential to address a number of challenges pertinent to open research including: a) supporting interoperability between infrastructures by using ROs as a primary mechanism for exchange and publication b) supporting the evolution of research objects as a living collection, enabling provenance tracking c) providing the ability to pivot research object components (data, software, models) that are not restricted to the traditional publication.
Here we present work towards the development and adoption of ROs:
(i) A series of specifications and conventions, using community standards, for the RO manifest and RO bundles.
(ii) Implementations of Java, Python and Ruby APIs and tooling against those specifications;
(iii) Examples of representations of the RO models in various languages (e.g. JSON-LD, RDF, HTML).
"Towards a Science of Reproducible Science?" DPRMA Workshop talk at JCDL 2013, Indianapolis, 25th July 2013. Workshop website is http://dprma.oerc.ox.ac.uk/
Paper is
David De Roure. 2013. Towards computational research objects. In Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts (DPRMA '13). ACM, New York, NY, USA, 16-19. DOI=10.1145/2499583.2499590 http://doi.acm.org/10.1145/2499583.2499590
FAIR Workflows and Research Objects get a Workout Carole Goble
So, you want to build a pan-national digital space for bioscience data and methods? That works with a bunch of pre-existing data repositories and processing platforms? So you can share FAIR workflows and move them between services? Package them up with data and other stuff (or just package up data for that matter)? How? WorkflowHub (https://workflowhub.eu) and RO-Crate Research Objects (https://www.researchobject.org/ro-crate) that’s how! A step towards FAIR Digital Objects gets a workout.
Presented at DataVerse Community Meeting 2021
Open Annotation Rollout, Manchester, 2013-06-25
See also PDF version: http://www.slideshare.net/soilandreyes/2013-0624annotatingr-osopenannotationmeeting-23289491
Open Annotation Rollout, Manchester, 2013-06-25
See also PPTX version with Notes: http://www.slideshare.net/soilandreyes/2013-0624annotatingr-osopenannotationmeeting
Presentation of Taverna from UKOLN DevSci "Workflow Tools" event in Bath, 2010-11-30
PDF version: http://www.slideshare.net/soilandreyes/taverna-workflow-management-system-2010-1130-bath-workflow-tools
http://taverna.org.uk/
http://www.ukoln.ac.uk/events/devcsi/workflow_tools/programme/index.html
http://devcsi.ukoln.ac.uk/
Sustaining research software at the Apache Software Foundation
Presented at BOSC 2015, Dublin on 2015-07-11. http://www.open-bio.org/wiki/BOSC_2015
Source: http://slides.com/soilandreyes/20150611-bosc2015-apache
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
myExperiment and the Rise of Social MachinesDavid De Roure
Talk at hubbub 2012, Indianapolis, 25 September 2012. The talk introduces myExperiment and Wf4Ever, discusses the future of research communication including FORCE11, and introduces the SOCIAM project (Theory and Practice of Social Machines) which launches in October 2012.
Keynote: SemSci 2017: Enabling Open Semantic Science
1st International Workshop co-located with ISWC 2017, October 2017, Vienna, Austria,
https://semsci.github.io/semSci2017/
Abstract
We have all grown up with the research article and article collections (let’s call them libraries) as the prime means of scientific discourse. But research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
We can think of “Research Objects” as different types and as packages all the components of an investigation. If we stop thinking of publishing papers and start thinking of releasing Research Objects (software), then scholar exchange is a new game: ROs and their content evolve; they are multi-authored and their authorship evolves; they are a mix of virtual and embedded, and so on.
But first, some baby steps before we get carried away with a new vision of scholarly communication. Many journals (e.g. eLife, F1000, Elsevier) are just figuring out how to package together the supplementary materials of a paper. Data catalogues are figuring out how to virtually package multiple datasets scattered across many repositories to keep the integrated experimental context.
Research Objects [1] (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described. The brave new world of containerisation provides the containers and Linked Data provides the metadata framework for the container manifest construction and profiles. It’s not just theory, but also in practice with examples in Systems Biology modelling, Bioinformatics computational workflows, and Health Informatics data exchange. I’ll talk about why and how we got here, the framework and examples, and what we need to do.
[1] Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, Carole Goble, Why linked data is not enough for scientists, In Future Generation Computer Systems, Volume 29, Issue 2, 2013, Pages 599-611, ISSN 0167-739X, https://doi.org/10.1016/j.future.2011.08.004
Research Objects for improved sharing and reproducibilityOscar Corcho
Presentation about the usage of Research Objects to improve scientific experiment sharing and reproducibility, given at the Dagstuhl Perspective Workshop on the intersection between Computer Sciences and Psychology (July 2015)
Project Website: http://www.researchobject.org/
researchobjects.org is a community project that has developed an approach to describe and package up all resources used as part of an investigation as Research Objects (RO’s).
RO’s - provide two main features; a manifest - a consistent way to provide a well-typed, structured description of the resources used in an investigation; and a ‘bundle’ - a mechanism for packaging up manifests with resources as a single, publishable unit.
RO’s therefore carry the research context of an experiment - data, software, standard operating procedures (SOPs), models etc - and gather together the components of an experiment so that they are findable, accessible, interoperable and reproducible (FAIR). RO’s combine software and data into an aggregative data structure consisting of well described reconstructable parts.
RO’s have the potential to address a number of challenges pertinent to open research including: a) supporting interoperability between infrastructures by using ROs as a primary mechanism for exchange and publication b) supporting the evolution of research objects as a living collection, enabling provenance tracking c) providing the ability to pivot research object components (data, software, models) that are not restricted to the traditional publication.
Here we present work towards the development and adoption of ROs:
(i) A series of specifications and conventions, using community standards, for the RO manifest and RO bundles.
(ii) Implementations of Java, Python and Ruby APIs and tooling against those specifications;
(iii) Examples of representations of the RO models in various languages (e.g. JSON-LD, RDF, HTML).
"Towards a Science of Reproducible Science?" DPRMA Workshop talk at JCDL 2013, Indianapolis, 25th July 2013. Workshop website is http://dprma.oerc.ox.ac.uk/
Paper is
David De Roure. 2013. Towards computational research objects. In Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts (DPRMA '13). ACM, New York, NY, USA, 16-19. DOI=10.1145/2499583.2499590 http://doi.acm.org/10.1145/2499583.2499590
FAIR Workflows and Research Objects get a Workout Carole Goble
So, you want to build a pan-national digital space for bioscience data and methods? That works with a bunch of pre-existing data repositories and processing platforms? So you can share FAIR workflows and move them between services? Package them up with data and other stuff (or just package up data for that matter)? How? WorkflowHub (https://workflowhub.eu) and RO-Crate Research Objects (https://www.researchobject.org/ro-crate) that’s how! A step towards FAIR Digital Objects gets a workout.
Presented at DataVerse Community Meeting 2021
Presented 2014-10-30 at Taverna Open Development Workshop in Manchester http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Open+Development+Workshop
Also available at http://slides.com/soilandreyes/2014-10-31-taverna-3-architecture#/
2014-10-30 Taverna 3 status
Presented at Taverna Open Development Workshop 2014 in Manchester.
http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Open+Development+Workshop#TavernaOpenDevelopmentWorkshop-Day1-Thursday2014-10-30
Taverna is becoming an Apache Incubator project. What are the effects on Taverna as an open source project and its future development?
HTML version: http://slides.com/soilandreyes/2014-10-30-taverna-incubator/
Wiki version: http://dev.mygrid.org.uk/wiki/display/developer/Taverna+as+an+Apache+Incubator+project
Presented 2014-10-30 at Taverna Open Development Workshop http://dev.mygrid.org.uk/wiki/display/developer/Taverna+Open+Development+Workshop
OMEX Combine Archives as example of Research Object in the wild - how converting it to RO Bundles using http://dx.doi.org/10.5281/zenodo.10439
Source pptx:
https://onedrive.live.com/view.aspx?cid=37935FEEE4DF1087&resid=37935FEEE4DF1087!788&app=PowerPoint%20f
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)Stian Soiland-Reyes
Presentation at BOSC 2013 / ISMB 2013. (PowerPoint 2013 source)
PDF: https://www.slideshare.net/soilandreyes/2013-0719bosc-2013myexperimentresearchobjectsslides
See also poster at http://www.slideshare.net/soilandreyes/2013-0718bosc-2013myexperimentresearchobjectsposter-24242509 or
submitted abstract: https://docs.google.com/document/d/1jaAuPV-EnbsyI14L56HKHBQP7eDVfeXGLlK-LwohnWw/edit?usp=sharing
We have evolved Research Objects as a mechanism to preserve digital resources related to research, by providing mechanisms, formats and architecture for describing aggregated resources (hypothesis, workflow, datasets, scripts, services), their relations (is input for, explains, used by), provenance (graph was derived from dataset A, B and C) and attribution (who contributed what, and when?).
The website myExperiment is already popular for collaborating on, publishing and sharing scientific workflows, however we have found that for understanding and preserving a workflow over time, its definition is not enough, specially faced with workflow decay, services and tools that change over time. We have therefore adapted the research object model as a foundation for the myExperiment packs, allowing uploading of workflow runs, inputs, outputs and other files relevant to the workflow, relating them with annotations and integrated the Wf4Ever architecture for performing decay analysis and tracking a research object’s evolution as it and its constituent resources change over time.
Slide deck presenting the Provenance support of Taverna workflow system, detailing architecture, ontologies and how results are exported as Research Object bundles, including the PROV-O provenance of the workflow run.
This upload is the PDF version, for PPTX source, see https://www.slideshare.net/soilandreyes/20130529-taverna-provenance-pptx-source/
At "Metagenomics, metagenetics and Pylogenetic workflows for Ocean Sampling Day" Workshop
Max Planck Institute for Marine Microbiology, Bremen, Germany 2013-03-21
For PPTX source - download http://www.wf4ever-project.org/wiki/download/attachments/2064544/2013-03-21-OSD-Bremen-Stian-What+can+provenance+do+for+me.pptx
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...Stian Soiland-Reyes
See presentation at http://www.slideshare.net/soilandreyes/2011-0716-scufl2-because-a-workflow-is-more-than-its-definition-bosc-2011
From BOSC 2011 - http://www.open-bio.org/wiki/BOSC_2011_Schedule
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
"Impact of front-end architecture on development cost", Viktor Turskyi
2012 03-28 Wf4ever, preserving workflows as digital research objects
1. Wf4Ever:
Preserving workflows as
digital Research Objects
Stian Soiland-Reyes
myGrid, University of Manchester
EGI Community Forum 2012, Workflow Systems workshop
Leibniz Supercomputing Centre, Münich, 2012-03-28
2. My background
Taverna - Scientific Workflow Management
System
~85000 downloads
~EU projects: SCAPE, BioVeL, HELIO,
http://www.taverna.org.uk/
e-Lico, VPH-SHARE, EGI-INSPiRE….
myExperiment - Web 3.0 virtual
environment, library and social
network for workflows
http://www.myexperiment.org/
~5000 registered users
~2200 workflows
~21 different systems
2
3. “A biologist would rather share their
toothbrush than their gene name”
Mike Ashburner and others
Professor in Dept of Genetics,
University of Cambridge, UK
4. http://www.myexperiment.org/
“Facebook for Scientists” A probe into researcher behaviour
...but different to Facebook!
A repository of research methods Open source (BSD) Ruby on Rails app
A social network of people and things REST and SPARQL, Linked Data
A Social Virtual Research Environment Influenced BioCatalogue, MethodBox
and SysMO-SEEK
myExperiment currently has 5378 members, 292 groups, 2273
workflows, 534 files and 217 packs
5.
6. Workflow Preservation
Research Objects
Provenance
Recommendation
Astronomy and Genomics
http://www.wf4ever-project.org/
7. Wf4Ever
Challenges
Preservation of scientific workflows » Scientific workflows enable automation
in data-intensive science of scientific methods and encourage
best practices to be shared
» Workflows need to be preserved for
› Reuse, fundamental for incremental
scientific development
› Method reproducibility, key for
credit and publication
» Workflow preservation is complex!
» Heterogeneous types of information
need to be aggregated, including
workflows and related resources
forming research objects
» Research objects need to be trusted and
understandable n years from now
» Social aspects need to be addressed in
order to support reuse in scientific
communities
7
8. The R.* dimensions
Reusable. The key tenet of Research Replayable. Studies might involve
Objects is to support the sharing and single investigations that happen in
reuse of data, methods and processes. milliseconds or protracted processes
Repurposeable. Reuse may also that take years.
involve the reuse of constituent parts of Referenceable. If research objects are
the Research Object. to augment or replace traditional
Repeatable. There should be sufficient publication methods, then they must be
referenceable or citeable.
information in a Research Object to be
able to repeat the study, perhaps years Revealable. Third parties must be able
later. to audit the steps performed in the
Reproducible. A third party can start research in order to be convinced of the
validity of results.
with the same inputs and methods and
see if a prior result can be confirmed. Respectful. Explicit representations of
the provenance, lineage and flow of
intellectual property.
Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
9. Wf4Ever
Forms of decay
Workflow Decay
• Service decay
• Flux/decay/unavailability
• Data decay
• Formats/ids/standards
• Infrastructure decay
• platform/resources
Experiment Decay
• Methodological changes
• New technologies
• New resources/components
• New data
9
10. Preservation, Conservation, Recreating
Preserving
Archived Record
Fixed Snapshots
Review
Rerun & Replay
Conserving
Active Instrument
Live
Rerun & Reuse
Repair & Restore
Recreating
Archived Record
Active Instrument
Live
Rebuild Recycle Repurpose
10
11. Workflow Decay
Decay at different abstraction levels
Redo
Flux
Flux
Flux
11
http://www.gridworkflow.org/kwfgrid/gwes/docs/
14. http://purl.org/wf4ever/ro#
Research Object model core (simplified)
ore:aggregates
ro:ResearchObject
ro:Resource ore:isDescribedBy
ro:Manifest
wfdesc:Workflow
ro:annotatesAggregatedResource ro:AggregatedAnnotation
Note: This figure shows a simplified view of the RO core.
RO specification: http://wf4ever.github.com/ro/
14
21. Roadmap
Year 1 (Dec 2010 Dec 2011)
» Exploration (2011)
Problem specification and requirements identification
Better understanding of workflow preservation needs
from the domains (what does it mean to preserve a
scientific workflow?)
Proofs of concepts
Preliminary models, components, and integrated
reference implementation
Result identification
21
22. Roadmap
Year 2 (Dec 2011 Dec 2012)
Realization/validation (2012)
› Validate the models, architectures and software in practice
› Distributed components with different access/security
arrangements – forming REST APIs and specifications
› RO Content Campaign: Generate 1000s of ROs
› First productization phase: Stable releases of models and
reference implementation
› Decay monitoring and notification (why my wf is no longer
stable), reacting to decay, attribution and credit support
beyond recommendation. Detailed use of provenance
› Execution and interoperability support (SHIWA integration)
22
23. Roadmap
Year 3 (Dec 2012 Dec 2013)
» Exploitation (2013)
› Final productization phase
› Deployment in user environments and systems, enhanced with
workflow preservation capabilities
› RO-enabled myExperiment
› RO-enabled Galaxy
› RO-enabled dataVerse
› … and more!
› Deployment in publishers e.g. Elsevier, Digital Science,
GigaScience
23
25. Thank you!
Any Questions?
http://www.wf4ever-project.org/
This work is licensed under the Creative Commons Attribution 3.0
Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California,
94041, USA. 25