The document discusses scientific workflows and their preservation in astronomy. It provides an overview of the Wf4Ever project, which aims to develop infrastructure for long-term preservation of scientific workflows across disciplines. The document discusses tools for workflow development, digital repositories, and initiatives related to workflow preservation in astronomy. It also outlines the astronomy work package in Wf4Ever, which involves developing exemplar workflows, integrating existing astronomy software, and engaging the astronomy community.
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
This is a derivative of a talk I gave at the Linnean society on 20th Sept. 2012. This version was given at the i4Life Environmental Genomics workshop on 25th Sept. and refocused to look at the dark taxa problem and developing published descriptions of molecular sequence clusters.
Being Reproducible: SSBSS Summer School 2017Carole Goble
Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
This is a derivative of a talk I gave at the Linnean society on 20th Sept. 2012. This version was given at the i4Life Environmental Genomics workshop on 25th Sept. and refocused to look at the dark taxa problem and developing published descriptions of molecular sequence clusters.
Being Reproducible: SSBSS Summer School 2017Carole Goble
Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
Presentation given at CERN Workshop on Innovations in Scholarly Communication (OAI7) on 22nd June 2011
http://indico.cern.ch/conferenceDisplay.py?confId=103325
Knowledge Infrastructure for Global Systems ScienceDavid De Roure
Presentation at the First Open Global Systems Science Conference, Brussels, 8-10 November 2012
http://www.gsdp.eu/nc/news/news/date/2012/10/31/first-open-global-systems-science-conference/
Where are we going and how are we going to get there?David De Roure
Keynote from JISC Projects start-up meeting
Information Environment 2009-11 & Virtual Research Environment http://www.jisc.ac.uk/whatwedo/programmes/inf11/inf11startup.aspx
myExperiment and the Rise of Social MachinesDavid De Roure
Talk at hubbub 2012, Indianapolis, 25 September 2012. The talk introduces myExperiment and Wf4Ever, discusses the future of research communication including FORCE11, and introduces the SOCIAM project (Theory and Practice of Social Machines) which launches in October 2012.
• “Detecting radio-astronomical "Fast Radio Transient Events" via an OODT-based metadata processing pipeline”, Chris Mattmann, Andrew Hart , Luca Cinquini, David Thompson, Kiri Wagstaff, Shakeh Khudikyan. ApacheCon NA 2013, Februrary 2013
2012 03-28 Wf4ever, preserving workflows as digital research objectsStian Soiland-Reyes
Presented on 2012-03-28 at EGI Community Forum 2012, Munich.
http://www.wf4ever-project.org/
http://purl.org/wf4ever/model
http://cf2012.egi.eu/
https://www.egi.eu/indico/sessionDisplay.py?sessionId=66&confId=679#20120328
Jupyter notebooks have arrived to stay as a means to document the scientific analysis protocol, as well as to provide executable recipes shared seamlessly among the community. This has triggered the rise of a plethora of complementary tools and services associated to them. This talk will cover different possibilities to use Jupyter notebooks and JupyterLab interface. We will start with the description of their basic functionalities, as well as functionality extensions not widely known by the community. We will describe how to take advantage of their cross-language capabilities to enhance collaborative work, and also use them as complementary assets in the paper publication process to provide reproducibility of the results. Other aspects on how to deal with modularity and scalability of long complex notebooks will be covered, and we will see several platforms for rendering and execution others then the browser and the local desktop. We will finish on how they are actually being used together with Docker and Binder as part of the versioned executable documentation of a project like Gammapy.
Los IPython Notebooks nos han proporcionado una sustancial mejora en la documentación del scripts, así como su inspección y una mayor re-utilización. Los IPython Notebooks también permiten acceder a distintos lenguajes de programación (Fortran, IDL, R, Shell,..) en un mismo script, lo que unido a su modo de acceso Web les hace ser un elemento ideal para el trabajo colaborativo (multi-lenguaje, multi-usuario, multi-plataforma, etc..) Os contaré qué tipo de cosas pueden hacerse con IPython Notebooks, desde desarrollo colaborativo de código multi-lenguaje, pasando por la reutilización de tutoriales, visualización interactiva de resultados, hasta la distribución de código más modular, y la publicación final de un experimento digital verificable y reproducible: el preámbulo de los papers ejecutables.
Presentation given at CERN Workshop on Innovations in Scholarly Communication (OAI7) on 22nd June 2011
http://indico.cern.ch/conferenceDisplay.py?confId=103325
Knowledge Infrastructure for Global Systems ScienceDavid De Roure
Presentation at the First Open Global Systems Science Conference, Brussels, 8-10 November 2012
http://www.gsdp.eu/nc/news/news/date/2012/10/31/first-open-global-systems-science-conference/
Where are we going and how are we going to get there?David De Roure
Keynote from JISC Projects start-up meeting
Information Environment 2009-11 & Virtual Research Environment http://www.jisc.ac.uk/whatwedo/programmes/inf11/inf11startup.aspx
myExperiment and the Rise of Social MachinesDavid De Roure
Talk at hubbub 2012, Indianapolis, 25 September 2012. The talk introduces myExperiment and Wf4Ever, discusses the future of research communication including FORCE11, and introduces the SOCIAM project (Theory and Practice of Social Machines) which launches in October 2012.
• “Detecting radio-astronomical "Fast Radio Transient Events" via an OODT-based metadata processing pipeline”, Chris Mattmann, Andrew Hart , Luca Cinquini, David Thompson, Kiri Wagstaff, Shakeh Khudikyan. ApacheCon NA 2013, Februrary 2013
2012 03-28 Wf4ever, preserving workflows as digital research objectsStian Soiland-Reyes
Presented on 2012-03-28 at EGI Community Forum 2012, Munich.
http://www.wf4ever-project.org/
http://purl.org/wf4ever/model
http://cf2012.egi.eu/
https://www.egi.eu/indico/sessionDisplay.py?sessionId=66&confId=679#20120328
Jupyter notebooks have arrived to stay as a means to document the scientific analysis protocol, as well as to provide executable recipes shared seamlessly among the community. This has triggered the rise of a plethora of complementary tools and services associated to them. This talk will cover different possibilities to use Jupyter notebooks and JupyterLab interface. We will start with the description of their basic functionalities, as well as functionality extensions not widely known by the community. We will describe how to take advantage of their cross-language capabilities to enhance collaborative work, and also use them as complementary assets in the paper publication process to provide reproducibility of the results. Other aspects on how to deal with modularity and scalability of long complex notebooks will be covered, and we will see several platforms for rendering and execution others then the browser and the local desktop. We will finish on how they are actually being used together with Docker and Binder as part of the versioned executable documentation of a project like Gammapy.
Los IPython Notebooks nos han proporcionado una sustancial mejora en la documentación del scripts, así como su inspección y una mayor re-utilización. Los IPython Notebooks también permiten acceder a distintos lenguajes de programación (Fortran, IDL, R, Shell,..) en un mismo script, lo que unido a su modo de acceso Web les hace ser un elemento ideal para el trabajo colaborativo (multi-lenguaje, multi-usuario, multi-plataforma, etc..) Os contaré qué tipo de cosas pueden hacerse con IPython Notebooks, desde desarrollo colaborativo de código multi-lenguaje, pasando por la reutilización de tutoriales, visualización interactiva de resultados, hasta la distribución de código más modular, y la publicación final de un experimento digital verificable y reproducible: el preámbulo de los papers ejecutables.
Astronomy is a collaborative science, but it has also become highly specialized, as many other disciplines. Improvement of sharing, discovery and access to resources will enable astronomers to greatly benefit from each other’s highly specialized knowhow. Some initiatives led by scientists and publishers, complement traditional paper publishing with assets published in more interactive digital formats. Among the main goals of these efforts are improving the reproducibility and clarity of the scientific outcome, going beyond the static PDF file, and fostering re-use, which turns into a more efficient exploitation of available digital resources.
The science performed in Astronomy is digital science, from observing proposals to final publication, including data and software used: each of the elements and actions involved in the scientific output could be recorded in electronic form.
This fact does not prevent the final outcome of an experiment is still difficult to reproduce. An exhaustive process of documentation can be long, tedious, where access to all the resources must be granted, and after all, the repeatability of results is not even guaranteed. At the same time, we have access to a wealth of files, observational data and publications that could be used more efficiently with a better visibility of the scientific production, avoiding duplication of effort and reinvention.
Digital Science: Reproducibility and Visibility in AstronomyJose Enrique Ruiz
The science done in Astronomy is digital science, from observing proposals to final publication, to data and software used: each of the elements and actions involved in scientific output could be recorded in electronic form. This fact does not prevent the final outcome of an experiment is still difficult to reproduce. This procedure can be long, tedious, not easily accessible or understandable, even to the author. At the same time, we have a rich infrastructure of files, observational data and publications. This could be used more efficiently if we reach greater visibility of the scientific production, which avoids duplication of effort and reinvention.
Reproducibility is a cornerstone in scientific method, and extraction of relevant information in the current and future data flood is key in Astronomy. The AMIGA group (Analysis of the interstellar Medium of Isolated GAlaxies, IAA-CSIC, http://amiga.iaa.es) faces these two challenges in the European project "Wf4Ever: Advanced technologies for enhanced preservation workflow Science" to enable the preservation of the methodology in scalable semantic repositories to facilitate their discovery, access, inspection, exploitation and distribution. These repositories store the experiments on "Research Objects" whose main constituents are digital scientific workflows. These provide a comprehensive view and clear scientific interpretation of the experiment as well as the automation of the method, going beyond the usual pipelines that normally end up in data processing.
The quantitative leap in volume and complexity of the next generation of archives will need analysis and data mining tasks to live closer to the data, in computing and distributed storage environments, but they should also be modular enough to allow customization from scientists and be easily accessible to foster their dissemination among the community. Astronomy is a collaborative science, but it has also become highly specialized, as many other disciplines. Sharing, preservation, discovery and a much simplified access to resources in the composition of scientific workflows will enable astronomers to greatly benefit from each other’s highly specialized knowhow, they constitute a way to push Astronomy to share and publish not only results and data, but also processes and methodologies.
We will show how the use of scientific workflows can help to improve the reproducibility of the experiment and a more efficient exploitation of astronomical archives, as well as the visibility of the scientific methodology and its reuse.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
1. Grant agreement no.: 27092
Workflows Preservation!
José Enrique Ruiz, Lourdes Verdes-Montenegro, Susana Sánchez, !
Juan de Dios Santander-Vela and the Wf4Ever Team !
IAA-CSIC!
!
January 18th 2012!
7th Workflow Working Group Meeting - AS OV France!
2. Who am I ?!
Instituto Astrofísica de Andalucia - CSIC!
2
3. AMIGA Group!
Analysis of the interstellar Medium of Isolated Galaxies!
!
Statistical baseline of isolated galaxies to compare!
!
with the behaviour of galaxies in denser environments!
Multi study of ~1000 galaxies!
+!
Need of intensive and complex analysis of 3D data!
2D spatial + 1 Velocity!
IAA-CSIC!
Uuiv . Granada, Obs. Marseille, Obs. Paris, NAOJ, !
FCRAO, UNAM, Univ. Edinburgh, IRAM, ESO,!
Kapteyn Astronomical Institute.!
!
P.I. Lourdes Verdes-Montenegro!
http://amiga.iaa.es!
3
4. What is Wf4Ever ?!
EU funded FP7 STREP Project!
December 2010 – December 2013 !
1. Intelligent Software Components (ISOCO, Spain)!
2. University of Manchester (UNIMAN, UK)!
2 7 3. Universidad Politécnica de Madrid (UPM, Spain)!
5! 4! 4. Poznan Supercomputing and Networking Centre
(PSNC, Poland)!
5. Universisty of Oxford (OXF, UK)!
6. Instituto de Astrofísica de Andalucía (IAA, Spain)!
1! 3!
7. Leiden University Medical Centre (LUMC, NL)!
6!
4
5. What is Wf4Ever ?!
Technological infrastructure for the preservation and efficient retrieval and reuse
of scientific workflows in a range of disciplines!
Partners!
• One SME!
Goals!
• Six public organizations! !
Archival, classification, and indexing
Technological Core Competencies! of scientific workflows and their
associated materials in scalable
• Digital Libraries!
• Workflow Management !
semantic repositories, providing
advanced access and recommendation
• Semantic Web!
• Integrity & Authenticity!
capabilities!
• Provenance! !
• Information Quality! Creation of scientific communities to
collaboratively share, reuse and evolve
Case Studies! workflows and their parts, stimulating
the development of new scientific
• Astronomy (IAA)!
knowledge!
• Genome-wide Analysis and
Biobanking (LUMC)!
!
5
6. What are our Scientific workflows ?!
Combination of data and processes into a configurable
and structured set of steps that implement semi-
automated computational solutions in problem solving!
Types of workflows in Astronomy!
• Personal script-based recipes !
Python, IDL, Software..!
• Multi-archive VO recipes!
• Internal group developments !
GRID, Clusters..!
• Processing pipelines!
Provide Data, Computing Infrastructure, Tools..!
Scientifically exploitable results vs. scientific insight ! Wfs on
Easily accessible and reproducible (Shared)! steroids !
6
7. Why workflow preservation is important ?!
! Astronomy research is entirely digital!
! Time has come to go “Beyond the PDF” !
!
Preserved experiments!
• Methodology “in action”! Discoverable !!
• All data are exposed!
• Reproducible!
• Repeatable! Trust assessment
• Re-usable!
• Re-purposeable!
• Participatory!
• Collaborative!
• Formative! Social aspect
7
8. Related Initiatives!
Cyber-SKA!
Provide infrastructure that will be required to address the needs of
future radio telescopes such as the Square Kilometre Array!
!
Web based workflow builder !
• Image segmentation!
• Image mosaicking (Montage)!
• Spatial reprojection!
• Plane extraction from data cubes!
IceCore!
University of Helsinki!
Web portal for executing workflows – University of Helsinki!
Common interface for Wfs distributed in different engine servers!
8
!
9. Related Initiatives!
Montage!
• FITS Image Mosaicking!
• Toolkit for Desktops, Clusters and Grids!
!
Astro-WISE!
• Distributed data storage and computing infrastructure!
• Track process provenance of final data products!
• Calibration and analysis of images!
!
Helio-VO!
• Solar physics Virtual Observatory!
• Enable workflow execution via Taverna Server!
!
Workflows VO France!
• Provide use cases mainly oriented VO !
• AÏDA Workflow System implements FITS validation with CharDM !
9
13. Tools!
ESO Reflex!
Finland’s in-kind contribution to ESO!
• Prototype/feasibility study!
• Initially based on Taverna 1!
Current implementation based on Kepler!
!
AstroTaverna!
AstroGrid Development!
Prototype, marrying of VO Desktop & Taverna 1!
Library of Taverna functions to access VO Desktop’s API!
!
Status!
Wrapper libraries only for Taverna 1!
13
14. Digital Repositories!
The recipes store!
Oxford e-Research Centre!
!
• Find workflows!
• Share workflows and files!
• Find people!
• Build communities!
• Publish packages!
• Tag workflows!
• Score and rate workflows!
• Comment on workflows!
• Write reviews!
14
15. Digital Repositories!
!!
!!
Astronomy in MyExperiment!
• 10 interested users !
• No VO-services-based Wfs!
• Some Helio Project Wfs!
• VOTables parsing!
• Internal services!
• Astro-Shims !
• BioCatalogue vs. VORegistry !
!
Astro-Wf4Ever specific Wfs!
• Catalogue Queries!
!
!
15
16. The upcoming context!
Processes should benefit of the same privileges acquired by Data!
Digital Libraries of Workflows may boost the use of the existing
infrastructure of data (VO)!
Users need templates !!
!
Wf4Ever is also a project about!
• How to publish!
• How to do review by peers!
• Improve visibility by reference and attribution!
!
Publishers should play an import role! 16
17. The upcoming context!
The next generation of archives!
!
Much wider FoV and spectral coverage!
• Huge sized datasets (~ tens TB)!
• Big Data science highly dependent on I/O data rates!
• Subproducts as virtual data generated on-the-fly!
Automated surveys!
• Huge amount of tabular data!
• Services for Knowledge Discovery in Databases!
!
17
18. The upcoming context!
We are moving into a world where !
• computing and storage are cheap !
• data movement is death!
Archives should evolve from data providers into virtual data
and services providers, where web services may help to solve
bandwidth issues.!
!
Archives speaking self-descriptif web services!
• Smaller virtual data subproducts!
• Distributed, multi-archive, multi-wavelength astronomy!
18
19. Considerations!
(Data) Workflow preservation!
!
• Interpreted through their execution!
• Complex models are required to describe them!
• Severely vulnerable to obsolescence !
• Applications !
• Libraries!
• Operating environment!
• Provenance is a complex issue in a cloud of services!
• Resources are often beyond control of scientists!
• Alleviate decay of external resources via alternates!
19
20. Considerations!
(Data) Workflow preservation!
!
• Versioning of the whole or its components !
• Restricted access on data and processes!
• Permissions, licenses, platform, costs, etc.!
• Semantic discovery of Wfs, processes, web services!
• Metrics for quality: use stats, logs uptime, etc.!
• Integrity evaluation!
• Completeness checking!
• Ensure trustworthiness and authenticity!
• Workflows for workflow curation!
20
21. A first approach in Workflow Preservation!
Preserve, Retrieve, Reconstruct, Replay!
!
• Retrieve! Characterization!
• Functionality of the Wf or its modules!
• What are the inputs and outputs!
• Metadata, authority, keywords! Semantics and
• Reconstruct! Modeling!
• Understand dependencies and components!
• Technical specificities!
• Replay! Execution Tools!
• Check the success of the preservation method!
!
• Referenced and acknowledged! Long-term IDs!
21
!
22. Wf4Ever Update!
RO. The Research Object!
!
All components related to the research lifecycle of an
experiment should be available. !
!
Preserved and easily retrievable !
!
• Proposals!
• Data!
• Processes!
• Publications!
!
22
23. Astronomy WP in Wf4Ever!
Development and Implementation of Golden Exemplars!
• Local catalogue curation based on VO Archives!
• Sources extraction and crossmatching from 2D images!
• Modeling and analysis of 3D velocity cubes of galaxies!
!
Create a community of users!
• Development of Prototypes and Tools!
• Dissemination!
!
Integrate existing astronomy software with Wf4Ever Tools!
• SAMP and WebSAMP!
!
Provide interoperable models, ontologies and vocabularies for the
characterization of workflows, processes and RO components !
!
23
24. Astronomy WP in Wf4Ever!
!
• ! Characterization of the Astronomy domain in Wf!
• ! Detailed study of standards and web services in IVOA!
• Exploration of similar initiatives for the curation of digital objects !
• Sociological study and working methodology of astronomers!
• Extraction of user and technical requirements!
• Extraction of Taverna user requirements for Astronomy!
• Implementation of first Golden Exemplar!
• Early contacts in IVOA for the creation of a community of users!
24
25. Wf4Ever Update!
!
Users’ Requirements!
• Functional requirements for Wf4Ever “working” platform!
• Focused on improving collaboration and reuse!
• Interoperability in exchanging scientific methodology!
• Expose experiment in a structured way to be understood by others!
!
RO Modeling!
• Model for interlinked components in a Research Object!
• Strategies for assessing integrity and authenticity!
• Attempts in metrics for Information Quality!
!
!
We need to build what we would like to preserve!
25
29. Wf4Ever Update!
ROBox!
!
Seamless contribution to a
working collaborative platform!
!
A shared folder in Dropbox
becomes a Working RO!
!
!
!
!
!
!
Automatic generation of
metadata !
29
36. Wf4Ever Update!
Notification Service for Authors!
What should be notified ?!
• Fails!
• Downloads!
• Annotations!
• Linked/Similarity!
• Modifications on Working RO!
• Acknowledgements!
!
Notification Management Tool!
Avoid spam!
36
37. Astronomy WP in Wf4Ever!
Astronomy WP!
• Development and Implementation of “Extraction of Sources”!
• Development and Implementation of “Modelling of 3D Data”!
• Explore experiments subject to be migrated to Wf/RO methodology!
• Contribute to IVOA in Semantics for Processes!
!
Other WPs!
Continue Providing Feedback!
• RO Model, Architecture, Integrity & Authenticity, Information Quality, etc. !
• Software integration and improved functionalities (SAMP, Taverna, etc.)!
• Prototypes for management and visualization of RO!
!
Community engagement!
• Approach Astro-Informaticians!
• Continue pushing in the IVOA Community!
• Tackle collaboration with Publishers!
!
! 37
38. Workflows & IVOA!
Distributed data analysis in the VO!
• Panchromatic, multi-archive, multi-facility!
• Executes in the VO Infrastructure!
• Orchestration of simple services!
!
Workflows VO Characterization!
Present processing pipelines! • Inputs!
• Produce exploitable data! • Outputs!
• Provenance modeling! • Processes!
• Descriptions!
• VO compliant data ! • Metadata!
! • Etc..!
Data processing from the VO!
• Provide custom re-processing to VO users!
• Virtual data generation through UWS in VOSpace!
38
39. Related activities in the VO!
IVOA Working Groups!
• Data Modeling!
Characterization, Provenance..!
• Semantics!
Ontologies, Vocabularies for Processes!
• Data Access Layer!
TAP, self-descriptive Protocols..! !
• Grid and Web Services! IVOA Note!
UWS, VOSpace, SSO..! Scientific Workflows in the VO!
• Applications! André Schaaff & Jose Enrique Ruiz!
SAMP! !
• IG. KDD! workflow@ivoa.net!
Knowledge Discovery and Data Mining!
• IG. Data Curation and Preservation!
Persistent Identifiers, Curation of VO Resources..!
Wf4Ever Project, US VAO semantic linking of proposals, publications, data!
39