• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Wf4Ever: Work!ows for Methodology and Science Preservation
 

Wf4Ever: Work!ows for Methodology and Science Preservation

on

  • 399 views

 

Statistics

Views

Total Views
399
Views on SlideShare
399
Embed Views
0

Actions

Likes
1
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Discoverability is a problem, as shown by Peter Teuben\n
  • Discoverability is a problem, as shown by Peter Teuben\n
  • Discoverability is a problem, as shown by Peter Teuben\n
  • Discoverability is a problem, as shown by Peter Teuben\n
  • Discoverability is a problem, as shown by Peter Teuben\n
  • Discoverability is a problem, as shown by Peter Teuben\n
  • Discoverability is a problem, as shown by Peter Teuben\n
  • The obsolescence problem has been illustrated by Harry Teplitz, and the previous BoF on data preservation. Methods should also be preserved is a clear call.\n
  • The obsolescence problem has been illustrated by Harry Teplitz, and the previous BoF on data preservation. Methods should also be preserved is a clear call.\n
  • \n
  • Long term ID preservation becomes a problem on itself\n
  • Long term ID preservation becomes a problem on itself\n
  • Long term ID preservation becomes a problem on itself\n
  • Long term ID preservation becomes a problem on itself\n
  • If IDs are not persisted, we loose our research objects. This is part of the discoverability problem.\n
  • If IDs are not persisted, we loose our research objects. This is part of the discoverability problem.\n
  • If IDs are not persisted, we loose our research objects. This is part of the discoverability problem.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Wf4Ever: Work!ows for Methodology and Science Preservation Wf4Ever: Work!ows for Methodology and Science Preservation Presentation Transcript

  • Grant agreement no.: 27092 Workflows for Methodology and Science Preservation Juan de Dios Santander VelaOn behalf of L. Verdes-Montenegro, J.E. Ruiz, S. Sánchez, and the Wf4Ever collaboration European Southern Observatory, ALMA Archive Subsystem
  • Grant agreement no.: 27092 Workflows for Methodology and Science Preservation Juan de Dios Santander Vela On behalf of L. Verdes-Montenegro, J.E. Ruiz, S. Sánchez, and the Wf4Ever collaborationInstituto de Astrofísica de Andalucía-CSIC, AMIGA Group (January 2012)
  • Who am I?█ Ph.D. within AMIGA group on making radio astronomical archives and tools work with the Virtual Observatory█ Applied Scientist at ESO VLT Archive, specialised in metadata management█ Currently working on the ALMA Science Archive, from the backend to the web GUI.█ From January 2012, working for the Wf4Ever project in bringing radio astronomical workflows to life. 2
  • Who am I?█ Ph.D. within AMIGA group on making radio astronomical archives and tools work with the Virtual Observatory█ Applied Scientist at ESO VLT Archive, specialised in metadata management█ Currently working on the ALMA Science Archive, from the backend to the web GUI.█ From January 2012, working for the Wf4Ever project in bringing radio astronomical workflows to life. 2
  • AMIGA█ AMIGA: Analysis of the Interstellar Medium of isolated GAlaxies ‣ Multi-wavelength, multi-object study on isolated galaxies with strict isolation criteria ‣ Careful curation of data ‣ Very careful processing of new parameters from • Group’s own observation programs and data reduction • Literature table scanning • Virtual Observatory table harvesting and parsing ‣ Emphasis on marrying astronomy and computer science, and buy-in of the VO 3
  • AMIGA█ AMIGA: Analysis of the Interstellar Medium of isolated GAlaxies ‣ Multi-wavelength, multi-object study on isolated galaxies with strict isolation criteria ‣ Careful curation of data ‣ Very careful processing of new parameters from • Group’s own observation programs and data reduction • Literature table scanning • Virtual Observatory table harvesting and parsing ‣ Emphasis on marrying astronomy and computer science, and buy-in of the VO v ers! elie ce b e-S cien 3
  • What is Wf4Ever?EU funded FP7 STREP Project 1. Intelligent SoftwareDecember 2010 – December 2013 Components (ISOCO, Spain) 2. University of Manchester (UNIMAN, UK) 3. Universidad Politécnica de Madrid (UPM, Spain) 2 7 5 4 4. Poznan Supercomputing and Networking Centre (PSNC, Poland) 13 5. University of Oxford 6 (OXF, UK) 6. Instituto de Astrofísica de Andalucía (IAA, Spain) 7. Leiden University Medical Centre (LUMC, NL) 4
  • What is Wf4Ever? Technological infrastructure for the preservation and efficient retrieval and reuse of scientific workflows in a range of disciplinesPartners Goals• One SME Archival, classification, and indexing• Six public organisations of scientific workflows and their associated materials in scalableCore Competencies (Tech) semantic repositories, providing• Digital Libraries advanced access and recommendation• Workflow Management capabilities• Semantic Web• Integrity & Authenticity• Provenance Creation of scientific communities to• Information Quality collaboratively share, reuse, and evolveCase Studies workflows and their parts, stimulating the development of new scientific• Astronomy (IAA) knowledge• Genome-wide Analysis and Biobanking 5
  • What are workflows? 6
  • What are workflows? Combination of data and processes into aconfigurable and structured set of steps thatimplement semi-automated, problem solving, computational solutions 6
  • What are workflows? Combination of data and processes into a configurable and structured set of steps that implement semi-automated, problem solving, computational solutions█ Types of workflows in Astronomy ‣ Personal script-based recipes ‣ Internal group developments✱ ‣ Multi-archive VO experiments ‣ The classical processing pipeline✱ ‣ Driving pipelines from VO services (TBD) ✱ Scientifically exploitable results vs. scientific insight 6
  • What are workflows? Combination of data and processes into a configurable and structured set of steps that implement semi-automated, problem solving, computational solutions█ Types of workflows in Astronomy ‣ Personal script-based recipes ‣ Internal group developments✱ ‣ Multi-archive VO experiments ‣ The classical processing pipeline✱ ‣ Driving pipelines from VO services (TBD) ✱ Scientifically exploitable results vs. scientific insight Easily accessible and reproducible 6
  • What tools are available?
  • What tools are available? 7
  • What tools are available? 7
  • What tools are available? Combination of data and processes into aconfigurable and structured set of steps thatimplement semi-automated, problem solving, computational solutions 7
  • What tools are available? Combination of data and processes into aconfigurable and structured set of steps thatimplement semi-automated, problem solving, computational solutions 7
  • The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF”█ Preserved experiments ‣ Methodology “in action” ‣ All data are exposed ‣ Reproducible ‣ Repeatable ‣ Re-usable ‣ Re-purposeable ‣ Participatory ‣ Collaborative ‣ Formative 8
  • The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF”█ Preserved experiments ‣ Methodology “in action” ‣ All data are exposed ‣ Reproducible Trust assessment ‣ Repeatable ‣ Re-usable ‣ Re-purposeable ‣ Participatory ‣ Collaborative ‣ Formative 8
  • The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF”█ Preserved experiments ‣ Methodology “in action” ‣ All data are exposed ‣ Reproducible ‣ Repeatable ‣ Re-usable ‣ Re-purposeable Social aspect ‣ Participatory of science ‣ Collaborative ‣ Formative 8
  • The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF”█ Preserved experiments ‣ Methodology “in action” New kind of publication? ‣ All data are exposed ‣ Reproducible ‣ Repeatable ‣ Re-usable ‣ Re-purposeable ‣ Participatory ‣ Collaborative ‣ Formative 8
  • The importance of workflow preservation Astronomy research is entirely digital: time to go “beyond the PDF” bl e!█ Preserved experiments ve ra ‣ Methodology “in action” is co ‣ All data are exposed D ‣ Reproducible ‣ Repeatable ‣ Re-usable ‣ Re-purposeable ‣ Participatory ‣ Collaborative ‣ Formative 8
  • Workflow preservation considerations 9
  • Workflow preservation considerationsWorkflow, not data preservation 9
  • Workflow preservation considerations Workflow, not data preservation█ Workflows are interpreted █ Provenance is a complex through their execution issue in a cloud of ‣ Complex models are services required to describe them █ Resources are often█ Severely vulnerable to beyond control of obsolescence scientists ‣ Applications █ Alleviate decay of ‣ Libraries external resources via ‣ Operating environment alternates █ Ensure trustworthiness and authenticity 9
  • Workflow preservation considerations Workflow, not data preservation█ Versioning of the whole █ Permissions, licenses, workflow, or its platform, costs, etc. components █ Semantic discovery (WFs,█ Access control policies processes, web services) on data and processes █ QA: usage, logs, uptime… Workflows and Processes should benefit of the same privileges acquired by Data 10
  • First Approach to Workflow PreservationPreserve, Retrieve, Reconstruct, Replay█ Retrieve ‣ Functionality of the WF and/or its modules ‣ What are the inputs and outputs ‣ Metadata: Authority, Complexity, Keywords…█ Reconstruct ‣ Understand dependencies and components ‣ Technical specificities█ Replay ‣ Check the success of the preservation method█ Referenced and acknowledged 11
  • First Approach to Workflow PreservationPreserve, Retrieve, Reconstruct, Replay█ Retrieve ‣ Functionality of the WF and/or its modules ‣ What are the inputs and outputs Characterisation ‣ Metadata: Authority, Complexity, Keywords…█ Reconstruct ‣ Understand dependencies and components ‣ Technical specificities█ Replay ‣ Check the success of the preservation method█ Referenced and acknowledged 11
  • First Approach to Workflow PreservationPreserve, Retrieve, Reconstruct, Replay█ Retrieve ‣ Functionality of the WF and/or its modules ‣ What are the inputs and outputs Characterisation ‣ Metadata: Authority, Complexity, Keywords…█ Reconstruct ‣ Understand dependencies and components Semantics ‣ Technical specificities & Modelling█ Replay ‣ Check the success of the preservation method█ Referenced and acknowledged 11
  • First Approach to Workflow PreservationPreserve, Retrieve, Reconstruct, Replay█ Retrieve ‣ Functionality of the WF and/or its modules ‣ What are the inputs and outputs Characterisation ‣ Metadata: Authority, Complexity, Keywords…█ Reconstruct Tools ‣ Understand dependencies and components Semantics ‣ Technical specificities & Modelling█ Replay ‣ Check the success of the preservation method█ Referenced and acknowledged 11
  • First Approach to Workflow PreservationPreserve, Retrieve, Reconstruct, Replay█ Retrieve ‣ Functionality of the WF and/or its modules ‣ What are the inputs and outputs Characterisation ‣ Metadata: Authority, Complexity, Keywords…█ Reconstruct Tools ‣ Understand dependencies and components Semantics ‣ Technical specificities & Modelling█ Replay ‣ Check the success of the preservation method█ Referenced and acknowledged Long term IDs 11
  • More than a WF: The Research Object (RO)█ All components related to the research lifecycle of an experiment should be available.█ Preserved and easily retrievable ‣ Proposals ‣ Data ‣ Processes ‣ Workflows ‣ Publications 12
  • More than a WF: The Research Object (RO)█ All components related to the research lifecycle of an experiment should be available.█ Preserved and easily retrievable ‣ Proposals ‣ Data ‣ Processes ‣ Workflows ‣ Publications 12
  • More than a WF: The Research Object (RO)█ All components related to the research lifecycle of an experiment should be available.█ Preserved and easily retrievable ‣ Proposals ‣ Data All linked by ‣ Processes persistent IDs ‣ Workflows ‣ Publications 12
  • More than a WF: The Research Object (RO)█ All components related to the research lifecycle of an experiment should be available.█ Preserved and easily retrievable ‣ Proposals ‣ Data All linked by ‣ Processes persistent IDs ‣ Workflows ‣ Publications 12
  • Wf4Ever Update█ User Requirements ‣ Functional requirements for Wf4Ever “working” platform ‣ Focused on improving collaboration and reuse ‣ Interoperability in exchanging scientific methodology ‣ Expose experiment in a structured way to be understood by others█ RO Modeling ‣ Model for interlinked components in a Research Object ‣ Strategies for assessing integrity and authenticity ‣ Attempts in metrics for Information Quality 13
  • Wf4Ever Update█ User Requirements ‣ Functional requirements for Wf4Ever “working” platform ‣ Focused on improving collaboration and reuse ‣ Interoperability in exchanging scientific methodology ‣ Expose experiment in a structured way to be understood by others We need to build what we want to preserve!█ RO Modeling ‣ Model for interlinked components in a Research Object ‣ Strategies for assessing integrity and authenticity ‣ Attempts in metrics for Information Quality 13
  • Wf4Ever Update‣ Architecture • Search & Retrieval Service • Recommender Service • I & A Evaluation Service • Notification Service‣ User-Tools Prototypes • RO Command Line Tool • RO Annotator • RO Box 14
  • New Workflows in myExperiment About | Mailing List | Log in | Register | Give us Feedback | Invite Publications Home Users Groups Workflows Files Packs Services Topics virtual observatory All SearchHome » New/Upload Workflow GO Search results for "virtual observatory"Search filter terms Log in / Register Sort by: Rank Username or Email: Showing 5 results. Use the filters on the left and the search box below to refine the results.Filter by category virtual observatory Search Password: Workflow 3 Group 1 User 1 Taverna 2 AMIGA ConeSearch (v3) View Remember me: Created: 11/07/11 @ 22:08:06 | Last updated: 11/07/11 @ 23:34:13 Download (v3) ORFilter by type Original License: BSD License Use OpenID: Taverna 2 3 Uploader This workflow provides a VOTable response from the AMIGA ConeSearch service and extract values (eg: name.myopenid.com)Filter by tag from VOTable columns. virtual observa… 4 Log in astronomy 3 Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | votable 3 Pique Citations: 0 Need an account? astrogrid-taver… 1 Click here to register Viewed: 4 times | Downloaded: 1 time astrophysics 1 Tags (3): Forgot Password? workflows 1 astronomy | virtual observatory | votable Popular TagsFilter by user 25 tags Pique 3 Taverna 2 AMIGA ConeSearch from a file of targets/positions View [All Tags] (v1) Download (v1)Filter by licence benchmarks | bio2rdf | Original Created: 12/07/11 @ 17:34:33 | Last updated: 12/07/11 @ 17:36:37 bioinformatics | BLAST | by-nd 3 Uploader License: BSD License cheminformatics | data integration 15
  • Administrator: AstroGrid and the VO View Unique name: astrogrid.org Created: Tuesday 05 February 2008 @ 19:44:08 (GMT) New Workflows in myExperiment This group will enable astronomers and astrophysicists who use the AstroGrid-Taverna workflow system to share their workflows. For more information see the AstroGrid website http://www.astrogrid.org. In addition Nicholas emerging International Virtual Observatory Alliance (IVOA - see Walton http://www.ivoa.net) efforts in the workflow arena will be referenced. 0 shared items | 0 announcements Members (2): Nicholas Dugan Walton Tags: astrogrid-taverna | astrophysics | virtual observatory | workflows Member Pique View Message Joined: Tuesday 08 March 2011 @ 00:23:14 (GMT) No description Last active: Wednesday 02 November 2011 @ 12:06:31 (GMT) Website: http://www.iaa.es/~jer | Email (public): jer [at] iaa.es Pique Sort by: Rank Results per page: 10 Copyright © 2007 - 2011 The University of Manchester and University of SouthamptonFront Page About Us Taverna Workflow Workbench EPSRCHome News and Events myGrid JISC MicrosoftInvite people to myExperiment Mailing List BioCatalogueHelp pages Contact Us Trident Powered by: Developers Google Coop Search Publications 15
  • Wf4Ever Update█ ROBox ‣ Seamless contribution to a working collaborative platform ‣ A shared folder in Dropbox becomes a Working RO ‣ Automatic metadata generation 16
  • Wf4Ever Update█ ROBox ‣ Seamless contribution to a working collaborative platform ‣ A shared folder in Dropbox becomes a Working RO ‣ Automatic metadata generationCould be based on VOSpace! 16
  • Wf4Ever Update 17
  • Wf4Ever Update Structurein Dropbox 17
  • Wf4Ever Update Structure Metadata forin Dropbox selected item 17
  • Wf4Ever Update Structure Metadata forin Dropbox selected item Unstructured, rich-text metadata editor 17
  • Wf4Ever UpdateNotification Service for Authors█ What should be notified? ‣ Fails ‣ Downloads ‣ Annotations ‣ Linked/Similarity ‣ Modifications on Working RO ‣ Acknowledgements█ Notification Management Tool ‣ Avoid spam 18
  • Conclusions█ Workflows are a powerful, semantically rich way of describing astronomical knowledge discovery methods ‣ Provide both glue and structure to the method ‣ Also allow for metadata encapsulation█ Preserving workflows allows for method reuse, experiment replay, dissemination, attribution, trust building█ Wf4Ever is providing a framework for allowing astronomers to start using workflows without leaving their tools ‣ But with the idea of nudging them toward more structured workflow descriptions 19