Workflows to access and massage VODataPresentation Transcript
1WorkflowsAccess and Massage VO DataJosé Enrique Ruizon behalf of the Wf4Ever TeamIVOA INTEROP SPRING MEETING 2013HEIDELBERG, MAY16th 2013
2Wf4EverWorkflows to Access and Massage VO Data1. Intelligent Software Components (ISOCO, Spain)2. University of Manchester (UNIMAN, UK)3. Universidad Politécnica de Madrid (UPM, Spain)4. Poznan Supercomputing and Networking Centre (Poland)5. University of Oxford and OeRC (OXF, UK)6. Instituto Astrofísica Andalucía (IAA-CSIC, Spain)7. Leiden University Medical Centre (LUMC, NL)4Wf4EverAdvanced Workflow Preservation Technologies for Enhanced Science3167522011 - 2013
3What is a Scientific Workflow?Workflows to Access and Massage VO Data» A mechanism for coordinating the execution ofservices and codes, and linking together resources.» The combination of data and processes into aconfigurable, modular, structured set of steps thatimplement semi-automated computational solutionsin scientific problem-solving.» The implementation of a scientific method.
4State of the art in AstronomyWorkflows to Access and Massage VO Data» IVOA Note Definition» Wf Software› Taverna› Kepler› Pegasus› Triana› ESO ReflexRelated Initiatives› ER-Flow› VAMDC› Helio-VO› Cyber-SKA› IceCore› Montage› Astro-WISE› AstroGridIn the VO› GWS WG› VO France WF WG› VAMDC› AstroGrid
6Digital AstronomyWorkflows to Access and Massage VO DataGoing beyond Automation!Improving Documentation andReadability!
7AstroTavernaWorkflows to Access and Massage VO DataAstroTaverna WorkflowsRetrieving and Manipulating VO Data• ConeSearch• SIA• SSA• TAP coming soon…• Tabular Data (VOTables)• Images, but not yet Spectra..• Crossmatching, Filtering, NameResolving, Coordinates and referencesystem transformation, Data massage.. (STILTS)• Overplotting sources catalogs on Images and filtering, overplot circles,ellipses, etc. as a function of physical magnitude. Resampling, crops,blinks, mosaics, movies, blinks, RGBs, fusion, diff.. (ALADIN)• SAMP for final inspection+ Catalogs on HTML Pages+ Advanced Analysis using Scripts+ SOAP/REST Web Services+ SQL access to JDBC databases
8AstroTavernaWorkflows to Access and Massage VO DataVOData Access: VO Services Discoveryhttp://amiga.iaa.es/p/290-astrotaverna.htm
9AstroTavernaWorkflows to Access and Massage VO Data!!VOData Massage: VOTables, STILTS, Aladin, TerminalSimhttp://amiga.iaa.es/p/290-astrotaverna.htm
10VOData ConsumersWorkflows to Access and Massage VO Data
11VOData ManipulationWorkflows to Access and Massage VO DataMassage of Tabular DataX-MatchingCalculationAdditionsFilteringAccess
12Workflows to Access and Massage VO DataVOData ManipulationX-MatchingCalculationAdditionsFilteringAccess
13VOData CurationWorkflows to Access and Massage VO Data
14VOData CurationWorkflows to Access and Massage VO Data
15VOTable Format InteroperabilityWorkflows to Access and Massage VO Data90 galaxies observed in 3 bandsCalculation of Luminosity Profiles for aSample of Galaxies extracted from SDSS DR8
16Method InspectionWorkflows to Access and Massage VO DataAladin Scripts and Macro executing in GUI/noGUI mode
17VOData InspectionWorkflows to Access and Massage VO DataAladin Scripts and Macro executing in GUI/noGUI modeSAMP
18Learning by the exampleWorkflows to Access and Massage VO Data
19The Virtual ObservatoryWorkflows to Access and Massage VO DataVO compliant data from pipelinesTraditional data processing pipelines, e.g., instrumental or survey dataprocessing pipelines, which produce higher, level data products. At presentthere are many variants of these and they have little or no direct connection toVO, aside from possibly producing VO-compliant data or being optionally drivenfrom VO.It is not clear how much VO mechanisms are needed at this level (VO compliantdata and metadata, modelling provenance, etc.)
20The Virtual ObservatoryWorkflows to Access and Massage VO DataDriving Data Processing Pipelines from the VOIn this case we have a traditional data processing pipeline and the remote useror client software invokes a job to do some pipeline reprocessing, e.g., tocustom reprocess an instrumental dataset to produce a new image, cube, etc.The "workflow" in this case runs at a single site, and VO is used to drive the jobremotely (SSO, UWS) and manage the results (VOSpace, VO data services).We could think on integrating the traditional data processing pipelines wealready have with VO, to allow VO users to do on-the-fly reprocessing togenerate data products which can be analysed with VO (custom reprocessingof observatory data for example)Some attempts to integrate general processing applications have been madewith CEA and UWS.
21The Virtual ObservatoryWorkflows to Access and Massage VO DataDistributed Data Analysis WorkflowsIn this case a user or a client defines and executes a distributed workflow, whichinvokes services on multiple remote sites via the VO infrastructure. Theworkflow would be entirely in VO-space, driving simpler services at theindividual sites.The AstroTaverna developments provide a graphical tool for the compositionand design of workflows based on VO services and data from different archivesand facilities.Self Descriptive Web Services: S3, SimDAL, PDL, DataLink