Work at ISI,                 Current Status,                   Next Steps                       Daniel Garijo Verdejo,    ...
What am I working on?•Creation of abstractions in scientific workflows   •Workflow Traces and template representation     ...
OutlineIndex1.   Motivation2.   Overview3.   Workflow systems used4.   Summary of work done in my previous visit to ISI   ...
Motivation•As a designer: Discovery   •Workflows with similar functionality fragments/methods   •Design based in previous ...
Overview                                             Descriptions/Abstraction definitions and categorization   PSMS/Ontolo...
Taverna and Wings                                                        http://www.taverna.org.uk/http://www.wings-workfl...
Summary: Previous Work at ISIAbstractions definitions and categorization   Algorithms for finding the different       abst...
High level architecture                                                                                    Other          ...
OPMW: Process view               9
OPMW: Attribution view                   10
Work previous to second visit to ISI                                              Motif DetectionAbstractions definitions ...
Overview      • Empirical analysis on 177 workflow templates from Taverna and      Wings      • Catalog of recurring patte...
Approach•Reverse-engineer the set of current practices in workflowdevelopment through an analysis of empirical evidence•Id...
Workflow Motifs•Workflow motif: Domain independent conceptual abstraction on the workflowsteps.1. Data-oriented motifs: Wh...
Motif CatalogData-Oriented Motifs                   Workflow-Oriented Motifs  Data Retrieval                            In...
Summary: Work done at ISI                                                 Motif DetectionAbstractions definitions and cate...
PROV Compatibility•OPMW fits naturally into PROV    •Same usage-generation    structure    •Extension for the scientific  ...
P-PLAN•Plans are not provenance•P-PLAN: Simple plan model for binding traces to template representations•Aligned with OPMW...
Summary: Work done at ISI                                                 Motif DetectionAbstractions definitions and cate...
Macro abstraction detectionProblem statement:Given a repository of workflow templates (either abstract or specific) or wor...
Macro abstraction detection•Work in Progress (implementation and evaluation)   •WINGS traces•Similar to Sub-graph Isomorph...
Next Steps       22
Next Steps•Thesis:    •Finish up implementation.    •How to evaluate results?•Publications:    •Workshop:         •Provena...
Future work•Thesis:    •Other methods for detecting workflow abstraction automatically        •Metadata and file analysis ...
Thanks !                     :oscarCorcho                                   :yolandGil                             :superv...
Work at ISI,        relation with wf4Ever,              future steps                        Daniel Garijo VerdejoOntology ...
Upcoming SlideShare
Loading in …5
×

Status update OEG - Nov 2012

455 views

Published on

Summary of the work done in my summer internship plus the current status of my thesis (2012)

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
455
On SlideShare
0
From Embeds
0
Number of Embeds
93
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Status update OEG - Nov 2012

  1. 1. Work at ISI, Current Status, Next Steps Daniel Garijo Verdejo, Oscar Corcho, Yolanda GilOntology Engineering Group. Laboratorio de Inteligencia Artificial Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid Date: 29/11/2012
  2. 2. What am I working on?•Creation of abstractions in scientific workflows •Workflow Traces and template representation •Provenance representation •Plan representation •Abstraction catalog •Find ways to link the definitions to the provenance traces automatically•Understandability and reuse of scientific workflows 2
  3. 3. OutlineIndex1. Motivation2. Overview3. Workflow systems used4. Summary of work done in my previous visit to ISI • OPMW and provenance publishing5. Summary of work done before second visit to ISI • Workflow motif catalog6. Summary of work done in my second visit to ISI • OPMW-PROV and P-PLAN • Automatic macro abstraction detection7. Next Steps8. Future work 3
  4. 4. Motivation•As a designer: Discovery •Workflows with similar functionality fragments/methods •Design based in previous templates.•As user/reuser: Understandability •Search workflows by functionality •Commonalities between execution runs •Component categorization 4
  5. 5. Overview Descriptions/Abstraction definitions and categorization PSMS/Ontologies Data mining tools, graph analysis, etc. Algorithms for finding the different abstractions automatically RDF Stores Experiment publication Vocabularies Provenance Plan representation representation 5
  6. 6. Taverna and Wings http://www.taverna.org.uk/http://www.wings-workflows.org/ 6 IEEE eScience 2012. Chicago, USA
  7. 7. Summary: Previous Work at ISIAbstractions definitions and categorization Algorithms for finding the different abstractions automatically Virtuoso, Experiment Publication Pubby, Wings (+Plugin) OPMW Provenance Plan representation representation 7
  8. 8. High level architecture Other workflow WINGS on local laptop environments Workflow Core Template OPM Portal Workflow export Instance Programatic access (external apps) WINGS on shared host Workflow Linked Core Template OPM Portal export Data Workflow Instance Publication Interactive WINGS on web server Browsing Workflow (Pubby frontend) Core Template OPM Portal export Users Workflow InstanceWings workflow OPM Publication Share Reuse generation conversion 8
  9. 9. OPMW: Process view 9
  10. 10. OPMW: Attribution view 10
  11. 11. Work previous to second visit to ISI Motif DetectionAbstractions definitions and categorization Algorithms for automatic matching Virtuoso, Experiment Publication Pubby, Wings (+Plugin) Provenance Plan OPMW representation representation 11
  12. 12. Overview • Empirical analysis on 177 workflow templates from Taverna and Wings • Catalog of recurring patterns: scientific workflow motifs. • Data Oriented Motifs • Workflow Oriented Motifs •Understandability and reusehttp://sensefinancial.com/wp-content/uploads/2012/02/contribution.jpg 12 IEEE eScience 2012. Chicago, USA
  13. 13. Approach•Reverse-engineer the set of current practices in workflowdevelopment through an analysis of empirical evidence•Identify workflow abstractions that would facilitateunderstandability and therefore effective re-use 13 IEEE eScience 2012. Chicago, USA
  14. 14. Workflow Motifs•Workflow motif: Domain independent conceptual abstraction on the workflowsteps.1. Data-oriented motifs: What kind of manipulations does the workflow have? •E.g.: •Data retrieval •Data preparation • etc. WHAT?2. Workflow-oriented motifs: How does the workflow perform its operations? •E.g.: •Stateful steps •Stateless steps •Human interactions •etc. HOW? 14 IEEE eScience 2012. Chicago, USA
  15. 15. Motif CatalogData-Oriented Motifs Workflow-Oriented Motifs Data Retrieval Intra-Workflow Motifs Data Preparation Stateful (Asynchronous) Invocations Format Transformation Stateless (Synchronous) Invocations Input Augmentation Internal Macros and Output Splitting Human Interactions Data Organisation Data Analysis Inter-Workflow Motifs Data Curation/Cleaning Atomic Workflows Composite Workflows Data Moving Data Visualisation Workflow Overloading 15 IEEE eScience 2012. Chicago, USA
  16. 16. Summary: Work done at ISI Motif DetectionAbstractions definitions and categorization SUBDUE exploration and MacroAlgorithms for automatic integration in abstraction matching RDF detection Virtuoso, Experiment Publication Pubby, Wings (+Plugin) Provenance Plan OPMW + PROV representation representation + P-PLAN 16
  17. 17. PROV Compatibility•OPMW fits naturally into PROV •Same usage-generation structure •Extension for the scientific workflow with PROV•Binary relationships (no n-arypatterns used). •Simplicity•Publication of PROV as well asOPMW. •Queries can be answered in both languages. •Flexibility.•http://www.opmw.org/node/8 17
  18. 18. P-PLAN•Plans are not provenance•P-PLAN: Simple plan model for binding traces to template representations•Aligned with OPMW and PROV•Documentation in progress 18
  19. 19. Summary: Work done at ISI Motif DetectionAbstractions definitions and categorization SUBDUE exploration and MacroAlgorithms for automatic integration in abstraction matching RDF detection Virtuoso, Experiment Publication Pubby, Wings (+Plugin) Provenance Plan OPMW + PROV representation representation + P-PLAN 19
  20. 20. Macro abstraction detectionProblem statement:Given a repository of workflow templates (either abstract or specific) or workflowexecution traces, what are the workflow fragments I can deduce from it?Useful for:•Systems like Taverna and Wings: (Many templates, little annotation to relate them) •Finding relationships between workflows and sub-workflows. •Most used fragments, most executed, etc. •Systems like GenePattern and Galaxy: (Many runs, nearly no templates published) •Proposing new templates with the popular fragments. 20
  21. 21. Macro abstraction detection•Work in Progress (implementation and evaluation) •WINGS traces•Similar to Sub-graph Isomorphism•Kind of “Graph Clustering”•Early results •Tool for finding common sub-graphs •Sequential graphs •Efficient •Scalable. •Integration with RDF (by me)•TO DO: •Finish implementation: inference. •Evaluation!! 21
  22. 22. Next Steps 22
  23. 23. Next Steps•Thesis: •Finish up implementation. •How to evaluate results?•Publications: •Workshop: •Provenance Corpus (with Taverna Team). To have something citable •Conference: •KCAP: Macro detection implementation and evaluation. •Journal •Decay analysis publication in journal (January) •OPMW - PROV -P-PLAN publication in journal (December) •Motif extension publication in journal (Invited by special issue) (Now) 23
  24. 24. Future work•Thesis: •Other methods for detecting workflow abstraction automatically •Metadata and file analysis (diff, etc.): Filter, merge, etc. •Provenance reconstruction.•Project: •RO model specifications •Testcases •Workflow abstraction with Isoco 24
  25. 25. Thanks ! :oscarCorcho :yolandGil :supervises :supervises :danielGarijo:khalidBelhajjame :varunRatnakar :collaboratesWith :collaboratesWith :collaboratesWith :collaboratesWith :caroleGoble :pinarAlper 25
  26. 26. Work at ISI, relation with wf4Ever, future steps Daniel Garijo VerdejoOntology Engineering Group. Laboratorio de Inteligencia Artificial Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid Date: 03/10/2011

×