Your SlideShare is downloading. ×
0
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Status update OEG - Nov 2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Status update OEG - Nov 2012

286

Published on

Summary of the work done in my summer internship plus the current status of my thesis (2012)

Summary of the work done in my summer internship plus the current status of my thesis (2012)

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
286
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Work at ISI, Current Status, Next Steps Daniel Garijo Verdejo, Oscar Corcho, Yolanda GilOntology Engineering Group. Laboratorio de Inteligencia Artificial Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid Date: 29/11/2012
  • 2. What am I working on?•Creation of abstractions in scientific workflows •Workflow Traces and template representation •Provenance representation •Plan representation •Abstraction catalog •Find ways to link the definitions to the provenance traces automatically•Understandability and reuse of scientific workflows 2
  • 3. OutlineIndex1. Motivation2. Overview3. Workflow systems used4. Summary of work done in my previous visit to ISI • OPMW and provenance publishing5. Summary of work done before second visit to ISI • Workflow motif catalog6. Summary of work done in my second visit to ISI • OPMW-PROV and P-PLAN • Automatic macro abstraction detection7. Next Steps8. Future work 3
  • 4. Motivation•As a designer: Discovery •Workflows with similar functionality fragments/methods •Design based in previous templates.•As user/reuser: Understandability •Search workflows by functionality •Commonalities between execution runs •Component categorization 4
  • 5. Overview Descriptions/Abstraction definitions and categorization PSMS/Ontologies Data mining tools, graph analysis, etc. Algorithms for finding the different abstractions automatically RDF Stores Experiment publication Vocabularies Provenance Plan representation representation 5
  • 6. Taverna and Wings http://www.taverna.org.uk/http://www.wings-workflows.org/ 6 IEEE eScience 2012. Chicago, USA
  • 7. Summary: Previous Work at ISIAbstractions definitions and categorization Algorithms for finding the different abstractions automatically Virtuoso, Experiment Publication Pubby, Wings (+Plugin) OPMW Provenance Plan representation representation 7
  • 8. High level architecture Other workflow WINGS on local laptop environments Workflow Core Template OPM Portal Workflow export Instance Programatic access (external apps) WINGS on shared host Workflow Linked Core Template OPM Portal export Data Workflow Instance Publication Interactive WINGS on web server Browsing Workflow (Pubby frontend) Core Template OPM Portal export Users Workflow InstanceWings workflow OPM Publication Share Reuse generation conversion 8
  • 9. OPMW: Process view 9
  • 10. OPMW: Attribution view 10
  • 11. Work previous to second visit to ISI Motif DetectionAbstractions definitions and categorization Algorithms for automatic matching Virtuoso, Experiment Publication Pubby, Wings (+Plugin) Provenance Plan OPMW representation representation 11
  • 12. Overview • Empirical analysis on 177 workflow templates from Taverna and Wings • Catalog of recurring patterns: scientific workflow motifs. • Data Oriented Motifs • Workflow Oriented Motifs •Understandability and reusehttp://sensefinancial.com/wp-content/uploads/2012/02/contribution.jpg 12 IEEE eScience 2012. Chicago, USA
  • 13. Approach•Reverse-engineer the set of current practices in workflowdevelopment through an analysis of empirical evidence•Identify workflow abstractions that would facilitateunderstandability and therefore effective re-use 13 IEEE eScience 2012. Chicago, USA
  • 14. Workflow Motifs•Workflow motif: Domain independent conceptual abstraction on the workflowsteps.1. Data-oriented motifs: What kind of manipulations does the workflow have? •E.g.: •Data retrieval •Data preparation • etc. WHAT?2. Workflow-oriented motifs: How does the workflow perform its operations? •E.g.: •Stateful steps •Stateless steps •Human interactions •etc. HOW? 14 IEEE eScience 2012. Chicago, USA
  • 15. Motif CatalogData-Oriented Motifs Workflow-Oriented Motifs Data Retrieval Intra-Workflow Motifs Data Preparation Stateful (Asynchronous) Invocations Format Transformation Stateless (Synchronous) Invocations Input Augmentation Internal Macros and Output Splitting Human Interactions Data Organisation Data Analysis Inter-Workflow Motifs Data Curation/Cleaning Atomic Workflows Composite Workflows Data Moving Data Visualisation Workflow Overloading 15 IEEE eScience 2012. Chicago, USA
  • 16. Summary: Work done at ISI Motif DetectionAbstractions definitions and categorization SUBDUE exploration and MacroAlgorithms for automatic integration in abstraction matching RDF detection Virtuoso, Experiment Publication Pubby, Wings (+Plugin) Provenance Plan OPMW + PROV representation representation + P-PLAN 16
  • 17. PROV Compatibility•OPMW fits naturally into PROV •Same usage-generation structure •Extension for the scientific workflow with PROV•Binary relationships (no n-arypatterns used). •Simplicity•Publication of PROV as well asOPMW. •Queries can be answered in both languages. •Flexibility.•http://www.opmw.org/node/8 17
  • 18. P-PLAN•Plans are not provenance•P-PLAN: Simple plan model for binding traces to template representations•Aligned with OPMW and PROV•Documentation in progress 18
  • 19. Summary: Work done at ISI Motif DetectionAbstractions definitions and categorization SUBDUE exploration and MacroAlgorithms for automatic integration in abstraction matching RDF detection Virtuoso, Experiment Publication Pubby, Wings (+Plugin) Provenance Plan OPMW + PROV representation representation + P-PLAN 19
  • 20. Macro abstraction detectionProblem statement:Given a repository of workflow templates (either abstract or specific) or workflowexecution traces, what are the workflow fragments I can deduce from it?Useful for:•Systems like Taverna and Wings: (Many templates, little annotation to relate them) •Finding relationships between workflows and sub-workflows. •Most used fragments, most executed, etc. •Systems like GenePattern and Galaxy: (Many runs, nearly no templates published) •Proposing new templates with the popular fragments. 20
  • 21. Macro abstraction detection•Work in Progress (implementation and evaluation) •WINGS traces•Similar to Sub-graph Isomorphism•Kind of “Graph Clustering”•Early results •Tool for finding common sub-graphs •Sequential graphs •Efficient •Scalable. •Integration with RDF (by me)•TO DO: •Finish implementation: inference. •Evaluation!! 21
  • 22. Next Steps 22
  • 23. Next Steps•Thesis: •Finish up implementation. •How to evaluate results?•Publications: •Workshop: •Provenance Corpus (with Taverna Team). To have something citable •Conference: •KCAP: Macro detection implementation and evaluation. •Journal •Decay analysis publication in journal (January) •OPMW - PROV -P-PLAN publication in journal (December) •Motif extension publication in journal (Invited by special issue) (Now) 23
  • 24. Future work•Thesis: •Other methods for detecting workflow abstraction automatically •Metadata and file analysis (diff, etc.): Filter, merge, etc. •Provenance reconstruction.•Project: •RO model specifications •Testcases •Workflow abstraction with Isoco 24
  • 25. Thanks ! :oscarCorcho :yolandGil :supervises :supervises :danielGarijo:khalidBelhajjame :varunRatnakar :collaboratesWith :collaboratesWith :collaboratesWith :collaboratesWith :caroleGoble :pinarAlper 25
  • 26. Work at ISI, relation with wf4Ever, future steps Daniel Garijo VerdejoOntology Engineering Group. Laboratorio de Inteligencia Artificial Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid Date: 03/10/2011

×