Work at ISI, relationwith wf4Ever,futuresteps<br />Daniel GarijoVerdejo,Yolanda Gil<br />Ontology Engineering Group. Labor...
The TB Drugome<br />1<br />
Project goals<br />Typical Published Article<br />Reproducible Article: <br />Weaver, GenePattern GRRD, etc. <br />Text:<b...
Problem with existing approaches<br />Only executable workflow is published:<br />Must have the same codes to re-execute t...
Eg: eHits was proprietary and replaced by AutodockVina
Different labs prefer different codes
Eg: R vsMatlab
Eg: viz in CitoscapevsyEd</li></ul>Must have the same workflow framework to re-execute the workflow<br /><ul><li>Must have...
Key Features of our approach<br /><ul><li>Publish an abstract workflow in addition to executable w.
Description of workflow that is independent of the codes executed
Maps to the codes executed (the “executable workflow”)
Publish both abstract and executable workflow using the OPM standard
OPM (Open Provenance Model) is independent of workflow framework and is widely implemented
Other groups can import to their own workflow framework
Publish data and workflows as Linked Data on the Web
All workflows and related files are web-accessible
Upcoming SlideShare
Loading in …5
×

ISI work

1,025 views
940 views

Published on

My work at ISI during the summer, its relation with wf4Ever and next steps in my research

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,025
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ISI work

  1. 1. Work at ISI, relationwith wf4Ever,futuresteps<br />Daniel GarijoVerdejo,Yolanda Gil<br />Ontology Engineering Group. Laboratorio de Inteligencia Artificial<br />Departamento de Inteligencia Artificial<br />Facultad de Informática<br />Universidad Politécnica de Madrid<br />Date: 13/10/2011<br />
  2. 2. The TB Drugome<br />1<br />
  3. 3. Project goals<br />Typical Published Article<br />Reproducible Article: <br />Weaver, GenePattern GRRD, etc. <br />Text:<br />Narrative of method,<br />software packages used<br />Text:<br />Narrative of method,<br />software packages used<br />Data:<br />Key datasets and figures/plots<br />Data:<br />Key datasets and figures/plots<br />Workflow: <br />Workflow/scripts describing <br />dataflow, codes, and parameters<br />NOT published, <br />loosely recorded:<br />Software:<br />scripted codes + manual steps + <br />notes/emails <br />3<br />
  4. 4. Problem with existing approaches<br />Only executable workflow is published:<br />Must have the same codes to re-execute the workflow, but:<br /><ul><li>Codes become unavailable
  5. 5. Eg: eHits was proprietary and replaced by AutodockVina
  6. 6. Different labs prefer different codes
  7. 7. Eg: R vsMatlab
  8. 8. Eg: viz in CitoscapevsyEd</li></ul>Must have the same workflow framework to re-execute the workflow<br /><ul><li>Must have R for Weaver</li></ul>Must import filesto local file system and workflow framework<br /><ul><li>Must import bundle of workflow/data/code files to reproduce</li></ul>Reproducible Article: <br />Weaver, GenePattern GRRD, etc. <br />Text:<br />Narrative of method,<br />software packages used<br />Data:<br />Key datasets and figures/plots<br />Workflow: <br />Workflow/scripts describing <br />dataflow, codes, and parameters<br />4<br />
  9. 9. Key Features of our approach<br /><ul><li>Publish an abstract workflow in addition to executable w.
  10. 10. Description of workflow that is independent of the codes executed
  11. 11. Maps to the codes executed (the “executable workflow”)
  12. 12. Publish both abstract and executable workflow using the OPM standard
  13. 13. OPM (Open Provenance Model) is independent of workflow framework and is widely implemented
  14. 14. Other groups can import to their own workflow framework
  15. 15. Publish data and workflows as Linked Data on the Web
  16. 16. All workflows and related files are web-accessible
  17. 17. Simple mechanism to share across local file systems</li></ul>5<br />
  18. 18. High level architecture<br />Other <br />workflow <br />environments<br />WINGS on local laptop<br />Workflow Template<br />Core<br />OPMexport<br />Programatic access(external apps)<br />Portal<br />WorkflowInstance<br />LinkedDataPublication<br />WINGS on shared host<br />Workflow Template<br />Core<br />OPMexport<br />Portal<br />WorkflowInstance<br />Interactive Browsing(Pubby frontend)<br />WINGS on web server<br />Workflow Template<br />Core<br />OPMexport<br />Users<br />Portal<br />WorkflowInstance<br />Wings workflow generation<br />OPMconversion<br />Publication<br />Share<br />Reuse<br />6<br />
  19. 19. High level architecture (2)<br />Linked Data publication<br />Abstract Workflow<br />(OPM)<br />RDF Upload Interface<br />Wings<br />OPMexport<br />ExecutableWorkflow<br />(OPM)<br />Other workflow frameworks <br />RDF<br />Triple store<br />OPMimport<br />Permanentweb-accessible<br />filestore<br />SPARQL Endpoint<br />Web accessible<br />Workflow<br />Data, Components, etc.<br />ISI web servers<br />(http://wings.isi.edu/…)<br />Amazon EC2 cloud<br />(http://ec2-184-72-160-64.…)<br />Web browser<br />Needed if workflow was developed in local host instead of a public server<br />7<br />
  20. 20. Executable and abstract workflow<br />8<br />
  21. 21. hasArtifactTemplate<br />Artifact<br />account<br />Artifact<br />Artifact<br />Artifact<br />Input artifact1<br />Input artifact2<br />Execution Input1<br />Execution Input2<br />hasArtifactTemplate<br />user<br />account<br />used<br />used<br />used<br />hasArtifact<br />account<br />wasControlledBy<br />Process<br />used<br />Agent<br />Execution account<br />Workflowtemplate<br />Abstract template Node<br />Execution Node<br />account<br />hasProcessTemplate<br />hasProcess<br />hasAbstractComponent<br />Process<br />hasSpecificComponent<br />Account<br />OPM Graph<br />wasGeneratedBy<br />subClassOf<br />hasArtifact<br />Abstractcomponent<br />Specificcomponent<br />account<br />wasGeneratedBy<br />Outputartifact1<br />Execution result<br />hasArtifactTemplate<br />Artifact<br />Artifact<br />hasWorkflowTemplate<br />Workflow Template<br />Execution Results<br />OPMV extended model<br />Red: OPM model<br />Black: OPM profile (extension)<br />9<br />
  22. 22. Reproducibility<br /><ul><li>3 perspectives:
  23. 23. Reproducibility by an expert
  24. 24. Basic reproducibility by non-experts
  25. 25. Reproducibility by students from text only
  26. 26. Or, not reproducible at all</li></ul>10<br />
  27. 27. Reproducibility Maps<br />Comparison of ligand binding sites using SMAP<br />Comparison of dissimilar protein structures using FATCAT<br />Docking using eHits/AutodockVina<br />11<br />
  28. 28. Reproducibility maps: accessing the scripts and intermediate data<br />12<br />
  29. 29. How can we use this in Wf4Ever ?<br /><ul><li>The abstract workflow notion can be reused and imported to the workflows used in RO’s.
  30. 30. Complement to the workflow, to understand it better.
  31. 31. Allows tackling incomplete provenance.
  32. 32. Additional workflow repository for recommendation
  33. 33. OPM (Open Provenance Model) is independent of workflow framework and is widely implemented (Taverna has a OPM export too)
  34. 34. Other groups can import to their own workflow framework
  35. 35. Workflow integration with WINGS.
  36. 36. Semantic annotation of workflows.
  37. 37. Distributed workflow execution engine</li></ul>13<br />
  38. 38. Next steps<br /><ul><li>Keep working on workflow abstraction.
  39. 39. Research on compatibility with problem solving methods (PSMs).
  40. 40. Create an OPMV/W3C PROV-O profile for common workflow representation.
  41. 41. Interoperability between workflow systems (Taverna).
  42. 42. Work in workflows in different domains.
  43. 43. Biology, Astronomy.
  44. 44. Workflow reuse between different domains?</li></ul>14<br />
  45. 45. References<br /><ul><li>The TB Drugome paper: http://funsite.sdsc.edu/drugome/TB/
  46. 46. OPMO + OPMV mapped version: http://openprovenance.org/model/opmo
  47. 47. WINGS workflow system: http://seagull.isi.edu/marbles/
  48. 48. TB Drugome Wiki (Evolution of the work): http://seagull.isi.edu/wings-drugome/index.php/Main_Page
  49. 49. Thanks to Yolanda Gil for letting me borrowing some of the Slides based on USCD slides for this presentation.</li></ul>15<br />
  50. 50. Work at ISI, relationwith wf4Ever,futuresteps<br />Daniel GarijoVerdejo<br />Ontology Engineering Group. Laboratorio de Inteligencia Artificial<br />Departamento de Inteligencia Artificial<br />Facultad de Informática<br />Universidad Politécnica de Madrid<br />Date: 03/10/2011<br />

×