Your SlideShare is downloading. ×
ISI work
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

ISI work

825

Published on

My work at ISI during the summer, its relation with wf4Ever and next steps in my research

My work at ISI during the summer, its relation with wf4Ever and next steps in my research

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
825
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Work at ISI, relationwith wf4Ever,futuresteps
    Daniel GarijoVerdejo,Yolanda Gil
    Ontology Engineering Group. Laboratorio de Inteligencia Artificial
    Departamento de Inteligencia Artificial
    Facultad de Informática
    Universidad Politécnica de Madrid
    Date: 13/10/2011
  • 2. The TB Drugome
    1
  • 3. Project goals
    Typical Published Article
    Reproducible Article:
    Weaver, GenePattern GRRD, etc.
    Text:
    Narrative of method,
    software packages used
    Text:
    Narrative of method,
    software packages used
    Data:
    Key datasets and figures/plots
    Data:
    Key datasets and figures/plots
    Workflow:
    Workflow/scripts describing
    dataflow, codes, and parameters
    NOT published,
    loosely recorded:
    Software:
    scripted codes + manual steps +
    notes/emails
    3
  • 4. Problem with existing approaches
    Only executable workflow is published:
    Must have the same codes to re-execute the workflow, but:
    • Codes become unavailable
    • 5. Eg: eHits was proprietary and replaced by AutodockVina
    • 6. Different labs prefer different codes
    • 7. Eg: R vsMatlab
    • 8. Eg: viz in CitoscapevsyEd
    Must have the same workflow framework to re-execute the workflow
    • Must have R for Weaver
    Must import filesto local file system and workflow framework
    • Must import bundle of workflow/data/code files to reproduce
    Reproducible Article:
    Weaver, GenePattern GRRD, etc.
    Text:
    Narrative of method,
    software packages used
    Data:
    Key datasets and figures/plots
    Workflow:
    Workflow/scripts describing
    dataflow, codes, and parameters
    4
  • 9. Key Features of our approach
    • Publish an abstract workflow in addition to executable w.
    • 10. Description of workflow that is independent of the codes executed
    • 11. Maps to the codes executed (the “executable workflow”)
    • 12. Publish both abstract and executable workflow using the OPM standard
    • 13. OPM (Open Provenance Model) is independent of workflow framework and is widely implemented
    • 14. Other groups can import to their own workflow framework
    • 15. Publish data and workflows as Linked Data on the Web
    • 16. All workflows and related files are web-accessible
    • 17. Simple mechanism to share across local file systems
    5
  • 18. High level architecture
    Other
    workflow
    environments
    WINGS on local laptop
    Workflow Template
    Core
    OPMexport
    Programatic access(external apps)
    Portal
    WorkflowInstance
    LinkedDataPublication
    WINGS on shared host
    Workflow Template
    Core
    OPMexport
    Portal
    WorkflowInstance
    Interactive Browsing(Pubby frontend)
    WINGS on web server
    Workflow Template
    Core
    OPMexport
    Users
    Portal
    WorkflowInstance
    Wings workflow generation
    OPMconversion
    Publication
    Share
    Reuse
    6
  • 19. High level architecture (2)
    Linked Data publication
    Abstract Workflow
    (OPM)
    RDF Upload Interface
    Wings
    OPMexport
    ExecutableWorkflow
    (OPM)
    Other workflow frameworks
    RDF
    Triple store
    OPMimport
    Permanentweb-accessible
    filestore
    SPARQL Endpoint
    Web accessible
    Workflow
    Data, Components, etc.
    ISI web servers
    (http://wings.isi.edu/…)
    Amazon EC2 cloud
    (http://ec2-184-72-160-64.…)
    Web browser
    Needed if workflow was developed in local host instead of a public server
    7
  • 20. Executable and abstract workflow
    8
  • 21. hasArtifactTemplate
    Artifact
    account
    Artifact
    Artifact
    Artifact
    Input artifact1
    Input artifact2
    Execution Input1
    Execution Input2
    hasArtifactTemplate
    user
    account
    used
    used
    used
    hasArtifact
    account
    wasControlledBy
    Process
    used
    Agent
    Execution account
    Workflowtemplate
    Abstract template Node
    Execution Node
    account
    hasProcessTemplate
    hasProcess
    hasAbstractComponent
    Process
    hasSpecificComponent
    Account
    OPM Graph
    wasGeneratedBy
    subClassOf
    hasArtifact
    Abstractcomponent
    Specificcomponent
    account
    wasGeneratedBy
    Outputartifact1
    Execution result
    hasArtifactTemplate
    Artifact
    Artifact
    hasWorkflowTemplate
    Workflow Template
    Execution Results
    OPMV extended model
    Red: OPM model
    Black: OPM profile (extension)
    9
  • 22. Reproducibility
    • 3 perspectives:
    • 23. Reproducibility by an expert
    • 24. Basic reproducibility by non-experts
    • 25. Reproducibility by students from text only
    • 26. Or, not reproducible at all
    10
  • 27. Reproducibility Maps
    Comparison of ligand binding sites using SMAP
    Comparison of dissimilar protein structures using FATCAT
    Docking using eHits/AutodockVina
    11
  • 28. Reproducibility maps: accessing the scripts and intermediate data
    12
  • 29. How can we use this in Wf4Ever ?
    • The abstract workflow notion can be reused and imported to the workflows used in RO’s.
    • 30. Complement to the workflow, to understand it better.
    • 31. Allows tackling incomplete provenance.
    • 32. Additional workflow repository for recommendation
    • 33. OPM (Open Provenance Model) is independent of workflow framework and is widely implemented (Taverna has a OPM export too)
    • 34. Other groups can import to their own workflow framework
    • 35. Workflow integration with WINGS.
    • 36. Semantic annotation of workflows.
    • 37. Distributed workflow execution engine
    13
  • 38. Next steps
    • Keep working on workflow abstraction.
    • 39. Research on compatibility with problem solving methods (PSMs).
    • 40. Create an OPMV/W3C PROV-O profile for common workflow representation.
    • 41. Interoperability between workflow systems (Taverna).
    • 42. Work in workflows in different domains.
    • 43. Biology, Astronomy.
    • 44. Workflow reuse between different domains?
    14
  • 45. References
    • The TB Drugome paper: http://funsite.sdsc.edu/drugome/TB/
    • 46. OPMO + OPMV mapped version: http://openprovenance.org/model/opmo
    • 47. WINGS workflow system: http://seagull.isi.edu/marbles/
    • 48. TB Drugome Wiki (Evolution of the work): http://seagull.isi.edu/wings-drugome/index.php/Main_Page
    • 49. Thanks to Yolanda Gil for letting me borrowing some of the Slides based on USCD slides for this presentation.
    15
  • 50. Work at ISI, relationwith wf4Ever,futuresteps
    Daniel GarijoVerdejo
    Ontology Engineering Group. Laboratorio de Inteligencia Artificial
    Departamento de Inteligencia Artificial
    Facultad de Informática
    Universidad Politécnica de Madrid
    Date: 03/10/2011

×