Your SlideShare is downloading. ×
0
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
WORKS 11 Presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

WORKS 11 Presentation

741

Published on

Presentation for the paper: A new Approach for Publishing Workflows: Abstractions, Standards and Linked Data

Presentation for the paper: A new Approach for Publishing Workflows: Abstractions, Standards and Linked Data

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
741
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. A new Approach for Publishing Workflows: Abstractions, Standards and Linked Data Daniel Garijo Ontology Engineering Group, Departamento de Inteligencia Artificial. Universidad Politécnica de Madrid Yolanda Gil Information Sciences and Institute University of Southern California, Marina del Rey Date: 14/11/2011
  • 2. Index of contentsIndex:1. Background2. Limitations of existing approaches to workflow publication3. Features of our approach • Publishing abstract workflows and specific workflows • OPMW Ontology • Linked Data Publication4. Workflow querying and Linked Data consumption5. Conclusions 1
  • 3. BackgroundTypical Published Article Reproducible Article: Weaver, GenePattern GRRD, etc. Text: Text: Narrative of method, Narrative of method, software packages used software packages used Data: Data:Key datasets and figures/plots Key datasets and figures/plots Workflow:NOT published, Workflow/scripts describingloosely recorded: dataflow, codes, and parameters Software:scripted codes + manual steps + notes/emails 2
  • 4. Current issues with existing publication approaches Only executable workflow is published: Reproducible Article: 1. Must have the same codes to re-execute Weaver, GenePattern GRRD, etc. the workflow, but: – Codes become unavailable • Eg: eHits was proprietary and replaced by Text: AutodockVina Narrative of method, – Different labs prefer different codes software packages used • Eg: R vs Matlab • Eg: viz in Citoscape vs yEd Data: 2. Must have the same workflow frameworkKey datasets and figures/plots to re-execute the workflow – Must have R for Weaver Workflow: 3. Must import files to local file system and Workflow/scripts describing workflow frameworkdataflow, codes, and parameters – Must import bundle of workflow/data/code files to reproduce 3
  • 5. Key Features of our approach• Publish an abstract workflow in addition to executable workflow – Description of workflow that is independent of the codes executed – Maps to the codes executed (the “executable workflow”)• Publish both abstract and executable workflow using the OPM standard – OPM (Open Provenance Model) is independent of workflow framework and is widely implemented – Other groups can import to their own workflow framework• Publish data and workflows as Linked Data on the Web – All workflows and related files are web-accessible – Simple mechanism to share across local file systems 4
  • 6. What is Linked Data1. Use URIs as names for things.2. Use HTTP URIs so that people can look up those names.3. When someone looks up a URI, provide useful information.4. Include links to other URIs. “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” 5
  • 7. High level architecture Other workflow WINGS on local laptop environments Workflow Core Template OPM Portal Workflow export Instance Programatic access (external apps) WINGS on shared host Workflow Linked Core Template OPM Portal export Data Workflow Instance Publication Interactive WINGS on web server Browsing Workflow (Pubby frontend) Core Template OPM Portal export Users Workflow InstanceWings workflow OPM Publication Share Reuse generation conversion 6
  • 8. Publishing the abstract workflow Comparison of Dissimilar protein structures workflow 7
  • 9. OPMW Ontology opmv:Artifact opmv:Artifact opmw: opmw: opmv: opmw: hasArtifactTemplate ArtifactTemplate ArtifactInstance Agent artifact1 execInput1 user1 opmo:accountopmo: opmo:hasArtifact opmv:used opmv:used opmo:account opmo: opmv:wasControlledByOPMGraph opmv:Process Account opmv:Process opmw: opmw: opmw:ProcessTemplate opmw:ProcessInstance opmo: WorkflowTemplate ExecutionAccount opmo: templateNode1 opmw:hasProcessTemplate executionNode1 account template1 account1 hasProcess opmw:hasTemplateComponent opmw:hasSpecificComponent opmo: opmv: hasArtifact wasGeneratedBy ac:AbstractComponent ac:SpecificComponent opmv:wasGeneratedBy opmo: account absComp1 specComp1 opmw: opmw: opmw:hasArtifactTemplate ArtifactInstace ArtifactTemplate outputArtifact1 executionOutput1 opmv:Artifact opmv:Artifact opmw:hasWorkflowTemplate Abstract Workflow Executable Workflow 8
  • 10. Publication of Workflows as Linked Data Linked Data publication Abstract Workflow RDF Upload Wings (OPM) Interface OPM conversion OPM Executable Other workflowconversion frameworks Workflow RDF (OPM) OPM Permanent Triple store import web- accessible Workflow file Data, store SPARQL Web Components, Endpoint accessible etc. Web browser 9
  • 11. Searching/Browsing Workflows as Linked Data Types of search Resource URI (Process instance) Autocomplete search bar Specific component for this process instanceProperties 10
  • 12. Searching/Browsing Workflows as Linked Data Component Name Component Inputs Component Outputs Code Implementations Template additional metadata Record of the different executions of this workflow 11
  • 13. Conclusions1. Publication of an abstract workflow that represents the computational method in an execution-independent manner.2. Publication of the abstract workflow and the executed workflow using the OPM standard that is independent of the execution environment used.3. Publication of the workflows, components, codes and datasets as Linked Data on the web. 12
  • 14. Future work• Extensions to abstract workflow publication – Be able to provide abstractions on several steps. – Incomplete provenance.• Create an OPMV/W3C PROV-O profile for common workflow representation. – Increase interoperability with other workflow representation systems.• Workflow reuse in different workflow systems. – Import and execute workflows in other workflow frameworks. 13
  • 15. References• WINGS workflow system: http://seagull.isi.edu/marbles/•The Open Provenance Model Specification: http://openprovenance.org/• OPMO: http://openprovenance.org/model/opmo•OPMV: http://open-biomed.sourceforge.net/opmv/ns.html• TB Drugome Wiki (Evolution of this work): http://seagull.isi.edu/wings-drugome/index.php/Main_Page•W3C PROV-O current ontology (draft): http://www.w3.org/2011/prov/wiki/PIL_OWL_Ontology•Principles of Linked Data: http://www.w3.org/DesignIssues/LinkedData.html 14
  • 16. Acknowledgements•UCSD people: •Li Xie •Lei Xie •Sarah Kinnings •Phil Bourne•ISI people: •Varun Ratnakaar•OEG people: •Oscar Corcho 15
  • 17. A new Approach for Publishing Workflows: Abstractions, Standards and Linked Data Daniel Garijo Ontology Engineering Group, Departamento de Inteligencia Artificial. Universidad Politécnica de Madrid Yolanda Gil Information Sciences and Institute University of Southern California, Marina del Rey Date: 14/11/2011

×