• Like

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Taverna workflow management system (2010 11-30 Bath Workflow Tools)

  • 1,353 views
Uploaded on

Presentation of Taverna from UKOLN DevSci "Workflow Tools" event in Bath, 2010-11-30 …

Presentation of Taverna from UKOLN DevSci "Workflow Tools" event in Bath, 2010-11-30

PPTX version: http://www.slideshare.net/soilandreyes/taverna-workflow-management-system-2010-1130-bath-workflow-tools-pptx

http://taverna.org.uk/
http://www.ukoln.ac.uk/events/devcsi/workflow_tools/programme/index.html
http://devcsi.ukoln.ac.uk/

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • The            setup            in            the            video            no            longer            works.           
    And            all            other            links            in            comment            are            fake            too.           
    But            luckily,            we            found            a            working            one            here (copy paste link in browser) :            www.goo.gl/i7K0s4
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
1,353
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
16
Comments
1
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. http://taverna.org.uk/ Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK Gridmy UKOLN DevSci: Workflow Tools Bath, 2010-11-30
  • 2. What is myGrid?  An e-Science Collaboration Since 2001  Not a grid!  Numerous partners involved:  University of Manchester  University of Southampton  University of Oxford  EMBL-EBI  Provides sustainable and production quality software  Supported by OMII-UK, EPSRC and BBSRC  Mixture of developers, bioinformaticians and researchers Software | Services | Content | Skills | Community Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 3. Motivation  Challenge: Bioinformatics  Large amounts of data  Many open questions  Numerous freely available public datasets and analysis tools Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 4. Huge amounts of data Microarray 1000+ Genes QTL regions 100+ Genes How do I look Next Gen at all the genes systematically? Sequencing 10,000+ Genes Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 5. Manual approach  Search using public web sites and databases  Pubmed  Uniprot  EBI BioMart  Copy and paste to web tools for analysis  NCBI Blast  EBI InterPro  Further processing locally  R  Perl  Python Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 6. Manual: disadvantages • Scale of analysis task overwhelms researchers – lots of data • User bias and premature filtering of datasets – cherry picking • Hypothesis-Driven approach to data analysis • Constant changes in data - problems with re- analysis of data • Implicit methodologies (hyper-linking through web pages) • Error proliferation from any of the listed issues – notably human error Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 7. Web services and workflows  Web services  Technology and standards for exposing code and data resources that can be programmatically consumed by a remote third party  Description on how to interact with the service, parameters, documentation  Workflows  General technique for describing and executing a process  Describe what you want to do running which services Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 8. Taverna workflows Workflow Inputs start_position chromosome_name end_position genes_in_qtl A set of (local and remote) mmusculus_gene_ensembl remove_entrez_duplicates remove_uniprot_duplicates create_report  services to analyze or manage merge_entrez_genes merge_uniprot_ids remove_Nulls REMOVE_NULLS_2 data add_ncbi_to_string add_uniprot_to_string Kegg_gene_ids_2 Kegg_gene_ids concat_kegg_genes  Nested workflows are also split_gene_ids regex_2 split_for_duplicates Get_pathways remove_duplicate_kegg_genes Workflow Inputs services  Data-links connects services regex gene_ids split_by_regex lister get_pathways_by_genes1  i.e. output from service A is input to service B and C Merge_pathways concat_ids concat_gene_pathway_ids Merge_gene_pathways Workflow Outputs  Describes the desired dataflow pathway_genes pathway_ids merge_pathway_list_1 instead of process coordination merge_pathway_list_2 split_for_duplicate_pathways  Automatic iterations  Can customize list handling and remove_duplicate_ids pathway_descriptions control links gene_descriptions merge_genes_and_pathways remove_pathway_duplicates merge_gene_desc merge_genes_and_pathways_2 merge_pathway_desc remove_nulls_3 merge_genes_and_pathways_3 remove_pathway_nulls merge_patwhay_ids species kegg_pathway_release flatten_pathway_files remove_pathway_nulls_2 merge_kegg_references merge_reports getcurrentdatabase binfo Workflow Outputs gene_descriptions genes_pathways merged_pathways pathway_descriptions pathway_ids kegg_external_gene_reference report ensembl_database_release kegg_pathway_releasemy Grid http://mygrid.org.uk/ http://taverna.org.uk/
  • 9. What types of services?  Public/private/secured WSDL/SOAP web services  RESTful web services  Spreadsheet import  Command line tools (local/ssh)  Inline scripts (Beanshell, R)  Java APIs  Customizations:  BioMart, BioMoby / SADI  Soaplab  Grid services (Globus, EGEE gLite, caGrid)  … your tool (Plugin tutorial on wiki) Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 10. Which services?  Taverna is general, can connect to standard web services for any domain  Bioinformatics:  From professional third-party organisations providing robust & open data/analysis services  ..to under-the-desk web services for one particular purpose, ran by PhD students   http://biocatalogue.org/ - 1730 services from 130 providers – crowd sourced and quality monitored Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 11. Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 12. Taverna workbench  Graphical desktop tool  No server installation required  Drag-and-drop services into diagram  Connect services, run, reconnect, rerun  Integrates diverse set of tools Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 13. Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 14. Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 15. Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 16. Sharing workflows  myExperiment.org allows users to share, find, download and rate workflows  “Facebook for the scientist”  3000 members, 1100 workflows Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 17. Extensible UI and engine  Plugins can provide new “perspectives”  i.e.: BioCatalogue, myExperiment  Provide service-specific customization  BioMart interface replicates web site  Adding new functionality  Looping, branching, dynamic service resolution  New service types  Design helpers, “Find matching service” Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 18. Taverna 3 “Next-gen”  Under development for 2011  Interactive, component-centric and data-centric workflow design  Pre-packaged workflow components  Searching for workflow components from BioCatalogue and myExperiment  New myGrid workflow components library Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 19. Taverna command line  Executes from a Windows/Linux/OSX shells  Takes a predefined workflow with files as inputs and outputs  Quick way to “productionize” a workflow Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 20. Taverna Server  REST/SOAP interface to execute workflows  Client libraries for Ruby and Java  Two demonstration web interfaces  Ruby  Java Portlets  Future  Detailed execution support and control  Security delegation Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 21. Taverna portlet  Example portlet implementation  Executes workflows using Taverna Server Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 22. Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 23. Ruby web interface  Example customized  Uses Ruby gem web interface t2-server Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 24. Taverna on the cloud  Use-case:  SNP analysis and annotation of genome sequenced from breeds of cows in Africa – why are some of them resistent to X?  Amazon EC2 with Taverna Server and local services  Custom (built-in-a-week) Ruby on Rails web interface  Runs through 31 chromosomes in 6.5 hours using 10 instances - $26 Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 25. Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 26. Open source, open development  Taverna suite of tools are all open source and free to use  Large user community, active mailing lists  Lead developers: myGrid in Manchester  Contributors from across the world  PAL programme  myGrid provides training, tutorials and documentation Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 27. Acknowledgements Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 28. Gridmy http://mygrid.org.uk/ http://taverna.org.uk/
  • 29. More information  http://www.mygrid.org.uk/  http://www.taverna.org.uk/  http://www.myexperiment.org/  http://www.biocatalogue.org/ Gridmy http://mygrid.org.uk/ http://taverna.org.uk/