Taverna workflow management system (2010 11-30 Bath Workflow Tools)


Published on

Presentation of Taverna from UKOLN DevSci "Workflow Tools" event in Bath, 2010-11-30

PPTX version: http://www.slideshare.net/soilandreyes/taverna-workflow-management-system-2010-1130-bath-workflow-tools-pptx


Published in: Technology
1 Comment
1 Like
  • The            setup            in            the            video            no            longer            works.           
    And            all            other            links            in            comment            are            fake            too.           
    But            luckily,            we            found            a            working            one            here (copy paste link in browser) :            www.goo.gl/i7K0s4
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Taverna workflow management system (2010 11-30 Bath Workflow Tools)

  1. 1. Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30Gridmy http://taverna.org.uk/
  2. 2. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ What is myGrid?  An e-Science Collaboration Since 2001  Not a grid!  Numerous partners involved:  University of Manchester  University of Southampton  University of Oxford  EMBL-EBI  Provides sustainable and production quality software  Supported by OMII-UK, EPSRC and BBSRC  Mixture of developers, bioinformaticians and researchers Software | Services | Content | Skills | Community
  3. 3. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Motivation  Challenge: Bioinformatics  Large amounts of data  Many open questions  Numerous freely available public datasets and analysis tools
  4. 4. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Huge amounts of data 100+ Genes QTL regions Microarray 1000+ Genes Next Gen Sequencing 10,000+ Genes How do I look at all the genes systematically?
  5. 5. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Manual approach  Search using public web sites and databases  Pubmed  Uniprot  EBI BioMart  Copy and paste to web tools for analysis  NCBI Blast  EBI InterPro  Further processing locally  R  Perl  Python
  6. 6. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Manual: disadvantages • Scale of analysis task overwhelms researchers – lots of data • User bias and premature filtering of datasets – cherry picking • Hypothesis-Driven approach to data analysis • Constant changes in data - problems with re- analysis of data • Implicit methodologies (hyper-linking through web pages) • Error proliferation from any of the listed issues – notably human error
  7. 7. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Web services and workflows  Web services  Technology and standards for exposing code and data resources that can be programmatically consumed by a remote third party  Description on how to interact with the service, parameters, documentation  Workflows  General technique for describing and executing a process  Describe what you want to do running which services
  8. 8. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna workflows  A set of (local and remote) services to analyze or manage data  Nested workflows are also services  Data-links connects services  i.e. output from service A is input to service B and C  Describes the desired dataflow instead of process coordination  Automatic iterations  Can customize list handling and control links Get_pathways Workflow Inputs Workflow Outputs Workflow Inputs Workflow Outputs remove_uniprot_duplicates merge_uniprot_ids species getcurrentdatabase kegg_pathway_release binfo regex_2 split_for_duplicates split_for_duplicate_pathways remove_duplicate_kegg_genes merge_genes_and_pathways_3 flatten_pathway_files merged_pathways merge_genes_and_pathways merge_genes_and_pathways_2 merge_kegg_references kegg_external_gene_reference remove_pathway_duplicates merge_pathway_desc merge_pathway_list_1 merge_pathway_list_2 remove_duplicate_ids merge_patwhay_ids pathway_descriptions merge_reports report merge_gene_desc remove_nulls_3 gene_descriptions gene_ids REMOVE_NULLS_2 remove_entrez_duplicates merge_entrez_genes remove_pathway_nulls remove_Nulls concat_kegg_genes split_gene_ids remove_pathway_nulls_2 add_uniprot_to_string gene_descriptions pathway_descriptions add_ncbi_to_string Kegg_gene_ids_2 pathway_ids Kegg_gene_ids genes_in_qtl mmusculus_gene_ensembl create_report ensembl_database_releasegenes_pathways kegg_pathway_release Merge_pathways concat_ids pathway_ids regex split_by_regex lister Merge_gene_pathways pathway_genes concat_gene_pathway_ids get_pathways_by_genes1 chromosome_namestart_position end_position
  9. 9. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ What types of services?  Public/private/secured WSDL/SOAP web services  RESTful web services  Spreadsheet import  Command line tools (local/ssh)  Inline scripts (Beanshell, R)  Java APIs  Customizations:  BioMart, BioMoby / SADI  Soaplab  Grid services (Globus, EGEE gLite, caGrid)  … your tool (Plugin tutorial on wiki)
  10. 10. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Which services?  Taverna is general, can connect to standard web services for any domain  Bioinformatics:  From professional third-party organisations providing robust & open data/analysis services  ..to under-the-desk web services for one particular purpose, ran by PhD students   http://biocatalogue.org/ - 1730 services from 130 providers – crowd sourced and quality monitored
  11. 11. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  12. 12. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna workbench  Graphical desktop tool  No server installation required  Drag-and-drop services into diagram  Connect services, run, reconnect, rerun  Integrates diverse set of tools
  13. 13. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  14. 14. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  15. 15. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  16. 16. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Sharing workflows  myExperiment.org allows users to share, find, download and rate workflows  “Facebook for the scientist”  3000 members, 1100 workflows
  17. 17. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Extensible UI and engine  Plugins can provide new “perspectives”  i.e.: BioCatalogue, myExperiment  Provide service-specific customization  BioMart interface replicates web site  Adding new functionality  Looping, branching, dynamic service resolution  New service types  Design helpers, “Find matching service”
  18. 18. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna 3 “Next-gen”  Under development for 2011  Interactive, component-centric and data-centric workflow design  Pre-packaged workflow components  Searching for workflow components from BioCatalogue and myExperiment  New myGrid workflow components library
  19. 19. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna command line  Executes from a Windows/Linux/OSX shells  Takes a predefined workflow with files as inputs and outputs  Quick way to “productionize” a workflow
  20. 20. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna Server  REST/SOAP interface to execute workflows  Client libraries for Ruby and Java  Two demonstration web interfaces  Ruby  Java Portlets  Future  Detailed execution support and control  Security delegation
  21. 21. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna portlet  Example portlet implementation  Executes workflows using Taverna Server
  22. 22. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  23. 23. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Ruby web interface  Example customized web interface  Uses Ruby gem t2-server
  24. 24. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna on the cloud  Use-case:  SNP analysis and annotation of genome sequenced from breeds of cows in Africa – why are some of them resistent to X?  Amazon EC2 with Taverna Server and local services  Custom (built-in-a-week) Ruby on Rails web interface  Runs through 31 chromosomes in 6.5 hours using 10 instances - $26
  25. 25. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  26. 26. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Open source, open development  Taverna suite of tools are all open source and free to use  Large user community, active mailing lists  Lead developers: myGrid in Manchester  Contributors from across the world  PAL programme  myGrid provides training, tutorials and documentation
  27. 27. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Acknowledgements
  28. 28. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  29. 29. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ More information  http://www.mygrid.org.uk/  http://www.taverna.org.uk/  http://www.myexperiment.org/  http://www.biocatalogue.org/