Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX

1,290 views

Published on

Presentation of Taverna from UKOLN DevSci "Workflow Tools" event in Bath, 2010-11-30

PDF version: http://www.slideshare.net/soilandreyes/taverna-workflow-management-system-2010-1130-bath-workflow-tools

http://taverna.org.uk/
http://www.ukoln.ac.uk/events/devcsi/workflow_tools/programme/index.html
http://devcsi.ukoln.ac.uk/

Published in: Technology
  • Be the first to comment

Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX

  1. 1. Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30Gridmy http://taverna.org.uk/
  2. 2. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ What is myGrid?  An e-Science Collaboration Since 2001  Not a grid!  Numerous partners involved:  University of Manchester  University of Southampton  University of Oxford  EMBL-EBI  Provides sustainable and production quality software  Supported by OMII-UK, EPSRC and BBSRC  Mixture of developers, bioinformaticians and researchers Software | Services | Content | Skills | Community
  3. 3. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Motivation  Challenge: Bioinformatics  Large amounts of data  Many open questions  Numerous freely available public datasets and analysis tools
  4. 4. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Huge amounts of data 100+ Genes QTL regions Microarray 1000+ Genes Next Gen Sequencing 10,000+ Genes How do I look at all the genes systematically?
  5. 5. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Manual approach  Search using public web sites and databases  Pubmed  Uniprot  EBI BioMart  Copy and paste to web tools for analysis  NCBI Blast  EBI InterPro  Further processing locally  R  Perl  Python
  6. 6. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Manual: disadvantages • Scale of analysis task overwhelms researchers – lots of data • User bias and premature filtering of datasets – cherry picking • Hypothesis-Driven approach to data analysis • Constant changes in data - problems with re- analysis of data • Implicit methodologies (hyper-linking through web pages) • Error proliferation from any of the listed issues – notably human error
  7. 7. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Web services and workflows  Web services  Technology and standards for exposing code and data resources that can be programmatically consumed by a remote third party  Description on how to interact with the service, parameters, documentation  Workflows  General technique for describing and executing a process  Describe what you want to do running which services
  8. 8. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna workflows  A set of (local and remote) services to analyze or manage data  Nested workflows are also services  Data-links connects services  i.e. output from service A is input to service B and C  Describes the desired dataflow instead of process coordination  Automatic iterations  Can customize list handling and control links Get_pathways Workflow Inputs Workflow Outputs Workflow Inputs Workflow Outputs remove_uniprot_duplicates merge_uniprot_ids species getcurrentdatabase kegg_pathway_release binfo regex_2 split_for_duplicates split_for_duplicate_pathways remove_duplicate_kegg_genes merge_genes_and_pathways_3 flatten_pathway_files merged_pathways merge_genes_and_pathways merge_genes_and_pathways_2 merge_kegg_references kegg_external_gene_reference remove_pathway_duplicates merge_pathway_desc merge_pathway_list_1 merge_pathway_list_2 remove_duplicate_ids merge_patwhay_ids pathway_descriptions merge_reports report merge_gene_desc remove_nulls_3 gene_descriptions gene_ids REMOVE_NULLS_2 remove_entrez_duplicates merge_entrez_genes remove_pathway_nulls remove_Nulls concat_kegg_genes split_gene_ids remove_pathway_nulls_2 add_uniprot_to_string gene_descriptions pathway_descriptions add_ncbi_to_string Kegg_gene_ids_2 pathway_ids Kegg_gene_ids genes_in_qtl mmusculus_gene_ensembl create_report ensembl_database_releasegenes_pathways kegg_pathway_release Merge_pathways concat_ids pathway_ids regex split_by_regex lister Merge_gene_pathways pathway_genes concat_gene_pathway_ids get_pathways_by_genes1 chromosome_namestart_position end_position
  9. 9. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ What types of services?  Public/private/secured WSDL/SOAP web services  RESTful web services  Spreadsheet import  Command line tools (local/ssh)  Inline scripts (Beanshell, R)  Java APIs  Customizations:  BioMart, BioMoby / SADI  Soaplab  Grid services (Globus, EGEE gLite, caGrid)  … your tool (Plugin tutorial on wiki)
  10. 10. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Which services?  Taverna is general, can connect to standard web services for any domain  Bioinformatics:  From professional third-party organisations providing robust & open data/analysis services  ..to under-the-desk web services for one particular purpose, ran by PhD students   http://biocatalogue.org/ - 1730 services from 130 providers – crowd sourced and quality monitored
  11. 11. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  12. 12. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna workbench  Graphical desktop tool  No server installation required  Drag-and-drop services into diagram  Connect services, run, reconnect, rerun  Integrates diverse set of tools
  13. 13. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  14. 14. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  15. 15. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  16. 16. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Sharing workflows  myExperiment.org allows users to share, find, download and rate workflows  “Facebook for the scientist”  3000 members, 1100 workflows
  17. 17. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Extensible UI and engine  Plugins can provide new “perspectives”  i.e.: BioCatalogue, myExperiment  Provide service-specific customization  BioMart interface replicates web site  Adding new functionality  Looping, branching, dynamic service resolution  New service types  Design helpers, “Find matching service”
  18. 18. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna 3 “Next-gen”  Under development for 2011  Interactive, component-centric and data-centric workflow design  Pre-packaged workflow components  Searching for workflow components from BioCatalogue and myExperiment  New myGrid workflow components library
  19. 19. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna command line  Executes from a Windows/Linux/OSX shells  Takes a predefined workflow with files as inputs and outputs  Quick way to “productionize” a workflow
  20. 20. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna Server  REST/SOAP interface to execute workflows  Client libraries for Ruby and Java  Two demonstration web interfaces  Ruby  Java Portlets  Future  Detailed execution support and control  Security delegation
  21. 21. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna portlet  Example portlet implementation  Executes workflows using Taverna Server
  22. 22. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  23. 23. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Ruby web interface  Example customized web interface  Uses Ruby gem t2-server
  24. 24. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Taverna on the cloud  Use-case:  SNP analysis and annotation of genome sequenced from breeds of cows in Africa – why are some of them resistent to X?  Amazon EC2 with Taverna Server and local services  Custom (built-in-a-week) Ruby on Rails web interface  Runs through 31 chromosomes in 6.5 hours using 10 instances - $26
  25. 25. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  26. 26. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Open source, open development  Taverna suite of tools are all open source and free to use  Large user community, active mailing lists  Lead developers: myGrid in Manchester  Contributors from across the world  PAL programme  myGrid provides training, tutorials and documentation
  27. 27. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ Acknowledgements
  28. 28. Gridmy http://taverna.org.uk/http://mygrid.org.uk/
  29. 29. Gridmy http://taverna.org.uk/http://mygrid.org.uk/ More information  http://www.mygrid.org.uk/  http://www.taverna.org.uk/  http://www.myexperiment.org/  http://www.biocatalogue.org/

×