Non technical introduction to Web Services & Workflows. Taverna, Biocatalogue and myExperimen
1. Sam Kerrien & Rafael Jimenez
Non technical introduction to
Web Services
& Workflows
Taverna, Biocatalogue and myExperiment
Hands-on training at EBI
APO-SYS Data Management Meeting
June 1, 2011
3. Introduction to Web Services at EBI
Table of contents
• Web Services
• Workflows
• myGrid solutions
– Biocatalogue
– Taverna
– myExperiment
4. Introduction to Web Services at EBI
Brief introduction to Web Services
This introduction is intended for a non technical audience;
We have purposely simplified technical aspect.
5. What is a Web Service
• It is a piece of software that runs remotely,
• It is accessible over a network (e.g. Internet),
• It is meant for machine to machine
communication,
• Independent from programming languages,
• It can be operated following specific rules (i.e.
protocol),
• There are 2 main protocols in use…
5
9. SOAP vs. REST
• Based on Standards,
• Only accessed by
software,
• Allow description of
complex data structure
in request and
response,
9
• Geared to simplicity,
• A browser can be a client,
• Request as complex as a
URL can be,
11. Introduction to Web Services at EBI
Workflow
• Workflow
– Sequence of tasks that produces a result of observable value
• Workflow management system
– Computer system to compose and execute workflows.
• Workflow components
– Input
– Service
– Output
– Shims
15. Create and run workflows
Share, discover and reuse workflows
Discover and reuse services
myGrid solutions
16. • A public centralised and curated registry of
Life Science Web Services
• ‘Web 2.0’-style website and API
• Allow anyone to register, discover and curate
Web Services
• Community oriented with expert guidance
• Open content, open source, open platform
Paul Fisher, myGrid, University of Manchester
Biocatalogue
http://www.biocatalogue.org
19. Workflow
diagram
Tree view of
workflow structure
Tree view of
workflow structure
Available
services
Taverna
• Workflow management system
• Java desktop application
• Open source and extensible
• Includes access to Biocatalogue and myExperiment
• http://www.taverna.org.uk/
20. What do Scientists use Taverna for?
– Data gathering, annotation and model building
– Data analysis from distributed tools
– Data mining and knowledge management
– Parameter sweeps and simulation
Users from Systems Biology, Proteomics, Sequence analysis,
Protein structure prediction, Gene/protein annotation, Microarray
data analysis, QTL studies, Chemioinformatics, Medical image
analysis, Public Health care epidemiology, Heart model simulation,
Phenotype studies, Phylogeny, Statistical analysis,
Pharmacogenomics, Text mining, Astronomy, Music, Meteorology
Katy Wolstencroft, myGrid, University of Manchester
21. Sharing Experiments
• Taverna supports the silico experimental process for
individual scientists
• You can share results/experiments/experiences with
your
– Research group
– Collaborators
– Scientific community
A registry of workflows
Paul Fisher, myGrid, University of Manchester
23. Recycling, Reuse, Repurposing
• Paul writes workflows for identifying biological pathways
implicated in resistance to Trypanosomiasis
• Paul meets Jo. Jo is investigating mouse Whipworm
infection.
• Jo reuses one of Paul’s workflows.
• Jo identifies the biological pathways involved in sex
dependence in the mouse model, believed to be involved in
the ability of mice to expel the parasite.
• Previously a manual two year study by Jo had failed to do
this.
Workflows are protocols
Paul Fisher, myGrid, University of Manchester
User has to extract from provided documentation how to build a URL to access the service.
The URL will typically contains parameters that constitute the user query.
Error codes are returned as Status.
Some REST services expose WADL file : http://en.wikipedia.org/wiki/Web_Application_Description_Language
WADL describes how a URL is formatted (a computer could use it to create a URL)
Sandra’s comment: Most often software will build that URL for you
User has to identify from provided documentation where the Web Service’s WSDL file is.
The WSDL file contains the list of operations that the web service can perform as well as the definition of the structure of returned data.
WSDL = Web Service Definition Langage
Life Cycle
Create and run workflows
Taverna workflow enactment engine and GUI workflow workbench for composing workflows
Create and manage services as components
Service deployment, workflow and service monitoring
Discover and reuse services
BioCatalogue curated catalogue and Feta plugin
Share, discover and reuse workflows
myExperiment Web 2.0 social environment
Manage the metadata needed and generated
Semantic Web based technologies for the ontologies needed for service and workflow finding and provenance collection and processing.
tGRAP when collapsed picked up by Nijmegen and rebuilt using workflows over two days.