Invited talk at the GeoClouds Workshop, Indianapolis, 2009

  • 459 views
Uploaded on

Taverna, Biocatalogue, and myExperiment: …

Taverna, Biocatalogue, and myExperiment:
a three-legged foundation for effective collaboration in E-science

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
459
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Scientific Workflow Management System Taverna,  Biocatalogue,  and  myExperiment: a  three-­‐legged  founda;on  for  effec;ve  collabora;on in  E-­‐science A collaborative talk by Paolo Missier Information Management Group School of Computer Science, University of Manchester, UK with additional material kindly shared by: Prof. Dave DeRoure and David Newman, University of Southampton Prof. Carole Goble and the e-Labs design group, University of Manchester 1 GeoClouds workshop, Indianapolis, IN, Sept. 17, 2009 - P. MissierSunday, 13 March 2011
  • 2. What is the myGrid Project?  UK  e-­‐Science  pilot  project  since  2001.    Centred  at  Manchester,  Southampton  and  the  EMBL-­‐EBI  Part  of  Open  Middleware  Infrastructure  InsEtute  UK  hFp:// www.omii.ac.uk.    Mixture  of  developers,  bioinformaEcians  and  researchers  An  alliance  of  contribuEng  projects  and  partners  Open  source  development  and  content  LGPL  or  BSD  Infrastructure  We  don’t  own  any  resources  (apart  from  catalogues)  Or  a  Grid.   ESIP meeting,Santa Barbara, CA, July 2009 - P. MissierSunday, 13 March 2011
  • 3. Taverna Graphical   Workbench For  Professionals Plug-­‐in  architecture Nested  Workflows Drag  and  Drop Wiring  together Rapidly  incorporate  new  service  without  coding.   Not  restricted  to  predetermined  services Access  to  local  and  remote  resources  and  analysis  tools 3500+  service  operaEons  available  when  start  up ESIP meeting,Santa Barbara, CA, July 2009 - P. MissierSunday, 13 March 2011
  • 4. What do Scientists use Taverna for? Systems  biology  model  building Netherlands  BioinformaEcs  Centre Genome  Canada  BioinformaEcs  Plaaorm Proteomics BioMOBY Sequence  analysis US  FLOSS  social  science  program Protein  structure  predicEon RENCI Gene/protein  annotaEon   SysMO  ConsorEum Microarray  data  analysis French  SIGENAE  farm  animals  project QTL  studies ThaiGrid CARMEN  Neuroscience  project QSAR  studies SPINE  consorEum Medical  image  analysis EU  Enfin,  EMBRACE,  BioSapian,  Casimir Public  Health  care  epidemiology EU  SysMO  ConsorEum Heart  model  simulaEons NERC  Centre  for  Ecology  and  Hydrology High  throughput  screening Bergen  Centre  for  ComputaEonal  Biology Max-­‐Planck  insEtute  for  Plant  Breeding  Research Phenotypical  studies Genoa  Cancer  Research  Centre Phylogeny AstroGrid          StaEsEcal  analysis          30  USA  academic  and  research            Text  mining ins;tu;ons Astronomy,  Music,  Meteorology ESIP meeting,Santa Barbara, CA, July 2009 - P. MissierSunday, 13 March 2011
  • 5. Who else is in this space? Trident Triana Kepler Ptolemy II Taverna BioExtract BPEL 5 ESIP meeting,Santa Barbara, CA, July 2009 - P. MissierSunday, 13 March 2011
  • 6. www.myexperiment.org Socially share, discover and reuse workflows and other methods. Cooperative bazaar.l Sunday  10th  May: 1748  registered  users,  143  groups,  669  workflows,  197  files,  52  packs 56  different  countries.  Top  4:  UK,  US,  The  Netherlands,  Germany Sunday, 13 March 2011
  • 7. Sunday, 13 March 2011
  • 8. Sunday, 13 March 2011
  • 9. Why data provenance matters, if done right • To establish quality, relevance, trust • To track information attribution through complex transformations • To describe one’s experiment to others, for understanding / reuse • To provide evidence in support of scientific claims • To enable post hoc process analysis for improvement, re-design The W3C Incubator on Provenance has been collecting numerous use cases: http://www.w3.org/2005/Incubator/prov/wiki/Use_Cases# Linköping, Sweden -- January 2010Sunday, 13 March 2011
  • 10. Goals, expected contributions • Established technology provider - open-source – traditionally active in the bioinf space – but also involved in the e-Lico EU project (data mining portal) – large community base, established production environment • Main goal: – to offer our workflow and workflow repository technology, put it to the test on the challenges of data preservation pipelines • Challenges: – expect new requirements on our current technology • robust, high-volume data pipelines • workflow provenance -- process evolution 10 • data provenanceSunday, 13 March 2011