Biodiversity Virtual e-LaboratoryAn e-Infrastructure and e-Science environment supporting researchon biodiversityWEB SERVI...
Products are “services” and “workflows”• Workflows allow to process vast  amounts of data, repeatedly   – Build your own w...
Creates powerful data processing tools Ecological Niche Modelling                                        Biogeochemical mo...
Supported by       many friendsFits into a portfolioof initiatives  • NoE: ALTER-Net, EDIT/PESI, LTER-Europe, EuroMarine, ...
BioVeL Tool SpectrumWorkflow design, compute     Concept Knowledge           Domain science               Technical       ...
BiodiversityCatalogues &                                                                      Catalogue                   ...
We’re at the halfway point• Several workflows maturing nicely   – Public Shared: Data refinement, Population modelling, Ec...
4 questions to address1. How to use distributed centres to efficiently run   distributed processing chains?2. Is there a p...
How to use distributed centres to efficiently    run distributed processing chains?Users’ workflows andapplicationsService...
Is there a problem of data exchange?            (And how to solve this)• At simplest level, we need for the user:   – A "s...
Deploying codes close to data• BioVeL Appliance  – A service packaged for DCI, deployed on-demand  – Working with EGI Fedc...
Access and security issues around      managing protected services• We need a lightweight and standard solution for   – Us...
Access and security issues around      managing protected services• We need a lightweight and standard solution for   – Us...
Questions?BioVeL is funded by theEuropean Commission7th Framework Programme (FP7).It is part of its e-Infrastructures acti...
Eudat user forum-london-11march2013-biovel-v3
Upcoming SlideShare
Loading in …5
×

Eudat user forum-london-11march2013-biovel-v3

250 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
250
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Eudat user forum-london-11march2013-biovel-v3

  1. 1. Biodiversity Virtual e-LaboratoryAn e-Infrastructure and e-Science environment supporting researchon biodiversityWEB SERVICES INFRASTRUCTURESFOR BIODIVERSITY SCIENCEAlex HardistyCoordinator, Cardiff UniversityEUDAT User Forum, 11-12th March 2013, London
  2. 2. Products are “services” and “workflows”• Workflows allow to process vast amounts of data, repeatedly – Build your own workflow: select and apply successive “services” (data analysis and processing steps) – Import data from one’s own research and/or from existing libraries (i.e. GBIF, Catalogue of Life)• Access a library of workflows and re-use existing workflows. – Improves efficiency by reducing Part of a workflow to study the ecological niche of the horseshoe crab research time and overhead expenses
  3. 3. Creates powerful data processing tools Ecological Niche Modelling Biogeochemical modelling for biodiversity research Metagenomics• Carbon Sequestration Phylogenetics Population Modelling• Ecosystem Functioning and Valuation Taxonomy Geospatial Visualization• Invasive Species ManagementAn international virtual network of experts connecting2 scientific communities: biodiversity and ICT• Aims to foster cooperation in the community by: – Discussing scientific use cases – Identifying and deploying important Web Services – Designing and offering workflows – Training scientists
  4. 4. Supported by many friendsFits into a portfolioof initiatives • NoE: ALTER-Net, EDIT/PESI, LTER-Europe, EuroMarine, etc. • Projects: 4D4Life, agINFRA, Aquamaps, ArtDataBanken, BioFresh, Envri, EU BON, EUBrazilOpenBio, Fauna Iberica, i4Life, iMarine, Micro B3, OpenPlantBio, ViBRANT • Global: CAMERA, Catalogue of Life, COOPEUS, CReATIVE-B, EoL, GBIF, GSC Biodiversity WG, TreeBase, and many moreImportant contributionto infrastructure
  5. 5. BioVeL Tool SpectrumWorkflow design, compute Concept Knowledge Domain science Technical Science Domain PAL PAL Scientist Taverna Component Taverna Domain-Specific Workbench Builder Lite / Server Website (Taverna Player) High Workflow Visibility Low
  6. 6. BiodiversityCatalogues & Catalogue WorkflowsRepositories Components Services BioCatalogue Curators Pro In theInterfaces Makers FieldDesign & Launch Users Third Partytools Taverna Lite Channels Workbench Local Public BioVeL Services Data Mgt Servers COTS Shim Local Taverna Server File Data MgtRun time Stores WorkspaceExecution Local Authentication Data Management Sets Domain Server Interaction Server SystemDeploymentInfrastructure Cloudhosting, compute, storage
  7. 7. We’re at the halfway point• Several workflows maturing nicely – Public Shared: Data refinement, Population modelling, Ecol. niche modelling – Beta: Phylogenetic inferencing – In the pipe: Biogeochemical process modelling, metagenomics, …• Using Web services from GBIF, CoL, CRIA, Fraunhofer, INFN, …. – Developing new services: viz and data selection, phylo, metagenomics, Biome-BGC modelling, pop modelling• A curated public catalogue of Web services – www.biodiversitycatalogue.org• AWS cloud infrastructure, new user interfaces (tavlite1.biovel.eu)• Growing profile in community – Steady enquiries from potential users and public training workshops
  8. 8. 4 questions to address1. How to use distributed centres to efficiently run distributed processing chains?2. Is there a problem of data exchange? (And how to solve this)3. Deploying codes close to data4. Access and security issues around managing protected services
  9. 9. How to use distributed centres to efficiently run distributed processing chains?Users’ workflows andapplicationsService and Data Providers(INFN, BioVeL, GBIF, CoL,EBI, BGBM, etc.)Resource Providers(EUDAT, EGI.eu, PRACE,commercial cloud, etc.)
  10. 10. Is there a problem of data exchange? (And how to solve this)• At simplest level, we need for the user: – A "starting place", where a workflow can find the data it needs – An "ending place", where a workflow can put its results – A "transient place" where temporary data / intermediate results can be put and retrieved• For services we need: – Temporary spaces associated with specific services, supporting data movements between services – Separation of users and separation of workflow runs• Summarise as : – A replicated distributed storage space, accessible to BioVeL services, (hence workflows) for both reading and writing; which presents to the user as a filespace, native to the user’s local environment. • = Dropbox for services, with fast replication between known service locations. Today, typically GB not TB
  11. 11. Deploying codes close to data• BioVeL Appliance – A service packaged for DCI, deployed on-demand – Working with EGI Fedcloud on this – Could be deployed close to data but this only makes sense if this would be quicker than moving the data • So where is the break-even point?• Taverna Server deployments – In connection with Web Services hosting Taverna Server
  12. 12. Access and security issues around managing protected services• We need a lightweight and standard solution for – User management & single sign-on to our Service Network – Permissions system for authorizing access to services • Same for Workspace Access Service (user workspace) User Contract SP Contract RP
  13. 13. Access and security issues around managing protected services• We need a lightweight and standard solution for – User management & single sign-on to our Service Network – Permissions system for authorizing access to services • Same for Workspace Access Service (user workspace)• 3-legged OAuth, extended – resource / service is independent of BioVeL OAuth provider• Adopt from megx.net – marine ecological genomics
  14. 14. Questions?BioVeL is funded by theEuropean Commission7th Framework Programme (FP7).It is part of its e-Infrastructures activity.BioVeL contributes to LifeWatch and GEO BON.BioVeL products are free to access.Under FP7, the e-Infrastructures activity is part of the Research Infrastructures programme,funded under the FP7 Capacities Specific Programme. It focuses on the further developmentand evolution of the high-capacity and high-performance communication network (GÉANT),distributed computing infrastructures (grids and clouds), supercomputer infrastructures,simulation software, scientific data infrastructures, e-Science services as well as on the adoptionof e-Infrastructures by user communities.

×