SlideShare a Scribd company logo
1 of 33
European Life Sciences Infrastructure for Biological Information
www.elixir-europe.org
Meeting with Google cloud Platform UK
Rafael C Jimenez
ELIXIR CTO
ELIXIR
TOC
• ELIXIR
• Data deluge and the cloud in life sciences
• Cloud use cases
2
ELIXIR
• European life sciences research
infrastructure for biological
information to facilitate research
• Safeguard data and build
sustainable data services
• Participated by major bioinformatics
service providers ( > 100) and
supported by 17 EU member states
• Creating a robust infrastructure for
biological information is a bigger
task than any individual
organisation or nation can take on
alone
3
Infrastructure for Life Sciences
4
Services & connectors
to drive access and
exploitation
Integration and interoperability
of data and services
Sustain core data
resources
Access, Exchange & Compute
on sensitive data
Compute
Dat
a
Standards
Tools
Training
Professional skills for
managing and exploiting data
Access, Search, Analysis …
Integration, Optimization, Privacy, …
Storage, Network & Computing
Formats, Ontologies, Guidelines, …
Scientific & technical
How does it affect data sharing
in life sciences?
Large-scale data sharing in the life sciences
http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC002552
How does big data affect data sharing?
http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC002552
Compute Compute
Compute
Storage Compute Transfer
Transfer
Transfer Transfer
Transfer
Storage Storage
Storage
What How Where
Growing data
Guy Cochrane, EMBL-EBI
Cost of DNA sequencing
10
Data generation vs. data transfer
11
~100 GB
~4 TB
~4 TB
24 hours 1 Gb 100 Mb 10 Mb
~30 min
~9 hour
~9 hour
~5 hours
~4 days
~4 days
~2 days
~5 weeks
~5 weeks
DNA sequencing
Mass spectrometry
Microscopy
Network File Transfer
Potential Bottlenecks
in Life Sciences
• Data production grows faster than storage
• Cost of data production technologies declines faster than
storage
• It takes longer to transfer data than produce the data.
Data growth
how to reduce the IT budget shortfall?
http://www.eweek.com/
Data growth
how to reduce the IT budget shortfall?
http://www.eweek.com/
Optimization
Using technology more effectively
Selecting relevant data
Potential solutions
• Storage
• Data compression
• Select what we store
• Evaluate data reproducibility & value of data
• Network
• Faster protocols
• Partitioning
• Network upgrade
• Computation
• Clouds
• Data close to computation
What data is relevant?
Proteomics data in PRIDE
17
~85% raw data
How can the data deluge affect data
production?
http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC002552
Centralization & specialization
19
Data production Data centralization Data
• Data is submitted to specialized centralized repositories.
• Current situation.
Federation
20
• If data gets bigger, the data might have to stay where
it is produced.
• We might have to provision data producers with storage
and computation.
• Data might be pulled instead of pushed into centralized
repositories.
20
Data production Data centralization Data
Replication
2121
Data production Data centralization Data
• Replication centers might offer additional services on top
of the data.
Integration
2222
Data production Data centralization Data
• Some replication centers might integrate data from
different resources.
How can the data deluge affect data analysis?
http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC002552
Separation of data tools and computation
24
Data
Analysis
tools
Data
Analysis
tools
Data
Analysis
tools
Computation
?
Cross-siteVM Operation - pilot
25
• Perform analysis via cloud infrastructures andVMs
• TransferVMs between computing centers to allow researchers to
perform analyses that they could not otherwise do locally
• Supported by 5 NRENs and in collaboration with
Cross-siteVM Operation
26
CSC
EMBL-EBI
University of Groningen
Data Analysis
tools
Computation
Data
Analysis
tools
VM
VM
VM
Chipster
200GB
NBIC Galaxy
50GB
GoNL
60TB
ENA
3.2PB 1GB lightpath
1GB lightpath
1GB lightpath
Funet
Janet
SURFnet
ELIXIR cloud providers
27
• Similar solutions
• Different implementations
Use cases
Infrastructure as a Service (IaaS)
Provides on demand access to compute and storage resources.
Platform as a Service (PaaS)
Provide a higher-level environment (other than infrastructure) that is needed
to support data analysis.
Virtual Machine Repository or Marketplace Portal
As a means to distribute or consume software environments targeted at
particular audiences.
28
Use cases
Virtual Clusters
Expand local cluster resources by connecting to cloud based virtual clusters
and storage resources.
Running Data Analysis Pipelines
Bring an analysis pipeline to a specified data set.The data set may be on a
shared network file system or database instance visible to the pipeline.
Data Extraction
Allow authorised researchers to deploy aVMI that can return a subset of the
stored data (e.g. data mining) or to undertake local analysis.
29
Use cases
30
Scalable Web Service Hosting
Run a single or multi-tier web service (e.g. front-end service and back-end
cluster) on a platform that can scale horizontally while managing the network
configuration (e.g. IP and firewall) and access control.
Shared Environment
Provide an environment for shared use and joint administration that can be
accessed and managed by all in a collaborative manner through a common
software environment.
Virtual Desktops for Immediate Use
To provide a working software environment for teaching, training or research
purposes. These could be a basic operating system or full analysis
environment (e.g. Biolinux).
Use cases
Software Development andTesting
For developing and testing software in different operating system
environments.
Appliance
Encapsulates a software product(s) or analysis environment (e.g. part of a
pipeline) in aVMI that is verified to work and ready to run.
31
Potential collaboration
• Host processed data like AWS
• Provide a joint solution to large data producers
• App engine, containers, compute, big data for bioinformatics
data analysis
• Facilitate deployment of ELIXIRVMs and containers
• Extension of existing ELIXIR cloud resources
• Replication of large data sets
• Discovery of data sets and tools
• Delegation of IT solutions
• Alliance in life science research
32
European Life Sciences Infrastructure for Biological Information
www.elixir-europe.org
Thank you

More Related Content

What's hot

Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyTutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyBlue BRIDGE
 
Using e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationBlue BRIDGE
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationHistoric Environment Scotland
 
HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board  HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board Helix Nebula The Science Cloud
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceOla Spjuth
 
Report on EDINA Authentication Related Academic Sector Activities
Report on EDINA Authentication Related Academic Sector ActivitiesReport on EDINA Authentication Related Academic Sector Activities
Report on EDINA Authentication Related Academic Sector ActivitiesEDINA, University of Edinburgh
 
The University of Edinburgh Research Data Management Service Suite
The University of Edinburgh Research Data Management Service SuiteThe University of Edinburgh Research Data Management Service Suite
The University of Edinburgh Research Data Management Service SuiteRobin Rice
 
Tryggve support for-research
Tryggve support for-researchTryggve support for-research
Tryggve support for-researchanttipursula
 
Computing with large datasets
Computing with large datasetsComputing with large datasets
Computing with large datasetsEUDAT
 
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Ola Spjuth
 
Shibboleth Access Management Federations as an Organisational Model for SDI
Shibboleth Access Management Federations as an Organisational Model for SDIShibboleth Access Management Federations as an Organisational Model for SDI
Shibboleth Access Management Federations as an Organisational Model for SDIEDINA, University of Edinburgh
 
Persistent Identifiers in EUDAT services| www.eudat.eu |
Persistent Identifiers in EUDAT services| www.eudat.eu | Persistent Identifiers in EUDAT services| www.eudat.eu |
Persistent Identifiers in EUDAT services| www.eudat.eu | EUDAT
 
The University of Edinburgh Research Data Management Service Suite
The University of Edinburgh Research Data Management Service SuiteThe University of Edinburgh Research Data Management Service Suite
The University of Edinburgh Research Data Management Service SuiteRobin Rice
 

What's hot (20)

OGC Interoperability Experiments and Authentication
OGC Interoperability Experiments and AuthenticationOGC Interoperability Experiments and Authentication
OGC Interoperability Experiments and Authentication
 
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyTutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
 
Using e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity Conservation
 
Sharing Big Data - Bob Jones
Sharing Big Data - Bob JonesSharing Big Data - Bob Jones
Sharing Big Data - Bob Jones
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
6th COBWEB Consortium Meeting
6th COBWEB Consortium Meeting6th COBWEB Consortium Meeting
6th COBWEB Consortium Meeting
 
Access Control in ESDIN: Shibboleth
Access Control in ESDIN: ShibbolethAccess Control in ESDIN: Shibboleth
Access Control in ESDIN: Shibboleth
 
HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board  HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board
 
A physical view
A physical viewA physical view
A physical view
 
Interoperability and scalability with microservices in science
Interoperability and scalability with microservices in scienceInteroperability and scalability with microservices in science
Interoperability and scalability with microservices in science
 
Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]Who is doing what, and how do we know? [PEPRS]
Who is doing what, and how do we know? [PEPRS]
 
Report on EDINA Authentication Related Academic Sector Activities
Report on EDINA Authentication Related Academic Sector ActivitiesReport on EDINA Authentication Related Academic Sector Activities
Report on EDINA Authentication Related Academic Sector Activities
 
The University of Edinburgh Research Data Management Service Suite
The University of Edinburgh Research Data Management Service SuiteThe University of Edinburgh Research Data Management Service Suite
The University of Edinburgh Research Data Management Service Suite
 
Tryggve support for-research
Tryggve support for-researchTryggve support for-research
Tryggve support for-research
 
Computing with large datasets
Computing with large datasetsComputing with large datasets
Computing with large datasets
 
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
 
Shibboleth Access Management Federations as an Organisational Model for SDI
Shibboleth Access Management Federations as an Organisational Model for SDIShibboleth Access Management Federations as an Organisational Model for SDI
Shibboleth Access Management Federations as an Organisational Model for SDI
 
Persistent Identifiers in EUDAT services| www.eudat.eu |
Persistent Identifiers in EUDAT services| www.eudat.eu | Persistent Identifiers in EUDAT services| www.eudat.eu |
Persistent Identifiers in EUDAT services| www.eudat.eu |
 
The University of Edinburgh Research Data Management Service Suite
The University of Edinburgh Research Data Management Service SuiteThe University of Edinburgh Research Data Management Service Suite
The University of Edinburgh Research Data Management Service Suite
 
COBWEB, AIP-6, and Access Management Federations
COBWEB, AIP-6, and Access Management FederationsCOBWEB, AIP-6, and Access Management Federations
COBWEB, AIP-6, and Access Management Federations
 

Similar to ELIXIR

EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub project
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...David Wallom
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud ComputingDavid Wallom
 
Cloud computing infrastructure
Cloud computing infrastructure Cloud computing infrastructure
Cloud computing infrastructure Dr. Anita Goel
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBlueData, Inc.
 
Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...David Wallom
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster LEARN Project
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesASIS&T
 
CLOUD ENABLING TECHNOLOGIES.pptx
 CLOUD ENABLING TECHNOLOGIES.pptx CLOUD ENABLING TECHNOLOGIES.pptx
CLOUD ENABLING TECHNOLOGIES.pptxDr Geetha Mohan
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldData Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldDenodo
 
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeBig Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeDenodo
 
Providing support and services for researchers in good data governance
Providing support and services for researchers in good data governanceProviding support and services for researchers in good data governance
Providing support and services for researchers in good data governanceRobin Rice
 

Similar to ELIXIR (20)

EGI Services
EGI Services EGI Services
EGI Services
 
EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolio
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
 
Cloud computing infrastructure
Cloud computing infrastructure Cloud computing infrastructure
Cloud computing infrastructure
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 Telco
 
Grid computing
Grid computingGrid computing
Grid computing
 
Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...Federating Infrastructure as a Service cloud computing systems to create a un...
Federating Infrastructure as a Service cloud computing systems to create a un...
 
Grid computing
Grid computingGrid computing
Grid computing
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
SomeSlides
SomeSlidesSomeSlides
SomeSlides
 
CLOUD ENABLING TECHNOLOGIES.pptx
 CLOUD ENABLING TECHNOLOGIES.pptx CLOUD ENABLING TECHNOLOGIES.pptx
CLOUD ENABLING TECHNOLOGIES.pptx
 
Information Systems
Information SystemsInformation Systems
Information Systems
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldData Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud World
 
GRID COMPUTING.ppt
GRID COMPUTING.pptGRID COMPUTING.ppt
GRID COMPUTING.ppt
 
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeBig Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
 
Cyverse: Extensible Cyberinfrastructure for Life Science
Cyverse: Extensible Cyberinfrastructure for Life ScienceCyverse: Extensible Cyberinfrastructure for Life Science
Cyverse: Extensible Cyberinfrastructure for Life Science
 
Providing support and services for researchers in good data governance
Providing support and services for researchers in good data governanceProviding support and services for researchers in good data governance
Providing support and services for researchers in good data governance
 

More from Rafael C. Jimenez

BMB Resource Integration Workshop
BMB Resource Integration WorkshopBMB Resource Integration Workshop
BMB Resource Integration Workshop Rafael C. Jimenez
 
Proteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesProteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesRafael C. Jimenez
 
Summary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsSummary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsRafael C. Jimenez
 
The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...Rafael C. Jimenez
 
Standardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresStandardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresRafael C. Jimenez
 
An introduction to programmatic access
An introduction to programmatic accessAn introduction to programmatic access
An introduction to programmatic accessRafael C. Jimenez
 
Life science requirements from e-infrastructure: initial results from a joint...
Life science requirements from e-infrastructure:initial results from a joint...Life science requirements from e-infrastructure:initial results from a joint...
Life science requirements from e-infrastructure: initial results from a joint...Rafael C. Jimenez
 
Technical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeTechnical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeRafael C. Jimenez
 
Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Rafael C. Jimenez
 
Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Rafael C. Jimenez
 
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Rafael C. Jimenez
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesRafael C. Jimenez
 
SASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course informationSASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course information Rafael C. Jimenez
 

More from Rafael C. Jimenez (20)

BMB Resource Integration Workshop
BMB Resource Integration WorkshopBMB Resource Integration Workshop
BMB Resource Integration Workshop
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Proteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesProteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resources
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Summary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsSummary of Technical Coordinators discussions
Summary of Technical Coordinators discussions
 
The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...
 
Standardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresStandardisation in BMS European infrastructures
Standardisation in BMS European infrastructures
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Standards
StandardsStandards
Standards
 
ELIXIR TCG update
ELIXIR TCG updateELIXIR TCG update
ELIXIR TCG update
 
An introduction to programmatic access
An introduction to programmatic accessAn introduction to programmatic access
An introduction to programmatic access
 
Life science requirements from e-infrastructure: initial results from a joint...
Life science requirements from e-infrastructure:initial results from a joint...Life science requirements from e-infrastructure:initial results from a joint...
Life science requirements from e-infrastructure: initial results from a joint...
 
Technical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeTechnical activities in ELIXIR Europe
Technical activities in ELIXIR Europe
 
Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.
 
Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.
 
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciences
 
SASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course informationSASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course information
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 

Recently uploaded

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Recently uploaded (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

ELIXIR

  • 1. European Life Sciences Infrastructure for Biological Information www.elixir-europe.org Meeting with Google cloud Platform UK Rafael C Jimenez ELIXIR CTO ELIXIR
  • 2. TOC • ELIXIR • Data deluge and the cloud in life sciences • Cloud use cases 2
  • 3. ELIXIR • European life sciences research infrastructure for biological information to facilitate research • Safeguard data and build sustainable data services • Participated by major bioinformatics service providers ( > 100) and supported by 17 EU member states • Creating a robust infrastructure for biological information is a bigger task than any individual organisation or nation can take on alone 3
  • 4. Infrastructure for Life Sciences 4 Services & connectors to drive access and exploitation Integration and interoperability of data and services Sustain core data resources Access, Exchange & Compute on sensitive data Compute Dat a Standards Tools Training Professional skills for managing and exploiting data Access, Search, Analysis … Integration, Optimization, Privacy, … Storage, Network & Computing Formats, Ontologies, Guidelines, … Scientific & technical
  • 5. How does it affect data sharing in life sciences?
  • 6. Large-scale data sharing in the life sciences http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC002552
  • 7. How does big data affect data sharing? http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC002552 Compute Compute Compute Storage Compute Transfer Transfer Transfer Transfer Transfer Storage Storage Storage What How Where
  • 8.
  • 10. Cost of DNA sequencing 10
  • 11. Data generation vs. data transfer 11 ~100 GB ~4 TB ~4 TB 24 hours 1 Gb 100 Mb 10 Mb ~30 min ~9 hour ~9 hour ~5 hours ~4 days ~4 days ~2 days ~5 weeks ~5 weeks DNA sequencing Mass spectrometry Microscopy Network File Transfer
  • 12. Potential Bottlenecks in Life Sciences • Data production grows faster than storage • Cost of data production technologies declines faster than storage • It takes longer to transfer data than produce the data.
  • 13. Data growth how to reduce the IT budget shortfall? http://www.eweek.com/
  • 14. Data growth how to reduce the IT budget shortfall? http://www.eweek.com/ Optimization Using technology more effectively Selecting relevant data
  • 15. Potential solutions • Storage • Data compression • Select what we store • Evaluate data reproducibility & value of data • Network • Faster protocols • Partitioning • Network upgrade • Computation • Clouds • Data close to computation
  • 16. What data is relevant?
  • 17. Proteomics data in PRIDE 17 ~85% raw data
  • 18. How can the data deluge affect data production? http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC002552
  • 19. Centralization & specialization 19 Data production Data centralization Data • Data is submitted to specialized centralized repositories. • Current situation.
  • 20. Federation 20 • If data gets bigger, the data might have to stay where it is produced. • We might have to provision data producers with storage and computation. • Data might be pulled instead of pushed into centralized repositories. 20 Data production Data centralization Data
  • 21. Replication 2121 Data production Data centralization Data • Replication centers might offer additional services on top of the data.
  • 22. Integration 2222 Data production Data centralization Data • Some replication centers might integrate data from different resources.
  • 23. How can the data deluge affect data analysis? http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC002552
  • 24. Separation of data tools and computation 24 Data Analysis tools Data Analysis tools Data Analysis tools Computation ?
  • 25. Cross-siteVM Operation - pilot 25 • Perform analysis via cloud infrastructures andVMs • TransferVMs between computing centers to allow researchers to perform analyses that they could not otherwise do locally • Supported by 5 NRENs and in collaboration with
  • 26. Cross-siteVM Operation 26 CSC EMBL-EBI University of Groningen Data Analysis tools Computation Data Analysis tools VM VM VM Chipster 200GB NBIC Galaxy 50GB GoNL 60TB ENA 3.2PB 1GB lightpath 1GB lightpath 1GB lightpath Funet Janet SURFnet
  • 27. ELIXIR cloud providers 27 • Similar solutions • Different implementations
  • 28. Use cases Infrastructure as a Service (IaaS) Provides on demand access to compute and storage resources. Platform as a Service (PaaS) Provide a higher-level environment (other than infrastructure) that is needed to support data analysis. Virtual Machine Repository or Marketplace Portal As a means to distribute or consume software environments targeted at particular audiences. 28
  • 29. Use cases Virtual Clusters Expand local cluster resources by connecting to cloud based virtual clusters and storage resources. Running Data Analysis Pipelines Bring an analysis pipeline to a specified data set.The data set may be on a shared network file system or database instance visible to the pipeline. Data Extraction Allow authorised researchers to deploy aVMI that can return a subset of the stored data (e.g. data mining) or to undertake local analysis. 29
  • 30. Use cases 30 Scalable Web Service Hosting Run a single or multi-tier web service (e.g. front-end service and back-end cluster) on a platform that can scale horizontally while managing the network configuration (e.g. IP and firewall) and access control. Shared Environment Provide an environment for shared use and joint administration that can be accessed and managed by all in a collaborative manner through a common software environment. Virtual Desktops for Immediate Use To provide a working software environment for teaching, training or research purposes. These could be a basic operating system or full analysis environment (e.g. Biolinux).
  • 31. Use cases Software Development andTesting For developing and testing software in different operating system environments. Appliance Encapsulates a software product(s) or analysis environment (e.g. part of a pipeline) in aVMI that is verified to work and ready to run. 31
  • 32. Potential collaboration • Host processed data like AWS • Provide a joint solution to large data producers • App engine, containers, compute, big data for bioinformatics data analysis • Facilitate deployment of ELIXIRVMs and containers • Extension of existing ELIXIR cloud resources • Replication of large data sets • Discovery of data sets and tools • Delegation of IT solutions • Alliance in life science research 32
  • 33. European Life Sciences Infrastructure for Biological Information www.elixir-europe.org Thank you

Editor's Notes

  1. Data resource: Sustainability, availability and integration
  2. 'compute power’ doubles every two years. Production of data doubles faster.
  3. Sequencing prices below Moore’s law Moore’s law predict exponential decline of computing cost Doubling of 'compute power' every two years Store data more expensive than produce it
  4. Technology get cheaper and faster ~15.000 hospital ~4.000 universities ~2.000 life sciences research institutes How much data we will produce? How we will store it?
  5. decline of computing cost
  6. Data resource: Sustainability, availability and integration
  7. Data resource: Sustainability, availability and integration