SlideShare a Scribd company logo
European Life Sciences Infrastructure for Biological Information
www.elixir-europe.org
Rafael C Jimenez
ELIXIR CTO
RDA Fourth Plenary Meeting, ELIXIR Bridging Force IG
2014,Tuesday 23 September
ELIXIR
European Research Infrastructures
2
European Research Infrastructures
3
LS
e-infrastructureslife sciences
ICT
Research infrastructures
Facilitate research
4
Data
ICT
e-infrastructures
LS
life sciences
Physical facilities
Scientific information
Transfer
Computation
Storage
ELIXIR
• European life sciences research infrastructure for
biological information to support all life science
research
• Safeguard data and build sustainable data
services
5
ELIXIR Members
6
• Participated by major bioinformatics service providers
(~130) and supported by 17 EU member states and
EMBL
• 11 EU states signed agreement
• Czech Republic, Estonia, Denmark,
Finland,Israel, Netherlands,
Norway,Portugal, Switzerland,
Sweden, UK
• 6 EU signed MoU
• Belgium, Greece, France, Italy,
Slovenia, Spain
7
D e s cr ip t io n W e b -a d d r e ss
T h e $Is ra e l$N o d e $
Contact
ELIXIR:The Norway
Node
ELIXIR'sNorway Node will provide competence and infra-
structure building on key areasfor Norway, in particularmarine
resourcesandmedicalresearch. Challengesrelatedtoprocessing
andanalysisof data from next-generation sequencing and other
high-throughput methodsare important both to basicresearch in
these areas, and to the development of new enterprises. The
node will alsoprovide training and support toward researchers.
Professor Inge Jonassen
ELIXIR Norway Node
Computational BiologyUnit
Departmentof Informatics
University of Bergen, Norway
Inge.Jonassen@ii.uib.no
www.elixir-norway.org
Universityof Bergen
The coordinating partner of the Elixir Norway node. The main focus is on
marine genomics and e-infrastructure. An early deliverable is LiceBase
developed in tight collaboration withe the Sea Lice Research Centre.
Universityof Oslo
Emphasizes biomedical resource provision an analysis, leveraging public
resources in integrative statistical genomics, with secure management of
person sensitive data.
NorwegianUniversity of LifeSciences
comparative fish genomics. Provision of web-based solutions for services,
toolboxes, and computational access to these data.
NorwegianUniversity of ScienceandTechnology
Tools and resources for analysing genome data, with focus on gene
regulation, non-protein-coding RNAs and epigenetics, but also bacterial
genomics. Handling and analysis of data from human biobanks.
Universityof Tromsø
Tools and pipelines for analyzing metagenomic (and genomic) data, with a
particular focus on taxonomic classification and bioprospecting (functional
and metabolic potential).
TheNorwegian ELIXIRNodewillprovideservicesand resourcestoward marine
genomicsincluding researchers, government, and industry. TheNorwegian
Nodewill off several integrated packagesgearedtowardslarge-scale analysisof
marinegenomicand metagenomicdata (e.g. fish genomicsand marine bio-
prospecting). Thisalso includesprovision of web-based solutionsfor services,
toolboxes, and computational accessto referencedata provided by theELIXIR
infrastructure.
Health and biobanks
TheNorwayNode supportsinfrastructure for handling and analysisof datafor
medical research, including human biobanks. Such datamay besensitive, and
must bestored with secure access.The nodeisdeveloping infrastructure for
sensitivedata.Toolsfor data analysisareintegrated into NeLS, theNorwegian
e-infrastructure for lifesciences.Thisprovidesuser-friendly solutionsfor
examplefor human re-sequencing dataand other genome-scaleanalyses.
Marine research
Collaborating organisations
©TrymIvar Bergsmo /Samfoto / NTBScanpix
ELIXIR node proposals
• ELIXIR deliver services through national ELIXIR Nodes
building on national strengths and priorities
ELIXIR strategic drivers
1. Establish a distributed infrastructure to scale with the challenge of data
growth
2. Secure and deliver the core data resources underpinning life science research
3. Provide discoverable tools, services and connectors to drive data access and
exploitation
4. Provide robust technical platforms and clouds for secure data access, data
exchange and compute
5. Develop and maintain standards for data management, reuse and integration
6. Drive partnerships with user communities and other organisations to ensure
high impact
7. Close the computational biology skills gap through a comprehensive training
programme for professionals
8. Support innovation in big data biology
8
Areas of interest and programme of work
9
Domain specific services
Data interoperability, vocabulary and ontology services
Tools interoperability and Service Registry
Data resources & services
Technical Services
Management and operations
Training
Compute
Dat
a
Standards
Tools
Training
• Working groups to coordinate the ELIXIR technical strategy
• Driven by national technical leads
• Plan, agree and implement technical strategies
• Represent ELIXIR on one specific technical topic
Task forces
10
Scientific
and funding
strategies
ELIXIR Programme of work
Task forces
Planning Agreement Implementation
Technical strategies
Pilot projects
HoNs
HoNs
HoNs
TCG
Task forces
11
Data interoperability, vocabulary and ontology services
Tools interoperability and Service Registry
Data resources & services
Technical Services
Management and operations
Training
• Interoperability
• Service registry
• Metrics, monitoring & quality control
• Cloud
• Storage
• AAI
• Training portal
• E-learning
• Communications
• Website
ELIXIR Pilot Projects
1. ELIXIR Facing Cloud Support and Virtual Machines - with SIB
2. ELIXIR Data IO to pilot the continuous transfer of major archive
resources to a remote European location - with CSC, Finland
3. Establishing EGA Distributed authentication - with CSC, Finland
4. Establishing EGA as joint venture – with CRG, Spain
5. Improving links between Human Proteome Atlas (HPA) and EMBL-
EBI resources
6. BILS-ProteomeXchange integration using EUDAT resources
7. Interoperable controlled-access big data transfer technology for
ELIXIR - application to EGA EBI / CRG ELIXIR collaboration and
beyond
8. Harmonising Marine Metagenomics pipelines
ELIXIR technical activities
• ELIXIR Node activities, Task forces, Pilots
• Technical activities among different interest groups …
13
ELIXIR
Technical
activities
BMBESFRIs e-infrastructures
ELIXIR
nodes
Communities
Affinity with RDA groups
Life science
• The BioSharing Registry: connecting data policies,
standards & databases in life sciences
• Wheat Data Interoperability WG
• Agriculture Data Interest Group (IGAD)
• Marine Data Harmonization IG
• Metabolomics
• Structural Biology IG
• Toxicogenomics Interoperability IG
14
Affinity with RDA groups
• Data Description Registry Interoperability
• Big Data Analytics IG
• Domain Repositories Interest Group
• Education and Training on handling of research data
• Federated Identity Management
• Metadata IG
• Preservation e-Infrastructure IG
• Service Management IG
• Active Data Management Plans
• Sustainability of eResearch / Cyberinfrastructure
15
Affinity with RDA groups
• Data Citation WG
• Data Foundation and Terminology WG
• Metadata Standards Directory Working Group
• PID Information Types WG
• RDA/WDS Publishing Data Bibliometrics WG
• RDA/WDS Publishing Data Services WG
• RDA/WDS Publishing Data Workflows WG
• Repository Audit and Certification DSA–WDS Partnership WG
• Long tail of research data IG
• PID Interest Group
• RDA/WDS Publishing Data IG
• Research Data Provenance
16
Thank you
17
ELIXIR 2015 Objectives
18
Data resources in life science
• Many
• Diverse
• Disperse
Nucleic Acids Research annual Database Issue
and the NAR online Molecular Biology Database Collection in 2012.
MY Galperin, GR Cochrane – Nucleic Acids Research, 2011
~1800molecular biology
data resources
Utility of databasesScientificimpact
Too little
information
Too many databases
Too diverse interfaces
Heterogeneous integration
Homogeneous integration
Data interoperability
A B C
1
2
Improving Links Between distributed European
resources
ELIXIR pilot: Interoperability of protein expressions resources
The Human Protein Atlas portal is a publicly available database
with millions of high-resolution images showing the spatial
distribution of proteins in 46 different normal human tissues and
20 different cancer types, as well as 47 different human cell lines.
Growing data
Proteomics data in PRIDE
24
~85% raw data
Data types examples
25
Raw data Process data Metadata
DNA
Human
Liver
Mitochondria
W. Smith
…
Peptide
Mouse
Heart
Nucleus
J. Heinz
…
LPISASHSSK…
TTGTTATCCG…
… … …
Data submission
26
raw data
processed data
metadata
Centralized
database
Data submission - pilot
27
raw data
processed data
metadata
PID
Data analysis
28
Data
Analysis
tools
Data
Analysis
tools
Data
Analysis
tools
Computation
?
Cross-site VM Operation - pilot
29
• Perform analysis via cloud infrastructures and VMs
• Transfer VMs between computing centers to allow researchers
to perform analyses that they could not otherwise do locally
• Supported by 5 NRENs and in collaboration with
Cross-site VM Operation
30
CSC
EMBL-EBI
University of Groningen
Data Analysis
tools
Computation
Data
Analysis
tools
VM
VM
VM
Chipster
200GB
NBIC Galaxy
50GB
GoNL
60TB
ENA
3.2PB 1GB lightpath
1GB lightpath
1GB lightpath
Funet
Janet
SURFnet
European ELIXIR Data - ”LightPath” (EBI / CSC)
• Aim
• To explore the replication of large scale
(Petabyte scale) archives to remote sites
• To create a separate source of data files for
challenging DataIO projects
• Update:
• Selection of pilot data transfer technology
between EBI and CSC
• Established a dedicated light path between
datacenters in London and Kajaani
• Development of model for future IO needs in
the lifesciences in Europe
REMS - Resource Entitlement Management System
Principal
investigator
Applicant
Research group
Members of the
application
Metadata on
dataset 1&2
Dataset 1
Dataset 2
DAC 1
Approver
DAC 2
Approver
REMS
Workflow
Reports
Entitlements
IdP
IdP
IdP
SP
1. Apply
for access
4. Approve
5. Access
3. Circulate
to approver
2. Commit to
licence terms
• Access to sensitive data (genomics) granted by a Data Access Committee
• In collaboration with eduGAIN
• Agreements to be applied to other domains: FI-CLARIN & FI-CESSDA
https://tnc2013.terena.org/getfile/421 Starting point: https://remsdemo.csc.fi/
Training
33
For Big Data to become huge, however, there are still
hurdles to leap. For one thing, the tools to analyse data
are not yet good enough. And people with the
skills to analyse data are scarce and will
become scarcer. By 2018 there will be a “talent
gap” of between 140,000 and 190,000 people, …
Challenges
• Sustain data and services
• Make data and service interoperable
• Necessary to integrate data
• Specially medical, clinical and research
• Data too big to store, exchange & compute? Forthcoming
challenges …
• Data production grows faster than storage
• Cost of data production technologies declines faster than storage
• It takes longer to transfer data than produce the data.
• Privacy, security & access (AAI)
• Training
34
• EGA is to be distributed
effort with archive,
submission, and data
distribution capacity at both
the EBI and CRG
35
European Genome-phenome archive
• From the users point of view, EGA remains one integrated
Archive of secure human biomedical research data.
• Search of datasets at either website is "global" across the EGA.
CASE: process for applying access to
the Nordic Control Database
ELIXIR pilots to address key challenges in
biomedical research:
1. Cloud computing
“Embassy cloud”: Access reference
data in a virtual environment – work
as though you are at EMBL-EBI or
SIB, Switzerland
3. High-Performance Computing
“Lightpath”: Connections for on-demand reference
data to remote HPC centres at EMBL-EBI and CSC
Finland
2. Authentication & Authorisation
Improved methods and processes for
access to clinical data
Minimum metadata
life sciences
Courses Materials
Trainers…
ServicesJobs
• Title
• Description
• Creator
• Publication Date
• Topics
• Audience
• …
Data deposition
Data submission
40
raw data
processed data
metadata
Centralized
database
Data sharing
The casual approach
‘data on my disk and available to anyone who requests it'
Submission to data repositories
Data submissions
42
Data
repository
Journal
submission
Data
repository
Journal
submission reads
Journal request Curator
Data
repository
Data Management Plan
submission
Data management
+
Data sharing
The casual approach
‘data on my disk and available to anyone who requests it'
Submission to data repositories
Will big data affect data deposition?
Data submissions
How much data?
How much available data?

More Related Content

Similar to ELIXIR

ELIXIR
ELIXIRELIXIR
ELIXIR
ELIXIRELIXIR
Elixir at de.nbi meeting
Elixir at de.nbi meetingElixir at de.nbi meeting
Elixir at de.nbi meeting
Niklas Blomberg
 
ELIXIR-UK
ELIXIR-UKELIXIR-UK
ELIXIR-UK
ELIXIR UK
 
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
ELIXIR-Europe
 
AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011
Alex Hardisty
 
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnectionseROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
e-ROSA
 
ELIXIR
ELIXIRELIXIR
Pistoia Alliance Debates - Data Sustainability 15-07-2015 16.00
Pistoia Alliance Debates - Data Sustainability 15-07-2015 16.00Pistoia Alliance Debates - Data Sustainability 15-07-2015 16.00
Pistoia Alliance Debates - Data Sustainability 15-07-2015 16.00
Pistoia Alliance
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
Carole Goble
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
Carole Goble
 
BiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole GrazianoBiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole Graziano
eventi-ITBbari
 
The BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative researchThe BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative research
Blue BRIDGE
 
170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap
anttipursula
 
Standards
StandardsStandards
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
European Bioinformatics Institute
 
Summary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsSummary of Technical Coordinators discussions
Summary of Technical Coordinators discussions
Rafael C. Jimenez
 
Big Data
Big DataBig Data
Big Data
SURFnet
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 5
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 5USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 5
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 5
Gianpaolo Coro
 
Jisc's new shared data centre
Jisc's new shared data centreJisc's new shared data centre
Jisc's new shared data centre
Jisc
 

Similar to ELIXIR (20)

ELIXIR
ELIXIRELIXIR
ELIXIR
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Elixir at de.nbi meeting
Elixir at de.nbi meetingElixir at de.nbi meeting
Elixir at de.nbi meeting
 
ELIXIR-UK
ELIXIR-UKELIXIR-UK
ELIXIR-UK
 
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
Current and future plans for ELIXIR presentation given by Niklas Blomberg, EL...
 
AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011
 
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnectionseROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
eROSA Stakeholder WS1: Ensembl, ELIXIR and engineering interconnections
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Pistoia Alliance Debates - Data Sustainability 15-07-2015 16.00
Pistoia Alliance Debates - Data Sustainability 15-07-2015 16.00Pistoia Alliance Debates - Data Sustainability 15-07-2015 16.00
Pistoia Alliance Debates - Data Sustainability 15-07-2015 16.00
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
BiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole GrazianoBiPday 2014 -- Pesole Graziano
BiPday 2014 -- Pesole Graziano
 
The BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative researchThe BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative research
 
170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap
 
Standards
StandardsStandards
Standards
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Summary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsSummary of Technical Coordinators discussions
Summary of Technical Coordinators discussions
 
Big Data
Big DataBig Data
Big Data
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 5
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 5USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 5
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 5
 
Jisc's new shared data centre
Jisc's new shared data centreJisc's new shared data centre
Jisc's new shared data centre
 

More from Rafael C. Jimenez

BMB Resource Integration Workshop
BMB Resource Integration WorkshopBMB Resource Integration Workshop
BMB Resource Integration Workshop
Rafael C. Jimenez
 
Proteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesProteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resources
Rafael C. Jimenez
 
ELIXIR
ELIXIRELIXIR
The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...
Rafael C. Jimenez
 
Standardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresStandardisation in BMS European infrastructures
Standardisation in BMS European infrastructures
Rafael C. Jimenez
 
ELIXIR TCG update
ELIXIR TCG updateELIXIR TCG update
ELIXIR TCG update
Rafael C. Jimenez
 
An introduction to programmatic access
An introduction to programmatic accessAn introduction to programmatic access
An introduction to programmatic access
Rafael C. Jimenez
 
Life science requirements from e-infrastructure: initial results from a joint...
Life science requirements from e-infrastructure:initial results from a joint...Life science requirements from e-infrastructure:initial results from a joint...
Life science requirements from e-infrastructure: initial results from a joint...
Rafael C. Jimenez
 
Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.
Rafael C. Jimenez
 
Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.
Rafael C. Jimenez
 
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Rafael C. Jimenez
 
SASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course informationSASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course information
Rafael C. Jimenez
 
Introduction to the BioJS project
Introduction to the BioJS projectIntroduction to the BioJS project
Introduction to the BioJS project
Rafael C. Jimenez
 
ELIXIR . Technical Coordinator
ELIXIR. Technical CoordinatorELIXIR. Technical Coordinator
ELIXIR . Technical Coordinator
Rafael C. Jimenez
 
BioJS introduction
BioJS introductionBioJS introduction
BioJS introduction
Rafael C. Jimenez
 
Java tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsJava tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular Interactions
Rafael C. Jimenez
 
An introduction . Programmatic access to interaction resources
An introduction. Programmatic access to interaction resourcesAn introduction. Programmatic access to interaction resources
An introduction . Programmatic access to interaction resources
Rafael C. Jimenez
 
Introduction to the BioJS project
Introduction to the BioJS projectIntroduction to the BioJS project
Introduction to the BioJS project
Rafael C. Jimenez
 
BioJS, a guideline for JavaScript components
BioJS, a guideline forJavaScript componentsBioJS, a guideline forJavaScript components
BioJS, a guideline for JavaScript components
Rafael C. Jimenez
 
Challenges in protein networks representation and integration
Challenges in protein networks representation and integrationChallenges in protein networks representation and integration
Challenges in protein networks representation and integration
Rafael C. Jimenez
 

More from Rafael C. Jimenez (20)

BMB Resource Integration Workshop
BMB Resource Integration WorkshopBMB Resource Integration Workshop
BMB Resource Integration Workshop
 
Proteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesProteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resources
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...
 
Standardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresStandardisation in BMS European infrastructures
Standardisation in BMS European infrastructures
 
ELIXIR TCG update
ELIXIR TCG updateELIXIR TCG update
ELIXIR TCG update
 
An introduction to programmatic access
An introduction to programmatic accessAn introduction to programmatic access
An introduction to programmatic access
 
Life science requirements from e-infrastructure: initial results from a joint...
Life science requirements from e-infrastructure:initial results from a joint...Life science requirements from e-infrastructure:initial results from a joint...
Life science requirements from e-infrastructure: initial results from a joint...
 
Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.
 
Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.
 
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
 
SASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course informationSASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course information
 
Introduction to the BioJS project
Introduction to the BioJS projectIntroduction to the BioJS project
Introduction to the BioJS project
 
ELIXIR . Technical Coordinator
ELIXIR. Technical CoordinatorELIXIR. Technical Coordinator
ELIXIR . Technical Coordinator
 
BioJS introduction
BioJS introductionBioJS introduction
BioJS introduction
 
Java tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular InteractionsJava tutorial: Programmatic Access to Molecular Interactions
Java tutorial: Programmatic Access to Molecular Interactions
 
An introduction . Programmatic access to interaction resources
An introduction. Programmatic access to interaction resourcesAn introduction. Programmatic access to interaction resources
An introduction . Programmatic access to interaction resources
 
Introduction to the BioJS project
Introduction to the BioJS projectIntroduction to the BioJS project
Introduction to the BioJS project
 
BioJS, a guideline for JavaScript components
BioJS, a guideline forJavaScript componentsBioJS, a guideline forJavaScript components
BioJS, a guideline for JavaScript components
 
Challenges in protein networks representation and integration
Challenges in protein networks representation and integrationChallenges in protein networks representation and integration
Challenges in protein networks representation and integration
 

Recently uploaded

writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
bmucuha
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 

Recently uploaded (20)

writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 

ELIXIR

  • 1. European Life Sciences Infrastructure for Biological Information www.elixir-europe.org Rafael C Jimenez ELIXIR CTO RDA Fourth Plenary Meeting, ELIXIR Bridging Force IG 2014,Tuesday 23 September ELIXIR
  • 4. Research infrastructures Facilitate research 4 Data ICT e-infrastructures LS life sciences Physical facilities Scientific information Transfer Computation Storage
  • 5. ELIXIR • European life sciences research infrastructure for biological information to support all life science research • Safeguard data and build sustainable data services 5
  • 6. ELIXIR Members 6 • Participated by major bioinformatics service providers (~130) and supported by 17 EU member states and EMBL • 11 EU states signed agreement • Czech Republic, Estonia, Denmark, Finland,Israel, Netherlands, Norway,Portugal, Switzerland, Sweden, UK • 6 EU signed MoU • Belgium, Greece, France, Italy, Slovenia, Spain
  • 7. 7 D e s cr ip t io n W e b -a d d r e ss T h e $Is ra e l$N o d e $ Contact ELIXIR:The Norway Node ELIXIR'sNorway Node will provide competence and infra- structure building on key areasfor Norway, in particularmarine resourcesandmedicalresearch. Challengesrelatedtoprocessing andanalysisof data from next-generation sequencing and other high-throughput methodsare important both to basicresearch in these areas, and to the development of new enterprises. The node will alsoprovide training and support toward researchers. Professor Inge Jonassen ELIXIR Norway Node Computational BiologyUnit Departmentof Informatics University of Bergen, Norway Inge.Jonassen@ii.uib.no www.elixir-norway.org Universityof Bergen The coordinating partner of the Elixir Norway node. The main focus is on marine genomics and e-infrastructure. An early deliverable is LiceBase developed in tight collaboration withe the Sea Lice Research Centre. Universityof Oslo Emphasizes biomedical resource provision an analysis, leveraging public resources in integrative statistical genomics, with secure management of person sensitive data. NorwegianUniversity of LifeSciences comparative fish genomics. Provision of web-based solutions for services, toolboxes, and computational access to these data. NorwegianUniversity of ScienceandTechnology Tools and resources for analysing genome data, with focus on gene regulation, non-protein-coding RNAs and epigenetics, but also bacterial genomics. Handling and analysis of data from human biobanks. Universityof Tromsø Tools and pipelines for analyzing metagenomic (and genomic) data, with a particular focus on taxonomic classification and bioprospecting (functional and metabolic potential). TheNorwegian ELIXIRNodewillprovideservicesand resourcestoward marine genomicsincluding researchers, government, and industry. TheNorwegian Nodewill off several integrated packagesgearedtowardslarge-scale analysisof marinegenomicand metagenomicdata (e.g. fish genomicsand marine bio- prospecting). Thisalso includesprovision of web-based solutionsfor services, toolboxes, and computational accessto referencedata provided by theELIXIR infrastructure. Health and biobanks TheNorwayNode supportsinfrastructure for handling and analysisof datafor medical research, including human biobanks. Such datamay besensitive, and must bestored with secure access.The nodeisdeveloping infrastructure for sensitivedata.Toolsfor data analysisareintegrated into NeLS, theNorwegian e-infrastructure for lifesciences.Thisprovidesuser-friendly solutionsfor examplefor human re-sequencing dataand other genome-scaleanalyses. Marine research Collaborating organisations ©TrymIvar Bergsmo /Samfoto / NTBScanpix ELIXIR node proposals • ELIXIR deliver services through national ELIXIR Nodes building on national strengths and priorities
  • 8. ELIXIR strategic drivers 1. Establish a distributed infrastructure to scale with the challenge of data growth 2. Secure and deliver the core data resources underpinning life science research 3. Provide discoverable tools, services and connectors to drive data access and exploitation 4. Provide robust technical platforms and clouds for secure data access, data exchange and compute 5. Develop and maintain standards for data management, reuse and integration 6. Drive partnerships with user communities and other organisations to ensure high impact 7. Close the computational biology skills gap through a comprehensive training programme for professionals 8. Support innovation in big data biology 8
  • 9. Areas of interest and programme of work 9 Domain specific services Data interoperability, vocabulary and ontology services Tools interoperability and Service Registry Data resources & services Technical Services Management and operations Training Compute Dat a Standards Tools Training
  • 10. • Working groups to coordinate the ELIXIR technical strategy • Driven by national technical leads • Plan, agree and implement technical strategies • Represent ELIXIR on one specific technical topic Task forces 10 Scientific and funding strategies ELIXIR Programme of work Task forces Planning Agreement Implementation Technical strategies Pilot projects HoNs HoNs HoNs TCG
  • 11. Task forces 11 Data interoperability, vocabulary and ontology services Tools interoperability and Service Registry Data resources & services Technical Services Management and operations Training • Interoperability • Service registry • Metrics, monitoring & quality control • Cloud • Storage • AAI • Training portal • E-learning • Communications • Website
  • 12. ELIXIR Pilot Projects 1. ELIXIR Facing Cloud Support and Virtual Machines - with SIB 2. ELIXIR Data IO to pilot the continuous transfer of major archive resources to a remote European location - with CSC, Finland 3. Establishing EGA Distributed authentication - with CSC, Finland 4. Establishing EGA as joint venture – with CRG, Spain 5. Improving links between Human Proteome Atlas (HPA) and EMBL- EBI resources 6. BILS-ProteomeXchange integration using EUDAT resources 7. Interoperable controlled-access big data transfer technology for ELIXIR - application to EGA EBI / CRG ELIXIR collaboration and beyond 8. Harmonising Marine Metagenomics pipelines
  • 13. ELIXIR technical activities • ELIXIR Node activities, Task forces, Pilots • Technical activities among different interest groups … 13 ELIXIR Technical activities BMBESFRIs e-infrastructures ELIXIR nodes Communities
  • 14. Affinity with RDA groups Life science • The BioSharing Registry: connecting data policies, standards & databases in life sciences • Wheat Data Interoperability WG • Agriculture Data Interest Group (IGAD) • Marine Data Harmonization IG • Metabolomics • Structural Biology IG • Toxicogenomics Interoperability IG 14
  • 15. Affinity with RDA groups • Data Description Registry Interoperability • Big Data Analytics IG • Domain Repositories Interest Group • Education and Training on handling of research data • Federated Identity Management • Metadata IG • Preservation e-Infrastructure IG • Service Management IG • Active Data Management Plans • Sustainability of eResearch / Cyberinfrastructure 15
  • 16. Affinity with RDA groups • Data Citation WG • Data Foundation and Terminology WG • Metadata Standards Directory Working Group • PID Information Types WG • RDA/WDS Publishing Data Bibliometrics WG • RDA/WDS Publishing Data Services WG • RDA/WDS Publishing Data Workflows WG • Repository Audit and Certification DSA–WDS Partnership WG • Long tail of research data IG • PID Interest Group • RDA/WDS Publishing Data IG • Research Data Provenance 16
  • 19. Data resources in life science • Many • Diverse • Disperse Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2012. MY Galperin, GR Cochrane – Nucleic Acids Research, 2011 ~1800molecular biology data resources
  • 20. Utility of databasesScientificimpact Too little information Too many databases Too diverse interfaces
  • 22. Improving Links Between distributed European resources ELIXIR pilot: Interoperability of protein expressions resources The Human Protein Atlas portal is a publicly available database with millions of high-resolution images showing the spatial distribution of proteins in 46 different normal human tissues and 20 different cancer types, as well as 47 different human cell lines.
  • 24. Proteomics data in PRIDE 24 ~85% raw data
  • 25. Data types examples 25 Raw data Process data Metadata DNA Human Liver Mitochondria W. Smith … Peptide Mouse Heart Nucleus J. Heinz … LPISASHSSK… TTGTTATCCG… … … …
  • 26. Data submission 26 raw data processed data metadata Centralized database
  • 27. Data submission - pilot 27 raw data processed data metadata PID
  • 29. Cross-site VM Operation - pilot 29 • Perform analysis via cloud infrastructures and VMs • Transfer VMs between computing centers to allow researchers to perform analyses that they could not otherwise do locally • Supported by 5 NRENs and in collaboration with
  • 30. Cross-site VM Operation 30 CSC EMBL-EBI University of Groningen Data Analysis tools Computation Data Analysis tools VM VM VM Chipster 200GB NBIC Galaxy 50GB GoNL 60TB ENA 3.2PB 1GB lightpath 1GB lightpath 1GB lightpath Funet Janet SURFnet
  • 31. European ELIXIR Data - ”LightPath” (EBI / CSC) • Aim • To explore the replication of large scale (Petabyte scale) archives to remote sites • To create a separate source of data files for challenging DataIO projects • Update: • Selection of pilot data transfer technology between EBI and CSC • Established a dedicated light path between datacenters in London and Kajaani • Development of model for future IO needs in the lifesciences in Europe
  • 32. REMS - Resource Entitlement Management System Principal investigator Applicant Research group Members of the application Metadata on dataset 1&2 Dataset 1 Dataset 2 DAC 1 Approver DAC 2 Approver REMS Workflow Reports Entitlements IdP IdP IdP SP 1. Apply for access 4. Approve 5. Access 3. Circulate to approver 2. Commit to licence terms • Access to sensitive data (genomics) granted by a Data Access Committee • In collaboration with eduGAIN • Agreements to be applied to other domains: FI-CLARIN & FI-CESSDA https://tnc2013.terena.org/getfile/421 Starting point: https://remsdemo.csc.fi/
  • 33. Training 33 For Big Data to become huge, however, there are still hurdles to leap. For one thing, the tools to analyse data are not yet good enough. And people with the skills to analyse data are scarce and will become scarcer. By 2018 there will be a “talent gap” of between 140,000 and 190,000 people, …
  • 34. Challenges • Sustain data and services • Make data and service interoperable • Necessary to integrate data • Specially medical, clinical and research • Data too big to store, exchange & compute? Forthcoming challenges … • Data production grows faster than storage • Cost of data production technologies declines faster than storage • It takes longer to transfer data than produce the data. • Privacy, security & access (AAI) • Training 34
  • 35. • EGA is to be distributed effort with archive, submission, and data distribution capacity at both the EBI and CRG 35 European Genome-phenome archive • From the users point of view, EGA remains one integrated Archive of secure human biomedical research data. • Search of datasets at either website is "global" across the EGA.
  • 36. CASE: process for applying access to the Nordic Control Database
  • 37. ELIXIR pilots to address key challenges in biomedical research: 1. Cloud computing “Embassy cloud”: Access reference data in a virtual environment – work as though you are at EMBL-EBI or SIB, Switzerland 3. High-Performance Computing “Lightpath”: Connections for on-demand reference data to remote HPC centres at EMBL-EBI and CSC Finland 2. Authentication & Authorisation Improved methods and processes for access to clinical data
  • 38. Minimum metadata life sciences Courses Materials Trainers… ServicesJobs • Title • Description • Creator • Publication Date • Topics • Audience • …
  • 40. Data submission 40 raw data processed data metadata Centralized database
  • 41. Data sharing The casual approach ‘data on my disk and available to anyone who requests it' Submission to data repositories
  • 42. Data submissions 42 Data repository Journal submission Data repository Journal submission reads Journal request Curator Data repository Data Management Plan submission Data management +
  • 43. Data sharing The casual approach ‘data on my disk and available to anyone who requests it' Submission to data repositories Will big data affect data deposition?
  • 44. Data submissions How much data? How much available data?

Editor's Notes

  1. Launched in 2014
  2. Creating a robust infrastructure for biological information is a bigger task than any individual organisation or nation can take on alone
  3. The programme of work defines the ELIXIR Strategic plan for the next 4 years
  4. Programme of work 5 years plan Work streams Strategy focused on generic topic Task forces Define technical implementation of specific tasks
  5. The programme of work defines the ELIXIR Strategic plan for the next 4 years
  6. Test beds for integration of ELIXIR services. Pilot Projects are being staffed by seconded staff from EMBL-EBI.
  7. Cost of storage more than cost of production
  8. Access to sensitive data granted by a Data Access Committee (DAC) Sensitive data
  9. Knowledgeable about technology, data and services in life sciences and be knowledgeable in ICT
  10. - EGA is to be distributed effort with archive, submission, and data distribution capacity at both the EBI and CRG - From the users point of view, EGA remains one integrated Archive of secure human biomedical research data. Search of datasets at either website is "global" across the EGA. - EBI and CRG teams are working together to improve search functions based on disease and technology platform type and - Archived datasets may exist at one or both location, providing load balancing and backup in cases where data is mirrored. - Data can be discovered and requested from either location, and will be streamed from the appropriate archive in a way transparent to users.
  11. necessary to understand, develop or reproduce published research
  12. Not all the data make it to the public repositories
  13. necessary to understand, develop or reproduce published research