SlideShare a Scribd company logo
1 of 21
Variation and Assembly Resources at EMBL-EBI
Laura Clarke
Variant Discovery and Genome Assembly
Wednesday November 1st
Genome Informatics
EVA
Variation and Assembly Resources at EMBL-EBI
PDX Finder
Oxford Nanopore
MARC ReadUntil
BlobToolKit
GWAS
Catalog
Archives
Looking back
• 1982 EMBL and Genbank established
• 1982 Data sharing and standardization
collaboration put in place
• 1983 first full phage genome published
• First public in November 1982
• Enterobacteria phage T7
https://www.ebi.ac.uk/ena/data/view/V01146
Credit: Ana Toribio
Assembly archiving today
• UI and API submission interfaces
• Reads and Assemblies accepted
• Chlamydia trachomatis A2497 serovar A
Comprehensive global genome dynamics
of Chlamydia trachomatis show ancient
diversification followed by contemporary
mixing and recent lineage expansion.
563 full genomes (455 novel)
Genome Res. 2017 Jul;27(7):1220-1229.
doi: 10.1101/gr.212647.116.
J Hadfield et al
https://www.ebi.ac.uk/ena/data/view/FM872306
Credit: Ana Toribio
Accessing managed human data
EGA By the numbers
● 1,698 studies
● 3,591 datasets
● 777 data providers
● >10,000 requestors
● EMBL-EBI and CRG
By volume
● 4.7 Petabytes
https://ega-archive.org/
Credit: Thomas Keane
Accessing managed human data
EGA ~2015
Credit: Thomas Keane
Accessing managed human data
Looking Forward
Credit: Thomas Keane
HTS Get
What is it?
• An efficient non-file based API interface for accessing read data
• Separate backend storage implementation from interface
• A bridge from existing file formats to API client/server model
Progress
• Launch of v1.0 at GA4GH plenary October 2017!
• Demonstrations of integration with AAI+secure transfer
http://samtools.github.io/hts-specs/
Credit: Thomas Keane
Beacon project
• Allele based genotype queries
• Each beacon determines it’s own access poliy
• Data returned can be determined depending on tier
• Allele frequency
• Data set
• Population
• Sample
• Phenotype
• Anything else?
• EGA Beacon has 3 tiers of access
• Public
• Registered
• Controlled
https://beacon-network.org/#/
Credit Thomas Keane
European Variation Archive
• European Variation Archive
• Established in 2014
• Accepts VCF submissions (no archive specific format)
• Can link to ENA read submissions
• Taking over non-human RS assignment from dbSNP
Credit: Cristina Yenyxe Gonzalez
https://www.ebi.ac.uk/eva/
Non Human RS number assignment and releases
• EVA to assign rs (locus) and ss (submission) numbers for non human variants
• Existing accessions will remain in use
• Continues rolling release of variants as submitted
• Bi-annual merging of submitted variants into loci
• Always connected to existing rs numbers on search
• Per species VCFs released
• API and streaming access available
• EVA continues to broker Human variants to dbSNP
Credit Cristina Yenyxe Gonzalez
https://www.ebi.ac.uk/eva/
VCF specification and validation
• Maintained by GA4GH file formats group
• EVA validates against official specification
• www.github.com/ebivariation/vcf-validator
• Proposal in place to improve SV structure in VCFs
• Maintainers of variation archives
• Structural variation caller methods developers
• Pull request on https://github.com/samtools/hts-specs/pull/231
• Please give feedback
Credit: Cristina Yenyxe Gonzalez
https://www.github.com/samtools/hts-specs
Other Resources
ATG AAAAAAA
Regulatory
3’ UTRIntronic
CODING
Missense
CODING
Synonymous
Splice site5’ UTR 3’ Downstream
http://www.ensembl.org/info/genome/variation/sources_documentation.html
Credit: Emily Perry
Phenotype and disease data can be
searched by ontology term to retrieve
aggregated results.
Improved allele frequency
views with more data
available
Credit: Sarah Hunt
The GWAS Catalog
• Public catalog of Genome Wide Association Studies
• Curated from the literature
• Now with summary statistics
• > 3000 publications
• > 44,000 variant-trait associations
https://www.ebi.ac.uk/gwas/downloads/summary-statistics
Credit: Fiona
Cunningham
Turning pathogen data collection into actionable information
• Risk-assessment models and risk-based sampling
• From samples and metadata to comparable data
• From comparable data to actionable information
• Pathogen identification and characterization
• Outbreak detection
• Outbreak investigation
• Outbreak prediction
• Building a common data platform and analysis framework
• Risk communication
PDX Finder
2010
2011
2016
Credit: Terry Meehan
PDX Finder
Build a comprehensive global catalogue of PDX models and their data available
for researchers
www.pdxfinder.org
JAX and EMBL-EBI co-developed resource
Carol Bult – Helen Parkinson/Terry Meehan
NCI funding
EC EuroPDX
Credit Terry Meehan
Questions
We are almost always hiring
https://www.ebi.ac.uk/about/jobs

More Related Content

Similar to Variation and Assembly Resources at EMBL-EBI

Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Monica Munoz-Torres
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
ekansa
 

Similar to Variation and Assembly Resources at EMBL-EBI (20)

Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIII
 
Module development
Module development Module development
Module development
 
An Oz Mammals Bioinformatics and Data Resource
An Oz Mammals Bioinformatics and Data ResourceAn Oz Mammals Bioinformatics and Data Resource
An Oz Mammals Bioinformatics and Data Resource
 
Isni where are we now gatenby harvard 2014 11
Isni where are we now gatenby harvard 2014 11Isni where are we now gatenby harvard 2014 11
Isni where are we now gatenby harvard 2014 11
 
Nucleic acid database
Nucleic acid databaseNucleic acid database
Nucleic acid database
 
Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...Building genomic data cyberinfrastructure with the online database software T...
Building genomic data cyberinfrastructure with the online database software T...
 
Randall "MECA Project Update"
Randall "MECA Project Update"Randall "MECA Project Update"
Randall "MECA Project Update"
 
Nucleic_Acid_Databases, Bioinformatics, genome
Nucleic_Acid_Databases, Bioinformatics, genomeNucleic_Acid_Databases, Bioinformatics, genome
Nucleic_Acid_Databases, Bioinformatics, genome
 
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
 
Linked APIs for Life Sciences Tutorial at SWAT4LS 3011
Linked APIs for Life Sciences Tutorial at SWAT4LS 3011Linked APIs for Life Sciences Tutorial at SWAT4LS 3011
Linked APIs for Life Sciences Tutorial at SWAT4LS 3011
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
COPO kick-off meeting
COPO kick-off meetingCOPO kick-off meeting
COPO kick-off meeting
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotations
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016
 
Parkinson mibbi
Parkinson mibbiParkinson mibbi
Parkinson mibbi
 
Odyssey Of The IWGSC Reference Genome Sequence: 12 Years 1 Month 28 Days 11 ...
 Odyssey Of The IWGSC Reference Genome Sequence: 12 Years 1 Month 28 Days 11 ... Odyssey Of The IWGSC Reference Genome Sequence: 12 Years 1 Month 28 Days 11 ...
Odyssey Of The IWGSC Reference Genome Sequence: 12 Years 1 Month 28 Days 11 ...
 
ENCODE-DCC-metadata-standard-Biocurator 2014
ENCODE-DCC-metadata-standard-Biocurator 2014ENCODE-DCC-metadata-standard-Biocurator 2014
ENCODE-DCC-metadata-standard-Biocurator 2014
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
 
Backbone taxonomies, data aggregation, and early career systematists: somethi...
Backbone taxonomies, data aggregation, and early career systematists: somethi...Backbone taxonomies, data aggregation, and early career systematists: somethi...
Backbone taxonomies, data aggregation, and early career systematists: somethi...
 

Recently uploaded

HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPTHIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPT
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
GlendelCaroz
 

Recently uploaded (20)

Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdf
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
 
Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...
 
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPTHIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
 
Micropropagation of Madagascar periwinkle (Catharanthus roseus)
Micropropagation of Madagascar periwinkle (Catharanthus roseus)Micropropagation of Madagascar periwinkle (Catharanthus roseus)
Micropropagation of Madagascar periwinkle (Catharanthus roseus)
 
THE FUNDAMENTAL UNIT OF LIFE CLASS IX.ppt
THE FUNDAMENTAL UNIT OF LIFE CLASS IX.pptTHE FUNDAMENTAL UNIT OF LIFE CLASS IX.ppt
THE FUNDAMENTAL UNIT OF LIFE CLASS IX.ppt
 
Vital Signs of Animals Presentation By Aftab Ahmed Rahimoon
Vital Signs of Animals Presentation By Aftab Ahmed RahimoonVital Signs of Animals Presentation By Aftab Ahmed Rahimoon
Vital Signs of Animals Presentation By Aftab Ahmed Rahimoon
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
 
Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdf
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdf
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
Factor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandFactor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary Gland
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence acceleration
 
Taphonomy and Quality of the Fossil Record
Taphonomy and Quality of the  Fossil RecordTaphonomy and Quality of the  Fossil Record
Taphonomy and Quality of the Fossil Record
 
NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.
 
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptx
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
 

Variation and Assembly Resources at EMBL-EBI

  • 1. Variation and Assembly Resources at EMBL-EBI Laura Clarke Variant Discovery and Genome Assembly Wednesday November 1st Genome Informatics
  • 2. EVA Variation and Assembly Resources at EMBL-EBI PDX Finder Oxford Nanopore MARC ReadUntil BlobToolKit GWAS Catalog
  • 4. Looking back • 1982 EMBL and Genbank established • 1982 Data sharing and standardization collaboration put in place • 1983 first full phage genome published • First public in November 1982 • Enterobacteria phage T7 https://www.ebi.ac.uk/ena/data/view/V01146 Credit: Ana Toribio
  • 5. Assembly archiving today • UI and API submission interfaces • Reads and Assemblies accepted • Chlamydia trachomatis A2497 serovar A Comprehensive global genome dynamics of Chlamydia trachomatis show ancient diversification followed by contemporary mixing and recent lineage expansion. 563 full genomes (455 novel) Genome Res. 2017 Jul;27(7):1220-1229. doi: 10.1101/gr.212647.116. J Hadfield et al https://www.ebi.ac.uk/ena/data/view/FM872306 Credit: Ana Toribio
  • 6. Accessing managed human data EGA By the numbers ● 1,698 studies ● 3,591 datasets ● 777 data providers ● >10,000 requestors ● EMBL-EBI and CRG By volume ● 4.7 Petabytes https://ega-archive.org/ Credit: Thomas Keane
  • 7. Accessing managed human data EGA ~2015 Credit: Thomas Keane
  • 8. Accessing managed human data Looking Forward Credit: Thomas Keane
  • 9. HTS Get What is it? • An efficient non-file based API interface for accessing read data • Separate backend storage implementation from interface • A bridge from existing file formats to API client/server model Progress • Launch of v1.0 at GA4GH plenary October 2017! • Demonstrations of integration with AAI+secure transfer http://samtools.github.io/hts-specs/ Credit: Thomas Keane
  • 10. Beacon project • Allele based genotype queries • Each beacon determines it’s own access poliy • Data returned can be determined depending on tier • Allele frequency • Data set • Population • Sample • Phenotype • Anything else? • EGA Beacon has 3 tiers of access • Public • Registered • Controlled https://beacon-network.org/#/ Credit Thomas Keane
  • 11. European Variation Archive • European Variation Archive • Established in 2014 • Accepts VCF submissions (no archive specific format) • Can link to ENA read submissions • Taking over non-human RS assignment from dbSNP Credit: Cristina Yenyxe Gonzalez https://www.ebi.ac.uk/eva/
  • 12. Non Human RS number assignment and releases • EVA to assign rs (locus) and ss (submission) numbers for non human variants • Existing accessions will remain in use • Continues rolling release of variants as submitted • Bi-annual merging of submitted variants into loci • Always connected to existing rs numbers on search • Per species VCFs released • API and streaming access available • EVA continues to broker Human variants to dbSNP Credit Cristina Yenyxe Gonzalez https://www.ebi.ac.uk/eva/
  • 13. VCF specification and validation • Maintained by GA4GH file formats group • EVA validates against official specification • www.github.com/ebivariation/vcf-validator • Proposal in place to improve SV structure in VCFs • Maintainers of variation archives • Structural variation caller methods developers • Pull request on https://github.com/samtools/hts-specs/pull/231 • Please give feedback Credit: Cristina Yenyxe Gonzalez https://www.github.com/samtools/hts-specs
  • 15. ATG AAAAAAA Regulatory 3’ UTRIntronic CODING Missense CODING Synonymous Splice site5’ UTR 3’ Downstream http://www.ensembl.org/info/genome/variation/sources_documentation.html Credit: Emily Perry
  • 16. Phenotype and disease data can be searched by ontology term to retrieve aggregated results. Improved allele frequency views with more data available Credit: Sarah Hunt
  • 17. The GWAS Catalog • Public catalog of Genome Wide Association Studies • Curated from the literature • Now with summary statistics • > 3000 publications • > 44,000 variant-trait associations https://www.ebi.ac.uk/gwas/downloads/summary-statistics Credit: Fiona Cunningham
  • 18. Turning pathogen data collection into actionable information • Risk-assessment models and risk-based sampling • From samples and metadata to comparable data • From comparable data to actionable information • Pathogen identification and characterization • Outbreak detection • Outbreak investigation • Outbreak prediction • Building a common data platform and analysis framework • Risk communication
  • 20. PDX Finder Build a comprehensive global catalogue of PDX models and their data available for researchers www.pdxfinder.org JAX and EMBL-EBI co-developed resource Carol Bult – Helen Parkinson/Terry Meehan NCI funding EC EuroPDX Credit Terry Meehan
  • 21. Questions We are almost always hiring https://www.ebi.ac.uk/about/jobs

Editor's Notes

  1. Hello, I am Laura Clarke and I would like to thank the organisers for inviting me and Jared to lead this session and for giving me the opportunity to speak.
  2. Before we get to the other talks about methods and approaches for assembly, variant discovery and annotation I want to give you a whistlestop tour of some of the EBI’s Variation and Assembly resources, we have being running genomic archives for about 35 years and over that time have added resources to turn this present this collected data together and support large scale data generation projects ensure their efforts are useful to the whole community.
  3. To start with, EBI has 3 genomic archives, the European Nucleotide Archive, the European Genome Phenome Archive and the European Variation Archive
  4. EMBL-bank was founded in 1982, a month before Genbank. Data sharing and standardization was rapidly established to ensure data submitted to either archive was available to users of both and in those days this was all distributed in printed books, before moving to CD and finally using the internet. One of the first genomes sequenced and submited to EMBL-Bank was the T7 phage, with a pre-publication submission in late 1982 and being published in the Journal of Molecular Biology in 1983. You can still access this genome today from the ENA it was last updated in 2004.
  5. These days many more genomes from across all clades of life are being sequenced, here is a chlamydia genome part of a publication assessing the global diversity of the more prevalent sexually transmitted bacteria but it remains poorly understood due to it being difficult to culture, this study was able to demonstrate that the diversity in the genome arrived respectively recently in over a few thousand years rather than the previously thought millions
  6. In a world where we sequence more and more humans to support biomedical resource we can no longer openly distribute all this data. The European Genome Phenome archive was established in 2008 along side dbGAP to provide a managed access solution enabling scientists to be able to get this data when previously it was impossible. EBI was joined by CRG in 2012 to maintain and extend the resource and today there are more than 3000 data sets and more than 10,000 users.
  7. While the EGA has made it possible to access this data in the age of 10s of thousands of genomes of high depth the process is increasingly cumbersome both to get permission to access and to get the data once you have permission
  8. The EGA over the course of the last year and moving forward is deploying new solutions to make it much easier to both put data in and get data out of the EGA (once you have permission), improving the tools available to give users permission and making it possible to for local deployments of the EGA to be setup ensuring nations which don’t allow genetic material to leave their borders to still take advantage of this data sharing technology Yes, the driving reason for this is that human genetic data is moving towards federation, especially as research interfaces with national healthcare (where data is often subject to jurisdictional restrictions or has higher local data security requirements).
  9. One new tool released at ASHG a couple of weeks ago is HTS Get, this allow streaming of data from secure storage using authentication enabling users to access this data both in a suitable cloud environment and get specific genomic regions of these files rather than needed to download and decrypt the whole thing when you are only interested in a piece of chr22. Currently HTSGet supports BAM and CRAM and VCF will be added in the near future.
  10. For allele level variant querying the EGA already hosts beacons where you can query by allele to find out if the beacon contains it. Depending on the beacon different types of data can be returned and different authentication is required before access. EGA has three tiers of access, Public (anything which is openly consented), registered where the user needs to provide an email address and controlled which needs DAC permission. An this leads us nicely onto the European Variation archive.
  11. The most recently of the EBI genomic archives to be established is the European Variation Archive, setup in 2014, originally this archived dataset and publication VCF files (and get associated reads into ENA if poss) and then brokered variants to dbSNP for RS numbers but as variantion discovery continues to increase they are taking over locus level accessioning from dbSNP from all non Human variants
  12. The existing accessions will remain in use, imported into the EVA systems over the course of the next year. No new name space will be invented and continuous release of study level VCFs will continue to be merged into locus level rs accessions on a bianual basis.
  13. VCF isn’t many peoples favourite format. It is being maintained by the GA4GH file formats group and the EVA will validate against the official specification. There are already proposals in place for SV representation, they are very keen to get feedback. Who are the maintainers? Cristina Yenyxe Gonzalez David Roazen Petr Danecek
  14. Now we move onto our other resources supporting genome and variation data at EBI, these other resources pull together data from the archives, other projects and the publication record
  15. I can’t talk about EBI’s resources for variation without mentioning Ensembl variation which holds data for many different species and puts this data into the context of all the other annotation Ensembl holds, allowing you to see genotypes, LD and allele frequencies and discover variants by phenotype associations
  16. Recently the phenotype and disease searches have been improved and there are better views with more allele frequencies on the variant pages. Do go see Will Mclaren’s poster about their improved REST query service
  17. We also have the GWAS catalog which curates genome wide association studies from the literature. Over the last year they have started adding summary statistics for these studies, about 1.5% of studies currently have them
  18. We run data coordination efforts for many different projects. We support the COMPARE project, a EC funded effort to improve biosurveilance. The ENA archives the assemblies and provided an analysis platform to support rapid analysis of the collected data.
  19. Finally I want to introduce a very new project which only started in the last year. Patient Derived Xenograph mice have been increasing in usage over the past five years but it is currently challenging to find all the different models and the data associated with them. These PDX mice can help assess cancer treatments in a trial setting and in future may be useful in a patient context, planning treatment
  20. Both the NCI and the European Commission are funding the EBI together with the JAX labs to build a global catalog of these models and their associated data to ensure the whole community can benefit from the models which are being created.
  21. Thank you very much for listening to me. I have spoken about the work of many people at the institute today. Do remember the EBI is a great place to work and we are always hiring.