Successfully reported this slideshow.
Your SlideShare is downloading. ×

Globus in European Life Science

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Globus in European Life Science

  1. 1. Steven Newhouse Head of Technical Services, EMBL-EBI Globus in European Life-Science GlobusWorld 2019
  2. 2. The European Molecular Biology Laboratory Heidelberg, Germany Main Laboratory Barcelona, Spain Tissue Biology, Disease Modeling 80+ nationalities Hinxton, Cambridge, UK Bioinformatics Mouse Biology Monterotondo, Rome, Italy >1600 personnel Grenoble, France Hamburg, Germany Structural Biology 6 sites in Europe Structural Biology
  3. 3. What is EMBL-EBI? • Europe’s home for biological data services, research and training • A trusted data provider for the life sciences • International: 600 members of staff from 60 nations OUR MISSION (1/5) To provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress
  4. 4. Literature services • BioStudies • Europe PMC Chemistry services • ChEBI • ChEMBL • MetaboLights • SureChEMBL Macromolecular & cellular structure • Protein Data Bank in Europe (PDBe) • PDBe-KB • Electron Microscopy Data Bank • EMPIAR Molecular atlas • Array Express • Expression Atlas • PRIDE Proteins & protein families • MGnify • InterPro • Pfam • Rfam • RNA Central • UniProt Genes, genomes & variation • Ensembl • Ensembl Genomes • GWAS Catalog Molecular systems • BioModels • IntAct • OmicsDI • Reactome Molecular archives • European Nucleotide Archive • European Variation Archive • European Genome-phenome Archive • Experimental Factor Ontology • BioSamples • Mouse Resources Data resources at EMBL-EBI Cross dom ain resources . C ro ss d o m a in re s o u rc e s d g P b s y
  5. 5. What we do: Data In Validate Correlate Data Out Volume: ~2PB/month • FTP: 56% • Aspera: 42% • Globus: 2% Analysis Capacity: • HTC: 28,500 job slots • HPC: 6,600 job slots • Cloud: 6,000 vCPUs • VMware: 1,500 cores Raw Storage (241PB): • Object Store: 103PB • NAS: 81PB • HPC Storage: 27PB • Tape: 30PB ~38 million requests to EMBL-EBI websites every day EMBL-EBI delivered 140 million jobs to its users in 2017 Requests from 3.3 million unique hosts to the EMBL-EBI websites, each month ~1PB/month
  6. 6. ELIXIR – Research Infrastructure for Life Science 6 • Tools Services & connectors to drive access and exploitation • Standards Integration and interoperability of data and services. • Training Professional skills for managing and exploiting data • Compute Access, Exchange & Compute on sensitive data • Data Sustain core data resources
  7. 7. Current Integration • ELIXIR AAI & EMBL-EBI IdP • Consistent ID provision across Europe and ELIXIR services • Integrated into Globus Transfer • Data Transfers • From Data Resources (e.g. EMBL-EBI) to a researcher’s desktop • From Data Resources (e.g. EMBL-EBI) to a cloud provider • From a researcher’s institute to a cloud provider
  8. 8. Planned Overhaul of Transfer Infrastructure at EMBL-EBI • Downloads • Would like to move away from Aspera • Performance w.r.t. Globus Transfer? • Would like to increase use of Globus Transfer • Understanding the barriers to adoption? Technical? Political? • Uploads • Moving towards an integrated upload infrastructure: common AAI & file space • Explore the use of Globus Transfer: ease of use, installation, AAI & performance • Current prototype uses
  9. 9. Future: Accessing Life-Science Data from Object Store • FIRE: FIle REplication Service • In existence for over 10 years • Grown to over 20PB • Evolution of technologies • Previous: Distinct NFS systems • Now: Distributed internal Object Store & tape • Future: Distributed internal Object Store & cloud • Challenge: Very long tail of data access patterns • Need ‘shopping cart’ model to retrieve data from cold storage and deliver to endpoint
  10. 10. Future: Moving Data within a Hybrid Ecosystem • European Open Science Cloud (EOSC) • Federation of cloud resources (a.k.a. grid) • Integration alongside commercial cloud resources • More broadly the services needed for the research life-cycle • ELIXIR Cloud Resources • National & domain cloud resources will probably appear within EOSC • EMBL-EBI Cloud Resources • For our own purposes… need to move data from internal to cloud resources • And for the community!
  11. 11. Summary • Some use within EMBL-EBI for edge downloads • Scope for more use and to integrate into uploads • Need reliable transfer to underpin movement of data sets • To users, service providers and public clouds • Contact today: • Steven Newhouse ( • Andrea Cristofori (