SlideShare a Scribd company logo
1 of 18
Research Computing Storage at
The Francis Crick Institute
Steve Hindmarsh
Head of Scientific Computing
steve.hindmarsh@crick.ac.uk
Michael Holliday
Senior HPC & Research Data Systems Engineer
michael.holliday@crick.ac.uk
DDN User Group SC18
The Francis Crick Institute
• A biomedical discovery institute dedicated to
understanding the fundamental biology underlying
health and disease.
• Founded in 2015 with the merger of two London
research institutes from the Medical Research Council
and Cancer Research UK into the Crick
• Supported by our founding partners:
3
“Discovery without boundaries”
• Core research areas:
• Growth and development
• Health and ageing
• Human biology
• Cancer
• Immune system
• Infectious disease
• Neuroscience
Multi-disciplinary approach: biology, physics,
chemistry, bioinformatics, maths,
engineering…
4
The Crick in numbers
• 1300 Scientists across 130 Labs, Research Groups &
Science Technology Platforms
• 300 Operations Staff
• Over 30,000 pieces of Scientific Equipment
• 13 Floors, 1500 Rooms
• 2 National / International facilities
• 1 Nobel Prize winner!
5
Scientific instruments
• Cryo-Electron microscopes
• Electron microscopes
• Light microscopes
• DNA/RNA sequencers
• Mass spectrometers
• Nuclear Magnetic Resonance (NMR) spectroscopes
• Etc.
All produce large volumes of data which needs to be captured, stored
and analysed.
6
DDN at the Crick
DDN underpins our research computing platform providing:
• Scalability – currently 10 PB and filling up fast!
• Performance
• Multiple and varied workloads
• Resilience
• Manageability
So far it has stood up well to all the workloads our scientists can throw at it. New
challenges include:
• Unprecedented data growth
• Machine/Deep Learning/AI – Nvidia V100 GPU cluster coming soon
7
CAMP – Crick Data Analysis and Management Platform –
Where did we come from:
• 4 Isilon Clusters with around 4PB of data going back 40 years
• Numerous NAS Boxes, Hard drives, USBs etc (we’re still finding out
about them now…)
• Storage from legacy institutes were managed in very different ways,
and had their own Domains and Permissions
• Data replicated in multiple locations but it wasn’t always clear
CAMP – Crick Data Analysis and Management Platform–
Where we started at the Crick:
CAMP
(3PB)
INGEST
(1PB)
GENERAL
(500TB)
Mill Hill LIF
GARLIC
HPC Cluster
(200 Nodes)
AFM
(IW)
AFM
(IW)
AFM
(RO)
AFM
(RO)
Remote Mount Remote Mount
Remote Mount
CAMP – Crick Data Analysis and Management Platform –
• FDR IB Fabric
• 2 Level Fabric
• CAMP
• 2 DDN GS12K Systems
• 20 Enclosures
• INGEST, General, LIF, Millhill:
• Each has 1 DDN GS7K
• 1 or 2 Enclosures
Problems we encountered…
• AFM
• Migration of legacy data
• Data, More Data & Even More Data….
• Instruments
• Labs adapting to the new systems
13
AFM
• INGEST and General were set up to be caches of CAMP storage
• At one point INGEST had 130 AFM links to CAMP
• Delays to syncing were noticeable to scientists, particularly those using the HPC
cluster
• Permissions were not syncing correctly between the two
• Cache was out of sync with home and unable to recover
• Used Independent Writer Mode
14
Data, More Data & Even More Data….
• Our Labs and STPs are producing more
data at faster and faster rates
• 1 Titan Cryo-EM Microscope can produce
up to 1PB data a year (we have 3!).
• Currently ½ Billion Files most only Kb
• Covers pretty much every file type
• CAMP has been Expanded from 3PB to
10PB
• Data needs to be kept for 10 years
• Archive facility is in progress!
16
Where we are working towards:
CAMP
(9PB)
INGEST
(1-2PB)
Melchett
(500TB)
BlackAdder
(testing) GARLIC
HPC Cluster
(400 Nodes)
Spectrum Protect
Backup (14PB)
Archive (TBD)
AFM
(single writer)
Remote Mount
HPC
Workstations
AFM
(single writer)
Questions?
20
Research Computing Storage at the Francis Crick Institute

More Related Content

Similar to Research Computing Storage at the Francis Crick Institute

The Cambridge Research Computing Service
The Cambridge Research Computing ServiceThe Cambridge Research Computing Service
The Cambridge Research Computing Serviceinside-BigData.com
 
CLIMB talk in the Virtual Laboratories session at the RCUK Cloud Working Grou...
CLIMB talk in the Virtual Laboratories session at the RCUK Cloud Working Grou...CLIMB talk in the Virtual Laboratories session at the RCUK Cloud Working Grou...
CLIMB talk in the Virtual Laboratories session at the RCUK Cloud Working Grou...thomasrconnor
 
Electron Microscopy Between OPIC, Oxford and eBIC
Electron Microscopy Between OPIC, Oxford and eBICElectron Microscopy Between OPIC, Oxford and eBIC
Electron Microscopy Between OPIC, Oxford and eBICJisc
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of ScienceGlobus
 
Advancing Research at London's Global University
Advancing Research at London's Global UniversityAdvancing Research at London's Global University
Advancing Research at London's Global Universityinside-BigData.com
 
Working with Instrument Data (GlobusWorld Tour - UMich)
Working with Instrument Data (GlobusWorld Tour - UMich)Working with Instrument Data (GlobusWorld Tour - UMich)
Working with Instrument Data (GlobusWorld Tour - UMich)Globus
 
Green Shoots: Research Data Management Pilot at Imperial College London
Green Shoots:Research Data Management Pilot at Imperial College LondonGreen Shoots:Research Data Management Pilot at Imperial College London
Green Shoots: Research Data Management Pilot at Imperial College LondonTorsten Reimer
 
Big Data for Big Discoveries
Big Data for Big DiscoveriesBig Data for Big Discoveries
Big Data for Big DiscoveriesGovnet Events
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light SourcesIan Foster
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Spark Summit
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesRafael C. Jimenez
 
PLNOG 18 - Dr Marek Michalewicz - InfiniCortex: Superkomputer wielki jak świat
PLNOG 18 - Dr Marek Michalewicz - InfiniCortex: Superkomputer wielki jak światPLNOG 18 - Dr Marek Michalewicz - InfiniCortex: Superkomputer wielki jak świat
PLNOG 18 - Dr Marek Michalewicz - InfiniCortex: Superkomputer wielki jak światPROIDEA
 
The Pacific Research Platform:a Science-Driven Big-Data Freeway System
The Pacific Research Platform:a Science-Driven Big-Data Freeway SystemThe Pacific Research Platform:a Science-Driven Big-Data Freeway System
The Pacific Research Platform:a Science-Driven Big-Data Freeway SystemLarry Smarr
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintrothomasrconnor
 
Shared services - the future of HPC and big data facilities for UK research
Shared services - the future of HPC and big data facilities for UK researchShared services - the future of HPC and big data facilities for UK research
Shared services - the future of HPC and big data facilities for UK researchMartin Hamilton
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it worldChris Dwan
 

Similar to Research Computing Storage at the Francis Crick Institute (20)

Hybrid Cloud for CERN
Hybrid Cloud for CERNHybrid Cloud for CERN
Hybrid Cloud for CERN
 
The Cambridge Research Computing Service
The Cambridge Research Computing ServiceThe Cambridge Research Computing Service
The Cambridge Research Computing Service
 
CLIMB talk in the Virtual Laboratories session at the RCUK Cloud Working Grou...
CLIMB talk in the Virtual Laboratories session at the RCUK Cloud Working Grou...CLIMB talk in the Virtual Laboratories session at the RCUK Cloud Working Grou...
CLIMB talk in the Virtual Laboratories session at the RCUK Cloud Working Grou...
 
Electron Microscopy Between OPIC, Oxford and eBIC
Electron Microscopy Between OPIC, Oxford and eBICElectron Microscopy Between OPIC, Oxford and eBIC
Electron Microscopy Between OPIC, Oxford and eBIC
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 
Advancing Research at London's Global University
Advancing Research at London's Global UniversityAdvancing Research at London's Global University
Advancing Research at London's Global University
 
De gaynsrdec overview_11jan16
De gaynsrdec overview_11jan16De gaynsrdec overview_11jan16
De gaynsrdec overview_11jan16
 
Working with Instrument Data (GlobusWorld Tour - UMich)
Working with Instrument Data (GlobusWorld Tour - UMich)Working with Instrument Data (GlobusWorld Tour - UMich)
Working with Instrument Data (GlobusWorld Tour - UMich)
 
Hybrid Cloud for CERN
Hybrid Cloud for CERN Hybrid Cloud for CERN
Hybrid Cloud for CERN
 
Green Shoots: Research Data Management Pilot at Imperial College London
Green Shoots:Research Data Management Pilot at Imperial College LondonGreen Shoots:Research Data Management Pilot at Imperial College London
Green Shoots: Research Data Management Pilot at Imperial College London
 
Big Data for Big Discoveries
Big Data for Big DiscoveriesBig Data for Big Discoveries
Big Data for Big Discoveries
 
Data Automation at Light Sources
Data Automation at Light SourcesData Automation at Light Sources
Data Automation at Light Sources
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciences
 
PLNOG 18 - Dr Marek Michalewicz - InfiniCortex: Superkomputer wielki jak świat
PLNOG 18 - Dr Marek Michalewicz - InfiniCortex: Superkomputer wielki jak światPLNOG 18 - Dr Marek Michalewicz - InfiniCortex: Superkomputer wielki jak świat
PLNOG 18 - Dr Marek Michalewicz - InfiniCortex: Superkomputer wielki jak świat
 
The Pacific Research Platform:a Science-Driven Big-Data Freeway System
The Pacific Research Platform:a Science-Driven Big-Data Freeway SystemThe Pacific Research Platform:a Science-Driven Big-Data Freeway System
The Pacific Research Platform:a Science-Driven Big-Data Freeway System
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintro
 
Shared services - the future of HPC and big data facilities for UK research
Shared services - the future of HPC and big data facilities for UK researchShared services - the future of HPC and big data facilities for UK research
Shared services - the future of HPC and big data facilities for UK research
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
 

More from inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Recently uploaded

WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptxFIDO Alliance
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdfMuhammad Subhan
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 

Recently uploaded (20)

Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 

Research Computing Storage at the Francis Crick Institute

  • 1.
  • 2. Research Computing Storage at The Francis Crick Institute Steve Hindmarsh Head of Scientific Computing steve.hindmarsh@crick.ac.uk Michael Holliday Senior HPC & Research Data Systems Engineer michael.holliday@crick.ac.uk DDN User Group SC18
  • 3. The Francis Crick Institute • A biomedical discovery institute dedicated to understanding the fundamental biology underlying health and disease. • Founded in 2015 with the merger of two London research institutes from the Medical Research Council and Cancer Research UK into the Crick • Supported by our founding partners: 3
  • 4. “Discovery without boundaries” • Core research areas: • Growth and development • Health and ageing • Human biology • Cancer • Immune system • Infectious disease • Neuroscience Multi-disciplinary approach: biology, physics, chemistry, bioinformatics, maths, engineering… 4
  • 5. The Crick in numbers • 1300 Scientists across 130 Labs, Research Groups & Science Technology Platforms • 300 Operations Staff • Over 30,000 pieces of Scientific Equipment • 13 Floors, 1500 Rooms • 2 National / International facilities • 1 Nobel Prize winner! 5
  • 6. Scientific instruments • Cryo-Electron microscopes • Electron microscopes • Light microscopes • DNA/RNA sequencers • Mass spectrometers • Nuclear Magnetic Resonance (NMR) spectroscopes • Etc. All produce large volumes of data which needs to be captured, stored and analysed. 6
  • 7. DDN at the Crick DDN underpins our research computing platform providing: • Scalability – currently 10 PB and filling up fast! • Performance • Multiple and varied workloads • Resilience • Manageability So far it has stood up well to all the workloads our scientists can throw at it. New challenges include: • Unprecedented data growth • Machine/Deep Learning/AI – Nvidia V100 GPU cluster coming soon 7
  • 8.
  • 9. CAMP – Crick Data Analysis and Management Platform – Where did we come from: • 4 Isilon Clusters with around 4PB of data going back 40 years • Numerous NAS Boxes, Hard drives, USBs etc (we’re still finding out about them now…) • Storage from legacy institutes were managed in very different ways, and had their own Domains and Permissions • Data replicated in multiple locations but it wasn’t always clear
  • 10. CAMP – Crick Data Analysis and Management Platform–
  • 11. Where we started at the Crick: CAMP (3PB) INGEST (1PB) GENERAL (500TB) Mill Hill LIF GARLIC HPC Cluster (200 Nodes) AFM (IW) AFM (IW) AFM (RO) AFM (RO) Remote Mount Remote Mount Remote Mount
  • 12. CAMP – Crick Data Analysis and Management Platform – • FDR IB Fabric • 2 Level Fabric • CAMP • 2 DDN GS12K Systems • 20 Enclosures • INGEST, General, LIF, Millhill: • Each has 1 DDN GS7K • 1 or 2 Enclosures
  • 13. Problems we encountered… • AFM • Migration of legacy data • Data, More Data & Even More Data…. • Instruments • Labs adapting to the new systems 13
  • 14. AFM • INGEST and General were set up to be caches of CAMP storage • At one point INGEST had 130 AFM links to CAMP • Delays to syncing were noticeable to scientists, particularly those using the HPC cluster • Permissions were not syncing correctly between the two • Cache was out of sync with home and unable to recover • Used Independent Writer Mode 14
  • 15. Data, More Data & Even More Data…. • Our Labs and STPs are producing more data at faster and faster rates • 1 Titan Cryo-EM Microscope can produce up to 1PB data a year (we have 3!). • Currently ½ Billion Files most only Kb • Covers pretty much every file type • CAMP has been Expanded from 3PB to 10PB • Data needs to be kept for 10 years • Archive facility is in progress! 16
  • 16. Where we are working towards: CAMP (9PB) INGEST (1-2PB) Melchett (500TB) BlackAdder (testing) GARLIC HPC Cluster (400 Nodes) Spectrum Protect Backup (14PB) Archive (TBD) AFM (single writer) Remote Mount HPC Workstations AFM (single writer)