SlideShare a Scribd company logo
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
FROM RESEARCH TO INDUSTRY
Commissariat à l’énergie atomique et aux énergies alternatives - www.cea.fr
CEA & SFA18KE STORY
November 18th, 2019
Gaël DELBARY
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
TOPICS
2
Site update
T1KF project
Building methodology
One story
Numbers
18KE key features
Thanks
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
Compute centers at CEA/DAM
3
2 production compute centers:
- TERA: Defense application
- TGCC: European research
 Hosting France Génomique (storage of DNA sequencing data)
 Hosting CCRT (for industrial companies)
 Hosting Human Brain project (soon)
1 lab compute center:
- R&D compute nodes
- R&D storage clusters
Compute power:
- TERA1K: 28 Pflops
- TGCC: 25 Pflops
- LAB: 1 Pflops (heterogeneous nodes)
2 production compute centers with a similar design:
- Nearly the same architecture, technologies, tools and system software
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
Global storage: data centric approach
4
User interface: Lustre/HSM (Lustre 2.10)
- Lustre 2.12.3 upgrade planned for Q1 2020
Seamless integration of HPSS as a Lustre backend
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
CEA HPC storage numbers
5
Production systems TGCC
(Usable Capacity/Write throughput)
TERA
(Usable Capacity/Write throughput)
Scratch FS (Cray) 15 PB (390 GB/s) 10 PB (530 GB/s)
Store FS (DDN) 16 PB (200 GB/s) 23 PB (200 GB/s)
HPSS HDD Tier 5 PB (30 GB/s) 4 PB (30 GB/s)
Total 36 PB (620 GB/s) 37 PB (760 GB/s)
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
CEA HPC DDN numbers
6
Production systems TGCC TERA
Store FS (DDN)
6x SFA14KXE (w 10x 8462
enclosures)
5x SFA14KXE (w 10x 8462
enclosures)
HPSS HDD Tier
1x SFA14KXE (w 10x 8462
enclosures)
1x SFA14KXE (w 10x 8462
enclosures)
Total 7 6
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
T1KF Project – Key numbers
7
T1KF: Flash Parallel Filesystem (Q4 2019) for the next exascale
supercomputers
Target:
- First store storage level (same Lustre FS as STORE) with Lustre Pool
- 1 TB/s in write sequential (5% of time)
- Mounted by all supercomputers
- Data migration triggered by RobinHood (via lfs migrate)
- Hide complexity to end users
Usage:
- Checkpoint restart files (mostly sequential writes)
- Final files (mostly sequential writes)
- Data post-processing (sequential writes + random read)
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
T1KF Project – Flash required?
8
Reasons:
- Low footprint (limit to 5 racks)
- Random read
- High write throughput (focus here)
External things to monitor:
- Weight (density is not light)
- Power (SSDs consume energy)
- Cooling (linked to power )
Goal: Find a storage solution which answers to these requirements
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
Building methodology
9
Layers by layers approach (not new):
SSD choice
Drive protocol validation
Network interface choice
PCIe validation
Raw devices IO
Network performance
Distributed filesystem IO
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE story: history
10
A long time ago in a small country, CEA decided to build a Flash parallel
filesystem: July 2016
- Embedded Targeted platform 14KE (X ?)
- SSDs in head enclosure (Sandisk Lightning Ascend Gen II 800 GB)
- Legacy RAID Mode
- Defect on SS14K (slots issue)
- Defect on SAS load balancing algorithm
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
pool93 pool94 pool95
GB/s
Parallel write - perf per pool
3.0.1.5
3.1.0.1
3.1.2.1a
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE story
11
July 2017:
- Switch to 14KXE
- DCR Beta
- New drives: 20x Hitachi SS200 (3840 GB)
- Push unmap feature
Summary:
- DCR stable
- 14KXE boost performance (~45GB/s in IOR write with more drives!)
- PCIe ICL limitation (4,5 GB/s)
- First words about 18K
- Dual port drives feature
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE story
12
January 2018:
- Decision to choose 18KE for production configuration
- Start of drives deep dive
- 20x Hitachi SS300 (1,6 TB) (costly)
- Start of raw drive benchs
July 2018:
- Drives winner list:
 Hitachi SS530 3,2 TB (DWPD 3)
 Samsung PM1643 3,2 TB (DWPD 3)
- Multiple VD feature supported
- Unmap design progress
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE story
13
November 2018:
- 18KE sample
- First runs with SSD drives
- Issues with loopback cables (copper)
- Not too stable (pretty stable for a sample )
- Push hardware remarks for GA version
- Power button hole size
- Plastic cpu cover
- Metal canister cover
January 2019:
- SSD Drives fight (be patient, next slides are coming…)
- Troubleshooting instability
 Failed heatsinks (fixed for GA canisters)
- SAS bandwidth analysis
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE story
14
April 2019:
- SAS bandwidth analysis finished
 +10% SAS bandwidth (SS530 or SS300 drives)
 Landed in 11.5
- SSD chosen: SS530
- Lustre write sequential benchs starts on lab
July 2019:
- Unmap feature in SFAOS 11.5
- 4x 18KE (56x drives inside) received
- Tuning locked for IOR write sequential workload
- IO500 discovery (some stuff…)
- Lnetself_test benchs warmup
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
Pause in story: How to choose a SSD?
15
XSR methodology
- Goal: monitor SSD capability to regain « nominal » performance
3DWPD Hitachi SS530
1DWPD Samsung PM1643
3DWPD Samsung PM1643 Low power
2 hours 128KB sequential write 2 hours 128KB random write 2 hours 128KB sequential write
Recovery time, shorter is better
1h40
47 minutes
1h10
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE story
16
September 2019:
- +5x 18KE received
- Slot available (1,5 day) to run benchs at scale
- Stuck on Infiniband networks (bad surprise)
- Full configuration received (mid-september)
October 2019:
- Full systems integration
- Infiniband troubleshooting
- 4x SS530 failures
- 4x power interposer failures
- 1x riser card failure
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE story
17
November 2019:
- Upgrade to SFAOS 11.6.1
- New slot available
- Infiniband optimization
- Lnetselftest validation (1,5 TB/s)
- One canister failure
- 1x SS530 failure
- 1x SS530 slow
- First full benchmark run
Could run IO500 and acceptance benchmark 
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE: numbers
18
IO500 (valid result) Lustre 2.12.3 LTS:
[RESULT] BW phase 1 ior_easy_write 836.649 GB/s : time 367.18 seconds
[RESULT] IOPS phase 1 mdtest_easy_write 1317.850 kiops : time 372.54 seconds
[RESULT] BW phase 2 ior_hard_write 4.494 GB/s : time 797.99 seconds
[RESULT] IOPS phase 2 mdtest_hard_write 465.515 kiops : time 312.24 seconds
[RESULT] IOPS phase 3 find 795.810 kiops : time 801.51 seconds
[RESULT] BW phase 3 ior_easy_read 789.340 GB/s : time 389.19 seconds
[RESULT] IOPS phase 4 mdtest_easy_stat 384.570 kiops : time 1276.63 seconds
[RESULT] BW phase 4 ior_hard_read 14.509 GB/s : time 247.18 seconds
[RESULT] IOPS phase 5 mdtest_hard_stat 730.733 kiops : time 198.92 seconds
[RESULT] IOPS phase 6 mdtest_easy_delete 676.117 kiops : time 726.13 seconds
[RESULT] IOPS phase 7 mdtest_hard_read 348.595 kiops : time 416.97 seconds
[RESULT] IOPS phase 8 mdtest_hard_delete 243.326 kiops : time 607.60 seconds
[SCORE] Bandwidth 81.0068 GB/s : IOPS 545.738 kiops : TOTAL 210.258
During IO500 benchmark:
- 3x slow drives (latency > 64ms in write)
- SFAOS failed one at the end of the bench
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE: numbers
19
Acceptance passed : 1,09 TB/s
- 1 drive missing
- IOR parameters weren’t optimal:
- One shot (no more time slot)
- Buffered IO (not perfect for this workload)
- Throughput sustained during 10 minutes
- Bandwidth space on drives (15%)
- Reachable target: 1,2 TB/s
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE: Key features for performance
20
Loopback cables:
- +40% on raw devices
- GA with fibers only (hard stuff to stabilize)
SAS optimization:
- +10% on raw devices
- No effect on hard drives (to check)
FIO parameters:
- Write sequential
- Block size=3M (strange or not?)
- Numjobs=8
- Latency hunt (iosched=none)
0
10
20
30
40
50
60
70
80
90
w loopback w/o loopback
GB/s
FIO write seq (3M)
SFAOS 11.4
SFAOS 11.5
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
SFA18KE loves Lustre
21
Lustre 2.12.3 LTS:
- One 18KE (56 drives)
- 24 clients
- EDR infiniband network
- Lustre checksums disable
- max_pages_per_rpc=2M
- rpc_in_flight=2
Lustre overhead is pretty light
Lustre loves #clients
- If not, play with “large rpc”
77.6
77.8
78
78.2
78.4
78.6
78.8
79
raw ext4 ior
GB/s
SFAOS 11.5 (Lustre overhead)
-0,7%
-1%
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
Lustre improvements
22
Lustre 2.12.3 LTS doesn’t like non aligned I/O
- ior_hard run kills Lustre performance
Lustre master brings major improvements
- Big boost in ior_hard_write (from 4,49 GB/s to 19,72 GB/s)
 More improvements could be done
- mdtest_easy_write (from 1.317.850 kiops to 3.184.910 kiops)
 Reason: bump to 128 MDTs (stable with IO500 benchmarks)
 Scalibilty is nearly linear
- Results not submitted
IO500 pushes Lustre to be more and more powerful
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
Thanks
23
IO500 and acceptance:
- L3 support guy: Sebastien PIECHURSKI (ATOS)
- Network guy: Damien GROS (CEA)
- Best qualified guy: Dominique MARTINET (CEA)
- Gael DELBARY (CEA)
 Contribution to the flash project:
- DDN: Paul, Bret, Ryan, Laurent, Lee, Cedric, Rich, Kevin, Vasu, Scott,
Thomas, William, Martin, Stefan, Richard, James, Mark, Reggie, Neil, Bill
- CEA: Jean-Marc Ducos (Cooling), Stephane Caillat (Infrastructure), Matthieu
Hautreux (brainstorming on IB topology), Damien GROS (Network), Gael
Delbary (DDN challenging )
Many people involved
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
Summary
24
SFA18KE story: exciting
Many things learned
Powerful platform
Strong relationship between CEA and DDN engineering
Todo:
- Unmap improvements
- Insert 16x18KE in Store FS
- CentOS 8 benchmarks
- Performance with 9012 enclosures
- Next Lustre improvements
- Next IO500?
November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY
Commissariat à l’énergie atomique et aux énergies alternatives - www.cea.fr
Questions?

More Related Content

What's hot

AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World Records
AMD
 
Aerospike: Key Value Data Access
Aerospike: Key Value Data AccessAerospike: Key Value Data Access
Aerospike: Key Value Data Access
Aerospike, Inc.
 
Experiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCExperiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRC
Ganesan Narayanasamy
 
Getting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDsGetting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDs
Aerospike, Inc.
 
Evaluation of cloudera impala 1.1
Evaluation of cloudera impala 1.1Evaluation of cloudera impala 1.1
Evaluation of cloudera impala 1.1
Yukinori Suda
 
Ceph Day Beijing - Welcome to Beijing Ceph Day
Ceph Day Beijing - Welcome to Beijing Ceph DayCeph Day Beijing - Welcome to Beijing Ceph Day
Ceph Day Beijing - Welcome to Beijing Ceph Day
Danielle Womboldt
 
Deep Learning on the SaturnV Cluster
Deep Learning on the SaturnV ClusterDeep Learning on the SaturnV Cluster
Deep Learning on the SaturnV Cluster
inside-BigData.com
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Danielle Womboldt
 
dCUDA: Distributed GPU Computing with Hardware Overlap
 dCUDA: Distributed GPU Computing with Hardware Overlap dCUDA: Distributed GPU Computing with Hardware Overlap
dCUDA: Distributed GPU Computing with Hardware Overlap
inside-BigData.com
 
A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator S...
A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator S...A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator S...
A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator S...
inside-BigData.com
 
Advances at the Argonne Leadership Computing Center
Advances at the Argonne Leadership Computing CenterAdvances at the Argonne Leadership Computing Center
Advances at the Argonne Leadership Computing Center
davidemartin
 
Ceph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community UpdateCeph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community Update
Danielle Womboldt
 
System Interconnects for HPC
System Interconnects for HPCSystem Interconnects for HPC
System Interconnects for HPC
inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
Ceph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for CephCeph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for Ceph
Danielle Womboldt
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
Ganesan Narayanasamy
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrKohei KaiGai
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
DataWorks Summit
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
Ganesan Narayanasamy
 

What's hot (19)

AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World Records
 
Aerospike: Key Value Data Access
Aerospike: Key Value Data AccessAerospike: Key Value Data Access
Aerospike: Key Value Data Access
 
Experiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCExperiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRC
 
Getting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDsGetting The Most Out Of Your Flash/SSDs
Getting The Most Out Of Your Flash/SSDs
 
Evaluation of cloudera impala 1.1
Evaluation of cloudera impala 1.1Evaluation of cloudera impala 1.1
Evaluation of cloudera impala 1.1
 
Ceph Day Beijing - Welcome to Beijing Ceph Day
Ceph Day Beijing - Welcome to Beijing Ceph DayCeph Day Beijing - Welcome to Beijing Ceph Day
Ceph Day Beijing - Welcome to Beijing Ceph Day
 
Deep Learning on the SaturnV Cluster
Deep Learning on the SaturnV ClusterDeep Learning on the SaturnV Cluster
Deep Learning on the SaturnV Cluster
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and Ceph
 
dCUDA: Distributed GPU Computing with Hardware Overlap
 dCUDA: Distributed GPU Computing with Hardware Overlap dCUDA: Distributed GPU Computing with Hardware Overlap
dCUDA: Distributed GPU Computing with Hardware Overlap
 
A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator S...
A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator S...A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator S...
A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator S...
 
Advances at the Argonne Leadership Computing Center
Advances at the Argonne Leadership Computing CenterAdvances at the Argonne Leadership Computing Center
Advances at the Argonne Leadership Computing Center
 
Ceph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community UpdateCeph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community Update
 
System Interconnects for HPC
System Interconnects for HPCSystem Interconnects for HPC
System Interconnects for HPC
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
Ceph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for CephCeph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for Ceph
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
 

Similar to Optimizing Flash at Scale

SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptx
ssuserabc741
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
RCCSRENKEI
 
Welcome to the Datasphere – the next level of storage
Welcome to the Datasphere – the next level of storageWelcome to the Datasphere – the next level of storage
Welcome to the Datasphere – the next level of storage
BOSTON Server & Storage Solutions GmbH
 
Seagate – Next Level Storage (Webinar mit Boston Server & Storage, 2018 09-28)
Seagate – Next Level Storage (Webinar mit Boston Server & Storage,  2018 09-28)Seagate – Next Level Storage (Webinar mit Boston Server & Storage,  2018 09-28)
Seagate – Next Level Storage (Webinar mit Boston Server & Storage, 2018 09-28)
BOSTON Server & Storage Solutions GmbH
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
DataWorks Summit
 
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...
BIOVIA
 
Cost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IP
Cost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IPCost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IP
Cost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IP
CAST, Inc.
 
PostgreSQL performance archaeology
PostgreSQL performance archaeologyPostgreSQL performance archaeology
PostgreSQL performance archaeology
Tomas Vondra
 
FPGAs for Supercomputing: The Why and How
FPGAs for Supercomputing: The Why and HowFPGAs for Supercomputing: The Why and How
FPGAs for Supercomputing: The Why and How
DESMOND YUEN
 
CGYRO Performance on Power9 CPUs and Volta GPUS
CGYRO Performance on Power9 CPUs and Volta GPUSCGYRO Performance on Power9 CPUs and Volta GPUS
CGYRO Performance on Power9 CPUs and Volta GPUS
Igor Sfiligoi
 
Flash for the Real World – Separate Hype from Reality
Flash for the Real World – Separate Hype from RealityFlash for the Real World – Separate Hype from Reality
Flash for the Real World – Separate Hype from Reality
Hitachi Vantara
 
fpga1 - What is.pptx
fpga1 - What is.pptxfpga1 - What is.pptx
fpga1 - What is.pptx
ssuser0de10a
 
E3MV - Embedded Vision - Sundance
E3MV - Embedded Vision - SundanceE3MV - Embedded Vision - Sundance
E3MV - Embedded Vision - Sundance
Sundance Multiprocessor Technology Ltd.
 
GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019
GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019
GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019
corehard_by
 
CAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablementCAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablement
Ganesan Narayanasamy
 
QCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AIQCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AILex Yu
 
Need For Speed- Using Flash Storage to optimise performance and reduce costs-...
Need For Speed- Using Flash Storage to optimise performance and reduce costs-...Need For Speed- Using Flash Storage to optimise performance and reduce costs-...
Need For Speed- Using Flash Storage to optimise performance and reduce costs-...
NetAppUK
 
BURA Supercomputer
BURA SupercomputerBURA Supercomputer
BURA Supercomputer
SIMTEC Software and Services
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
Jeff Larkin
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
Kohei KaiGai
 

Similar to Optimizing Flash at Scale (20)

SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptx
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
Welcome to the Datasphere – the next level of storage
Welcome to the Datasphere – the next level of storageWelcome to the Datasphere – the next level of storage
Welcome to the Datasphere – the next level of storage
 
Seagate – Next Level Storage (Webinar mit Boston Server & Storage, 2018 09-28)
Seagate – Next Level Storage (Webinar mit Boston Server & Storage,  2018 09-28)Seagate – Next Level Storage (Webinar mit Boston Server & Storage,  2018 09-28)
Seagate – Next Level Storage (Webinar mit Boston Server & Storage, 2018 09-28)
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
 
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...
 
Cost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IP
Cost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IPCost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IP
Cost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IP
 
PostgreSQL performance archaeology
PostgreSQL performance archaeologyPostgreSQL performance archaeology
PostgreSQL performance archaeology
 
FPGAs for Supercomputing: The Why and How
FPGAs for Supercomputing: The Why and HowFPGAs for Supercomputing: The Why and How
FPGAs for Supercomputing: The Why and How
 
CGYRO Performance on Power9 CPUs and Volta GPUS
CGYRO Performance on Power9 CPUs and Volta GPUSCGYRO Performance on Power9 CPUs and Volta GPUS
CGYRO Performance on Power9 CPUs and Volta GPUS
 
Flash for the Real World – Separate Hype from Reality
Flash for the Real World – Separate Hype from RealityFlash for the Real World – Separate Hype from Reality
Flash for the Real World – Separate Hype from Reality
 
fpga1 - What is.pptx
fpga1 - What is.pptxfpga1 - What is.pptx
fpga1 - What is.pptx
 
E3MV - Embedded Vision - Sundance
E3MV - Embedded Vision - SundanceE3MV - Embedded Vision - Sundance
E3MV - Embedded Vision - Sundance
 
GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019
GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019
GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019
 
CAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablementCAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablement
 
QCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AIQCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AI
 
Need For Speed- Using Flash Storage to optimise performance and reduce costs-...
Need For Speed- Using Flash Storage to optimise performance and reduce costs-...Need For Speed- Using Flash Storage to optimise performance and reduce costs-...
Need For Speed- Using Flash Storage to optimise performance and reduce costs-...
 
BURA Supercomputer
BURA SupercomputerBURA Supercomputer
BURA Supercomputer
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 

More from inside-BigData.com

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
inside-BigData.com
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
inside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
inside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
inside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
inside-BigData.com
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
inside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
inside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
inside-BigData.com
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
inside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
 

Recently uploaded

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 

Recently uploaded (20)

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 

Optimizing Flash at Scale

  • 1. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY FROM RESEARCH TO INDUSTRY Commissariat à l’énergie atomique et aux énergies alternatives - www.cea.fr CEA & SFA18KE STORY November 18th, 2019 Gaël DELBARY
  • 2. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY TOPICS 2 Site update T1KF project Building methodology One story Numbers 18KE key features Thanks
  • 3. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY Compute centers at CEA/DAM 3 2 production compute centers: - TERA: Defense application - TGCC: European research  Hosting France Génomique (storage of DNA sequencing data)  Hosting CCRT (for industrial companies)  Hosting Human Brain project (soon) 1 lab compute center: - R&D compute nodes - R&D storage clusters Compute power: - TERA1K: 28 Pflops - TGCC: 25 Pflops - LAB: 1 Pflops (heterogeneous nodes) 2 production compute centers with a similar design: - Nearly the same architecture, technologies, tools and system software
  • 4. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY Global storage: data centric approach 4 User interface: Lustre/HSM (Lustre 2.10) - Lustre 2.12.3 upgrade planned for Q1 2020 Seamless integration of HPSS as a Lustre backend
  • 5. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY CEA HPC storage numbers 5 Production systems TGCC (Usable Capacity/Write throughput) TERA (Usable Capacity/Write throughput) Scratch FS (Cray) 15 PB (390 GB/s) 10 PB (530 GB/s) Store FS (DDN) 16 PB (200 GB/s) 23 PB (200 GB/s) HPSS HDD Tier 5 PB (30 GB/s) 4 PB (30 GB/s) Total 36 PB (620 GB/s) 37 PB (760 GB/s)
  • 6. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY CEA HPC DDN numbers 6 Production systems TGCC TERA Store FS (DDN) 6x SFA14KXE (w 10x 8462 enclosures) 5x SFA14KXE (w 10x 8462 enclosures) HPSS HDD Tier 1x SFA14KXE (w 10x 8462 enclosures) 1x SFA14KXE (w 10x 8462 enclosures) Total 7 6
  • 7. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY T1KF Project – Key numbers 7 T1KF: Flash Parallel Filesystem (Q4 2019) for the next exascale supercomputers Target: - First store storage level (same Lustre FS as STORE) with Lustre Pool - 1 TB/s in write sequential (5% of time) - Mounted by all supercomputers - Data migration triggered by RobinHood (via lfs migrate) - Hide complexity to end users Usage: - Checkpoint restart files (mostly sequential writes) - Final files (mostly sequential writes) - Data post-processing (sequential writes + random read)
  • 8. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY T1KF Project – Flash required? 8 Reasons: - Low footprint (limit to 5 racks) - Random read - High write throughput (focus here) External things to monitor: - Weight (density is not light) - Power (SSDs consume energy) - Cooling (linked to power ) Goal: Find a storage solution which answers to these requirements
  • 9. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY Building methodology 9 Layers by layers approach (not new): SSD choice Drive protocol validation Network interface choice PCIe validation Raw devices IO Network performance Distributed filesystem IO
  • 10. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE story: history 10 A long time ago in a small country, CEA decided to build a Flash parallel filesystem: July 2016 - Embedded Targeted platform 14KE (X ?) - SSDs in head enclosure (Sandisk Lightning Ascend Gen II 800 GB) - Legacy RAID Mode - Defect on SS14K (slots issue) - Defect on SAS load balancing algorithm 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 pool93 pool94 pool95 GB/s Parallel write - perf per pool 3.0.1.5 3.1.0.1 3.1.2.1a
  • 11. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE story 11 July 2017: - Switch to 14KXE - DCR Beta - New drives: 20x Hitachi SS200 (3840 GB) - Push unmap feature Summary: - DCR stable - 14KXE boost performance (~45GB/s in IOR write with more drives!) - PCIe ICL limitation (4,5 GB/s) - First words about 18K - Dual port drives feature
  • 12. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE story 12 January 2018: - Decision to choose 18KE for production configuration - Start of drives deep dive - 20x Hitachi SS300 (1,6 TB) (costly) - Start of raw drive benchs July 2018: - Drives winner list:  Hitachi SS530 3,2 TB (DWPD 3)  Samsung PM1643 3,2 TB (DWPD 3) - Multiple VD feature supported - Unmap design progress
  • 13. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE story 13 November 2018: - 18KE sample - First runs with SSD drives - Issues with loopback cables (copper) - Not too stable (pretty stable for a sample ) - Push hardware remarks for GA version - Power button hole size - Plastic cpu cover - Metal canister cover January 2019: - SSD Drives fight (be patient, next slides are coming…) - Troubleshooting instability  Failed heatsinks (fixed for GA canisters) - SAS bandwidth analysis
  • 14. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE story 14 April 2019: - SAS bandwidth analysis finished  +10% SAS bandwidth (SS530 or SS300 drives)  Landed in 11.5 - SSD chosen: SS530 - Lustre write sequential benchs starts on lab July 2019: - Unmap feature in SFAOS 11.5 - 4x 18KE (56x drives inside) received - Tuning locked for IOR write sequential workload - IO500 discovery (some stuff…) - Lnetself_test benchs warmup
  • 15. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY Pause in story: How to choose a SSD? 15 XSR methodology - Goal: monitor SSD capability to regain « nominal » performance 3DWPD Hitachi SS530 1DWPD Samsung PM1643 3DWPD Samsung PM1643 Low power 2 hours 128KB sequential write 2 hours 128KB random write 2 hours 128KB sequential write Recovery time, shorter is better 1h40 47 minutes 1h10
  • 16. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE story 16 September 2019: - +5x 18KE received - Slot available (1,5 day) to run benchs at scale - Stuck on Infiniband networks (bad surprise) - Full configuration received (mid-september) October 2019: - Full systems integration - Infiniband troubleshooting - 4x SS530 failures - 4x power interposer failures - 1x riser card failure
  • 17. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE story 17 November 2019: - Upgrade to SFAOS 11.6.1 - New slot available - Infiniband optimization - Lnetselftest validation (1,5 TB/s) - One canister failure - 1x SS530 failure - 1x SS530 slow - First full benchmark run Could run IO500 and acceptance benchmark 
  • 18. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE: numbers 18 IO500 (valid result) Lustre 2.12.3 LTS: [RESULT] BW phase 1 ior_easy_write 836.649 GB/s : time 367.18 seconds [RESULT] IOPS phase 1 mdtest_easy_write 1317.850 kiops : time 372.54 seconds [RESULT] BW phase 2 ior_hard_write 4.494 GB/s : time 797.99 seconds [RESULT] IOPS phase 2 mdtest_hard_write 465.515 kiops : time 312.24 seconds [RESULT] IOPS phase 3 find 795.810 kiops : time 801.51 seconds [RESULT] BW phase 3 ior_easy_read 789.340 GB/s : time 389.19 seconds [RESULT] IOPS phase 4 mdtest_easy_stat 384.570 kiops : time 1276.63 seconds [RESULT] BW phase 4 ior_hard_read 14.509 GB/s : time 247.18 seconds [RESULT] IOPS phase 5 mdtest_hard_stat 730.733 kiops : time 198.92 seconds [RESULT] IOPS phase 6 mdtest_easy_delete 676.117 kiops : time 726.13 seconds [RESULT] IOPS phase 7 mdtest_hard_read 348.595 kiops : time 416.97 seconds [RESULT] IOPS phase 8 mdtest_hard_delete 243.326 kiops : time 607.60 seconds [SCORE] Bandwidth 81.0068 GB/s : IOPS 545.738 kiops : TOTAL 210.258 During IO500 benchmark: - 3x slow drives (latency > 64ms in write) - SFAOS failed one at the end of the bench
  • 19. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE: numbers 19 Acceptance passed : 1,09 TB/s - 1 drive missing - IOR parameters weren’t optimal: - One shot (no more time slot) - Buffered IO (not perfect for this workload) - Throughput sustained during 10 minutes - Bandwidth space on drives (15%) - Reachable target: 1,2 TB/s
  • 20. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE: Key features for performance 20 Loopback cables: - +40% on raw devices - GA with fibers only (hard stuff to stabilize) SAS optimization: - +10% on raw devices - No effect on hard drives (to check) FIO parameters: - Write sequential - Block size=3M (strange or not?) - Numjobs=8 - Latency hunt (iosched=none) 0 10 20 30 40 50 60 70 80 90 w loopback w/o loopback GB/s FIO write seq (3M) SFAOS 11.4 SFAOS 11.5
  • 21. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY SFA18KE loves Lustre 21 Lustre 2.12.3 LTS: - One 18KE (56 drives) - 24 clients - EDR infiniband network - Lustre checksums disable - max_pages_per_rpc=2M - rpc_in_flight=2 Lustre overhead is pretty light Lustre loves #clients - If not, play with “large rpc” 77.6 77.8 78 78.2 78.4 78.6 78.8 79 raw ext4 ior GB/s SFAOS 11.5 (Lustre overhead) -0,7% -1%
  • 22. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY Lustre improvements 22 Lustre 2.12.3 LTS doesn’t like non aligned I/O - ior_hard run kills Lustre performance Lustre master brings major improvements - Big boost in ior_hard_write (from 4,49 GB/s to 19,72 GB/s)  More improvements could be done - mdtest_easy_write (from 1.317.850 kiops to 3.184.910 kiops)  Reason: bump to 128 MDTs (stable with IO500 benchmarks)  Scalibilty is nearly linear - Results not submitted IO500 pushes Lustre to be more and more powerful
  • 23. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY Thanks 23 IO500 and acceptance: - L3 support guy: Sebastien PIECHURSKI (ATOS) - Network guy: Damien GROS (CEA) - Best qualified guy: Dominique MARTINET (CEA) - Gael DELBARY (CEA)  Contribution to the flash project: - DDN: Paul, Bret, Ryan, Laurent, Lee, Cedric, Rich, Kevin, Vasu, Scott, Thomas, William, Martin, Stefan, Richard, James, Mark, Reggie, Neil, Bill - CEA: Jean-Marc Ducos (Cooling), Stephane Caillat (Infrastructure), Matthieu Hautreux (brainstorming on IB topology), Damien GROS (Network), Gael Delbary (DDN challenging ) Many people involved
  • 24. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY Summary 24 SFA18KE story: exciting Many things learned Powerful platform Strong relationship between CEA and DDN engineering Todo: - Unmap improvements - Insert 16x18KE in Store FS - CentOS 8 benchmarks - Performance with 9012 enclosures - Next Lustre improvements - Next IO500?
  • 25. November 18th, 2019Commissariat à l’énergie atomique et aux énergies alternatives Gaël DELBARY Commissariat à l’énergie atomique et aux énergies alternatives - www.cea.fr Questions?