SlideShare a Scribd company logo
1 of 20
Download to read offline
Realizing the Next Generation of Exabyte-scale Persistent
Memory-Centric Architectures and Memory Fabrics
Zvonimir Z. Bandic, Sr. Director, Next Generation Platform Technologies
Western Digital Corporation
This presentation contains forward-looking statements that involve risks and uncertainties, including, but not limited to, statements regarding
our product and technology positioning, the anticipated benefits of the integration of HGST and SanDisk into our company, market trends,
business strategy and growth opportunities. Forward-looking statements should not be read as a guarantee of future performance or results,
and will not necessarily be accurate indications of the times at, or by, which such performance or results will be achieved, if at all. Forward-
looking statements are subject to risks and uncertainties that could cause actual performance or results to differ materially from those
expressed in or suggested by the forward-looking statements.
Additional key risks and uncertainties include the impact of continued uncertainty and volatility in global economic conditions; actions by
competitors; difficulties associated with the integration of SanDisk and HGST into our company; business conditions; growth in our markets;
and pricing trends and fluctuations in average selling prices. More information about the other risks and uncertainties that could affect our
business are listed in our filings with the Securities and Exchange Commission (the “SEC”) and available on the SEC’s website at
www.sec.gov, including our most recently filed periodic report, to which your attention is directed. We do not undertake any obligation to
publicly update or revise any forward-looking statement, whether as a result of new information, future developments or otherwise, except as
otherwise required by law.
Forward-looking statements
Safe Harbor | Disclaimers
Agenda
Rapid growth in the data ecosystem
Importance of purpose-built technologies and architectures
Current distributed systems architecture
Where should we attach persistent memory?
RDMA networking: Lowest latency remote memory access
Compute state of the art: Compute acceleration
Persistent memory scale out
Memory pool size requirements
CPU attached memories access via RDMA networking vs. memory fabrics
Memory fabrics latency requirements
Conclusions
Diverse and connected data types
Tight coupling between Big Data and Fast Data
PerformanceScale
Batch
Analytics
Streaming
Analytics
Artificial
Intelligence
Machine
Learning
Data
Aggregation
Modeling
Scale
INSIGHT
PREDICTION
PRESCRIPTION
BIG DATA FAST DATA
Performance
SMART MACHINES
MOBILITY
REAL-TIME RESULTS
Big
Data
Fast
Data
From general purpose to purpose built
General purpose
compute-centric architecture
Expanding applications and workloads
Devices
Platforms
Systems
Solutions
Architectures designed for Big Data, Fast Data applications
Workload diversity
Demanding diverse technologies and architectures
StorageHDD SSD
Fast
Data
Big
Data
Performance-
centric scale
Capacity-
centric scale
Compute
Storage SOC GPU FPGA ASIC
Memory NVM DRAM SRAM
Storage Semantic
Data Flow
Memory Semantic
Data Flow
Interconnect
Storage-centric
architecture
Purpose-built data-centric architectures
Memory-centric
architecture
General Purpose CPU
Where should we attach persistent memory?
CPU BUS: PARALLEL CPU BUS: SERIAL SERIAL PERIPHERAL BUS
Physical
interface
DIMM DIMM/other PCIe
Logical
interface
Non-standard DDR4, NVDIMM-P DMI for Power 8, CCIX, OpenCAPI 3.1,
Rapid-IO, gen-Z
NVMe, DC express*
Pros • Low latency
• High bandwidth
• Power proportional
• Coherent through memory controller
• High bandwidth
• Significant pin reduction
• Higher memory bandwidth to CPU
• Coherent through memory controller, or
in some cases can even present lowest
point of coherence
• Standardized
• CPU/platform independent
• Latency low enough for storage
• Easy RDMA integration
• Hot pluggable
Cons • CPU memory controller has to implement
specific logical interface
• Not suited for stochastic latency behavior
• Not hot pluggable
• BIOS needs change
• CPU memory controller has to support
• May have higher power consumption
Higher latency (~1us)
The emergence of memory fabric
Memory fabric may mean different
things to different people:
Page fault trap leading to RDMA
request (incurs context switch and
SW overhead)
Global address translation
management in SW, leading to
LD/ST across global memory fabric
Coherence protocol scaled out,
global page management and no
context switching
RDMA networking: Latency
As a direct memory transfer protocol RDMA
is already suited to accessing remote byte-
addressable persistent memory devices
Typical latency for RDMA reads: 1.6-1.8us
For PCIe attached RNICs on host and target nodes
Simultaneous PCIe tracing reveals as little as
600ns spent actually transferring data over
the network
70% of latency is spent in memory or
PCIe-protocol overhead
How much can be saved by attaching memory
directly to network fabric (i.e., no PCIe overhead)
Maximal benefit (~1.0 us) would require integrated
RNIC interfaces on both Initiator and Target
BREAKDOWN OF A
16-BYTE RDMA_READ REQUEST
PCIe
Interposer
“Local”
CPU
“Local”
HCA Port
“Remote”
HCA Port
Remote
Buffer
Driver Memory
Addresses?
User
Memory
Buffer
Local
CPU
T=0 ns:
CPU writes 64B WR to Mellanox “Blue Flame”
register (NO DOORBELL!?)
T=393 ns:
Data read request TLP received on “Remote” side
T=812 ns:
Root Complex replies with 16B completion
T=1064 ns:
RDMA Data written to User Buffer
T=1118 ns:
64B Write TLP (assumedly to Completion Queue?)
Followed immediately by mysterious 4B Read
T=1577 ns:
Completion of mysterious 4B Read
T-1815 ns:
CPU issues next RDMA read request
PCIe
Interposer
RDMA to persistent memory latency
Raw RDMA access to remote PCM via PCIe peer2peer
is 26% slower than to remote DRAM
0
10000
20000
30000
40000
50000
60000
0.0 0.5 1.0 1.5 2.0 2.5
DRAM PCM
LATENCY (US)
HISTOGRAM
Compute accelerators
Traditional distributed
system:
Memory accessible
only via general
purpose CPU or
via RDMA
Distributed system with
compute accelerators, with
own memory:
FPGA, GP-GPU, ASIC
Compute accelerators need
large memory bandwith and
may drive need for memory
appliances (e.g. NVLink)
Data (memory) centric
architecture:
Large number of compute
devices (Cores, FPGA, GP-
GPUs, ASIC) assessing
large pool of memory
The emergence of memory fabric
Memory fabric may mean different
things to different people:
Page fault trap leading to RDMA
request (incurs context switch and
SW overhead)
Global address translation
management in SW, leading to
LD/ST across global memory fabric
Coherence protocol scaled out,
global page management and no
context switching
Latency requirements
Impact of latency on execution time
(Intel Xeon, DRAM frequency settings changed in BIOS)
It is not trivial to determine the
threshold latency, for which CPU
load operation still makes sense:
As opposed to using block IO,
which allows more efficient CPU
utilization
Based on slowing down DRAM,
and executing large number of
benchmarks, we determine soft
threshold of approximately:
500 ns – 1 us
NORMALIZED EXECUTION TIME
Persistent memory scale-out
Exabyte scale memory system requirements:
Assume exabyte scale, or at least many 10’s of
Petabytes distributed in large number of nodes
Assuming 256 Gb/die, the amount of persistent
memory per node can be estimated as 8-16 TB:
Limited by total power budget – SoC and media
Number of nodes required is quite large:
For 1 PB: 64 nodes
For 64 PB: 4096 nodes
Assuming 3D torus, the number of hops
(one direction) is:
For 1 PB: 64 nodes; 8 hops
For 64 PB: 4096 nodes; 48 hops
Path for improvement is:
Increasing dimensionality of the network
topology – expensive
Improving power-density envelope of the memory
technology in order to pack more capacity per node
End-to-end latency requirements
Assuming that at least 8 hops across the cluster
are required for remote memory access:
8 x switch_latency+raw_memory_latency <500 ns – 1000 ns
Assuming raw memory latency is 100ns:
Switch_latency <(50 – 100) ns
Larger clusters and larger number of nodes will lead
to significantly tighter latency requirement
Routable and switchable fabric:
Require uniform fabric from card, to node to cluster level,
with uniform switching and routing solutions
25Gb/s IP is available today:
With a clear roadmap to at least
56Gb with PAM4
Same fabric can be used on a node
level to enable heterogeneous compute
Replace L2 Ethernet with custom
protocol on top of 802.3 L1
Keep L1 from 802.3
Rest is programmed with P4 allowing
implementation of multiple protocols
Supports innovation required for RAS
FPGA or ASIC switch
Memory fabric protocol innovation platform I
P4 example: Gen-Z switch prototyping
BarefootTofino ASIC (or FPGA):
256-lane 25 GT/s Ethernet switch, 6 Tbit/s aggregate throughput
Supports P4 HDL, successor to OpenFlow enabling protocol innovation
Describe Gen-Z packet format in P4
Match-Action Pipeline (a.k.a. “flow tables”) enables line-rate performance
Parser
Match-
Action
Unit
Match-
Action
Unit
Match-
Action
Unit
De-
parser
Packet
data
Packet
data
Match
Table
Match-Action Unit
Key
Actions
Parameters
PHV
PHV
Memory fabric protocol innovation platform II
Conclusions
RDMA networking today can be used to build memory
fabric solutions
Typical measured latency in the 1.6-1.8 us range
From performance perspective, achieving 500 ns fabric
latency across exabyte (~10’s of PB) scale cluster puts
significant pressure on persistent memory power and density,
and memory fabric switch latency
Protocol innovations are required to fulfill latency, throughput
and RAS requirements
19
Insights
Machine
Learning
Precise
Predictions
Real-Time
Results
Protection
Automation
Anywhere
Anytime
Analytics
Context
Aware
Creating environments for data to thrive
©2018 Western Digital Corporation or its affiliates. All rights reserved. Confidential.
Research authored by Zvonimir Z. Bandic, Yang Liu, Chao Sun, Qingbo Wang, Martin Lueker-Boden, Damien LeMoal,
Dejan Vucinic for Next Generation Platform Technologies at Western Digital Corporation

More Related Content

What's hot

1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hwvideos
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureJim St. Leger
 
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...Videoguy
 
Hosted Solutions Hi-Touch Services Guide
Hosted Solutions Hi-Touch Services GuideHosted Solutions Hi-Touch Services Guide
Hosted Solutions Hi-Touch Services GuideHosted Solutions
 
Network Attached Storage (NAS)
Network Attached Storage (NAS)Network Attached Storage (NAS)
Network Attached Storage (NAS)sandeepgodfather
 
Scalable Storage for Massive Volume Data Systems
Scalable Storage for Massive Volume Data SystemsScalable Storage for Massive Volume Data Systems
Scalable Storage for Massive Volume Data SystemsLars Nielsen
 

What's hot (7)

Mcserviceguard2
Mcserviceguard2Mcserviceguard2
Mcserviceguard2
 
1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
 
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
 
Hosted Solutions Hi-Touch Services Guide
Hosted Solutions Hi-Touch Services GuideHosted Solutions Hi-Touch Services Guide
Hosted Solutions Hi-Touch Services Guide
 
Network Attached Storage (NAS)
Network Attached Storage (NAS)Network Attached Storage (NAS)
Network Attached Storage (NAS)
 
Scalable Storage for Massive Volume Data Systems
Scalable Storage for Massive Volume Data SystemsScalable Storage for Massive Volume Data Systems
Scalable Storage for Massive Volume Data Systems
 

Similar to Realizing Exabyte-scale PM Centric Architectures and Memory Fabrics

I_O Switch Thunderstorm Data Sheet
I_O Switch Thunderstorm Data SheetI_O Switch Thunderstorm Data Sheet
I_O Switch Thunderstorm Data SheetDimitar Boyn
 
SSDs Deliver More at the Point-of-Processing
SSDs Deliver More at the Point-of-ProcessingSSDs Deliver More at the Point-of-Processing
SSDs Deliver More at the Point-of-ProcessingSamsung Business USA
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RSimon Huang
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsAnand Haridass
 
Summit 16: Deploying Virtualized Mobile Infrastructures on Openstack
Summit 16: Deploying Virtualized Mobile Infrastructures on OpenstackSummit 16: Deploying Virtualized Mobile Infrastructures on Openstack
Summit 16: Deploying Virtualized Mobile Infrastructures on OpenstackOPNFV
 
Silicon Motion's PCIe FerriSSD : High speed and reliability for digital signage
Silicon Motion's PCIe FerriSSD : High speed and reliability for digital signageSilicon Motion's PCIe FerriSSD : High speed and reliability for digital signage
Silicon Motion's PCIe FerriSSD : High speed and reliability for digital signageSilicon Motion
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Ontico
 
Offloading for Databases - Deep Dive
Offloading for Databases - Deep DiveOffloading for Databases - Deep Dive
Offloading for Databases - Deep DiveUniFabric
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergenceinside-BigData.com
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreinside-BigData.com
 
Open Channel SSD Controller
Open Channel SSD ControllerOpen Channel SSD Controller
Open Channel SSD ControllerSilicon Motion
 
Sansymphony v10-psp1-new-features-overview
Sansymphony v10-psp1-new-features-overviewSansymphony v10-psp1-new-features-overview
Sansymphony v10-psp1-new-features-overviewPatrick Tang
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
3PAR and VMWare
3PAR and VMWare3PAR and VMWare
3PAR and VMWarevmug
 
ODSA Proof of Concept SmartNIC Speeds & Feeds
ODSA Proof of Concept SmartNIC Speeds & FeedsODSA Proof of Concept SmartNIC Speeds & Feeds
ODSA Proof of Concept SmartNIC Speeds & FeedsODSA Workgroup
 
DPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettDPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettJim St. Leger
 
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...Shuquan Huang
 
Introduction to san ( storage area networks )
Introduction to san ( storage area networks )Introduction to san ( storage area networks )
Introduction to san ( storage area networks )sagaroceanic11
 

Similar to Realizing Exabyte-scale PM Centric Architectures and Memory Fabrics (20)

I_O Switch Thunderstorm Data Sheet
I_O Switch Thunderstorm Data SheetI_O Switch Thunderstorm Data Sheet
I_O Switch Thunderstorm Data Sheet
 
SSDs Deliver More at the Point-of-Processing
SSDs Deliver More at the Point-of-ProcessingSSDs Deliver More at the Point-of-Processing
SSDs Deliver More at the Point-of-Processing
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
@IBM Power roadmap 8
@IBM Power roadmap 8 @IBM Power roadmap 8
@IBM Power roadmap 8
 
Summit 16: Deploying Virtualized Mobile Infrastructures on Openstack
Summit 16: Deploying Virtualized Mobile Infrastructures on OpenstackSummit 16: Deploying Virtualized Mobile Infrastructures on Openstack
Summit 16: Deploying Virtualized Mobile Infrastructures on Openstack
 
Silicon Motion's PCIe FerriSSD : High speed and reliability for digital signage
Silicon Motion's PCIe FerriSSD : High speed and reliability for digital signageSilicon Motion's PCIe FerriSSD : High speed and reliability for digital signage
Silicon Motion's PCIe FerriSSD : High speed and reliability for digital signage
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
 
Offloading for Databases - Deep Dive
Offloading for Databases - Deep DiveOffloading for Databases - Deep Dive
Offloading for Databases - Deep Dive
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
 
Open Channel SSD Controller
Open Channel SSD ControllerOpen Channel SSD Controller
Open Channel SSD Controller
 
Infrastructure Strategies 2007
Infrastructure Strategies 2007Infrastructure Strategies 2007
Infrastructure Strategies 2007
 
Sansymphony v10-psp1-new-features-overview
Sansymphony v10-psp1-new-features-overviewSansymphony v10-psp1-new-features-overview
Sansymphony v10-psp1-new-features-overview
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
3PAR and VMWare
3PAR and VMWare3PAR and VMWare
3PAR and VMWare
 
ODSA Proof of Concept SmartNIC Speeds & Feeds
ODSA Proof of Concept SmartNIC Speeds & FeedsODSA Proof of Concept SmartNIC Speeds & Feeds
ODSA Proof of Concept SmartNIC Speeds & Feeds
 
DPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettDPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles Shiflett
 
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
Optimized HPC/AI cloud with OpenStack acceleration service and composable har...
 
Introduction to san ( storage area networks )
Introduction to san ( storage area networks )Introduction to san ( storage area networks )
Introduction to san ( storage area networks )
 

More from inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Recently uploaded

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Recently uploaded (20)

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Realizing Exabyte-scale PM Centric Architectures and Memory Fabrics

  • 1. Realizing the Next Generation of Exabyte-scale Persistent Memory-Centric Architectures and Memory Fabrics Zvonimir Z. Bandic, Sr. Director, Next Generation Platform Technologies Western Digital Corporation
  • 2. This presentation contains forward-looking statements that involve risks and uncertainties, including, but not limited to, statements regarding our product and technology positioning, the anticipated benefits of the integration of HGST and SanDisk into our company, market trends, business strategy and growth opportunities. Forward-looking statements should not be read as a guarantee of future performance or results, and will not necessarily be accurate indications of the times at, or by, which such performance or results will be achieved, if at all. Forward- looking statements are subject to risks and uncertainties that could cause actual performance or results to differ materially from those expressed in or suggested by the forward-looking statements. Additional key risks and uncertainties include the impact of continued uncertainty and volatility in global economic conditions; actions by competitors; difficulties associated with the integration of SanDisk and HGST into our company; business conditions; growth in our markets; and pricing trends and fluctuations in average selling prices. More information about the other risks and uncertainties that could affect our business are listed in our filings with the Securities and Exchange Commission (the “SEC”) and available on the SEC’s website at www.sec.gov, including our most recently filed periodic report, to which your attention is directed. We do not undertake any obligation to publicly update or revise any forward-looking statement, whether as a result of new information, future developments or otherwise, except as otherwise required by law. Forward-looking statements Safe Harbor | Disclaimers
  • 3. Agenda Rapid growth in the data ecosystem Importance of purpose-built technologies and architectures Current distributed systems architecture Where should we attach persistent memory? RDMA networking: Lowest latency remote memory access Compute state of the art: Compute acceleration Persistent memory scale out Memory pool size requirements CPU attached memories access via RDMA networking vs. memory fabrics Memory fabrics latency requirements Conclusions
  • 4. Diverse and connected data types Tight coupling between Big Data and Fast Data PerformanceScale Batch Analytics Streaming Analytics Artificial Intelligence Machine Learning Data Aggregation Modeling Scale INSIGHT PREDICTION PRESCRIPTION BIG DATA FAST DATA Performance SMART MACHINES MOBILITY REAL-TIME RESULTS
  • 5. Big Data Fast Data From general purpose to purpose built General purpose compute-centric architecture Expanding applications and workloads Devices Platforms Systems Solutions Architectures designed for Big Data, Fast Data applications
  • 6. Workload diversity Demanding diverse technologies and architectures StorageHDD SSD Fast Data Big Data Performance- centric scale Capacity- centric scale Compute Storage SOC GPU FPGA ASIC Memory NVM DRAM SRAM Storage Semantic Data Flow Memory Semantic Data Flow Interconnect Storage-centric architecture Purpose-built data-centric architectures Memory-centric architecture General Purpose CPU
  • 7. Where should we attach persistent memory? CPU BUS: PARALLEL CPU BUS: SERIAL SERIAL PERIPHERAL BUS Physical interface DIMM DIMM/other PCIe Logical interface Non-standard DDR4, NVDIMM-P DMI for Power 8, CCIX, OpenCAPI 3.1, Rapid-IO, gen-Z NVMe, DC express* Pros • Low latency • High bandwidth • Power proportional • Coherent through memory controller • High bandwidth • Significant pin reduction • Higher memory bandwidth to CPU • Coherent through memory controller, or in some cases can even present lowest point of coherence • Standardized • CPU/platform independent • Latency low enough for storage • Easy RDMA integration • Hot pluggable Cons • CPU memory controller has to implement specific logical interface • Not suited for stochastic latency behavior • Not hot pluggable • BIOS needs change • CPU memory controller has to support • May have higher power consumption Higher latency (~1us)
  • 8. The emergence of memory fabric Memory fabric may mean different things to different people: Page fault trap leading to RDMA request (incurs context switch and SW overhead) Global address translation management in SW, leading to LD/ST across global memory fabric Coherence protocol scaled out, global page management and no context switching
  • 9. RDMA networking: Latency As a direct memory transfer protocol RDMA is already suited to accessing remote byte- addressable persistent memory devices Typical latency for RDMA reads: 1.6-1.8us For PCIe attached RNICs on host and target nodes Simultaneous PCIe tracing reveals as little as 600ns spent actually transferring data over the network 70% of latency is spent in memory or PCIe-protocol overhead How much can be saved by attaching memory directly to network fabric (i.e., no PCIe overhead) Maximal benefit (~1.0 us) would require integrated RNIC interfaces on both Initiator and Target BREAKDOWN OF A 16-BYTE RDMA_READ REQUEST PCIe Interposer “Local” CPU “Local” HCA Port “Remote” HCA Port Remote Buffer Driver Memory Addresses? User Memory Buffer Local CPU T=0 ns: CPU writes 64B WR to Mellanox “Blue Flame” register (NO DOORBELL!?) T=393 ns: Data read request TLP received on “Remote” side T=812 ns: Root Complex replies with 16B completion T=1064 ns: RDMA Data written to User Buffer T=1118 ns: 64B Write TLP (assumedly to Completion Queue?) Followed immediately by mysterious 4B Read T=1577 ns: Completion of mysterious 4B Read T-1815 ns: CPU issues next RDMA read request PCIe Interposer
  • 10. RDMA to persistent memory latency Raw RDMA access to remote PCM via PCIe peer2peer is 26% slower than to remote DRAM 0 10000 20000 30000 40000 50000 60000 0.0 0.5 1.0 1.5 2.0 2.5 DRAM PCM LATENCY (US) HISTOGRAM
  • 11. Compute accelerators Traditional distributed system: Memory accessible only via general purpose CPU or via RDMA Distributed system with compute accelerators, with own memory: FPGA, GP-GPU, ASIC Compute accelerators need large memory bandwith and may drive need for memory appliances (e.g. NVLink) Data (memory) centric architecture: Large number of compute devices (Cores, FPGA, GP- GPUs, ASIC) assessing large pool of memory
  • 12. The emergence of memory fabric Memory fabric may mean different things to different people: Page fault trap leading to RDMA request (incurs context switch and SW overhead) Global address translation management in SW, leading to LD/ST across global memory fabric Coherence protocol scaled out, global page management and no context switching
  • 13. Latency requirements Impact of latency on execution time (Intel Xeon, DRAM frequency settings changed in BIOS) It is not trivial to determine the threshold latency, for which CPU load operation still makes sense: As opposed to using block IO, which allows more efficient CPU utilization Based on slowing down DRAM, and executing large number of benchmarks, we determine soft threshold of approximately: 500 ns – 1 us NORMALIZED EXECUTION TIME
  • 14. Persistent memory scale-out Exabyte scale memory system requirements: Assume exabyte scale, or at least many 10’s of Petabytes distributed in large number of nodes Assuming 256 Gb/die, the amount of persistent memory per node can be estimated as 8-16 TB: Limited by total power budget – SoC and media Number of nodes required is quite large: For 1 PB: 64 nodes For 64 PB: 4096 nodes Assuming 3D torus, the number of hops (one direction) is: For 1 PB: 64 nodes; 8 hops For 64 PB: 4096 nodes; 48 hops Path for improvement is: Increasing dimensionality of the network topology – expensive Improving power-density envelope of the memory technology in order to pack more capacity per node
  • 15. End-to-end latency requirements Assuming that at least 8 hops across the cluster are required for remote memory access: 8 x switch_latency+raw_memory_latency <500 ns – 1000 ns Assuming raw memory latency is 100ns: Switch_latency <(50 – 100) ns Larger clusters and larger number of nodes will lead to significantly tighter latency requirement
  • 16. Routable and switchable fabric: Require uniform fabric from card, to node to cluster level, with uniform switching and routing solutions 25Gb/s IP is available today: With a clear roadmap to at least 56Gb with PAM4 Same fabric can be used on a node level to enable heterogeneous compute Replace L2 Ethernet with custom protocol on top of 802.3 L1 Keep L1 from 802.3 Rest is programmed with P4 allowing implementation of multiple protocols Supports innovation required for RAS FPGA or ASIC switch Memory fabric protocol innovation platform I
  • 17. P4 example: Gen-Z switch prototyping BarefootTofino ASIC (or FPGA): 256-lane 25 GT/s Ethernet switch, 6 Tbit/s aggregate throughput Supports P4 HDL, successor to OpenFlow enabling protocol innovation Describe Gen-Z packet format in P4 Match-Action Pipeline (a.k.a. “flow tables”) enables line-rate performance Parser Match- Action Unit Match- Action Unit Match- Action Unit De- parser Packet data Packet data Match Table Match-Action Unit Key Actions Parameters PHV PHV
  • 18. Memory fabric protocol innovation platform II
  • 19. Conclusions RDMA networking today can be used to build memory fabric solutions Typical measured latency in the 1.6-1.8 us range From performance perspective, achieving 500 ns fabric latency across exabyte (~10’s of PB) scale cluster puts significant pressure on persistent memory power and density, and memory fabric switch latency Protocol innovations are required to fulfill latency, throughput and RAS requirements 19
  • 20. Insights Machine Learning Precise Predictions Real-Time Results Protection Automation Anywhere Anytime Analytics Context Aware Creating environments for data to thrive ©2018 Western Digital Corporation or its affiliates. All rights reserved. Confidential. Research authored by Zvonimir Z. Bandic, Yang Liu, Chao Sun, Qingbo Wang, Martin Lueker-Boden, Damien LeMoal, Dejan Vucinic for Next Generation Platform Technologies at Western Digital Corporation