2. 2
Why NVMe-oF?
Only 2
SSDs fill
10GbE
24 HDDs to
fill 10GbE
One SAS
SSD fills
10GbE
SSDs
move the
Bottleneck
from the
Disk to the
Network
3. 3
NVME TECHNOLOGY
NVMe Technology
§Optimized for flash and PM
§ Traditional SCSI interfaces designed for spinning disk
§ NVMe bypasses unneeded layers
§NVMe Flash Outperforms SAS/SATA Flash
§ +2.5x more bandwidth, +50% lower latency, +3x
more IOPS
4. 4
“NVME OVER FABRICS” WAS THE LOGICAL AND HISTORICAL
NEXT STEP
§Sharing NVMe based storage across
multiple servers/CPUs was the next step
§ Better utilization: capacity, rack space, power
§ Scalability, management, fault isolation
§NVMe over Fabrics standard
§ 50+ contributors
§ Version 1.0 released in June 2016
§Pre-standard demos in 2014
5. 5
End to End 200Gb/s
Faster Network Products from Nvidia Solves
Half the Network Bottle Neck Problem…
6. 6
Faster Protocols from Nvidia Solves the Other Half
Faster
Protocol
RDM
A
NVMe-oF
Faster
Network
SSDHDD
8. 8NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
RDMA NOW COMMON ACROSS ALL STORAGE TYPES
Persistent
Memory
RPM
Block
File
Object
RDMA
SMB (CIFS)
NFS
Ceph
iSERSMB Direct
Ceph
over
RDMA
NVMe-
oF/RDMA
Swift
§ RDMA for optimal
performance
• InfiniBand & RoCE
o NVMe-oF
o Nvidia GPU Direct Storage
§ Persistent Memory (3D-
XPoint)
S3
FC
iSCSI
NFSoRDMANFSoRDMA NVMe-
oF/TCP
9. 9
NVME OVER FABRICS (NVME-OF) TRANSPORTS
§ The NVMe-oF standard is
not Fabric specific
§ Instead there is a separate
Transport Binding
specification for each
Transport Layer
§ RDMA was 1st
§ Later Fibre Channel
§ Most recent TCP
InfiniBand
10. 10
How Does NVMe-oF Maintain NVMe Like Performance?
§ By extending NVMe efficiency over a fabric
§ NVMe commands and data structures are transferred end
to end
§ Bypassing legacy stacks for performance
§ First products all used RDMA
§ Performance is impressive
SAS/sATA
Deviceover Fabrics
NVMe/RDMA
NVMe/TCP
Transport
Transport
or IB
11. 11
How Does NVMe-oF Maintain NVMe Like Performance?
§ By extending NVMe efficiency over a fabric
§ NVMe commands and data structures are transferred end
to end
§ Bypassing legacy stacks for performance
§ First products all used RDMA
§ Performance is impressive
SAS/sATA
Deviceover Fabrics
NVMe/RDMA
NVMe/TCP
12. 12
NVMe-oF Applications - Composable Infrastructure
§Also called Compute
Storage Disaggregation
and Rack Scale
§NVMe over Fabrics
enables Composable
Infrastructure
§ Low latency
§ High bandwidth
§ Nearly local disk
performance
16. 16
Nvidia GPUDirect Storage
CPU
GPU Chip
set
GPU
Mem
System
Memory
CPU
Chip
set
System
Memory
Network
2
1
1
2
Without GPUDirect StorageReceive Transmit
CPU
GPU Chip
set
GPU
Memory
System
Memory CPU
Chip
set
System
Memory
Network
1
1
With GPUDirect StorageReceive Transmit
RDMARDMA
https://info.nvidia.com/gpudirect-storage-webinar-reg-page.html?ondemandrgt=yes
17. 17
NVME/TCP
§NVMe-oF commands are sent over standard TCP/IP sockets
§Each NVMe queue pair is mapped to a TCP connection
§Easy to support NVMe over TCP with no changes
§Good for distance, stranded server, and out of band management connectivity
19. 19
IMPORTANCE OF TAIL LATENCY IN TODAY’S DATACENTERS
§ Most datacenters today need to support interactive real-time requests
§ Online searches generate a small amount of network traffic between the requestor and datacenter, but
the response generates a massive amount of traffic within the datacenter
§ Much of this traffic is related to the core advertising business model of the datacenter’s owner
§ The distributive software architecture that drives these businesses is very susceptible to tail latency
§ https://www.nextplatform.com/2018/03/27/in-modern-datacenters-the-latency-tail-wags-the-network-
dog/
20. 20
CONCLUSIONS
§NVMe-oF brings the value of networked storage to NVMe
based solutions
§NVMe-oF is supported across many network technologies
§The performance advantages of NVMe, are not lost with
NVMe-oF
§Especially with RDMA
§There are many suppliers of NVMe-oF solutions across a
variety of important data center use cases