Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Redis on NVMe SSD - Zvika Guz, Samsung

1,323 views

Published on

In this talk we report on our experience with Redis-on-Flash (RoF)—a recently introduced product that uses SSDs as a RAM extension to dramatically increase the effective dataset capacity that can be stored on a single server. This talk provides the first in-depth RoF system performance characterization: we consider different use cases (varying both RAM-to-disk access ratio and object size), and compare SATA-based RoF, NVMe-based RoF, and all-RAM Redis deployments. We show that the superior performance of NVMe drives in terms of both latency and peak bandwidth makes them a particularly good fit for RoF use cases. Specifically, we show that backing RoF with NVMe drives can deliver more than 2 million operations per second with sub-millisecond latency on a single server.

Published in: Technology
  • Be the first to comment

Redis on NVMe SSD - Zvika Guz, Samsung

  1. 1. Zvika Guz and Vijay Balakrishnan Memory Solutions Lab, Samsung Semiconductor Inc Redis on NVMe SSD
  2. 2. 2 Redis-on-Flash  Closed-source (RLEC Flesh), 100% compatible with the open-source Redis  Uses Flash as RAM extension to increase effective node capacity  Tiering memory into “fast” and “slow”:  RAM saves keys and hot values  Flash saves cold values  Dynamic configuration of RAM/Flash usage  Uses RockDB as the storage engine to optimize access to block storage  Multi-threaded and asynchronous Redis used to access Flash
  3. 3. 3 Why Redis-on-Flash?  Optimize price-to-performance for a given workload  DRAM is more performant than flash, but $/GB is higher  Limited DRAM capacity per server  Tiering dramatically reduces $/GB, while preserving good performance ($/ops)  Enables orders-of-magnitude more capacity per server  RoF is particularly suitable for large datasets with skewed access distribution
  4. 4. 4 Workload  Models real-world Redis Labs customers  Benchmark: memtier_benchmark (open source)  GET/SET requests, varying: 1. Object size 2. Write-to-read ratio 3. Redis RAM hit ratio  Performance target:  Maximize operation-per-second on a single server, while maintaining sub- millisecond latency  Compared 3 system configuration 1. All-RAM: In-memory RLEC 2. Redis-on-NVMe: 4xSamsung PM1725 NVMe SSDs 3. Redis-on-SATA: 16xSamsung 850 Pro SATA SSDs https://github.com/RedisLabs/memtier_benchmark
  5. 5. 5  Consistent sub-millisecond latencies favor NVMe  NVMe SSD are designed for consistent high performance @ ultra-low latency  Modest incremental cost over SATA, with much better performance  Samsung PM1725 is the fastest NVMe in the market Redis-on-NVMe Samsung PM1725 Specification* Form Factor 2.5” Host Interface PCIe Gen3 x4 Capacities 800GB, 1.6TB, 3.2TB Sequential Read 3300 MB/s Sequential Write 1900 MB/s Random Read 840KIOPS Random Write 130KIOPS Read Latency 95 usec Write Latency 60 usec >6X over SATA >8.5X over SATA *PM1725 HHHL version (PCIe Gen3 x8) provides ~double the performance and capacity, but we did not use it here
  6. 6. 7 System Configuration  Single client, single server  Industry-standard components, all available today Server Dell PowerEdge R730xd, dual-socket Processor 2 x Xeon E5-2690 v3 @ 2.6GHz 12 cores, 24 logical processor per CPU 24 cores, 48 logical processor total Memory 256GB ECC DDR4 Network 10GbE Storage 4 x Samsung PM1725 NVMe 16 x Samsung 850PRO SATA SSD Memtier_benchmark 1.2.6 RLEC version 4.3.0 Operating System Ubuntu 14.04 Linux Kernel 3.19.8
  7. 7. 8 Use case #1: Small Objects  100B objects, write-to-read ratio: 1:1 Perf= 750 KOPS Latency = 0.75 msec Disk BW=1.7 GB/s Perf= 1.8 MOPS Latency=0.9 msec Disk BW=602 MB/s 50% RAM-to-Flash ratio 85% RAM-to-Flash ratio  100% of requests served with <1msec latency
  8. 8. 9 Disk Bandwidth Spike  Spikes in disk bandwidth align with RocksDB compaction phase  Can reach 2-3x the average BW  Drives must be able to sustain these spikes, otherwise tail latency suffers Object Size=100B, write-to-read ratio=1:1, RAM-to-Flash hit ratio=85% Disk BW=602 MB/s
  9. 9. 10 Use case #2: Large Objects  1KB objects, write-to-read ratio: 1:4  100% of requests served with <1msec latency Perf= 270 KOPS Latency = 0.75 msec Disk BW=4.3 GB/s Perf= 816 KOPS Disk BW=3.9 GB/s 50% RAM-to-Flash ratio 85% RAM-to-Flash ratio latency= 0.78 msec
  10. 10. 11 Redis-on-Flash Performance  80/20 read-to-write ratio  With sufficient locality, RoF performance gets close to All-RAM  NVMe speedup over SATA is 2x-2.5x (using ¼ of the drives) 7% 14% 18% 23% 26% 33% 12% 23% 36% 47% 60% 83% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% 0 250,000 500,000 750,000 1,000,000 1,250,000 1,500,000 1,750,000 2,000,000 2,250,000 2,500,000 20% 30% 40% 50% 60% 70% 80% 90% 100% OperationsPerSecond RAM-to-Flash Hit Ratio 100B Objects Series1 Series2 3% 8% 13% 19% 26% 35% 8% 15% 25% 35% 47% 74% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 20% 30% 40% 50% 60% 70% 80% 90% 100% OperationsPerSecond RAM-to-Flash Hit Ratio 1KB Objects Series1 Series2 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% 0 250,000 500,000 750,000 1,000,000 1,250,000 1,500,000 1,750,000 2,000,000 2,250,000 2,500,000 20% 30% 40% 50% 60% 70% 80% 90% 100% OperationsPerSecond RAM-to-Flash Hit Ratio 100B Objects Series1 Series2 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 20% 30% 40% 50% 60% 70% 80% 90% 100% OperationsPerSecond RAM-to-Flash Hit Ratio 1KB Objects Series1 Series2
  11. 11. 12 The Problem with SATA  Need 4X the drives to get to ~half the performance of NVMe  Performance is much more noisy:  99 latency percentile > 1msec  Very difficult to get rid of these latency spikes, exists in almost all our SATA runs Perf= 132 KOPS Latency = 0.65 msec Object Size=1000B, write-to-read ratio=1:4, RAM-to-Flash hit ratio =50%
  12. 12. 13 DRAM or Flash?  Optimize performance/$ for each use-case  Affected by the dataset size, access pattern, and access locality Redis in Memory Redis-on-NVMe Redis-on-SATA $/GB DRAM:NVMe:SATA = 15:2.5:1
  13. 13. 14 Summary  Redis-on-Flash enables:  Order-of-magnitude more capacity per node  High performance at significant lower cost  Samsung PM1725 NVME:  Enables breakthrough performance @ sub-millisecond latency  Consistent performance reduces tail latency  Industry standard components, available today Thank You! zvika.guz@samsung.com
  14. 14. 15 Backup

×