Redis on NVMe SSD - Zvika Guz, Samsung

427 views

Published on

In this talk we report on our experience with Redis-on-Flash (RoF)—a recently introduced product that uses SSDs as a RAM extension to dramatically increase the effective dataset capacity that can be stored on a single server. This talk provides the first in-depth RoF system performance characterization: we consider different use cases (varying both RAM-to-disk access ratio and object size), and compare SATA-based RoF, NVMe-based RoF, and all-RAM Redis deployments. We show that the superior performance of NVMe drives in terms of both latency and peak bandwidth makes them a particularly good fit for RoF use cases. Specifically, we show that backing RoF with NVMe drives can deliver more than 2 million operations per second with sub-millisecond latency on a single server.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
427
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
17
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Redis on NVMe SSD - Zvika Guz, Samsung

  1. 1. Zvika Guz and Vijay Balakrishnan Memory Solutions Lab, Samsung Semiconductor Inc Redis on NVMe SSD
  2. 2. 2 Redis-on-Flash  Closed-source (RLEC Flesh), 100% compatible with the open-source Redis  Uses Flash as RAM extension to increase effective node capacity  Tiering memory into “fast” and “slow”:  RAM saves keys and hot values  Flash saves cold values  Dynamic configuration of RAM/Flash usage  Uses RockDB as the storage engine to optimize access to block storage  Multi-threaded and asynchronous Redis used to access Flash
  3. 3. 3 Why Redis-on-Flash?  Optimize price-to-performance for a given workload  DRAM is more performant than flash, but $/GB is higher  Limited DRAM capacity per server  Tiering dramatically reduces $/GB, while preserving good performance ($/ops)  Enables orders-of-magnitude more capacity per server  RoF is particularly suitable for large datasets with skewed access distribution
  4. 4. 4 Workload  Models real-world Redis Labs customers  Benchmark: memtier_benchmark (open source)  GET/SET requests, varying: 1. Object size 2. Write-to-read ratio 3. Redis RAM hit ratio  Performance target:  Maximize operation-per-second on a single server, while maintaining sub- millisecond latency  Compared 3 system configuration 1. All-RAM: In-memory RLEC 2. Redis-on-NVMe: 4xSamsung PM1725 NVMe SSDs 3. Redis-on-SATA: 16xSamsung 850 Pro SATA SSDs https://github.com/RedisLabs/memtier_benchmark
  5. 5. 5  Consistent sub-millisecond latencies favor NVMe  NVMe SSD are designed for consistent high performance @ ultra-low latency  Modest incremental cost over SATA, with much better performance  Samsung PM1725 is the fastest NVMe in the market Redis-on-NVMe Samsung PM1725 Specification* Form Factor 2.5” Host Interface PCIe Gen3 x4 Capacities 800GB, 1.6TB, 3.2TB Sequential Read 3300 MB/s Sequential Write 1900 MB/s Random Read 840KIOPS Random Write 130KIOPS Read Latency 95 usec Write Latency 60 usec >6X over SATA >8.5X over SATA *PM1725 HHHL version (PCIe Gen3 x8) provides ~double the performance and capacity, but we did not use it here
  6. 6. 7 System Configuration  Single client, single server  Industry-standard components, all available today Server Dell PowerEdge R730xd, dual-socket Processor 2 x Xeon E5-2690 v3 @ 2.6GHz 12 cores, 24 logical processor per CPU 24 cores, 48 logical processor total Memory 256GB ECC DDR4 Network 10GbE Storage 4 x Samsung PM1725 NVMe 16 x Samsung 850PRO SATA SSD Memtier_benchmark 1.2.6 RLEC version 4.3.0 Operating System Ubuntu 14.04 Linux Kernel 3.19.8
  7. 7. 8 Use case #1: Small Objects  100B objects, write-to-read ratio: 1:1 Perf= 750 KOPS Latency = 0.75 msec Disk BW=1.7 GB/s Perf= 1.8 MOPS Latency=0.9 msec Disk BW=602 MB/s 50% RAM-to-Flash ratio 85% RAM-to-Flash ratio  100% of requests served with <1msec latency
  8. 8. 9 Disk Bandwidth Spike  Spikes in disk bandwidth align with RocksDB compaction phase  Can reach 2-3x the average BW  Drives must be able to sustain these spikes, otherwise tail latency suffers Object Size=100B, write-to-read ratio=1:1, RAM-to-Flash hit ratio=85% Disk BW=602 MB/s
  9. 9. 10 Use case #2: Large Objects  1KB objects, write-to-read ratio: 1:4  100% of requests served with <1msec latency Perf= 270 KOPS Latency = 0.75 msec Disk BW=4.3 GB/s Perf= 816 KOPS Disk BW=3.9 GB/s 50% RAM-to-Flash ratio 85% RAM-to-Flash ratio latency= 0.78 msec
  10. 10. 11 Redis-on-Flash Performance  80/20 read-to-write ratio  With sufficient locality, RoF performance gets close to All-RAM  NVMe speedup over SATA is 2x-2.5x (using ¼ of the drives) 7% 14% 18% 23% 26% 33% 12% 23% 36% 47% 60% 83% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% 0 250,000 500,000 750,000 1,000,000 1,250,000 1,500,000 1,750,000 2,000,000 2,250,000 2,500,000 20% 30% 40% 50% 60% 70% 80% 90% 100% OperationsPerSecond RAM-to-Flash Hit Ratio 100B Objects Series1 Series2 3% 8% 13% 19% 26% 35% 8% 15% 25% 35% 47% 74% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 20% 30% 40% 50% 60% 70% 80% 90% 100% OperationsPerSecond RAM-to-Flash Hit Ratio 1KB Objects Series1 Series2 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% 0 250,000 500,000 750,000 1,000,000 1,250,000 1,500,000 1,750,000 2,000,000 2,250,000 2,500,000 20% 30% 40% 50% 60% 70% 80% 90% 100% OperationsPerSecond RAM-to-Flash Hit Ratio 100B Objects Series1 Series2 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 20% 30% 40% 50% 60% 70% 80% 90% 100% OperationsPerSecond RAM-to-Flash Hit Ratio 1KB Objects Series1 Series2
  11. 11. 12 The Problem with SATA  Need 4X the drives to get to ~half the performance of NVMe  Performance is much more noisy:  99 latency percentile > 1msec  Very difficult to get rid of these latency spikes, exists in almost all our SATA runs Perf= 132 KOPS Latency = 0.65 msec Object Size=1000B, write-to-read ratio=1:4, RAM-to-Flash hit ratio =50%
  12. 12. 13 DRAM or Flash?  Optimize performance/$ for each use-case  Affected by the dataset size, access pattern, and access locality Redis in Memory Redis-on-NVMe Redis-on-SATA $/GB DRAM:NVMe:SATA = 15:2.5:1
  13. 13. 14 Summary  Redis-on-Flash enables:  Order-of-magnitude more capacity per node  High performance at significant lower cost  Samsung PM1725 NVME:  Enables breakthrough performance @ sub-millisecond latency  Consistent performance reduces tail latency  Industry standard components, available today Thank You! zvika.guz@samsung.com
  14. 14. 15 Backup

×