Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases Busted

2,259 views

Published on

Have you heard that all in-memory databases are equally fast but unreliable, inconsistent and expensive? This session highlights in-memory technology that busts all those myths.
Redis, the fastest database on the planet, is not a simply in-memory key-value data-store; but rather a rich in-memory data-structure engine that serves the world’s most popular apps. Redis Labs’ unique clustering technology enables Redis to be highly reliable, keeping every data byte intact despite hundreds of cloud instance failures and dozens of complete data-center outages. It delivers full CP system characteristics at high performance. And with the latest Redis on Flash technology, Redis Labs achieves close to in-memory performance at 70% lower operational costs. Learn about the best uses of in-memory computing to accelerate everyday applications such as high volume transactions, real time analytics, IoT data ingestion and more.

Published in: Technology

IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases Busted

  1. 1. 4 Myths about in-memory databases busted Yiftach Shoolman – Co-Founder & CTO @ Redis Labs @yiftachsh, @redislabsinc
  2. 2. Background - Redis 2 Created by Salvatore Sanfilippo (@antirez) OSS, in-memory NoSQL k/v database/data-structure engine # 1 # 2 # 3 # 12 Fastest growing DB 2013-01 – to date: DB-Engines Most popular database on containers: @DevOps.com & ClusterHQ Top NoSQL databases: DB-Engines Top tools developers love: StackShare
  3. 3. Backgroud – Redis Labs 3 Founded in 2011. HQ in Mountain View CA, R&D in Tel-Aviv IL The largest commercial company behind OSS Redis •  5000+ paying customers •  30,000+ free users •  100,000+ databases under management •  ±200 new databases/day Provide enterpsie class Redis deployement •  As a service – Redis Cloud •  On-premises – Redis Labs Enterprise Cluster (RLEC) $28MM VC funding
  4. 4. Why an in-memory operational DBMS ? 4 DatabaseApp Servers 100msec = The new standard for E2E application response time, under any load 50msec = Average internet latency (round-trip) 50msec = Required app response time (includes processing & multi-DB access) 1msec = Required DB response time
  5. 5. Why an in-memory analytics DBMS ? 5 Query time: Days/Hours Disk-based Query time: Mins/Secs RAM RAM-based
  6. 6. Myth #1 all in-memory databases are equally fast 6
  7. 7. How many ops/sec can a single EC2 instance do? 7 1,000,000 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1xc3.8xlarge EC2 instance (1)  Deployed in a VPC (2)  Enhanced Network Interface (ENI) enabled (3)  Placement group enabled (4)  IRQ Distribution enabled (5)  Interrupt coalescing enabled (6)  4xNICs configured (7)  Receive Packet Steering (RPS) enabled
  8. 8. How many ops/sec can a single EC2 instance do? 8 1,228,432 1,000,000 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 1xc3.8xlarge EC2 instance Fully optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  Placement group enabled (4)  IRQ Distribution enabled (5)  Interrupt coalescing enabled (6)  4xNICs configured (7)  RPS enabled Not optimized (1)  Deployed on Classic EC2 (2)  No tuning (3)  Reached packets/sec limit
  9. 9. How many ops/sec can a single EC2 instance do? 9 1,750,000 1,228,432 1,000,000 0 200,000 400,000 600,000 800,0001,000,0001,200,0001,400,0001,600,0001,800,0002,000,000 1xc3.8xlarge EC2 instance Not optimized (1)  Deployed on Classic EC2 (2)  No tuning (3)  Reached packets/sec limit Half optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  RPS enabled (4)  Only 1xNIC Fully optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  Placement group enabled (4)  IRQ Distribution enabled (5)  Interrupt coalescing enabled (6)  4xNICs configured (7)  RPS enabled
  10. 10. How many ops/sec can a single EC2 instance do? 10 2,000,000 1,750,000 1,228,432 1,000,000 0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 1xc3.8xlarge EC2 instance Not optimized (1)  Deployed on Classic EC2 (2)  No tuning (3)  Reached packets/sec limit Half optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  RPS enabled (4)  Only 1xNIC Fully optimized (1)  Deployed in a VPC (2)  ENI enabled (3)  Placement group enabled (4)  IRQ Distribution enabled (5)  Interrupt coalescing enabled (6)  4xNICs configured (7)  RPS enabled Fully optimized
  11. 11. How many servers to get 1M writes/sec on GCE? 300 0 50 100 150 200 250 300 350 11
  12. 12. How many servers to get 1M writes/sec on GCE? 300 50 50 0 50 100 150 200 250 300 350 12
  13. 13. How many servers to get 1M writes/sec on GCE? 300 50 50 2 0 50 100 150 200 250 300 350 In-memory databases13
  14. 14. Real-world write intensive app 14
  15. 15. Hbase+Internal Cache vs. Hbase+Redis 15
  16. 16. 16 Same Redis core, same HW, different performance 16
  17. 17. So why aren’t in-memory DBs equally fast? 17 Most are written in C/C++…. but programming language isn’t the only thing to consider
  18. 18. What affects in-memory DB performance? (1) Complexity of processing commands à How many lines of code per command ? What is the computation complexity (e.g. in Redis most commands are O(1))? (2) Query efficiency à Is it limited to blob queries? Can you query a discrete value? (3) Pipelining à Can you send multiple requests at once to get lower latency and less context switches?18
  19. 19. What affects in-memory DB performance? (4) Protocol efficiency à How long it takes to parse a request or to serialize a response (5) TCP overhead à Long-lived (connection pool) vs. short-lived connections 19
  20. 20. What affects in-memory DB performance? (6) Single-threaded or multi-threaded architecture à Lock-free vs. parallel computing (7) Shared-nothing (the best) vs. shared-something vs. shared-everything (8) Built-in acceleration components 20
  21. 21. Myth #2 A single node is not a cluster 21
  22. 22. The truth: A single node can be a cluster but not a HA cluster 22
  23. 23. In the new containers/VMs world a cluster is: A bunch of processes that together look like one big process 23
  24. 24. A real-world example A Binary Option platform; 400MB dataset; 1,000,000 ops/sec A single node, 9 shards cluster = 3 nodes, 9 shards cluster24
  25. 25. Myth #3 In-memory databases are inconsistent and unreliable 25
  26. 26. A few facts/questions about consistency Almost all NoSQL databases (not just in-memory) ack the client before commiting to disk Almost all in-memory databases can commit to disk before they ack the client However, even if you ack after everything is committed: •  Is your driver memory buffer persistent and consistent? •  Is your storage system cache persistent and consistent? Most databases are NOT bulletproof26
  27. 27. Most in-memory databases are async most of the time Master Storage Slave Storage 2 1 4 3 6 5 Application Async replication Async disk I/O 27
  28. 28. Some of them can partially sync Master Storage Slave Storage 4 1 3 2 6 5 Application Sync replication Async disk I/O 28
  29. 29. A few of them fully sync Master Storage Slave Storage 6 1 5 2 4 3 Application Sync replication Sync disk I/O 29
  30. 30. CP AP CAP and in-memory databases CA Memcached in-memory 30
  31. 31. AP Behavior during network splits Y X X Write Inconsistency Read CP X X X Full-consistency Read 31 31
  32. 32. Are in-memory databases reliable? Redis Labs facts: •  Provisioned 100s of TBs of RAM •  500+ node failure events à 1 failure every 2 days •  ~30 complete data-center outages à 1 outage every month •  Users with high availability (HA) features enabled haven’t lost a single byte of data 32
  33. 33. Myth #4 In-memory databases are expensive 33
  34. 34. Real-world use case: •  500+GB •  400K writes/sec •  1500 reads/sec •  37.5KB average object size Which one costs more? No extra work at app level 1.5Gbps 120Gbps Tons of work at app level Others 6-node cluster 30+ node cluster 34
  35. 35. Redis Labs Aerospike/ Couchbase Cassandra Read Write Cost Which one costs more (2)? 35
  36. 36. Sometimes in-memory can be very expensive #1 200GB #2 200GB #50 200GB 50 x r3.8xlarge instances #51 200GB #51 200GB #100 200GB 1st replica for HA #101 200GB #102 200GB #150 200GB 2nd replica for quorum HA deployment of 10TB in-memory dataset on EC2 Total cost (reserved instances) = $2,132,250/yr 36
  37. 37. Do we really need 2 replicas? #1 200GB #2 200GB #50 200GB 50 x r3.8xlarge instances #51 200GB #51 200GB #100 200GB 1 replica for HA #101 15GB Efficient HA deployment of 10TB in-memory dataset on EC2 Total cost (reserved instances) = $1,421,500/yr Savings = $710,750/yr Redis Labs Patent Pending Technology 1 quorum server 37
  38. 38. Can we save more? 38
  39. 39. Price/performance of memory technologies $10 $3 $1 0.10 1.00 10.00 100.00 1000.00 $0 $2 $4 $6 $8 $10 $12 DRAM SCM NAND (flash) micosecond 1GB memory - cost vs read/write performance 1 GB cost Read latency Write latency 39
  40. 40. Redis on Flash Flash (slow RAM) RAM All keys + ‘hot’ values ‘cold’ values ●  Asynchronous ●  Multi-threaded Flash used as a RAM extender and NOT as persistent storage 40
  41. 41. How to achieve optimal price/performance By dynamically setting RAM/Flash ratio RAM 4410 GB590 GB 88.2%11.8% 41
  42. 42. Single server performance - 10% in RAM / 90% in Flash *IBM P8 with IBM FlashSystem42 RAM Hits Ratio Ops/Sec Latency Low latency scenarios 100% 1.35M 1.00 msec 80% 340K 1.07 msec 50% 200K 0.96 msec 20% 160K 1.00 msec High throughput scenarios 100% 2.00M 2.40 msec 80% 671K 6.20 msec 50% 483K 10.00 msec 20% 366K 14.50 msec
  43. 43. 10TB Redis deployment on EC2 43 Redis (on RAM) 2 replicas Redis (on RAM) 1 replicas Redis on Flash 1 replica Instance type r3.8xlarge r3.8xlarge i2.8xlarge # of instances 150 100 10 RAM 30TB 20TB 2TB Flash - - 64TB Persistent storage (EBS) 150TB 100TB 80TB 1yr costs (reserved instances) $2,132,250 $1,421,500 $318,090 Yearly savings - $710,750 $1,814,160 Savings % - 33.33% 85.08%
  44. 44. Summary 44
  45. 45. 4 myths about in-memory databases busted All in-memory databases are NOT equally fast You can create a single node in-memory cluster In-memory databases can be consistent and reliable With the right technology, in-memory databases are not expensive 45
  46. 46. Thank you Click to get more info about Redis, Redis Labs, Redis Cloud and RLEC

×