Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1
powering lightning fast apps
2
The newest NoSQL
The fastest data store available today (served entirely from
RAM)
Among the top 3 databases chosen by d...
3
Leading the commercial Redis market
Founded in 2011; GA in 02/2013
2,400+ paying customers; 52,000+ DBs; 100+ new
DBs/da...
4
Redis Cloud Memcached Cloud
Our offering
Fully-managed cloud services.
On-prem server license - soon.
4
5
100msec =
Fast apps requirements
max E2E response time, under any
load
50msec = average Internet latency
50msec = requir...
6
DB performance comparison
@<1msec
@<1msec
@<1msec
@<20msec
@<10-50msec
@<10-50msec
@<100msec
@<100msec
@>100msec
7
Why is Redis efficient ?
Many data-structures
Many cool commands (atomicity maintained)
Complexity aware
8
Real world use case:
• 500+GB
• 400K writes/sec
• 1500 reads/sec
• 37.5KB average object size
Efficiency
No extra
work a...
9
Timeline
Followers
Caching
Messaging
Geo search
Leaderboards
Job management
RT analytics
Verticals & main use cases
Onli...
10
• Multi-TB in memory
• ~ 300,000 reads/sec
• ~ 5,000*N writes/sec
N - # of followers
Twitter
Every Timeline
(800 tweets...
11
• 20TB+ in memory
• ~ 6,000,000 reads/sec
• ~ 600,000 writes/sec
Weibo (Chinese Twitter)
• Counting
• Reverse cache
• T...
12
Pinterest
Object graph:
• Per user (Sorted Set w/ timestamp as score)
 store the users followed (explicit+ implicit)
...
13
Stack Overflow
Three levels of cache:
• Local cache (no persistence)
 sessions, and pending view count updates
• Site ...
14
Github
• Redis is used for routing info
• Matching user repositories to server names
15
Hipchat
• Which users are in which room
• Who is online
• XMPP server balancing
16
Youporn
Most data is found in Hashes with ordered Sets used to know
what data to show
(1) ZinterStore on:
{videos:filte...
17
Snapchat
• 500+ instances
• 15-50TB
• Running on GCE
400M messages/day
18
Why Redis Labs ?
19
Infinite seamless scalability
True high-availability
Stable top performance
Zero management
Users choose us because..
Dynamic Clustering Technology
Zero-latency proxy
Cluster manager
In-Memory Node
Cross-shard processor
In-Memory Cluster
+
21
Challenge #1
How to serve users from the same
data-center ?
4 clouds /10 regions
18 data-centers / 30 clusters
24
AWS zones mapping dilemma
Redis Labs User
us-east-1a us-east-1c
us-east-1b
us-east-1c us-east-1e
us-east-1d us-east-1a
...
25
Eric Hammond’s post on: Matching EC2 Availability Zones
Across AWS Accounts
How did we solve it
26
How did we solve it
Redis Labs
User
27
Challenge #2
Which instance type shall we use
for our cluster?
28
Various instance types in the same cluster
•  High load scenarios
•  High memory usage scenarios
•  New generation o...
29
Adrian Cockcroft's Blog - Understanding and using Amazon
EBS - Elastic Block Store
•  use large instances and get dedi...
30
What we use today
C3 & R3 A4/5/6/7
n1-standard
n1-highmem
n1-highcpu
BM+VM
31
Challenge #3
How to mange data-persistence
with high volumes of ‘writes’ and
slow cloud storage ?
32
Ephemeral vs. Persistence storage
Ephemeral
EBS/Cloud Drive/Persistent
Disk/SAN
Network attached
Persistent
Slow
Direct...
33
Adrian’ s Blog  use the larger EBSes if you want speed
Google (GCP)  “Larger volumes can achieve higher I/O
levels th...
34
We use large volumes (1TB+)
We use both ephemeral and persistent storage
We improved/tuned/optimized the Redis persiste...
35
Why not PIOPS
36
Challenge #4
How to monitor 50K+ databases,
30+ clusters and hundreds of
nodes ?
37
Zabbix (not Nagios) - per node metrics
Limbic (home made) - databases’ metrics
• 50K (databases) x 100+(metrics) x 10K+...
38
Team/Method/Spirit
39
Team /Method/Spirit
Tiny devops team
Core dev. team knows ops (very well)
Baby steps, especially in production
The prac...
40
We are hiring !
41
Thank You
42
Why is Redis efficient ?
Many data-structures
Many cool commands (atomicity maintained)
Complexity aware
43
Think data-structure
• Strings
• Hashes
• Lists
• Sets Sorted Sets
• HyperLogLogs
44
Cool commands
• SET if it doesn’t exist – O(1)
• Blocking POP (with timeout) – O(1)
• (blocking) POP from one list, PUS...
Upcoming SlideShare
Loading in …5
×

Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

26,700 views

Published on

A presentation by Redis Labs' CTO, Yiftach Shoolman, given at the July 2nd meet up, hosted by I am OnDemand and IGT Cloud at the Microsoft ILDC Auditorium.

See the video at: https://www.youtube.com/watch?v=eymqHZaUOH4

In this In this session Yiftach shares tips on how the company manages 50,000+ scalable and highly avaliable Redis databases over the 4 largest public clouds, 8 leading Platforms-as-a-Service, and across 10 geographical regions. He explains the service's back-end architecture, the open-source projects it uses, and which tools the company builds in-house. Shoolman also shares what Redis Labs' small DevOps team does automatically, and what it still does manually. Finally, he offers advice on how to build a strong R&D team that lives and breathes DevOps.

Since the company launched its Redis Cloud service, it has dealt with 150+ node failure events and a half-dozen complete data-center outages. In addition, its team has experienced many interesting scenarios, such as hard to believe scaling patterns like 0 to a few hundreds gigabytes of in-memory data in just a few minutes, and 0 to 300K+ ops/sec in just a few seconds.

Published in: Technology
  • Your opinions matter! get paid for them! click here for more info...◆◆◆ https://tinyurl.com/make2793amonth
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Just got my check for $500, Sometimes people don't believe me when I tell them about how much you can make taking paid surveys online... So I took a video of myself actually getting paid $500 for paid surveys to finally set the record straight. I'm not going to leave this video up for long, so check it out now before I take it down! ➤➤ https://tinyurl.com/realmoneystreams2019
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Have you ever heard of taking paid surveys on the internet before? We have one right now that pays $50, and takes less than 10 minutes! If you want to take it, here is your personal link ◆◆◆ https://tinyurl.com/realmoneystreams2019
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Simply use Allavsoft to free download YouPorn videos to MP4, AVI, 3GP, WMV . It also can download videos from YouPorn, Pornhub, RebTube, YouTube, DailyMotion, Yahoo, eHow, RuTube, etc
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Managing 50K+ Redis Databases Over 4 Public Clouds ... with a Tiny Devops Team

  1. 1. 1 powering lightning fast apps
  2. 2. 2 The newest NoSQL The fastest data store available today (served entirely from RAM) Among the top 3 databases chosen by developers Much more than a simple key/value - Strings, Hashes, Lists, Sets, Sorted Set, LUA, transactions, Bits operations Strong use cases, dynamic community, large eco-system Redis
  3. 3. 3 Leading the commercial Redis market Founded in 2011; GA in 02/2013 2,400+ paying customers; 52,000+ DBs; 100+ new DBs/day 2nd largest contributor to open source Redis Raised $13M - Bain/Carmel/Strategic/Angels Offices in Santa Clara and Tel-Aviv Redis Labs
  4. 4. 4 Redis Cloud Memcached Cloud Our offering Fully-managed cloud services. On-prem server license - soon. 4
  5. 5. 5 100msec = Fast apps requirements max E2E response time, under any load 50msec = average Internet latency 50msec = required app response time (includes processing & multi DB accesses) 1msec = required DB response time The only database to meet requirement=
  6. 6. 6 DB performance comparison @<1msec @<1msec @<1msec @<20msec @<10-50msec @<10-50msec @<100msec @<100msec @>100msec
  7. 7. 7 Why is Redis efficient ? Many data-structures Many cool commands (atomicity maintained) Complexity aware
  8. 8. 8 Real world use case: • 500+GB • 400K writes/sec • 1500 reads/sec • 37.5KB average object size Efficiency No extra work at app level 1.5Gbps 120Gbps Tones of work at app level NoSQL 6 Nodes cluster 150+ Nodes cluster
  9. 9. 9 Timeline Followers Caching Messaging Geo search Leaderboards Job management RT analytics Verticals & main use cases Online advertising Social Gaming Financial Services
  10. 10. 10 • Multi-TB in memory • ~ 300,000 reads/sec • ~ 5,000*N writes/sec N - # of followers Twitter Every Timeline (800 tweets per user) is on Redis
  11. 11. 11 • 20TB+ in memory • ~ 6,000,000 reads/sec • ~ 600,000 writes/sec Weibo (Chinese Twitter) • Counting • Reverse cache • Top 10 lists • Last Index • Relational list/Message Queue • Fast transactions w/ LUA
  12. 12. 12 Pinterest Object graph: • Per user (Sorted Set w/ timestamp as score)  store the users followed (explicit+ implicit)  store the user’s followers (explicit+implicit) • Per board  Redis Hash for storing explicit followers  Redis Set for storing explicit unfollowers
  13. 13. 13 Stack Overflow Three levels of cache: • Local cache (no persistence)  sessions, and pending view count updates • Site cache  hot question id lists, users acceptance rates.. • Global cache  Inboxes, API usage quotas, …
  14. 14. 14 Github • Redis is used for routing info • Matching user repositories to server names
  15. 15. 15 Hipchat • Which users are in which room • Who is online • XMPP server balancing
  16. 16. 16 Youporn Most data is found in Hashes with ordered Sets used to know what data to show (1) ZinterStore on: {videos:filters:release}{videos:filters:orientation:straight} {videos:filters:categories(id)}{videos:ordering:rating} (2) Perform a ZRANGE to get the pages we want and get the list of video_ids back (3) Start pipelining to get all the videos from Hashes
  17. 17. 17 Snapchat • 500+ instances • 15-50TB • Running on GCE 400M messages/day
  18. 18. 18 Why Redis Labs ?
  19. 19. 19 Infinite seamless scalability True high-availability Stable top performance Zero management Users choose us because..
  20. 20. Dynamic Clustering Technology Zero-latency proxy Cluster manager In-Memory Node Cross-shard processor In-Memory Cluster +
  21. 21. 21 Challenge #1 How to serve users from the same data-center ?
  22. 22. 4 clouds /10 regions
  23. 23. 18 data-centers / 30 clusters
  24. 24. 24 AWS zones mapping dilemma Redis Labs User us-east-1a us-east-1c us-east-1b us-east-1c us-east-1e us-east-1d us-east-1a us-east-1e us-east-1b
  25. 25. 25 Eric Hammond’s post on: Matching EC2 Availability Zones Across AWS Accounts How did we solve it
  26. 26. 26 How did we solve it Redis Labs User
  27. 27. 27 Challenge #2 Which instance type shall we use for our cluster?
  28. 28. 28 Various instance types in the same cluster •  High load scenarios •  High memory usage scenarios •  New generation of instances Dedicated instances As cheap as possible Cluster’s node requirements
  29. 29. 29 Adrian Cockcroft's Blog - Understanding and using Amazon EBS - Elastic Block Store •  use large instances and get dedicated instances for free The tip
  30. 30. 30 What we use today C3 & R3 A4/5/6/7 n1-standard n1-highmem n1-highcpu BM+VM
  31. 31. 31 Challenge #3 How to mange data-persistence with high volumes of ‘writes’ and slow cloud storage ?
  32. 32. 32 Ephemeral vs. Persistence storage Ephemeral EBS/Cloud Drive/Persistent Disk/SAN Network attached Persistent Slow Direct attached Ephemeral “Fast”
  33. 33. 33 Adrian’ s Blog  use the larger EBSes if you want speed Google (GCP)  “Larger volumes can achieve higher I/O levels than smaller volumes” The tips
  34. 34. 34 We use large volumes (1TB+) We use both ephemeral and persistent storage We improved/tuned/optimized the Redis persistent storage interface If replication is enabled, slave writes to disk We don’t use PIOPS What we do
  35. 35. 35 Why not PIOPS
  36. 36. 36 Challenge #4 How to monitor 50K+ databases, 30+ clusters and hundreds of nodes ?
  37. 37. 37 Zabbix (not Nagios) - per node metrics Limbic (home made) - databases’ metrics • 50K (databases) x 100+(metrics) x 10K+(time resolutions) • Based on Python, RRD, Redis Redis adminUI – cluster configuration Monitoring
  38. 38. 38 Team/Method/Spirit
  39. 39. 39 Team /Method/Spirit Tiny devops team Core dev. team knows ops (very well) Baby steps, especially in production The practical approach always wins Review your plans every 3 months
  40. 40. 40 We are hiring !
  41. 41. 41 Thank You
  42. 42. 42 Why is Redis efficient ? Many data-structures Many cool commands (atomicity maintained) Complexity aware
  43. 43. 43 Think data-structure • Strings • Hashes • Lists • Sets Sorted Sets • HyperLogLogs
  44. 44. 44 Cool commands • SET if it doesn’t exist – O(1) • Blocking POP (with timeout) – O(1) • (blocking) POP from one list, PUSH to another – O(1) • Get/Set string ranges (and bit operation) – O(N) • Union/Intersect/Ranges of SETs – O(N)+O(Mxlog(M)) • Pub/Sub – O(1)/O(M)/O(M+N) • LUA / Transactions / Pipelining

×