PUTTING THE X FACTOR INTO              CASSANDRA:  ADVENTURES IN COUNTING                  MALCOLM BOX, CTO, LIVE TALKBACK...
INTRO  Malcolm Box, CTO & Co-Founder  @malcolmbox  malcolm@tellybug.com  http://tellybug.com                              ...
WHAT WE DID FOR X FACTOR                           2
X-FACTOR: THE RESULTS                        4
X-FACTOR: THE RESULTS  Over 1 Million app downloads                                 4
X-FACTOR: THE RESULTS  Over 1 Million app downloads  Over 260 Million boos/claps                                 4
X-FACTOR: THE RESULTS  Over 1 Million app downloads  Over 260 Million boos/claps  Massive peak loads on CTA               ...
BUT SURELY COUNTING IS EASY?  Need real time results     How many boos?     How many claps?     Rate of boos     Rate of c...
DISTRIBUTED COUNTING  “Hey, my CPU can do 22305 MIPS!”  “Stick it in Memcache!”  “How about Redis?”  “OK, how about shardi...
MEMCACHE CAN’T COUNT                       -1
MEMCACHE CAN’T COUNT  cache.set(key, 1)                        -1
MEMCACHE CAN’T COUNT  cache.set(key, 1)  cache.decr(key, 1)                         -1
MEMCACHE CAN’T COUNT  cache.set(key, 1)  cache.decr(key, 1)  >>> 0L                         -1
MEMCACHE CAN’T COUNT  cache.set(key, 1)  cache.decr(key, 1)  >>> 0L  cache.decr(key, 1)                         -1
MEMCACHE CAN’T COUNT  cache.set(key, 1)  cache.decr(key, 1)  >>> 0L  cache.decr(key, 1)  >>> 0L                         -1
MEMCACHE CAN’T COUNT  cache.set(key, 1)  cache.decr(key, 1)  >>> 0L  cache.decr(key, 1)  >>> 0L  cache.incr(key, -1)      ...
MEMCACHE CAN’T COUNT  cache.set(key, 1)  cache.decr(key, 1)  >>> 0L  cache.decr(key, 1)  >>> 0L  cache.incr(key, -1)  >>> ...
MEMCACHE CAN’T COUNT  cache.set(key, 1)  cache.decr(key, 1)  >>> 0L  cache.decr(key, 1)  >>> 0L  cache.incr(key, -1)  >>> ...
MEMCACHE CAN’T COUNT  cache.set(key, 1)  cache.decr(key, 1)  >>> 0L  cache.decr(key, 1)  >>> 0L  cache.incr(key, -1)  >>> ...
MEMCACHE CAN’T COUNT PART 3                              8
MEMCACHE CAN’T COUNT PART 3  EC2 limits     Single Memcache server runs out of network I/O     What then?                 ...
MEMCACHE CAN’T COUNT PART 3  EC2 limits     Single Memcache server runs out of network I/O     What then?  Redis?     Benc...
SHARDED COUNTERS  Implemented 2 level cache on web tier (https://gist.github.com/953524)  But a counter is more complicate...
CASSANDRA HAS COUNTERS  New feature in Cassandra 0.8  Special column type - CounterColumnType as the validator  Distribute...
CAN CASSANDRA COUNT?                       11
CAN CASSANDRA COUNT?  Yes, But....                       11
CAN CASSANDRA COUNT?  Yes, But....  Performance can suck     Switch off replicate_on_write, tune RF & cluster size        ...
CAN CASSANDRA COUNT?  Yes, But....  Performance can suck     Switch off replicate_on_write, tune RF & cluster size  Not sc...
CAN CASSANDRA COUNT?  Yes, But....  Performance can suck     Switch off replicate_on_write, tune RF & cluster size  Not sc...
CASSANDRA - MAKE IT COUNT *FASTER*  Recommendation (from Cassandra committers...):                                        ...
CASSANDRA - MAKE IT COUNT *FASTER*  Recommendation (from Cassandra committers...):           SHARD YOUR COUNTERS          ...
YOU’RE NOT COUNTING IT RIGHT                               13
YOU’RE NOT COUNTING IT RIGHT  When 1+1+1 is 2                               13
YOU’RE NOT COUNTING IT RIGHT  When 1+1+1 is 2  Write Only Databases                               13
CONCLUSION  Counting is easy.....  Unless you want to do it really, really fast  If you’re inside the I/O limits for a sin...
ANY QUESTIONS?      We’re hiring - if you’re interested in helping us count, get in touch!                            malc...
Upcoming SlideShare
Loading in...5
×

Big data meetup 2012 01-18 - stripped

594

Published on

Presentation from Big Data London Jan 2012 on counting at scale, and the ways various things can't count.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
594
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • redis - what if you need 30K/s?\n\n
  • redis - what if you need 30K/s?\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • reveal - “shard your counters”\n
  • 1+1+1 = 2 - eventual consistency. Cache consistency\n\nWrite only DB - Cassandra bug where get_range() wasn’t returning all the data in the DB.\n
  • 1+1+1 = 2 - eventual consistency. Cache consistency\n\nWrite only DB - Cassandra bug where get_range() wasn’t returning all the data in the DB.\n
  • single box - failures?\nCass counters don’t scale :(\n
  • \n
  • Big data meetup 2012 01-18 - stripped

    1. 1. PUTTING THE X FACTOR INTO CASSANDRA: ADVENTURES IN COUNTING MALCOLM BOX, CTO, LIVE TALKBACK BIG DATA LONDON MEETUP, 18TH JANUARY 2012 1
    2. 2. INTRO Malcolm Box, CTO & Co-Founder @malcolmbox malcolm@tellybug.com http://tellybug.com 2
    3. 3. WHAT WE DID FOR X FACTOR 2
    4. 4. X-FACTOR: THE RESULTS 4
    5. 5. X-FACTOR: THE RESULTS Over 1 Million app downloads 4
    6. 6. X-FACTOR: THE RESULTS Over 1 Million app downloads Over 260 Million boos/claps 4
    7. 7. X-FACTOR: THE RESULTS Over 1 Million app downloads Over 260 Million boos/claps Massive peak loads on CTA 4
    8. 8. BUT SURELY COUNTING IS EASY? Need real time results How many boos? How many claps? Rate of boos Rate of claps Design for scale Goal of handling 10K per second coming into our servers 5
    9. 9. DISTRIBUTED COUNTING “Hey, my CPU can do 22305 MIPS!” “Stick it in Memcache!” “How about Redis?” “OK, how about sharding?” “Well, I hear Cassandra 0.8 has counters” 6
    10. 10. MEMCACHE CAN’T COUNT -1
    11. 11. MEMCACHE CAN’T COUNT cache.set(key, 1) -1
    12. 12. MEMCACHE CAN’T COUNT cache.set(key, 1) cache.decr(key, 1) -1
    13. 13. MEMCACHE CAN’T COUNT cache.set(key, 1) cache.decr(key, 1) >>> 0L -1
    14. 14. MEMCACHE CAN’T COUNT cache.set(key, 1) cache.decr(key, 1) >>> 0L cache.decr(key, 1) -1
    15. 15. MEMCACHE CAN’T COUNT cache.set(key, 1) cache.decr(key, 1) >>> 0L cache.decr(key, 1) >>> 0L -1
    16. 16. MEMCACHE CAN’T COUNT cache.set(key, 1) cache.decr(key, 1) >>> 0L cache.decr(key, 1) >>> 0L cache.incr(key, -1) -1
    17. 17. MEMCACHE CAN’T COUNT cache.set(key, 1) cache.decr(key, 1) >>> 0L cache.decr(key, 1) >>> 0L cache.incr(key, -1) >>> 4294967295L -1
    18. 18. MEMCACHE CAN’T COUNT cache.set(key, 1) cache.decr(key, 1) >>> 0L cache.decr(key, 1) >>> 0L cache.incr(key, -1) >>> 4294967295L cache.incr(key, 1) -1
    19. 19. MEMCACHE CAN’T COUNT cache.set(key, 1) cache.decr(key, 1) >>> 0L cache.decr(key, 1) >>> 0L cache.incr(key, -1) >>> 4294967295L cache.incr(key, 1) >>> 4294967296L -1
    20. 20. MEMCACHE CAN’T COUNT PART 3 8
    21. 21. MEMCACHE CAN’T COUNT PART 3 EC2 limits Single Memcache server runs out of network I/O What then? 8
    22. 22. MEMCACHE CAN’T COUNT PART 3 EC2 limits Single Memcache server runs out of network I/O What then? Redis? Benchmarked on EC2 m1.large -> m1.large, 28K INCR/s Network I/O limited Can’t horizontally scale 8
    23. 23. SHARDED COUNTERS Implemented 2 level cache on web tier (https://gist.github.com/953524) But a counter is more complicated Sharded counter Store (count, delta, timestamp) locally Store count in L2 cache Increment changes local delta Push deltas to central every N seconds & refresh count Eventually consistent Maybe....unless something crashes 9
    24. 24. CASSANDRA HAS COUNTERS New feature in Cassandra 0.8 Special column type - CounterColumnType as the validator Distributed 64 bit counter, with eventual consistency CL.ONE writes recommended to avoid implicit reads impacting performance Reads tot up values from replicas to give value Simple functionality incr()/decr(), get() 10
    25. 25. CAN CASSANDRA COUNT? 11
    26. 26. CAN CASSANDRA COUNT? Yes, But.... 11
    27. 27. CAN CASSANDRA COUNT? Yes, But.... Performance can suck Switch off replicate_on_write, tune RF & cluster size 11
    28. 28. CAN CASSANDRA COUNT? Yes, But.... Performance can suck Switch off replicate_on_write, tune RF & cluster size Not scalable Scales as function of RF up to 4 nodes Above that ... you’re out of luck Best we achieved is ~10K/s increments to single counter value 11
    29. 29. CAN CASSANDRA COUNT? Yes, But.... Performance can suck Switch off replicate_on_write, tune RF & cluster size Not scalable Scales as function of RF up to 4 nodes Above that ... you’re out of luck Best we achieved is ~10K/s increments to single counter value What do you do if an operation fails? 11
    30. 30. CASSANDRA - MAKE IT COUNT *FASTER* Recommendation (from Cassandra committers...): 12
    31. 31. CASSANDRA - MAKE IT COUNT *FASTER* Recommendation (from Cassandra committers...): SHARD YOUR COUNTERS 12
    32. 32. YOU’RE NOT COUNTING IT RIGHT 13
    33. 33. YOU’RE NOT COUNTING IT RIGHT When 1+1+1 is 2 13
    34. 34. YOU’RE NOT COUNTING IT RIGHT When 1+1+1 is 2 Write Only Databases 13
    35. 35. CONCLUSION Counting is easy..... Unless you want to do it really, really fast If you’re inside the I/O limits for a single box, all is peachy Above that, there’s no good off the shelf answers 14
    36. 36. ANY QUESTIONS? We’re hiring - if you’re interested in helping us count, get in touch! malcolm@tellybug.com @malcolmbox 15

    ×