Cassandra EU 2012 - Putting the X Factor into Cassandra

1,476 views
1,353 views

Published on

Malcolm Box's talk at Cassandra Europe 2012

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,476
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
26
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • Who I am.\nBackground in mobile\nNot a Big Data Expert\n\n
  • Apps that make TV more entertaining\nBig shows, big audiences\nSimple interaction - so we get lots of it\nSmall number of “results”\n
  • \n
  • XFactor - over 1M installs, 260 Million boos/claps\n
  • No way to scale MySQL for single counter write\nHybrid memcache/mysql for values\nWhere to write the audit trail/log of what had happened?\nStep forward Acunu/Cassandra\n
  • Random partitioner. UUID type\nAnalytics by MySQL\nA write only database\n
  • E.g. too many connections from the web tier\n\n
  • \n
  • \n
  • Counters - moving production counts from MySQL to Cassandra\nSocial network - challenge if you don’t own the graph\n\n
  • Splaying writes is normal solution - push everyone’s updates to all their friends\nBut what about friends who aren’t there yet?\n
  • Cassandra as source of truth and destination for writes.\nMemcache as place to read from - holds social graphs, activity etc. Updated in parallel with Cassandra writes\nA lot of logic to deal with cache misses, and horizontal scaling of the cache\n
  • BGT used a memcache based counter with write-behind to MySQL\n
  • \n
  • Bug in older versions of memcached and pylibmc - now fixed\n
  • Redis - same sort of issues.\nFundamental limitation of single value living on single box\n\n
  • Redis - same sort of issues.\nFundamental limitation of single value living on single box\n\n
  • Redis - same sort of issues.\nFundamental limitation of single value living on single box\n\n
  • Looked ideal for our needs - move counts out of memcache & MySQL\n
  • \n
  • \n
  • \n
  • \n
  • Now multiple levels of inconsistency:\n- Cassandra\n- Central memcache value\n- Sharded counter values on each webserver box\n\nWhat is “the truth”?\n
  • We saw crashes on too many connections, truncate behaviour etc\n
  • \n
  • \n
  • We have millions of records in the DB - and then counts etc. Are the two consistent?\nIf not, why not?\nWe’ve seen various issues including missing reads, counter values not consistent etc etc\n
  • Netflix, Rackspace... everyone writes a tool\nTook us a couple of weeks to be able to backup and restore our cluster successfully\n - and another week to figure out whether the data was the same\n
  • \n
  • Bursty loads - we need to scale both ways\nMonitoring - we struggle generally with monitoring/alerting/graphing\nBackup & restore to smaller clusters - see Priam from Netflix\nAnalytics - we’ve hit the wall on the get_range() approach\n
  • \n
  • Cassandra EU 2012 - Putting the X Factor into Cassandra

    1. 1. PUTTING THE X FACTOR INTO CASSANDRA: ADVENTURES IN COUNTING MALCOLM BOX, CTO, TELLYBUG 1
    2. 2. INTRO Malcolm Box, CTO & Co-Founder @malcolmbox malcolm@tellybug.com http://tellybug.com 2
    3. 3. 3
    4. 4. WHAT I’M TALKING ABOUT How we started using Cassandra How we use it to power the X Factor and Britain’s Got Talent apps Counting - harder than you might think What we learnt along the way 4
    5. 5. THE CHALLENGE 10-12 Million people watching these shows TV tells them to buzz/clap/score.... ....Servers melt Design goals to handle 10K interactions/s 5
    6. 6. ROLL BACK 1 YEAR We’d won BGT 2011 - our first big talent show Existing MySQL/Django/Python stack Back of envelope calculations....oh dear Needed something quickly that could cope with anticipated load 6
    7. 7. OUR FIRST CASSANDRA SCHEMA create column family vote_log with comment=Log of votes and comparator=UTF8Type and key_validation_class=UUIDType and default_validation_class=UTF8Type and column_metadata = [ {column_name:ipaddr, validation_class:AsciiType}, {column_name:poll, validation_class:LongType}, {column_name:choice, validation_class:LongType}, {column_name:idtoken, validation_class:UTF8Type}, {column_name:count, validation_class:LongType}]; 7
    8. 8. WHAT WE LEARNT Cassandra scales beautifully for writes Cassandra has no single point of failure ....but it’s not hard to make it fail Ad-hoc questions and reporting were going to be much slower 8
    9. 9. OPERATIONS BGT 2011 was a write only DB Ignored failures One cluster, one AZ Backup to MySQL 9
    10. 10. Over 1 Million app downloads Over 260 Million boos/clapsX FACTOR 2011 10
    11. 11. IMPLEMENTING X FACTOR WITH CASSANDRA Counting Social network No longer write-only 11
    12. 12. WHAT ARE MY FRIENDS DOING? Scale makes this hard 10K changes/s Which ones are relevant to which users? When new users (and their social graph) can arrive at any time 12
    13. 13. SOLUTION New Column Family - user activity Maps user to their interactions Write problem nicely randomised and thus ideal for Cassandra Read problem! 13
    14. 14. COUNTING - HARDER THAN IT LOOKS Everyone can count But we need to count really fast And distribute the results to all the clients 14
    15. 15. DISTRIBUTED COUNTING “Memcache does counters” “OK, how about sharding?” “Well, I hear Cassandra 0.8 has counters” 15
    16. 16. ASIDE - THINGS THAT CAN’T COUNT #3 cache.set(key, 1) cache.decr(key, 1) >>> 0L cache.decr(key, 1) >>> 0L cache.incr(key, -1) >>> 4294967295L cache.incr(key, 1) >>> 4294967296L 16
    17. 17. SINGLE BOX LIMITS 17
    18. 18. SINGLE BOX LIMITS We have a single value 17
    19. 19. SINGLE BOX LIMITS We have a single value Everything needs to read and write that value - from multiple servers 17
    20. 20. SINGLE BOX LIMITS We have a single value Everything needs to read and write that value - from multiple servers EC2 limits Single Memcache server runs out of network I/O What then? 17
    21. 21. CASSANDRA HAS COUNTERS New (at the time) feature in Cassandra 0.8 Special column type - CounterColumnType as the validator Distributed 64 bit counter, with eventual consistency CL.ONE writes recommended to avoid implicit reads impacting performance Reads tot up values from replicas to give value Simple functionality incr()/decr(), get() 18
    22. 22. CAN CASSANDRA COUNT? 19
    23. 23. CAN CASSANDRA COUNT? Yes, But.... 19
    24. 24. CAN CASSANDRA COUNT? Yes, But.... Performance can be an issue Switch off replicate_on_write, tune RF & cluster size 19
    25. 25. CAN CASSANDRA COUNT? Yes, But.... Performance can be an issue Switch off replicate_on_write, tune RF & cluster size Not scalable for single counter Scales as function of RF up to 4 nodes Above that ... you’re out of luck Best we achieved is ~10K/s increments to single counter value on EC2 m1.large instances 19
    26. 26. CAN CASSANDRA COUNT? Yes, But.... Performance can be an issue Switch off replicate_on_write, tune RF & cluster size Not scalable for single counter Scales as function of RF up to 4 nodes Above that ... you’re out of luck Best we achieved is ~10K/s increments to single counter value on EC2 m1.large instances What do you do if an operation fails? 19
    27. 27. COUNTING AT SCALE WITH CASSANDRA Write throughput to a single counter is limited We were inside the performance limit, so writes could go to Cassandra No way to scale within Cassandra (yet) Reads have a serious performance overhead We used sharded counters in memcached with source of truth in Cassandra Few reads from Cass = much more predictable performance 20
    28. 28. OPERATIONS Cassandra GUIs & mgmt consoles still in infancy Hard to figure out what is going wrong when performance suffered Analytics (and backup) still via dump to MySQL Flexible, well understood Single cluster, single AZ 21
    29. 29. WHERE WE WERE AFTER X FACTOR Cassandra as a source of truth in production Mainly write load Memcached layer on top Simple operations No backups :( 22
    30. 30. BEYOND X FACTOR Dancing on Ice - harder counting Britain’s Got Talent 2012 - more social Backups Data integrity 23
    31. 31. DATA CONSISTENCY There’s no referential integrity So is the data in the database self-consistent? Or do you have a bug somewhere? How do you validate the data? Truth + 1 24
    32. 32. BACKUPS Backing up a cluster isn’t easy Restoring can be harder... 25
    33. 33. CONCLUSION Cassandra saved our bacon :) Scales to insane write loads Reads are easier to scale in memcached Beware of limitations on “hot” values Migrating functionality gradually let us learn the operational aspects There are lots of interesting failure scenarios at scale 26
    34. 34. TODO Scale-up/Scale-down of a cluster Better monitoring and operations Analytics using Hadoop 27
    35. 35. ANY QUESTIONS? We’re hiring - if you want to work on wicked scaling problems and reach millions of users, get in touch! malcolm@tellybug.com @malcolmbox 28

    ×