Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling Crashlytics: Building Analytics on Redis 2.6

20,470 views

Published on

Published in: Technology
  • Shameless plug: just yesterday, I published a Ruby gem for bit-wise operations over sparse bitmaps in Redis using natural syntax that lets you store millions of users without wasting precious memory:

    week1 = redis.sparse_bitmap('accounts::active::2012-W41')
    week2 = redis.sparse_bitmap('accounts::active::2012-W42')
    week3 = redis.sparse_bitmap('accounts::active::2012-W42')

    result = week1 & week2 & week3

    (It results in one BITOP AND command.)

    Setting bit for a millionth user:

    week3[1_000_000] = true

    (Uses around 20kb.)

    http://github.com/bilus/redis-bitops

    Enjoy!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • heavy stuff my man!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Scaling Crashlytics: Building Analytics on Redis 2.6

  1. 1. Redis Analytics @JeffSeibert CEO, Crashlytics2 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  2. 2. 3 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  3. 3. 4 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  4. 4. Crashlytics for Mac
  5. 5. Strings Lists Hashes Sets Sorted Sets8 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  6. 6. Strings Activity Tracking Lists Hashes Event Tracking Sets Sorted Sets Leader boards9 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  7. 7. Active User Tracking10 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  8. 8. Active User Tracking CREATE TABLE accounts ( id int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY, name varchar(255), email varchar(255), ... last_active_at datetime );11 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  9. 9. Active User Tracking CREATE TABLE events ( id int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY, type varchar(32), account_id int(11), happened_at datetime );12 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  10. 10. Active User Tracking accounts::active 0 0 0 0 1 0 0 1 SETBIT key offset value (>= 2.2) O(1) > SETBIT “accounts::active” 4 1 > SETBIT “accounts::active” 7 114 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  11. 11. Active User Tracking accounts::active::2012-10 1 1 1 0 1 0 1 1 accounts::active::2012-10-22 0 0 1 0 1 0 0 1 accounts::active::2012-10-22-00 0 0 0 0 1 0 0 115 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  12. 12. Active User Tracking def record_active(obj, t=Time.now.utc) key = "#{obj.class.name.downcase.pluralize}::active::" key << t.year.to_s key << "-" << %02d % t.month REDIS.setbit key, obj.id, 1 # accounts::active::2012-10 key << "-" << %02d % t.day REDIS.setbit key, obj.id, 1 # accounts::active::2012-10-22 key << "-" << %02d % t.hour REDIS.setbit key, obj.id, 1 # accounts::active::2012-10-22-00 end16 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  13. 13. Active User Tracking ‣ We want to know… • How many users were active today? This month? BITCOUNT key (>= 2.6) O(N) > BITCOUNT “accounts::active::2012-10-22” (integer) 3 > BITCOUNT “accounts::active::2012-10” (integer) 5 • Was user X active today? This month? GETBIT key index (>= 2.2) O(1) > GETBIT “accounts::active::2012-10-22” 6 (integer) 0 > GETBIT “accounts::active::2012-10” 6 (integer) 117 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  14. 14. Active User Tracking ‣ Graphs and Heatmaps • Monthly actives over time? > BITCOUNT “accounts::active::2012-07” > BITCOUNT “accounts::active::2012-08” > BITCOUNT “accounts::active::2012-09” > BITCOUNT “accounts::active::2012-10” ... • Over time, when was user X active? > GETBIT “accounts::active::2012-10-22” 6 > GETBIT “accounts::active::2012-10-21” 6 > GETBIT “accounts::active::2012-10-20” 6 > GETBIT “accounts::active::2012-10-19” 6 ...18 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  15. 15. Active User Tracking ‣ Advanced Data-Mining: WAU • Computing weekly active users: BITOP op destkey srckey [srckeys...] (>= 2.6) O(N) • > BITOP OR “accounts::active::2012-W42” “accounts::active::2012-10-21” “accounts::active::2012-10-20” “accounts::active::2012-10-19” “accounts::active::2012-10-18” “accounts::active::2012-10-17” “accounts::active::2012-10-16” “accounts::active::2012-10-15” > BITCOUNT “accounts::active::2012-W42”19 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  16. 16. Active User Tracking ‣ Advanced Data-Mining: Retention • What % of users active last week are active this week? BITOP op destkey srckey [srckeys...] (>= 2.6) O(N) • > BITOP AND “accounts::active::2012-W41+W42” “accounts::active::2012-W41” “accounts::active::2012-W42” > BITCOUNT “accounts::active::2012-W41” > BITCOUNT “accounts::active::2012-W41+W42”20 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  17. 17. Active User Tracking ‣ Advanced Data-Mining: Churn • Locate accounts that have been inactive for 3 months BITOP op destkey srckey [srckeys...] (>= 2.6) O(N) • > BITOP OR “accounts::active::2012-Q3” “accounts::active::2012-09” “accounts::active::2012-08” “accounts::active::2012-07” > BITOP NOT “accounts::churned::2012-Q3” “accounts::active::2012-Q3” > BITCOUNT “accounts::churned::2012-Q3”21 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  18. 18. Active User Tracking def record_boolean(obj, topic=:active, t=Time.now.utc) key = "#{obj.class.name.downcase.pluralize}::#{topic}::" key << t.year.to_s key << "-" << %02d % t.month REDIS.setbit key, obj.id, 1 # accounts::active::2012-10 key << "-" << %02d % t.day REDIS.setbit key, obj.id, 1 # accounts::active::2012-10-22 key << "-" << %02d % t.hour REDIS.setbit key, obj.id, 1 # accounts::active::2012-10-22-00 end22 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  19. 19. Event Tracking23 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  20. 20. Event Tracking apps::crashes 0 0 0 0 ? 0 0 024 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  21. 21. Event Tracking apps::crashes { 0 => 34, 1 => 546457, 2 => 1 } HINCRBY key field increment (>= 2.0) O(1) > HINCRBY “apps::crashes” “0” 1 > HINCRBY “apps::crashes” “2” 125 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  22. 22. Event Tracking app::0::crash::by_day { 2012-10-22 => 34, 2012-10-21 => 46, 2012-10-20 => 29, ... } > HINCRBY “app::0::crash::by_day” “2012-10-22” 126 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  23. 23. Event Tracking def record_event(obj, topic=:crash, specificity=:day, t=Time.now.utc) key = "#{obj.class.name.downcase}::#{obj.id}::#{topic}::by_#{specificity}" # e.g. app::0::crash::by_day field = t.year.to_s field << "-" << %02d % t.month # 2012-10 REDIS.hincrby key, field, 1 if specificity == :month field << "-" << %02d % t.day # 2012-10-22 REDIS.hincrby key, field, 1 if specificity == :day field << "-" << %02d % t.hour # 2012-10-22-00 REDIS.hincrby key, field, 1 if specificity == :hour end27 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  24. 24. Event Tracking ‣ We want to… • Power a graph of crashes over the last week HMGET key field1 [...] (>= 2.0) O(N) > HMGET “app::0::crash::by_day” “2012-10-22” “2012-10-21” “2012-10-20” “2012-10-19” “2012-10-18” “2012-10-17” “2012-10-16” 1) ... • “Zoom” the graph to see more detail > HMGET “app::0::crash::by_hour” “2012-10-22-00” “2012-10-22-01” “2012-10-22-02” “2012-10-22-03” “2012-10-22-04” “2012-10-22-05” “2012-10-22-06” ... 1) ...28 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  25. 25. Grouped Event Tracking “How often has app X crashed on each type of iPad?”29 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  26. 26. Grouped Event Tracking app::0::crash::iPad1,1 { device_models [ 2012-10-22 => 34, “iPad1,1”, 2012-10-21 => 46, “iPad2,1”, 2012-10-20 => 29, ... ... ] } app::0::crash::iPad2,1 { 2012-10-22 => 12, 2012-10-21 => 17, 2012-10-20 => 11, ... }30 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  27. 27. Grouped Event Tracking app::0::crash::2012-10-22 { ALL => 46, iPad1,1 => 34, iPad2,1 => 12, ... } HGETALL key (>= 2.0) O(N) > HGETALL “app::0::crash::2012-10-22” (multi-bulk)31 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  28. 28. Grouped Event Tracking def record_grouped_event(obj, group, topic=:crash, t=Time.now.utc) key = "#{obj.class.name.downcase}::#{obj.id}::#{topic}::" key = t.year.to_s key << "-" << %02d % t.month # app::0::crash::2012-10 REDIS.hincrby key, group, 1 REDIS.hincrby key, ALL, 1 field << "-" << %02d % t.day # app::0::crash::2012-10-22 REDIS.hincrby key, group, 1 REDIS.hincrby key, ALL, 1 field << "-" << %02d % t.hour # app::0::crash::2012-10-22-00 REDIS.hincrby key, group, 1 REDIS.hincrby key, ALL, 1 end32 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  29. 29. MongoDB > Account.first.id => BSON::ObjectId(507db04798a3340ada000002)33 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  30. 30. Sequential ID Generation sequential_ids::accounts { 10 5084bfbb98a33406f0000002, 9 5084bfa798a33406f0000001, 8 507db04798a3340ada000002, ... } ZADD key score member (>= 1.2) O(log(N)) > ZADD “sequential_ids::accounts” 10 507db04798a3340ada000002 (integer) 134 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  31. 31. Sequential ID Generation sequential_ids::accounts { 10 5084bfbb98a33406f0000002, 9 5084bfa798a33406f0000001, 8 507db04798a3340ada000002, ... } ZCARD key (>= 1.2) O(1) > ZCARD “sequential_ids::accounts” (integer) 9 ZADD key score member (>= 1.2) O(log(N)) > ZADD “sequential_ids::accounts” 10 5084bfbb98a33406f0000002 (integer) 135 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  32. 32. Sequential ID Generation sequential_ids::accounts { 10 5084bfbb98a33406f0000002, 9 5084bfa798a33406f0000001, 8 507db04798a3340ada000002, ... } ZSCORE key member (>= 1.2) O(1) > ZSCORE “sequential_ids::accounts” 5084bfbb98a33406f0000002 (integer) 1036 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  33. 33. Sequential ID Generation def sequential_id(obj) key = "sequential_keys::#{obj.class.name.downcase.pluralize}" id = obj.id.to_s # Lua script to atomically determine the score of an id. # If needed, adds it to the set with the next available score. # In the general case, O(1). On add, O(log(N)). Requires Redis >= 2.6 monotonic_zadd = <<LUA local sequential_id = redis.call(zscore, KEYS[1], ARGV[1]) if not sequential_id then sequential_id = redis.call(zcard, KEYS[1]) redis.call(zadd, KEYS[1], sequential_id, ARGV[1]) end return sequential_id LUA REDIS.eval(monotonic_zadd, [key], [id]).to_i end37 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  34. 34. Redis Analytics Wish List38 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  35. 35. Redis Analytics Wish List ‣ MSETBIT, MGETBIT, MBITCOUNT, HMINCRBY • Can already be addressed with scripting ‣ Native support for (insertion-)ordered sets ‣ Per-hash-key expiration policies39 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved
  36. 36. Q&A @JeffSeibert CEO, Crashlytics40 CRASHLYTICS CONFIDENTIAL © 2012. All rights reserved

×