Real Time Analytics with Cassandra

413 views
326 views

Published on

A recipe of Acunu style analytics with Cassandra

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
413
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Real Time Analytics with Cassandra

  1. 1. Real Time Analytics with Vagmi Mudumbai @vagmi / @reducedata
  2. 2. What is Cassandra?
  3. 3. Dynamo Based on
  4. 4. Facebook Built by
  5. 5. Key Value Store is both a
  6. 6. Column Store and a
  7. 7. The CAP Theorem
  8. 8. Column Families
  9. 9. HashMap<RowKey,SortedMap<ColumnName, Value>>
  10. 10. id name email country 1 Vagmi me@vagmim.in IN 2 Karthik yeskarthik@blah IN 3 MarkZ mark@fb US Rowkey 1 2 3 name Vagmi Karthik MarkZ email me@vagmim.in yeskarthik@blah mark@fb country IN IN US
  11. 11. The Problem
  12. 12. As a user, I want to view real time metrics and filter by dimensions like time, city, category, etc.
  13. 13. select sum(measure) from events where time between A and B and country=’US’ and device_platform=’Android’ The wrong way
  14. 14. HashMap<RowKey,SortedMap<ColumnName, Value>>
  15. 15. Counters
  16. 16. create column family view_counts_hourly with comparator=UTF8Type and default_validation_class=CounterColumnType and key_validation_class=UTF8Type;
  17. 17. http://reducedata.com/, Chrome, 2014-03-14 15:30:00Z, IP, Cookie-Info
  18. 18. RowKey 20140101 20140102 20140103 20140104 ... ... 20140628 ... 20150308 sid1#us 2553 2341 2342 3242 ... ... 32342 ... 33423 sid1#us#chrome 1556 1532 1892 ... ... ... ... ... ... sid1#us#chrome#25 833 899 1200
  19. 19. Uniques? but what about
  20. 20. Bitmaps to the rescue
  21. 21. 1 0 1 0 1 1 0 0 0 1 1 0 1 0 0 1 u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 ... ... ...
  22. 22. UID- 1328abc2838fd283e282 Fast Hash Function - Murmur32 1 0 1 0 1 1 0 0 0 1 1 0 1 0 0 1 u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 ... ... ...
  23. 23. RowKey 20140101 20140102 20140103 20140104 ... ... 20140628 ... 20150308 sid1#us 10101 10111 11100 11101 ... ... ... ... 11101 sid1#us#chrome ... ... ... ... ... ... ... ... ... sid1#us#chrome#25 10101 11101 11100 …. ... ... ... ... ...
  24. 24. But I do not have Big Data
  25. 25. Oh and we’re hiring (vagmi@reducedata.co)
  26. 26. Thanks @vagmi on Github / Twitter / Facebook

×