Real Time Analytics with Cassandra
Upcoming SlideShare
Loading in...5
×
 

Real Time Analytics with Cassandra

on

  • 214 views

A recipe of Acunu style analytics with Cassandra

A recipe of Acunu style analytics with Cassandra

Statistics

Views

Total Views
214
Views on SlideShare
208
Embed Views
6

Actions

Likes
0
Downloads
2
Comments
0

1 Embed 6

http://www.slideee.com 6

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Real Time Analytics with Cassandra Real Time Analytics with Cassandra Presentation Transcript

  • Real Time Analytics with Vagmi Mudumbai @vagmi / @reducedata
  • What is Cassandra?
  • Dynamo Based on View slide
  • Facebook Built by View slide
  • Key Value Store is both a
  • Column Store and a
  • The CAP Theorem
  • Column Families
  • HashMap<RowKey,SortedMap<ColumnName, Value>>
  • id name email country 1 Vagmi me@vagmim.in IN 2 Karthik yeskarthik@blah IN 3 MarkZ mark@fb US Rowkey 1 2 3 name Vagmi Karthik MarkZ email me@vagmim.in yeskarthik@blah mark@fb country IN IN US
  • The Problem
  • As a user, I want to view real time metrics and filter by dimensions like time, city, category, etc.
  • select sum(measure) from events where time between A and B and country=’US’ and device_platform=’Android’ The wrong way
  • HashMap<RowKey,SortedMap<ColumnName, Value>>
  • Counters
  • create column family view_counts_hourly with comparator=UTF8Type and default_validation_class=CounterColumnType and key_validation_class=UTF8Type;
  • http://reducedata.com/, Chrome, 2014-03-14 15:30:00Z, IP, Cookie-Info
  • RowKey 20140101 20140102 20140103 20140104 ... ... 20140628 ... 20150308 sid1#us 2553 2341 2342 3242 ... ... 32342 ... 33423 sid1#us#chrome 1556 1532 1892 ... ... ... ... ... ... sid1#us#chrome#25 833 899 1200
  • Uniques? but what about
  • Bitmaps to the rescue
  • 1 0 1 0 1 1 0 0 0 1 1 0 1 0 0 1 u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 ... ... ...
  • UID- 1328abc2838fd283e282 Fast Hash Function - Murmur32 1 0 1 0 1 1 0 0 0 1 1 0 1 0 0 1 u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 ... ... ...
  • RowKey 20140101 20140102 20140103 20140104 ... ... 20140628 ... 20150308 sid1#us 10101 10111 11100 11101 ... ... ... ... 11101 sid1#us#chrome ... ... ... ... ... ... ... ... ... sid1#us#chrome#25 10101 11101 11100 …. ... ... ... ... ...
  • But I do not have Big Data
  • Oh and we’re hiring (vagmi@reducedata.co)
  • Thanks @vagmi on Github / Twitter / Facebook