View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
Use cases‣ realtime traffic/engagement analytics‣ systems monitoring
Time Series Data‣ write heavy‣ stored temporally‣ viewed temporarily‣ hierarchical aggregation
Data Model‣ Distributed Counters (CASSANDRA-1072)‣ each time series is a row (or rows) of counters‣ slice over rows to get recent data
Data Model‣ An example (not exactly the way we do it): 2011-01-12T10:00 2011-01-12T10:01 ... host:web1:load1 5 4 ... host:web2:load1 4 3 ... cluster:web:load1:sum 576 505 ... cluster:web:load1:count 100 95 ...
Aggregation‣ Measured every minute (or continuously)‣ Rollup to courser granularities‣ More Counters! (aka, let’s do it live)
Aggregation‣ other dimensions besides time:‣ clusters‣ racks / dcs, etc‣ And combinations of the above
Pros / Cons‣ Pros‣ real-time data (average 30s between measurement and visibility)‣ real time aggregation‣ flexible data retention (once counters and TTLs work together)‣ Cons‣ Storage-intensive‣ Slow reads