Your SlideShare is downloading. ×
0
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Realtime Analytics with Cassandra
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Realtime Analytics with Cassandra

2,075

Published on

My talk at NoSQL Now 2012

My talk at NoSQL Now 2012

Published in: Technology, Spiritual
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,075
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
70
Comments
0
Likes
5
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Realtime Analytics with Cassandra Acunu Analytics Tom Wilkie, Acunu 21st August 2012
  • 2. • Motivation / alternatives • What is it? • How does it work? • Approximate Analytics • Whats it good for?2 Analytics
  • 3. • Motivation / alternatives • What is it? • How does it work? • Approximate Analytics • Whats it good for?3 Analytics
  • 4. Why bother? “Companies that can harness big data will trample data incompetents” The Economist, May 26th 20114 Analytics
  • 5. time page session id duration time page session id duration time ...page session id duration ... time ... time page ... page session id ...... ... ... duration ... page session id duration ... time 14:58:03.234 time ... /index.html page session id 175 ...... ... ... duration ... ... 248.180.3.40 session id 175 duration 14:58:03.234 time... 14:58:03.234 time /index.html page ... /index.html page 248.180.3.40 session id 175 ...... ... ... duration ... 14:58:03.234 /csi/csi/council/freedom.html 14:58:03.409 ... time ... 248.180.3.40 /index.html page 248.180.3.40 session id session id 175 duration ... 248.180.3.40 1234 ... 14:58:03.409 ... time /index.html page 248.180.3.40 session id duration /csi/csi/council/freedom.html ... 248.180.3.40 1234 175 ... ... /index.html page 248.180.3.40 session id duration 14:58:03.234 /docs/access/chapter8.txt ...... page 248.180.3.40 ...session id ...... 14:58:03.234 /csi/csi/council/freedom.html 14:58:03.409 ... time 248.180.3.40 1234 175 duration /csi/csi/council/freedom.html 99.1.10.178 52 /docs/access/chapter8.txt ... page 248.180.3.40 ...session id duration 14:58:03.409 ... time 14:58:03.877 14:58:03.234 /index.html 248.180.3.40 1234 175 14:58:03.877 14:58:03.234 /csi/csi/council/freedom.html 14:58:03.409 ... time /index.html 99.1.10.178 52 248.180.3.40 1234 175 ... ... 52 1234 175 duration 14:58:03.409 ... /csi/csi/council/freedom.html 99.1.10.178 /docs/access/chapter8.txt ... page 248.180.3.40 session id 14:58:03.87714:58:03.234 time 248.180.3.40 ...session id ...... 248.180.3.40 1234 175 duration 14:58:03.877 /index.html 248.180.3.40 14:58:03.877 /docs/access/chapter8.txt 14:58:03.409 ... time/docs/access/chapter8.txt ...99.1.10.178 /csi/csi/council/freedom.html 99.1.10.178 /index.html page 52 ...52 248.180.3.40 session id duration 14:58:03.234 14:58:03.877 /docs/access/chapter8.txt 14:58:03.877 14:58:03.234 time /docs/access/chapter8.txt ...99.1.10.178 /csi/csi/council/freedom.html 99.1.10.178 /index.html page 52 ... 1234 175 ... 52 52 ... 1234 175 duration 248.180.3.40 14:58:03.409 ... /docs/access/chapter8.txt /docs/access/chapter8.txt ...99.1.10.178 99.1.10.178 52 14:58:03.87714:58:03.409 ...... /csi/csi/council/freedom.html 99.1.10.17852 52session id 175 ...... /docs/access/chapter8.txt /index.html page 248.180.3.40 14:58:03.877 14:58:03.234 time /csi/csi/council/freedom.html 14:58:03.409 /docs/access/chapter8.txt 99.1.10.178 248.180.3.40 session id duration 99.1.10.178 248.180.3.40... 1234 14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52 /docs/access/chapter8.txt ......99.1.10.178 14:58:03.877 14:58:03.234 time 14:58:03.877 14:58:03.409 ... /csi/csi/council/freedom.html 99.1.10.17852 52 ... 1234 175 duration /docs/access/chapter8.txt /index.html page 248.180.3.40 14:58:03.877 14:58:03.877 /docs/access/chapter8.txt 52 14:58:03.877 14:58:03.409 ... /csi/csi/council/freedom.html 99.1.10.17852 52 ... 1234 175 ...... 14:58:03.877 14:58:03.234 248.180.3.40 /docs/access/chapter8.txt /index.html 99.1.10.178 248.180.3.40 52 99.1.10.178 14:58:03.877 /docs/access/chapter8.txt248.180.3.40 14:58:03.877 14:58:03.234 /docs/access/chapter8.txt ... 248.180.3.40 /docs/access/chapter8.txt /index.html 99.1.10.178 248.180.3.40 52 14:58:03.877 /csi/csi/council/freedom.html 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt 14:58:03.409 ... /docs/access/chapter8.txt ... 1234 14:58:03.877 14:58:03.234 /csi/csi/council/freedom.html 99.1.10.17852 52 ... 1234 175 ... 99.1.10.178 248.180.3.40 /index.html 99.1.10.178 248.180.3.40 /csi/csi/council/freedom.html /docs/access/chapter8.txt /index.html 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 14:58:03.877 14:58:03.234 /csi/csi/council/freedom.html 99.1.10.17852 52 52 1234 175 99.1.10.178 1234 248.180.3.40 248.180.3.40 14:58:03.877 /csi/csi/council/freedom.html 248.180.3.40 /docs/access/chapter8.txt /index.html 99.1.10.178 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 1234 52 /docs/access/chapter8.txt/csi/csi/council/freedom.html 99.1.10.17852 52 52 1234 175 14:58:03.87714:58:03.877 14:58:03.234 /csi/csi/council/freedom.html 99.1.10.178 248.180.3.40 1234 175 14:58:03.409 /docs/access/chapter8.txt 99.1.10.17852248.180.3.40 /docs/access/chapter8.txt /index.html 99.1.10.178 248.180.3.40 52 /csi/csi/council/freedom.html 99.1.10.178 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 1234 52 /docs/access/chapter8.txt /csi/csi/council/freedom.html 99.1.10.178 248.180.3.40 14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 99.1.10.178 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 99.1.10.17852248.180.3.40 14:58:03.87714:58:03.877 14:58:03.234 /csi/csi/council/freedom.html 99.1.10.17852 52 52 1234 1234 14:58:03.877 14:58:03.877 /docs/access/chapter8.txt 248.180.3.40 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.17852248.180.3.40 /docs/access/chapter8.txt/csi/csi/council/freedom.html 99.1.10.17852 52 52 1234 1234 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 99.1.10.178 248.180.3.40 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt 14:58:03.409 /docs/access/chapter8.txt 99.1.10.17852248.180.3.40 /csi/csi/council/freedom.html/docs/access/chapter8.txt 1234 99.1.10.17852 52 52 1234 248.180.3.40 /docs/access/chapter8.txt/csi/csi/council/freedom.html 14:58:03.877 14:58:03.877 /docs/access/chapter8.txt 248.180.3.40 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt 1234 99.1.10.178 14:58:03.409 14:58:03.87714:58:03.409 14:58:03.877 /csi/csi/council/freedom.html/docs/access/chapter8.txt 1234 99.1.10.17852 52 52 /docs/access/chapter8.txt 248.180.3.40 /csi/csi/council/freedom.html 99.1.10.178 /docs/access/chapter8.txt 248.180.3.40123452 1234 99.1.10.178 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 99.1.10.178 1234 52 52 52 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt 14:58:03.87714:58:03.877 /csi/csi/council/freedom.html/docs/access/chapter8.txt /docs/access/chapter8.txt 248.180.3.40 52 99.1.10.178 248.180.3.40 99.1.10.178 /docs/access/chapter8.txt /docs/access/chapter8.txt 99.1.10.178 99.1.10.17852 52 52 14:58:03.877 /docs/access/chapter8.txt 99.1.10.17852 99.1.10.178 14:58:03.409 14:58:03.877 /docs/access/chapter8.txt /docs/access/chapter8.txt 248.180.3.40 14:58:03.877 14:58:03.409 14:58:03.877 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 248.180.3.40 99.1.10.178 14:58:03.877 /csi/csi/council/freedom.html/docs/access/chapter8.txt 1234 52 1234 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 99.1.10.178 52 1234 52 52 14:58:03.87714:58:03.877 /docs/access/chapter8.txt 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.17852 1234 /docs/access/chapter8.txt /docs/access/chapter8.txt /csi/csi/council/freedom.html 99.1.10.178 /docs/access/chapter8.txt 248.180.3.401234 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 99.1.10.178 52 1234 52 52 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.17852 99.1.10.178 14:58:03.87714:58:03.877 /docs/access/chapter8.txt /docs/access/chapter8.txt /csi/csi/council/freedom.html 99.1.10.178 /docs/access/chapter8.txt 248.180.3.401234 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 99.1.10.178 52 1234 52 52 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.17852 99.1.10.178 14:58:03.87714:58:03.877 /docs/access/chapter8.txt /docs/access/chapter8.txt /csi/csi/council/freedom.html 99.1.10.178 /docs/access/chapter8.txt 248.180.3.401234 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 99.1.10.178 52 1234 52 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.17852 99.1.10.178 14:58:03.87714:58:03.877 /docs/access/chapter8.txt /csi/csi/council/freedom.html 99.1.10.178 /docs/access/chapter8.txt 248.180.3.40 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 99.1.10.178 52 1234 52 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.17852 1234 /docs/access/chapter8.txt /csi/csi/council/freedom.html 99.1.10.178 /docs/access/chapter8.txt 248.180.3.40 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 99.1.10.178 52 1234 52 14:58:03.877 14:58:03.409 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.17852 1234 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 /docs/access/chapter8.txt /docs/access/chapter8.txt 248.180.3.40 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 99.1.10.178 52 1234 14:58:03.409 /docs/access/chapter8.txt 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 /csi/csi/council/freedom.html 99.1.10.17852 1234 248.180.3.40 14:58:03.409 /docs/access/chapter8.txt 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 /csi/csi/council/freedom.html 99.1.10.17852 1234 52 1234 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 248.180.3.40 14:58:03.409 /docs/access/chapter8.txt /csi/csi/council/freedom.html 99.1.10.178 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt 248.180.3.40 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.17852 1234 52 1234 248.180.3.40 14:58:03.409 /docs/access/chapter8.txt 14:58:03.877 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 99.1.10.17852 1234 52 /docs/access/chapter8.txt 248.180.3.40 14:58:03.409 /docs/access/chapter8.txt 14:58:03.877 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 99.1.10.17852 1234 52 /docs/access/chapter8.txt 248.180.3.40 14:58:03.877 /docs/access/chapter8.txt 14:58:03.877 /csi/csi/council/freedom.html 14:58:03.409 99.1.10.17899.1.10.17852 1234 52 /docs/access/chapter8.txt 248.180.3.40 14:58:03.409 /docs/access/chapter8.txt 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 248.180.3.40 52 1234 14:58:03.409 /docs/access/chapter8.txt 14:58:03.877 /csi/csi/council/freedom.html 99.1.10.178 248.180.3.40 52 1234 14:58:03.877 14:58:03.409 /docs/access/chapter8.txt /csi/csi/council/freedom.html 99.1.10.178 248.180.3.40 52 1234 14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52 14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 52 14:58:03.877 /docs/access/chapter8.txt 99.1.10.178 525 Analytics
  • 6. Combining “big” and “real-time” is hard Live & historical Drill downs Trends... aggregates... and roll ups6 Analytics
  • 7. Solution Con Scalability $$$ Not realtime Spartan query semantics => complex, DIY solutions7 Analytics
  • 8. • Motivation / alternatives • What is it? • How does it work? • Approximate Analytics • Whats it good for?8 Analytics
  • 9. Analytics counter updatesClick stream events AcunuSensor data Analytics etc • Aggregate incrementally, on the fly • Store live + historical aggregates
  • 10. { time : TIME(HOUR; MIN; SEC), page : PATH(/), category : STRING, loadTime : LONG } { select : ["COUNT", "AVG(loadTime)"], where : “time, ?path”, group : “time, ?category” }10 Analytics
  • 11. Dashboard UI11 Analytics
  • 12. • Motivation / alternatives • What is it? • How does it work? • Approximate Analytics • Whats it good for?12 Analytics
  • 13. count grouped by ... day count distinct(session) count ... geographyavg(duration) ... browser13 Analytics
  • 14. time : TIME(HOUR; MIN; SEC), cust_id : LONG, Data session_id : LONG, Definition geography : STRING, browser : STRING, load_time : LONG { select: “COUNT” patterns: [ { where : “?time”, group : “?time” }, Query { where : “”, group : “geography” }, { where : “”, group : “browser” } Patterns ] }, { select: [“COUNT_DISTINCT(session_id)”, “AVG(load_time)”], where: “time”, group: “” }14 Analytics
  • 15. 21:00 all→1345 :00→45 :01→62 :02→87 ... 22:00 all→3221 :00→22 :00→19 :02→104 ...{ cust_id: user01, ... ... session_id: 102, UK all→228 user01→1 user14→12 user99→7 ... geography: UK, US all→354 user01→4 user04→8 user56→17 ... browser: IE, time: 22:02, ...} UK, 22:00 all→1904 ... ∅ all→87314 UK→238 US→354 ...15 Analytics
  • 16. 21:00 all→1345 :00→45 :01→62 :02→87 ... 22:00 all→3222 :00→22 :00→19 :02→105 ...{ cust_id: user01, ... ... session_id: 102, UK all→229 user01→2 user14→12 user99→7 ... geography: UK, US all→354 user01→4 user04→8 user56→17 ... browser: IE, time: 22:02, ...} UK, 22:00 all→1905 ... ∅ all→87315 UK→239 US→354 ...16 Analytics
  • 17. 21:00 all→1345 :00→45 :01→62 :02→87 ... 22:00 all→3221 :00→22 :00→19 :02→104 ... ... ... UK all→228 user01→1 user14→12 user99→7 ... US all→354 user01→4 user04→8 user56→17 ... ... UK, 22:00 all→1904 ... ∅ all→87314 UK→238 US→354 ...17 Analytics
  • 18. where time 21:00-22:00 count(*) 21:00 all→1345 :00→45 :01→62 :02→87 ... 22:00 all→3222 :00→22 :01→19 :02→105 ... ... ... UK all→229 user01→2 user14→12 user99→7 ... US all→354 user01→4 user04→8 user56→17 ... ... UK, 22:00 all→1905 ... ∅ all→87315 UK→239 US→354 ...18 Analytics
  • 19. where time 21:00-22:00 count(*) 21:00 all→1345 :00→45 :01→62 :02→87 ...where time 22:00-23:00, 22:00 all→3222 :00→22 :01→19 :02→105 ... group by minute ... ... UK all→229 user01→2 user14→12 user99→7 ... US all→354 user01→4 user04→8 user56→17 ... ... UK, 22:00 all→1905 ... ∅ all→87315 UK→239 US→354 ...19 Analytics
  • 20. where time 21:00-22:00 count(*) 21:00 all→1345 :00→45 :01→62 :02→87 ...where time 22:00-23:00, 22:00 all→3222 :00→22 :01→19 :02→105 ... group by minute ... ... UK all→229 user01→2 user14→12 user99→7 ...where geography=UK US all→354 user01→4 user04→8 user56→17 ... group all by user, ... UK, 22:00 all→1905 ... ∅ all→87315 UK→239 US→354 ...20 Analytics
  • 21. where time 21:00-22:00 count(*) 21:00 all→1345 :00→45 :01→62 :02→87 ...where time 22:00-23:00, 22:00 all→3222 :00→22 :01→19 :02→105 ... group by minute ... ... UK all→229 user01→2 user14→12 user99→7 ...where geography=UK US all→354 user01→4 user04→8 user56→17 ... group all by user, ... UK, 22:00 all→1905 ...count all ∅ all→87315 UK→239 US→354 ...21 Analytics
  • 22. where time 21:00-22:00 count(*) 21:00 all→1345 :00→45 :01→62 :02→87 ...where time 22:00-23:00, 22:00 all→3222 :00→22 :01→19 :02→105 ... group by minute ... ... UK all→229 user01→2 user14→12 user99→7 ...where geography=UK US all→354 user01→4 user04→8 user56→17 ... group all by user, ... UK, 22:00 all→1905 ...count all ∅ all→87315 UK→239 US→354 ...group all by geo22 Analytics
  • 23. • Motivation / alternatives • What is it? • How does it work? • Approximate Analytics • Whats it good for?23 Analytics
  • 24. Approximate Analytics Exact Real-time Large Scale24 Analytics
  • 25. Count Distinct Plan A: keep a list of all the things you’ve seen count them at query time Quick to update ... but at scale ... Takes lots of space Takes a long time to query25 Analytics
  • 26. Approximate Distinct max # leading zeroes seen so far item hash leading zeroes max so far x 00101001110... 2 2 y 11010100111... 0 2 z 00011101011... 3 3 ... ... to see a max of M takes about 2M items26 Analytics
  • 27. Approximate Distinct to reduce var, average over m=2k sub-streams item hash index, zeroes max so far x 00101001110... 0, 0 0,0,0,0 y 11010100111... 3, 1 0,0,1,0 z 00011101011... 0, 1 1,0,1,0 ... take the harmonic mean27 Analytics
  • 28. • Motivation / alternatives • What is it? • How does it work? • Approximate Analytics • Whats it good for?28 Analytics
  • 29. Was it worth it?29 Analytics
  • 30. What’s Coming? • Ad Hoc: same queries, but without the need to pre-define them • Geolocation: support for location-based events and queries • Drill down: see the events that make up any given aggregate30 Analytics
  • 31. • Motivation / alternatives • What is it? • How does it work? • Approximate Analytics • Whats it good for?31 Analytics
  • 32. Manufacturing Social Media Ad Analytics Systems Financial Oil + Gas Monitoring Services Analytics
  • 33. “Up and running in about 4 hours”“We found out a competitor was scraping our data” “We keep discovering use cases we hadn’t thought of ” Analytics
  • 34. Analytics
  • 35. www.acunu.com @acunuApache, Apache Cassandra, Cassandra, Hadoop, and the eye andelephant logos are trademarks of the Apache Software Foundation. 35 Analytics

×