Your SlideShare is downloading. ×
0
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Acunu Analytics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Acunu Analytics

3,866

Published on

Published in: Technology, Business
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,866
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
62
Comments
0
Likes
6
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Acunu Analytics Simple, powerful, real-time Andrew Byde Principal ScientistTuesday, 27 March 2012
  • 2. Making big data useful How do we turn this ... time page session id duration ... ... ... ... 14:58:03.234 /index.html 248.180.3.40 898 14:58:03.234 /csi/csi/council/freedom.html 248.180.3.40 1234 14:58:03.234 /docs/access/chapter8.txt 99.1.10.178 52 ... ... ... ... x billionsTuesday, 27 March 2012
  • 3. MY Introduction into this...Tuesday, 27 March 2012
  • 4. or this...Tuesday, 27 March 2012
  • 5. or this...Tuesday, 27 March 2012
  • 6. • SQL + materialised viewsTuesday, 27 March 2012
  • 7. • SQL + materialised views ... would be nice if it scaledTuesday, 27 March 2012
  • 8. • Hadoop/Map-Reduce can do anythingTuesday, 27 March 2012
  • 9. • Hadoop/Map-Reduce can do anything Not real-time Inefficient re-computationTuesday, 27 March 2012
  • 10. • Hadoop/Map-Reduce can do anything Not real-time Inefficient re-computation (100TB on a 100 node cluster is > 3 hours)Tuesday, 27 March 2012
  • 11. • Cassandra counters are pretty coolTuesday, 27 March 2012
  • 12. • Cassandra counters are pretty cool but the query semantics is spartan => DIY solutionsTuesday, 27 March 2012
  • 13. Acunu Analytics • Simple, real-time, incremental analytics • push processing into ingest phase AA event Cassandra counter updatesTuesday, 27 March 2012
  • 14. Acunu Analytics • Event template, e.g., select : ["COUNT", "AVG(loadTime)"], type : { time : [TIME(HOUR; MIN; SEC), ?, 0], page : PATH(/), loadTime : [LONG, 0, 0] } • specifies “blow-up” strategy according to supported queriesTuesday, 27 March 2012
  • 15. Acunu Analytics type : { time : TIME(HOUR; MIN), category : STRING, user : STRING 21:00 all→1345 :00→45 :01→62 :02→87 ... } 22:00 all→3221 :00→22 :00→19 :02→104 ... ... ... click all→228 user01→1 user14→12 user99→7 ... open all→354 user01→4 user04→8 user56→17 ... ... click, 22:00 all→1904 ... ∅ all→87314 click→238 open→354 ...Tuesday, 27 March 2012
  • 16. Acunu Analytics type : { time : TIME(HOUR; MIN), category : STRING, user : STRING 21:00 all→1345 :00→45 :01→62 :02→87 ... } 22:00 all→3221 :00→22 :00→19 :02→104 ... ... ... (22:02, “click”, user01) click all→228 user01→1 user14→12 user99→7 ... open all→354 user01→4 user04→8 user56→17 ... ... click, 22:00 all→1904 ... ∅ all→87314 click→238 open→354 ...Tuesday, 27 March 2012
  • 17. Acunu Analytics type : { time : TIME(HOUR; MIN), category : STRING, user : STRING 21:00 all→1345 :00→45 :01→62 :02→87 ... } 22:00 all→3222 :00→22 :00→19 :02→105 ... ... ... (22:02, “click”, user01) click all→229 user01→2 user14→12 user99→7 ... open all→354 user01→4 user04→8 user56→17 ... ... click, 22:00 all→1905 ... ∅ all→87315 click→239 open→355 ...Tuesday, 27 March 2012
  • 18. Acunu Analytics Pre-assembled queries, e.g. ... 21:00 all→1345 :00→45 :01→62 :02→87 ... for 22:00-23:00, 22:00 all→3222 :00→22 :00→19 :02→105 ... group by minute ... ... click all→229 user01→2 user14→12 user99→7 ... group all by user, open all→354 user01→4 user04→8 user56→17 ... where category=click ... count all click, 22:00 all→1905 ... ∅ all→87315 click→239 open→355 ... group all by categoryTuesday, 27 March 2012
  • 19. Summary • Simple, real-time, incremental analytics • work done on ingest • sum, count, distinct, avg, stddev, min-max etc • time + hierarchy bucketing • efficient ‘group’ semantics • works with Apache CassandraTuesday, 27 March 2012
  • 20. Early Access Program analytics@acunu.comTuesday, 27 March 2012
  • 21. Tuesday, 27 March 2012
  • 22. countTuesday, 27 March 2012
  • 23. count distinct (session) countTuesday, 27 March 2012
  • 24. count distinct (session) count avg(duration)Tuesday, 27 March 2012
  • 25. count grouped by ... day count distinct (session) count avg(duration)Tuesday, 27 March 2012
  • 26. count grouped by ... day count distinct (session) count ... geography avg(duration)Tuesday, 27 March 2012
  • 27. count grouped by ... day count distinct (session) count ... geography avg(duration) ... browserTuesday, 27 March 2012

×