Successfully reported this slideshow.
Pre-aggregation with counters       © Copyright 2010 10gen Inc.
Goals• Dashboard style reports• (Known) Reports• Real-time numbers
Framework• Know your metrics/counter• Prepared reports• Calculate during write• Fast queries• Always up to date data• Reco...
Rationale• Documents are updated in-place*• $inc update operator• Working set is small• Aggregations are much smaller*
Dashboard               Projects       Lines   Events                  6497     5543        3401                          ...
Demo Dashboard
Roads not traveled• Map/Reduce  • Reprocess raw data  • Now possible to do partial reduce• Aggregation Framework (aggregat...
Not Appropriate For• Ad-hoc aggregations (unknown metrics)• One-off reports• Possibly complex calculations
Processing• Event received• Split into many updates w/$inc• Aggregate  • Input Field(s)  • Time periods (hourly, daily, mo...
Example Data: github> db.events.findOne() {     "repository" : {           "url" : "https://github.com/vidageek/games",   ...
Define Metrics• “actor”• “repository.name”• “repository.language”• “type”  PushEvent, IssuesEvent, WatchEvent, GistEvent• ...
AggregationsTimePeriod, type #TimePeriod, author #TimePeriod, project #
Stats Collectionsstats_[hourly/daily/monthly].actorsstats_[hourly/daily/monthly].projectsstats_[hourly/daily/monthly].lang...
Stats> db.stats_hourly.types.find({"_id.type":"GistEvent"}) {     "_id" : {              "p" : ISODate("2012-05-21T00:00:0...
Updates IncrementQuery:{ ”p" : Date(…), "actor" : "neoplastic"}}Update:{ "$inc" : { "h.21.c" : 1 , "t.c" : 1}}Upsert : true
Query/Graphing• Select by grouping (by date, by type/value)• Documents hold many data points
The Whys• Writing more data up front, helps with reads• Multiple data points per document• Documents hold many timed point...
Thanks for coming… ne questions
drivers at mongodb.orgmongodb.org Supported               Community Supported      C                            REST      ...
download at mongodb.org        conferences, appearances, and meetups                  http://www.10gen.com/events   Facebo...
Upcoming SlideShare
Loading in …5
×

Realtime Analytics with MongoDB Counters (mongonyc 2012)

4,462 views

Published on

Real time analytics with pre-aggregation with counters.

Published in: Technology, Business
  • Be the first to comment

Realtime Analytics with MongoDB Counters (mongonyc 2012)

  1. 1. Pre-aggregation with counters © Copyright 2010 10gen Inc.
  2. 2. Goals• Dashboard style reports• (Known) Reports• Real-time numbers
  3. 3. Framework• Know your metrics/counter• Prepared reports• Calculate during write• Fast queries• Always up to date data• Record time-series collections
  4. 4. Rationale• Documents are updated in-place*• $inc update operator• Working set is small• Aggregations are much smaller*
  5. 5. Dashboard Projects Lines Events 6497 5543 3401 3543 2314 2342 921 123416 27 42 45JavaScript Java Ruby Python Monday Tuesday Thursday Friday
  6. 6. Demo Dashboard
  7. 7. Roads not traveled• Map/Reduce • Reprocess raw data • Now possible to do partial reduce• Aggregation Framework (aggregate in 2.2) • Also reprocess data on operation (initial release) • Optimizations to come• More costly during reads
  8. 8. Not Appropriate For• Ad-hoc aggregations (unknown metrics)• One-off reports• Possibly complex calculations
  9. 9. Processing• Event received• Split into many updates w/$inc• Aggregate • Input Field(s) • Time periods (hourly, daily, monthly) • Defined Metrics
  10. 10. Example Data: github> db.events.findOne() { "repository" : { "url" : "https://github.com/vidageek/games", ... "open_issues" : 25, "watchers" : 6, "pushed_at" : "2012/03/10 08:34:00 -0800", "language" : "Java" }, "actor_attributes" : {...}, "created_at" : "2012/03/11 15:20:24 -0700", "public" : true, "actor" : "juliano", "payload" : {...}, "url" : "https://github.com/...", "type" : "CommitCommentEvent” }
  11. 11. Define Metrics• “actor”• “repository.name”• “repository.language”• “type” PushEvent, IssuesEvent, WatchEvent, GistEvent• “payload.ref” efs/heads/improved_history, refs/heads/master, refs /heads/signs
  12. 12. AggregationsTimePeriod, type #TimePeriod, author #TimePeriod, project #
  13. 13. Stats Collectionsstats_[hourly/daily/monthly].actorsstats_[hourly/daily/monthly].projectsstats_[hourly/daily/monthly].langsstats_[hourly/daily/monthly].types
  14. 14. Stats> db.stats_hourly.types.find({"_id.type":"GistEvent"}) { "_id" : { "p" : ISODate("2012-05-21T00:00:00Z"), "type" : "GistEvent” }, "hour" : { "2" : { "count" : 65 }, "3" : { "count" : 2 }, "7" : { ”count" : 130}, "8" : { "count" : 5 } }, "total" : { ”count" : 202 } }
  15. 15. Updates IncrementQuery:{ ”p" : Date(…), "actor" : "neoplastic"}}Update:{ "$inc" : { "h.21.c" : 1 , "t.c" : 1}}Upsert : true
  16. 16. Query/Graphing• Select by grouping (by date, by type/value)• Documents hold many data points
  17. 17. The Whys• Writing more data up front, helps with reads• Multiple data points per document• Documents hold many timed points• Good for graphs by time, or types• Nested for improved performance
  18. 18. Thanks for coming… ne questions
  19. 19. drivers at mongodb.orgmongodb.org Supported Community Supported C REST node.js C# ActionScript3 Objective C C++ C# and .NET PHP Erlang Clojure PowerShell Haskell ColdFusion Blog post Java Delphi Python Javascript Erlang Ruby Perl F# Scala PHP Go: gomongo Scheme (PLT) Python Groovy Smalltalk: Dolphin Ruby Haskell Smalltalk Javascript Lua © Copyright 2010 10gen Inc.
  20. 20. download at mongodb.org conferences, appearances, and meetups http://www.10gen.com/events Facebook | Twitter | LinkedInhttp://bit.ly/mongofb @mongodb http://linkd.in/joinmongo support, training, and this talk brought to you by © Copyright 2010 10gen Inc.

×