SlideShare a Scribd company logo
1 of 43
ANALYTICS WITH MONGODB


      ROGER BODAMER
YOU WANT TO ANALYZE THIS
LIKE THIS
BUT HOW ?



• These   graphs are the end result of a process

• In
   order get here there’s a few things you need to do and
 explore
A WORD ON NON-NATIVE
         APPROACHES
•   Yes, you can

    •   map your document schema to a relational schema

    •   then export your data from MongoDB to a relational db

        •   and set up a cron job to do this every day

    •   then use your BI tool to map relational to “objects”

    •   and then Report and do Analytics
BUT THAT WOULD BE NO
              FUN


• Analytics   using Native Queries

•A   simple process
PROCESS: NAIVE

• Take   a sample document

• Develop     query

• Put   on chart

• Done    !

  • and   a gold star from your boss !
PROCESS: REALITY
• Understand       your schema
  • multiple schema’s in single collection
  • multiple collections / multiple data sources
• Iterate:
  • define metric
  • develop query and report on metrics
    • understand and drill down or discard
    • repeat
• Operationalize metrics: dashboard
  • Dimensions
  • Plotting
WHY ITERATE ?
UNDERSTAND YOUR SCHEMA

{
    "name" : "Mario",
    "games" : [{"game" : "WoW",
                "duration" : 130},
               {"game" : "Tetris",
                "duration" : 130}]
}
BUT ALSO:
• Schema’s   can be Polymorphic

{
    "name" : "Bob",
    "location" : "us",
    "games" : [{"game" : "WoW",
                "duration" : 2910},
               {"game" : "Tetris",
                "duration" : 593}]
}
SO NOW WHAT ?
•   Only report on common attributes

    •   probably missing the most recent / interesting data
SO NOW WHAT ?
•   Write 2 programs, one for each schema

    •   2 graphs / reports

    •   2 programs writing to 1 graph (basically merging instance data in 2
        places)
SO NOW WHAT ?

•   Unify Schema

    •   deal with absent, null values

    •   translate(NULL, “EU”);
ITERATE



• total   time and how many games people play in the us vs eu ?
QUERY
db.runCommand(
{ aggregate : "gamers", pipeline : [
    { $project : {
	

    location : 1,
	

    games: 1
    }},
    { $unwind : "$games" },
    { $group : {
        _id : { location : 1},
	

    number_games: { $sum : 1 },
        total_duration: {$sum : "$games.duration"}
    }},
    { $project : {
	

    _id : 0,
        location : "$_id.location",
	

    number_games : 1,
        total_duration : 1
    }}
]})
SIDEBAR: WRITING
           AGGREGATION QUERIES
•   Prepare Data
    •   Extract relevant properties from collection documents
    •   Unwind sub collection if its document is contributing to aggregation
•   Aggregate data
    •   determine the key (_id) on which the aggregates should be done
    •   name aggregates
•   Project Data
    •   For final results
EXAMPLE
{
    "name" : "Alice",
    "location" : "us",
    "games" : [{
        "game" : "WoW",
        "duration" : 200
      }, {
        "game" : "Tetris",
        "duration" : 100
      }]
}
PREPARE
• Only   use location and games:

{ $project : {
	

 location : 1,
	

 games: 1
    }}


• Unwind   games as properties of its documents are aggregated
 over:

{ $unwind : "$games" }
AGGREGATE DATA
• Aggregate on number of games (add 1 per game)
  and total duration (add duration per game)
  using location as key


{ $group : {
      _id : { location : 1},
	

   number_games: { $sum : 1 },
      total_duration: {$sum : "$games.duration"}
   }}
PROJECT
• Only   show location and aggregates, do not show _id


{ $project : {
	

 _id : 0,
      location : "$_id.location",
	

 number_games : 1,
      total_duration : 1
   }}
RESULT 1




• People   spend a little more time playing in the US
• More   games played in the EU
RING....
CHALLENGE 2


• Since
     we found EU and US play similar amount and same
 number of games, new challenge is:


• Lets
     see what the distribution of different
 games is the 2 locations
QUERY 2
db.runCommand(
{ aggregate : "gamers", pipeline : [
    { $project : {
	

     location : 1,
	

     games : 1
    }},
    { $unwind : "$games" },
    { $project : {
	

     location : 1,
	

     game : "$games.game",
        duration : "$games.duration"
    }},
    { $group : {
        _id : { location: "$location", game: "$game"},
	

     number_games: { $sum : 1 },
        total_duration: {$sum : "$duration"}
    }},
    { $project : {
        _id : 0,
        location : "$_id.location",
        game : "$_id.game",
	

     number_games : 1,
        total_duration : 1
    }}
]})
QUERY 2
db.runCommand(
{ aggregate : "gamers", pipeline : [
    { $project : {
	

     location : 1,                                    location, games
	

     games : 1
    }},
    { $unwind : "$games" },
    { $project : {
	

     location : 1,
	

     game : "$games.game",
        duration : "$games.duration"
    }},
    { $group : {
        _id : { location: "$location", game: "$game"},
	

     number_games: { $sum : 1 },
        total_duration: {$sum : "$duration"}
    }},
    { $project : {
        _id : 0,
        location : "$_id.location",
        game : "$_id.game",
	

     number_games : 1,
        total_duration : 1
    }}
]})
QUERY 2
db.runCommand(
{ aggregate : "gamers", pipeline : [
    { $project : {
	

     location : 1,                                    location, games
	

     games : 1
    }},
    { $unwind : "$games" },
    { $project : {
	

     location : 1,
	

     game : "$games.game",                            location, game, duration
        duration : "$games.duration"
    }},
    { $group : {
        _id : { location: "$location", game: "$game"},
	

     number_games: { $sum : 1 },
        total_duration: {$sum : "$duration"}
    }},
    { $project : {
        _id : 0,
        location : "$_id.location",
        game : "$_id.game",
	

     number_games : 1,
        total_duration : 1
    }}
]})
QUERY 2
db.runCommand(
{ aggregate : "gamers", pipeline : [
    { $project : {
	

     location : 1,                                     location, games
	

     games : 1
    }},
    { $unwind : "$games" },
    { $project : {
	

     location : 1,
	

     game : "$games.game",                            location, game, duration
        duration : "$games.duration"
    }},
    { $group : {
        _id : { location: "$location", game: "$game"},   key: aggregate on location and game
	

     number_games: { $sum : 1 },
        total_duration: {$sum : "$duration"}
    }},
    { $project : {
        _id : 0,
        location : "$_id.location",
        game : "$_id.game",
	

     number_games : 1,
        total_duration : 1
    }}
]})
QUERY 2
db.runCommand(
{ aggregate : "gamers", pipeline : [
    { $project : {
	

     location : 1,                                     location, games
	

     games : 1
    }},
    { $unwind : "$games" },
    { $project : {
	

     location : 1,
	

     game : "$games.game",                            location, game, duration
        duration : "$games.duration"
    }},
    { $group : {
        _id : { location: "$location", game: "$game"},   key: aggregate on location and game
	

     number_games: { $sum : 1 },
        total_duration: {$sum : "$duration"}
    }},
    { $project : {
        _id : 0,
        location : "$_id.location",
        game : "$_id.game",
	

     number_games : 1,
        total_duration : 1
    }}
]})
QUERY 2
db.runCommand(
{ aggregate : "gamers", pipeline : [
    { $project : {
	

     location : 1,                                                location, games
	

     games : 1
    }},
    { $unwind : "$games" },
    { $project : {
	

     location : 1,
	

     game : "$games.game",                                        location, game, duration
        duration : "$games.duration"
    }},
    { $group : {
        _id : { location: "$location", game: "$game"},              key: aggregate on location and game
	

     number_games: { $sum : 1 },
        total_duration: {$sum : "$duration"}
    }},
    { $project : {
        _id : 0,
        location : "$_id.location",                      project: location, game, total(#games), sum(duration)
        game : "$_id.game",
	

     number_games : 1,
        total_duration : 1
    }}
]})
RESULT 2




Count: EU - WoW, US Tetris
EU spends more time on WoW, US it’s more
evenly spread
RING....
CHALLENGE 3:



• How   do I compare Bob to everyone else in the EU ?
QUERY

•2   aggregations happening at same time:

  •1   by user

  •1   by location

• This   query needs to be broken up in several queries

• Fairly   complex

• Currently   easiest to process in Ruby/Java/Python/...
db.runCommand(                                                 db.runCommand(
{ aggregate : "gamers", pipeline : [                           { aggregate : "gamers", pipeline : [
    { $project : {                                                 { $project : {
         name : 1,                                             	

     location : 1,
	

     location : 1,                                          	

     games : 1
	

     games : 1                                                  }},
    }},                                                            { $unwind : "$games" },
    { $unwind : "$games" },                                        { $project : {
    { $project : {                                                      location : 1,
	

     name: 1,                                                        duration : "$games.duration"
         location : 1,                                             }},
	

     game : "$games.game",                                      { $group : {
         duration : "$games.duration"                                   _id : { location: 1},
    }},                                                                 total_duration: {$sum :
    { $group : {                                               "$duration"}
         _id : { location: "$location", name: "$name", game:       }},
"$game"},                                                          { $project : {
         total_duration: {$sum : "$duration"}                  	

     name : "$_id.location",
    }},                                                                 _id : 0,
    { $project : {                                                      total_duration : 1
	

     name : "$_id.name",                                        }}
         _id : 0,                                              ]})
         location : "$_id.location",
         game : "$_id.game",
         total_duration : 1
    }}
]})
RESULT 3




• Bob plays >20% WoW in comparison to the Europeans, but
 plays 200% more Tetris
A NOTE ON QUERIES


• There’s   no notion of a declared schema

• The   augmented scheme is coded in queries

• Reuse   is very hard, happens at a query language
DIMENSIONS
• Most   questions / graphs have a dimension

 • Time, Geo

 • Categories

 • Relative: what’s   X’s contribution of revenue to total

• Youwill need to be able to pass in dimensions as a
 predicate for your queries

 • or   cache result and post process client-side
A WORD ON RENDERING
           GRAPHS / REPORTS
• Several   libraries available for ruby / python / java

  • Gruff, Scruffy, StockCharts, D3, JRafael, JQuery Vizualize,
   MooCharts, etc, etc.

• Also some services: John Nunemakers work (http://
 get.gaug.es/)

• But   Basically:

  • you   know how to program, right !
REVIEW
• Understand       your schema
  • multiple schema’s in single collection
  • multiple collections / multiple data sources
• Iterate:
  • define metric
  • develop query and report on metrics
    • understand and drill down or discard
    • repeat
• Operationalize metrics: dashboard
  • Dimensions
  • Plotting
PUNCHLINES

• We     have described a software engineering process

  • but    requirements will be very fluid

• When      you know how to write ruby / java / python etc. - life is
  good

• If   you’re a business analyst you have a problem

  • better   be BFF with some engineer :)
PLUG

• We’ve    been working on a declarative analytics product

• (initially)   uses Excel as its presentation layer

• Reach    out to me if you’re interested

  @rogerb
  roger@norellan.com
THANK YOU / QUESTIONS

More Related Content

What's hot

The Ring programming language version 1.6 book - Part 50 of 189
The Ring programming language version 1.6 book - Part 50 of 189The Ring programming language version 1.6 book - Part 50 of 189
The Ring programming language version 1.6 book - Part 50 of 189Mahmoud Samir Fayed
 
The Ring programming language version 1.5.3 book - Part 62 of 184
The Ring programming language version 1.5.3 book - Part 62 of 184The Ring programming language version 1.5.3 book - Part 62 of 184
The Ring programming language version 1.5.3 book - Part 62 of 184Mahmoud Samir Fayed
 
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or DieMongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or DieAaron Silverman
 
MongoDB Online Conference: Introducing MongoDB 2.2
MongoDB Online Conference: Introducing MongoDB 2.2MongoDB Online Conference: Introducing MongoDB 2.2
MongoDB Online Conference: Introducing MongoDB 2.2MongoDB
 
The Ring programming language version 1.9 book - Part 62 of 210
The Ring programming language version 1.9 book - Part 62 of 210The Ring programming language version 1.9 book - Part 62 of 210
The Ring programming language version 1.9 book - Part 62 of 210Mahmoud Samir Fayed
 
From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)Night Sailer
 
Html5 game programming overview
Html5 game programming overviewHtml5 game programming overview
Html5 game programming overview민태 김
 
First app online conf
First app   online confFirst app   online conf
First app online confMongoDB
 
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB
 
Cleaner, Leaner, Meaner: Refactoring your jQuery
Cleaner, Leaner, Meaner: Refactoring your jQueryCleaner, Leaner, Meaner: Refactoring your jQuery
Cleaner, Leaner, Meaner: Refactoring your jQueryRebecca Murphey
 
Coding Horrors
Coding HorrorsCoding Horrors
Coding HorrorsMark Baker
 
The Testing Games: Mocking, yay!
The Testing Games: Mocking, yay!The Testing Games: Mocking, yay!
The Testing Games: Mocking, yay!Donny Wals
 
The Ring programming language version 1.5 book - Part 9 of 31
The Ring programming language version 1.5 book - Part 9 of 31The Ring programming language version 1.5 book - Part 9 of 31
The Ring programming language version 1.5 book - Part 9 of 31Mahmoud Samir Fayed
 
enchant js workshop on Calpoly
enchant js workshop  on Calpolyenchant js workshop  on Calpoly
enchant js workshop on CalpolyRyo Shimizu
 
Contando uma história com O.O.
Contando uma história com O.O.Contando uma história com O.O.
Contando uma história com O.O.Vagner Zampieri
 

What's hot (20)

The Ring programming language version 1.6 book - Part 50 of 189
The Ring programming language version 1.6 book - Part 50 of 189The Ring programming language version 1.6 book - Part 50 of 189
The Ring programming language version 1.6 book - Part 50 of 189
 
The Ring programming language version 1.5.3 book - Part 62 of 184
The Ring programming language version 1.5.3 book - Part 62 of 184The Ring programming language version 1.5.3 book - Part 62 of 184
The Ring programming language version 1.5.3 book - Part 62 of 184
 
Sensmon couchdb
Sensmon couchdbSensmon couchdb
Sensmon couchdb
 
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or DieMongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
 
Game dev 101 part 3
Game dev 101 part 3Game dev 101 part 3
Game dev 101 part 3
 
MongoDB Online Conference: Introducing MongoDB 2.2
MongoDB Online Conference: Introducing MongoDB 2.2MongoDB Online Conference: Introducing MongoDB 2.2
MongoDB Online Conference: Introducing MongoDB 2.2
 
The Ring programming language version 1.9 book - Part 62 of 210
The Ring programming language version 1.9 book - Part 62 of 210The Ring programming language version 1.9 book - Part 62 of 210
The Ring programming language version 1.9 book - Part 62 of 210
 
From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)From mysql to MongoDB(MongoDB2011北京交流会)
From mysql to MongoDB(MongoDB2011北京交流会)
 
Html5 game programming overview
Html5 game programming overviewHtml5 game programming overview
Html5 game programming overview
 
Books
BooksBooks
Books
 
Game dev 101 part 2
Game dev 101   part 2Game dev 101   part 2
Game dev 101 part 2
 
First app online conf
First app   online confFirst app   online conf
First app online conf
 
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
 
Cleaner, Leaner, Meaner: Refactoring your jQuery
Cleaner, Leaner, Meaner: Refactoring your jQueryCleaner, Leaner, Meaner: Refactoring your jQuery
Cleaner, Leaner, Meaner: Refactoring your jQuery
 
Coding Horrors
Coding HorrorsCoding Horrors
Coding Horrors
 
Groovy scripts with Groovy
Groovy scripts with GroovyGroovy scripts with Groovy
Groovy scripts with Groovy
 
The Testing Games: Mocking, yay!
The Testing Games: Mocking, yay!The Testing Games: Mocking, yay!
The Testing Games: Mocking, yay!
 
The Ring programming language version 1.5 book - Part 9 of 31
The Ring programming language version 1.5 book - Part 9 of 31The Ring programming language version 1.5 book - Part 9 of 31
The Ring programming language version 1.5 book - Part 9 of 31
 
enchant js workshop on Calpoly
enchant js workshop  on Calpolyenchant js workshop  on Calpoly
enchant js workshop on Calpoly
 
Contando uma história com O.O.
Contando uma história com O.O.Contando uma história com O.O.
Contando uma história com O.O.
 

Viewers also liked

Social Analytics on MongoDB at MongoNYC
Social Analytics on MongoDB at MongoNYCSocial Analytics on MongoDB at MongoNYC
Social Analytics on MongoDB at MongoNYCPatrick Stokes
 
Klmug presentation - Simple Analytics with MongoDB
Klmug presentation - Simple Analytics with MongoDBKlmug presentation - Simple Analytics with MongoDB
Klmug presentation - Simple Analytics with MongoDBRoss Affandy
 
Blazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkBlazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkMongoDB
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for AnalyticsMongoDB
 
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...MongoDB
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBMongoDB
 
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI ConnectorWebinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI ConnectorMongoDB
 
Real Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishReal Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishMongoDB
 
MongoDB World 2016: The Best IoT Analytics with MongoDB
MongoDB World 2016: The Best IoT Analytics with MongoDBMongoDB World 2016: The Best IoT Analytics with MongoDB
MongoDB World 2016: The Best IoT Analytics with MongoDBMongoDB
 

Viewers also liked (9)

Social Analytics on MongoDB at MongoNYC
Social Analytics on MongoDB at MongoNYCSocial Analytics on MongoDB at MongoNYC
Social Analytics on MongoDB at MongoNYC
 
Klmug presentation - Simple Analytics with MongoDB
Klmug presentation - Simple Analytics with MongoDBKlmug presentation - Simple Analytics with MongoDB
Klmug presentation - Simple Analytics with MongoDB
 
Blazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkBlazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & Spark
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for Analytics
 
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...
Webinar: How Penton Uses MongoDB As an Analytics Platform within their Drupal...
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI ConnectorWebinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
 
Real Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishReal Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at Wish
 
MongoDB World 2016: The Best IoT Analytics with MongoDB
MongoDB World 2016: The Best IoT Analytics with MongoDBMongoDB World 2016: The Best IoT Analytics with MongoDB
MongoDB World 2016: The Best IoT Analytics with MongoDB
 

Similar to Thoughts on MongoDB Analytics

MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkTyler Brock
 
MongoDB Aggregation Framework in action !
MongoDB Aggregation Framework in action !MongoDB Aggregation Framework in action !
MongoDB Aggregation Framework in action !Sébastien Prunier
 
Doing More with MongoDB Aggregation
Doing More with MongoDB AggregationDoing More with MongoDB Aggregation
Doing More with MongoDB AggregationMongoDB
 
Modern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter BootstrapModern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter BootstrapHoward Lewis Ship
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in BerlinToshiaki Katayama
 
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReducePerl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReducePedro Figueiredo
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarMongoDB
 
Coscup2021-rust-toturial
Coscup2021-rust-toturialCoscup2021-rust-toturial
Coscup2021-rust-toturialWayne Tsai
 
Geospatial Enhancements in MongoDB 2.4
Geospatial Enhancements in MongoDB 2.4Geospatial Enhancements in MongoDB 2.4
Geospatial Enhancements in MongoDB 2.4MongoDB
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation FrameworkMongoDB
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsMongoDB
 
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)""Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"MongoDB
 
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...MongoDB
 
Security Challenges in Node.js
Security Challenges in Node.jsSecurity Challenges in Node.js
Security Challenges in Node.jsWebsecurify
 
Powerful Analysis with the Aggregation Pipeline
Powerful Analysis with the Aggregation PipelinePowerful Analysis with the Aggregation Pipeline
Powerful Analysis with the Aggregation PipelineMongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
Mongo db 2.2 aggregation like a champ
Mongo db 2.2 aggregation like a champMongo db 2.2 aggregation like a champ
Mongo db 2.2 aggregation like a champNuri Halperin
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & AggregationMongoDB
 
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...MongoDB
 

Similar to Thoughts on MongoDB Analytics (20)

MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
MongoDB Aggregation Framework in action !
MongoDB Aggregation Framework in action !MongoDB Aggregation Framework in action !
MongoDB Aggregation Framework in action !
 
Doing More with MongoDB Aggregation
Doing More with MongoDB AggregationDoing More with MongoDB Aggregation
Doing More with MongoDB Aggregation
 
Modern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter BootstrapModern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter Bootstrap
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlin
 
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReducePerl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReduce
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
 
Couchdb
CouchdbCouchdb
Couchdb
 
Coscup2021-rust-toturial
Coscup2021-rust-toturialCoscup2021-rust-toturial
Coscup2021-rust-toturial
 
Geospatial Enhancements in MongoDB 2.4
Geospatial Enhancements in MongoDB 2.4Geospatial Enhancements in MongoDB 2.4
Geospatial Enhancements in MongoDB 2.4
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation Framework
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
 
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)""Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
 
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
 
Security Challenges in Node.js
Security Challenges in Node.jsSecurity Challenges in Node.js
Security Challenges in Node.js
 
Powerful Analysis with the Aggregation Pipeline
Powerful Analysis with the Aggregation PipelinePowerful Analysis with the Aggregation Pipeline
Powerful Analysis with the Aggregation Pipeline
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
Mongo db 2.2 aggregation like a champ
Mongo db 2.2 aggregation like a champMongo db 2.2 aggregation like a champ
Mongo db 2.2 aggregation like a champ
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Em...
 

More from rogerbodamer

Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency modelsrogerbodamer
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling rogerbodamer
 
Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011rogerbodamer
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 

More from rogerbodamer (6)

Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency models
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling
 
Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011Mongo Web Apps: OSCON 2011
Mongo Web Apps: OSCON 2011
 
Mongo db japan
Mongo db japanMongo db japan
Mongo db japan
 
Deployment
DeploymentDeployment
Deployment
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Thoughts on MongoDB Analytics

  • 1. ANALYTICS WITH MONGODB ROGER BODAMER
  • 2. YOU WANT TO ANALYZE THIS
  • 4. BUT HOW ? • These graphs are the end result of a process • In order get here there’s a few things you need to do and explore
  • 5. A WORD ON NON-NATIVE APPROACHES • Yes, you can • map your document schema to a relational schema • then export your data from MongoDB to a relational db • and set up a cron job to do this every day • then use your BI tool to map relational to “objects” • and then Report and do Analytics
  • 6. BUT THAT WOULD BE NO FUN • Analytics using Native Queries •A simple process
  • 7. PROCESS: NAIVE • Take a sample document • Develop query • Put on chart • Done ! • and a gold star from your boss !
  • 8. PROCESS: REALITY • Understand your schema • multiple schema’s in single collection • multiple collections / multiple data sources • Iterate: • define metric • develop query and report on metrics • understand and drill down or discard • repeat • Operationalize metrics: dashboard • Dimensions • Plotting
  • 10. UNDERSTAND YOUR SCHEMA { "name" : "Mario", "games" : [{"game" : "WoW", "duration" : 130}, {"game" : "Tetris", "duration" : 130}] }
  • 11. BUT ALSO: • Schema’s can be Polymorphic { "name" : "Bob", "location" : "us", "games" : [{"game" : "WoW", "duration" : 2910}, {"game" : "Tetris", "duration" : 593}] }
  • 12. SO NOW WHAT ? • Only report on common attributes • probably missing the most recent / interesting data
  • 13. SO NOW WHAT ? • Write 2 programs, one for each schema • 2 graphs / reports • 2 programs writing to 1 graph (basically merging instance data in 2 places)
  • 14. SO NOW WHAT ? • Unify Schema • deal with absent, null values • translate(NULL, “EU”);
  • 15. ITERATE • total time and how many games people play in the us vs eu ?
  • 16. QUERY db.runCommand( { aggregate : "gamers", pipeline : [ { $project : { location : 1, games: 1 }}, { $unwind : "$games" }, { $group : { _id : { location : 1}, number_games: { $sum : 1 }, total_duration: {$sum : "$games.duration"} }}, { $project : { _id : 0, location : "$_id.location", number_games : 1, total_duration : 1 }} ]})
  • 17. SIDEBAR: WRITING AGGREGATION QUERIES • Prepare Data • Extract relevant properties from collection documents • Unwind sub collection if its document is contributing to aggregation • Aggregate data • determine the key (_id) on which the aggregates should be done • name aggregates • Project Data • For final results
  • 18. EXAMPLE { "name" : "Alice", "location" : "us", "games" : [{ "game" : "WoW", "duration" : 200 }, { "game" : "Tetris", "duration" : 100 }] }
  • 19. PREPARE • Only use location and games: { $project : { location : 1, games: 1 }} • Unwind games as properties of its documents are aggregated over: { $unwind : "$games" }
  • 20. AGGREGATE DATA • Aggregate on number of games (add 1 per game) and total duration (add duration per game) using location as key { $group : { _id : { location : 1}, number_games: { $sum : 1 }, total_duration: {$sum : "$games.duration"} }}
  • 21. PROJECT • Only show location and aggregates, do not show _id { $project : { _id : 0, location : "$_id.location", number_games : 1, total_duration : 1 }}
  • 22. RESULT 1 • People spend a little more time playing in the US • More games played in the EU
  • 24. CHALLENGE 2 • Since we found EU and US play similar amount and same number of games, new challenge is: • Lets see what the distribution of different games is the 2 locations
  • 25. QUERY 2 db.runCommand( { aggregate : "gamers", pipeline : [ { $project : { location : 1, games : 1 }}, { $unwind : "$games" }, { $project : { location : 1, game : "$games.game", duration : "$games.duration" }}, { $group : { _id : { location: "$location", game: "$game"}, number_games: { $sum : 1 }, total_duration: {$sum : "$duration"} }}, { $project : { _id : 0, location : "$_id.location", game : "$_id.game", number_games : 1, total_duration : 1 }} ]})
  • 26. QUERY 2 db.runCommand( { aggregate : "gamers", pipeline : [ { $project : { location : 1, location, games games : 1 }}, { $unwind : "$games" }, { $project : { location : 1, game : "$games.game", duration : "$games.duration" }}, { $group : { _id : { location: "$location", game: "$game"}, number_games: { $sum : 1 }, total_duration: {$sum : "$duration"} }}, { $project : { _id : 0, location : "$_id.location", game : "$_id.game", number_games : 1, total_duration : 1 }} ]})
  • 27. QUERY 2 db.runCommand( { aggregate : "gamers", pipeline : [ { $project : { location : 1, location, games games : 1 }}, { $unwind : "$games" }, { $project : { location : 1, game : "$games.game", location, game, duration duration : "$games.duration" }}, { $group : { _id : { location: "$location", game: "$game"}, number_games: { $sum : 1 }, total_duration: {$sum : "$duration"} }}, { $project : { _id : 0, location : "$_id.location", game : "$_id.game", number_games : 1, total_duration : 1 }} ]})
  • 28. QUERY 2 db.runCommand( { aggregate : "gamers", pipeline : [ { $project : { location : 1, location, games games : 1 }}, { $unwind : "$games" }, { $project : { location : 1, game : "$games.game", location, game, duration duration : "$games.duration" }}, { $group : { _id : { location: "$location", game: "$game"}, key: aggregate on location and game number_games: { $sum : 1 }, total_duration: {$sum : "$duration"} }}, { $project : { _id : 0, location : "$_id.location", game : "$_id.game", number_games : 1, total_duration : 1 }} ]})
  • 29. QUERY 2 db.runCommand( { aggregate : "gamers", pipeline : [ { $project : { location : 1, location, games games : 1 }}, { $unwind : "$games" }, { $project : { location : 1, game : "$games.game", location, game, duration duration : "$games.duration" }}, { $group : { _id : { location: "$location", game: "$game"}, key: aggregate on location and game number_games: { $sum : 1 }, total_duration: {$sum : "$duration"} }}, { $project : { _id : 0, location : "$_id.location", game : "$_id.game", number_games : 1, total_duration : 1 }} ]})
  • 30. QUERY 2 db.runCommand( { aggregate : "gamers", pipeline : [ { $project : { location : 1, location, games games : 1 }}, { $unwind : "$games" }, { $project : { location : 1, game : "$games.game", location, game, duration duration : "$games.duration" }}, { $group : { _id : { location: "$location", game: "$game"}, key: aggregate on location and game number_games: { $sum : 1 }, total_duration: {$sum : "$duration"} }}, { $project : { _id : 0, location : "$_id.location", project: location, game, total(#games), sum(duration) game : "$_id.game", number_games : 1, total_duration : 1 }} ]})
  • 31. RESULT 2 Count: EU - WoW, US Tetris EU spends more time on WoW, US it’s more evenly spread
  • 33. CHALLENGE 3: • How do I compare Bob to everyone else in the EU ?
  • 34. QUERY •2 aggregations happening at same time: •1 by user •1 by location • This query needs to be broken up in several queries • Fairly complex • Currently easiest to process in Ruby/Java/Python/...
  • 35. db.runCommand( db.runCommand( { aggregate : "gamers", pipeline : [ { aggregate : "gamers", pipeline : [ { $project : { { $project : { name : 1, location : 1, location : 1, games : 1 games : 1 }}, }}, { $unwind : "$games" }, { $unwind : "$games" }, { $project : { { $project : { location : 1, name: 1, duration : "$games.duration" location : 1, }}, game : "$games.game", { $group : { duration : "$games.duration" _id : { location: 1}, }}, total_duration: {$sum : { $group : { "$duration"} _id : { location: "$location", name: "$name", game: }}, "$game"}, { $project : { total_duration: {$sum : "$duration"} name : "$_id.location", }}, _id : 0, { $project : { total_duration : 1 name : "$_id.name", }} _id : 0, ]}) location : "$_id.location", game : "$_id.game", total_duration : 1 }} ]})
  • 36. RESULT 3 • Bob plays >20% WoW in comparison to the Europeans, but plays 200% more Tetris
  • 37. A NOTE ON QUERIES • There’s no notion of a declared schema • The augmented scheme is coded in queries • Reuse is very hard, happens at a query language
  • 38. DIMENSIONS • Most questions / graphs have a dimension • Time, Geo • Categories • Relative: what’s X’s contribution of revenue to total • Youwill need to be able to pass in dimensions as a predicate for your queries • or cache result and post process client-side
  • 39. A WORD ON RENDERING GRAPHS / REPORTS • Several libraries available for ruby / python / java • Gruff, Scruffy, StockCharts, D3, JRafael, JQuery Vizualize, MooCharts, etc, etc. • Also some services: John Nunemakers work (http:// get.gaug.es/) • But Basically: • you know how to program, right !
  • 40. REVIEW • Understand your schema • multiple schema’s in single collection • multiple collections / multiple data sources • Iterate: • define metric • develop query and report on metrics • understand and drill down or discard • repeat • Operationalize metrics: dashboard • Dimensions • Plotting
  • 41. PUNCHLINES • We have described a software engineering process • but requirements will be very fluid • When you know how to write ruby / java / python etc. - life is good • If you’re a business analyst you have a problem • better be BFF with some engineer :)
  • 42. PLUG • We’ve been working on a declarative analytics product • (initially) uses Excel as its presentation layer • Reach out to me if you’re interested @rogerb roger@norellan.com
  • 43. THANK YOU / QUESTIONS

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n