SlideShare a Scribd company logo
1 of 27
Download to read offline
mongoDB




                    advanced analytics and
                    statistics with mongodb
                         John A. De Goes @jdegoes




 http://precog.io                                   04/30/2012
mongoDB




          what do you want
           from your data?
mongoDB




          I want to get and
                                I want aggregates   I want deep insight
          put data


               MongoDB               MongoDB
                 Query              Aggregation             ???
               Language             Framework



                              SQL

     data storage                                           data intelligence
mongoDB




          I want to get and
                                I want aggregates   I want deep insight
          put data


               MongoDB               MongoDB
                                                           Map
                 Query              Aggregation
                                                          Reduce
               Language             Framework



                              SQL

     data storage                                           data intelligence
mongoDB

          function map() {
              emit(1, // Or put a GROUP BY key here
                   {sum: this.value, // the field you want stats for
                    min: this.value,
                    max: this.value,
                    count:1,
                    diff: 0, // M2,n: sum((val-mean)^2)
              });
          }

          function reduce(key, values) {
              var a = values[0]; // will reduce into here
              for (var i=1/*!*/; i < values.length; i++){
                  var b = values[i]; // will merge 'b' into 'a'


                  // temp helpers
                  var delta = a.sum/a.count - b.sum/b.count; // a.mean - b.mean
                  var weight = (a.count * b.count)/(a.count + b.count);
                  
                  // do the reducing
                  a.diff += b.diff + delta*delta*weight;
                  a.sum += b.sum;
                  a.count += b.count;
                  a.min = Math.min(a.min, b.min);
                  a.max = Math.max(a.max, b.max);
              }

              return a;
          }

          function finalize(key, value){
              value.avg = value.sum / value.count;
              value.variance = value.diff / value.count;
              value.stddev = Math.sqrt(value.variance);
              return value;
          }
mongoDB




          what if there were
           another way?
mongoDB




                 introducing


          • Statistical query language for JSON data
          • Purely declarative
          • Implicitly parallel
          • Inherently composable
mongoDB




          a taste of quirrel
          pageViews := //pageViews

          bound := 1.5 * stdDev(pageViews.duration)

          avg := mean(pageViews.duration)

          lengthyPageViews := 
            pageViews where pageViews.duration > (avg + bound)

          lengthyPageViews.userId
mongoDB




          a taste of quirrel
          pageViews := //pageViews

          bound := 1.5 * stdDev(pageViews.duration)

                    Users who spend an unusually
          avg := mean(pageViews.duration)          long
                   time looking at a page!
          lengthyPageViews := 
            pageViews where pageViews.duration > (avg + bound)

          lengthyPageViews.userId
mongoDB




          quirrel in 10 minutes
mongoDB




          set-oriented
          in Quirrel everything is
          a set of events
mongoDB




          event
          an event is a JSON value
          paired with an identity
mongoDB




          (really) basic queries
          quirrel> 1
          [1]

          quirrel> true
          [true]

          quirrel> {userId: 1239823, name: “John Doe”}
          [{userId: 1239823, name: “John Doe”}]

          quirrel>1 + 2
          [3]

          quirrel> sqrt(16) * 4 - 1 / 3
          [5]
mongoDB




          loading data
          quirrel> //payments

          [{"amount":5,"date":1329741127233,"recipients":
          ["research","marketing"]}, ...]


          quirrel> load(“/payments”)

          [{"amount":5,"date":1329741127233,"recipients":
          ["research","marketing"]}, ...]
mongoDB




          variables
          quirrel> payments := //payments
                 | payments

          [{"amount":5,"date":1329741127233,"recipients":
          ["research","marketing"]}, ...]


          quirrel> five := 5
                 | five * 2
          [10]
mongoDB




          filtered descent
          quirrel> //users.userId

          [9823461231, 916727123, 23987183, ...]


          quirrel> //payments.recipients[0]

          ["engineering","operations","research", ...]
mongoDB




          reductions
          quirrel> count(//users)
          24185132

          quirrel> mean(//payments.amount)
          87.39

          quirrel> sum(//payments.amount)
          921541.29

          quirrel> stdDev(//payments.amount)
          31.84
mongoDB




          identity matching
                 a*b
            a
            e1
                  ?    b
            e2         e8
            e3         e9
            e4    *    e10
            e5         e11
            e6         e12
                  ?
            e7
mongoDB




          identity matching
          quirrel> orders := //orders
                 | orders.subTotal +
                 | orders.subTotal *
                 | orders.taxRate +
                 | orders.shipping + orders.handling 
          [153.54805, 152.7618, 80.38365, ...]
mongoDB




          values
          quirrel> payments.amount * 0.10
          [6.1, 27.842, 29.084, 50, 0.5, 16.955, ...]
mongoDB




          filtering
          quirrel> users := //users
                 | segment := users.age > 19 & 
                 | users.age < 53 & users.income > 60000
                 | count(users where segment)
          [15]
mongoDB




          chaining
          pageViews := //pageViews

          bound := 1.5 * stdDev(pageViews.duration)

          avg := mean(pageViews.duration)

          lengthyPageViews := 
            pageViews where pageViews.duration > (avg + bound)

          lengthyPageViews.userId
mongoDB




          user functions
          quirrel> pageViews := //pageViews
                 |
                 | statsForUser('userId) :=
                 |   {userId:      'userId, 
                 |    meanPageView: mean(pageViews.duration 
                 |                       where pageViews.userId =  'userId)}
                 |
                 | statsForUser

          [{"userId":12353,"meanPageView":100.66666666666667},{"userId":
          12359,"meanPageView":83}, ...]
mongoDB




          lots more!
          • Cross-joins
          • Self-joins
          • Augmentation
          • Power-packed standard library
mongoDB




          quirrel -> mongodb
          • Quirrel is extremely expressive
          • Aggregation framework insufficient
          • Working with 10gen on new primitives
          • Backup plan: AF + MapReduce
mongoDB




          quirrel -> mongodb
          pageViews := //pageViews

          bound := 1.5 * stdDev(pageViews.duration)
                                                                  one-pass
          avg := mean(pageViews.duration)                         map/reduce
          lengthyPageViews := 
            pageViews where pageViews.duration > (avg + bound)

          lengthyPageViews.userId
                                                                 one-pass
                                                                 mongo filter
mongoDB




                            qa
                    John A. De Goes @jdegoes




 http://precog.io                              04/30/2012

More Related Content

What's hot

Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichNorberto Leite
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014MongoDB
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation Amit Ghosh
 
Geospatial Indexing and Querying with MongoDB
Geospatial Indexing and Querying with MongoDBGeospatial Indexing and Querying with MongoDB
Geospatial Indexing and Querying with MongoDBGrant Goodale
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
3D + MongoDB = 3D Repo
3D + MongoDB = 3D Repo3D + MongoDB = 3D Repo
3D + MongoDB = 3D RepoMongoDB
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2MongoDB
 
Embedding a language into string interpolator
Embedding a language into string interpolatorEmbedding a language into string interpolator
Embedding a language into string interpolatorMichael Limansky
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation FrameworkMongoDB
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipelinezahid-mian
 
Data Governance with JSON Schema
Data Governance with JSON SchemaData Governance with JSON Schema
Data Governance with JSON SchemaMongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB
 
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)""Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"MongoDB
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineJason Terpko
 
Getting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDBGetting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDBMongoDB
 
When to Use MongoDB
When to Use MongoDB When to Use MongoDB
When to Use MongoDB MongoDB
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDBKishor Parkhe
 

What's hot (19)

Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
 
Querying mongo db
Querying mongo dbQuerying mongo db
Querying mongo db
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
 
Geospatial Indexing and Querying with MongoDB
Geospatial Indexing and Querying with MongoDBGeospatial Indexing and Querying with MongoDB
Geospatial Indexing and Querying with MongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
3D + MongoDB = 3D Repo
3D + MongoDB = 3D Repo3D + MongoDB = 3D Repo
3D + MongoDB = 3D Repo
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
 
Embedding a language into string interpolator
Embedding a language into string interpolatorEmbedding a language into string interpolator
Embedding a language into string interpolator
 
Web Development
Web DevelopmentWeb Development
Web Development
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipeline
 
Data Governance with JSON Schema
Data Governance with JSON SchemaData Governance with JSON Schema
Data Governance with JSON Schema
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)""Powerful Analysis with the Aggregation Pipeline (Tutorial)"
"Powerful Analysis with the Aggregation Pipeline (Tutorial)"
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
 
Getting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDBGetting Started with Geospatial Data in MongoDB
Getting Started with Geospatial Data in MongoDB
 
When to Use MongoDB
When to Use MongoDB When to Use MongoDB
When to Use MongoDB
 
Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
 

Viewers also liked

Using MongoDB As a Tick Database
Using MongoDB As a Tick DatabaseUsing MongoDB As a Tick Database
Using MongoDB As a Tick DatabaseMongoDB
 
Rise of the scientific database
Rise of the scientific databaseRise of the scientific database
Rise of the scientific databaseJohn De Goes
 
In-Database Predictive Analytics
In-Database Predictive AnalyticsIn-Database Predictive Analytics
In-Database Predictive AnalyticsJohn De Goes
 
Post-Free: Life After Free Monads
Post-Free: Life After Free MonadsPost-Free: Life After Free Monads
Post-Free: Life After Free MonadsJohn De Goes
 
Analytics Maturity Model
Analytics Maturity ModelAnalytics Maturity Model
Analytics Maturity ModelJohn De Goes
 
Фотоматериалы
Фотоматериалы Фотоматериалы
Фотоматериалы Yerdos
 
Universidad nacional de chimbor
Universidad nacional de chimborUniversidad nacional de chimbor
Universidad nacional de chimborDoris Aguagallo
 
Product Management and Systems Thinking
Product Management and Systems ThinkingProduct Management and Systems Thinking
Product Management and Systems ThinkingDr. Arne Roock
 
How emotional abuse is wrecking your mental health
How emotional abuse is wrecking your mental healthHow emotional abuse is wrecking your mental health
How emotional abuse is wrecking your mental healthRivka Levy
 
Tulevaisuutemme verkossa
Tulevaisuutemme verkossaTulevaisuutemme verkossa
Tulevaisuutemme verkossaKaroliina Luoto
 
Grafico diario del dax perfomance index para el 10 05-2012
Grafico diario del dax perfomance index para el 10 05-2012Grafico diario del dax perfomance index para el 10 05-2012
Grafico diario del dax perfomance index para el 10 05-2012Experiencia Trading
 
7 câu mẹ nào cũng muốn hỏi khi mang bầu
7 câu mẹ nào cũng muốn hỏi khi mang bầu7 câu mẹ nào cũng muốn hỏi khi mang bầu
7 câu mẹ nào cũng muốn hỏi khi mang bầucuongdienbaby05
 
Got centerpiece? (#hewebar 2013 edition)
Got centerpiece? (#hewebar 2013 edition)Got centerpiece? (#hewebar 2013 edition)
Got centerpiece? (#hewebar 2013 edition)Michael Fienen
 
Mobile is your friend, not enemy.
Mobile is your friend, not enemy. Mobile is your friend, not enemy.
Mobile is your friend, not enemy. Edith Yeung
 
School of Fish: The MSC End of Term Report on sustainable fish in schools 2015
School of Fish: The MSC End of Term Report on sustainable fish in schools 2015School of Fish: The MSC End of Term Report on sustainable fish in schools 2015
School of Fish: The MSC End of Term Report on sustainable fish in schools 2015Marine Stewardship Council
 
Ponencia experiencia e learning y web 2.0
Ponencia experiencia e learning y web 2.0Ponencia experiencia e learning y web 2.0
Ponencia experiencia e learning y web 2.0Elizabeth Huisa Veria
 

Viewers also liked (20)

Using MongoDB As a Tick Database
Using MongoDB As a Tick DatabaseUsing MongoDB As a Tick Database
Using MongoDB As a Tick Database
 
Rise of the scientific database
Rise of the scientific databaseRise of the scientific database
Rise of the scientific database
 
In-Database Predictive Analytics
In-Database Predictive AnalyticsIn-Database Predictive Analytics
In-Database Predictive Analytics
 
Post-Free: Life After Free Monads
Post-Free: Life After Free MonadsPost-Free: Life After Free Monads
Post-Free: Life After Free Monads
 
Analytics Maturity Model
Analytics Maturity ModelAnalytics Maturity Model
Analytics Maturity Model
 
Фотоматериалы
Фотоматериалы Фотоматериалы
Фотоматериалы
 
Universidad nacional de chimbor
Universidad nacional de chimborUniversidad nacional de chimbor
Universidad nacional de chimbor
 
Product Management and Systems Thinking
Product Management and Systems ThinkingProduct Management and Systems Thinking
Product Management and Systems Thinking
 
Barometrul mediului de afaceri romanesc 2016
Barometrul mediului de afaceri romanesc 2016Barometrul mediului de afaceri romanesc 2016
Barometrul mediului de afaceri romanesc 2016
 
How emotional abuse is wrecking your mental health
How emotional abuse is wrecking your mental healthHow emotional abuse is wrecking your mental health
How emotional abuse is wrecking your mental health
 
Tulevaisuutemme verkossa
Tulevaisuutemme verkossaTulevaisuutemme verkossa
Tulevaisuutemme verkossa
 
servo press P2113 BA for press fit
servo press P2113 BA for press fitservo press P2113 BA for press fit
servo press P2113 BA for press fit
 
Teoría de las relaciones humanas
Teoría de las relaciones humanasTeoría de las relaciones humanas
Teoría de las relaciones humanas
 
Grafico diario del dax perfomance index para el 10 05-2012
Grafico diario del dax perfomance index para el 10 05-2012Grafico diario del dax perfomance index para el 10 05-2012
Grafico diario del dax perfomance index para el 10 05-2012
 
7 câu mẹ nào cũng muốn hỏi khi mang bầu
7 câu mẹ nào cũng muốn hỏi khi mang bầu7 câu mẹ nào cũng muốn hỏi khi mang bầu
7 câu mẹ nào cũng muốn hỏi khi mang bầu
 
Got centerpiece? (#hewebar 2013 edition)
Got centerpiece? (#hewebar 2013 edition)Got centerpiece? (#hewebar 2013 edition)
Got centerpiece? (#hewebar 2013 edition)
 
Mobile is your friend, not enemy.
Mobile is your friend, not enemy. Mobile is your friend, not enemy.
Mobile is your friend, not enemy.
 
School of Fish: The MSC End of Term Report on sustainable fish in schools 2015
School of Fish: The MSC End of Term Report on sustainable fish in schools 2015School of Fish: The MSC End of Term Report on sustainable fish in schools 2015
School of Fish: The MSC End of Term Report on sustainable fish in schools 2015
 
Ponencia experiencia e learning y web 2.0
Ponencia experiencia e learning y web 2.0Ponencia experiencia e learning y web 2.0
Ponencia experiencia e learning y web 2.0
 
Vanvasa
VanvasaVanvasa
Vanvasa
 

Similar to Advanced Analytics & Statistics with MongoDB

Shankar's mongo db presentation
Shankar's mongo db presentationShankar's mongo db presentation
Shankar's mongo db presentationShankar Kamble
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDBNorberto Leite
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB and Ruby on Rails
MongoDB and Ruby on RailsMongoDB and Ruby on Rails
MongoDB and Ruby on Railsrfischer20
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWAnkur Raina
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorHenrik Ingo
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & AggregationMongoDB
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBMongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
 
mongodb-introduction
mongodb-introductionmongodb-introduction
mongodb-introductionTse-Ching Ho
 
[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema
[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema
[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON SchemaMongoDB
 
Data Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineData Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineMongoDB
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB
 

Similar to Advanced Analytics & Statistics with MongoDB (20)

Shankar's mongo db presentation
Shankar's mongo db presentationShankar's mongo db presentation
Shankar's mongo db presentation
 
MongoDB and Python
MongoDB and PythonMongoDB and Python
MongoDB and Python
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
Mongo db dla administratora
Mongo db dla administratoraMongo db dla administratora
Mongo db dla administratora
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB and Ruby on Rails
MongoDB and Ruby on RailsMongoDB and Ruby on Rails
MongoDB and Ruby on Rails
 
Python and MongoDB
Python and MongoDB Python and MongoDB
Python and MongoDB
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUW
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
MongoDB.pdf
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
 
mongodb-introduction
mongodb-introductionmongodb-introduction
mongodb-introduction
 
[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema
[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema
[MongoDB.local Bengaluru 2018] Just in Time Validation with JSON Schema
 
MongoDB
MongoDBMongoDB
MongoDB
 
Data Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane FineData Analytics with MongoDB - Jane Fine
Data Analytics with MongoDB - Jane Fine
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
 
MongoDb and NoSQL
MongoDb and NoSQLMongoDb and NoSQL
MongoDb and NoSQL
 

More from John De Goes

Refactoring Functional Type Classes
Refactoring Functional Type ClassesRefactoring Functional Type Classes
Refactoring Functional Type ClassesJohn De Goes
 
One Monad to Rule Them All
One Monad to Rule Them AllOne Monad to Rule Them All
One Monad to Rule Them AllJohn De Goes
 
Error Management: Future vs ZIO
Error Management: Future vs ZIOError Management: Future vs ZIO
Error Management: Future vs ZIOJohn De Goes
 
Atomically { Delete Your Actors }
Atomically { Delete Your Actors }Atomically { Delete Your Actors }
Atomically { Delete Your Actors }John De Goes
 
The Death of Final Tagless
The Death of Final TaglessThe Death of Final Tagless
The Death of Final TaglessJohn De Goes
 
Scalaz Stream: Rebirth
Scalaz Stream: RebirthScalaz Stream: Rebirth
Scalaz Stream: RebirthJohn De Goes
 
Scalaz Stream: Rebirth
Scalaz Stream: RebirthScalaz Stream: Rebirth
Scalaz Stream: RebirthJohn De Goes
 
ZIO Schedule: Conquering Flakiness & Recurrence with Pure Functional Programming
ZIO Schedule: Conquering Flakiness & Recurrence with Pure Functional ProgrammingZIO Schedule: Conquering Flakiness & Recurrence with Pure Functional Programming
ZIO Schedule: Conquering Flakiness & Recurrence with Pure Functional ProgrammingJohn De Goes
 
Blazing Fast, Pure Effects without Monads — LambdaConf 2018
Blazing Fast, Pure Effects without Monads — LambdaConf 2018Blazing Fast, Pure Effects without Monads — LambdaConf 2018
Blazing Fast, Pure Effects without Monads — LambdaConf 2018John De Goes
 
Scalaz 8: A Whole New Game
Scalaz 8: A Whole New GameScalaz 8: A Whole New Game
Scalaz 8: A Whole New GameJohn De Goes
 
Scalaz 8 vs Akka Actors
Scalaz 8 vs Akka ActorsScalaz 8 vs Akka Actors
Scalaz 8 vs Akka ActorsJohn De Goes
 
Orthogonal Functional Architecture
Orthogonal Functional ArchitectureOrthogonal Functional Architecture
Orthogonal Functional ArchitectureJohn De Goes
 
The Design of the Scalaz 8 Effect System
The Design of the Scalaz 8 Effect SystemThe Design of the Scalaz 8 Effect System
The Design of the Scalaz 8 Effect SystemJohn De Goes
 
Quark: A Purely-Functional Scala DSL for Data Processing & Analytics
Quark: A Purely-Functional Scala DSL for Data Processing & AnalyticsQuark: A Purely-Functional Scala DSL for Data Processing & Analytics
Quark: A Purely-Functional Scala DSL for Data Processing & AnalyticsJohn De Goes
 
Streams for (Co)Free!
Streams for (Co)Free!Streams for (Co)Free!
Streams for (Co)Free!John De Goes
 
The Easy-Peasy-Lemon-Squeezy, Statically-Typed, Purely Functional Programming...
The Easy-Peasy-Lemon-Squeezy, Statically-Typed, Purely Functional Programming...The Easy-Peasy-Lemon-Squeezy, Statically-Typed, Purely Functional Programming...
The Easy-Peasy-Lemon-Squeezy, Statically-Typed, Purely Functional Programming...John De Goes
 
Halogen: Past, Present, and Future
Halogen: Past, Present, and FutureHalogen: Past, Present, and Future
Halogen: Past, Present, and FutureJohn De Goes
 
All Aboard The Scala-to-PureScript Express!
All Aboard The Scala-to-PureScript Express!All Aboard The Scala-to-PureScript Express!
All Aboard The Scala-to-PureScript Express!John De Goes
 

More from John De Goes (20)

Refactoring Functional Type Classes
Refactoring Functional Type ClassesRefactoring Functional Type Classes
Refactoring Functional Type Classes
 
One Monad to Rule Them All
One Monad to Rule Them AllOne Monad to Rule Them All
One Monad to Rule Them All
 
Error Management: Future vs ZIO
Error Management: Future vs ZIOError Management: Future vs ZIO
Error Management: Future vs ZIO
 
Atomically { Delete Your Actors }
Atomically { Delete Your Actors }Atomically { Delete Your Actors }
Atomically { Delete Your Actors }
 
The Death of Final Tagless
The Death of Final TaglessThe Death of Final Tagless
The Death of Final Tagless
 
Scalaz Stream: Rebirth
Scalaz Stream: RebirthScalaz Stream: Rebirth
Scalaz Stream: Rebirth
 
Scalaz Stream: Rebirth
Scalaz Stream: RebirthScalaz Stream: Rebirth
Scalaz Stream: Rebirth
 
ZIO Schedule: Conquering Flakiness & Recurrence with Pure Functional Programming
ZIO Schedule: Conquering Flakiness & Recurrence with Pure Functional ProgrammingZIO Schedule: Conquering Flakiness & Recurrence with Pure Functional Programming
ZIO Schedule: Conquering Flakiness & Recurrence with Pure Functional Programming
 
ZIO Queue
ZIO QueueZIO Queue
ZIO Queue
 
Blazing Fast, Pure Effects without Monads — LambdaConf 2018
Blazing Fast, Pure Effects without Monads — LambdaConf 2018Blazing Fast, Pure Effects without Monads — LambdaConf 2018
Blazing Fast, Pure Effects without Monads — LambdaConf 2018
 
Scalaz 8: A Whole New Game
Scalaz 8: A Whole New GameScalaz 8: A Whole New Game
Scalaz 8: A Whole New Game
 
Scalaz 8 vs Akka Actors
Scalaz 8 vs Akka ActorsScalaz 8 vs Akka Actors
Scalaz 8 vs Akka Actors
 
Orthogonal Functional Architecture
Orthogonal Functional ArchitectureOrthogonal Functional Architecture
Orthogonal Functional Architecture
 
The Design of the Scalaz 8 Effect System
The Design of the Scalaz 8 Effect SystemThe Design of the Scalaz 8 Effect System
The Design of the Scalaz 8 Effect System
 
Quark: A Purely-Functional Scala DSL for Data Processing & Analytics
Quark: A Purely-Functional Scala DSL for Data Processing & AnalyticsQuark: A Purely-Functional Scala DSL for Data Processing & Analytics
Quark: A Purely-Functional Scala DSL for Data Processing & Analytics
 
Streams for (Co)Free!
Streams for (Co)Free!Streams for (Co)Free!
Streams for (Co)Free!
 
MTL Versus Free
MTL Versus FreeMTL Versus Free
MTL Versus Free
 
The Easy-Peasy-Lemon-Squeezy, Statically-Typed, Purely Functional Programming...
The Easy-Peasy-Lemon-Squeezy, Statically-Typed, Purely Functional Programming...The Easy-Peasy-Lemon-Squeezy, Statically-Typed, Purely Functional Programming...
The Easy-Peasy-Lemon-Squeezy, Statically-Typed, Purely Functional Programming...
 
Halogen: Past, Present, and Future
Halogen: Past, Present, and FutureHalogen: Past, Present, and Future
Halogen: Past, Present, and Future
 
All Aboard The Scala-to-PureScript Express!
All Aboard The Scala-to-PureScript Express!All Aboard The Scala-to-PureScript Express!
All Aboard The Scala-to-PureScript Express!
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 

Advanced Analytics & Statistics with MongoDB

  • 1. mongoDB advanced analytics and statistics with mongodb John A. De Goes @jdegoes http://precog.io 04/30/2012
  • 2. mongoDB what do you want from your data?
  • 3. mongoDB I want to get and I want aggregates I want deep insight put data MongoDB MongoDB Query Aggregation ??? Language Framework SQL data storage data intelligence
  • 4. mongoDB I want to get and I want aggregates I want deep insight put data MongoDB MongoDB Map Query Aggregation Reduce Language Framework SQL data storage data intelligence
  • 5. mongoDB function map() {     emit(1, // Or put a GROUP BY key here          {sum: this.value, // the field you want stats for           min: this.value,           max: this.value,           count:1,           diff: 0, // M2,n: sum((val-mean)^2)     }); } function reduce(key, values) {     var a = values[0]; // will reduce into here     for (var i=1/*!*/; i < values.length; i++){         var b = values[i]; // will merge 'b' into 'a'         // temp helpers         var delta = a.sum/a.count - b.sum/b.count; // a.mean - b.mean         var weight = (a.count * b.count)/(a.count + b.count);                  // do the reducing         a.diff += b.diff + delta*delta*weight;         a.sum += b.sum;         a.count += b.count;         a.min = Math.min(a.min, b.min);         a.max = Math.max(a.max, b.max);     }     return a; } function finalize(key, value){     value.avg = value.sum / value.count;     value.variance = value.diff / value.count;     value.stddev = Math.sqrt(value.variance);     return value; }
  • 6. mongoDB what if there were another way?
  • 7. mongoDB introducing • Statistical query language for JSON data • Purely declarative • Implicitly parallel • Inherently composable
  • 8. mongoDB a taste of quirrel pageViews := //pageViews bound := 1.5 * stdDev(pageViews.duration) avg := mean(pageViews.duration) lengthyPageViews :=  pageViews where pageViews.duration > (avg + bound) lengthyPageViews.userId
  • 9. mongoDB a taste of quirrel pageViews := //pageViews bound := 1.5 * stdDev(pageViews.duration) Users who spend an unusually avg := mean(pageViews.duration) long time looking at a page! lengthyPageViews :=  pageViews where pageViews.duration > (avg + bound) lengthyPageViews.userId
  • 10. mongoDB quirrel in 10 minutes
  • 11. mongoDB set-oriented in Quirrel everything is a set of events
  • 12. mongoDB event an event is a JSON value paired with an identity
  • 13. mongoDB (really) basic queries quirrel> 1 [1] quirrel> true [true] quirrel> {userId: 1239823, name: “John Doe”} [{userId: 1239823, name: “John Doe”}] quirrel>1 + 2 [3] quirrel> sqrt(16) * 4 - 1 / 3 [5]
  • 14. mongoDB loading data quirrel> //payments [{"amount":5,"date":1329741127233,"recipients": ["research","marketing"]}, ...] quirrel> load(“/payments”) [{"amount":5,"date":1329741127233,"recipients": ["research","marketing"]}, ...]
  • 15. mongoDB variables quirrel> payments := //payments | payments [{"amount":5,"date":1329741127233,"recipients": ["research","marketing"]}, ...] quirrel> five := 5 | five * 2 [10]
  • 16. mongoDB filtered descent quirrel> //users.userId [9823461231, 916727123, 23987183, ...] quirrel> //payments.recipients[0] ["engineering","operations","research", ...]
  • 17. mongoDB reductions quirrel> count(//users) 24185132 quirrel> mean(//payments.amount) 87.39 quirrel> sum(//payments.amount) 921541.29 quirrel> stdDev(//payments.amount) 31.84
  • 18. mongoDB identity matching a*b a e1 ? b e2 e8 e3 e9 e4 * e10 e5 e11 e6 e12 ? e7
  • 19. mongoDB identity matching quirrel> orders := //orders   | orders.subTotal + | orders.subTotal * | orders.taxRate + | orders.shipping + orders.handling  [153.54805, 152.7618, 80.38365, ...]
  • 20. mongoDB values quirrel> payments.amount * 0.10 [6.1, 27.842, 29.084, 50, 0.5, 16.955, ...]
  • 21. mongoDB filtering quirrel> users := //users   | segment := users.age > 19 &  | users.age < 53 & users.income > 60000   | count(users where segment) [15]
  • 22. mongoDB chaining pageViews := //pageViews bound := 1.5 * stdDev(pageViews.duration) avg := mean(pageViews.duration) lengthyPageViews :=  pageViews where pageViews.duration > (avg + bound) lengthyPageViews.userId
  • 23. mongoDB user functions quirrel> pageViews := //pageViews |   | statsForUser('userId) :=   |   {userId:  'userId,  | meanPageView: mean(pageViews.duration  | where pageViews.userId =  'userId)} |   | statsForUser [{"userId":12353,"meanPageView":100.66666666666667},{"userId": 12359,"meanPageView":83}, ...]
  • 24. mongoDB lots more! • Cross-joins • Self-joins • Augmentation • Power-packed standard library
  • 25. mongoDB quirrel -> mongodb • Quirrel is extremely expressive • Aggregation framework insufficient • Working with 10gen on new primitives • Backup plan: AF + MapReduce
  • 26. mongoDB quirrel -> mongodb pageViews := //pageViews bound := 1.5 * stdDev(pageViews.duration) one-pass avg := mean(pageViews.duration) map/reduce lengthyPageViews :=  pageViews where pageViews.duration > (avg + bound) lengthyPageViews.userId one-pass mongo filter
  • 27. mongoDB qa John A. De Goes @jdegoes http://precog.io 04/30/2012