SlideShare a Scribd company logo
1 of 47
Managing Social Content with
                   MongoDB
  Chris Harris - charris@10gen.com
           twitter: @cj_harris5
Traditional Architecture
Traditional Architecture

                  HTML

          Web Server

       Application Server

          Controllers

           Services


                   SQL




           Database
Challenge - Write Volumes
Increase in Write Ratio
 Users don’t just want to read content!

They want to share and contribute to the
               content!


             Volume Writes!
Need to Scale Datasource


       JSON       JSON      JSON

          Web Server


       Application Server

          Service #1


                  SQL


                                   Bottleneck!
Need to Scale Datasource


       JSON       JSON      JSON

          Web Server


       Application Server

          Service #1


                  SQL


                                   Bottleneck!
Application Cache?


    JSON       JSON      JSON

       Web Server


    Application Server
       Service #1

       App Cache

               SQL
Issues


+ Read Only data comes from a Cache

- Writes slow down as need to update the
Cache and the Database

- Need to keep cache data in sync between
Application Servers
IT needs are evolving...
                           Agile
                           Development
                           • Iterative
                           • Continuous



Data Volume, Type
& Use
• Trillions of records
• 100’s of millions of
  queries per second
• Real-Time Analytics
• Unstructured / semi-
                              New Hardware
  structured                  Architectures
                               • Commodity servers
                               • Cloud Computing
                               • Horizontal Scaling
Tradeoff: Scale vs. Functionality


                              • memcached
  scalability & performance



                                  • key/value




                                                                 • RDBMS




                                        depth of functionality
Terminology

RDBMS       MongoDB

Table       Collection

Row(s)      JSON Document

Index       Index

Join        Embedding & Linking
Publishing Content with
       MongoDB
A simple start

article = {author: "Chris",
      date: new Date(),
      title: "Managing Social Content"}

> db.articles.save(article)



Map the documents to your application.
Find the document

> db.articles.find()
  { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"),
    author: "Chris",
    date: ISODate("2012-01-23T14:01:00.117Z"),
    title: “Managing Social Content"
  }


Note:
• _id must be unique, but can be anything you'd like
• Default BSON ObjectId if one is not supplied
Add an index, find via index
> db.articles.ensureIndex({author: 1})
> db.articles.find({author: 'Chris'})

   { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"),
     author: "Chris",
     date: ISODate("2012-01-23T14:01:00.117Z"),
      ...
    }



Secondary index on "author"
Social Tagging
Extending the schema




    http://nysi.org.uk/kids_stuff/rocket/rocket.htm
Adding Tags

  > db.articles.update(
       {title: "Managing Social Content" },
       {$push: {tags: [“MongoDB”, “NoSQL”]}}
    )




Push social "tags" into the existing article
Find the document

> db.articles.find()
  { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"),
    author: "Chris",
    date: ISODate("2012-01-23T14:01:00.117Z"),
    title: "Managing Social Content",
    tags: [ "comic", "adventure" ]
  }
Note:
• _id must be unique, but can be anything you'd like
• Default BSON ObjectId if one is not supplied
Social Comments
Query operators
• Conditional operators:
  ‣ $ne, $in, $nin, $mod, $all, $size, $exists,
    $type, ..
  ‣ $lt, $lte, $gt, $gte, $ne


• Update operators:
  ‣ $set, $inc, $push
Extending the Schema
new_comment = {author: "Marc",
               date: new Date(),
               text: "great article",
               stars: 5}

> db.articles.update(
     {title: "Managing Social Content" },

      {"$push": {comments: new_comment},
        "$inc": {comments_count: 1}
      }
  )
Extending the Schema
    { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
      author : "Chris",
      date: ISODate("2012-01-23T14:01:00.117Z"),
      title : "Managing Social Content",
      tags : [ "MongoDB", "NoSQL" ],
      comments : [{
	     	   author : "Marc",
	     	   date : ISODate("2012-01-23T14:31:53.848Z"),
	     	   text : "great article",
          stars : 5
	     }],
      comments_count: 1
    }
Extending the Schema
    { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
      author : "Chris",
      date: ISODate("2012-01-23T14:01:00.117Z"),
      title : "Managing Social Content",
      tags : [ "MongoDB", "NoSQL" ],
      comments : [{
	     	   author : "Marc",
	     	   date : ISODate("2012-01-23T14:31:53.848Z"),
	     	   text : "great article",
          stars : 5
	     }],
      comments_count: 1
    }
Trees
 //Embedded Tree

  { ...
    comments : [{
	   	   author : "Marc", text : "...",
        replies : [{
            author : "Fred", text : "..."
            replies : [],
	   	    }]
	   }]
  }
+ PROs: Single Document, Performance, Intuitive

- CONs: Hard to search, Partial Results, 16MB limit
One to Many - Normalized
 // Articles collection
 { _id : 1000,
    author : "Chris",
    date: ISODate("2012-01-23T14:01:00.117Z"),
    title : "Managing Social Content"
  }
  // Comments collection
  { _id : 1,
     article : 1000,
     author : "Marc",
     date : ISODate("2012-01-23T14:31:53.848Z"),
     ...
  }
> article = db. articles.find({title: "Managing Social
Content"});
> db.comments.find({article: article._id});
Array of Ancestors
                                            A      B   C
// Store all ancestors of a node
{ _id: "a" }                                       E   D
{ _id: "b", thread: [ "a" ], replyTo: "a" }
{ _id: "c", thread: [ "a", "b" ], replyTo: "b" }       F
{ _id: "d", thread: [ "a", "b" ], replyTo: "b" }
{ _id: "e", thread: [ "a" ], replyTo: "a" }
{ _id: "f", thread: [ "a", "e" ], replyTo: "e" }
Array of Ancestors
                                            A      B   C
// Store all ancestors of a node
{ _id: "a" }                                       E   D
{ _id: "b", thread: [ "a" ], replyTo: "a" }
{ _id: "c", thread: [ "a", "b" ], replyTo: "b" }       F
{ _id: "d", thread: [ "a", "b" ], replyTo: "b" }
{ _id: "e", thread: [ "a" ], replyTo: "a" }
{ _id: "f", thread: [ "a", "e" ], replyTo: "e" }
// find all threads where 'b" is in
> db.msg_tree.find({"thread": "b"})
Array of Ancestors
                                            A      B   C
// Store all ancestors of a node
{ _id: "a" }                                       E   D
{ _id: "b", thread: [ "a" ], replyTo: "a" }
{ _id: "c", thread: [ "a", "b" ], replyTo: "b" }       F
{ _id: "d", thread: [ "a", "b" ], replyTo: "b" }
{ _id: "e", thread: [ "a" ], replyTo: "a" }
{ _id: "f", thread: [ "a", "e" ], replyTo: "e" }
// find all threads where 'b" is in
> db.msg_tree.find({"thread": "b"})
// find all direct message "b: replied to
> db.msg_tree.find({"replyTo": "b"})
Array of Ancestors
                                            A      B   C
// Store all ancestors of a node
{ _id: "a" }                                       E   D
{ _id: "b", thread: [ "a" ], replyTo: "a" }
{ _id: "c", thread: [ "a", "b" ], replyTo: "b" }       F
{ _id: "d", thread: [ "a", "b" ], replyTo: "b" }
{ _id: "e", thread: [ "a" ], replyTo: "a" }
{ _id: "f", thread: [ "a", "e" ], replyTo: "e" }
// find all threads where 'b" is in
> db.msg_tree.find({"thread": "b"})
// find all direct message "b: replied to
> db.msg_tree.find({"replyTo": "b"})
//find all ancestors of f:
> threads = db.msg_tree.findOne({"_id": "f"}).thread
> db.msg_tree.find({"_id ": { $in : threads})
Location, Location, Location!
Geospatial
• Geo hash stored in B-Tree
• First two values indexed
db.articles.save({
  loc: { long: 40.739037, lat: 40.739037 }
});

db.articles.save({
  loc: [40.739037, 40.739037]
});

db.articles.ensureIndex({"loc": "2d"})
Geospatial Query
• Multi-location indexes for a single document
• $near may return the document for each index match
• $within will return a document once and once only

Find 100 nearby locations:

> db.locations.find({loc: {$near: [37.75, -122.42]}});


Find all locations within a box
>box = [[40, 40], [60, 60]]
>db.locations.find({loc: {$within: {$box: box}}});
Social Aggregation
Aggregation framework

• New aggregation framework
  • Declarative framework (no JavaScript)
  • Describe a chain of operations to apply
  • Expression evaluation
   • Return computed values
 • Framework: new operations added easily
 • C++ implementation
Aggregation - Pipelines

• Aggregation requests specify a pipeline
• A pipeline is a series of operations
• Members of a collection are passed through
  a pipeline to produce a result
  • ps -ef | grep -i mongod
Example - twitter
 {
     "_id" : ObjectId("4f47b268fb1c80e141e9888c"),
     "user" : {
         "friends_count" : 73,
         "location" : "Brazil",
         "screen_name" : "Bia_cunha1",
         "name" : "Beatriz Helena Cunha",
         "followers_count" : 102,
     }
 }


• Find the # of followers and # friends by location
Example - twitter
db.tweets.aggregate(
  {$match:
    {"user.friends_count": { $gt: 0 },
     "user.followers_count": { $gt: 0 }
    }
  },
  {$project:
    { location: "$user.location",
      friends: "$user.friends_count",
      followers: "$user.followers_count"
    }
  },
  {$group:
    {_id:     "$location",
     friends: {$sum: "$friends"},
     followers: {$sum: "$followers"}
    }
  }
);
Example - twitter
db.tweets.aggregate(
  {$match:                                 Predicate
    {"user.friends_count": { $gt: 0 },
     "user.followers_count": { $gt: 0 }
    }
  },
  {$project:
    { location: "$user.location",
      friends: "$user.friends_count",
      followers: "$user.followers_count"
    }
  },
  {$group:
    {_id:     "$location",
     friends: {$sum: "$friends"},
     followers: {$sum: "$followers"}
    }
  }
);
Example - twitter
db.tweets.aggregate(
  {$match:                                  Predicate
    {"user.friends_count": { $gt: 0 },
     "user.followers_count": { $gt: 0 }
    }
  },
  {$project:
    { location: "$user.location",
                                            Parts of the
      friends: "$user.friends_count",      document you
      followers: "$user.followers_count"   want to project
    }
  },
  {$group:
    {_id:     "$location",
     friends: {$sum: "$friends"},
     followers: {$sum: "$followers"}
    }
  }
);
Example - twitter
db.tweets.aggregate(
  {$match:                                  Predicate
    {"user.friends_count": { $gt: 0 },
     "user.followers_count": { $gt: 0 }
    }
  },
  {$project:
    { location: "$user.location",
                                            Parts of the
      friends: "$user.friends_count",      document you
      followers: "$user.followers_count"   want to project
    }
  },
  {$group:
    {_id:     "$location",                 Function to
     friends: {$sum: "$friends"},
     followers: {$sum: "$followers"}
                                           apply to the
    }                                       result set
  }
);
Example - twitter
{

     "result" : [

     
    {

     
    
       "_id" : "Far Far Away",

     
    
       "friends" : 344,

     
    
       "followers" : 789

     
    },
...

     ],

     "ok" : 1
}
Demo
Demo files are at https://gist.github.com/
                2036709
Use Cases
Content Management                 Operational Intelligence                Product Data Management




                 User Data Management                    High Volume Data Feeds
Some Customers..
Questions

• 10Gen Services
 – Development Support
 – Consultancy
 – TAM
 – Production Support


• Free online MongoDB
  training
 – Develop
 – Deploy
 – Classes start Oct. 2012
                               43

More Related Content

What's hot

Indexing and Performance Tuning
Indexing and Performance TuningIndexing and Performance Tuning
Indexing and Performance TuningMongoDB
 
Neo4j - 5 cool graph examples
Neo4j - 5 cool graph examplesNeo4j - 5 cool graph examples
Neo4j - 5 cool graph examplesPeter Neubauer
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBNodeXperts
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsSteven Francia
 
MongoDB World 2019: The Sights (and Smells) of a Bad Query
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB World 2019: The Sights (and Smells) of a Bad Query
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB FundamentalsMongoDB
 
Performance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBasePerformance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBaseSindhujanDhayalan
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger InternalsNorberto Leite
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDBMongoDB
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB ClusterMongoDB
 
DAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraDAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraAmazon Web Services
 
How Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDBHow Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDBMongoDB
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters MongoDB
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at ScaleMongoDB
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentationHyphen Call
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentationadvaitdeo
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseMike Dirolf
 

What's hot (20)

Indexing and Performance Tuning
Indexing and Performance TuningIndexing and Performance Tuning
Indexing and Performance Tuning
 
Mongo DB Presentation
Mongo DB PresentationMongo DB Presentation
Mongo DB Presentation
 
Neo4j - 5 cool graph examples
Neo4j - 5 cool graph examplesNeo4j - 5 cool graph examples
Neo4j - 5 cool graph examples
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
 
MongoDB World 2019: The Sights (and Smells) of a Bad Query
MongoDB World 2019: The Sights (and Smells) of a Bad QueryMongoDB World 2019: The Sights (and Smells) of a Bad Query
MongoDB World 2019: The Sights (and Smells) of a Bad Query
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB Fundamentals
 
Performance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBasePerformance analysis of MongoDB and HBase
Performance analysis of MongoDB and HBase
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger Internals
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB Cluster
 
DAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraDAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon Aurora
 
How Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDBHow Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDB
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentation
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 

Viewers also liked

Choosing the right NOSQL database
Choosing the right NOSQL databaseChoosing the right NOSQL database
Choosing the right NOSQL databaseTobias Lindaaker
 
MetaScale Case Study: Hadoop Extends DataStage ETL Capacity
MetaScale Case Study: Hadoop Extends DataStage ETL CapacityMetaScale Case Study: Hadoop Extends DataStage ETL Capacity
MetaScale Case Study: Hadoop Extends DataStage ETL CapacityMetaScale
 
HBase Introduction
HBase IntroductionHBase Introduction
HBase IntroductionHanborq Inc.
 
Date time java 8 (jsr 310)
Date time java 8 (jsr 310)Date time java 8 (jsr 310)
Date time java 8 (jsr 310)Eyal Golan
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseKristijan Duvnjak
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016StampedeCon
 
Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Lear...
Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Lear...Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Lear...
Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Lear...DataWorks Summit/Hadoop Summit
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprisesmarkgrover
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache sparksarith divakar
 
Introduction To Apache Pig at WHUG
Introduction To Apache Pig at WHUGIntroduction To Apache Pig at WHUG
Introduction To Apache Pig at WHUGAdam Kawa
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapakapa rohit
 
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...Kai Wähner
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disquszeeg
 
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsCompression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsDataWorks Summit
 

Viewers also liked (19)

Choosing the right NOSQL database
Choosing the right NOSQL databaseChoosing the right NOSQL database
Choosing the right NOSQL database
 
MetaScale Case Study: Hadoop Extends DataStage ETL Capacity
MetaScale Case Study: Hadoop Extends DataStage ETL CapacityMetaScale Case Study: Hadoop Extends DataStage ETL Capacity
MetaScale Case Study: Hadoop Extends DataStage ETL Capacity
 
HBase Introduction
HBase IntroductionHBase Introduction
HBase Introduction
 
Date time java 8 (jsr 310)
Date time java 8 (jsr 310)Date time java 8 (jsr 310)
Date time java 8 (jsr 310)
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 
Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Lear...
Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Lear...Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Lear...
Large Scale Health Telemetry and Analytics with MQTT, Hadoop and Machine Lear...
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
HBase Lightning Talk
HBase Lightning TalkHBase Lightning Talk
HBase Lightning Talk
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache spark
 
Introduction To Apache Pig at WHUG
Introduction To Apache Pig at WHUGIntroduction To Apache Pig at WHUG
Introduction To Apache Pig at WHUG
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
 
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
 
Fluentd vs. Logstash for OpenStack Log Management
Fluentd vs. Logstash for OpenStack Log ManagementFluentd vs. Logstash for OpenStack Log Management
Fluentd vs. Logstash for OpenStack Log Management
 
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of TradeoffsCompression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similar to Aggregating Social Data by Location

Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
Starting with MongoDB
Starting with MongoDBStarting with MongoDB
Starting with MongoDBDoThinger
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsMongoDB
 
Schema design
Schema designSchema design
Schema designchristkv
 
MongoDB NYC Python
MongoDB NYC PythonMongoDB NYC Python
MongoDB NYC PythonMike Dirolf
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
 
MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011Steven Francia
 
MongoDB Strange Loop 2009
MongoDB Strange Loop 2009MongoDB Strange Loop 2009
MongoDB Strange Loop 2009Mike Dirolf
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDBNorberto Leite
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsMatias Cascallares
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBAlex Bilbie
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
Marc s01 e02-crud-database
Marc s01 e02-crud-databaseMarc s01 e02-crud-database
Marc s01 e02-crud-databaseMongoDB
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...MongoDB
 
MongoDB at FrozenRails
MongoDB at FrozenRailsMongoDB at FrozenRails
MongoDB at FrozenRailsMike Dirolf
 

Similar to Aggregating Social Data by Location (20)

Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
Starting with MongoDB
Starting with MongoDBStarting with MongoDB
Starting with MongoDB
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
Schema design
Schema designSchema design
Schema design
 
MongoDB NYC Python
MongoDB NYC PythonMongoDB NYC Python
MongoDB NYC Python
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011
 
Einführung in MongoDB
Einführung in MongoDBEinführung in MongoDB
Einführung in MongoDB
 
MongoDB at GUL
MongoDB at GULMongoDB at GUL
MongoDB at GUL
 
MongoDB Strange Loop 2009
MongoDB Strange Loop 2009MongoDB Strange Loop 2009
MongoDB Strange Loop 2009
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Marc s01 e02-crud-database
Marc s01 e02-crud-databaseMarc s01 e02-crud-database
Marc s01 e02-crud-database
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
 
MongoDB at FrozenRails
MongoDB at FrozenRailsMongoDB at FrozenRails
MongoDB at FrozenRails
 
Latinoware
LatinowareLatinoware
Latinoware
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Aggregating Social Data by Location

  • 1. Managing Social Content with MongoDB Chris Harris - charris@10gen.com twitter: @cj_harris5
  • 3. Traditional Architecture HTML Web Server Application Server Controllers Services SQL Database
  • 5. Increase in Write Ratio Users don’t just want to read content! They want to share and contribute to the content! Volume Writes!
  • 6. Need to Scale Datasource JSON JSON JSON Web Server Application Server Service #1 SQL Bottleneck!
  • 7. Need to Scale Datasource JSON JSON JSON Web Server Application Server Service #1 SQL Bottleneck!
  • 8. Application Cache? JSON JSON JSON Web Server Application Server Service #1 App Cache SQL
  • 9. Issues + Read Only data comes from a Cache - Writes slow down as need to update the Cache and the Database - Need to keep cache data in sync between Application Servers
  • 10. IT needs are evolving... Agile Development • Iterative • Continuous Data Volume, Type & Use • Trillions of records • 100’s of millions of queries per second • Real-Time Analytics • Unstructured / semi- New Hardware structured Architectures • Commodity servers • Cloud Computing • Horizontal Scaling
  • 11. Tradeoff: Scale vs. Functionality • memcached scalability & performance • key/value • RDBMS depth of functionality
  • 12. Terminology RDBMS MongoDB Table Collection Row(s) JSON Document Index Index Join Embedding & Linking
  • 14. A simple start article = {author: "Chris", date: new Date(), title: "Managing Social Content"} > db.articles.save(article) Map the documents to your application.
  • 15. Find the document > db.articles.find() { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), author: "Chris", date: ISODate("2012-01-23T14:01:00.117Z"), title: “Managing Social Content" } Note: • _id must be unique, but can be anything you'd like • Default BSON ObjectId if one is not supplied
  • 16. Add an index, find via index > db.articles.ensureIndex({author: 1}) > db.articles.find({author: 'Chris'}) { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), author: "Chris", date: ISODate("2012-01-23T14:01:00.117Z"), ... } Secondary index on "author"
  • 18. Extending the schema http://nysi.org.uk/kids_stuff/rocket/rocket.htm
  • 19. Adding Tags > db.articles.update( {title: "Managing Social Content" }, {$push: {tags: [“MongoDB”, “NoSQL”]}} ) Push social "tags" into the existing article
  • 20. Find the document > db.articles.find() { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), author: "Chris", date: ISODate("2012-01-23T14:01:00.117Z"), title: "Managing Social Content", tags: [ "comic", "adventure" ] } Note: • _id must be unique, but can be anything you'd like • Default BSON ObjectId if one is not supplied
  • 22. Query operators • Conditional operators: ‣ $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. ‣ $lt, $lte, $gt, $gte, $ne • Update operators: ‣ $set, $inc, $push
  • 23. Extending the Schema new_comment = {author: "Marc", date: new Date(), text: "great article", stars: 5} > db.articles.update( {title: "Managing Social Content" }, {"$push": {comments: new_comment}, "$inc": {comments_count: 1} } )
  • 24. Extending the Schema { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Chris", date: ISODate("2012-01-23T14:01:00.117Z"), title : "Managing Social Content", tags : [ "MongoDB", "NoSQL" ], comments : [{ author : "Marc", date : ISODate("2012-01-23T14:31:53.848Z"), text : "great article", stars : 5 }], comments_count: 1 }
  • 25. Extending the Schema { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Chris", date: ISODate("2012-01-23T14:01:00.117Z"), title : "Managing Social Content", tags : [ "MongoDB", "NoSQL" ], comments : [{ author : "Marc", date : ISODate("2012-01-23T14:31:53.848Z"), text : "great article", stars : 5 }], comments_count: 1 }
  • 26. Trees //Embedded Tree { ... comments : [{ author : "Marc", text : "...", replies : [{ author : "Fred", text : "..." replies : [], }] }] } + PROs: Single Document, Performance, Intuitive - CONs: Hard to search, Partial Results, 16MB limit
  • 27. One to Many - Normalized // Articles collection { _id : 1000, author : "Chris", date: ISODate("2012-01-23T14:01:00.117Z"), title : "Managing Social Content" } // Comments collection { _id : 1, article : 1000, author : "Marc", date : ISODate("2012-01-23T14:31:53.848Z"), ... } > article = db. articles.find({title: "Managing Social Content"}); > db.comments.find({article: article._id});
  • 28. Array of Ancestors A B C // Store all ancestors of a node { _id: "a" } E D { _id: "b", thread: [ "a" ], replyTo: "a" } { _id: "c", thread: [ "a", "b" ], replyTo: "b" } F { _id: "d", thread: [ "a", "b" ], replyTo: "b" } { _id: "e", thread: [ "a" ], replyTo: "a" } { _id: "f", thread: [ "a", "e" ], replyTo: "e" }
  • 29. Array of Ancestors A B C // Store all ancestors of a node { _id: "a" } E D { _id: "b", thread: [ "a" ], replyTo: "a" } { _id: "c", thread: [ "a", "b" ], replyTo: "b" } F { _id: "d", thread: [ "a", "b" ], replyTo: "b" } { _id: "e", thread: [ "a" ], replyTo: "a" } { _id: "f", thread: [ "a", "e" ], replyTo: "e" } // find all threads where 'b" is in > db.msg_tree.find({"thread": "b"})
  • 30. Array of Ancestors A B C // Store all ancestors of a node { _id: "a" } E D { _id: "b", thread: [ "a" ], replyTo: "a" } { _id: "c", thread: [ "a", "b" ], replyTo: "b" } F { _id: "d", thread: [ "a", "b" ], replyTo: "b" } { _id: "e", thread: [ "a" ], replyTo: "a" } { _id: "f", thread: [ "a", "e" ], replyTo: "e" } // find all threads where 'b" is in > db.msg_tree.find({"thread": "b"}) // find all direct message "b: replied to > db.msg_tree.find({"replyTo": "b"})
  • 31. Array of Ancestors A B C // Store all ancestors of a node { _id: "a" } E D { _id: "b", thread: [ "a" ], replyTo: "a" } { _id: "c", thread: [ "a", "b" ], replyTo: "b" } F { _id: "d", thread: [ "a", "b" ], replyTo: "b" } { _id: "e", thread: [ "a" ], replyTo: "a" } { _id: "f", thread: [ "a", "e" ], replyTo: "e" } // find all threads where 'b" is in > db.msg_tree.find({"thread": "b"}) // find all direct message "b: replied to > db.msg_tree.find({"replyTo": "b"}) //find all ancestors of f: > threads = db.msg_tree.findOne({"_id": "f"}).thread > db.msg_tree.find({"_id ": { $in : threads})
  • 33. Geospatial • Geo hash stored in B-Tree • First two values indexed db.articles.save({ loc: { long: 40.739037, lat: 40.739037 } }); db.articles.save({ loc: [40.739037, 40.739037] }); db.articles.ensureIndex({"loc": "2d"})
  • 34. Geospatial Query • Multi-location indexes for a single document • $near may return the document for each index match • $within will return a document once and once only Find 100 nearby locations: > db.locations.find({loc: {$near: [37.75, -122.42]}}); Find all locations within a box >box = [[40, 40], [60, 60]] >db.locations.find({loc: {$within: {$box: box}}});
  • 36. Aggregation framework • New aggregation framework • Declarative framework (no JavaScript) • Describe a chain of operations to apply • Expression evaluation • Return computed values • Framework: new operations added easily • C++ implementation
  • 37. Aggregation - Pipelines • Aggregation requests specify a pipeline • A pipeline is a series of operations • Members of a collection are passed through a pipeline to produce a result • ps -ef | grep -i mongod
  • 38. Example - twitter { "_id" : ObjectId("4f47b268fb1c80e141e9888c"), "user" : { "friends_count" : 73, "location" : "Brazil", "screen_name" : "Bia_cunha1", "name" : "Beatriz Helena Cunha", "followers_count" : 102, } } • Find the # of followers and # friends by location
  • 39. Example - twitter db.tweets.aggregate( {$match: {"user.friends_count": { $gt: 0 }, "user.followers_count": { $gt: 0 } } }, {$project: { location: "$user.location", friends: "$user.friends_count", followers: "$user.followers_count" } }, {$group: {_id: "$location", friends: {$sum: "$friends"}, followers: {$sum: "$followers"} } } );
  • 40. Example - twitter db.tweets.aggregate( {$match: Predicate {"user.friends_count": { $gt: 0 }, "user.followers_count": { $gt: 0 } } }, {$project: { location: "$user.location", friends: "$user.friends_count", followers: "$user.followers_count" } }, {$group: {_id: "$location", friends: {$sum: "$friends"}, followers: {$sum: "$followers"} } } );
  • 41. Example - twitter db.tweets.aggregate( {$match: Predicate {"user.friends_count": { $gt: 0 }, "user.followers_count": { $gt: 0 } } }, {$project: { location: "$user.location", Parts of the friends: "$user.friends_count", document you followers: "$user.followers_count" want to project } }, {$group: {_id: "$location", friends: {$sum: "$friends"}, followers: {$sum: "$followers"} } } );
  • 42. Example - twitter db.tweets.aggregate( {$match: Predicate {"user.friends_count": { $gt: 0 }, "user.followers_count": { $gt: 0 } } }, {$project: { location: "$user.location", Parts of the friends: "$user.friends_count", document you followers: "$user.followers_count" want to project } }, {$group: {_id: "$location", Function to friends: {$sum: "$friends"}, followers: {$sum: "$followers"} apply to the } result set } );
  • 43. Example - twitter { "result" : [ { "_id" : "Far Far Away", "friends" : 344, "followers" : 789 }, ... ], "ok" : 1 }
  • 44. Demo Demo files are at https://gist.github.com/ 2036709
  • 45. Use Cases Content Management Operational Intelligence Product Data Management User Data Management High Volume Data Feeds
  • 47. Questions • 10Gen Services – Development Support – Consultancy – TAM – Production Support • Free online MongoDB training – Develop – Deploy – Classes start Oct. 2012 43

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. Evolutions in computing are significantly impacting the traditional RDBMS.\nVolume of data is magnitudes higher than previously\ntens of millions quereis a second\nstructured and unstructured data\ncloud computing and storage\nscaling horizontally not vertically due reaching the capacity. buying a bigger box\ncommodity servers not expensive sans\nand developers are not doing waterfall development anymore, they want to be more agile\nflexible in their data models..\n
  11. where is mongodb, when you compare functionality vs. performance?\nwe are to haveing most of the features of a relational database but not complex joins which arent scale\n
  12. * No joins for scalability - Doing joins across shards in SQL highly inefficient and difficult to perform.\n* MongoDB is geared for easy scaling - going from a single node to a distributed cluster is easy.\n* Little or no application code changes are needed to scale from a single node to a sharded cluster.\n
  13. \n
  14. \n
  15. \n
  16. * you can always add und remove indexes during runtime (but reindexing will take some time)\n
  17. \n
  18. \n
  19. * upserts - $push, $inc\n* atomicy\n
  20. \n
  21. \n
  22. * Rich query language\n* Powerful - can do range queries $lt and $gt\n* Update - can update parts of documents\n
  23. * upserts - $push, $inc\n* atomicy\n
  24. * later? .. extending…: whats wrong with that schema?\ncomments… (a lot of comments) a single doc could be only 16meg in size), padding factors\n
  25. * Also one to many pattern\n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. change som customers to european ones\ncraigslist: had the problem, that they couldnt introduce new features as fast they want, because they had to change the schema for that, wich is a massive impact on the database. possible \n
  53. and counting..\n..national archives, are digitalising their whole dataset and storing that into mongodb\n...the guardian, main database for every new project\n...navteq, discovering mongodb because of its location based features and now loving it because of the flexibility of the schema\n...cern : using for their data aggregation system. so all systems feeding that db results in 1M Updates a day.\n\n..a customer in france:\n250 million products stored (product data only, not images which are stored in our own CDN)\n- 300 million reads per day (peak at 1600 reads per second)\n- 150 million writes per day\n
  54. \n