SlideShare a Scribd company logo
1 of 115
MongoDB
My name is
Steve Francia

     @spf13
• 15+ years building the
  internet

• BYU Alumnus
• Father, husband,
  skateboarder

• Chief Solutions Architect @
  10gen
Introduction to
   MongoDB
Why MongoDB?
Agility
     Easily model complex data

  Database speaks your languages
       (java, .net, PHP, etc)

Schemaless data model enables faster
        development cycle
Scale
Easy and automatic scale out
Cost
Cost effectively manage abundant data
       (clickstreams, logs, etc.)
• Company behind MongoDB
 • (A)GPL license, own copyrights,
    engineering team
  • support, consulting, commercial license
    revenue
• Management
 • Google/DoubleClick, Oracle,   Apple,
    NetApp
  • Funding: Sequoia, Union Square, Flybridge
  • Offices in NYC and Redwood Shores, CA
  • 50+ employees
MongoDB Goals
• OpenSource
• Designed for today
 • Today’s hardware / environments
 • Today’s challenges
• Easy development
• Reliable
• Scalable
A bit of history
1974
The relational database is created
1979
1979   1982‐1996
1979   1982‐1996   1995
Computers in 1995

• Pentium 100 mhz
• 10base T
• 16 MB ram
• 200 MB HD
Cell Phones in 2011

• Dual core 1.5 Ghz
• WiFi 802.11n (300+ Mbps)
• 1 GB ram
• 64GB Solid State
How about a DB
designed for today?
It started with
 DoubleClick
Signs something
      needed
• doubleclick - 400,000 ads/second
• people writing their own stores
• caching is de rigueur
• complex ORM frameworks
• computer architecture trends
• cloud computing
Requirements
• need a good degree of functionality
  to handle a large set of use cases

 • sometimes need strong
   consistency / atomicity

 • secondary indexes
 • ad hoc queries
Trim unneeded
      features
• leave out a few things so we can
  scale

 • no choice but to leave out
   relational

 • distributed transactions are hard
   to scale
Needed a scalable
   data model
• some options:
 • key/value
 • columnar / tabular
 • document oriented (JSON inspired)
• opportunity to innovate -> agility
MongoDB philosphy
•   No longer one-size-fits all. but not 12 tools either.

•   Non-relational (no joins) makes scaling horizontally
    practical

•   Document data models are good

•   Keep functionality when we can (key/value stores are
    great, but we need more)

•   Database technology should run anywhere, being
    available both for running on your own servers or VMs,
    and also as a cloud pay-for-what-you-use service.
•   Ideally open source...
MongoDB

• JSON Documents
• Querying/Indexing/Updating similar
  to relational databases

• Traditional Consistency
• Auto-Sharding
Under the hood

• Written in C++
• Available on most platforms
• Data serialized to BSON
• Extensive use of memory-mapped
 les
Database
Landscape
MongoDB is:
          Application       Document
                             Oriented
   High                      { author: “steve”,
                               date: new Date(),

Performanc
                               text: “About MongoDB...”,
                               tags: [“tech”, “database”]}

     e



    Horizontally Scalable
This has led
    some to say

“
MongoDB has the best
features of key/ values
stores, document
databases and relational
databases in one.
              John Nunemaker
Use Cases
Photo Meta-
Problem:
• Business needed more flexibility than Oracle could deliver

Solution:
• Used MongoDB instead of Oracle

Results:
• Developed application in one sprint cycle
• 500% cost reduction compared to Oracle
• 900% performance improvement compared to Oracle
Customer Analytics
Problem:
• Deal with massive data volume across all customer sites

Solution:
• Used MongoDB to replace Google Analytics / Omniture
  options
Results:
• Less than one week to build prototype and prove business
  case
• Rapid deployment of new features
Online
Problem:
• MySQL could not scale to handle their 5B+ documents

Solution:
• Switched from MySQL to MongoDB

Results:
• Massive simplification of code base
• Eliminated need for external caching system
• 20x performance improvement over MySQL
E-commerce
Problem:
• Multi-vertical E-commerce impossible to model (efficiently)
  in RDBMS

Solution:
• Switched from MySQL to MongoDB

Results:
•   Massive simplification of code base
•   Rapidly build, halving time to market (and cost)
•   Eliminated need for external caching system
•   50x+ improvement over MySQL
Tons more
Pretty much if you can use a RDMBS or Key/
        Value MongoDB is a great t
In Good Company
Schema Design
Relational made normalized
     data look like this
Document databases make
normalized data look like this
Terminology
   RDBMS                   Mongo
Table, View     ➜   Collection
Row             ➜   JSON Document
Index           ➜   Index
Join            ➜   Embedded
Partition       ➜   Document
                    Shard
Partition Key   ➜   Shard Key
Tables to
Documents
Tables to
Documents
      {
          title: ‘MongoDB’,
          contributors: [
             { name: ‘Eliot Horowitz’,
               email: ‘eh@10gen.com’ },
             { name: ‘Dwight Merriman’,
               email: ‘dm@10gen.com’ }
          ],
          model: {
              relational: false,
              awesome: true
          }
DEMO TIME
Documents
Blog Post Document


> p = {author:   “roger”,
         date:   new Date(),
         text:   “about mongoDB...”,
         tags:   [“tech”, “databases”]}

> db.posts.save(p)
Querying
> db.posts.find()

>   { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
    author : "roger",
      date : "Sat Jul 24 2010 19:47:11",
      text : "About MongoDB...",
      tags : [ "tech", "databases" ] }




 Note:   _id is unique, but can be
anything you’d like
Secondary Indexes
Create index on any Field in Document
Secondary Indexes
Create index on any Field in Document


    //   1 means ascending, -1 means descending
    > db.posts.ensureIndex({author: 1})
    > db.posts.find({author: 'roger'})

>   { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
    author : "roger",
     ... }
Conditional Query
   Operators
$all, $exists, $mod, $ne, $in, $nin, $nor,
$or, $size, $type, $lt, $lte, $gt, $gte
Conditional Query
   Operators
$all, $exists, $mod, $ne, $in, $nin, $nor,
$or, $size, $type, $lt, $lte, $gt, $gte

// find posts with any tags
> db.posts.find( {tags: {$exists: true }} )

// find posts matching a regular expression
> db.posts.find( {author: /^rog*/i } )

// count posts by author
> db.posts.find( {author: ‘roger’} ).count()
Update Operations
$set, $unset, $inc, $push, $pushAll,
$pull, $pullAll, $bit
Update Operations
 $set, $unset, $inc, $push, $pushAll,
 $pull, $pullAll, $bit


> comment = { author: “fred”,
              date: new Date(),
              text: “Best Movie Ever”}

> db.posts.update( { _id: “...” },
	 	 	 	       $push: {comments: comment} );
Nested Documents
    {   _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
        author : "roger",
        date : "Sat Apr 24 2011 19:47:11",
        text : "About MongoDB...",
        tags : [ "tech", "databases" ],
        comments : [
	              {
	              	 author : "Fred",
               	
	              	 date : "Sat Apr 25 2010 20:51:03 GMT-0700",
               	
	              	 text : "Best Post Ever!"
               	
	              	
               }
         ]
}
Secondary Indexes
// Index nested documents
> db.posts.ensureIndex( “comments.author”: 1)
> db.posts.find({‘comments.author’:’Fred’})

// Index on tags (multi-key index)
> db.posts.ensureIndex( tags: 1)
> db.posts.find( { tags: ‘tech’ } )

// geospatial index
> db.posts.ensureIndex( “author.location”: “2d” )
> db.posts.find( “author.location”: { $near : [22,42] } )
Rich Documents
{   _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),

    line_items : [ { sku: ‘tt-123’,
                     name: ‘Coltrane: Impressions’ },
                   { sku: ‘tt-457’,
                     name: ‘Davis: Kind of Blue’ } ],

    address : { name: ‘Banker’,
                street: ‘111 Main’,
                zip: 10010 },

    payment: { cc: 4567,
               exp: Date(2011, 7, 7) },

    subtotal: 2355
}
High Availability
MongoDB
        Replication
•MongoDB replication like MySQL
 replication (kinda)
•Asynchronous master/slave
•Variations
   •Master / slave
   •Replica Sets
Replica Set features
•   A cluster of N servers
•   Any (one) node can be primary

•   Consensus election of primary

•   Automatic failover
•   Automatic recovery

•   All writes to primary

•    Reads can be to primary (default) or a
    secondary
How MongoDB
Replication works
     Member
1              Member
3




                Member
2




 Set is made up of 2 or more nodes
How MongoDB
    Replication works
          Member
1              Member
3




                     Member
2
                     PRIMARY



       Election establishes the PRIMARY
Data replication from PRIMARY to SECONDARY
How MongoDB
   Replication works
                       negotiate

                      new
master
          Member
1                  Member
3




                     Member
2
                      DOWN



              PRIMARY may fail
Automatic election of new PRIMARY if majority
                    exists
How MongoDB
Replication works
                           Member
3
    Member
1
                           PRIMARY




               Member
2
                DOWN



    New PRIMARY elected
  Replication Set re-established
How MongoDB
Replication works
                           Member
3
   Member
1
                           PRIMARY




              Member
2
              RECOVERING




      Automatic recovery
How MongoDB
Replication works
                           Member
3
    Member
1
                           PRIMARY




               Member
2



  Replication Set re-established
Creating a Replica
        Set
> cfg = {
    _id : "acme_a",
    members : [
      { _id : 0, host : "sf1.acme.com" },
      { _id : 1, host : "sf2.acme.com" },
      { _id : 2, host : "sf3.acme.com" } ] }
> use admin
> db.runCommand( { replSetInitiate : cfg } )
Replica Set Options
•   {arbiterOnly: True}

    •   Can vote in an election
    •   Does not hold any data

•   {hidden: True}
    •   Not reported in isMaster()

    •   Will not be sent slaveOk() reads
•   {priority: n}

•   {tags: }
Using Replicas for
       Reads
• slaveOk()
 • - driver will send read requests to
     Secondaries

  • - driver will always send writes to Primary
  • Java examples
  • - DB.slaveOk()
  • - Collection.slaveOk()
• find(q).addOption(Bytes.QUERYOPTION_SLAVEO
  K);
Safe Writes
•   db.runCommand({getLastError: 1, w : 1})
    •   - ensure write is synchronous

    •   - command returns after primary has written to memory

•   w=n or w='majority'
    •   n is the number of nodes data must be replicated to

    •   driver will always send writes to Primary
•   w='myTag' [MongoDB 2.0]

    •   Each member is "tagged" e.g. "US_EAST", "EMEA",
        "US_WEST"

    •   Ensure that the write is executed in each tagged "region"
Safe Writes
• fsync:true
 • Ensures changed disk blocks are
   flushed to disk


• j:true
 • Ensures changes are flush to
   Journal
When are elections
   triggered?
• When a given member see's that the
  Primary is not reachable

• The member is not an Arbiter
• Has a priority greater than other
  eligible members
Typical
Use?
     Set

     size
                  Deployments
             Data
Protection High
Availability Notes


 X   One     No              No                 Must
use
‐‐journal
to
protect
against
crashes


                                                On
loss
of
one
member,
surviving
member
is

     Two     Yes             No                 read
only

                                                On
loss
of
one
member,
surviving
two

     Three   Yes             Yes
‐
1
failure    members
can
elect
a
new
primary

                                                *
On
loss
of
two
members,
surviving
two

 X   Four    Yes             Yes
‐
1
failure*   members
are
read
only



                                                On
loss
of
two
members,
surviving
three

     Five    Yes             Yes
‐
2
failures   members
can
elect
a
new
primary
Replication features
•    Reads from Primary are always
    consistent

•    Reads from Secondaries are eventually
    consistent

•   Automatic failover if a Primary fails

•    Automatic recovery when a node joins
    the set

•   Control of where writes occur
Scaling
Sharding MongoDB
What is Sharding
• Ad-hoc partitioning
• Consistent hashing
 • Amazon Dynamo
• Range based partitioning
 • Google BigTable
 • Yahoo! PNUTS
 • MongoDB
MongoDB Sharding
• Automatic partitioning and
  management
• Range based
• Convert to sharded system with no
  downtime
• Fully consistent
How MongoDB
Sharding Works
How MongoDB Sharding works
 >
db.runCommand(
{
addshard
:
"shard1"
}
);
 >
db.runCommand(

 


{
shardCollection
:
“mydb.blogs”,

 




key
:
{
age
:
1}
}
)

        -∞   +∞  




•Range keys from -∞ to +∞  
•Ranges are stored as “chunks”
How MongoDB Sharding works

 >
db.posts.save(
{age:40}
)



        -∞   +∞  

  -∞   40      41 +∞  


•Data in inserted
•Ranges are split into more “chunks”
How MongoDB Sharding works

 >
db.posts.save(
{age:40}
)
 >
db.posts.save(
{age:50}
)


        -∞   +∞  

  -∞   40      41 +∞  
          41 50       51 +∞  
•More Data in inserted
•Ranges are split into more“chunks”
How MongoDB Sharding works

>
db.posts.save(
{age:40}
)
>
db.posts.save(
{age:50}
)
>
db.posts.save(
{age:60}
)

      -∞   +∞  

 -∞   40      41 +∞  
        41 50        51 +∞  
               51 60          61 +∞  
How MongoDB Sharding works

>
db.posts.save(
{age:40}
)
>
db.posts.save(
{age:50}
)
>
db.posts.save(
{age:60}
)

      -∞   +∞  

 -∞   40      41 +∞  
        41 50        51 +∞  
               51 60          61 +∞  
How MongoDB Sharding works




shard1
 -∞   40
  41 50
  51 60
  61 +∞  
How MongoDB Sharding works

>
db.runCommand(
{
addshard
:
"shard2"
}
);




 -∞   40
 41 50
 51 60
 61 +∞  
How MongoDB Sharding works

>
db.runCommand(
{
addshard
:
"shard2"
}
);



shard1
 -∞   40
  41 50
  51 60
  61 +∞  
How MongoDB Sharding works

>
db.runCommand(
{
addshard
:
"shard2"
}
);



shard1         shard2
 -∞   40
                 41 50
  51 60
                61 +∞  
How MongoDB Sharding works

>
db.runCommand(
{
addshard
:
"shard2"
}
);
>
db.runCommand(
{
addshard
:
"shard3"
}
);


shard1         shard2           shard3
 -∞   40
                 41 50
                                 51 60
                61 +∞  
How MongoDB
Sharding Works
Sharding Features
•   Shard data without no downtime
•   Automatic balancing as data is written
•   Commands routed (switched) to correct node
    •   Inserts - must have the Shard Key
    •   Updates - must have the Shard Key
    •   Queries
        •   With Shard Key - routed to nodes
        •   Without Shard Key - scatter gather
    •   Indexed Queries
        •   With Shard Key - routed in order
        •   Without Shard Key - distributed sort merge
Sharding
Architecture
Architecture
Cong Servers
• 3 of them
• changes are made with 2 phase
  commit

• if any are down, meta data
  goes read only

• system is online as long as 1/3
  is up
Cong Servers
• 3 of them
• changes are made with 2 phase
  commit

• if any are down, meta data
  goes read only

• system is online as long as 1/3
  is up
Shards


• Can be master, master/slave or
  replica sets

• Replica sets gives sharding + full
  auto-failover

• Regular mongod processes
Shards


• Can be master, master/slave or
  replica sets

• Replica sets gives sharding + full
  auto-failover

• Regular mongod processes
Mongos
• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra
  network traffic
Mongos
• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra
  network traffic
Advanced
Replication
Priorities
•   Prior to 2.0.0

    •   {priority:0} // Never can be elected Primary

    •   {priority:1} // Can be elected Primary

•   New in 2.0.0

        •   Priority, floating point number between 0 and 1000

        •   During an election

            •   Most up to date

            •   Highest priority

        •   Allows weighting of members during failover
Priorities - example
•   Assuming all members are up to date
                                                 A     D
•   Members A or B will be chosen first          p:2   p:1
    •   Highest priority
                                           B           E
•   Members C or D will be chosen next if
                                          p:2         p:0
    • A and B are unavailable

    •   A and B are not up to date               C
                                                p:1
•   Member E is never chosen
    •   priority:0 means it cannot be elected
Tagging
•   New in 2.0.0

•   Control over where data is written to

•   Each member can have one or more tags e.g.

    •   tags: {dc: "ny"}

    •   tags: {dc: "ny",
            ip: "192.168",
            rack: "row3rk7"}

•   Replica set defines rules for where data resides

•   Rules can change without change application code
Tagging - example
{
    _id : "mySet",
    members : [
        {_id : 0, host : "A",   tags    :   {"dc":   "ny"}},
        {_id : 1, host : "B",   tags    :   {"dc":   "ny"}},
        {_id : 2, host : "C",   tags    :   {"dc":   "sf"}},
        {_id : 3, host : "D",   tags    :   {"dc":   "sf"}},
        {_id : 4, host : "E",   tags    :   {"dc":   "cloud"}}]
    settings : {
        getLastErrorModes : {
            allDCs : {"dc" :    3},
            someDCs : {"dc" :   2}} }
}

> db.blogs.insert({...})
> db.runCommand({getLastError : 1, w : "allDCs"})
Use Cases - Multi
       Data Center
   •   write to three data centers

       •   allDCs : {"dc" : 3}

       •   > db.runCommand({getLastError : 1, w : "allDCs"})

   •   write to two data centers and three availability zones

       •   allDCsPlus : {"dc" : 2, "az": 3}

       •   > db.runCommand({getLastError : 1, w : "allDCsPlus"})

US‐EAST‐1               US‐WEST‐1             LONDON‐1
tag
:
{dc:
"JFK",       tag
:
{dc:
"SFO",     tag
:
{dc:
"LHR",







az:
"r1"}        






az
:
"r3"}     






az:
"r5"}

US‐EAST‐2               US‐WEST‐2
tag
:
{dc:
"JFK"        tag
:
{dc:
"SFO"







az:
"r2"}        






az:
"r4"}
Use Cases - Data Protection
    & High Availability
•    A and B will take priority during a failover
•    C or D will become primary if A and B become unavailable
•    E cannot be primary

•    D and E cannot be read from with a slaveOk()
•    D can use be used for Backups, feed Solr index etc.

•    E provides a safe guard for operational or application error

                                                           E
     A                          C
                                                      priority:
0
priority:
2                priority:
1
                                                      hidden:
True
                                                    slaveDelay:
3600
                                D
     B
                           priority:
1
priority:
2
                           hidden:
True
Optimizing app
 performance
RAM




Disk
RAM




Disk
RAM




Disk
RAM




Disk
Goal

Minimize memory
    turnover
What is your data
 access pattern?
10 days of data
       RAM




Disk
http://spf13.com
                         http://github.com/spf13
                         @spf13




   Questions?
download at mongodb.org
PS: We’re hiring!! Contact us at
      jobs@10gen.com
MongoDB

More Related Content

What's hot

Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use Cases
DATAVERSITY
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDB
MongoDB
 

What's hot (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
An Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDBAn Introduction To NoSQL & MongoDB
An Introduction To NoSQL & MongoDB
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use Cases
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Introduction to mongodb
Introduction to mongodbIntroduction to mongodb
Introduction to mongodb
 
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorialsMongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
Mongo DB: Fundamentals & Basics/ An Overview of MongoDB/ Mongo DB tutorials
 
Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDB
 
MongoDB
MongoDBMongoDB
MongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB Fundamentals
 
SQL vs MongoDB
SQL vs MongoDBSQL vs MongoDB
SQL vs MongoDB
 

Viewers also liked

Viewers also liked (20)

Oracle rac cachefusion - High Availability Day 2015
Oracle rac cachefusion - High Availability Day 2015Oracle rac cachefusion - High Availability Day 2015
Oracle rac cachefusion - High Availability Day 2015
 
Cloud conference - mongodb
Cloud conference - mongodbCloud conference - mongodb
Cloud conference - mongodb
 
Aman sharma hyd_12crac High Availability Day 2015
Aman sharma hyd_12crac High Availability Day 2015Aman sharma hyd_12crac High Availability Day 2015
Aman sharma hyd_12crac High Availability Day 2015
 
RACATTACK Lab Handbook - Enable Flex Cluster and Flex ASM
RACATTACK Lab Handbook - Enable Flex Cluster and Flex ASMRACATTACK Lab Handbook - Enable Flex Cluster and Flex ASM
RACATTACK Lab Handbook - Enable Flex Cluster and Flex ASM
 
Spark Application for Time Series Analysis
Spark Application for Time Series AnalysisSpark Application for Time Series Analysis
Spark Application for Time Series Analysis
 
Flex Your Database on 12c's Flex ASM and Flex Cluster
Flex Your Database on 12c's Flex ASM and Flex ClusterFlex Your Database on 12c's Flex ASM and Flex Cluster
Flex Your Database on 12c's Flex ASM and Flex Cluster
 
Running Analytics at the Speed of Your Business
Running Analytics at the Speed of Your BusinessRunning Analytics at the Speed of Your Business
Running Analytics at the Speed of Your Business
 
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoIntro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of Twingo
 
Oracle 12.2 sharding learning more
Oracle 12.2 sharding learning moreOracle 12.2 sharding learning more
Oracle 12.2 sharding learning more
 
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernĂĄndez
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernĂĄndezSpark Streaming Tips for Devs and Ops by Fran perez y federico fernĂĄndez
Spark Streaming Tips for Devs and Ops by Fran perez y federico fernĂĄndez
 
Policy based cluster management in oracle 12c
Policy based cluster management in oracle 12c Policy based cluster management in oracle 12c
Policy based cluster management in oracle 12c
 
La transformacion digital en nuestra vida cotidiana. Un vistazo a las APIs
La transformacion digital en nuestra vida cotidiana. Un vistazo a las APIsLa transformacion digital en nuestra vida cotidiana. Un vistazo a las APIs
La transformacion digital en nuestra vida cotidiana. Un vistazo a las APIs
 
Leverage integration cloud_service_for_ebs_
Leverage integration cloud_service_for_ebs_Leverage integration cloud_service_for_ebs_
Leverage integration cloud_service_for_ebs_
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless Architecture
 
Oracle sharding : Installation & Configuration
Oracle sharding : Installation & ConfigurationOracle sharding : Installation & Configuration
Oracle sharding : Installation & Configuration
 
Oracle Flex ASM - What’s New and Best Practices by Jim Williams
Oracle Flex ASM - What’s New and Best Practices by Jim WilliamsOracle Flex ASM - What’s New and Best Practices by Jim Williams
Oracle Flex ASM - What’s New and Best Practices by Jim Williams
 
Oracle RAC 12c Collaborate Best Practices - IOUG 2014 version
Oracle RAC 12c Collaborate Best Practices - IOUG 2014 versionOracle RAC 12c Collaborate Best Practices - IOUG 2014 version
Oracle RAC 12c Collaborate Best Practices - IOUG 2014 version
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
Oracle 12.2 sharded database management
Oracle 12.2 sharded database managementOracle 12.2 sharded database management
Oracle 12.2 sharded database management
 
Understanding Oracle RAC 12c Internals OOW13 [CON8806]
Understanding Oracle RAC 12c Internals OOW13 [CON8806]Understanding Oracle RAC 12c Internals OOW13 [CON8806]
Understanding Oracle RAC 12c Internals OOW13 [CON8806]
 

Similar to MongoDB

MongoDB and Ruby on Rails
MongoDB and Ruby on RailsMongoDB and Ruby on Rails
MongoDB and Ruby on Rails
rfischer20
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesign
MongoDB APAC
 
Mongo db first steps with csharp
Mongo db first steps with csharpMongo db first steps with csharp
Mongo db first steps with csharp
Serdar Buyuktemiz
 
The Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterThe Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb Cluster
Chris Henry
 

Similar to MongoDB (20)

Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behl
 
Mongodb
MongodbMongodb
Mongodb
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
How to use MongoDB with CakePHP
How to use MongoDB with CakePHPHow to use MongoDB with CakePHP
How to use MongoDB with CakePHP
 
MongoDB and Ruby on Rails
MongoDB and Ruby on RailsMongoDB and Ruby on Rails
MongoDB and Ruby on Rails
 
Mongodb Training Tutorial in Bangalore
Mongodb Training Tutorial in BangaloreMongodb Training Tutorial in Bangalore
Mongodb Training Tutorial in Bangalore
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
 
10gen MongoDB Video Presentation at WebGeek DevCup
10gen MongoDB Video Presentation at WebGeek DevCup10gen MongoDB Video Presentation at WebGeek DevCup
10gen MongoDB Video Presentation at WebGeek DevCup
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesign
 
Mongo db first steps with csharp
Mongo db first steps with csharpMongo db first steps with csharp
Mongo db first steps with csharp
 
MongoDB_ppt.pptx
MongoDB_ppt.pptxMongoDB_ppt.pptx
MongoDB_ppt.pptx
 
MongoDB at FrozenRails
MongoDB at FrozenRailsMongoDB at FrozenRails
MongoDB at FrozenRails
 
The Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterThe Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb Cluster
 
MongoDB by Emroz sardar.
MongoDB by Emroz sardar.MongoDB by Emroz sardar.
MongoDB by Emroz sardar.
 
Webinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverWebinar: What's new in the .NET Driver
Webinar: What's new in the .NET Driver
 
2012 phoenix mug
2012 phoenix mug2012 phoenix mug
2012 phoenix mug
 
MongoDB
MongoDBMongoDB
MongoDB
 
MongoDB.pdf
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
 

More from Steven Francia

OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
Steven Francia
 
MongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous DataMongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous Data
Steven Francia
 

More from Steven Francia (20)

State of the Gopher Nation - Golang - August 2017
State of the Gopher Nation - Golang - August 2017State of the Gopher Nation - Golang - August 2017
State of the Gopher Nation - Golang - August 2017
 
Building Awesome CLI apps in Go
Building Awesome CLI apps in GoBuilding Awesome CLI apps in Go
Building Awesome CLI apps in Go
 
The Future of the Operating System - Keynote LinuxCon 2015
The Future of the Operating System -  Keynote LinuxCon 2015The Future of the Operating System -  Keynote LinuxCon 2015
The Future of the Operating System - Keynote LinuxCon 2015
 
7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)
 
What every successful open source project needs
What every successful open source project needsWhat every successful open source project needs
What every successful open source project needs
 
7 Common mistakes in Go and when to avoid them
7 Common mistakes in Go and when to avoid them7 Common mistakes in Go and when to avoid them
7 Common mistakes in Go and when to avoid them
 
Go for Object Oriented Programmers or Object Oriented Programming without Obj...
Go for Object Oriented Programmers or Object Oriented Programming without Obj...Go for Object Oriented Programmers or Object Oriented Programming without Obj...
Go for Object Oriented Programmers or Object Oriented Programming without Obj...
 
Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go
 
Getting Started with Go
Getting Started with GoGetting Started with Go
Getting Started with Go
 
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Build your first MongoDB App in Ruby @ StrangeLoop 2013Build your first MongoDB App in Ruby @ StrangeLoop 2013
Build your first MongoDB App in Ruby @ StrangeLoop 2013
 
Modern Database Systems (for Genealogy)
Modern Database Systems (for Genealogy)Modern Database Systems (for Genealogy)
Modern Database Systems (for Genealogy)
 
Introduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopIntroduction to MongoDB and Hadoop
Introduction to MongoDB and Hadoop
 
Future of data
Future of dataFuture of data
Future of data
 
MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012
 
Big data for the rest of us
Big data for the rest of usBig data for the rest of us
Big data for the rest of us
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
Multi Data Center Strategies
Multi Data Center StrategiesMulti Data Center Strategies
Multi Data Center Strategies
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
MongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous DataMongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous Data
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 

MongoDB

  • 2. My name is Steve Francia @spf13
  • 3. • 15+ years building the internet • BYU Alumnus • Father, husband, skateboarder • Chief Solutions Architect @ 10gen
  • 4. Introduction to MongoDB
  • 6. Agility Easily model complex data Database speaks your languages (java, .net, PHP, etc) Schemaless data model enables faster development cycle
  • 8. Cost Cost effectively manage abundant data (clickstreams, logs, etc.)
  • 9. • Company behind MongoDB • (A)GPL license, own copyrights, engineering team • support, consulting, commercial license revenue • Management • Google/DoubleClick, Oracle, Apple, NetApp • Funding: Sequoia, Union Square, Flybridge • Offices in NYC and Redwood Shores, CA • 50+ employees
  • 10. MongoDB Goals • OpenSource • Designed for today • Today’s hardware / environments • Today’s challenges • Easy development • Reliable • Scalable
  • 11. A bit of history
  • 13.
  • 14.
  • 15. 1979
  • 16. 1979 1982‐1996
  • 17. 1979 1982‐1996 1995
  • 18. Computers in 1995 • Pentium 100 mhz • 10base T • 16 MB ram • 200 MB HD
  • 19. Cell Phones in 2011 • Dual core 1.5 Ghz • WiFi 802.11n (300+ Mbps) • 1 GB ram • 64GB Solid State
  • 20. How about a DB designed for today?
  • 21. It started with DoubleClick
  • 22. Signs something needed • doubleclick - 400,000 ads/second • people writing their own stores • caching is de rigueur • complex ORM frameworks • computer architecture trends • cloud computing
  • 23. Requirements • need a good degree of functionality to handle a large set of use cases • sometimes need strong consistency / atomicity • secondary indexes • ad hoc queries
  • 24. Trim unneeded features • leave out a few things so we can scale • no choice but to leave out relational • distributed transactions are hard to scale
  • 25. Needed a scalable data model • some options: • key/value • columnar / tabular • document oriented (JSON inspired) • opportunity to innovate -> agility
  • 26. MongoDB philosphy • No longer one-size-ts all. but not 12 tools either. • Non-relational (no joins) makes scaling horizontally practical • Document data models are good • Keep functionality when we can (key/value stores are great, but we need more) • Database technology should run anywhere, being available both for running on your own servers or VMs, and also as a cloud pay-for-what-you-use service. • Ideally open source...
  • 27. MongoDB • JSON Documents • Querying/Indexing/Updating similar to relational databases • Traditional Consistency • Auto-Sharding
  • 28. Under the hood • Written in C++ • Available on most platforms • Data serialized to BSON • Extensive use of memory-mapped les
  • 30. MongoDB is: Application Document Oriented High { author: “steve”, date: new Date(), Performanc text: “About MongoDB...”, tags: [“tech”, “database”]} e Horizontally Scalable
  • 31. This has led some to say “ MongoDB has the best features of key/ values stores, document databases and relational databases in one. John Nunemaker
  • 33. Photo Meta- Problem: • Business needed more flexibility than Oracle could deliver Solution: • Used MongoDB instead of Oracle Results: • Developed application in one sprint cycle • 500% cost reduction compared to Oracle • 900% performance improvement compared to Oracle
  • 34. Customer Analytics Problem: • Deal with massive data volume across all customer sites Solution: • Used MongoDB to replace Google Analytics / Omniture options Results: • Less than one week to build prototype and prove business case • Rapid deployment of new features
  • 35. Online Problem: • MySQL could not scale to handle their 5B+ documents Solution: • Switched from MySQL to MongoDB Results: • Massive simplication of code base • Eliminated need for external caching system • 20x performance improvement over MySQL
  • 36. E-commerce Problem: • Multi-vertical E-commerce impossible to model (efficiently) in RDBMS Solution: • Switched from MySQL to MongoDB Results: • Massive simplication of code base • Rapidly build, halving time to market (and cost) • Eliminated need for external caching system • 50x+ improvement over MySQL
  • 37. Tons more Pretty much if you can use a RDMBS or Key/ Value MongoDB is a great t
  • 40. Relational made normalized data look like this
  • 41. Document databases make normalized data look like this
  • 42. Terminology RDBMS Mongo Table, View ➜ Collection Row ➜ JSON Document Index ➜ Index Join ➜ Embedded Partition ➜ Document Shard Partition Key ➜ Shard Key
  • 44. Tables to Documents { title: ‘MongoDB’, contributors: [ { name: ‘Eliot Horowitz’, email: ‘eh@10gen.com’ }, { name: ‘Dwight Merriman’, email: ‘dm@10gen.com’ } ], model: { relational: false, awesome: true }
  • 46. Documents Blog Post Document > p = {author: “roger”, date: new Date(), text: “about mongoDB...”, tags: [“tech”, “databases”]} > db.posts.save(p)
  • 47. Querying > db.posts.find() > { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Jul 24 2010 19:47:11", text : "About MongoDB...", tags : [ "tech", "databases" ] } Note: _id is unique, but can be anything you’d like
  • 48. Secondary Indexes Create index on any Field in Document
  • 49. Secondary Indexes Create index on any Field in Document // 1 means ascending, -1 means descending > db.posts.ensureIndex({author: 1}) > db.posts.find({author: 'roger'}) > { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", ... }
  • 50. Conditional Query Operators $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type, $lt, $lte, $gt, $gte
  • 51. Conditional Query Operators $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type, $lt, $lte, $gt, $gte // find posts with any tags > db.posts.find( {tags: {$exists: true }} ) // find posts matching a regular expression > db.posts.find( {author: /^rog*/i } ) // count posts by author > db.posts.find( {author: ‘roger’} ).count()
  • 52. Update Operations $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit
  • 53. Update Operations $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit > comment = { author: “fred”, date: new Date(), text: “Best Movie Ever”} > db.posts.update( { _id: “...” }, $push: {comments: comment} );
  • 54. Nested Documents { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Apr 24 2011 19:47:11", text : "About MongoDB...", tags : [ "tech", "databases" ], comments : [ { author : "Fred", date : "Sat Apr 25 2010 20:51:03 GMT-0700", text : "Best Post Ever!" } ] }
  • 55. Secondary Indexes // Index nested documents > db.posts.ensureIndex( “comments.author”: 1) > db.posts.find({‘comments.author’:’Fred’}) // Index on tags (multi-key index) > db.posts.ensureIndex( tags: 1) > db.posts.find( { tags: ‘tech’ } ) // geospatial index > db.posts.ensureIndex( “author.location”: “2d” ) > db.posts.find( “author.location”: { $near : [22,42] } )
  • 56. Rich Documents { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), line_items : [ { sku: ‘tt-123’, name: ‘Coltrane: Impressions’ }, { sku: ‘tt-457’, name: ‘Davis: Kind of Blue’ } ], address : { name: ‘Banker’, street: ‘111 Main’, zip: 10010 }, payment: { cc: 4567, exp: Date(2011, 7, 7) }, subtotal: 2355 }
  • 58. MongoDB Replication •MongoDB replication like MySQL replication (kinda) •Asynchronous master/slave •Variations •Master / slave •Replica Sets
  • 59. Replica Set features • A cluster of N servers • Any (one) node can be primary • Consensus election of primary • Automatic failover • Automatic recovery • All writes to primary • Reads can be to primary (default) or a secondary
  • 60. How MongoDB Replication works Member
1 Member
3 Member
2 Set is made up of 2 or more nodes
  • 61. How MongoDB Replication works Member
1 Member
3 Member
2 PRIMARY Election establishes the PRIMARY Data replication from PRIMARY to SECONDARY
  • 62. How MongoDB Replication works negotiate
 new
master Member
1 Member
3 Member
2 DOWN PRIMARY may fail Automatic election of new PRIMARY if majority exists
  • 63. How MongoDB Replication works Member
3 Member
1 PRIMARY Member
2 DOWN New PRIMARY elected Replication Set re-established
  • 64. How MongoDB Replication works Member
3 Member
1 PRIMARY Member
2 RECOVERING Automatic recovery
  • 65. How MongoDB Replication works Member
3 Member
1 PRIMARY Member
2 Replication Set re-established
  • 66. Creating a Replica Set > cfg = { _id : "acme_a", members : [ { _id : 0, host : "sf1.acme.com" }, { _id : 1, host : "sf2.acme.com" }, { _id : 2, host : "sf3.acme.com" } ] } > use admin > db.runCommand( { replSetInitiate : cfg } )
  • 67. Replica Set Options • {arbiterOnly: True} • Can vote in an election • Does not hold any data • {hidden: True} • Not reported in isMaster() • Will not be sent slaveOk() reads • {priority: n} • {tags: }
  • 68. Using Replicas for Reads • slaveOk() • - driver will send read requests to Secondaries • - driver will always send writes to Primary • Java examples • - DB.slaveOk() • - Collection.slaveOk() • nd(q).addOption(Bytes.QUERYOPTION_SLAVEO K);
  • 69. Safe Writes • db.runCommand({getLastError: 1, w : 1}) • - ensure write is synchronous • - command returns after primary has written to memory • w=n or w='majority' • n is the number of nodes data must be replicated to • driver will always send writes to Primary • w='myTag' [MongoDB 2.0] • Each member is "tagged" e.g. "US_EAST", "EMEA", "US_WEST" • Ensure that the write is executed in each tagged "region"
  • 70. Safe Writes • fsync:true • Ensures changed disk blocks are flushed to disk • j:true • Ensures changes are flush to Journal
  • 71. When are elections triggered? • When a given member see's that the Primary is not reachable • The member is not an Arbiter • Has a priority greater than other eligible members
  • 72. Typical Use? Set
 size Deployments Data
Protection High
Availability Notes X One No No Must
use
‐‐journal
to
protect
against
crashes On
loss
of
one
member,
surviving
member
is
 Two Yes No read
only On
loss
of
one
member,
surviving
two
 Three Yes Yes
‐
1
failure members
can
elect
a
new
primary *
On
loss
of
two
members,
surviving
two
 X Four Yes Yes
‐
1
failure* members
are
read
only
 On
loss
of
two
members,
surviving
three
 Five Yes Yes
‐
2
failures members
can
elect
a
new
primary
  • 73. Replication features • Reads from Primary are always consistent • Reads from Secondaries are eventually consistent • Automatic failover if a Primary fails • Automatic recovery when a node joins the set • Control of where writes occur
  • 75. What is Sharding • Ad-hoc partitioning • Consistent hashing • Amazon Dynamo • Range based partitioning • Google BigTable • Yahoo! PNUTS • MongoDB
  • 76. MongoDB Sharding • Automatic partitioning and management • Range based • Convert to sharded system with no downtime • Fully consistent
  • 78. How MongoDB Sharding works >
db.runCommand(
{
addshard
:
"shard1"
}
); >
db.runCommand(
 


{
shardCollection
:
“mydb.blogs”,
 




key
:
{
age
:
1}
}
) -∞   +∞   •Range keys from -∞ to +∞   •Ranges are stored as “chunks”
  • 79. How MongoDB Sharding works >
db.posts.save(
{age:40}
) -∞   +∞   -∞   40 41 +∞   •Data in inserted •Ranges are split into more “chunks”
  • 80. How MongoDB Sharding works >
db.posts.save(
{age:40}
) >
db.posts.save(
{age:50}
) -∞   +∞   -∞   40 41 +∞   41 50 51 +∞   •More Data in inserted •Ranges are split into more“chunks”
  • 81. How MongoDB Sharding works >
db.posts.save(
{age:40}
) >
db.posts.save(
{age:50}
) >
db.posts.save(
{age:60}
) -∞   +∞   -∞   40 41 +∞   41 50 51 +∞   51 60 61 +∞  
  • 82. How MongoDB Sharding works >
db.posts.save(
{age:40}
) >
db.posts.save(
{age:50}
) >
db.posts.save(
{age:60}
) -∞   +∞   -∞   40 41 +∞   41 50 51 +∞   51 60 61 +∞  
  • 83. How MongoDB Sharding works shard1 -∞   40 41 50 51 60 61 +∞  
  • 84. How MongoDB Sharding works >
db.runCommand(
{
addshard
:
"shard2"
}
); -∞   40 41 50 51 60 61 +∞  
  • 85. How MongoDB Sharding works >
db.runCommand(
{
addshard
:
"shard2"
}
); shard1 -∞   40 41 50 51 60 61 +∞  
  • 86. How MongoDB Sharding works >
db.runCommand(
{
addshard
:
"shard2"
}
); shard1 shard2 -∞   40 41 50 51 60 61 +∞  
  • 87. How MongoDB Sharding works >
db.runCommand(
{
addshard
:
"shard2"
}
); >
db.runCommand(
{
addshard
:
"shard3"
}
); shard1 shard2 shard3 -∞   40 41 50 51 60 61 +∞  
  • 89. Sharding Features • Shard data without no downtime • Automatic balancing as data is written • Commands routed (switched) to correct node • Inserts - must have the Shard Key • Updates - must have the Shard Key • Queries • With Shard Key - routed to nodes • Without Shard Key - scatter gather • Indexed Queries • With Shard Key - routed in order • Without Shard Key - distributed sort merge
  • 92. Cong Servers • 3 of them • changes are made with 2 phase commit • if any are down, meta data goes read only • system is online as long as 1/3 is up
  • 93. Cong Servers • 3 of them • changes are made with 2 phase commit • if any are down, meta data goes read only • system is online as long as 1/3 is up
  • 94. Shards • Can be master, master/slave or replica sets • Replica sets gives sharding + full auto-failover • Regular mongod processes
  • 95. Shards • Can be master, master/slave or replica sets • Replica sets gives sharding + full auto-failover • Regular mongod processes
  • 96. Mongos • Sharding Router • Acts just like a mongod to clients • Can have 1 or as many as you want • Can run on appserver so no extra network traffic
  • 97. Mongos • Sharding Router • Acts just like a mongod to clients • Can have 1 or as many as you want • Can run on appserver so no extra network traffic
  • 99. Priorities • Prior to 2.0.0 • {priority:0} // Never can be elected Primary • {priority:1} // Can be elected Primary • New in 2.0.0 • Priority, floating point number between 0 and 1000 • During an election • Most up to date • Highest priority • Allows weighting of members during failover
  • 100. Priorities - example • Assuming all members are up to date A D • Members A or B will be chosen rst p:2 p:1 • Highest priority B E • Members C or D will be chosen next if p:2 p:0 • A and B are unavailable • A and B are not up to date C p:1 • Member E is never chosen • priority:0 means it cannot be elected
  • 101. Tagging • New in 2.0.0 • Control over where data is written to • Each member can have one or more tags e.g. • tags: {dc: "ny"} • tags: {dc: "ny", ip: "192.168", rack: "row3rk7"} • Replica set denes rules for where data resides • Rules can change without change application code
  • 102. Tagging - example { _id : "mySet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}}] settings : { getLastErrorModes : { allDCs : {"dc" : 3}, someDCs : {"dc" : 2}} } } > db.blogs.insert({...}) > db.runCommand({getLastError : 1, w : "allDCs"})
  • 103. Use Cases - Multi Data Center • write to three data centers • allDCs : {"dc" : 3} • > db.runCommand({getLastError : 1, w : "allDCs"}) • write to two data centers and three availability zones • allDCsPlus : {"dc" : 2, "az": 3} • > db.runCommand({getLastError : 1, w : "allDCsPlus"}) US‐EAST‐1 US‐WEST‐1 LONDON‐1 tag
:
{dc:
"JFK", tag
:
{dc:
"SFO", tag
:
{dc:
"LHR", 






az:
"r1"} 






az
:
"r3"} 






az:
"r5"} US‐EAST‐2 US‐WEST‐2 tag
:
{dc:
"JFK" tag
:
{dc:
"SFO" 






az:
"r2"} 






az:
"r4"}
  • 104. Use Cases - Data Protection & High Availability • A and B will take priority during a failover • C or D will become primary if A and B become unavailable • E cannot be primary • D and E cannot be read from with a slaveOk() • D can use be used for Backups, feed Solr index etc. • E provides a safe guard for operational or application error E A C priority:
0 priority:
2 priority:
1 hidden:
True slaveDelay:
3600 D B priority:
1 priority:
2 hidden:
True
  • 106.
  • 111. Goal Minimize memory turnover
  • 112. What is your data access pattern?
  • 113. 10 days of data RAM Disk
  • 114. http://spf13.com http://github.com/spf13 @spf13 Questions? download at mongodb.org PS: We’re hiring!! Contact us at jobs@10gen.com

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
  15. Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
  16. Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. By reducing transactional semantics the db provides, one can still solve an interesting set of problems where performance is very important, and horizontal scaling then becomes easier.\n\n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. sharding isn’t new\n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n
  92. \n
  93. \n
  94. \n
  95. \n
  96. \n
  97. \n
  98. \n
  99. \n
  100. \n
  101. \n
  102. \n
  103. \n
  104. \n
  105. \n
  106. \n
  107. \n
  108. \n
  109. write: add new paragraph. read: read through book.\ndon't go into indexes yet\n
  110. \n
  111. \n
  112. \n
  113. \n
  114. \n
  115. webapp: recent data\n
  116. \n
  117. \n
  118. \n