MongoDB
My name isSteve Francia     @spf13
• 15+ years building the  internet• BYU Alumnus• Father, husband,  skateboarder• Chief Solutions Architect @  10gen
Introduction to   MongoDB
Why MongoDB?
Agility     Easily model complex data  Database speaks your languages       (java, .net, PHP, etc)Schemaless data model en...
ScaleEasy and automatic scale out
CostCost effectively manage abundant data       (clickstreams, logs, etc.)
• Company behind MongoDB • (A)GPL license, own copyrights,    engineering team  • support, consulting, commercial license ...
MongoDB Goals• OpenSource• Designed for today • Today’s hardware / environments • Today’s challenges• Easy development• Re...
A bit of history
1974The relational database is created
1979
1979   1982‐1996
1979   1982‐1996   1995
Computers in 1995• Pentium 100 mhz• 10base T• 16 MB ram• 200 MB HD
Cell Phones in 2011• Dual core 1.5 Ghz• WiFi 802.11n (300+ Mbps)• 1 GB ram• 64GB Solid State
How about a DBdesigned for today?
It started with DoubleClick
Signs something      needed• doubleclick - 400,000 ads/second• people writing their own stores• caching is de rigueur• com...
Requirements• need a good degree of functionality  to handle a large set of use cases • sometimes need strong   consistenc...
Trim unneeded      features• leave out a few things so we can  scale • no choice but to leave out   relational • distribut...
Needed a scalable   data model• some options: • key/value • columnar / tabular • document oriented (JSON inspired)• opport...
MongoDB philosphy•   No longer one-size-fits all. but not 12 tools either.•   Non-relational (no joins) makes scaling horiz...
MongoDB• JSON Documents• Querying/Indexing/Updating similar  to relational databases• Traditional Consistency• Auto-Sharding
Under the hood• Written in C++• Available on most platforms• Data serialized to BSON• Extensive use of memory-mapped files
DatabaseLandscape
MongoDB is:          Application       Document                             Oriented   High                      { author:...
This has led    some to say“MongoDB has the bestfeatures of key/ valuesstores, documentdatabases and relationaldatabases i...
Use Cases
Photo Meta-Problem:• Business needed more flexibility than Oracle could deliverSolution:• Used MongoDB instead of OracleRes...
Customer AnalyticsProblem:• Deal with massive data volume across all customer sitesSolution:• Used MongoDB to replace Goog...
OnlineProblem:• MySQL could not scale to handle their 5B+ documentsSolution:• Switched from MySQL to MongoDBResults:• Mass...
E-commerceProblem:• Multi-vertical E-commerce impossible to model (efficiently)  in RDBMSSolution:• Switched from MySQL to...
Tons morePretty much if you can use a RDMBS or Key/        Value MongoDB is a great fit
In Good Company
Schema Design
Relational made normalized     data look like this
Document databases makenormalized data look like this
Terminology   RDBMS                   MongoTable, View     ➜   CollectionRow             ➜   JSON DocumentIndex           ...
Tables toDocuments
Tables toDocuments      {          title: ‘MongoDB’,          contributors: [             { name: ‘Eliot Horowitz’,       ...
DEMO TIME
DocumentsBlog Post Document> p = {author:   “roger”,         date:   new Date(),         text:   “about mongoDB...”,      ...
Querying> db.posts.find()>   { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),    author : "roger",      date : "Sat Jul 24 20...
Secondary IndexesCreate index on any Field in Document
Secondary IndexesCreate index on any Field in Document    //   1 means ascending, -1 means descending    > db.posts.ensure...
Conditional Query   Operators$all, $exists, $mod, $ne, $in, $nin, $nor,$or, $size, $type, $lt, $lte, $gt, $gte
Conditional Query   Operators$all, $exists, $mod, $ne, $in, $nin, $nor,$or, $size, $type, $lt, $lte, $gt, $gte// find post...
Update Operations$set, $unset, $inc, $push, $pushAll,$pull, $pullAll, $bit
Update Operations $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit> comment = { author: “fred”,              dat...
Nested Documents    {   _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),        author : "roger",        date : "Sat Apr 24 201...
Secondary Indexes// Index nested documents> db.posts.ensureIndex( “comments.author”: 1)> db.posts.find({‘comments.author’:...
Rich Documents{   _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),    line_items : [ { sku: ‘tt-123’,                     name:...
High Availability
MongoDB        Replication•MongoDB replication like MySQL replication (kinda)•Asynchronous master/slave•Variations   •Mast...
Replica Set features•   A cluster of N servers•   Any (one) node can be primary•   Consensus election of primary•   Automa...
How MongoDBReplication works     Member
1              Member
3                Member
2 Set is made up of 2 or more nodes
How MongoDB    Replication works          Member
1              Member
3                     Member
2                     ...
How MongoDB   Replication works                       negotiate
                      new
master          Member
1        ...
How MongoDBReplication works                           Member
3    Member
1                           PRIMARY             ...
How MongoDBReplication works                           Member
3   Member
1                           PRIMARY              ...
How MongoDBReplication works                           Member
3    Member
1                           PRIMARY             ...
Creating a Replica        Set> cfg = {    _id : "acme_a",    members : [      { _id : 0, host : "sf1.acme.com" },      { _...
Replica Set Options•   {arbiterOnly: True}    •   Can vote in an election    •   Does not hold any data•   {hidden: True} ...
Using Replicas for       Reads• slaveOk() • - driver will send read requests to     Secondaries  • - driver will always se...
Safe Writes•   db.runCommand({getLastError: 1, w : 1})    •   - ensure write is synchronous    •   - command returns after...
Safe Writes• fsync:true • Ensures changed disk blocks are   flushed to disk• j:true • Ensures changes are flush to   Journal
When are elections   triggered?• When a given member sees that the  Primary is not reachable• The member is not an Arbiter...
TypicalUse?     Set
     size                  Deployments             Data
Protection High
Availability Notes X   One    ...
Replication features•    Reads from Primary are always    consistent•    Reads from Secondaries are eventually    consiste...
ScalingSharding MongoDB
What is Sharding• Ad-hoc partitioning• Consistent hashing • Amazon Dynamo• Range based partitioning • Google BigTable • Ya...
MongoDB Sharding• Automatic partitioning and  management• Range based• Convert to sharded system with no  downtime• Fully ...
How MongoDBSharding Works
How MongoDB Sharding works >
db.runCommand(
{
addshard
:
"shard1"
}
); >
db.runCommand(
 


{
shardCollection
:
“mydb.blog...
How MongoDB Sharding works >
db.posts.save(
{age:40}
)        -∞   +∞    -∞   40      41 +∞  •Data in inserted•Ranges are ...
How MongoDB Sharding works >
db.posts.save(
{age:40}
) >
db.posts.save(
{age:50}
)        -∞   +∞    -∞   40      41 +∞   ...
How MongoDB Sharding works>
db.posts.save(
{age:40}
)>
db.posts.save(
{age:50}
)>
db.posts.save(
{age:60}
)      -∞   +∞  ...
How MongoDB Sharding works>
db.posts.save(
{age:40}
)>
db.posts.save(
{age:50}
)>
db.posts.save(
{age:60}
)      -∞   +∞  ...
How MongoDB Sharding worksshard1 -∞   40  41 50  51 60  61 +∞  
How MongoDB Sharding works>
db.runCommand(
{
addshard
:
"shard2"
}
); -∞   40 41 50 51 60 61 +∞  
How MongoDB Sharding works>
db.runCommand(
{
addshard
:
"shard2"
}
);shard1 -∞   40  41 50  51 60  61 +∞  
How MongoDB Sharding works>
db.runCommand(
{
addshard
:
"shard2"
}
);shard1         shard2 -∞   40                 41 50  ...
How MongoDB Sharding works>
db.runCommand(
{
addshard
:
"shard2"
}
);>
db.runCommand(
{
addshard
:
"shard3"
}
);shard1    ...
How MongoDBSharding Works
Sharding Features•   Shard data without no downtime•   Automatic balancing as data is written•   Commands routed (switched...
ShardingArchitecture
Architecture
Config Servers• 3 of them• changes are made with 2 phase  commit• if any are down, meta data  goes read only• system is onl...
Config Servers• 3 of them• changes are made with 2 phase  commit• if any are down, meta data  goes read only• system is onl...
Shards• Can be master, master/slave or  replica sets• Replica sets gives sharding + full  auto-failover• Regular mongod pr...
Shards• Can be master, master/slave or  replica sets• Replica sets gives sharding + full  auto-failover• Regular mongod pr...
Mongos• Sharding Router• Acts just like a mongod to clients• Can have 1 or as many as you want• Can run on appserver so no...
Mongos• Sharding Router• Acts just like a mongod to clients• Can have 1 or as many as you want• Can run on appserver so no...
AdvancedReplication
Priorities•   Prior to 2.0.0    •   {priority:0} // Never can be elected Primary    •   {priority:1} // Can be elected Pri...
Priorities - example•   Assuming all members are up to date                                                 A     D•   Mem...
Tagging•   New in 2.0.0•   Control over where data is written to•   Each member can have one or more tags e.g.    •   tags...
Tagging - example{    _id : "mySet",    members : [        {_id : 0, host : "A",   tags    :   {"dc":   "ny"}},        {_i...
Use Cases - Multi       Data Center   •   write to three data centers       •   allDCs : {"dc" : 3}       •   > db.runComm...
Use Cases - Data Protection    & High Availability•    A and B will take priority during a failover•    C or D will become...
Optimizing app performance
RAMDisk
RAMDisk
RAMDisk
RAMDisk
GoalMinimize memory    turnover
What is your data access pattern?
10 days of data       RAMDisk
http://spf13.com                         http://github.com/spf13                         @spf13   Questions?download at mo...
MongoDB
MongoDB
MongoDB
MongoDB
Upcoming SlideShare
Loading in...5
×

MongoDB

3,016

Published on

This presentation was given at the LDS Tech SORT Conference 2011 in Salt Lake City. The slides are quite comprehensive covering many topics on MongoDB. Rather than a traditional presentation, this was presented as more of a Q & A session. Topics covered include. Introduction to MongoDB, Use Cases, Schema design, High availability (replication) and Horizontal Scaling (sharding).

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,016
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
150
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
  • Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
  • Remember in 1995 there were around 10,000 websites. Mosiac, Lynx, Mozilla (pre netscape) and IE 2.0 were the only web browsers. \nApache (Dec ’95), Java (’96), PHP (June ’95), and .net didn’t exist yet. Linux just barely (1.0 in ’94)\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • By reducing transactional semantics the db provides, one can still solve an interesting set of problems where performance is very important, and horizontal scaling then becomes easier.\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • sharding isn’t new\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • write: add new paragraph. read: read through book.\ndon't go into indexes yet\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • webapp: recent data\n
  • \n
  • \n
  • \n
  • MongoDB

    1. 1. MongoDB
    2. 2. My name isSteve Francia @spf13
    3. 3. • 15+ years building the internet• BYU Alumnus• Father, husband, skateboarder• Chief Solutions Architect @ 10gen
    4. 4. Introduction to MongoDB
    5. 5. Why MongoDB?
    6. 6. Agility Easily model complex data Database speaks your languages (java, .net, PHP, etc)Schemaless data model enables faster development cycle
    7. 7. ScaleEasy and automatic scale out
    8. 8. CostCost effectively manage abundant data (clickstreams, logs, etc.)
    9. 9. • Company behind MongoDB • (A)GPL license, own copyrights, engineering team • support, consulting, commercial license revenue• Management • Google/DoubleClick, Oracle, Apple, NetApp • Funding: Sequoia, Union Square, Flybridge • Offices in NYC and Redwood Shores, CA • 50+ employees
    10. 10. MongoDB Goals• OpenSource• Designed for today • Today’s hardware / environments • Today’s challenges• Easy development• Reliable• Scalable
    11. 11. A bit of history
    12. 12. 1974The relational database is created
    13. 13. 1979
    14. 14. 1979 1982‐1996
    15. 15. 1979 1982‐1996 1995
    16. 16. Computers in 1995• Pentium 100 mhz• 10base T• 16 MB ram• 200 MB HD
    17. 17. Cell Phones in 2011• Dual core 1.5 Ghz• WiFi 802.11n (300+ Mbps)• 1 GB ram• 64GB Solid State
    18. 18. How about a DBdesigned for today?
    19. 19. It started with DoubleClick
    20. 20. Signs something needed• doubleclick - 400,000 ads/second• people writing their own stores• caching is de rigueur• complex ORM frameworks• computer architecture trends• cloud computing
    21. 21. Requirements• need a good degree of functionality to handle a large set of use cases • sometimes need strong consistency / atomicity • secondary indexes • ad hoc queries
    22. 22. Trim unneeded features• leave out a few things so we can scale • no choice but to leave out relational • distributed transactions are hard to scale
    23. 23. Needed a scalable data model• some options: • key/value • columnar / tabular • document oriented (JSON inspired)• opportunity to innovate -> agility
    24. 24. MongoDB philosphy• No longer one-size-fits all. but not 12 tools either.• Non-relational (no joins) makes scaling horizontally practical• Document data models are good• Keep functionality when we can (key/value stores are great, but we need more)• Database technology should run anywhere, being available both for running on your own servers or VMs, and also as a cloud pay-for-what-you-use service.• Ideally open source...
    25. 25. MongoDB• JSON Documents• Querying/Indexing/Updating similar to relational databases• Traditional Consistency• Auto-Sharding
    26. 26. Under the hood• Written in C++• Available on most platforms• Data serialized to BSON• Extensive use of memory-mapped files
    27. 27. DatabaseLandscape
    28. 28. MongoDB is: Application Document Oriented High { author: “steve”, date: new Date(),Performanc text: “About MongoDB...”, tags: [“tech”, “database”]} e Horizontally Scalable
    29. 29. This has led some to say“MongoDB has the bestfeatures of key/ valuesstores, documentdatabases and relationaldatabases in one. John Nunemaker
    30. 30. Use Cases
    31. 31. Photo Meta-Problem:• Business needed more flexibility than Oracle could deliverSolution:• Used MongoDB instead of OracleResults:• Developed application in one sprint cycle• 500% cost reduction compared to Oracle• 900% performance improvement compared to Oracle
    32. 32. Customer AnalyticsProblem:• Deal with massive data volume across all customer sitesSolution:• Used MongoDB to replace Google Analytics / Omniture optionsResults:• Less than one week to build prototype and prove business case• Rapid deployment of new features
    33. 33. OnlineProblem:• MySQL could not scale to handle their 5B+ documentsSolution:• Switched from MySQL to MongoDBResults:• Massive simplification of code base• Eliminated need for external caching system• 20x performance improvement over MySQL
    34. 34. E-commerceProblem:• Multi-vertical E-commerce impossible to model (efficiently) in RDBMSSolution:• Switched from MySQL to MongoDBResults:• Massive simplification of code base• Rapidly build, halving time to market (and cost)• Eliminated need for external caching system• 50x+ improvement over MySQL
    35. 35. Tons morePretty much if you can use a RDMBS or Key/ Value MongoDB is a great fit
    36. 36. In Good Company
    37. 37. Schema Design
    38. 38. Relational made normalized data look like this
    39. 39. Document databases makenormalized data look like this
    40. 40. Terminology RDBMS MongoTable, View ➜ CollectionRow ➜ JSON DocumentIndex ➜ IndexJoin ➜ EmbeddedPartition ➜ Document ShardPartition Key ➜ Shard Key
    41. 41. Tables toDocuments
    42. 42. Tables toDocuments { title: ‘MongoDB’, contributors: [ { name: ‘Eliot Horowitz’, email: ‘eh@10gen.com’ }, { name: ‘Dwight Merriman’, email: ‘dm@10gen.com’ } ], model: { relational: false, awesome: true }
    43. 43. DEMO TIME
    44. 44. DocumentsBlog Post Document> p = {author: “roger”, date: new Date(), text: “about mongoDB...”, tags: [“tech”, “databases”]}> db.posts.save(p)
    45. 45. Querying> db.posts.find()> { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Jul 24 2010 19:47:11", text : "About MongoDB...", tags : [ "tech", "databases" ] } Note: _id is unique, but can beanything you’d like
    46. 46. Secondary IndexesCreate index on any Field in Document
    47. 47. Secondary IndexesCreate index on any Field in Document // 1 means ascending, -1 means descending > db.posts.ensureIndex({author: 1}) > db.posts.find({author: roger})> { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", ... }
    48. 48. Conditional Query Operators$all, $exists, $mod, $ne, $in, $nin, $nor,$or, $size, $type, $lt, $lte, $gt, $gte
    49. 49. Conditional Query Operators$all, $exists, $mod, $ne, $in, $nin, $nor,$or, $size, $type, $lt, $lte, $gt, $gte// find posts with any tags> db.posts.find( {tags: {$exists: true }} )// find posts matching a regular expression> db.posts.find( {author: /^rog*/i } )// count posts by author> db.posts.find( {author: ‘roger’} ).count()
    50. 50. Update Operations$set, $unset, $inc, $push, $pushAll,$pull, $pullAll, $bit
    51. 51. Update Operations $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit> comment = { author: “fred”, date: new Date(), text: “Best Movie Ever”}> db.posts.update( { _id: “...” }, $push: {comments: comment} );
    52. 52. Nested Documents { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Apr 24 2011 19:47:11", text : "About MongoDB...", tags : [ "tech", "databases" ], comments : [ { author : "Fred", date : "Sat Apr 25 2010 20:51:03 GMT-0700", text : "Best Post Ever!" } ]}
    53. 53. Secondary Indexes// Index nested documents> db.posts.ensureIndex( “comments.author”: 1)> db.posts.find({‘comments.author’:’Fred’})// Index on tags (multi-key index)> db.posts.ensureIndex( tags: 1)> db.posts.find( { tags: ‘tech’ } )// geospatial index> db.posts.ensureIndex( “author.location”: “2d” )> db.posts.find( “author.location”: { $near : [22,42] } )
    54. 54. Rich Documents{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), line_items : [ { sku: ‘tt-123’, name: ‘Coltrane: Impressions’ }, { sku: ‘tt-457’, name: ‘Davis: Kind of Blue’ } ], address : { name: ‘Banker’, street: ‘111 Main’, zip: 10010 }, payment: { cc: 4567, exp: Date(2011, 7, 7) }, subtotal: 2355}
    55. 55. High Availability
    56. 56. MongoDB Replication•MongoDB replication like MySQL replication (kinda)•Asynchronous master/slave•Variations •Master / slave •Replica Sets
    57. 57. Replica Set features• A cluster of N servers• Any (one) node can be primary• Consensus election of primary• Automatic failover• Automatic recovery• All writes to primary• Reads can be to primary (default) or a secondary
    58. 58. How MongoDBReplication works Member
1 Member
3 Member
2 Set is made up of 2 or more nodes
    59. 59. How MongoDB Replication works Member
1 Member
3 Member
2 PRIMARY Election establishes the PRIMARYData replication from PRIMARY to SECONDARY
    60. 60. How MongoDB Replication works negotiate
 new
master Member
1 Member
3 Member
2 DOWN PRIMARY may failAutomatic election of new PRIMARY if majority exists
    61. 61. How MongoDBReplication works Member
3 Member
1 PRIMARY Member
2 DOWN New PRIMARY elected Replication Set re-established
    62. 62. How MongoDBReplication works Member
3 Member
1 PRIMARY Member
2 RECOVERING Automatic recovery
    63. 63. How MongoDBReplication works Member
3 Member
1 PRIMARY Member
2 Replication Set re-established
    64. 64. Creating a Replica Set> cfg = { _id : "acme_a", members : [ { _id : 0, host : "sf1.acme.com" }, { _id : 1, host : "sf2.acme.com" }, { _id : 2, host : "sf3.acme.com" } ] }> use admin> db.runCommand( { replSetInitiate : cfg } )
    65. 65. Replica Set Options• {arbiterOnly: True} • Can vote in an election • Does not hold any data• {hidden: True} • Not reported in isMaster() • Will not be sent slaveOk() reads• {priority: n}• {tags: }
    66. 66. Using Replicas for Reads• slaveOk() • - driver will send read requests to Secondaries • - driver will always send writes to Primary • Java examples • - DB.slaveOk() • - Collection.slaveOk()• find(q).addOption(Bytes.QUERYOPTION_SLAVEO K);
    67. 67. Safe Writes• db.runCommand({getLastError: 1, w : 1}) • - ensure write is synchronous • - command returns after primary has written to memory• w=n or w=majority • n is the number of nodes data must be replicated to • driver will always send writes to Primary• w=myTag [MongoDB 2.0] • Each member is "tagged" e.g. "US_EAST", "EMEA", "US_WEST" • Ensure that the write is executed in each tagged "region"
    68. 68. Safe Writes• fsync:true • Ensures changed disk blocks are flushed to disk• j:true • Ensures changes are flush to Journal
    69. 69. When are elections triggered?• When a given member sees that the Primary is not reachable• The member is not an Arbiter• Has a priority greater than other eligible members
    70. 70. TypicalUse? Set
 size Deployments Data
Protection High
Availability Notes X One No No Must
use
‐‐journal
to
protect
against
crashes On
loss
of
one
member,
surviving
member
is
 Two Yes No read
only On
loss
of
one
member,
surviving
two
 Three Yes Yes
‐
1
failure members
can
elect
a
new
primary *
On
loss
of
two
members,
surviving
two
 X Four Yes Yes
‐
1
failure* members
are
read
only
 On
loss
of
two
members,
surviving
three
 Five Yes Yes
‐
2
failures members
can
elect
a
new
primary
    71. 71. Replication features• Reads from Primary are always consistent• Reads from Secondaries are eventually consistent• Automatic failover if a Primary fails• Automatic recovery when a node joins the set• Control of where writes occur
    72. 72. ScalingSharding MongoDB
    73. 73. What is Sharding• Ad-hoc partitioning• Consistent hashing • Amazon Dynamo• Range based partitioning • Google BigTable • Yahoo! PNUTS • MongoDB
    74. 74. MongoDB Sharding• Automatic partitioning and management• Range based• Convert to sharded system with no downtime• Fully consistent
    75. 75. How MongoDBSharding Works
    76. 76. How MongoDB Sharding works >
db.runCommand(
{
addshard
:
"shard1"
}
); >
db.runCommand(
 


{
shardCollection
:
“mydb.blogs”,
 




key
:
{
age
:
1}
}
) -∞   +∞  •Range keys from -∞ to +∞  •Ranges are stored as “chunks”
    77. 77. How MongoDB Sharding works >
db.posts.save(
{age:40}
) -∞   +∞   -∞   40 41 +∞  •Data in inserted•Ranges are split into more “chunks”
    78. 78. How MongoDB Sharding works >
db.posts.save(
{age:40}
) >
db.posts.save(
{age:50}
) -∞   +∞   -∞   40 41 +∞   41 50 51 +∞  •More Data in inserted•Ranges are split into more“chunks”
    79. 79. How MongoDB Sharding works>
db.posts.save(
{age:40}
)>
db.posts.save(
{age:50}
)>
db.posts.save(
{age:60}
) -∞   +∞   -∞   40 41 +∞   41 50 51 +∞   51 60 61 +∞  
    80. 80. How MongoDB Sharding works>
db.posts.save(
{age:40}
)>
db.posts.save(
{age:50}
)>
db.posts.save(
{age:60}
) -∞   +∞   -∞   40 41 +∞   41 50 51 +∞   51 60 61 +∞  
    81. 81. How MongoDB Sharding worksshard1 -∞   40 41 50 51 60 61 +∞  
    82. 82. How MongoDB Sharding works>
db.runCommand(
{
addshard
:
"shard2"
}
); -∞   40 41 50 51 60 61 +∞  
    83. 83. How MongoDB Sharding works>
db.runCommand(
{
addshard
:
"shard2"
}
);shard1 -∞   40 41 50 51 60 61 +∞  
    84. 84. How MongoDB Sharding works>
db.runCommand(
{
addshard
:
"shard2"
}
);shard1 shard2 -∞   40 41 50 51 60 61 +∞  
    85. 85. How MongoDB Sharding works>
db.runCommand(
{
addshard
:
"shard2"
}
);>
db.runCommand(
{
addshard
:
"shard3"
}
);shard1 shard2 shard3 -∞   40 41 50 51 60 61 +∞  
    86. 86. How MongoDBSharding Works
    87. 87. Sharding Features• Shard data without no downtime• Automatic balancing as data is written• Commands routed (switched) to correct node • Inserts - must have the Shard Key • Updates - must have the Shard Key • Queries • With Shard Key - routed to nodes • Without Shard Key - scatter gather • Indexed Queries • With Shard Key - routed in order • Without Shard Key - distributed sort merge
    88. 88. ShardingArchitecture
    89. 89. Architecture
    90. 90. Config Servers• 3 of them• changes are made with 2 phase commit• if any are down, meta data goes read only• system is online as long as 1/3 is up
    91. 91. Config Servers• 3 of them• changes are made with 2 phase commit• if any are down, meta data goes read only• system is online as long as 1/3 is up
    92. 92. Shards• Can be master, master/slave or replica sets• Replica sets gives sharding + full auto-failover• Regular mongod processes
    93. 93. Shards• Can be master, master/slave or replica sets• Replica sets gives sharding + full auto-failover• Regular mongod processes
    94. 94. Mongos• Sharding Router• Acts just like a mongod to clients• Can have 1 or as many as you want• Can run on appserver so no extra network traffic
    95. 95. Mongos• Sharding Router• Acts just like a mongod to clients• Can have 1 or as many as you want• Can run on appserver so no extra network traffic
    96. 96. AdvancedReplication
    97. 97. Priorities• Prior to 2.0.0 • {priority:0} // Never can be elected Primary • {priority:1} // Can be elected Primary• New in 2.0.0 • Priority, floating point number between 0 and 1000 • During an election • Most up to date • Highest priority • Allows weighting of members during failover
    98. 98. Priorities - example• Assuming all members are up to date A D• Members A or B will be chosen first p:2 p:1 • Highest priority B E• Members C or D will be chosen next if p:2 p:0 • A and B are unavailable • A and B are not up to date C p:1• Member E is never chosen • priority:0 means it cannot be elected
    99. 99. Tagging• New in 2.0.0• Control over where data is written to• Each member can have one or more tags e.g. • tags: {dc: "ny"} • tags: {dc: "ny", ip: "192.168", rack: "row3rk7"}• Replica set defines rules for where data resides• Rules can change without change application code
    100. 100. Tagging - example{ _id : "mySet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}}] settings : { getLastErrorModes : { allDCs : {"dc" : 3}, someDCs : {"dc" : 2}} }}> db.blogs.insert({...})> db.runCommand({getLastError : 1, w : "allDCs"})
    101. 101. Use Cases - Multi Data Center • write to three data centers • allDCs : {"dc" : 3} • > db.runCommand({getLastError : 1, w : "allDCs"}) • write to two data centers and three availability zones • allDCsPlus : {"dc" : 2, "az": 3} • > db.runCommand({getLastError : 1, w : "allDCsPlus"})US‐EAST‐1 US‐WEST‐1 LONDON‐1tag
:
{dc:
"JFK", tag
:
{dc:
"SFO", tag
:
{dc:
"LHR",






az:
"r1"} 






az
:
"r3"} 






az:
"r5"}US‐EAST‐2 US‐WEST‐2tag
:
{dc:
"JFK" tag
:
{dc:
"SFO"






az:
"r2"} 






az:
"r4"}
    102. 102. Use Cases - Data Protection & High Availability• A and B will take priority during a failover• C or D will become primary if A and B become unavailable• E cannot be primary• D and E cannot be read from with a slaveOk()• D can use be used for Backups, feed Solr index etc.• E provides a safe guard for operational or application error E A C priority:
0priority:
2 priority:
1 hidden:
True slaveDelay:
3600 D B priority:
1priority:
2 hidden:
True
    103. 103. Optimizing app performance
    104. 104. RAMDisk
    105. 105. RAMDisk
    106. 106. RAMDisk
    107. 107. RAMDisk
    108. 108. GoalMinimize memory turnover
    109. 109. What is your data access pattern?
    110. 110. 10 days of data RAMDisk
    111. 111. http://spf13.com http://github.com/spf13 @spf13 Questions?download at mongodb.orgPS: We’re hiring!! Contact us at jobs@10gen.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×