MongoDB Sharding - MongoBoston 2010

7,090 views

Published on

0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,090
On SlideShare
0
From Embeds
0
Number of Embeds
2,632
Actions
Shares
0
Downloads
181
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide


  • What is scaling?
    Well - hopefully for everyone here.

  • scaling isn’t new
    sharding isn’t
    manual re-balancing is painful at best


  • Replica Sets for inconsistent read scaling
    for inconsistent read scaling






  • don’t shard by date












  • MongoDB Sharding - MongoBoston 2010

    1. 1. Scaling and Sharding Eliot Horowitz @eliothorowitz MongoBoston September 20, 2010
    2. 2. Scaling • Data size only goes up • Operations/sec only go up • Vertical scaling is limited • Hard to scale vertically in the cloud • Can scale wider than higher
    3. 3. Traditional Horizontal Scaling • read only slaves • caching • custom partitioning code
    4. 4. Newer Scaling • relational database clustering • consistent hashing (Dynamo) • range based partitioning (BigTable/PNUTS)
    5. 5. MongoDB Sharding • Scale horizontally for data size, index size, write and consistent read scaling • Distribute databases, collections or a objects in a collection • Auto-balancing, migrations, management happen with no down time
    6. 6. • Choose how you partition data • Can convert from single master to sharded system with no downtime • Same features as non-sharding single master • Fully consistent
    7. 7. Range Based • collection is broken into chunks by range • chunks default to 200mb or 100,000 objects
    8. 8. Architecture Shards mongod mongod mongod ... Config mongod mongod mongod Servers mongod mongod mongod mongos mongos ... client
    9. 9. User profiles • Partition by user_id • Secondary indexes on location, dates, etc... • Reads/writes know which shard to hit
    10. 10. User Activity Stream • Shard by user_id • Loading a user’s stream hits a single shard • Writes are distributed across all shards • Can index on activity for deleting
    11. 11. Photos • Can shard by photo_id for best read/write distribution • Secondary index on tags, date
    12. 12. Logging Possible Shard Keys • date • machine, date • logger name
    13. 13. Config Servers • 3 of them • changes are made with 2 phase commit • if any are down, meta data goes read only • system is online as long as 1/3 is up
    14. 14. Shards • Can be master, master/slave or replica sets • Replica sets gives sharding + full auto- failover • Regular mongod processes
    15. 15. mongos • Sharding Router • Acts just like a mongod to clients • Can have 1 or as many as you want • Can run on appserver so no extra network traffic
    16. 16. Writes • Inserts : require shard key, routed • Removes: routed and/or scattered • Updates: routed or scattered
    17. 17. Queries • By shard key: routed • sorted by shard key: routed in order • by non shard key: scatter gather • sorted by non shard key: distributed merge sort
    18. 18. Operations • split: breaking a chunk into 2 • migrate: move a chunk from 1 shard to another • balancing: moving chunks automatically to keep system in balance
    19. 19. Setting it Up • Start servers • add shards: db.runCommand( { addshard : "10.1.1.5" } ) • turn on partitioning: db.runCommand( { enablesharding : "test" } • shard a collection: db.runCommand( { shardcollection : "test.data" , key : { num : 1 } } )
    20. 20. Download MongoDB http://www.mongodb.org and
let
us
know
what
you
think @eliothorowitz



@mongodb 10gen is hiring! http://www.10gen.com/jobs

    ×