Mongodb - Scaling write performance
Upcoming SlideShare
Loading in...5
×
 

Mongodb - Scaling write performance

on

  • 9,348 views

 

Statistics

Views

Total Views
9,348
Views on SlideShare
8,793
Embed Views
555

Actions

Likes
25
Downloads
103
Comments
1

8 Embeds 555

http://editoy.com 392
http://www.scoop.it 145
http://www.hanrss.com 10
https://twitter.com 4
http://www.egloos.com 1
http://1.234.7.107 1
http://tweetedtimes.com 1
http://kred.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Excellent post. Thank you very much. I am interested to know if you have come across other real world implementations in Mongo that supported thousands of inserts/second with secondary indexes.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Mongodb - Scaling write performance Mongodb - Scaling write performance Presentation Transcript

  • MongoDB:Scaling write performance Junegunn Choi
  • MongoDB• Document data store • JSON-like document• Secondary indexes• Automatic failover• Automatic sharding
  • First impression: Easy• Easy installation• Easy data model• No prior schema design• Native support for secondary indexes
  • Second thought: Not so easy• No SQL• Coping with massive data growth• Setting up and operating sharded cluster• Scaling write performance
  • Today we’ll talk about insert performance
  • Insert throughput on a replica set
  • Steady 5k inserts/sec * 1kB record. ObjectId as PK * WriteConcern: Journal sync on Majority
  • Insert throughput on a replica setwith a secondary index
  • Culprit: B+Tree index• Good at sequential insert • e.g. ObjectId, Sequence #, Timestamp• Poor at random insert • Indexes on randomly-distributed data
  • Sequential vs. Random insert 1 55 2 75 3 78 4 1 5 99 6 36 7 80 8 91 9 52 10 B+Tree 63 B+Tree 11 56 12 33 working set working set Sequential insert ➔ Small working set Random insert ➔ Large working set ➔ Fits in RAM ➔ Sequential I/O ➔ Cannot fit in RAM ➔ Random I/O (bandwidth-bound) (IOPS-bound)
  • So, what do we do now?
  • 1. Partitioning Aug 2012 Sep 2012 Oct 2012 B+Tree fits in memorydoes not fit in memory
  • 1. Partitioning• MongoDB doesn’t support partitioning• Partitioning at application-level• e.g. Daily log collection • logs_20121012
  • Switch collection every hour
  • 2. Better H/W• More RAM• More IOPS • RAID striping • SSD • AWS Provisioned IOPS (1k ~ 10k)
  • 3. More H/W: Sharding• Automatic partitioning across nodes SHARD1 SHARD2 SHARD3 mongos router
  • 3 shards (3x3)
  • 3 shards (3x3)on RAID 1+0
  • There’s no free lunch• Manual partitioning • Incidental complexity• Better H/W • $• Sharding • $$ • Operational complexity
  • “Do you really need that index?”
  • Scaling insert performance with sharding
  • =Choosing the right shard key
  • Shard key example: year_of_birth 64MB chunk ~ 1950 1971 ~ 1990 1951 ~ 19701991 ~ 2005 2006 ~ 2010 2010 ~ ∞ USERS USERS USERS SHARD1 SHARD2 SHARD3 mongos router
  • 5k inserts/sec w/o sharding
  • Sequential key• ObjectId as shard key• Sequence #• Timestamp
  • Worse throughput with 3x H/W.
  • Sequential key 1000 ~ 2000• All inserts into one chunk 5000 ~ 7500• Cannot scale insert performance 9000 ~ ∞• Chunk migration overhead USERS SHARD-x 9001, 9002, 9003, 9004, ...
  • Sequential key
  • Hash key• e.g. SHA1(_id) = 9f2feb0f1ef425b292f2f94 ...• Distributes inserts evenly across all chunks
  • Hash key• Performance drops as collection grows • Why? Mandatory index on shard key • B+Tree problem again!
  • Sequential key Hash key
  • Sequential + hash key• Coarse-grained sequential prefix• e.g. Year-month + hash value • 201210_24c3a5b9 B+Tree 201208_* 201209_* 201210_*
  • But what if... B+Tree large working set 201208_* 201209_* 201210_*
  • Sequential + hash key• Can you predict data growth rate?• Balancer not clever enough • Only considers # of chunks • Migration slow during heavy-writes
  • Sequential key Hash keySequential + hash key
  • Low-cardinality hash key• Small portion of hash value Shard key range: A ~ D • e.g. A~Z, 00~FF• Alleviates B+Tree problem Local • Sequential access on fixed # B+Tree of parts • Cardinality / # of shards A A A B B B C C C
  • Low-cardinality hash key• Limits the # of possible chunks • e.g. 00 ~ FF ➔ 256 chunks • Chunk grows past 64MB • Balancing becomes difficult
  • Sequential key Hash key Sequential + hash keyLow-cardinality hash key
  • Low-cardinality hash prefix + sequential part Shard key range: A000 ~ C999• e.g. Short hash prefix + timestamp• Nice index access pattern Local B+Tree• Unlimited number of chunks A000 A123 B000 B123 C000 C123
  • Finally, 2x throughput
  • Lessons learned• Know the performance impact of secondary index• Choose the right shard key• Test with large data sets• Linear scalability is hard • If you really need it, consider HBase or Cassandra • SSD
  • Thank you. Questions? 유응섭 rspeed@daumcorp.com 최준건 gunn@daumcorp.com