MongoDB:Scaling write performance                            Junegunn Choi
First impression:                     Easy• Easy   installation• Easy   data model• No   prior schema design• Native   sup...
Second thought:                Not so easy• No   SQL• Coping    with massive data growth• Setting   up and operating shard...
Today we’ll talk about insert performance
Insert throughput  on a replica set
Steady 5k inserts/sec              * 1kB record. ObjectId as PK              * WriteConcern: Journal sync on Majority
Insert throughputwith a secondary index
Culprit:               B+Tree index• Good    at sequential insert • e.g. ObjectId, Sequence       #, Timestamp• Poor   at ...
Sequential vs. Random insert  1                                           55  2                                           ...
So, what do we do now?
1. Partitioning                        Aug 2012   Sep 2012     Oct 2012      B+Tree                                       ...
1. Partitioning• MongoDB       doesn’t support partitioning• Partitioning   at application-level• e.g. Daily   log collect...
Switch collection every hour
2. Better H/W• More   RAM• More   IOPS • RAID   striping • SSD • AWS    Provisioned IOPS (1k ~ 10k)
3. More H/W: Sharding• Automatic   partitioning across nodes   SHARD1             SHARD2              SHARD3              ...
3 shards (3x3)
3 shards (3x3)on RAID 1+0
There’s no free lunch•   Manual partitioning    •   Incidental complexity•   Better H/W    •   $•   Sharding    •   $$    ...
“Do you really need that index?”
Scaling insert performance       with sharding
=Choosing the right shard key
Shard key example:          year_of_birth                                 64MB chunk    ~ 1950    1971 ~ 1990     1951 ~ 1...
5k inserts/sec w/o sharding
Sequential key• ObjectId   as shard key• Sequence   #• Timestamp
Worse throughput with 3x H/W.
Sequential key                                               1000 ~ 2000• All   inserts into one chunk                 500...
Sequential key
Hash key• e.g. SHA1(_id)   = 9f2feb0f1ef425b292f2f94 ...• Distributes   evenly across all ranges
Hash key• Performance   drops as collection grows • Why?   Mandatory shard key index     • B+Tree   problem again!
Sequential key  Hash key
Sequential + hash key• Coarse-grained    sequential prefix• e.g. Year-month   + hash value  • 201210_24c3a5b9              ...
But what if...                         B+Tree                             large working set   201208_*   201209_*         ...
Sequential + hash key• Can   you predict data growth rate?• Balancer   not clever enough  • Only   considers # of chunks  ...
Sequential key     Hash keySequential + hash key
Low-cardinality hash key                                              Shard key range: A ~ D• e.g. A~Z, 00~FF• Alleviates ...
Low-cardinality hash key• Limits   the # of possible chunks  • e.g. 00   ~ FF ➔ 256 chunks  • Chunk     grows past 64MB   ...
Sequential key       Hash key Sequential + hash keyLow-cardinality hash key
Low-cardinality hash prefix      + sequential part                                         Shard key range: A000 ~ C999• e....
Finally, 2x throughput
Lessons learned• Know     the performance impact of secondary index• Choose     the right shard key• Test   with large dat...
Thank you. Questions?              gunn@daumcorp.com
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012
Upcoming SlideShare
Loading in …5
×

MongoDB: Scaling write performance | Devon 2012

5,746 views
5,657 views

Published on

0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,746
On SlideShare
0
From Embeds
0
Number of Embeds
4,220
Actions
Shares
0
Downloads
79
Comments
0
Likes
11
Embeds 0
No embeds

No notes for slide

MongoDB: Scaling write performance | Devon 2012

  1. 1. MongoDB:Scaling write performance Junegunn Choi
  2. 2. First impression: Easy• Easy installation• Easy data model• No prior schema design• Native support for secondary indexes
  3. 3. Second thought: Not so easy• No SQL• Coping with massive data growth• Setting up and operating sharded cluster• Scaling write performance
  4. 4. Today we’ll talk about insert performance
  5. 5. Insert throughput on a replica set
  6. 6. Steady 5k inserts/sec * 1kB record. ObjectId as PK * WriteConcern: Journal sync on Majority
  7. 7. Insert throughputwith a secondary index
  8. 8. Culprit: B+Tree index• Good at sequential insert • e.g. ObjectId, Sequence #, Timestamp• Poor at random insert • Indexes on randomly-distributed data
  9. 9. Sequential vs. Random insert 1 55 2 75 3 78 4 1 5 99 6 36 7 80 8 91 9 52 10 B+Tree 63 B+Tree 11 56 12 33 working set working set Sequential insert ➔ Small working set Random insert ➔ Large working set ➔ Fits in RAM ➔ Sequential I/O ➔ Cannot fit in RAM ➔ Random I/O (bandwidth-bound) (IOPS-bound)
  10. 10. So, what do we do now?
  11. 11. 1. Partitioning Aug 2012 Sep 2012 Oct 2012 B+Tree fits in memorydoes not fit in memory
  12. 12. 1. Partitioning• MongoDB doesn’t support partitioning• Partitioning at application-level• e.g. Daily log collection • logs_20121012
  13. 13. Switch collection every hour
  14. 14. 2. Better H/W• More RAM• More IOPS • RAID striping • SSD • AWS Provisioned IOPS (1k ~ 10k)
  15. 15. 3. More H/W: Sharding• Automatic partitioning across nodes SHARD1 SHARD2 SHARD3 mongos router
  16. 16. 3 shards (3x3)
  17. 17. 3 shards (3x3)on RAID 1+0
  18. 18. There’s no free lunch• Manual partitioning • Incidental complexity• Better H/W • $• Sharding • $$ • Operational complexity
  19. 19. “Do you really need that index?”
  20. 20. Scaling insert performance with sharding
  21. 21. =Choosing the right shard key
  22. 22. Shard key example: year_of_birth 64MB chunk ~ 1950 1971 ~ 1990 1951 ~ 19701991 ~ 2005 2006 ~ 2010 2010 ~ ∞ USERS USERS USERS SHARD1 SHARD2 SHARD3 mongos router
  23. 23. 5k inserts/sec w/o sharding
  24. 24. Sequential key• ObjectId as shard key• Sequence #• Timestamp
  25. 25. Worse throughput with 3x H/W.
  26. 26. Sequential key 1000 ~ 2000• All inserts into one chunk 5000 ~ 7500• Chunk migration overhead 9000 ~ ∞ USERS SHARD-x 9001, 9002, 9003, 9004, ...
  27. 27. Sequential key
  28. 28. Hash key• e.g. SHA1(_id) = 9f2feb0f1ef425b292f2f94 ...• Distributes evenly across all ranges
  29. 29. Hash key• Performance drops as collection grows • Why? Mandatory shard key index • B+Tree problem again!
  30. 30. Sequential key Hash key
  31. 31. Sequential + hash key• Coarse-grained sequential prefix• e.g. Year-month + hash value • 201210_24c3a5b9 B+Tree 201208_* 201209_* 201210_*
  32. 32. But what if... B+Tree large working set 201208_* 201209_* 201210_*
  33. 33. Sequential + hash key• Can you predict data growth rate?• Balancer not clever enough • Only considers # of chunks • Migration slow during heavy-writes
  34. 34. Sequential key Hash keySequential + hash key
  35. 35. Low-cardinality hash key Shard key range: A ~ D• e.g. A~Z, 00~FF• Alleviates B+Tree problem • Sequential access on fixed # Local of parts B+Tree A A A B B B C C C
  36. 36. Low-cardinality hash key• Limits the # of possible chunks • e.g. 00 ~ FF ➔ 256 chunks • Chunk grows past 64MB • Balancing becomes difficult
  37. 37. Sequential key Hash key Sequential + hash keyLow-cardinality hash key
  38. 38. Low-cardinality hash prefix + sequential part Shard key range: A000 ~ C999• e.g. Short hash prefix + timestamp • FA1350005981 Local• Nice index access pattern B+Tree• Unlimited # of chunks A000 A123 B000 B123 C000 C123
  39. 39. Finally, 2x throughput
  40. 40. Lessons learned• Know the performance impact of secondary index• Choose the right shard key• Test with large data sets• Linear scalability is hard • If you really need it, consider HBase or Cassandra • SSD
  41. 41. Thank you. Questions? gunn@daumcorp.com

×