Advertisement
Advertisement

More Related Content

Advertisement

More from Ontico(20)

Advertisement

Новая архитектура шардинга MongoDB, Leif Walsh (Tokutek)

  1. A New MongoDB Sharding Architecture for Higher Availability and Better Resource Utilization Leif Walsh @leifwalsh
  2. A Traditional MongoDB Cluster • 3 shards. • 3 replicas per shard.
  3. A Traditional MongoDB Cluster • 3x write throughput. • 3x read throughput.
  4. A Traditional MongoDB Cluster • 1 node can go down without losing availability.
  5. A Traditional MongoDB Cluster • Data can survive destruction of 2 nodes.
  6. General MongoDB Cluster • Sx write throughput. • Rx read throughput. • R/2 nodes can go down without losing availability. • Data can survive destruction of R-1 nodes. • S×R hardware & maintenance cost.
  7. TokuMX: MongoDB with Fractal Trees • MongoDB fork. • Compression, performance, transactions. • Details about Fractal Trees after lunch.
  8. TokuMX: MongoDB with Fractal Trees • Read-free Replication • Fast Updates • Optimized Sharding Migrations • Ark Consensus for Replication Failover • Partitioned Collections • Clustering Indexes & Primary Keys • tokutek.com/tokumx
  9. Fractal Tree Performance Basics Writes are cheap: • O(1/B) I/Os per op. • ≈10k/s Reads are expensive: • Ω(1) I/O per op. • ≈100/s
  10. Read-free Replication Updates are reads + writes. Secondaries can trust the primary, only do writes.
  11. Read-free Replication Updates are reads + writes. Secondaries can trust the primary, only do writes. Looking at I/O utilization, secondaries are very cheap compared to primaries.
  12. A Traditional TokuMX Cluster • 9 machines, only 3x throughput benefit. • Secondaries are under-utilized.
  13. A TokuMX Cluster With Read-free Replication • 3x write throughput. • 3x read throughput. • (maybe separately)
  14. A TokuMX Cluster With Read-free Replication • 1 node can go down without losing availability.
  15. A TokuMX Cluster With Read-free Replication • Data can survive destruction of 2 nodes.
  16. A TokuMX Cluster With Read-free Replication • Only 3x hardware cost, down from 9x.
  17. Dynamo Architecture • Developed at Amazon. • Used by Cassandra, Riak, Voldemort. • Many components, I will focus on data partitioning. http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
  18. Dynamo Architecture • Servers are equal peers, not separate primaries and secondaries. • Store overlapping subsets of data (MongoDB shards store disjoint subsets). • Data partitioning determined by consistent hashing.
  19. Dynamo Partitioning • N servers in a ring. • hash(K) is a location around the ring. • Store data for K on the next R servers on the ring.
  20. Dynamo Partitioning • All nodes accept writes: ~linear write scaling. • Data replicated R times: Rx read performance/ reliability.
  21. Dynamo-style Sharding in TokuMX • Each node is primary for some chunks, secondary for others. • Nodes store overlapping subsets of the data set.
  22. Dynamo-style Sharding in TokuMX • S primaries in the ring: Sx write throughput. • R copies of each chunk on separate machines: Rx read throughput, availability & recovery guarantees.
  23. Dynamo-style Sharding in TokuMX • Adding a node: – Move one secondary from each of next 2 nodes to the new node. – Initialize a new replica set on the new node and next 2 nodes.
  24. Future Work Chunk balancer is not sophisticated: • Adding/removing machines is rough, overloads the machine’s neighbors. • Can we use ideas from Cassandra & Riak to improve this? MongoDB architecture requires managing multiple processes on each machine. • We can do better with good tools. Talk to me if you want to write them.
  25. Thanks! Come to my talk after lunch for details about Fractal Trees. leif@tokutek.com @leifwalsh tokutek.com/tokumx slidesha.re/13pxgH8
Advertisement