Your SlideShare is downloading. ×
Virtual nodes: Operational Aspirin
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Virtual nodes: Operational Aspirin

792
views

Published on

Cassandra SF meetup, October 2013

Cassandra SF meetup, October 2013

Published in: Technology, Education

1 Comment
1 Like
Statistics
Notes
  • Following link is broken :http://www.acunu.com/2/post/2012/07/virtual-nodes-performance-results.html
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
792
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
1
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Virtual nodes: Operational Aspirin Nicolas Favre-Felix nicolas@acunu.com @yowgi Thursday, 24 October 13
  • 2. 1-minute recap on Cassandra distribution • Nodes are clustered in a “ring” • Each node has a token in [0,2127-1]: • 0 • 42535295865117307932921825928971026432 • 85070591730234615865843651857942052864 • 127605887595351923798765477786913079296 • Keys are hashed using MD5 (now Murmur3) • Each node owns a share of the key-space 2 Thursday, 24 October 13
  • 3. Cassandra distribution limitations • Operational complexity • Rebuild cost for capacity bound clusters • Impact on maintenance operations • Impact on topology changes • No native support for heterogeneous hardware 3 Thursday, 24 October 13
  • 4. Adding a node to an existing cluster 4 Thursday, 24 October 13
  • 5. Insert the new node... 5 Thursday, 24 October 13
  • 6. Recalculate ranges and rebalance by hand 6 Thursday, 24 October 13
  • 7. Usually just double the number of nodes 7 Thursday, 24 October 13
  • 8. Add/remove node • Need to rebalance ranges between nodes • Move more data than is optimal • (optimal would be 1/N) • Impacts at most RF nodes • (prefer to spread load across cluster) • Manual, tedious, error-prone, painful... 8 Thursday, 24 October 13
  • 9. Removing a node • nodetool removetoken (removenode from 1.2) • Dead host's token removed from ring • Next host in ring assumes range • Replica count restored • Involves at most 2 * RF - 1 nodes • If we can make it faster, we can store more data! 9 Thursday, 24 October 13
  • 10. Virtual Nodes! 10 Thursday, 24 October 13
  • 11. Virtual nodes in Cassandra 1.2+ • More than one token per node • Random token assignment • Incremental cluster resize, one node at a time • Streaming to/from all nodes, not just neighbors • Only random partitioners are supported • Multi-DC support still works in the same way 11 Thursday, 24 October 13
  • 12. Different virtual nodes strategies Number partitions Partition Size Random (Cassandra 1.2+) O(N) O(B/N) Fixed (Riak) O(1) O(B) Auto-sharding (MongoDb) O(B) O(1) N = number of nodes B = size of dataset (read more at http://bit.ly/virtualnodes) Thursday, 24 October 13 12
  • 13. Virtual Nodes!   New in 1.2 Enabled by default in 2.0 → set num_tokens: 256 in cassandra.yaml 13 Thursday, 24 October 13
  • 14. Adding nodes to a cluster • From a single node... • Multiple tokens • Ranges of different sizes 14 Thursday, 24 October 13
  • 15. Adding nodes to a cluster • We add a second node • “Steals” ranges from the existing node 15 Thursday, 24 October 13
  • 16. Adding nodes to a cluster • And a third one... • “Steals” ranges from the existing nodes • Distribution is close to 1/3 each 16 Thursday, 24 October 13
  • 17. An ideal distribution 17 Thursday, 24 October 13
  • 18. Actually more like this 18 Thursday, 24 October 13
  • 19. Bootstrap • Assign a new host T random tokens (T=256) • New tokens split ranges from existing nodes • Each existing node contributes to the bootstrap • Optimal data movement • No need to rebalance, or double cluster size • No need to calculate tokens 19 Thursday, 24 October 13
  • 20. Removing nodes from a cluster Removing the node with blue ranges: 20 Thursday, 24 October 13
  • 21. Removing nodes from a cluster 21 Thursday, 24 October 13
  • 22. Removing a node • Nodetool removetoken removenode • nodetool removenode <host_id> • Dead host's tokens removed from ring • Ranges recalculated & data moved • All nodes participate! 22 Thursday, 24 October 13
  • 23. nodetool ring becomes useless... $ nodetool ring Datacenter: datacenter1 ========== Address Rack Status State Load 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 192.168.100.2 48.18 48.18 48.18 48.18 48.18 48.18 48.18 48.18 48.18 48.18 48.18 48.18 48.18 48.18 rack1 rack1 rack1 rack1 rack1 rack1 rack1 rack1 rack1 rack1 rack1 rack1 rack1 rack1 Up Up Up Up Up Up Up Up Up Up Up Up Up Up Normal Normal Normal Normal Normal Normal Normal Normal Normal Normal Normal Normal Normal Normal Owns KB KB KB KB KB KB KB KB KB KB KB KB KB KB 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% 19.46% Token 908086307850 -92138833317 -91449505236 -89961709812 -89833237466 -89829145910 -88349645925 -87940053784 -87315744643 -86833403935 -86172092729 -85207698040 -85134888150 -85110178049 -84965473082 23 Thursday, 24 October 13
  • 24. nodetool status $ nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns UN 192.168.100.2 48.18 KB 256 19.5% UN 192.168.100.3 48.21 KB 256 19.9% UN 192.168.100.1 46.19 KB 256 21.3% UN 192.168.100.5 48.13 KB 256 18.9% UN 192.168.100.4 48.15 KB 256 20.5% Host ID bb84e34e-b929-41c2-a5 50e8c1b1-a28f-431a-85 67bdd989-1b34-4bbd-a5 1a88e040-84fd-4461-80 3be40484-8225-467c-82 24 Thursday, 24 October 13
  • 25. Heterogeneity Fewer tokens, less data! 25 Thursday, 24 October 13
  • 26. Modeled with simulated token assignment Virtual Nodes: Operational Aspirin Frequency mean range size Range size (arbitrary units) Thursday, 24 October 13 26
  • 27. How does this lead to balanced load?! • Each host has the same distribution of range sizes • So will assume roughly equal portions of the key-space • Modelled with simulated data inserted into ranges... 27 Thursday, 24 October 13
  • 28. Normalised data load Virtual Nodes: Operational Aspirin Virtual node (location in key-space) Thursday, 24 October 13 28
  • 29. Frequency How balanced is balanced? Virtual Nodes: Operational Aspirin Normalised load (arbitrary units) Thursday, 24 October 13 29
  • 30. A balanced cluster • Keys are randomly distributed • V-node partition will assume the load proportional to its size • Load tends towards balance with increase in number of nodes • 2 nodes: 48.4% and 51.6% • 3 nodes: 34.3%, 33.0%, 32.7% • 4 nodes: 24.3%, 25.2%, 24.9%, 25.6% 30 Thursday, 24 October 13
  • 31. Performance testing • 17 node EC2 m1.large • Inserted 460 million keys • at RF=3 • Timed removenode and then bootstrap • Results at http://bit.ly/vnodesperf 31 Thursday, 24 October 13
  • 32. Performance testing Time (seconds) 500 Cassandra 1.2 Cassandra 1.1 375 250 125 0 removenode bootstrap 32 Thursday, 24 October 13
  • 33. Migration path for a non-vnode cluster • Several techniques to migrate to vnodes • The “simplest” is to rebuild your cluster • With downtime: restore from backup • Without downtime: twice the hardware • “shuffle” is the proposed alternative • Migrate all nodes to vnodes, shuffle ranges • Very few success stories 33 Thursday, 24 October 13
  • 34. Conclusion • You should already be using virtual nodes! • Token management is a thing of the past • Embrace the randomness • Scale up and down without pain 34 Thursday, 24 October 13
  • 35. Thanks! @yowgi @acunu Thursday, 24 October 13
  • 36. We’re hiring! • Acunu suggested and developed virtual nodes • Patches by @samoverton and @jericevans • Eric Evans also contributed much of CQL • We are looking for developers to work on Apache Cassandra, contributing features and enhancements to the Open-Source project 36 Thursday, 24 October 13