Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Breaking Open Apache Geode: How It Works and Why

80 views

Published on

SpringOne Platform 2019
Session Title: Breaking Open Apache Geode: How It Works and Why
Speakers: Dan Smith, Principal Software Engineer, Pivotal
Youtube: https://youtu.be/qUs3ftvsEoU

Published in: Software
  • Be the first to comment

  • Be the first to like this

Breaking Open Apache Geode: How It Works and Why

  1. 1. Apache Geode Summit 2019 Breaking Open Apache Geode - Dan Smith, Pivotal Dan Da
  2. 2. What is Geode?
  3. 3. What is Geode? ● Distributed key-value store Client Put (key, value) Server Server Server
  4. 4. ● Distributed key-value store ● Highly available What is Geode? Client Put (key, value) Server Server Server
  5. 5. ● Distributed key-value store ● Highly available ● Low Latency What is Geode? Client Put (key, value) Server Server < 1ms Whoah!
  6. 6. ● Distributed key-value store ● Highly available ● Low Latency ● Consistent and Partition Tolerant What is Geode Client Put (key, value) Server Server Oh, no! A network partition!
  7. 7. ● Two types of regions What is Geode Client Put (A) Replicated Server A Server Server A A A
  8. 8. ● Two types of regions What is Geode Client Put (A) Replicated Server A Server Server A A A Partitioned Server A Server Server B A
  9. 9. What is Geode ● Keys and Values are Objects (Java, C++, C#, JSON) ● Has ○ Secondary Indexes & Querying ○ Continuous Queries ○ Transactions ○ Persistence ○ WAN replication ○ Event delivery ○ Parallel functions ○ ...
  10. 10. Components 1 1 Membership Distributed Locks Replicated Regions Partitioned Regions Function Execution Serialization Messaging Persistence Indexes Querying WAN ReplicationStatistics
  11. 11. Components 1 2 Membership Distributed Locks Replicated Regions Partitioned Regions Function Execution Serialization Messaging Persistence Indexes Querying WAN ReplicationStatistics Partitioned Regions
  12. 12. Components 1 3 Membership Distributed Locks Replicated Regions Partitioned Regions Function Execution Serialization Messaging Persistence Indexes Querying WAN ReplicationStatisticsPartitioned Regions - Partitioning & Routing - High Availability - Consistency - Recovery and Rebalancing
  13. 13. ● A partitioned regions is divided into buckets Partitioned Regions Put (“Marie Tharp”, value) Bucket 0 Bucket 1 Bucket 2 Bucket 3 Bucket N hash = “Marie Tharp”.hashCode() bucket = hash % num_buckets
  14. 14. Server 2 Server 1 Server 3 ● Buckets are mapped to servers Partitioned Regions Put (“Marie Tharp”, value) Bucket 0 Bucket 3 Bucket N Bucket 1 Bucket 2 hash = “Marie Tharp”.hashCode() bucket = hash % num_buckets
  15. 15. The End
  16. 16. What about? ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution?
  17. 17. Placing Buckets ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution?
  18. 18. Be Lazy
  19. 19. Server 2 Server 1 Client Partitioned Regions - Lazy Creation Put (key, value) Hash Function Put in Bucket 2 Routing Table (empty) Server 3 Proxy
  20. 20. Server 2 Server 1 Client Partitioned Regions - Lazy Creation Put (key, value) Hash Function Routing Table (empty) Server 3 Bucket 2 key=value Proxy Create Bucket!
  21. 21. Server 2 Server 1 Client Partitioned Regions - Lazy Discovery Routing Table (empty) Server 3 Bucket 2 key=value Proxy Reply - Bucket Metadata Changed!
  22. 22. Server 2 Server 1 Client Partitioned Regions - Lazy Discovery Routing Table Server 3 Bucket 2 key=value Proxy Get Bucket Locations
  23. 23. Server 2 Server 1 Client Partitioned Regions - Lazy Discovery Put (key, value) Hash Function Put in Bucket Bucket 2 key=value Routing Table Bucket 2 Server 3
  24. 24. High Availability ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution?
  25. 25. Duplicate Work
  26. 26. Server 2 Server 1 Client Partitioned Regions - High Availability Put (key, value) Hash Function Put in Bucket Routing Table Bucket 2 Server 3 Bucket 2 key=value
  27. 27. Server 2 Server 1 Client Partitioned Regions - High Availability Put (key, value) Hash Function Put in Bucket Routing Table Bucket 2 Server 3 Bucket 2 key=value Bucket 2 key=value
  28. 28. Server 2 Server 1 Client Partitioned Regions - Failover Put (key, value) Hash Function Put in Bucket Bucket 2 key=value Routing Table Bucket 2 Server 3 Bucket 2 key=value
  29. 29. Consistency ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we add/remove servers?
  30. 30. Server 2 Server 1 Client 1 Consistency - Ships Passing in the Night Put (key, value1) Bucket 2 key=value1 Server 3 Client 2 Put (key, value2) Bucket 2 key=value2
  31. 31. Server 2 Server 1 Client 1 Consistency - Ships Passing in the Night Put (key, value1) Bucket 2 key=value2 Server 3 Client 2 Put (key, value2) Bucket 2 key=value1
  32. 32. Consistency ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution?
  33. 33. Wait in Line
  34. 34. Server 2 Server 1 Client 1 Consistency - Ships Passing in the Night Put (key, value1) Bucket 2 key=value2 Server 3 Client 2 Put (key, value2) Bucket 2 key=value1
  35. 35. Server 2 Server 1 Client 1 Consistency Put (key, value1) Bucket 2 key=value2 Server 3 Client 2 Put (key, value2) Bucket 2 key=value2 Operations on key Serialized on primary
  36. 36. Server 2 Server 1 Client Consistency - Lingering Operations Put (key, value) Hash Function Put in Bucket Bucket 2 key=value Routing Table Bucket 2 Server 3 Bucket 2 key=value
  37. 37. Server 2 Client Consistency - Lingering Operations Server 3 Bucket 2 key=value Old, lingering event (key, value, Event ID)Put (key, value1) Hash Function Routing Table Bucket 2 Event Tracker (key, value, Event ID)
  38. 38. Server 2 Server 1 Client 1 Consistency - Network Partitions Put (key, value1) Bucket 2 key=value2 Client 2 Bucket 2 key=value2
  39. 39. Server 2 Server 1 Client 1 Consistency - Network Partitions Put (key, value1) Bucket 2 key=value2 Client 2 Bucket 2 key=value2
  40. 40. Server 2 Server 1 Client 1 Consistency - Network Partitions Put (key, value1) Bucket 2 key=value1 Client 2 Put (key, value2) Bucket 2 key=value2
  41. 41. Give Up
  42. 42. Server 2 Server 1 Client 1 Consistency - Network Partitions Put (key, value1) Bucket 2 key=value2 Client 2 Put (key, value2) Bucket 2 key=value2
  43. 43. ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution? Restoring Redundancy
  44. 44. Tell Others What to Do
  45. 45. Partitioned Regions - Redundancy Recovery Start Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Start Start
  46. 46. Partitioned Regions - Redundancy Recovery Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Got a lock!
  47. 47. Partitioned Regions - Redundancy Recovery Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Bucket 2 Make a copy! Copy Bucket
  48. 48. Partitioned Regions - Redundancy Recovery Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Nothing to Do Bucket 2
  49. 49. Partitioned Regions - Redundancy Recovery Nothing to Do Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Bucket 2
  50. 50. Rebalancing ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do improve data distribution?
  51. 51. Be Greedy (Optimizer)
  52. 52. Rebalancing - What are we optimizing ● Cost based optimizer ● Minimizes the variance in bytes stored on each member ● Greedy algorithm ○ Maximize the improvement in variance per byte moved Bucket 1 Bucket 3 Bucket 2Server 1 Bucket 1 Bucket 3 Bucket 2 Variance: 1600 Server 2 Server 3 60 0 0
  53. 53. Server 3 Server 1 Server 2 Rebalancing - What are we optimizing ● Cost based optimizer ● Minimizes the variance in bytes stored on each member ● Greedy algorithm ○ Maximize the improvement in variance per byte moved Bucket 1 Bucket 3 Bucket 2 Variance: 1050 45 15 0
  54. 54. Server 3 Server 1 Server 2 Rebalancing - What are we optimizing ● Cost based optimizer ● Minimizes the variance in bytes stored on each member ● Greedy algorithm ○ Maximize the improvement in variance per byte moved Bucket 1 Bucket 3 Bucket 2 Variance: 150 30 15 15
  55. 55. Rebalancing - what does it do? Three Phases 1. Restore Redundancy 2. Optimize bucket distribution 3. Optimize primary distribution Membership changes start from phase 1 again.
  56. 56. Putting it Together ● Start with the simple idea: Hashing ● Using - Laziness, Duplication, Bossyness and Greed ● Get ○ High Availability ○ Low Latency ○ Consistency
  57. 57. Links ● Mailing List: dev-subscribe@geode.apache.org ● Internal Architecture: https://cwiki.apache.org/confluence/x/AolXAw
  58. 58. Q & A 59

×