SlideShare a Scribd company logo
Dynamic Reconfiguration of ZooKeeper

             Alex Shraer
    (presented by Benjamin Reed)
Why ZooKeeper?




•
    Lots of servers
•
    Lots of processes
•
    High volumes of data
•
    Highly complex software systems
•
    … mere mortal developers
What ZooKeeper gives you
●   Simple programming model
●   Coordination of distributed processes
●   Fast notification of changes
●   Elasticity
●   Easy setup
●   High availability
ZooKeeper Configuration

• Membership
• Role of each server
  – E.g., follower or observer
• Quorum System spec
  – Zookeeper: majority or hierarchical
• Network addresses & ports
• Timeouts, directory paths, etc.
Zookeeper - distributed and replicated
                                 ZooKeeper Service
                                    Leader

             Server     Server        Server            Server        Server




    Client   Client   Client     Client        Client        Client     Client   Client


• All servers store a copy of the data (in memory)
• A leader is elected at startup
• Reads served by followers, all updates go through leader
• Update acked when a quorum of servers have persisted the
  change (on disk)
• Zookeeper uses ZAB - its own atomic broadcast protocol
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Dynamic Membership Changes
• Necessary in every long-lived system!
• Examples:
   – Cloud computing: adapt to changing load, don’t pre-allocate!
   – Failures: replacing failed nodes with healthy ones
   – Upgrades: replacing out-of-date nodes with up-to-date ones
   – Free up storage space: decreasing the number of replicas
   – Moving nodes: within the network or the data center
   – Increase resilience by changing the set of servers
  Example: asynch. replication works as long as > #servers/2 operate:
Hazards of Manual Reconfiguration
                                     E
       A

                       C


        {A, B, C}

        B              {A, B, C}     D




           {A, B, C}


       • Goal: add servers E and D
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        • Goal: add servers E and D
        • Change Configuration
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        • Goal: add servers E and D
        • Change Configuration
        • Restart Servers
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        • Goal: add servers E and D
        • Change Configuration
        • Restart Servers
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        • Goal: add servers E and D
        • Change Configuration
        • Restart Servers
Hazards of Manual Reconfiguration
                                           E
        A

                           C


       {A, B, C, D, E}                      {A, B, C, D, E}

         B               {A, B, C, D, E}   D




       {A, B, C, D, E}
                                           {A, B, C, D, E}


        •    Goal: add servers E and D
        •    Change Configuration
        •    Restart Servers
        •    Lost    and    !
18

          Just use a coordination service!
     • Zookeeper is the coordination service
        – Don’t want to deploy another system to coordinate it!


     • Who will reconfigure that system ?
        – GFS has 3 levels of coordination services


     • More system components -> more management overhead


     • Use Zookeeper to reconfigure itself!
        – Other systems store configuration information in Zookeeper
        – Can we do the same??
        – Only if there are no failures
Recovery in Zookeeper

                  C               E

                           B


setData(/x, 5)

                                  D
                      A
Recovery in Zookeeper

                  C               E

                           B


setData(/x, 5)

                                  D
                      A
Recovery in Zookeeper

                  C               E

                           B


setData(/x, 5)

                                  D
                      A
Recovery in Zookeeper

                  C               E

                           B


setData(/x, 5)

                                  D
                      A
This doesn’t work for reconfigurations!
                                                                        E
                               C


                                                     B
                               {A, B, C, D, E}                          {A, B, C, D, E}


setData(/zookeeper/config, {A, B, F})
                                                      {A, B, C, D, E}   D
      remove C, D, E add F



             F
                                                                        {A, B, C, D, E}
                                        A




                                         {A, B, C, D, E}
This doesn’t work for reconfigurations!
                                                                          E
                               C


                                                        B
                               {A, B, C, D, E}                            {A, B, C, D, E}


setData(/zookeeper/config, {A, B, F})
                                                        {A, B, C, D, E}   D
      remove C, D, E add F



             F
                                                                          {A, B, C, D, E}
                                        A



 {A, B, F}
                                            {A, B, F}
This doesn’t work for reconfigurations!
                                                                              E
                                   C


                                                            B
                                   {A, B, C, D, E}                            {A, B, C, D, E}


    setData(/zookeeper/config, {A, B, F})
                                                            {A, B, C, D, E}   D
          remove C, D, E add F



                  F
                                                                              {A, B, C, D, E}
                                            A



      {A, B, F}
                                                {A, B, F}

•   Must persist the decision to reconfigure in the old
    config before activating the new config!
•   Once such decision is reached, must not allow further
    ops to be committed in old config
Our Solution
•   Correct
•   Fully automatic
•   No external services or additional components
•   Minimal changes to Zookeeper
•   Usually unnoticeable to clients
    – Pause operations only in rare circumstances
    – Clients work with a single configuration
• Rebalances clients across servers in new configuration

• Reconfigures immediately

• Speculative Reconfiguration
    – Reconfiguration (and commands that follow it) speculatively sent out by the
      primary, similarly to all other updates
Principles
●   Commit reconfig in a quorum of the old ensemble
    –   Submit reconfig op just like any other update
●   Make sure new ensemble has latest state before
    becoming active
    –   Get quorum of synced followers from new config
    –   Get acks from both old and new ensembles before committing
        updates proposed between reconfig op and activation
    –   Activate new configuration when reconfig commits
●   Once new ensemble active old ensemble cannot commit
    or propose new updates
●   Gossip activation through leader election and syncing
●   Verify configuration id of leader and follower
Failure free flow
Reconfiguration scenario 1
                                 E
   A

                   C


    {A, B, C}                        {A, B, C}

    B              {A, B, C}     D




       {A, B, C}
                                      {A, B, C}


   • Goal: add servers E and D
Reconfiguration scenario 1
                               E
   A

                   C


    {A, B, C}

    B              {A, B, C}   D




       {A, B, C}


   • Goal: add servers E and D
   •    doesn't commit until quorums of
     both ensembles ack
Reconfiguration scenario 1
                               E
   A

                   C


    {A, B, C}                      {A, B, C}

    B              {A, B, C}   D




       {A, B, C}
                                    {A, B, C}


   • Goal: add servers E and D
   •    doesn't commit until quorums of
     both ensembles ack
Reconfiguration scenario 1
                               E
   A

                   C


    {A, B, C}                      {A, B, C}

    B              {A, B, C}   D




       {A, B, C}
                                    {A, B, C}


   • Goal: add servers E and D
   •    doesn't commit until quorums of
     both ensembles ack
Reconfiguration scenario 1
                                 E
    A

                     C


   {A, B, C, D, E}               {A, B, C, D, E}

     B               {A, B, C}   D




   {A, B, C, D, E}
                                  {A, B, C, D, E}


    • Goal: add servers E and D
    •    doesn't commit until quorums of
      both ensembles ack
Reconfiguration scenario 1
                                 E
    A

                     C


   {A, B, C, D, E}               {A, B, C, D, E}

     B               {A, B, C}   D




   {A, B, C, D, E}
                                  {A, B, C, D, E}


    • Goal: add servers E and D
    •    doesn't commit until quorums of
      both ensembles ack
    • E and D gossip new configuration
      to C
Reconfiguration scenario 1
                                       E
    A

                       C


   {A, B, C, D, E}                     {A, B, C, D, E}

     B               {A, B, C, D, E}   D




   {A, B, C, D, E}
                                        {A, B, C, D, E}


    • Goal: add servers E and D
    •    doesn't commit until quorums of
      both ensembles ack
    • E and D gossip new configuration
      to C
Example - reconfig using CLI
reconfig -add 1=host1.com:1234:1235:observer;1239

         -add 2=host2.com:1236:1237:follower;1231 -remove 5
●
    Change follower 1 to an observer and change its ports
●
    Add follower 2 to the ensemble
●
    Remove follower 5 from the ensemble

reconfig -file myNewConfig.txt -v 234547
●
    Change the current config to the one in myNewConfig.txt
●
    But only if current config version is 234547

getConfig -w -c
●
    set a watch on /zookeeper/config
●
    -c means we only want the new connection string for clients
When it will not work
●   Quorum of new ensemble must be in sync
●   Another reconfig in progress
●   Version condition check fails
How do you know you are done
●   Write something somewhere
The “client side” of reconfiguration
• When system changes, clients need to stay connected
   – The usual solution: directory service (e.g., DNS)
• Re-balancing load during reconfiguration is also important!
• Goal: uniform #clients per server with minimal client migration
   – Migration should be proportional to change in membership




                  X 10   X 10   X 10
The “client side” of reconfiguration
• When system changes, clients need to stay connected
   – The usual solution: directory service (e.g., DNS)
• Re-balancing load during reconfiguration is also important!
• Goal: uniform #clients per server with minimal client migration
   – Migration should be proportional to change in membership




                   X 10   X 10   X 10
Our approach - Probabilistic Load Balancing
• Example 1 :


                X 10   X 10   X 10
Our approach - Probabilistic Load Balancing
• Example 1 :


                X 10   X 10   X 10
Our approach - Probabilistic Load Balancing
• Example 1 :


                       X6       X6       X6        X6     X6

   –   Each client moves to a random new server with probability 0.4
   –   1 – 3/5 = 0.4

   –   Exp. 40% clients will move off of each server
Our approach - Probabilistic Load Balancing
• Example 1 :


                        X6       X6       X6        X6     X6

    –   Each client moves to a random new server with probability 0.4
    –   1 – 3/5 = 0.4

    –   Exp. 40% clients will move off of each server
●
    Example 2 :



                        X6       X6      X6         X6     X6
Our approach - Probabilistic Load Balancing
• Example 1 :


                        X6       X6       X6        X6     X6

    –   Each client moves to a random new server with probability 0.4
    –   1 – 3/5 = 0.4

    –   Exp. 40% clients will move off of each server
●
    Example 2 :



                        X6       X6      X6         X6     X6
Our approach - Probabilistic Load Balancing
• Example 1 :


                         X6         X6       X6           X6      X6
    –   Each client moves to a random new server with probability 0.4
    –   1 – 3/5 = 0.4

    –   Exp. 40% clients will move off of each server
●
    Example 2 :
                                                   4/18        4/18    10/18




                         X6        X6       X6            X6      X6

    –   Connected clients don’t move
    –   Disconnected clients move to old servers with prob 4/18 and new one with prob
        10/18
    –   Exp. 8 clients will move from A, B, C to D, E and 10 to F
Our approach - Probabilistic Load Balancing
• Example 1 :


                         X6         X6       X6           X6      X6
    –   Each client moves to a random new server with probability 0.4
    –   1 – 3/5 = 0.4

    –   Exp. 40% clients will move off of each server
●
    Example 2 :
                                                   4/18        4/18      10/18




                                                        X 10      X 10      X 10

    –   Connected clients don’t move
    –   Disconnected clients move to old servers with prob 4/18 and new one with prob
        10/18
    –   Exp. 8 clients will move from A, B, C to D, E and 10 to F
Current Load Balancing
ProbabilisticCurrent Load Balancing
 When moving from config. S to S’:
E (load (i, S ' )) = load (i, S ) +     ∑ load ( j, S ) ⋅ Pr( j → i ) − load (i, S ) ∑ Pr(i → j )
                                      j∈S ∧ j ≠i                                 j∈S ' ∧ j ≠i

 expected #clients       #clients
 connected to i in S’   connected                      #clients
(10 in last example)     to i in S                                               #clients
                                                    moving to i from         moving from i to
                                                   other servers in S       other servers in S’
 Solving for Pr we get case-specific probabilities.
 Input: each client answers locally
 Question 1: Are there more servers now or less ?
 Question 2: Is my server being removed?
 Output: 1) disconnect or stay connected to my server
          if disconnect 2) Pr(connect to one of the old servers)
                 and Pr(connect to newly added server)
Implementation
• Implemented in Zookeeper (Java & C), integration ongoing
   – 3 new Zookeeper API calls: reconfig, getConfig, updateServerList
   – feature requested since 2008, expected in 3.5.0 release (july 2012)
• Dynamic changes to:
   –   Membership
   –   Quorum System
   –   Server roles
   –   Addresses & ports
• Reconfiguration modes:
   – Incremental (add servers E and D, remove server B)
   – Non-incremental (new config = {A, C, D, E})
   – Blind or conditioned (reconfig only if current config is #5)
• Subscriptions to config changes
   – Client can invoke client-side re-balancing upon change
52

                                      Summary
     • Design and implementation of reconfiguration for Apache Zookeeper
        – being contributed into Zookeeper codebase


     • Much simpler than state of the art, using properties already provided by Zookeeper

     • Many nice features:
        – Doesn’t limit concurrency
        – Reconfigures immediately
        – Preserves primary order
        – Doesn’t stop client ops
        – Zookeeper used by online systems, any delay must be avoided
        – Clients work with a single configuration at a time
        – No external services
        – Includes client-side rebalancing

More Related Content

What's hot

Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best PracticesOracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Markus Michalewicz
 
New lessons in connection management
New lessons in connection managementNew lessons in connection management
New lessons in connection management
Toon Koppelaars
 
Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3
SANG WON PARK
 
Understanding and Improving Code Generation
Understanding and Improving Code GenerationUnderstanding and Improving Code Generation
Understanding and Improving Code Generation
Databricks
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
ScyllaDB
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
confluent
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Altinity Ltd
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ScaleGrid.io
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
Saurav Haloi
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Databricks
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Altinity Ltd
 
Logical replication with pglogical
Logical replication with pglogicalLogical replication with pglogical
Logical replication with pglogical
Umair Shahid
 
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Amy W. Tang
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
chrislusf
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestrator
YoungHeon (Roy) Kim
 

What's hot (20)

Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best PracticesOracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
 
New lessons in connection management
New lessons in connection managementNew lessons in connection management
New lessons in connection management
 
Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3
 
Understanding and Improving Code Generation
Understanding and Improving Code GenerationUnderstanding and Improving Code Generation
Understanding and Improving Code Generation
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
 
Logical replication with pglogical
Logical replication with pglogicalLogical replication with pglogical
Logical replication with pglogical
 
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestrator
 

Similar to Dynamic Reconfiguration of Apache ZooKeeper

Understanding histogramppt.prn
Understanding histogramppt.prnUnderstanding histogramppt.prn
Understanding histogramppt.prn
Leyi (Kamus) Zhang
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
NYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKeeNYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKee
Rizwan Habib
 
Graph analysis platform comparison, pregel/goldenorb/giraph
Graph analysis platform comparison, pregel/goldenorb/giraphGraph analysis platform comparison, pregel/goldenorb/giraph
Graph analysis platform comparison, pregel/goldenorb/giraph
Andrew Yongjoon Kong
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculator
solarisyourep
 
C++ unit-1-part-11
C++ unit-1-part-11C++ unit-1-part-11
C++ unit-1-part-11
Jadavsejal
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Flink Forward
 
Lucene revolution 2011
Lucene revolution 2011Lucene revolution 2011
Lucene revolution 2011Takahiko Ito
 
Using GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with JavaUsing GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with Java
Tim Ellison
 
Genomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyGenomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive Biology
Uri Laserson
 
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptxUKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
Marco Gralike
 
Provenance Annotation and Analysis to Support Process Re-Computation
Provenance Annotation and Analysis to Support Process Re-ComputationProvenance Annotation and Analysis to Support Process Re-Computation
Provenance Annotation and Analysis to Support Process Re-Computation
Paolo Missier
 
Real World Optimization
Real World OptimizationReal World Optimization
Real World Optimization
David Golden
 
Extend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop EcosystemExtend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop EcosystemFei Dong
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
MLconf
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
Kyong-Ha Lee
 
What's new in Doctrine
What's new in DoctrineWhat's new in Doctrine
What's new in Doctrine
Jonathan Wage
 
Eventually Consistent Data Structures (from strangeloop12)
Eventually Consistent Data Structures (from strangeloop12)Eventually Consistent Data Structures (from strangeloop12)
Eventually Consistent Data Structures (from strangeloop12)
Sean Cribbs
 
Eventually-Consistent Data Structures
Eventually-Consistent Data StructuresEventually-Consistent Data Structures
Eventually-Consistent Data Structures
Sean Cribbs
 
Towards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker PrototypingTowards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker Prototyping
Edmundo López Bóbeda
 

Similar to Dynamic Reconfiguration of Apache ZooKeeper (20)

Understanding histogramppt.prn
Understanding histogramppt.prnUnderstanding histogramppt.prn
Understanding histogramppt.prn
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
 
NYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKeeNYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKee
 
Graph analysis platform comparison, pregel/goldenorb/giraph
Graph analysis platform comparison, pregel/goldenorb/giraphGraph analysis platform comparison, pregel/goldenorb/giraph
Graph analysis platform comparison, pregel/goldenorb/giraph
 
Presentation v mware roi tco calculator
Presentation   v mware roi tco calculatorPresentation   v mware roi tco calculator
Presentation v mware roi tco calculator
 
C++ unit-1-part-11
C++ unit-1-part-11C++ unit-1-part-11
C++ unit-1-part-11
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
 
Lucene revolution 2011
Lucene revolution 2011Lucene revolution 2011
Lucene revolution 2011
 
Using GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with JavaUsing GPUs to Handle Big Data with Java
Using GPUs to Handle Big Data with Java
 
Genomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyGenomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive Biology
 
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptxUKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
 
Provenance Annotation and Analysis to Support Process Re-Computation
Provenance Annotation and Analysis to Support Process Re-ComputationProvenance Annotation and Analysis to Support Process Re-Computation
Provenance Annotation and Analysis to Support Process Re-Computation
 
Real World Optimization
Real World OptimizationReal World Optimization
Real World Optimization
 
Extend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop EcosystemExtend starfish to Support the Growing Hadoop Ecosystem
Extend starfish to Support the Growing Hadoop Ecosystem
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
 
What's new in Doctrine
What's new in DoctrineWhat's new in Doctrine
What's new in Doctrine
 
Eventually Consistent Data Structures (from strangeloop12)
Eventually Consistent Data Structures (from strangeloop12)Eventually Consistent Data Structures (from strangeloop12)
Eventually Consistent Data Structures (from strangeloop12)
 
Eventually-Consistent Data Structures
Eventually-Consistent Data StructuresEventually-Consistent Data Structures
Eventually-Consistent Data Structures
 
Towards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker PrototypingTowards a General Approach for Symbolic Model-Checker Prototyping
Towards a General Approach for Symbolic Model-Checker Prototyping
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 

Dynamic Reconfiguration of Apache ZooKeeper

  • 1. Dynamic Reconfiguration of ZooKeeper Alex Shraer (presented by Benjamin Reed)
  • 2. Why ZooKeeper? • Lots of servers • Lots of processes • High volumes of data • Highly complex software systems • … mere mortal developers
  • 3. What ZooKeeper gives you ● Simple programming model ● Coordination of distributed processes ● Fast notification of changes ● Elasticity ● Easy setup ● High availability
  • 4. ZooKeeper Configuration • Membership • Role of each server – E.g., follower or observer • Quorum System spec – Zookeeper: majority or hierarchical • Network addresses & ports • Timeouts, directory paths, etc.
  • 5. Zookeeper - distributed and replicated ZooKeeper Service Leader Server Server Server Server Server Client Client Client Client Client Client Client Client • All servers store a copy of the data (in memory) • A leader is elected at startup • Reads served by followers, all updates go through leader • Update acked when a quorum of servers have persisted the change (on disk) • Zookeeper uses ZAB - its own atomic broadcast protocol
  • 6. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 7. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 8. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 9. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 10. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 11. Dynamic Membership Changes • Necessary in every long-lived system! • Examples: – Cloud computing: adapt to changing load, don’t pre-allocate! – Failures: replacing failed nodes with healthy ones – Upgrades: replacing out-of-date nodes with up-to-date ones – Free up storage space: decreasing the number of replicas – Moving nodes: within the network or the data center – Increase resilience by changing the set of servers Example: asynch. replication works as long as > #servers/2 operate:
  • 12. Hazards of Manual Reconfiguration E A C {A, B, C} B {A, B, C} D {A, B, C} • Goal: add servers E and D
  • 13. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration
  • 14. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration • Restart Servers
  • 15. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration • Restart Servers
  • 16. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration • Restart Servers
  • 17. Hazards of Manual Reconfiguration E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • Change Configuration • Restart Servers • Lost and !
  • 18. 18 Just use a coordination service! • Zookeeper is the coordination service – Don’t want to deploy another system to coordinate it! • Who will reconfigure that system ? – GFS has 3 levels of coordination services • More system components -> more management overhead • Use Zookeeper to reconfigure itself! – Other systems store configuration information in Zookeeper – Can we do the same?? – Only if there are no failures
  • 19. Recovery in Zookeeper C E B setData(/x, 5) D A
  • 20. Recovery in Zookeeper C E B setData(/x, 5) D A
  • 21. Recovery in Zookeeper C E B setData(/x, 5) D A
  • 22. Recovery in Zookeeper C E B setData(/x, 5) D A
  • 23. This doesn’t work for reconfigurations! E C B {A, B, C, D, E} {A, B, C, D, E} setData(/zookeeper/config, {A, B, F}) {A, B, C, D, E} D remove C, D, E add F F {A, B, C, D, E} A {A, B, C, D, E}
  • 24. This doesn’t work for reconfigurations! E C B {A, B, C, D, E} {A, B, C, D, E} setData(/zookeeper/config, {A, B, F}) {A, B, C, D, E} D remove C, D, E add F F {A, B, C, D, E} A {A, B, F} {A, B, F}
  • 25. This doesn’t work for reconfigurations! E C B {A, B, C, D, E} {A, B, C, D, E} setData(/zookeeper/config, {A, B, F}) {A, B, C, D, E} D remove C, D, E add F F {A, B, C, D, E} A {A, B, F} {A, B, F} • Must persist the decision to reconfigure in the old config before activating the new config! • Once such decision is reached, must not allow further ops to be committed in old config
  • 26. Our Solution • Correct • Fully automatic • No external services or additional components • Minimal changes to Zookeeper • Usually unnoticeable to clients – Pause operations only in rare circumstances – Clients work with a single configuration • Rebalances clients across servers in new configuration • Reconfigures immediately • Speculative Reconfiguration – Reconfiguration (and commands that follow it) speculatively sent out by the primary, similarly to all other updates
  • 27. Principles ● Commit reconfig in a quorum of the old ensemble – Submit reconfig op just like any other update ● Make sure new ensemble has latest state before becoming active – Get quorum of synced followers from new config – Get acks from both old and new ensembles before committing updates proposed between reconfig op and activation – Activate new configuration when reconfig commits ● Once new ensemble active old ensemble cannot commit or propose new updates ● Gossip activation through leader election and syncing ● Verify configuration id of leader and follower
  • 29. Reconfiguration scenario 1 E A C {A, B, C} {A, B, C} B {A, B, C} D {A, B, C} {A, B, C} • Goal: add servers E and D
  • 30. Reconfiguration scenario 1 E A C {A, B, C} B {A, B, C} D {A, B, C} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack
  • 31. Reconfiguration scenario 1 E A C {A, B, C} {A, B, C} B {A, B, C} D {A, B, C} {A, B, C} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack
  • 32. Reconfiguration scenario 1 E A C {A, B, C} {A, B, C} B {A, B, C} D {A, B, C} {A, B, C} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack
  • 33. Reconfiguration scenario 1 E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack
  • 34. Reconfiguration scenario 1 E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack • E and D gossip new configuration to C
  • 35. Reconfiguration scenario 1 E A C {A, B, C, D, E} {A, B, C, D, E} B {A, B, C, D, E} D {A, B, C, D, E} {A, B, C, D, E} • Goal: add servers E and D • doesn't commit until quorums of both ensembles ack • E and D gossip new configuration to C
  • 36. Example - reconfig using CLI reconfig -add 1=host1.com:1234:1235:observer;1239 -add 2=host2.com:1236:1237:follower;1231 -remove 5 ● Change follower 1 to an observer and change its ports ● Add follower 2 to the ensemble ● Remove follower 5 from the ensemble reconfig -file myNewConfig.txt -v 234547 ● Change the current config to the one in myNewConfig.txt ● But only if current config version is 234547 getConfig -w -c ● set a watch on /zookeeper/config ● -c means we only want the new connection string for clients
  • 37. When it will not work ● Quorum of new ensemble must be in sync ● Another reconfig in progress ● Version condition check fails
  • 38. How do you know you are done ● Write something somewhere
  • 39. The “client side” of reconfiguration • When system changes, clients need to stay connected – The usual solution: directory service (e.g., DNS) • Re-balancing load during reconfiguration is also important! • Goal: uniform #clients per server with minimal client migration – Migration should be proportional to change in membership X 10 X 10 X 10
  • 40. The “client side” of reconfiguration • When system changes, clients need to stay connected – The usual solution: directory service (e.g., DNS) • Re-balancing load during reconfiguration is also important! • Goal: uniform #clients per server with minimal client migration – Migration should be proportional to change in membership X 10 X 10 X 10
  • 41. Our approach - Probabilistic Load Balancing • Example 1 : X 10 X 10 X 10
  • 42. Our approach - Probabilistic Load Balancing • Example 1 : X 10 X 10 X 10
  • 43. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server
  • 44. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server ● Example 2 : X6 X6 X6 X6 X6
  • 45. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server ● Example 2 : X6 X6 X6 X6 X6
  • 46. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server ● Example 2 : 4/18 4/18 10/18 X6 X6 X6 X6 X6 – Connected clients don’t move – Disconnected clients move to old servers with prob 4/18 and new one with prob 10/18 – Exp. 8 clients will move from A, B, C to D, E and 10 to F
  • 47. Our approach - Probabilistic Load Balancing • Example 1 : X6 X6 X6 X6 X6 – Each client moves to a random new server with probability 0.4 – 1 – 3/5 = 0.4 – Exp. 40% clients will move off of each server ● Example 2 : 4/18 4/18 10/18 X 10 X 10 X 10 – Connected clients don’t move – Disconnected clients move to old servers with prob 4/18 and new one with prob 10/18 – Exp. 8 clients will move from A, B, C to D, E and 10 to F
  • 49. ProbabilisticCurrent Load Balancing When moving from config. S to S’: E (load (i, S ' )) = load (i, S ) + ∑ load ( j, S ) ⋅ Pr( j → i ) − load (i, S ) ∑ Pr(i → j ) j∈S ∧ j ≠i j∈S ' ∧ j ≠i expected #clients #clients connected to i in S’ connected #clients (10 in last example) to i in S #clients moving to i from moving from i to other servers in S other servers in S’ Solving for Pr we get case-specific probabilities. Input: each client answers locally Question 1: Are there more servers now or less ? Question 2: Is my server being removed? Output: 1) disconnect or stay connected to my server if disconnect 2) Pr(connect to one of the old servers) and Pr(connect to newly added server)
  • 50. Implementation • Implemented in Zookeeper (Java & C), integration ongoing – 3 new Zookeeper API calls: reconfig, getConfig, updateServerList – feature requested since 2008, expected in 3.5.0 release (july 2012) • Dynamic changes to: – Membership – Quorum System – Server roles – Addresses & ports • Reconfiguration modes: – Incremental (add servers E and D, remove server B) – Non-incremental (new config = {A, C, D, E}) – Blind or conditioned (reconfig only if current config is #5) • Subscriptions to config changes – Client can invoke client-side re-balancing upon change
  • 51. 52 Summary • Design and implementation of reconfiguration for Apache Zookeeper – being contributed into Zookeeper codebase • Much simpler than state of the art, using properties already provided by Zookeeper • Many nice features: – Doesn’t limit concurrency – Reconfigures immediately – Preserves primary order – Doesn’t stop client ops – Zookeeper used by online systems, any delay must be avoided – Clients work with a single configuration at a time – No external services – Includes client-side rebalancing