What is the Architecture of ZooKeeper?
•ZooKeeper is a distributed application on its own while being a coordination service
for distributed systems.
•It has a simple client-server model in which clients are nodes (i.e. machines) and
servers are nodes.
•As a function, ZooKeper Clients make use of the services and servers provides the
•Applications make calls to ZooKeeper through a client library.
•The client library handles the interaction with ZooKeeper servers here.
Sessions are very important for the operation of ZooKeeper. Requests in a session are
executed in FIFO order. Once a client connects to a server, the session will be established
and a session id is assigned to the client.
The client sends heartbeats at a particular time interval to keep the session valid. If the
ZooKeeper ensemble does not receive heartbeats from a client for more than the period
(session timeout) specified at the starting of the service, it decides that the client died.
Session timeouts are usually represented in milliseconds. When a session ends for
any reason, the ephemeral znodes created during that session also get deleted.
Apache Zookeeper is an open source distributed
coordination service that helps to manage a large set of
hosts. Management and coordination in a distributed
environment is tricky. Zookeeper automates this process
and allows developers to focus on building software
features rather than worry about it’s distributed nature.
Zookeeper helps you to maintain configuration
information, naming, group services for distributed
applications. It implements different protocols on the
cluster so that the application should not implement on
their own. It provides a single coherent view of multiple
Why to use Zookeeper?
•It allows for mutual exclusion and cooperation between server processes
•It ensures that your application runs consistently.
•The transaction process is never completed partially. It is either given the status of
Success or failure.
• The distributed state can be held up, but it’s never wrong
•Irrespective to the server that it connects to, a client will be able to see the same
of the service
•Helps you to encode the data as per the specific set of rules
•It helps to maintain a standard hierarchical namespace similar to files and
•Computers, which run as a single system which can be locally or geographically
•It allows to Join/leave node in a cluster and node status at the real time
•You can increase performance by deploying more machines
•It allows you to elect a node as a leader for better coordination
•ZooKeeper works fast with workloads where reads to the data are more common
Data Model in ZooKeeper
As same as a standard file system, the namespace provided by ZooKeeper.
Basically, a sequence of path elements which separates by a slash (/) is what
we call a name. In ZooKeeper’s namespace, a path identifies every node.
Moreover, in a ZooKeeper namespace, each node can have data associated
with it and its children. As same as a file-system which permits a file to also be
•The zookeeper data model follows a Hierarchal namespace where each node is called a
ZNode. A node is a system where the cluster runs.
•Every ZNode has data. It may or may not have children
• Canonical, slash-separated and absolute
• Not use any relative references
• Names may have Unicode characters
•ZNode maintains stat structure and version number for data changes.
There are three types of Znodes:
Persistence znode: This type of znode is alive even after the client which created that
specific znode, is disconnected. By default, in zookeeper, all nodes are persistent if it is not
Ephemeral znode: This type of zookeeper znode are alive until the client is alive.
Therefore, when the client gets a disconnect from the zookeeper, it will also be deleted.
Moreover, ephemeral nodes are not allowed to have children.
Sequential znode: Sequential znodes can be either ephemeral or persistent. So when a
new znode is created as a sequential znode. You can assign the path of the znode by
attaching a 10 digit sequence number to the original
Apache ZooKeeper Applications
Apache Zookeeper used for following purposes:
•Managing the configuration
•Choosing the leader
•Queuing the messages
•Managing the notification system
•Distributed Cluster Management
Disadvantages of using Zookeeper
•Data loss may occur if you are adding new Zookeeper Servers
•No Migration allowed for users
•Not offer support for Rack placement and awareness
•Zookeeper does not allow you to reduce the number of pods to
prevent accidental data loss
•You can’t switch service to host networking without a full re-
installation when the service is deployed on a virtual network
•Service doesn’t support changing volume requirements once the
initial deployment is over
•There are large numbers of node involved so there could be
than one point of failure
•Messages can be lost in the communication network, which
requires special software to recover it again
Zookeeper Architecture goes through the master node, so all writes are
guaranteed to be sequential. When performing a write operation to
Zookeeper, each server connected to this client stores data along with the
master server. This keeps all servers updated with the data. However, this also
means that concurrent writes cannot be performed.
•Apache Zookeeper might be a data model. Each directory in our example is
called a znode in Zookeeper. Stores statistical data such as version details and
user data up to 1 Mb in size.
•If the master node fails, another master node is instantly selected and takes
over from the previous master node. In addition to masters and slaves, there
are also observers in Zookeeper.
•Multiple server nodes are collectively called a ZooKeeper file. A Zookeeper
Architecture client uses at least one server at a given time.
Like the Video and Subscribe the Channel