Binu George
www.globinch.com
www.java.globinch.com
Agenda
 Coordination headache in distributed applications
 What is zookeeper?
 Zookeeper Overview
 Zookeeper Service & Data Model
 ZAB (Zookeeper Atomic Broadcasting Protocol)
 Zookeeper Read, Write and Watch
 Zookeeper Use Cases
 Zookeeper Client API Example
Coordination headache in distributed
applications
 Problem of Detection
 Cluster Size Estimation
 Leader Election
 Group Membership
 Configuration Management
 Consensus among multiple parties is difficult in
dynamic distributed systems as members may
leave/fail/join anytime
 Consistency, Availability and Partition tolerance
What is Zookeeper?
 Apache Zookeeper is a highly consistent, scalable and
reliable cluster co-ordination service.
 Zookeeper is a CP system where Consistency and
Partition tolerance are guaranteed.
 Off-the-shelf support for implementing consensus,
group management, leader election etc.
 Zookeeper cluster is called Ensemble
 Ensemble has one leader and remaining followers
 Zookeeper is a Distributed Coordination Service for
Distributed Applications.
 Zookeeper is originally developed by Yahoo research
Team and became a top level Apache project.
Zookeeper is used by many enterprises including
Rackspace, Yahoo! and eBay.
Zookeeper Overview
 Zookeeper use Zookeeper Broadcasting Protocol (ZAB) for
ordering and integrity.
 Achieves High availability and performance via replication.
 Low latency and high throughput via in memory image of
data tree ,transaction logs and snapshots.
 Uses “quorum” model and will only be available if a
majority of servers are alive (by a strict majority of nodes).
 Zookeeper relies on a quorum for durability as well.
 Pipelined architecture of Zookeeper enables the execution
of operations from a single client in FIFO order.
 Ordered updates uses ZAB protocol (a leader-based atomic
broadcast protocol) for Ordering, Integrity , exactly once.
 Election of a new leader is followed by a synchronization
phase before any new changes are accepted.
Zookeeper Service & Data Model
 At any given time, one Zookeeper client is connected
to one Zookeeper server.
 If the node (to which client connected) fails, client
automatically connect to another server in ensemble
with transparent session transfer.
 Znode is an in-memory data node in Zookeeper Data
organized in a hierarchical namespace (data tree)
 The data model of ZooKeeper (znode) is essentially a
file system with a simplified API and associated meta
data and versions.
 Regular, Ephemeral and Sequential Znode are
supported.
 Each ZNode is using UNIX style notation. For
example, /A/B/C to denote the path to znode C, where
C has B as its parent and B has A as its parent.
 All znodes store data, and all znodes, except for
ephemeral znodes, can have children.
Zookeeper read/write/watch
 Clients make use of Zookeeper API to update/write
Zookeeper the data tree.
 All Updates are ordered and managed by leader with
unique zxid and are considered complete when a
quorum confirms the update.
 The leader executes update request and broadcasts
the change to the ZooKeeper state through Zab.
 Zab by default uses a simple majority quorums to
decide on a proposal.
 Read and Watch requests on the znode sent by a client
to a ZooKeeper server are processed locally.
 Watch –Allows client to get notification if an already
returned information gets updated.
Zookeeper read/write/watch
Zookeeper read/write/watch
 Read and Watch requests on the znode are
processed locally but, write requests are
propagated to leader server and go through a
consensus before a response is generated.
 Throughput of read request increases with
number of servers and throughput of write
requests, decreases with number of servers.
Zookeeper use cases
 ZooKeeper offers the library to create and manage
synchronization primitives.
 ZooKeeper avoids the single-point-of-failure. Typical
use cases ,
 Naming service
 Configuration management
 Synchronization
 Leader election
 Message Queue
 Notification system
Zookeeper Java Client API Examples
 A simple example.
 To check the complete example see at the below link.
 http://java.globinch.com/enterprise-
services/zookeeper/apache-zookeeper-explained-
tutorial-cases-zookeeper-java-api-examples/

Apache Zookeeper Explained: Tutorial, Use Cases and Zookeeper Java API Examples

  • 1.
  • 2.
    Agenda  Coordination headachein distributed applications  What is zookeeper?  Zookeeper Overview  Zookeeper Service & Data Model  ZAB (Zookeeper Atomic Broadcasting Protocol)  Zookeeper Read, Write and Watch  Zookeeper Use Cases  Zookeeper Client API Example
  • 3.
    Coordination headache indistributed applications  Problem of Detection  Cluster Size Estimation  Leader Election  Group Membership  Configuration Management  Consensus among multiple parties is difficult in dynamic distributed systems as members may leave/fail/join anytime  Consistency, Availability and Partition tolerance
  • 4.
    What is Zookeeper? Apache Zookeeper is a highly consistent, scalable and reliable cluster co-ordination service.  Zookeeper is a CP system where Consistency and Partition tolerance are guaranteed.  Off-the-shelf support for implementing consensus, group management, leader election etc.  Zookeeper cluster is called Ensemble  Ensemble has one leader and remaining followers  Zookeeper is a Distributed Coordination Service for Distributed Applications.  Zookeeper is originally developed by Yahoo research Team and became a top level Apache project. Zookeeper is used by many enterprises including Rackspace, Yahoo! and eBay.
  • 5.
    Zookeeper Overview  Zookeeperuse Zookeeper Broadcasting Protocol (ZAB) for ordering and integrity.  Achieves High availability and performance via replication.  Low latency and high throughput via in memory image of data tree ,transaction logs and snapshots.  Uses “quorum” model and will only be available if a majority of servers are alive (by a strict majority of nodes).  Zookeeper relies on a quorum for durability as well.  Pipelined architecture of Zookeeper enables the execution of operations from a single client in FIFO order.  Ordered updates uses ZAB protocol (a leader-based atomic broadcast protocol) for Ordering, Integrity , exactly once.  Election of a new leader is followed by a synchronization phase before any new changes are accepted.
  • 6.
    Zookeeper Service &Data Model  At any given time, one Zookeeper client is connected to one Zookeeper server.  If the node (to which client connected) fails, client automatically connect to another server in ensemble with transparent session transfer.  Znode is an in-memory data node in Zookeeper Data organized in a hierarchical namespace (data tree)  The data model of ZooKeeper (znode) is essentially a file system with a simplified API and associated meta data and versions.  Regular, Ephemeral and Sequential Znode are supported.  Each ZNode is using UNIX style notation. For example, /A/B/C to denote the path to znode C, where C has B as its parent and B has A as its parent.  All znodes store data, and all znodes, except for ephemeral znodes, can have children.
  • 7.
    Zookeeper read/write/watch  Clientsmake use of Zookeeper API to update/write Zookeeper the data tree.  All Updates are ordered and managed by leader with unique zxid and are considered complete when a quorum confirms the update.  The leader executes update request and broadcasts the change to the ZooKeeper state through Zab.  Zab by default uses a simple majority quorums to decide on a proposal.  Read and Watch requests on the znode sent by a client to a ZooKeeper server are processed locally.  Watch –Allows client to get notification if an already returned information gets updated.
  • 8.
  • 9.
    Zookeeper read/write/watch  Readand Watch requests on the znode are processed locally but, write requests are propagated to leader server and go through a consensus before a response is generated.  Throughput of read request increases with number of servers and throughput of write requests, decreases with number of servers.
  • 10.
    Zookeeper use cases ZooKeeper offers the library to create and manage synchronization primitives.  ZooKeeper avoids the single-point-of-failure. Typical use cases ,  Naming service  Configuration management  Synchronization  Leader election  Message Queue  Notification system
  • 11.
    Zookeeper Java ClientAPI Examples  A simple example.  To check the complete example see at the below link.  http://java.globinch.com/enterprise- services/zookeeper/apache-zookeeper-explained- tutorial-cases-zookeeper-java-api-examples/

Editor's Notes

  • #4 1)Number of non-faulty nodes currently in the system (heartbeat.Distributed Hashtables (DHT)) Leader/master detection & availability Membership notifications etc. 2)