In this session you will learn:
Zookeeper
To know more, click here: https://www.mindsmapped.com/courses/big-data-hadoop/big-data-and-hadoop-training-for-beginners/
3. Page 3Classification: Restricted
Zookeeper
•ZooKeeper is a distributed Co-ordinated service.
•Partial failures are intrinsic in distributed systems.
•ZooKeeper gives you a set of tools to build distributed applications that can
safely handle partial failures
4. Page 4Classification: Restricted
A scenario
•A group of servers provides services to the clients. To maintain the list of
these servers at a certain place is a challenge – Can’t be stored on a single
node; even if stored on multiple machines, removing a certain entity from
the list is challenging.
•ZooKeeper provides a group membership service to achieve the above
requirement.
5. Page 5Classification: Restricted
Group membership in ZK
•ZK provides a high availability file system.
•It doesn’t have files & directories though – but, znodes
•Znodes - contain data (as a file) & also contain other znodes(as directories)
•Znodes form a hierarchical namespace, and a natural way to build a
membership list is to create a parent znode with the name of the group and
child znodes with the names of the group members (servers)
7. Page 7Classification: Restricted
ZK data model
•ZK maintains a hierarchical tree of nodes called znodes. A znode stores data
and has a corresponding ACL(access list)
•ZooKeeper is designed for coordination (which typically uses small
datafiles), not high-volume data storage, so there is a limit of 1 MB on the
amount of data that may be stored in any znode.
•Data access is atomic
•A write will replace all the data associated with a znode. A Write will either
succeed or fail. ZooKeeper does not support an append operation
8. Page 8Classification: Restricted
ZK data model – node types
•Znodes – ephemeral & persistent
•A znode’s type is set at creation time and may not be changed later
•An ephemeral znode is deleted by ZooKeeper when the creating client’s
session ends
•a persistent znode is not tied to the client’s session and is deleted only
when explicitly deleted by a client (not necessarily the one that created it)
•An ephemeral znode may not have children, not even ephemeral ones.
•Even though ephemeral nodes are tied to a client session, they are visible to
all clients (subject to their ACL policies, of course)
•Ephemeral znodes are ideal for building applications that need to know
when certain distributed resources are available
9. Page 9Classification: Restricted
ZK data model – sequence numbers
•A sequential znode is given a sequence number by ZooKeeper as a part of its
name
•If a znode is created with the sequential flag set, then the value of a
monotonically increasing counter (maintained by the parent znode) is
appended to its name.
•If a client asks to create a sequential znode with the name /a/b-, for
example, the znode created may actually have the name /a/b-3. If, later on,
another sequential znode with the name /a/b- is created, it will be given a
unique name with a larger value of the counter—for example, /a/b-5
•Sequence numbers can be used to impose a global ordering on events in a
distributed system and may be used by the client to infer the ordering. You
can use this in Lock sevice
10. Page 10Classification: Restricted
ZK data model – Watches
•Watches allow clients to get notifications when a znode changes in some
way
•Watches are set by operations on the ZooKeeper service and are triggered
by other operations on the service
•For example, a client might call the exists operation on a znode, placing a
watch on it at the same time. If the znode doesn’t exist, the exists operation
will return false. If, some time later, the znode is created by a second client,
the watch is triggered, notifying the first client of the znode’s creation
•Watchers are triggered only once