Distributed system coordination by zookeeper and introduction to kazoo python library

Distributed System Coordination by
Zookeeper and Introduction to
Kazoo Python Library
Jimmy Lai
r97922028 [at] ntu.edu.tw
Dec. 22th, 2014
1

Outline
1. Overview
2. Basics
3. Deployment
4. Recipes
5. References
2

A Distributed System - Master-Worker
• Coordination tasks:
1. elect new master when the master crashes
2. master assign tasks to worker
3. when worker crashes, re-assign the task to other
worker
4. When worker ﬁnished their task, master assign new
tasks to it
Master
Worker Worker Worker Worker Worker Worker
4

Distributed System
• An application consists of programs run on a
group of computers.
• Coordination is more difﬁcult than writing a
standalone program.
• Developer may take too much times to handle
the coordination or create a fragile (e.g. race
condition, single point failure) distributed system.
5

Easy Distributed System by Zookeeper
• Common coordination tasks:
• Naming service
• Conﬁguration management
• Synchronization
• Leader election
• Message queue
• Notiﬁcation system
• Zookeeper provides highly reliable API for those
common coordination tasks
http://en.wikipedia.org/wiki/Apache_ZooKeeper#Typical_use_cases
6

Powered By Zookeeper
• Zookeeper is built by Yahoo Research
• Customers:
• Hadoop, Hbase
• Solr
• Neo4j
• Flume
• Facebook messages
7

Beneﬁts of Zookeeper
• With Zookeeper:
• simplify the development of distributed
system, more agile and robust
• zookeeper is simple, fast and replicated
• Without Zookeeper:
• more difﬁcult
8

• Servers replicate data
• Client connect to one of the
server
• Throughput test
• Hardware: dual 2Ghz Xeon and
two SATA 15K RPM drives
Beneﬁts of Zookeeper
9

Znode (1/2)
• Based on shared storage
model, each client store/
acquire data from
zookeeper service
• File system-like API
• Znode: hierarchical tree
contains optional data or
optional znodes.
• Persistent znode will
disappear after delete
operation
• Ephemeral znode will
disappear when the
client creator crashes or
close the connection, or
deleted by any client
11

Znode (2/2)
• Sequential znode will
be assigned a
monotonically
increasing integer at
the end of path. E.g. /
path-1, /path-2
• Versions: each node
have a version and
will be increased
when its data
changes
12

Operations
• Primitive operations:
• create /path data
• delete /path
• exists /path
• setData /path data
• getData /path
• getChildren /path
13

Notiﬁcation
• set a watch on a znode operation (getData,
getChildren, exist) and then get the notiﬁcation
when there is a change at the target
• Watch is:
• one-time trigger
• with ordering guarantee: all the event received
in client side will preserve the order of time
14

Session
• Session: client create a session connection
to one of the server and start operations
• Session states:
• connecting
• connected
• closed
• not_connected
15

Example - implement a lock
• Spec: n clients try to get the lock at the same
time, but only one of them can get the lock.
• Solution: clients try to create a ephemeral
znode e.g. /lock. the ﬁrst one will get the lock
and the rest of them which fail to create the
znode set up a watch to know when the lock
is released and then try to acquire again.
16

Example - implement master-worker
• Spec:
• client submit tasks
• master watches for new workers and tasks,
assign tasks to available workers
• backup master takes over when the master fails
• workers register themselves and then watch for
new tasks
17

Example - implement master-worker
• Solution:
• ephemeral znode /master for master election
• backup masters sets up a watch for /master
• persistent znode /workers
• master set up with for /workers
• worker create a znode in /workers, e.g. /workers/host1
• persistent sequential znode /tasks
• client submit tasks by creating znode under /tasks
• persistent znode /assign
• workers set up watch on their corresponding znode under /assign e.g. /assign/
host1
• master assign task to worker by create znode under /assign, e.g. /assign/host1/
task1
• worker mark the task as done by update the data of task as “done”
18

Zookeeper Server Run Modes
• Standalone: single server
• Quorum: multiple servers replicate the data
• the cluster apply majority vote to keep the
consistency so a cluster can afford less than
half of nodes crash
• default ports: client(2181), quorum(2182),
election(2183)
20

Clients
• Native primitive operations
• C library
• Java library
• Recipes (3rd party high level API)
• Java: Curator (by Netﬂix)
• Python: kazoo (by Mozilla and Zope)
21

Java Client Console
• bin/zkCli.sh -server 127.0.0.1:2181
• Commands
• get path [watch]
• ls path [watch]
• set path data [version]
• createpath data acl
• delete path [version]
• setquota -n|-b val path
22

Python client - kazoo
• from kazoo.client import KazooClient
• zk = KazooClient(hosts='127.0.0.1:2181')
• zk.start()
• zk.stop()
https://kazoo.readthedocs.org/en/latest/
23

from kazoo.client import KazooClient 
from kazoo.client import KazooState
def my_listener(state): 
if state == KazooState.LOST: 
print 'lost session' 
elif state == KazooState.SUSPENDED: 
print 'disconnected from Zookeeper' 
elif state == KazooState.CONNECTED: 
# try to become the master 
print 'connected'
zk = KazooClient(hosts='127.0.0.1:2181') 
zk.add_listener(my_listener) 
zk.start() 
lock = zk.Lock('/master', '%s-%d' %(socket.gethostname(), os.getpid()))
24
zk.ensure_path("/path") 
zk.set("/path", “data_string".encode('utf8'))
start_key, stat = zk.get("/path")

Common Recipes
• lock
• election
• counter
• barrier
• partitioner
• party
• queue
• watch
26

Lock
zk = KazooClient() 
lock = zk.Lock("/lockpath", "my-identiﬁer") 
with lock: # blocks waiting for lock acquisition 
# do something with the lock
lock.release()
27

Election
election = zk.Election("/electionpath", "my-identiﬁer") 
 
# blocks until the election is won, then calls 
# my_leader_function() 
election.run(my_leader_function)
28

counter = zk.Counter("/int") 
counter += 2 
counter -= 1 
counter.value == 1 
 
counter = zk.Counter("/ﬂoat", default=1.0) 
counter += 2.0 
counter.value == 3.0
Counter
29

Barrier
barrier = zk.Barrier("/barrier") 
barrier.create() 
 
barrier.wait() 
 
# master release the barrier by 
barrier.remove()
30

Partitioner 
from kazoo.client import KazooClient 
client = KazooClient() 
qp = client.SetPartitioner( 
path='/work_queues', set=('queue-1', 'queue-2', 'queue-3')) 
 
while 1: 
if qp.failed: 
raise Exception("Lost or unable to acquire partition") 
elif qp.release: 
qp.release_set() 
elif qp.acquired: 
for partition in qp: 
# Do something with each partition 
elif qp.allocating: 
qp.wait_for_acquire() 31

Party
party1 = zk.Party("/party1", "my-identifier") 
party2 = zk.Party("/party2", "my-identifier") 
party1.join() 
"my-identifier" in party1 
"my-identifier" not in party2
32

Queue
queue = zk.LockingQueue("/queue") 
for task in tasks: 
queue.put(task.encode('utf8')) 
 
task = queue.get()
33

Watch: watch znode continuously
@zk.DataWatch('/last_scanned_card_key') 
def my_func(data, stat, event): 
print("Data is %s" % data) 
print("Version is %s" % stat.version) 
print("Event is %s" % event)
34

References
35
• Flavio Junqueira, Benjamin Reed, ZooKeeper: Distributed Process Coordination,
O'Reilly Media, Inc., November 25, 2013
• Zookeeper website, http://zookeeper.apache.org/

Distributed system coordination by zookeeper and introduction to kazoo python library

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Distributed system coordination by zookeeper and introduction to kazoo python library

Similar to Distributed system coordination by zookeeper and introduction to kazoo python library (20)

More from Jimmy Lai

More from Jimmy Lai (19)

Recently uploaded

Recently uploaded (20)

Distributed system coordination by zookeeper and introduction to kazoo python library