So we’re running

ZooKeeper. Now What?
Camille Fournier, Rent the Runway
@skamille
Big Data. Big Systems.
• Outages
• Coordination
• Operational Complexity

Common Challenges
• Consistency guarantees

Common Deficiency
Storm uses Zookeeper for coordinating the cluster.
Zookeeper is not used for message passing, so the
load Storm places on ...
A centralized service for maintaining
configuration information, naming,
providing distributed synchronization,
and provid...
• Tolerates the loss of a minority ((n/2) – 1) of ensemble
members and still function

Highly Available
• All data is stored in memory
• Performance measured around 50,000

operations/second
• Particularly fast for read perfor...
• Atomic Writes
• In the order you sent them
• Changes always seen in the order they occurred

• Reliable, no writes acked...
leader
cli
cli

follower

follower

cli
cli

cli
cli

Basics: Cluster
Interactions
leader
cli
cli

follower

follower

cli
cli

cli
cli

Basics: Cluster
Interactions
leader
cli
cli

follower

cli
cli

cli
cli

Basics: Cluster
Interactions
cli

/a/b/myNode

/a/b
/a/b/d

/a
/a/c

Data Structure

/a/c/e000001
• Nodes can contain data, have children, or both
• Ephemeral nodes are associated with the session that
created them
• The...
client

leader

getData “/foo” true
client

client

Watches

return “mydata”

follower

follower
client

leader

client

follower

setData “/foo” “bar”
client

Watches

follower
client

leader

NOTIFICATION
client

follower

client

follower

Watches
client

leader

getData “/foo” true
client

client

Watches

return “bar”

follower

follower
• Set against data or path changes
• Ordered with respect to other events, other watches, and
asynchronous replies.
• A cl...
• create
• delete
• setData

Basics: Creation API
• exists
• getData
• getChildren

Basics: Get/Watch API
• multi * new in 3.4
• sync

Basics: API
Service Management
Distributed Locking

Common Uses
• In Storm, ZooKeeper is the source of
communication between Nimbus and
Supervisors
• Nimbus finds Supervisors via ZooKeep...
Find servers doing job “Products”
Encode as path in ZooKeeper:
/servers/products
Servers register as ephemeral nodes under...
Read config from nodes
Watch nodes for config changes

Configuration
Shared Locks
Barriers and Latches
Leader Election
Two-Phase Commit

Locking
And Now, The Scary Part
The State Machine
NOPE
• Curator (Java)
• Kazoo (Python)
• Twitter Commons for Discovery

Recommended Clients
ZooKeeper Owns Your Availability

(maybe)

Be Aware
•
•
•
•
•
•

Thank you to @zaa for the format of the slide on watches
Tweet me! @skamille
Email me! camille@apache.org
Kaz...
Upcoming SlideShare
Loading in...5
×

So we're running Apache ZooKeeper. Now What? By Camille Fournier

10,230

Published on

The ZooKeeper framework was originally built at Yahoo! to make it easy for the company’s applications to access configuration information in a robust and easy-to-understand way, but it has since grown to offer a lot of features that help coordinate work across distributed clusters. Apache Zookeeper became a de-facto standard for coordination service and used by Storm, Hadoop, HBase, ElasticSearch and other distributed computing frameworks.

Published in: Technology
0 Comments
18 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
10,230
On Slideshare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
106
Comments
0
Likes
18
Embeds 0
No embeds

No notes for slide
  • Example of outage…Nodes Goes DownNetwork PartitionsDisk CorruptionCoordination: Task AssignmentOperational Complexity: Finding other cluster membersDynamic ConfigurationGroup Membership
  • If you use the sync call before a read, ZooKeeper providesilnearizability for sync+read and write operations (this is true withcertain timing assumption made in ZooKeeper for efficiency).
  • So we're running Apache ZooKeeper. Now What? By Camille Fournier

    1. 1. So we’re running ZooKeeper. Now What? Camille Fournier, Rent the Runway @skamille
    2. 2. Big Data. Big Systems.
    3. 3. • Outages • Coordination • Operational Complexity Common Challenges
    4. 4. • Consistency guarantees Common Deficiency
    5. 5. Storm uses Zookeeper for coordinating the cluster. Zookeeper is not used for message passing, so the load Storm places on Zookeeper is quite low ZooKeeper in Storm
    6. 6. A centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services • • • • Distributed, Consistent Data Store Highly Available High performance Strictly ordered access ZooKeeper
    7. 7. • Tolerates the loss of a minority ((n/2) – 1) of ensemble members and still function Highly Available
    8. 8. • All data is stored in memory • Performance measured around 50,000 operations/second • Particularly fast for read performance, built for readdominant workloads High Performance
    9. 9. • Atomic Writes • In the order you sent them • Changes always seen in the order they occurred • Reliable, no writes acked will be dropped Strictly Ordered Access
    10. 10. leader cli cli follower follower cli cli cli cli Basics: Cluster Interactions
    11. 11. leader cli cli follower follower cli cli cli cli Basics: Cluster Interactions
    12. 12. leader cli cli follower cli cli cli cli Basics: Cluster Interactions
    13. 13. cli /a/b/myNode /a/b /a/b/d /a /a/c Data Structure /a/c/e000001
    14. 14. • Nodes can contain data, have children, or both • Ephemeral nodes are associated with the session that created them • They cannot have children, and disappear when that session ends • Sequential nodes have an ever-increasing number attached to them Basics: Data Structure
    15. 15. client leader getData “/foo” true client client Watches return “mydata” follower follower
    16. 16. client leader client follower setData “/foo” “bar” client Watches follower
    17. 17. client leader NOTIFICATION client follower client follower Watches
    18. 18. client leader getData “/foo” true client client Watches return “bar” follower follower
    19. 19. • Set against data or path changes • Ordered with respect to other events, other watches, and asynchronous replies. • A client will see a watch event for a node it is watching before seeing the new data that corresponds to that node. • The order of watch events corresponds to the order of the updates as seen by the ZooKeeper service • One time notifications; must be reset, changes can be missed between notification and reset of the watch Basics: Watches
    20. 20. • create • delete • setData Basics: Creation API
    21. 21. • exists • getData • getChildren Basics: Get/Watch API
    22. 22. • multi * new in 3.4 • sync Basics: API
    23. 23. Service Management Distributed Locking Common Uses
    24. 24. • In Storm, ZooKeeper is the source of communication between Nimbus and Supervisors • Nimbus finds Supervisors via ZooKeeper Coordination
    25. 25. Find servers doing job “Products” Encode as path in ZooKeeper: /servers/products Servers register as ephemeral nodes under this path with details about location, other connection info Discovery (Naming)
    26. 26. Read config from nodes Watch nodes for config changes Configuration
    27. 27. Shared Locks Barriers and Latches Leader Election Two-Phase Commit Locking
    28. 28. And Now, The Scary Part
    29. 29. The State Machine
    30. 30. NOPE
    31. 31. • Curator (Java) • Kazoo (Python) • Twitter Commons for Discovery Recommended Clients
    32. 32. ZooKeeper Owns Your Availability (maybe) Be Aware
    33. 33. • • • • • • Thank you to @zaa for the format of the slide on watches Tweet me! @skamille Email me! camille@apache.org Kazoo: http://kazoo.readthedocs.org/en/latest/ Curator: http://curator.incubator.apache.org/ Twitter commons: http://twitter.github.io/commons/ Credits and Contact
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×