1. ExhibitorNetﬂix’s ZooKeeper Management System Jordan Zimmerman Senior Platform Engineer Netﬂix, Inc. jzimmerman@netﬂix.com @rangalt
2. The Problem
3. • ZooKeeper is statically conﬁgured• Limited tools for managing the ensemble• Backup/restore is sometimes needed• Visualization is desperately needed• Prior to 3.4.x, periodic cleanup needed
4. The Goal
5. Chaos Monkey-able• See http://techblog.netﬂix.com/2011/07/ netﬂix-simian-army.html A tool that randomly disables our production instances to make sure we can survive common types of failure without any customer impact.• Completely unmanned• Bringing up a new ensemble should be turn-key/push-button
7. Instance MonitoringEach Exhibitor instance monitors theZooKeeper server running on the sameserver. If ZooKeeper is not running, Exhibitorwill write the zoo.cfg ﬁle, etc. and start it. IfZooKeeper crashes for some reason,Exhibitor will restart it.
8. Log CleanupIn versions prior to ZooKeeper 3.4.x, log ﬁlemaintenance is necessary. Exhibitor willperiodically do this maintenance.
9. Backup/RestoreBackups in a ZooKeeper ensemble are more complicatedthan for a traditional data store (e.g. aRDBMS). Generally,most of the data in ZooKeeper is ephemeral. It would beharmful to blindly restore an entire ZooKeeper data set.What is needed is selective restoration to prevent accidentaldamage to a subset of the data set. Exhibitor enables this.Exhibitor will periodically backup the ZooKeeper transactionﬁles. Once backed up, you can index any of these transactionﬁles. Once indexed, you can search for individual transactionsand “replay” them to restore a given ZNode to ZooKeeper.
10. Cluster-wide ConﬁgurationExhibitor presents a single console for yourentire ZooKeeper ensemble. Conﬁgurationchanges made in Exhibitor will be applied tothe entire ensemble.
11. Rolling Ensemble ChangesExhibitor can update the servers in theensemble in a rolling fashion so that theZooKeeper ensemble can stay up and inquorum while the changes are being made.
12. VisualizerExhibitor provides a graphical tree view ofthe ZooKeeper ZNode hierarchy.
13. ZooKeeper Data MutationWhen enabled, Exhibitor can create/update/delete nodes in the ZooKeeper hierarchy.
14. Curator IntegrationExhibitor and Curator (Cur/Ex!) can beconﬁgured to work together so that Curatorinstances are updated for changes in theensemble. Exhibitor Exhibitor Exhibitor A B ... Round Robin - periodic query for servers list Curator Clients Curator Clients Curator Clients
15. How it Works
16. Shared Conﬁguration • S3 • File System Shared • Etc. Conﬁg Exhibitor Exhibitor Exhibitor A B ...
17. Coming Soon...
18. • Auto-register new instances• Auto-remove old instances• Alerting• ???
19. Using / Integration
20. • Stand alone application - or -• Library/JAR