Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to ZooKeeper - TriHUG May 22, 2012

12,935 views

Published on

Presentation given at TriHUG (Triangle Hadoop User Group) on May 22, 2012. Gives a basic overview of Apache ZooKeeper as well as some common use cases, 3rd party libraries, and "gotchas"

Demo code available at https://github.com/mumrah/trihug-zookeeper-demo

Published in: Technology
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Apache ZooKeeper Tutorial (Videos and Books) #ZooKeeper $7.95 http://www.dbmanagement.info/Tutorials/Apache_ZooKeeper.htm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Introduction to ZooKeeper - TriHUG May 22, 2012

  1. 1. Apache ZooKeeperAn Introduction and Practical Use Cases
  2. 2. Who am I● David Arthur● Engineer at Lucid Imagination● Hadoop user● Python enthusiast● Father● Gardener
  3. 3. Play along!Grab the source for this presentation at GitHubgithub.com/mumrah/trihug-zookeeper-demoYoull need Java, Ant, and bash.
  4. 4. Apache ZooKeeper● Formerly a Hadoop sub-project● ASF TLP (top level project) since Nov 2010● 7 PMC members, 8 committers - most from Yahoo! and Cloudera● Ugly logo
  5. 5. One liner"ZooKeeper allows distributed processes tocoordinate with each other through a sharedhierarchical name space of data registers"- ZooKeeper wiki
  6. 6. Who uses it?Everyone*● Yahoo!● HBase● Solr● LinkedIn (Kafka, Hedwig)● Many more* https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy
  7. 7. What is it good for?● Configuration management - machines bootstrap config from a centralized source, facilitates simpler deployment/provisioning● Naming service - like DNS, mappings of names to addresses● Distributed synchronization - locks, barriers, queues● Leader election - a common problem in distributed coordination● Centralized and highly reliable (simple) data registry
  8. 8. Namespace (ZNodes)parent : "foo"|-- child1 : "bar"|-- child2 : "spam"`-- child3 : "eggs" `-- grandchild1 : "42"Every znode has data (given as byte[]) and canoptionally have children.
  9. 9. Sequential znodeNodes created in "sequential" mode willappend a 10 digit zero padded monotonicallyincreasing number to the name.create("/demo/seq-", ..., ..., PERSISTENT_SEQUENTIAL) x4/demo|-- seq-0000000000|-- seq-0000000001|-- seq-0000000002`-- seq-0000000003
  10. 10. Ephemeral znodeNodes created in "ephemeral" mode will bedeleted when the originating client goes away.create("/demo/foo", ..., ..., PERSISTENT);create("/demo/bar", ..., ..., EPHEMERAL); Connected Disconnected /demo /demo |-- foo `-- foo `-- bar
  11. 11. Simple APIPretty much everything lives under theZooKeeper class● create● exists● delete● getData● setData● getChildren
  12. 12. Synchronicitysync and async version of API methodsexists("/demo", null);exists("/demo", null, new StatCallback() { @Override public processResult(int rc, String path, Object ctx, Stat stat) { ... }}, null);
  13. 13. WatchesWatches are a one-shot callback mechanismfor changes on connection and znode state● Client connects/disconnects● ZNode data changes● ZNode children change
  14. 14. Demo time!For those playing along, youll need to getZooKeeper running. Using the default port(2181), run: ant zkOr specify a port like: ant zk -Dzk.port=2181
  15. 15. Things to "watch" out for● Watches are one-shot - if you want continuous monitoring of a znode, you have to reset the watch after each event● Too many clients watches on a single znode creates a "herd effect" - lots of clients get notifications at the same time and cause spikes in load● Potential for missing changes● All watches are executed in a single, separate thread (be careful about synchronization)
  16. 16. Building blocks● Hierarchical nodes● Parent and leaf nodes can have data● Two special types of nodes - ephemeral and sequential● Watch mechanism● Consistency guarantees ○ Order of updates is maintained ○ Updates are atomic ○ Znodes are versioned for MVCC ○ Many more
  17. 17. The Fun StuffRecipes:● Lock● Barrier● Queue● Two-phase commit● Leader election● Group membership
  18. 18. Demo Time!Group membership (i.e., the easy one)Recipe:● Members register a sequential ephemeral node under the group node● Everyone keeps a watch on the group node for new children
  19. 19. Lots of boilerplate● Synchronize the asynchronous connection (using a latch or something)● Handling disconnects/reconnects● Exception handling● Ensuring paths exist (nothing like mkdir -p)● Resetting watches● Cleaning up
  20. 20. What happens?● Everyone writes their own high level wrapper/connection manager ○ ZooKeeperWrapper ○ ZooKeeperSession ○ (w+)ZooKeeper ○ ZooKeeper(w+)
  21. 21. Open Source, FTW!Luckily, some smart people have open sourcedtheir ZooKeeper utilities/wrappers● Netflix Curator - Netflix/curator● Linkedin - linkedin/linkedin-zookeeper● Many others
  22. 22. Netflix Curator● Handles the connection management● Implements many recipes ○ leader election ○ locks, queues, and barriers ○ counters ○ path cache● Bonus: service discovery implementation (we use this)
  23. 23. Demo Time!Group membership refactored with Curator● EnsurePath is nice● Robust connection management is awesome● Exceptions are more sane
  24. 24. Thoughts on Curatori.e., my non-expert subjective opinions● Good level of abstraction - doesnt do anything "magical"● Doesnt hide ZooKeeper● Weird API design (builder soup)● Extensive, well tested recipe support● It works!
  25. 25. ZooKeeper in the wildSome use cases
  26. 26. Use case: Solr 4.0Used in "Solr cloud" mode for:● Cluster management - what machines are available and where are they located● Leader election - used for picking a shard as the "leader"● Consolidated config storage● Watches allow for very non-chatty steady- state● Herd effect not really an issue
  27. 27. Use case: Kafka● Linkedins distributed pub/sub system● Queues are persistent● Clients request a slice of a queue (offset, length)● Brokers are registered in ZooKeeper, clients load balance requests among live brokers● Client state (last consumed offset) is stored in ZooKeeper● Client rebalancing algorithm, similar to leader election
  28. 28. Use case: LucidWorks Big Data● We use Curators service discovery to register REST services● Nice for SOA● Took 1 dev (me) 1 day to get something functional (mostly reading Curator docs)● So far, so good!
  29. 29. Review of "gotchas"● Watch execution is single threaded and synchronized● Cant reliably get every change for a znode● Excessive watchers on the same znode (herd effect) Some new ones● GC pauses: if your application is prone to long GC pauses, make sure your session timeout is sufficiently long● Catch-all watches: if you use one Watcher for everything, it can be tedious to infer exactly what happened
  30. 30. Four letter wordsThe ZooKeeper server responds to a few "fourletter word" commands via TCP or Telnet* > echo ruok | nc localhost 2181 imokIm glad youre OK, ZooKeeper - really I am.* http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_zkCommands
  31. 31. QuorumsIn a multi-node deployment (aka, ZooKeeperQuorum), it is best to use an odd number ofmachines.ZooKeeper uses majority voting, so it cantolerate ceil(N/2)-1 machine failures andstill function properly.
  32. 32. Multi-tenancyZooKeeper supports "chroot" at the session level. You canadd a path to the connection string that will be implicitlyprefixed to everything you do: new ZooKeeper("localhost:2181/my/app");Curator also supports this, but at the application level: CuratorFrameworkFactory.builder() .namespace("/my/app");
  33. 33. Python clientDumb wrapper around C client, not veryPythonicimport zookeeperzk_handle = zookeeper.init("localhost:2181")zookeeper.exists(zk_handle, "/demo")zookeeper.get_children(zk_handle, "/demo")Stuff in contrib didnt work for me, I used astatically linked version: zc-zookeeper-static
  34. 34. Other clientsIncluded in ZooKeeper under src/contrib:● C (this is what the Python client uses)● Perl (again, using the C client)● REST (JAX-RS via Jersey)● FUSE? (strange)3rd-party client implementations:● Scala, courtesy of Twitter● Several others
  35. 35. Overview● Basics of ZooKeeper (znode types, watches)● High-level recipes (group membership, et al.)● Lots of boilerplate for basic functionality● 3rd party helpers (Curator, et al.)● Gotchas and other miscellany
  36. 36. Questions?David Arthurmumrah@gmail.comgithub.com/mumrah/trihug-zookeeper-demo

×