Curator intro

13,881 views
14,153 views

Published on

Introduction to Curator - the Netflix ZooKeeper library

Published in: Technology, Self Improvement
0 Comments
25 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
13,881
On SlideShare
0
From Embeds
0
Number of Embeds
2,412
Actions
Shares
0
Downloads
290
Comments
0
Likes
25
Embeds 0
No embeds

No notes for slide
  • \n
  • * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • * Background - ZK issues, the need for a wrapper, etc. - mention that you can go more in depth on this\n* Why Curator was written, etc.\n* Low-level - details of the client/framework. Error handling, assumptions, etc.\n* Mention that this will be very technical - lots of code\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Mention that you contributed part on recoverable errors\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Becomes a persistent, unchanging handle to the ZK ensemble\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Kishore Gopalakrishna from Linked-in\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Curator intro

    1. 1. CuratorThe Netflix ZooKeeper Client Library Jordan Zimmerman Senior Platform Engineer Netflix, Inc. jzimmerman@netflix.com @rangalt
    2. 2. Agenda
    3. 3. Agenda• Background• Overview of Curator• The Recipes• Some Low-Level Details• Q&A
    4. 4. Background
    5. 5. What’s wrong with this code?ZooKeeper client = new ZooKeeper(...);client.create(“/foo”, data, ...);
    6. 6. ZooKeeper Surprise
    7. 7. ZooKeeper Surprise• Almost no ZK client call is safe• You cannot assume success• You must handle exceptions
    8. 8. The Recipes Are Hard LocksFully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock. These can beimplemented using ZooKeeeper. As with priority queues, first define a lock node.NoteThere now exists a Lock implementation in ZooKeeper recipes directory. This is distributed with the release -- src/recipes/lock directory of the release artifact.Clients wishing to obtain a lock do the following: 1. Call create( ) with a pathname of "_locknode_/guid-lock-" and the sequence and ephemeral flags set. The guid is needed in case the create() result is missed. See the note below. 2. Call getChildren( ) on the lock node without setting the watch flag (this is important to avoid the herd effect). 3. If the pathname created in step 1 has the lowest sequence number suffix, the client has the lock and the client exits the protocol. 4. The client calls exists( ) with the watch flag set on the path in the lock directory with the next lowest sequence number. 5. if exists( ) returns false, go to step 2. Otherwise, wait for a notification for the pathname from the previous step before going to step 2.The unlock protocol is very simple: clients wishing to release a lock simply delete the node they created in step 1.Here are a few things to notice: • The removal of a node will only cause one client to wake up since each node is watched by exactly one client. In this way, you avoid the herd effect. • There is no polling or timeouts. • Because of the way you implement locking, it is easy to see the amount of lock contention, break locks, debug locking problems, etc.Recoverable Errors and the GUID • If a recoverable error occurs calling create() the client should call getChildren() and check for a node containing the guid used in the path name. This handles the case (noted above) of the create() succeeding on the server but the server crashing before returning the name of the new node.
    9. 9. Even the Distribution Has Issuesfrom org.apache.zookeeper.recipes.lock.WriteLockif (id == null) { long sessionId = zookeeper.getSessionId(); String prefix = "x-" + sessionId + "-"; // lets try look up the current ID if we failed // in the middle of creating the znode findPrefixInChildren(prefix, zookeeper, dir); idName = new ZNodeName(id);}
    10. 10. Even the Distribution Has Issues from org.apache.zookeeper.recipes.lock.WriteLock if (id == null) { long sessionId = zookeeper.getSessionId(); String prefix = "x-" + sessionId + "-"; // lets try look up the current ID if we failed // in the middle of creating the znode findPrefixInChildren(prefix, zookeeper, dir); idName = new ZNodeName(id); }Bad handling of Ephemeral-Sequential issue!
    11. 11. What About ZKClient?• Unclear if it’s still being supported Eleven open issues (back to 10/1/2009)• README: “+ TBD”• No docs• Little or no retries• Design problems: • All exceptions converted to RuntimeException • Recipes/management code highly coupled • Lots of foreground synchronization • Small number of tests • ... etc ...• ...
    12. 12. Introducing Curator
    13. 13. Introducing CuratorCurator n ˈkyo͝orˌātər: a keeper or custodian of amuseum or other collection - A ZooKeeperKeeperThree components: Client - A replacement/wrapper for the bundled ZooKeeper class Framework - A high-level API that greatly simplifies using ZooKeeper Recipes - Implementations of some of the common ZooKeeper "recipes" built on top of the Curator Framework
    14. 14. Overview of Curator
    15. 15. The Curator Stack• Client• Framework• Recipes• Extensions
    16. 16. The Curator Stack• Client• Framework• Recipes Curator Recipes Curator Framework• Extensions Curator Client ZooKeeper
    17. 17. Curator is a platformfor writing ZooKeeper Recipes
    18. 18. Curator Client manages the ZooKeeper Connection
    19. 19. Curator Recipes Curator Framework Curator Client ZooKeeperCurator Client manages the ZooKeeper Connection
    20. 20. Curator Framework uses retry for alloperations and provides a friendlier API
    21. 21. Curator Recipes Curator Framework Curator Client ZooKeeper Curator Framework uses retry for alloperations and provides a friendlier API
    22. 22. Curator Recipes:implementations of all recipes listed on theZK website (and more)
    23. 23. Curator Recipes Curator Framework Curator Client ZooKeeper Curator Recipes:implementations of all recipes listed on theZK website (and more)
    24. 24. The Recipes
    25. 25. The Recipes• Leader Selector• Distributed Locks• Queues• Barriers• Counters• Atomics• ...
    26. 26. CuratorFramework InstanceCuratorFrameworkFactory.newClient(...) ---------------------CuratorFrameworkFactory.builder() .connectString(“...”) ... .build() Usually injected as a singleton
    27. 27. Must Be Startedclient.start();// client is now ready for use
    28. 28. Leader SelectorBy far the most common usage ofZooKeeper Distributed lock with a notification mechanism
    29. 29. Sample
    30. 30. public class CleanupLeader implements LeaderSelectorListener { ... @Override public void takeLeadership(CuratorFramework client) throws Exception { while ( !Thread.currentThread().isInterrupted() ) { sleepUntilNextPeriod(); doPeriodicCleanup(); } } }...LeaderSelector leaderSelector = new LeaderSelector(client, path, new CleanupLeader());leaderSelector.start();
    31. 31. Distributed Locks• InterProcessMutex• InterProcessReadWriteLock• InterProcessMultiLock• InterProcessSemaphore
    32. 32. Distributed Locks• InterProcessMutex• InterProcessReadWriteLock• InterProcessMultiLock• InterProcessSemaphore Very similar to JDK locks
    33. 33. Sample
    34. 34. InterProcessMutex mutex = new InterProcessMutex(client, lockPath);mutex.acquire();try{ // do work in critical section}finally{ mutex.release();}
    35. 35. Low-Level Details
    36. 36. public void process(WatchedEvent event){ boolean wasConnected = isConnected.get(); boolean newIsConnected = wasConnected; if ( event.getType() == Watcher.Event.EventType.None ) { newIsConnected = (event.getState() == Event.KeeperState.SyncConnected); if ( event.getState() == Event.KeeperState.Expired ) { handleExpiredSession(); } } if ( newIsConnected != wasConnected ) { isConnected.set(newIsConnected); connectionStartMs = System.currentTimeMillis(); } ...}
    37. 37. public static boolean shouldRetry(int rc){ return (rc == KeeperException.Code.CONNECTIONLOSS.intValue()) || (rc == KeeperException.Code.OPERATIONTIMEOUT.intValue()) || (rc == KeeperException.Code.SESSIONMOVED.intValue()) || (rc == KeeperException.Code.SESSIONEXPIRED.intValue());}public void takeException(Exception exception) throws Exception{ boolean rethrow = true; if ( isRetryException(exception) ) { if ( retryPolicy.allowRetry(retryCount++, System.currentTimeMillis() - startTimeMs) ) { rethrow = false; } } if ( rethrow ) { throw exception; }}
    38. 38. byte[] responseData = RetryLoop.callWithRetry( client.getZookeeperClient(), new Callable<byte[]>() { @Override public byte[] call() throws Exception { byte[] responseData; responseData = client.getZooKeeper().getData(path, ...); } return responseData; } });return responseData;
    39. 39. client.withProtectedEphemeralSequential()
    40. 40. final AtomicBoolean firstTime = new AtomicBoolean(true);String returnPath = RetryLoop.callWithRetry( client.getZookeeperClient(), new Callable<String>() { @Override public String call() throws Exception { ... String createdPath = null; if ( !firstTime.get() && doProtectedEphemeralSequential ) { createdPath = findProtectedNodeInForeground(localPath); } ... } });
    41. 41. public interface ConnectionStateListener{ public void stateChanged(CuratorFramework client, ConnectionState newState);}public enum ConnectionState{ SUSPENDED, RECONNECTED, LOST}
    42. 42. if ( e instanceof KeeperException.ConnectionLossException ) { connectionStateManager.addStateChange(ConnectionState.LOST); }private void validateConnection(CuratorEvent curatorEvent){ if ( curatorEvent.getType() == CuratorEventType.WATCHED ) { if ( curatorEvent.getWatchedEvent().getState() == Watcher.Event.KeeperState.Disconnected ) { connectionStateManager.addStateChange(ConnectionState.SUSPENDED); internalSync(this, "/", null); } else if ( curatorEvent.getWatchedEvent().getState() == Watcher.Event.KeeperState.Expired ) { connectionStateManager.addStateChange(ConnectionState.LOST); } else if ( curatorEvent.getWatchedEvent().getState() == Watcher.Event.KeeperState.SyncConnected ) { connectionStateManager.addStateChange(ConnectionState.RECONNECTED); } }}
    43. 43. Testing Utilities
    44. 44. • TestingServer: manages an internally running ZooKeeper server // Create the server using a random port public TestingServer()• TestingCluster: manages an internally running ensemble of ZooKeeper servers. // Creates an ensemble comprised of n servers. // Each server will use a temp directory and // random ports public TestingCluster(int instanceQty)
    45. 45. Extensions• Discovery• Discovery REST Server• Exhibitor• ???
    46. 46. Extensions• Discovery• Discovery REST Server• Exhibitor Curator Recipes• ??? Curator Framework Extensions Curator Client ZooKeeper
    47. 47. Exhibitor Sneak Peak
    48. 48. Exhibitor Sneak Peak
    49. 49. Exhibitor Sneak Peak
    50. 50. Exhibitor Sneak Peak
    51. 51. Exhibitor Sneak Peak March or April 2012Open Source on Github
    52. 52. Netflix Github
    53. 53. Netflix GithubNetflix’s home for Open Source
    54. 54. Maven Central
    55. 55. Maven CentralBinaries pushed to Maven Central <dependency> <groupId>com.netflix.curator</groupId> <artifactId>curator-recipes</artifactId> <version>1.1.0</version> </dependency>
    56. 56. Much%younger%–%much%thinner0 Jordan Zimmerman jzimmerman@netflix.com @randgalt
    57. 57. Q&A Much%younger%–%much%thinner0 Jordan Zimmerman jzimmerman@netflix.com @randgalt

    ×