SlideShare a Scribd company logo
http://zookeeper.apache.org/



GNUnify - 2013                    1
Saurav Haloi
    Engineer at Symantec
    Work in Hadoop & Distributed System
    FOSS enthusiast




GNUnify - 2013                            2
A distributed system consists of multiple
          computers that communicate through a
         computer network and interact with each
             other to achieve a common goal.
                                         - Wikipedia




GNUnify - 2013                                         3
The network is reliable.
     Latency is zero.
     Bandwidth is infinite.
     The network is secure.
     Topology doesn't change.
     There is one administrator.
     Transport cost is zero.
     The network is homogeneous.
Reference: http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing


 GNUnify - 2013                                                              4
Coordination: An act that multiple nodes must
    perform together.
    Examples:
           Group membership
           Locking
           Publisher/Subscriber
           Leader Election
           Synchronization
    Getting node coordination correct is very hard!

GNUnify - 2013                                        5
GNUnify - 2013   6
ZooKeeper allows distributed processes to
        coordinate with each other through a shared
        hierarchical name space of data registers.
                                        - ZooKeeper Wiki

                 ZooKeeper is much more than a
                    distributed lock server!


GNUnify - 2013                                             7
An open source, high-performance coordination
    service for distributed applications.
    Exposes common services in simple interface:
           naming
           configuration management
           locks & synchronization
           group services
                 … developers don't have to write them from scratch
    Build your own on it for specific needs.

GNUnify - 2013                                                        8
Configuration Management
           Cluster member nodes bootstrapping configuration from a
           centralized source in unattended way
           Easier, simpler deployment/provisioning
    Distributed Cluster Management
           Node join / leave
           Node statuses in real time
    Naming service – e.g. DNS
    Distributed synchronization - locks, barriers, queues
    Leader election in a distributed system.
    Centralized and highly reliable (simple) data registry

GNUnify - 2013                                                       9
ZooKeeper Service is replicated over a set of machines
    All machines store a copy of the data (in memory)
    A leader is elected on service startup
    Clients only connect to a single ZooKeeper server & maintains a TCP connection.
    Client can read from any Zookeeper server, writes go through the leader & needs
    majority consensus.

 Image: https://cwiki.apache.org/confluence/display/ZOOKEEPER/ProjectDescription
GNUnify - 2013                                                                        10
ZooKeeper has a hierarchal name space.
    Each node in the namespace is called as a ZNode.
    Every ZNode has data (given as byte[]) and can
    optionally have children.
       parent : "foo"
         |-- child1 : "bar"
         |-- child2 : "spam"
         `-- child3 : "eggs"
                  `-- grandchild1 : "42"
    ZNode paths:
            canonical, absolute, slash-separated
            no relative references.
            names can have Unicode characters

GNUnify - 2013                                         11
Maintain a stat structure with
   version numbers for data
   changes, ACL changes and
   timestamps.
   Version numbers increases with
   changes
   Data is read and written in its
   entirety




                           Image: http://helix.incubator.apache.org/Architecture.html
GNUnify - 2013                                                                      12
Persistent Nodes
            exists till explicitly deleted
    Ephemeral Nodes
            exists as long as the session is active
            can’t have children
    Sequence Nodes (Unique Naming)
            append a monotonically increasing counter to the
            end of path
            applies to both persistent & ephemeral nodes



GNUnify - 2013                                                 13
Operation                    Type

                 create                       Write
                 delete                       Write
                 exists                       Read
                 getChildren                  Read
                 getData                      Read
                 setData                      Write
                 getACL                       Read
                 setACL                       Write
                 sync                         Read

                    ZNodes are the main entity that a programmer access.

GNUnify - 2013                                                             14
[zk: localhost:2181(CONNECTED) 0] help   [zk: localhost:2181(CONNECTED) 1] ls /
ZooKeeper -server host:port cmd args     [hbase, zookeeper]
     connect host:port
     get path [watch]                    [zk: localhost:2181(CONNECTED) 2] ls2 /zookeeper
     ls path [watch]                     [quota]
     set path data [version]             cZxid = 0x0
     rmr path                            ctime = Tue Jan 01 05:30:00 IST 2013
     delquota [-n|-b] path               mZxid = 0x0
     quit                                mtime = Tue Jan 01 05:30:00 IST 2013
     printwatches on|off                 pZxid = 0x0
     create [-s] [-e] path data acl      cversion = -1
     stat path [watch]                   dataVersion = 0
     close                               aclVersion = 0
     ls2 path [watch]                    ephemeralOwner = 0x0
     history                             dataLength = 0
     listquota path                      numChildren = 1
     setAcl path acl
     getAcl path                         [zk: localhost:2181(CONNECTED) 3] create /test-znode HelloWorld
     sync path                           Created /test-znode
     redo cmdno                          [zk: localhost:2181(CONNECTED) 4] ls /
     addauth scheme auth                 [test-znode, hbase, zookeeper]
     delete path [version]               [zk: localhost:2181(CONNECTED) 5] get /test-znode
     setquota -n|-b val path             HelloWorld
GNUnify - 2013                                                                                             15
Clients can set watches on znodes:
            NodeChildrenChanged
            NodeCreated
            NodeDataChanged
            NodeDeleted
    Changes to a znode trigger the watch and ZooKeeper
    sends the client a notification.
    Watches are one time triggers.
    Watches are always ordered.
    Client sees watched event before new znode data.
    Client should handle cases of latency between getting
    the event and sending a new request to get a watch.
GNUnify - 2013                                              16
API methods are sync as well as async
    Sync:
      exists(“/test-cluster/CONFIGS", null);
    Async:
      exists("/test-cluster/CONFIGS", null, new StatCallback() {
         @Override
         public processResult(int rc, String path, Object ctx, Stat stat)
         {
                   //process result when called back later
         }
         }, null);


GNUnify - 2013                                                              17
Read requests are processed locally at the ZooKeeper server to
    which the client is currently connected
    Write requests are forwarded to the leader and go through
    majority consensus before a response is generated.
                   Image: http://www.slideshare.net/scottleber/apache-zookeeper
GNUnify - 2013                                                                    18
Sequential Consistency: Updates are applied in order
    Atomicity: Updates either succeed or fail
    Single System Image: A client sees the same view of
    the service regardless of the ZK server it connects to.
    Reliability: Updates persists once applied, till
    overwritten by some clients.
    Timeliness: The clients’ view of the system is
    guaranteed to be up-to-date within a certain time
    bound. (Eventual Consistency)




GNUnify - 2013                                                19
Each Client Host i, i:=1 .. N
                                      Cluster
1. Watch on /members
2. Create /members/host-${i} as
     ephemeral nodes                            /members
3. Node Join/Leave generates alert
4. Keep updating /members/host-${i}                        host-1
     periodically for node status
     changes
    (load, memory, CPU etc.)
                                                           host-2



                                                           host-N




GNUnify - 2013                                                      20
1. A znode, say “/svc/election-path"
2. All participants of the election process
   create an ephemeral-sequential node
   on the same election path.
3. The node with the smallest sequence
   number is the leader.
4. Each “follower” node listens to the
   node with the next lower seq. number
5. Upon leader removal go to
   election-path and find a new leader,
   or become the leader if it has the
   lowest sequence number.
6. Upon session expiration check the
   election state and go to election if
   needed



 Image: http://techblog.outbrain.com/2011/07/leader-election-with-zookeeper/
GNUnify - 2013                                                                 21
Assuming there are N clients trying to        ZK
acquire a lock
   Clients creates an
                                              |---Cluster
   ephemeral, sequential znode under            +---config
   the path /Cluster/_locknode_                 +---memberships
   Clients requests a list of children for
   the lock znode (i.e. _locknode_)             +---_locknode_
    The client with the least ID according to      +---host1-3278451
    natural ordering will hold the lock.
                                                   +---host2-3278452
    Other clients sets watches on the
    znode with id immediately preceding            +---host3-3278453
    its own id                                     +--- …
    Periodically checks for the lock in case
    of notification.
                                                   ---hostN-3278XXX
    The client wishing to release a lock
    deletes the node, which triggering the
    next client in line to acquire the lock.

 GNUnify - 2013                                                        22
ZooKeeper ships client libraries in:
            Java
            C
            Perl
            Python
    Community contributed client bindings available for
    Scala, C#, Node.js, Ruby, Erlang, Go, Haskell
      https://cwiki.apache.org/ZOOKEEPER/zkclientbindings.html




GNUnify - 2013                                                   23
Watches are one time triggers
            Continuous watching on znodes requires reset of watches
            after every events / triggers
    Too many watches on a single znode creates the “herd
    effect” - causing bursts of traffic and limiting scalability
    If a znode changes multiple times between getting the
    event and setting the watch again, carefully handle it!
    Keep session time-outs long enough to handle long
    garbage-collection pauses in applications.
    Set Java max heap size correctly to avoid swapping.
    Dedicated disk for ZooKeeper transaction log


GNUnify - 2013                                                        24
Companies:                               Projects in FOSS:
   •    Yahoo!                              •   Apache Map/Reduce (Yarn)
   •    Zynga                               •   Apache HBase
   •    Rackspace                           •   Apache Solr
   •    LinkedIn                            •   Neo4j
   •    Netflix                             •   Katta
   •    and many more…                      •   and many more…




Reference: https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy

GNUnify - 2013                                                               25
Used within Twitter for service discovery
    How?
           Services register themselves in ZooKeeper
           Clients query the production cluster for service “A” in data
           center “XYZ”
           An up-to-date host list for each service is maintained
           Whenever new capacity is added the client will
           automatically be aware
           Also, enables load balancing across all servers.




      Reference: http://engineering.twitter.com/
GNUnify - 2013                                                            26
The Chubby lock service for loosely-coupled distributed systems
      Google Research (7th USENIX Symposium on Operating Systems Design and
      Implementation (OSDI), {USENIX} (2006) )
    ZooKeeper: Wait-free coordination for Internet-scale systems
      Yahoo Research (USENIX Annual Technology Conference 2010)
    Apache ZooKeeper Home: http://zookeeper.apache.org/
    Presentations:
            http://www.slideshare.net/mumrah/introduction-to-zookeeper-trihug-
            may-22-2012
            http://www.slideshare.net/scottleber/apache-zookeeper
            https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeperPre
            sentations




GNUnify - 2013                                                                27
The Google File System
    The Hadoop Distributed File System
    MapReduce: Simplified Data Processing on Large Clusters
    Bigtable: A Distributed Storage System for Structured Data
    PNUTS: Yahoo!’s Hosted Data Serving Platform
    Dynamo: Amazon's Highly Available Key-value Store
    Spanner: Google's Globally Distributed Database
    Centrifuge: Integrated Lease Management and Partitioning
    Cloud Services (Microsoft)
    ZAB: A simple totally ordered broadcast protocol (Yahoo!)
    Paxos Made Simple by Leslie Lamport.
    Eventually Consistent by Werner Vogel (CTO, Amazon)
    http://www.highscalability.com/
GNUnify - 2013                                                   28
Questions?


GNUnify - 2013                29
Thank You!
                 Saurav Haloi
                 saurav.haloi@yahoo.com
                 Twitter: sauravhaloi




GNUnify - 2013                            30

More Related Content

What's hot

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
Scott Leberknight
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
Pietro Michiardi
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
Dan McKinley
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
Cloudera, Inc.
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
pflueras
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
Chhavi Parasher
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
DataWorks Summit
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
University of California, Santa Cruz
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
jhao niu
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
Clement Demonchy
 

What's hot (20)

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 

Similar to Introduction to Apache ZooKeeper

A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to Kubernetes
Paul Czarkowski
 
Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!
Joydeep Banik Roy
 
NodeJS guide for beginners
NodeJS guide for beginnersNodeJS guide for beginners
NodeJS guide for beginners
Enoch Joshua
 
Zookeeper big sonata
Zookeeper  big sonataZookeeper  big sonata
Zookeeper big sonata
Anh Le
 
Node.js at Joyent: Engineering for Production
Node.js at Joyent: Engineering for ProductionNode.js at Joyent: Engineering for Production
Node.js at Joyent: Engineering for Production
jclulow
 
OpenDaylight SDN Controller - Introduction
OpenDaylight SDN Controller - IntroductionOpenDaylight SDN Controller - Introduction
OpenDaylight SDN Controller - Introduction
Eueung Mulyana
 
Базы данных. ZooKeeper
Базы данных. ZooKeeperБазы данных. ZooKeeper
Базы данных. ZooKeeper
Vadim Tsesko
 
Networking in Docker EE 2.0 with Kubernetes and Swarm
Networking in Docker EE 2.0 with Kubernetes and SwarmNetworking in Docker EE 2.0 with Kubernetes and Swarm
Networking in Docker EE 2.0 with Kubernetes and Swarm
Abhinandan P.b
 
Networking in docker ee with kubernetes and swarm
Networking in docker ee with kubernetes and swarmNetworking in docker ee with kubernetes and swarm
Networking in docker ee with kubernetes and swarm
Docker, Inc.
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific data
Bruno Vieira
 
Zookeeper Architecture
Zookeeper ArchitectureZookeeper Architecture
Zookeeper Architecture
Prasad Wali
 
iguazio - nuclio overview to CNCF (Sep 25th 2017)
iguazio - nuclio overview to CNCF (Sep 25th 2017)iguazio - nuclio overview to CNCF (Sep 25th 2017)
iguazio - nuclio overview to CNCF (Sep 25th 2017)
Eran Duchan
 
Apache zookeeper 101
Apache zookeeper 101Apache zookeeper 101
Apache zookeeper 101
Quach Tung
 
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Giovanni Toraldo
 
Cloud computing, in practice ~ develer workshop
Cloud computing, in practice ~ develer workshopCloud computing, in practice ~ develer workshop
Cloud computing, in practice ~ develer workshop
Develer S.r.l.
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
iguazio
 
Docker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBMDocker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBM
Neependra Khare
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
Federico Michele Facca
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009
lilyco
 
sector-sphere
sector-spheresector-sphere
sector-sphere
xlight
 

Similar to Introduction to Apache ZooKeeper (20)

A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to Kubernetes
 
Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!
 
NodeJS guide for beginners
NodeJS guide for beginnersNodeJS guide for beginners
NodeJS guide for beginners
 
Zookeeper big sonata
Zookeeper  big sonataZookeeper  big sonata
Zookeeper big sonata
 
Node.js at Joyent: Engineering for Production
Node.js at Joyent: Engineering for ProductionNode.js at Joyent: Engineering for Production
Node.js at Joyent: Engineering for Production
 
OpenDaylight SDN Controller - Introduction
OpenDaylight SDN Controller - IntroductionOpenDaylight SDN Controller - Introduction
OpenDaylight SDN Controller - Introduction
 
Базы данных. ZooKeeper
Базы данных. ZooKeeperБазы данных. ZooKeeper
Базы данных. ZooKeeper
 
Networking in Docker EE 2.0 with Kubernetes and Swarm
Networking in Docker EE 2.0 with Kubernetes and SwarmNetworking in Docker EE 2.0 with Kubernetes and Swarm
Networking in Docker EE 2.0 with Kubernetes and Swarm
 
Networking in docker ee with kubernetes and swarm
Networking in docker ee with kubernetes and swarmNetworking in docker ee with kubernetes and swarm
Networking in docker ee with kubernetes and swarm
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific data
 
Zookeeper Architecture
Zookeeper ArchitectureZookeeper Architecture
Zookeeper Architecture
 
iguazio - nuclio overview to CNCF (Sep 25th 2017)
iguazio - nuclio overview to CNCF (Sep 25th 2017)iguazio - nuclio overview to CNCF (Sep 25th 2017)
iguazio - nuclio overview to CNCF (Sep 25th 2017)
 
Apache zookeeper 101
Apache zookeeper 101Apache zookeeper 101
Apache zookeeper 101
 
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
 
Cloud computing, in practice ~ develer workshop
Cloud computing, in practice ~ develer workshopCloud computing, in practice ~ develer workshop
Cloud computing, in practice ~ develer workshop
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
 
Docker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBMDocker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBM
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009
 
sector-sphere
sector-spheresector-sphere
sector-sphere
 

Recently uploaded

(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
Pigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending PlantPigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending Plant
LINUS PROJECTS (INDIA)
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
KAMAL CHOUDHARY
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
Figma AI Design Generator_ In-Depth Review.pdf
Figma AI Design Generator_ In-Depth Review.pdfFigma AI Design Generator_ In-Depth Review.pdf
Figma AI Design Generator_ In-Depth Review.pdf
Management Institute of Skills Development
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
Brian Pichman
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
aakash malhotra
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
The importance of Quality Assurance for ICT Standardization
The importance of Quality Assurance for ICT StandardizationThe importance of Quality Assurance for ICT Standardization
The importance of Quality Assurance for ICT Standardization
Axel Rennoch
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
CEPTES Software Inc
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 

Recently uploaded (20)

(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
Pigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending PlantPigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending Plant
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
Figma AI Design Generator_ In-Depth Review.pdf
Figma AI Design Generator_ In-Depth Review.pdfFigma AI Design Generator_ In-Depth Review.pdf
Figma AI Design Generator_ In-Depth Review.pdf
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
The importance of Quality Assurance for ICT Standardization
The importance of Quality Assurance for ICT StandardizationThe importance of Quality Assurance for ICT Standardization
The importance of Quality Assurance for ICT Standardization
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 

Introduction to Apache ZooKeeper

  • 2. Saurav Haloi Engineer at Symantec Work in Hadoop & Distributed System FOSS enthusiast GNUnify - 2013 2
  • 3. A distributed system consists of multiple computers that communicate through a computer network and interact with each other to achieve a common goal. - Wikipedia GNUnify - 2013 3
  • 4. The network is reliable. Latency is zero. Bandwidth is infinite. The network is secure. Topology doesn't change. There is one administrator. Transport cost is zero. The network is homogeneous. Reference: http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing GNUnify - 2013 4
  • 5. Coordination: An act that multiple nodes must perform together. Examples: Group membership Locking Publisher/Subscriber Leader Election Synchronization Getting node coordination correct is very hard! GNUnify - 2013 5
  • 7. ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical name space of data registers. - ZooKeeper Wiki ZooKeeper is much more than a distributed lock server! GNUnify - 2013 7
  • 8. An open source, high-performance coordination service for distributed applications. Exposes common services in simple interface: naming configuration management locks & synchronization group services … developers don't have to write them from scratch Build your own on it for specific needs. GNUnify - 2013 8
  • 9. Configuration Management Cluster member nodes bootstrapping configuration from a centralized source in unattended way Easier, simpler deployment/provisioning Distributed Cluster Management Node join / leave Node statuses in real time Naming service – e.g. DNS Distributed synchronization - locks, barriers, queues Leader election in a distributed system. Centralized and highly reliable (simple) data registry GNUnify - 2013 9
  • 10. ZooKeeper Service is replicated over a set of machines All machines store a copy of the data (in memory) A leader is elected on service startup Clients only connect to a single ZooKeeper server & maintains a TCP connection. Client can read from any Zookeeper server, writes go through the leader & needs majority consensus. Image: https://cwiki.apache.org/confluence/display/ZOOKEEPER/ProjectDescription GNUnify - 2013 10
  • 11. ZooKeeper has a hierarchal name space. Each node in the namespace is called as a ZNode. Every ZNode has data (given as byte[]) and can optionally have children. parent : "foo" |-- child1 : "bar" |-- child2 : "spam" `-- child3 : "eggs" `-- grandchild1 : "42" ZNode paths: canonical, absolute, slash-separated no relative references. names can have Unicode characters GNUnify - 2013 11
  • 12. Maintain a stat structure with version numbers for data changes, ACL changes and timestamps. Version numbers increases with changes Data is read and written in its entirety Image: http://helix.incubator.apache.org/Architecture.html GNUnify - 2013 12
  • 13. Persistent Nodes exists till explicitly deleted Ephemeral Nodes exists as long as the session is active can’t have children Sequence Nodes (Unique Naming) append a monotonically increasing counter to the end of path applies to both persistent & ephemeral nodes GNUnify - 2013 13
  • 14. Operation Type create Write delete Write exists Read getChildren Read getData Read setData Write getACL Read setACL Write sync Read ZNodes are the main entity that a programmer access. GNUnify - 2013 14
  • 15. [zk: localhost:2181(CONNECTED) 0] help [zk: localhost:2181(CONNECTED) 1] ls / ZooKeeper -server host:port cmd args [hbase, zookeeper] connect host:port get path [watch] [zk: localhost:2181(CONNECTED) 2] ls2 /zookeeper ls path [watch] [quota] set path data [version] cZxid = 0x0 rmr path ctime = Tue Jan 01 05:30:00 IST 2013 delquota [-n|-b] path mZxid = 0x0 quit mtime = Tue Jan 01 05:30:00 IST 2013 printwatches on|off pZxid = 0x0 create [-s] [-e] path data acl cversion = -1 stat path [watch] dataVersion = 0 close aclVersion = 0 ls2 path [watch] ephemeralOwner = 0x0 history dataLength = 0 listquota path numChildren = 1 setAcl path acl getAcl path [zk: localhost:2181(CONNECTED) 3] create /test-znode HelloWorld sync path Created /test-znode redo cmdno [zk: localhost:2181(CONNECTED) 4] ls / addauth scheme auth [test-znode, hbase, zookeeper] delete path [version] [zk: localhost:2181(CONNECTED) 5] get /test-znode setquota -n|-b val path HelloWorld GNUnify - 2013 15
  • 16. Clients can set watches on znodes: NodeChildrenChanged NodeCreated NodeDataChanged NodeDeleted Changes to a znode trigger the watch and ZooKeeper sends the client a notification. Watches are one time triggers. Watches are always ordered. Client sees watched event before new znode data. Client should handle cases of latency between getting the event and sending a new request to get a watch. GNUnify - 2013 16
  • 17. API methods are sync as well as async Sync: exists(“/test-cluster/CONFIGS", null); Async: exists("/test-cluster/CONFIGS", null, new StatCallback() { @Override public processResult(int rc, String path, Object ctx, Stat stat) { //process result when called back later } }, null); GNUnify - 2013 17
  • 18. Read requests are processed locally at the ZooKeeper server to which the client is currently connected Write requests are forwarded to the leader and go through majority consensus before a response is generated. Image: http://www.slideshare.net/scottleber/apache-zookeeper GNUnify - 2013 18
  • 19. Sequential Consistency: Updates are applied in order Atomicity: Updates either succeed or fail Single System Image: A client sees the same view of the service regardless of the ZK server it connects to. Reliability: Updates persists once applied, till overwritten by some clients. Timeliness: The clients’ view of the system is guaranteed to be up-to-date within a certain time bound. (Eventual Consistency) GNUnify - 2013 19
  • 20. Each Client Host i, i:=1 .. N Cluster 1. Watch on /members 2. Create /members/host-${i} as ephemeral nodes /members 3. Node Join/Leave generates alert 4. Keep updating /members/host-${i} host-1 periodically for node status changes (load, memory, CPU etc.) host-2 host-N GNUnify - 2013 20
  • 21. 1. A znode, say “/svc/election-path" 2. All participants of the election process create an ephemeral-sequential node on the same election path. 3. The node with the smallest sequence number is the leader. 4. Each “follower” node listens to the node with the next lower seq. number 5. Upon leader removal go to election-path and find a new leader, or become the leader if it has the lowest sequence number. 6. Upon session expiration check the election state and go to election if needed Image: http://techblog.outbrain.com/2011/07/leader-election-with-zookeeper/ GNUnify - 2013 21
  • 22. Assuming there are N clients trying to ZK acquire a lock Clients creates an |---Cluster ephemeral, sequential znode under +---config the path /Cluster/_locknode_ +---memberships Clients requests a list of children for the lock znode (i.e. _locknode_) +---_locknode_ The client with the least ID according to +---host1-3278451 natural ordering will hold the lock. +---host2-3278452 Other clients sets watches on the znode with id immediately preceding +---host3-3278453 its own id +--- … Periodically checks for the lock in case of notification. ---hostN-3278XXX The client wishing to release a lock deletes the node, which triggering the next client in line to acquire the lock. GNUnify - 2013 22
  • 23. ZooKeeper ships client libraries in: Java C Perl Python Community contributed client bindings available for Scala, C#, Node.js, Ruby, Erlang, Go, Haskell https://cwiki.apache.org/ZOOKEEPER/zkclientbindings.html GNUnify - 2013 23
  • 24. Watches are one time triggers Continuous watching on znodes requires reset of watches after every events / triggers Too many watches on a single znode creates the “herd effect” - causing bursts of traffic and limiting scalability If a znode changes multiple times between getting the event and setting the watch again, carefully handle it! Keep session time-outs long enough to handle long garbage-collection pauses in applications. Set Java max heap size correctly to avoid swapping. Dedicated disk for ZooKeeper transaction log GNUnify - 2013 24
  • 25. Companies: Projects in FOSS: • Yahoo! • Apache Map/Reduce (Yarn) • Zynga • Apache HBase • Rackspace • Apache Solr • LinkedIn • Neo4j • Netflix • Katta • and many more… • and many more… Reference: https://cwiki.apache.org/confluence/display/ZOOKEEPER/PoweredBy GNUnify - 2013 25
  • 26. Used within Twitter for service discovery How? Services register themselves in ZooKeeper Clients query the production cluster for service “A” in data center “XYZ” An up-to-date host list for each service is maintained Whenever new capacity is added the client will automatically be aware Also, enables load balancing across all servers. Reference: http://engineering.twitter.com/ GNUnify - 2013 26
  • 27. The Chubby lock service for loosely-coupled distributed systems Google Research (7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), {USENIX} (2006) ) ZooKeeper: Wait-free coordination for Internet-scale systems Yahoo Research (USENIX Annual Technology Conference 2010) Apache ZooKeeper Home: http://zookeeper.apache.org/ Presentations: http://www.slideshare.net/mumrah/introduction-to-zookeeper-trihug- may-22-2012 http://www.slideshare.net/scottleber/apache-zookeeper https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeperPre sentations GNUnify - 2013 27
  • 28. The Google File System The Hadoop Distributed File System MapReduce: Simplified Data Processing on Large Clusters Bigtable: A Distributed Storage System for Structured Data PNUTS: Yahoo!’s Hosted Data Serving Platform Dynamo: Amazon's Highly Available Key-value Store Spanner: Google's Globally Distributed Database Centrifuge: Integrated Lease Management and Partitioning Cloud Services (Microsoft) ZAB: A simple totally ordered broadcast protocol (Yahoo!) Paxos Made Simple by Leslie Lamport. Eventually Consistent by Werner Vogel (CTO, Amazon) http://www.highscalability.com/ GNUnify - 2013 28
  • 30. Thank You! Saurav Haloi saurav.haloi@yahoo.com Twitter: sauravhaloi GNUnify - 2013 30