© 2015 IBM Corporation
ZooKeeper And Embedded ZooKeeper
IBM InfoSphere Streams Version 4.0
Yip-Hing Ng
Senior Software Engineer
Streams Platform Team
yipng@us.ibm.com
2 © 2015 IBM Corporation
Important Disclaimer
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL
PURPOSES ONLY.
WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE
INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY
OF ANY KIND, EXPRESS OR IMPLIED.
IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY,
WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR
OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:
• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR
THEIR SUPPLIERS AND/OR LICENSORS); OR
• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT
GOVERNING THE USE OF IBM SOFTWARE.
IBM’s statements regarding its plans, directions, and intent are subject to change or
withdrawal without notice at IBM’s sole discretion. Information regarding potential
future products is intended to outline our general product direction and it should not
be relied on in making a purchasing decision. The information mentioned regarding
potential future products is not a commitment, promise, or legal obligation to deliver
any material, code or functionality. Information about potential future products may
not be incorporated into any contract. The development, release, and timing of any
future features or functionality described for our products remains at our sole
discretion.
THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
3 © 2015 IBM Corporation
Agenda
 Apache ZooKeeper Overview
 ZooKeeper Architecture
 ZooKeeper Data Model
 ZooKeeper Consistency Guarantees
 Embedded ZooKeeper
 External ZooKeeper
 ZooKeeper Guidelines/Best Practices
 Questions
4 © 2015 IBM Corporation
Apache ZooKeeper Overview
 A highly scalable, open source, distributed coordination service for
distributed applications
 Key component and Prerequisite of Streams Version 4.0
– Requires v3.4.6 or above
 Apache Software Foundation
– Used in Apache Hadoop and HBase projects
 Provides a set of primitives to implement higher level constructs in a
distributed system such as:
– Configuration maintenance
– Synchronization
– Leader Election
– Groups and Naming services
– Work Queues
 High Availability
– Replication
5 © 2015 IBM Corporation
ZooKeeper Architecture
ZooKeeper
(Follower)
Host A
Client
(Read)
Client
(Write)
Client
(Read)
ZooKeeper
(Leader)
Host B
Client
(Read)
Client
(Read)
Client
(Read)
ZooKeeper
(Follower)
Host C
ZooKeeper Ensemble
6 © 2015 IBM Corporation
ZooKeeper Data Model
 Hierarchical namespace (e.g.: similar to distributed file system)
 Each node called Znode can have its own data and child nodes
 Path is represented as canonical absolute path (no relative path)
 e.g.: /app1/p1
 Znode maintains a stat structure
 Version (conditional update)
 ACL
 Watcher for data change notification, single trigger
 Znode Types
 Persistent
 Exists until they are explicitly deleted
 Ephemeral
 gets deleted when session expires
 Not allowed to have children
 Sequential
 Can be persistent or ephemeral
 Monotonic sequence counter, helpful for synchronization, e.g.: /app2/p1-0000000001
/
/app1 /app2
/app1/p3/app1/p2/app1/p1 /app2/p1
7 © 2015 IBM Corporation
ZooKeeper Consistency Guarantees
 Sequential Consistency
 Updates are applied in the order they are received by ZooKeeper
 Atomicity
 All or nothing, no partial results
 Reliability
 Once an update has been applied, it will persist from that time forward until
overwritten by another update
 Timeliness
 Client view is guaranteed to be up-to-date within certain time-bound
 Single System Image
 Client sees the same view of the service regardless of the ZooKeeper server it
connects to
8 © 2015 IBM Corporation
Embedded ZooKeeper
 Managed by Streams (start, stop, etc.) to simplify Streams prerequiste
 Basic Domain creation by Domain Manager or via streamtool
 e.g.: streamtool mkdomain -d streamsdomain1 --embeddedzk
 Primarily use for a single node developer environment. It is not
recommended for a production environment.
 A Supervisor process/watchdog runs side by side with Embedded
ZooKeeper
 Can be manually started or stopped via streamtool (when no active domain)
 e.g.: To start it: streamtool embeddedzk --start
 e.g.: To stop it: streamtool embeddedzk --stop
 e.g.: To get its status: streamtool embeddedzk --status
9 © 2015 IBM Corporation
Embedded ZooKeeper (cont.)
 Embedded ZooKeeper configuration can be set via streamtool
 ZooKeeper server related config parameters are prepended with:
 streams.zookeeper.property.
 e.g.: To update its server port to 21810:
 streamtool setbootproperty streams.zookeeper.property.clientPort=21810
 Default Embedded ZooKeeper dataDir location
 $HOME/.streams/var/embeddedzk/datadir
 Default Embedded ZooKeeper and ZKMonitor log/trace file location
 $HOME/.streams/var/embeddedzk
10 © 2015 IBM Corporation
Embedded ZooKeeper (cont.)
ZooKeeper
Controller
ZK Monitor
Single Host
Audit Log
SWS
JMX
AAS
SAM SRM
View
APP
SCH
HC
11 © 2015 IBM Corporation
External ZooKeeper
 Not Managed by Streams
 Standalone or Replicated Mode
 Specify STREAMS_ZKCONNECT env var or streamtool –zkconnect option
 e.g.: streamtool mkdomain -d streamsdomain2
--zkconnect zkserver1:2181,zkserver2:2181,zkserver3:2181
 Enterprise Domain, use for multi-users and hosts
 For reliability and high availability on a production environment, its
recommended to run as an ensemble of ZooKeeper servers.
 ZooKeeper Ensemble
 Writes
 All writes go through leader
 Global ordering (zxid)
 Reads
 In memory
 Follow-the-leader (can lag from leader – but eventual consistency)
12 © 2015 IBM Corporation
External ZooKeeper (Standalone Mode)
ZooKeeper
Controller
Single Host
Audit Log
SWS
JMX
AAS
SAM SRM
View
APP
SCH
HC
13 © 2015 IBM Corporation
External ZooKeeper (Replicated Mode)
ZooKeeper
(Follower)
Controller
Host A
AAS
SAM
SWS
Audit Log
JMX
SRM
SCH
View
ZooKeeper
(Leader)
Controller
Host B
AAS
SAM
Audit Log
JMX
SRM
SCH
View
ZooKeeper
(Follower)
Controller
Host C
AAS
SAM
Audit Log
JMX
SRM
SCH
View
Host D
Controller
HC
APP
Host E
Controller
HC
APP
Host F
Controller
HC
APP
14 © 2015 IBM Corporation
ZooKeeper Guidelines/Best Practices
 The ZooKeeper Admin Guide does not recommend standalone mode in a production
environment. ZooKeeper runs as an ensemble of ZooKeeper servers. For reliability and
availability, run ZooKeeper on at least 3 hosts. Running ZooKeeper on 5 hosts is preferred.
 For optimal performance and response time, run the ZooKeeper server on a dedicated
machine, and use a dedicated device for the transaction log.
 Having a supervisory process that manages each of the ZooKeeper server processes
ensures that if the ZooKeeper process exits abnormally, it is restarted automatically and
rejoins the cluster.
 If you use the default ZooKeeper configuration, ZooKeeper does not remove old snapshots
and log files that are stored in the data directory. To configure automatic purging of the old
files, you can use the autopurge.snapRetainCount and autopurge.purgeInterval
parameters.
 Ensure that the value of the maxClientCnxns configuration parameter is high enough to
avoid the loss of connections.
15 © 2015 IBM Corporation
ZooKeeper Guidelines/Best Practices (cont.)
 ZooKeeper keeps data in memory and in a persistent store. The amount of data that
InfoSphere Streams stores in ZooKeeper depends on the application runtime size. A typical
amount is three times the application description language (ADL) file size.
 The default Java™ heap size for ZooKeeper is the JVM default for the system. If the
maximum heap size is not sufficient for the ZooKeeper runtime system and data in memory,
increase the size by using the JVMFLAGS environment variable.
 Tune JVM GC flags to avoid long garbage collection pauses (Parallel/CMS/Incremental GC)
 To avoid disk swapping, ensure that the Java heap size is less than the unused physical
memory.
 The ZooKeeper Administrator’s Guide recommends having a dedicated disk for the
dataLogDir directory that is separate from the dataDir directory. Set the dataLogDir
parameter in the ZooKeeper-installation-directory/conf/zoo.cfg file.
 Periodically backing up the ZooKeeper data and data log directory is a good practice.
Recovering from backups might be necessary in case a catastrophic failure, such as a
corrupted disk, occurs.
16 © 2015 IBM Corporation
ZooKeeper Guidelines/Best Practices (cont.)
 If ZooKeeper follower(s) throw exception that it has fail to follow leader, it may be caused by
 Network issues
 Disk IO contention
 ZK snapshot is too large
 This can be resolved by:
 Monitoring network
 Reduce IO contention
 Increase initLimit and syncLimit on all ZooKeeper servers and restart
17 © 2015 IBM Corporation
Questions?

ZooKeeper and Embedded ZooKeeper Support for IBM InfoSphere Streams V4.0

  • 1.
    © 2015 IBMCorporation ZooKeeper And Embedded ZooKeeper IBM InfoSphere Streams Version 4.0 Yip-Hing Ng Senior Software Engineer Streams Platform Team yipng@us.ibm.com
  • 2.
    2 © 2015IBM Corporation Important Disclaimer THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: • CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS); OR • ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF IBM SOFTWARE. IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
  • 3.
    3 © 2015IBM Corporation Agenda  Apache ZooKeeper Overview  ZooKeeper Architecture  ZooKeeper Data Model  ZooKeeper Consistency Guarantees  Embedded ZooKeeper  External ZooKeeper  ZooKeeper Guidelines/Best Practices  Questions
  • 4.
    4 © 2015IBM Corporation Apache ZooKeeper Overview  A highly scalable, open source, distributed coordination service for distributed applications  Key component and Prerequisite of Streams Version 4.0 – Requires v3.4.6 or above  Apache Software Foundation – Used in Apache Hadoop and HBase projects  Provides a set of primitives to implement higher level constructs in a distributed system such as: – Configuration maintenance – Synchronization – Leader Election – Groups and Naming services – Work Queues  High Availability – Replication
  • 5.
    5 © 2015IBM Corporation ZooKeeper Architecture ZooKeeper (Follower) Host A Client (Read) Client (Write) Client (Read) ZooKeeper (Leader) Host B Client (Read) Client (Read) Client (Read) ZooKeeper (Follower) Host C ZooKeeper Ensemble
  • 6.
    6 © 2015IBM Corporation ZooKeeper Data Model  Hierarchical namespace (e.g.: similar to distributed file system)  Each node called Znode can have its own data and child nodes  Path is represented as canonical absolute path (no relative path)  e.g.: /app1/p1  Znode maintains a stat structure  Version (conditional update)  ACL  Watcher for data change notification, single trigger  Znode Types  Persistent  Exists until they are explicitly deleted  Ephemeral  gets deleted when session expires  Not allowed to have children  Sequential  Can be persistent or ephemeral  Monotonic sequence counter, helpful for synchronization, e.g.: /app2/p1-0000000001 / /app1 /app2 /app1/p3/app1/p2/app1/p1 /app2/p1
  • 7.
    7 © 2015IBM Corporation ZooKeeper Consistency Guarantees  Sequential Consistency  Updates are applied in the order they are received by ZooKeeper  Atomicity  All or nothing, no partial results  Reliability  Once an update has been applied, it will persist from that time forward until overwritten by another update  Timeliness  Client view is guaranteed to be up-to-date within certain time-bound  Single System Image  Client sees the same view of the service regardless of the ZooKeeper server it connects to
  • 8.
    8 © 2015IBM Corporation Embedded ZooKeeper  Managed by Streams (start, stop, etc.) to simplify Streams prerequiste  Basic Domain creation by Domain Manager or via streamtool  e.g.: streamtool mkdomain -d streamsdomain1 --embeddedzk  Primarily use for a single node developer environment. It is not recommended for a production environment.  A Supervisor process/watchdog runs side by side with Embedded ZooKeeper  Can be manually started or stopped via streamtool (when no active domain)  e.g.: To start it: streamtool embeddedzk --start  e.g.: To stop it: streamtool embeddedzk --stop  e.g.: To get its status: streamtool embeddedzk --status
  • 9.
    9 © 2015IBM Corporation Embedded ZooKeeper (cont.)  Embedded ZooKeeper configuration can be set via streamtool  ZooKeeper server related config parameters are prepended with:  streams.zookeeper.property.  e.g.: To update its server port to 21810:  streamtool setbootproperty streams.zookeeper.property.clientPort=21810  Default Embedded ZooKeeper dataDir location  $HOME/.streams/var/embeddedzk/datadir  Default Embedded ZooKeeper and ZKMonitor log/trace file location  $HOME/.streams/var/embeddedzk
  • 10.
    10 © 2015IBM Corporation Embedded ZooKeeper (cont.) ZooKeeper Controller ZK Monitor Single Host Audit Log SWS JMX AAS SAM SRM View APP SCH HC
  • 11.
    11 © 2015IBM Corporation External ZooKeeper  Not Managed by Streams  Standalone or Replicated Mode  Specify STREAMS_ZKCONNECT env var or streamtool –zkconnect option  e.g.: streamtool mkdomain -d streamsdomain2 --zkconnect zkserver1:2181,zkserver2:2181,zkserver3:2181  Enterprise Domain, use for multi-users and hosts  For reliability and high availability on a production environment, its recommended to run as an ensemble of ZooKeeper servers.  ZooKeeper Ensemble  Writes  All writes go through leader  Global ordering (zxid)  Reads  In memory  Follow-the-leader (can lag from leader – but eventual consistency)
  • 12.
    12 © 2015IBM Corporation External ZooKeeper (Standalone Mode) ZooKeeper Controller Single Host Audit Log SWS JMX AAS SAM SRM View APP SCH HC
  • 13.
    13 © 2015IBM Corporation External ZooKeeper (Replicated Mode) ZooKeeper (Follower) Controller Host A AAS SAM SWS Audit Log JMX SRM SCH View ZooKeeper (Leader) Controller Host B AAS SAM Audit Log JMX SRM SCH View ZooKeeper (Follower) Controller Host C AAS SAM Audit Log JMX SRM SCH View Host D Controller HC APP Host E Controller HC APP Host F Controller HC APP
  • 14.
    14 © 2015IBM Corporation ZooKeeper Guidelines/Best Practices  The ZooKeeper Admin Guide does not recommend standalone mode in a production environment. ZooKeeper runs as an ensemble of ZooKeeper servers. For reliability and availability, run ZooKeeper on at least 3 hosts. Running ZooKeeper on 5 hosts is preferred.  For optimal performance and response time, run the ZooKeeper server on a dedicated machine, and use a dedicated device for the transaction log.  Having a supervisory process that manages each of the ZooKeeper server processes ensures that if the ZooKeeper process exits abnormally, it is restarted automatically and rejoins the cluster.  If you use the default ZooKeeper configuration, ZooKeeper does not remove old snapshots and log files that are stored in the data directory. To configure automatic purging of the old files, you can use the autopurge.snapRetainCount and autopurge.purgeInterval parameters.  Ensure that the value of the maxClientCnxns configuration parameter is high enough to avoid the loss of connections.
  • 15.
    15 © 2015IBM Corporation ZooKeeper Guidelines/Best Practices (cont.)  ZooKeeper keeps data in memory and in a persistent store. The amount of data that InfoSphere Streams stores in ZooKeeper depends on the application runtime size. A typical amount is three times the application description language (ADL) file size.  The default Java™ heap size for ZooKeeper is the JVM default for the system. If the maximum heap size is not sufficient for the ZooKeeper runtime system and data in memory, increase the size by using the JVMFLAGS environment variable.  Tune JVM GC flags to avoid long garbage collection pauses (Parallel/CMS/Incremental GC)  To avoid disk swapping, ensure that the Java heap size is less than the unused physical memory.  The ZooKeeper Administrator’s Guide recommends having a dedicated disk for the dataLogDir directory that is separate from the dataDir directory. Set the dataLogDir parameter in the ZooKeeper-installation-directory/conf/zoo.cfg file.  Periodically backing up the ZooKeeper data and data log directory is a good practice. Recovering from backups might be necessary in case a catastrophic failure, such as a corrupted disk, occurs.
  • 16.
    16 © 2015IBM Corporation ZooKeeper Guidelines/Best Practices (cont.)  If ZooKeeper follower(s) throw exception that it has fail to follow leader, it may be caused by  Network issues  Disk IO contention  ZK snapshot is too large  This can be resolved by:  Monitoring network  Reduce IO contention  Increase initLimit and syncLimit on all ZooKeeper servers and restart
  • 17.
    17 © 2015IBM Corporation Questions?