SlideShare a Scribd company logo
1 of 27
Kathleen Ting | Cloudera
Building an Impenetrable
ZooKeeper
Who Am I?
•  Kathleen Ting
– Apache Sqoop Committer, PMC Member
– Author of Apache Sqoop Cookbook
– Customer Operations Engr Mgr, Cloudera
– @kate_ting, kathleen@apache.org
2
What is ZooKeeper?
•  Coordinator of distributed applications
•  Small clusters reliably serve many coordination
needs
•  Canary in the Hadoop coal mine
3
3 ZooKeeper Ensemble
4
ZooKeeper’s Data Model
5
Why is ZooKeeper Important?
•  High Availability
–  Replicate to withstand machine failures
•  Distributed Coordination
–  One consistent framework to rule coordination across all
systems
–  Observe every operation by every client in exactly the same
order
6
Who Uses ZooKeeper?
•  HBase
•  MapReduce (YARN)
•  HDFS (High Availability)
•  Solr
•  Kafka
•  S4
•  Storm
•  Accumulo
•  Numerous custom solutions:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/
poweredby
7
ZooKeeper Coordination Example
8
JVM / Linux
ZooKeeper HDFS
HBase
App MR
Disk/Network
What are Misconfigurations?
•  Any diagnostic ticket requiring a change to ZooKeeper (HBase,
Hadoop..) or to OS config files
•  Comprise 44% of tickets
•  e.g. resource-allocation: memory, file-handles, disk-space
9
Ticket Breakdown by Type
44%
34%
10%
8%
4% Misconfig
Bug
App
JVM/Linux
Disk/NW
10
Ticket Breakdown by Component
11
43%	
  
34%	
  
7%	
  
3%	
  
3%	
   10%	
   ZooKeeper	
  
HBase	
  
Pig	
  
	
  Flume	
  OG	
  
HDFS	
  
System	
  
ZooKeeper Anti-Patterns
•  Connection Mismanagement
•  Time Mismanagement
•  Disk Mismanagement
12
ZooKeeper Anti-Patterns
•  Connection Mismanagement
•  Time Mismanagement
•  Disk Mismanagement
13
1. Too Many Connections
WARN [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@247] - Too
many connections from /xx.x.xx.xxx - max is 60 !
How can it be resolved?
•  Running out of ZK connections?
–  Set maxClientCnxns=200 (per ZK node) in zoo.cfg!
•  HBase or Hive client leaking connections?
–  Manually close connections
–  Fixed in HBASE-3777, HBASE-4773, HBASE-5466,
HIVE-3723
•  Turn off IPv6 or java.net.preferIPv4Stack=true!
14
2. Connection Closes Prematurely
ERROR:
org.apache.hadoop.hbase.ZooKeeperConnectionException:
HBase is able to connect to ZooKeeper but the
connection closes immediately.!
How can it be resolved?
•  If hbase.cluster.distributed = true in hbase-site,
then in zoo.cfg, quorum can’t be set to localhost
•  In hbase-site, set
hbase.zookeeper.recoverable.waittime=30000ms!
–  Provides enough time for HBase client to try another ZK
server
–  Fixed in HBASE-3065
15
3. Pig Hangs Connecting to HBase
WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for
server null, unexpected error, closing socket connection
and attempting reconnect java.net.ConnectionException:
Connection refused!
What causes this?
•  Location of ZK quorum is not known to Pig (default 127.0.0.1:2181
fails)
How can it be resolved?
•  Use Pig 10, which includes PIG-2115
•  If there is overlap between TaskTrackers and ZK quorum nodes
–  Set hbase.zookeeper.quorum to final in hbase-site.xml
–  Otherwise, add
"hbase.zookeeper.quorum=hadoophbasemaster.lan:
2181" to "pig.properties” (fixed in PIG-2821)
16
ZooKeeper Anti-Patterns
•  Connection Mismanagement
•  Time Mismanagement
•  Disk Mismanagement
17
4. Client Session Timed Out
INFO org.apache.zookeeper.server.ZooKeeperServer:
Expiring session <id>, timeout of 40000ms exceeded!
How can it be resolved?
•  ZK and HBase need same session timeout values:
–  zoo.cfg: maxSessionTimeout=180000!
–  hbase-site.xml: zookeeper.session.timeout=180000!
•  Don’t co-locate ZK with IO-intense DataNode or RegionServer
•  Make sure your session timeout is sufficiently long
•  Specify right amount of heap and tune GC flags
–  Turn on Parallel/CMS/Incremental GC
18
5. Clients Lose Connections
WARN org.apache.zookeeper.ClientCnxn - Session <id>
for server <name>, unexpected error, closing socket
connection and attempting reconnect!
java.io.IOException: Broken pipe!
Don’t use SSD drive for ZK transaction log
•  ZK optimized for mechanical spindles and for sequential IO
•  SSD provides little benefit and suffers from high latency spikes
–  http://storagemojo.com/2012/06/07/the-ssd-write-cliff-in-real-
life/
–  SSD pre-allocates disk extents to avoid directory updates
but that doubles the load on the SSD
–  SSD disk stops for 40 sec (which is greater than session
timeout)
19
ZooKeeper Anti-Patterns
•  Connection Mismanagement
•  Time Mismanagement
•  Disk Mismanagement
20
6. Unable to Load Database –
Unable to Run Quorum Server
FATAL Unable to load database on disk !
java.io.IOException: Failed to process transaction
type: 2 error: KeeperErrorCode = NoNode for <file> at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.
restore(FileTxnSnapLog.java:152)!
How can it be resolved?
•  Archive and wipe /var/zookeeper/version-2 if other two ZK
servers are running
21
7. Unable to Load Database –
Unreasonable Length Exception
FATAL Unable to load database on disk !
java.io.IOException: Unreasonable length = 1048583 !
at
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInp
utArchive.java:100) !
How can it be resolved?
•  Server allows a client to set data larger than the server can read
from disk
•  If a znode is not readable, increase jute.maxbuffer
–  Look for "Packet len <xx> is out of range" in the client log
–  Increase it by 20%
–  Set in JVMFLAGS="-Djute.maxbuffer=yy" bin/zkCli.sh22
8. Failure to Follow Leader
WARN org.apache.zookeeper.server.quorum.Learner:
Exception when following the leader
java.net.SocketTimeoutException: Read timed out !
What causes this?
•  Disk IO contention, Network Issues
•  ZK snapshot is too large (lots of ZK nodes)
How can it be resolved?
•  Reduce IO contention by putting dataDir on dedicated spindle
•  Increase initLimit on all ZK servers and restart, see
ZOOKEEPER-1521
•  Monitor network (e.g. ifconfig)23
Optimal Ensemble Size
24
# of ZK
Servers
Purpose
1 Coordination
3 Reliability for production environment
5 Permits taking one server down for maintenance
Why not run 11 ZK servers?
Trust But Verify
•  zk-smoketest
–  https://github.com/phunt/zk-smoketest
–  Verify new, updated, & existing installations
–  Identify latency issues
•  zk-top
–  https://github.com/phunt/zktop
–  Unix “top” like utility for ZK
•  4 letter words/JMX (e.g. ruok, srvr)
–  http://zookeeper.apache.org/doc/current/
zookeeperAdmin.html#sc_zkCommands
–  Use "stat" to get an idea what your request latency looks like
25
ZooKeeper Patterns
DOs
•  Separate spindles for dataDir & dataLogDir
–  Avoids competition between logging and snapshots
–  Improves throughput and latency
•  Allocate 3 or 5 ZK servers
•  Tune Garbage Collection
•  Run zkCleanup.sh script via cron
DON’Ts
•  Don’t co-locate ZK with I/O intense DataNode or RegionServer
–  ZK is latency sensitive
•  Don’t use SSD drive for ZK transaction log
26
Configure ZooKeeper
Correctly..
..and it’ll be as impenetrable as a
distributed system allows.
Questions?
27

More Related Content

What's hot

Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsConcentric Sky
 
NGINX High-performance Caching
NGINX High-performance CachingNGINX High-performance Caching
NGINX High-performance CachingNGINX, Inc.
 
Rails Caching Secrets from the Edge
Rails Caching Secrets from the EdgeRails Caching Secrets from the Edge
Rails Caching Secrets from the EdgeMichael May
 
Nginx: Accelerate Rails, HTTP Tricks
Nginx: Accelerate Rails, HTTP TricksNginx: Accelerate Rails, HTTP Tricks
Nginx: Accelerate Rails, HTTP TricksAdam Wiggins
 
Altitude SF 2017: Debugging Fastly VCL 101
Altitude SF 2017: Debugging Fastly VCL 101Altitude SF 2017: Debugging Fastly VCL 101
Altitude SF 2017: Debugging Fastly VCL 101Fastly
 
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014zshoylev
 
Tips for going fast in a slow world: Michael May at OSCON 2015
Tips for going fast in a slow world: Michael May at OSCON 2015Tips for going fast in a slow world: Michael May at OSCON 2015
Tips for going fast in a slow world: Michael May at OSCON 2015Fastly
 
Care and feeding notes
Care and feeding notesCare and feeding notes
Care and feeding notesPerrin Harkins
 
MetaZeta Clusters Overview
MetaZeta Clusters OverviewMetaZeta Clusters Overview
MetaZeta Clusters OverviewPaul Baclace
 
Building a better web
Building a better webBuilding a better web
Building a better webFastly
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13Dave Gardner
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disquszeeg
 
Use case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in productionUse case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in production知教 本間
 
Drupal Performance : DrupalCamp North
Drupal Performance : DrupalCamp NorthDrupal Performance : DrupalCamp North
Drupal Performance : DrupalCamp NorthPhilip Norton
 
Wordpress optimization
Wordpress optimizationWordpress optimization
Wordpress optimizationAlmog Baku
 
Happy Browser, Happy User! WordSesh 2019
Happy Browser, Happy User! WordSesh 2019Happy Browser, Happy User! WordSesh 2019
Happy Browser, Happy User! WordSesh 2019Katie Sylor-Miller
 
Resources, Providers, and Helpers Oh My!
Resources, Providers, and Helpers Oh My!Resources, Providers, and Helpers Oh My!
Resources, Providers, and Helpers Oh My!Brian Stajkowski
 

What's hot (19)

Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the Seams
 
NGINX High-performance Caching
NGINX High-performance CachingNGINX High-performance Caching
NGINX High-performance Caching
 
Rails Caching Secrets from the Edge
Rails Caching Secrets from the EdgeRails Caching Secrets from the Edge
Rails Caching Secrets from the Edge
 
Alfresco tuning part1
Alfresco tuning part1Alfresco tuning part1
Alfresco tuning part1
 
Nginx: Accelerate Rails, HTTP Tricks
Nginx: Accelerate Rails, HTTP TricksNginx: Accelerate Rails, HTTP Tricks
Nginx: Accelerate Rails, HTTP Tricks
 
Altitude SF 2017: Debugging Fastly VCL 101
Altitude SF 2017: Debugging Fastly VCL 101Altitude SF 2017: Debugging Fastly VCL 101
Altitude SF 2017: Debugging Fastly VCL 101
 
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
 
Tips for going fast in a slow world: Michael May at OSCON 2015
Tips for going fast in a slow world: Michael May at OSCON 2015Tips for going fast in a slow world: Michael May at OSCON 2015
Tips for going fast in a slow world: Michael May at OSCON 2015
 
Care and feeding notes
Care and feeding notesCare and feeding notes
Care and feeding notes
 
MetaZeta Clusters Overview
MetaZeta Clusters OverviewMetaZeta Clusters Overview
MetaZeta Clusters Overview
 
Building a better web
Building a better webBuilding a better web
Building a better web
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
 
Ehcache 3 @ BruJUG
Ehcache 3 @ BruJUGEhcache 3 @ BruJUG
Ehcache 3 @ BruJUG
 
Use case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in productionUse case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in production
 
Drupal Performance : DrupalCamp North
Drupal Performance : DrupalCamp NorthDrupal Performance : DrupalCamp North
Drupal Performance : DrupalCamp North
 
Wordpress optimization
Wordpress optimizationWordpress optimization
Wordpress optimization
 
Happy Browser, Happy User! WordSesh 2019
Happy Browser, Happy User! WordSesh 2019Happy Browser, Happy User! WordSesh 2019
Happy Browser, Happy User! WordSesh 2019
 
Resources, Providers, and Helpers Oh My!
Resources, Providers, and Helpers Oh My!Resources, Providers, and Helpers Oh My!
Resources, Providers, and Helpers Oh My!
 

Similar to Building an Impenetrable ZooKeeper - Kathleen Ting

HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trencheswchevreuil
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingGreat Wide Open
 
Strata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxStrata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxManish Maheshwari
 
Strata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaStrata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaManish Maheshwari
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...confluent
 
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaReal-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaJason Shih
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudRose Toomey
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudDatabricks
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutSander Temme
 
Open source: Top issues in the top enterprise packages
Open source: Top issues in the top enterprise packagesOpen source: Top issues in the top enterprise packages
Open source: Top issues in the top enterprise packagesRogue Wave Software
 
Capacity Management/Provisioning (Cloud's full, Can't build here)
Capacity Management/Provisioning (Cloud's full, Can't build here)Capacity Management/Provisioning (Cloud's full, Can't build here)
Capacity Management/Provisioning (Cloud's full, Can't build here)andyhky
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataacelyc1112009
 
cache concepts and varnish-cache
cache concepts and varnish-cachecache concepts and varnish-cache
cache concepts and varnish-cacheMarc Cortinas Val
 
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...confluent
 
(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool ManagementBIOVIA
 
Performance_Out.pptx
Performance_Out.pptxPerformance_Out.pptx
Performance_Out.pptxsanjanabal
 

Similar to Building an Impenetrable ZooKeeper - Kathleen Ting (20)

HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trenches
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
Strata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxStrata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptx
 
Strata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaStrata London 2019 Scaling Impala
Strata London 2019 Scaling Impala
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
 
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaReal-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using Impala
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
Open source: Top issues in the top enterprise packages
Open source: Top issues in the top enterprise packagesOpen source: Top issues in the top enterprise packages
Open source: Top issues in the top enterprise packages
 
Capacity Management/Provisioning (Cloud's full, Can't build here)
Capacity Management/Provisioning (Cloud's full, Can't build here)Capacity Management/Provisioning (Cloud's full, Can't build here)
Capacity Management/Provisioning (Cloud's full, Can't build here)
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
cache concepts and varnish-cache
cache concepts and varnish-cachecache concepts and varnish-cache
cache concepts and varnish-cache
 
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
 
(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management
 
Performance_Out.pptx
Performance_Out.pptxPerformance_Out.pptx
Performance_Out.pptx
 
2 7
2 72 7
2 7
 

Recently uploaded

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Building an Impenetrable ZooKeeper - Kathleen Ting

  • 1. Kathleen Ting | Cloudera Building an Impenetrable ZooKeeper
  • 2. Who Am I? •  Kathleen Ting – Apache Sqoop Committer, PMC Member – Author of Apache Sqoop Cookbook – Customer Operations Engr Mgr, Cloudera – @kate_ting, kathleen@apache.org 2
  • 3. What is ZooKeeper? •  Coordinator of distributed applications •  Small clusters reliably serve many coordination needs •  Canary in the Hadoop coal mine 3
  • 6. Why is ZooKeeper Important? •  High Availability –  Replicate to withstand machine failures •  Distributed Coordination –  One consistent framework to rule coordination across all systems –  Observe every operation by every client in exactly the same order 6
  • 7. Who Uses ZooKeeper? •  HBase •  MapReduce (YARN) •  HDFS (High Availability) •  Solr •  Kafka •  S4 •  Storm •  Accumulo •  Numerous custom solutions: https://cwiki.apache.org/confluence/display/ZOOKEEPER/ poweredby 7
  • 8. ZooKeeper Coordination Example 8 JVM / Linux ZooKeeper HDFS HBase App MR Disk/Network
  • 9. What are Misconfigurations? •  Any diagnostic ticket requiring a change to ZooKeeper (HBase, Hadoop..) or to OS config files •  Comprise 44% of tickets •  e.g. resource-allocation: memory, file-handles, disk-space 9
  • 10. Ticket Breakdown by Type 44% 34% 10% 8% 4% Misconfig Bug App JVM/Linux Disk/NW 10
  • 11. Ticket Breakdown by Component 11 43%   34%   7%   3%   3%   10%   ZooKeeper   HBase   Pig    Flume  OG   HDFS   System  
  • 12. ZooKeeper Anti-Patterns •  Connection Mismanagement •  Time Mismanagement •  Disk Mismanagement 12
  • 13. ZooKeeper Anti-Patterns •  Connection Mismanagement •  Time Mismanagement •  Disk Mismanagement 13
  • 14. 1. Too Many Connections WARN [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@247] - Too many connections from /xx.x.xx.xxx - max is 60 ! How can it be resolved? •  Running out of ZK connections? –  Set maxClientCnxns=200 (per ZK node) in zoo.cfg! •  HBase or Hive client leaking connections? –  Manually close connections –  Fixed in HBASE-3777, HBASE-4773, HBASE-5466, HIVE-3723 •  Turn off IPv6 or java.net.preferIPv4Stack=true! 14
  • 15. 2. Connection Closes Prematurely ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately.! How can it be resolved? •  If hbase.cluster.distributed = true in hbase-site, then in zoo.cfg, quorum can’t be set to localhost •  In hbase-site, set hbase.zookeeper.recoverable.waittime=30000ms! –  Provides enough time for HBase client to try another ZK server –  Fixed in HBASE-3065 15
  • 16. 3. Pig Hangs Connecting to HBase WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectionException: Connection refused! What causes this? •  Location of ZK quorum is not known to Pig (default 127.0.0.1:2181 fails) How can it be resolved? •  Use Pig 10, which includes PIG-2115 •  If there is overlap between TaskTrackers and ZK quorum nodes –  Set hbase.zookeeper.quorum to final in hbase-site.xml –  Otherwise, add "hbase.zookeeper.quorum=hadoophbasemaster.lan: 2181" to "pig.properties” (fixed in PIG-2821) 16
  • 17. ZooKeeper Anti-Patterns •  Connection Mismanagement •  Time Mismanagement •  Disk Mismanagement 17
  • 18. 4. Client Session Timed Out INFO org.apache.zookeeper.server.ZooKeeperServer: Expiring session <id>, timeout of 40000ms exceeded! How can it be resolved? •  ZK and HBase need same session timeout values: –  zoo.cfg: maxSessionTimeout=180000! –  hbase-site.xml: zookeeper.session.timeout=180000! •  Don’t co-locate ZK with IO-intense DataNode or RegionServer •  Make sure your session timeout is sufficiently long •  Specify right amount of heap and tune GC flags –  Turn on Parallel/CMS/Incremental GC 18
  • 19. 5. Clients Lose Connections WARN org.apache.zookeeper.ClientCnxn - Session <id> for server <name>, unexpected error, closing socket connection and attempting reconnect! java.io.IOException: Broken pipe! Don’t use SSD drive for ZK transaction log •  ZK optimized for mechanical spindles and for sequential IO •  SSD provides little benefit and suffers from high latency spikes –  http://storagemojo.com/2012/06/07/the-ssd-write-cliff-in-real- life/ –  SSD pre-allocates disk extents to avoid directory updates but that doubles the load on the SSD –  SSD disk stops for 40 sec (which is greater than session timeout) 19
  • 20. ZooKeeper Anti-Patterns •  Connection Mismanagement •  Time Mismanagement •  Disk Mismanagement 20
  • 21. 6. Unable to Load Database – Unable to Run Quorum Server FATAL Unable to load database on disk ! java.io.IOException: Failed to process transaction type: 2 error: KeeperErrorCode = NoNode for <file> at org.apache.zookeeper.server.persistence.FileTxnSnapLog. restore(FileTxnSnapLog.java:152)! How can it be resolved? •  Archive and wipe /var/zookeeper/version-2 if other two ZK servers are running 21
  • 22. 7. Unable to Load Database – Unreasonable Length Exception FATAL Unable to load database on disk ! java.io.IOException: Unreasonable length = 1048583 ! at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInp utArchive.java:100) ! How can it be resolved? •  Server allows a client to set data larger than the server can read from disk •  If a znode is not readable, increase jute.maxbuffer –  Look for "Packet len <xx> is out of range" in the client log –  Increase it by 20% –  Set in JVMFLAGS="-Djute.maxbuffer=yy" bin/zkCli.sh22
  • 23. 8. Failure to Follow Leader WARN org.apache.zookeeper.server.quorum.Learner: Exception when following the leader java.net.SocketTimeoutException: Read timed out ! What causes this? •  Disk IO contention, Network Issues •  ZK snapshot is too large (lots of ZK nodes) How can it be resolved? •  Reduce IO contention by putting dataDir on dedicated spindle •  Increase initLimit on all ZK servers and restart, see ZOOKEEPER-1521 •  Monitor network (e.g. ifconfig)23
  • 24. Optimal Ensemble Size 24 # of ZK Servers Purpose 1 Coordination 3 Reliability for production environment 5 Permits taking one server down for maintenance Why not run 11 ZK servers?
  • 25. Trust But Verify •  zk-smoketest –  https://github.com/phunt/zk-smoketest –  Verify new, updated, & existing installations –  Identify latency issues •  zk-top –  https://github.com/phunt/zktop –  Unix “top” like utility for ZK •  4 letter words/JMX (e.g. ruok, srvr) –  http://zookeeper.apache.org/doc/current/ zookeeperAdmin.html#sc_zkCommands –  Use "stat" to get an idea what your request latency looks like 25
  • 26. ZooKeeper Patterns DOs •  Separate spindles for dataDir & dataLogDir –  Avoids competition between logging and snapshots –  Improves throughput and latency •  Allocate 3 or 5 ZK servers •  Tune Garbage Collection •  Run zkCleanup.sh script via cron DON’Ts •  Don’t co-locate ZK with I/O intense DataNode or RegionServer –  ZK is latency sensitive •  Don’t use SSD drive for ZK transaction log 26
  • 27. Configure ZooKeeper Correctly.. ..and it’ll be as impenetrable as a distributed system allows. Questions? 27