SlideShare a Scribd company logo
Headline Goes Here
Speaker Name or Subhead Goes Here
DO NOT USE PUBLICLY
PRIOR TO 10/23/12
Trends in Supporting Production
Apache HBase Clusters
Jonathan Hsieh | @jmhsieh | Software Engineer at Cloudera /
HBase PMC Member
Kevin O’Dell| kevin.odell@cloudera| Systems Engineer at Cloudera
June 26, 2013
Who are we?
Jonathan Hsieh
• Cloudera:
• Software Engineer
• Apache HBase committer /
PMC
• Apache Flume founder
Kevin O’Dell
• Cloudera:
• Systems Engineer
• Apache HBase contributor
• Cloudera HBase Support Lead
2 6/26/13 Hadoop Summit / O'Dell, Hsieh
What is Apache HBase?
Apache HBase is a
reliable, column-
oriented data store that
provides consistent, low-
latency, random
read/write access.
ZK HDFS
App MR
3 6/26/13 Hadoop Summit / O'Dell, Hsieh
HBase Architecture
ZK HDFS
App MR
4 6/26/13 Hadoop Summit / O'Dell, Hsieh
• HBase is designed to be fault tolerant
and highly available
• It depends on other systems to be as
well.
• Replication for fault tolerance
• Serve regions from any Region server
• Failover HMasters
• ZK Quorums
• HDFS Block replication on Data Nodes
From the trenches at Cloudera Customer Operations
Trends Supporting HBase
Customers in 2011-12 vs in 2012-13
0.90.x / CDH3 era
• Red Hat 5.x
• Java jvm 1.6.13
• 4-8 disk machines
• 24-48 GB RAM
• Dual 4-core HT
• CDH3
• Apache HBase 0.90
• Apache Hadoop 0.20.x
0.92.x/0.94.x / CDH4 era
• Red Hat 6.x
• Java jvm 1.6.31
• 12-15 disk machines
• 48-96 GB RAM
• Dual 6-core HT
• CDH4
• Apache HBase 0.92/0.94
• Apache Hadoop 2.0
6 6/26/13 Hadoop Summit / O'Dell, Hsieh
Support Incidents 6/2011-6/2012
• Patched Bug
• Patched delivered, or
• Fixed in next version
• Operational Workaround
• Misconfiguration
• Schema design / tuning
• hbck used to fix
• Network/HW/OS
• Problems with underlying
systems.
7
Patched
12%
Workaround
(hbck)
28%
Workaround
(config)
44%
Net/HW/OS
16%
6/11-6/12 - CDH3 / 0.90.x HBase
Support Tickets
6/26/13 Hadoop Summit / O'Dell, Hsieh
Comparing 6/11-6/12 to 6/12-6/13
8
Patched
12%
Workaround
(hbck)
28%
Workaround
(config)
44%
Net/HW/OS
16%
6/11-6/12 - CDH3 / 0.90.x HBase
Support Tickets
Patched
14%
Workaround
(config/hbck)
36%
Net/HW/OS
42%
Documentation
8%
6/12-6/13 - CDH3+CDH4 HBase
Support Tickets
Much smaller!
Merged
config/hbck
New
category
This is
bigger!
6/26/13 Hadoop Summit / O'Dell, Hsieh
Comparing 2011 to 2012
• Majority customers
upgraded to CDH4.
• More customers, but similar
volume of support incidents
• Shrunk the CDH3’s largest
trouble spots significantly.
• Larger number of issues
due to underlying systems.
• This is actually a good thing!
9
Patched
14%
Workaround
(config/hbck)
36%
Net/HW/OS
42%
Documentation
8%
6/12-6/13 - CDH3+CDH4 HBase
Support Tickets
6/26/13 Hadoop Summit / O'Dell, Hsieh
HBase Operations Challenges
Operation’s pain points from 6/12 – 6/13
• Hardware (Net/OS/HW)
• Upgrade (0.90 -> 0.92)
• HBase configuration
11 6/26/13 Hadoop Summit / O'Dell, Hsieh
Hardware / Network / Operating System
• Leap second
• Transparent Huge pages
• Bad 10GB Ethernet Firmware
12
Bug
14%
Workaround
(config/hbck)
36%
Net/HW/OS
42%
Documentation
8%
6/12-6/13 - CDH3+CDH4 HBase
Support Tickets
6/26/13 Hadoop Summit / O'Dell, Hsieh
Cloudera Manager (CM) system host checker
13 6/26/13 Hadoop Summit / O'Dell, Hsieh
Upgrade Issues
• Old .edits (HBASE-6440)
• 0-length HLogs (HBASE-6443)
• Bad region refs (HBASE-7199)
• Invalid HFile (Heisenbug)
14
Bug
14%
Workaround
(config/hbck)
36%
Net/HW/OS
42%
Documentation
8%
6/12-6/13 - CDH3+CDH4 HBase
Support Tickets
6/26/13 Hadoop Summit / O'Dell, Hsieh
Upgrade Assistance
• Parcels
• simplified distribution
• flexibility of install location
• side by side installs for rolling upgrades
• Rolling upgrades via CM
• hot fixes
• minor version upgrades
• Automated tests for upgrades and compatibility
15 6/26/13 Hadoop Summit / O'Dell, Hsieh
Configuration / Feature
• Continuous Bulk Load
• Avoid and Use Puts
• Region tuning
• Updated defaults + CM
• GC tuning
• Updated defaults + CM
• Balancer
• Manual / custom tools
• Bad Schema
• Trial and Error
16
Bug
14%
Workaround
(config/hbck)
36%
Net/HW/OS
42%
Documentation
8%
6/12-6/13 - CDH3+CDH4 HBase
Support Tickets
6/26/13 Hadoop Summit / O'Dell, Hsieh
CM helps
• Sanity checks on configurations
• Wizard based installation and setup
• Wizard based rolling upgrades (minor versions)
• Wizard based backup and disaster recovery strategies
17 6/26/13 Hadoop Summit / O'Dell, Hsieh
Configuration Management
18 6/26/13 Hadoop Summit / O'Dell, Hsieh
Support improvement wishlist
• Improved “Ergonomics”
• Better default configuration and guard rails
• “I’m sorry Dave, I can’t let you do that”
• Improved error messaging
• Suggest likely root causes in logs
• Improve log signal-to-noise ratio
• More improved ops tooling and frameworks for app development
6/26/13 Hadoop Summit / O'Dell, Hsieh19
Good news
• All bug fixes go into the Apache versions before CDH
• HBase is maturing
• Higher percentage of incidents by underlying OS/HW/NW
• More performance and tuning oriented questions
• Similar percentage of incidents caused by bugs
• We’re getting better
• Lower percentage of incidents managed with workarounds
• More tools in place to help operational support
• Hbck, CM, defaults
• We can still do better!
20 6/26/13 Hadoop Summit / O'Dell, Hsieh
Getting rid of workarounds
Trends Developing HBase
21 6/26/13 Hadoop Summit / O'Dell, Hsieh
Developer Community
• Vibrant, Highly
Active community!
• We’re Growing!
6/26/13 Hadoop Summit / O'Dell, Hsieh22
Upstream Development Improvements for 0.95+
• Improving Usability
• Improving Reliability
• Improving Predictability
23 6/26/13 Hadoop Summit / O'Dell, Hsieh
Patched
14%
Workaround
(config/hbck)
36%
Net/HW/OS
42%
Documentation
8%
6/12-6/13 - CDH3+CDH4 HBase
Support Tickets
Improving Usability
Metrics and Frameworks
Usability Concerns
• Administering HBase has been too hard.
• Difficult to see what is happening in HBase
• Easy to make bad design decisions early without realizing
• New Developments
• Metrics Revamp
• HTrace
• Frameworks for Schema design
6/26/13 Hadoop Summit / O'Dell, Hsieh25
Metrics Options
Cloudera Manager OpenTSDB
26
Ganglia
Ganglia Image From:http://www.flickr.com/photos/hongiiv/
6/26/13 Hadoop Summit / O'Dell, Hsieh
HTrace
• Problem:
• Where is time being spent inside HBase?
• Solution: HTrace Framework
• Inspired by Google Dapper
• Threaded through HBase and HDFS
• Tracks time spent in calls in a distributed system by tracking spans*
on different machines.
*Some assembly still required.
6/26/13 Hadoop Summit / O'Dell, Hsieh27
HBase Schemas
• HBase Application developers must iterate to find a suitable HBase
schema
• Schema critical for Performance at Scale
• How can we make this easier?
• How can we reduce the expertise required to do this?
• Today:
• Lots of tuning knobs
• Developers need to understand Column Families, Rowkey design, Data
encoding, …
• Some are expensive to change after the fact
6/26/13 Hadoop Summit / O'Dell, Hsieh28
Row key design techniques
• Numeric Keys and lexicographic sort
• Store numbers big-endian.
• Pad ASCII numbers with 0’s.
• Use reversal to have most significant traits first.
• Reverse URL.
• Reverse timestamp to get most recent first.
• (MAX_LONG - ts) so “time” gets monotonically smaller.
• Use composite keys to make key distribute nicely and work
well with sub-scans
• Ex: User-ReverseTimeStamp
• Do not use current timestamp as first part of row key!
29
Row100
Row3
Row 31
Row003
Row031
Row100
vs.
blog.cloudera.com
hbase.apache.org
strataconf.com
vs.
com.cloudera.blog
com.strataconf
org.apache.hbase
6/26/13 Hadoop Summit / O'Dell, Hsieh
Row key design techniques
• Numeric Keys and lexicographic sort
• Store numbers big-endian.
• Pad ASCII numbers with 0’s.
• Use reversal to have most significant traits first.
• Reverse URL.
• Reverse timestamp to get most recent first.
• (MAX_LONG - ts) so “time” gets monotonically smaller.
• Use composite keys to make key distribute nicely and work
well with sub-scans
• Ex: User-ReverseTimeStamp
• Do not use current timestamp as first part of row key!
30
Row100
Row3
Row 31
Row003
Row031
Row100
vs.
blog.cloudera.com
hbase.apache.org
strataconf.com
vs.
com.cloudera.blog
com.strataconf
org.apache.hbase
6/26/13 Hadoop Summit / O'Dell, Hsieh
MTTR
Improving Reliability
Reliable
Reliable / Highly Available
• Reliable:
• Ability to recover service if a
component fails, without losing data.
• Highly Available:
• Ability to quickly recover service if a
component fails, without losing data.
• Goal: Minimize downtime!
Highly Available
32 6/26/13 Hadoop Summit / O'Dell, Hsieh
Mean Time To Recovery (MTTR)
• Average time taken to automatically recover from a failure.
• Detection time
• Repair Time
• Notification Time
• Measure: HTrace (Dapper) Infrastructure (0.96+)
6/26/13 Hadoop Summit / O'Dell, Hsieh33
Detect Repair Notify
time
Reduce Detection Time
• Proactive notification of HMaster failure (0.95)
• Proactive notification of RS failure (0.95)
• Fast server failover (Hardware)
6/26/13 Hadoop Summit / O'Dell, Hsieh34
Detect Notify
time
Repair
Reduce Detection Time
• Proactive notification of HMaster failure (0.95)
• Proactive notification of RS failure (0.95)
• Fast server failover (Hardware)
6/26/13 Hadoop Summit / O'Dell, Hsieh35
Repair Notify
time
Detect
Reduce Recovery Time
• Distributed Log Splitting (0.92)
• Distributed Log Replay (0.95)
• Fast Write recovery (0.95)
• Pristine Read recovery (0.96+)
6/26/13 Hadoop Summit / O'Dell, Hsieh36
Notify
time
Detect Repair
Reduce Recovery Time
• Distributed Log Splitting (0.92)
• Distributed Log Replay (0.95)
• Fast Write recovery (0.95)
• Pristine Read recovery (0.96+)
6/26/13 Hadoop Summit / O'Dell, Hsieh37
Repair Notify
time
Detect
Reduce Notification Time
• Notify client on recovery
• Async Client rewrite (0.96+)
6/26/13 Hadoop Summit / O'Dell, Hsieh38
Notify
time
Detect Repair
Reduce Notification Time
• Notify client on recovery
• Async Client rewrite (0.96+)
6/26/13 Hadoop Summit / O'Dell, Hsieh39
Repair Notify
time
Detect
Compactions
Improving Predictability
Reliable
Reliable / Highly Available
• Reliable:
• Ability to recover service if a component
fails, without losing data.
• Highly Available:
• Ability to quickly recover service if a
component fails, without losing data.
• Goal: Minimize downtime!
Highly Available
41 6/26/13 Hadoop Summit / O'Dell, Hsieh
Reliable
Reliable / Highly Available / Latency Tolerant
• Reliable:
• Ability to recover service if a component
fails, without losing data.
• Highly Available:
• Ability to quickly recover service if a
component fails, without losing data.
• Latency Tolerant
• Ability to perform and recover in a
predictable amount of time, without
losing data
• New Goal: Predictable performance
Highly Available
42
Latency
Tolerant
6/26/13 Hadoop Summit / O'Dell, Hsieh
Common causes of performance variability
• Compaction
• Garbage Collection
• Locality Loss
6/26/13 Hadoop Summit / O'Dell, Hsieh43
Compaction
• Compactions optimizing read layout by rewriting files
• Reduce the seeks required to read a row
• Improve random read performance
• Age off expired or deleted data
• Assumes uniformly distributed write workload
• But we have new workloads:
• Continuous Bulk load write pattern
• Time-series write pattern
6/26/13 Hadoop Summit / O'Dell, Hsieh44
Compactions: Put workload
• Minor compactions
• Optimizes a sub set of adjacent
files
• Major Compactions
• Optimizes all files
• Choosing:
• Assume: older files should be
larger than newer files.
• “New” files are “larger” than
“older” files? major compaction
• Else, look at newer files and
select files for a minor
compaction
6/26/13 Hadoop Summit / O'Dell, Hsieh45
Newly flushed HFiles
Minor
…
…
Minor
MajorMinor
Compactions: Bulkload workload
• Functionality for loading data en
masse
• Intended for Bootstrapping
HBase tables
• New write workload:
frequently ingest data only via
bulk load
• Problem:
• Breaks age/size assumption!
• Major Compaction Storms!
• Compactions unnecessarily
rewrite large files.
46 6/26/13 Hadoop Summit / O'Dell, Hsieh
Newly bulk loaded HFiles
Major
Newly flushed HFiles
MajorMajor
Bulkload: Exploring Compactor
• Explore all compaction
possibilities
• Choose minor compactions
that reduces # of files while
incurring least IO.
• “the best bang of the buck”
• Compaction workload is
more manageable
47 6/26/13 Hadoop Summit / O'Dell, Hsieh
Newly bulk loaded HFiles
Explore
Newly flushed HFiles
Minor
Minor
Conclusions
Comparing 6/11-6/12 to 6/12-6/13
49
Patched
12%
Workaround
(hbck)
28%
Workaround
(config)
44%
Net/HW/OS
16%
6/11-6/12 - CDH3 / 0.90.x HBase
Support Tickets
Patched
14%
Workaround
(config/hbck)
36%
Net/HW/OS
42%
Documentation
8%
6/12-6/13 - CDH3+CDH4 HBase
Support Tickets
Development
and tooling
efforts continue
to reduce
HBase is
becoming more
robust
6/26/13 Hadoop Summit / O'Dell, Hsieh
Improved
testing
Summary by Version
0.90 0.92 /0.94 0.95-dev / 0.96 0.98 /trunk
•HBase Developer
Expertise
• HBase Operational
Experience
• Distributed Systems Admin
Experience
• 
•True Durability • Consistency
• Performance
• MTTR
• Protobufs
• Snapshots
• Table locks
• (Predictability)
• (File Block Affinity†)
•Distributed log
splitting*
•Distributed log splitting • Distributed log splitting
• Distributed log replay†
• Fast Write Recovery†
•Distributed log splitting
•Distributed log replay†
•Fast Write Recovery†
•(Pristine Region Read Recovery)
•Metrics • CF+Region Granularity
Metrics
• CF+Region Granularity Metrics
• Improved failure detection time
•CF +Region Granularity Metrics
•Improved failure detection time
•(Htrace)
Recovery in Hours Recovery in Minutes Recovery in Seconds (for writes) Recovery in Seconds
† experimental (in progress) *backported in CDH
50 6/26/13 Hadoop Summit / O'Dell, Hsieh
Questions?
6/26/13 Hadoop Summit / O'Dell, Hsieh51
@kevinrodell
@jmhsieh

More Related Content

What's hot

Deep Dive - Usage of on premises data gateway for hybrid integration scenarios
Deep Dive - Usage of on premises data gateway for hybrid integration scenariosDeep Dive - Usage of on premises data gateway for hybrid integration scenarios
Deep Dive - Usage of on premises data gateway for hybrid integration scenarios
Sajith C P Nair
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
Chris Nauroth
 
Multitenancy At Bloomberg - HBase and Oozie
Multitenancy At Bloomberg - HBase and OozieMultitenancy At Bloomberg - HBase and Oozie
Multitenancy At Bloomberg - HBase and Oozie
DataWorks Summit
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Cedric CARBONE
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
DataWorks Summit/Hadoop Summit
 
Sharing metadata across the data lake and streams
Sharing metadata across the data lake and streamsSharing metadata across the data lake and streams
Sharing metadata across the data lake and streams
DataWorks Summit
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
Rommel Garcia
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
alanfgates
 
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profitDeep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
DataWorks Summit/Hadoop Summit
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaselarsgeorge
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Global Business Events
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeN
DataWorks Summit
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
Edureka!
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
DataWorks Summit
 
Big data and mstr bridge the elephant
Big data and mstr   bridge the elephantBig data and mstr   bridge the elephant
Big data and mstr bridge the elephant
Kognitio
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks
 
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFiFrom Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop ApplicationsArchitectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applications
hadooparchbook
 
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and ParquetFormat Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
DataWorks Summit
 

What's hot (20)

Deep Dive - Usage of on premises data gateway for hybrid integration scenarios
Deep Dive - Usage of on premises data gateway for hybrid integration scenariosDeep Dive - Usage of on premises data gateway for hybrid integration scenarios
Deep Dive - Usage of on premises data gateway for hybrid integration scenarios
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 
Multitenancy At Bloomberg - HBase and Oozie
Multitenancy At Bloomberg - HBase and OozieMultitenancy At Bloomberg - HBase and Oozie
Multitenancy At Bloomberg - HBase and Oozie
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
 
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and ImprovementsOperating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
 
Sharing metadata across the data lake and streams
Sharing metadata across the data lake and streamsSharing metadata across the data lake and streams
Sharing metadata across the data lake and streams
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
 
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profitDeep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBase
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
 
Big Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeNBig Data Simplified - Is all about Ab'strakSHeN
Big Data Simplified - Is all about Ab'strakSHeN
 
Advanced Security In Hadoop Cluster
Advanced Security In Hadoop ClusterAdvanced Security In Hadoop Cluster
Advanced Security In Hadoop Cluster
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
 
Big data and mstr bridge the elephant
Big data and mstr   bridge the elephantBig data and mstr   bridge the elephant
Big data and mstr bridge the elephant
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFiFrom Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
 
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop ApplicationsArchitectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applications
 
Format Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and ParquetFormat Wars: from VHS and Beta to Avro and Parquet
Format Wars: from VHS and Beta to Avro and Parquet
 

Similar to Trends in Supporting Production Apache HBase Clusters

Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
DataWorks Summit
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
StampedeCon
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataPentaho
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
markgrover
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
Jeff Holoman
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
larsgeorge
 
Chicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBaseChicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBase
Cloudera, Inc.
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
sudhakara st
 
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Dataconomy Media
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Mike Pittaro
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
Cloudera, Inc.
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
EMC
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
Seeling Cheung
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
Hitachi Vantara
 
Analysis of historical movie data by BHADRA
Analysis of historical movie data by BHADRAAnalysis of historical movie data by BHADRA
Analysis of historical movie data by BHADRA
Bhadra Gowdra
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
Hortonworks
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configurationprabakaranbrick
 
Hbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBaseHbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBase
phanleson
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015
Apekshit Sharma
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 

Similar to Trends in Supporting Production Apache HBase Clusters (20)

Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in HadoopBackup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Chicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBaseChicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBase
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Analysis of historical movie data by BHADRA
Analysis of historical movie data by BHADRAAnalysis of historical movie data by BHADRA
Analysis of historical movie data by BHADRA
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
 
Hbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBaseHbase in action - Chapter 09: Deploying HBase
Hbase in action - Chapter 09: Deploying HBase
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 

Recently uploaded (20)

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 

Trends in Supporting Production Apache HBase Clusters

  • 1. Headline Goes Here Speaker Name or Subhead Goes Here DO NOT USE PUBLICLY PRIOR TO 10/23/12 Trends in Supporting Production Apache HBase Clusters Jonathan Hsieh | @jmhsieh | Software Engineer at Cloudera / HBase PMC Member Kevin O’Dell| kevin.odell@cloudera| Systems Engineer at Cloudera June 26, 2013
  • 2. Who are we? Jonathan Hsieh • Cloudera: • Software Engineer • Apache HBase committer / PMC • Apache Flume founder Kevin O’Dell • Cloudera: • Systems Engineer • Apache HBase contributor • Cloudera HBase Support Lead 2 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 3. What is Apache HBase? Apache HBase is a reliable, column- oriented data store that provides consistent, low- latency, random read/write access. ZK HDFS App MR 3 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 4. HBase Architecture ZK HDFS App MR 4 6/26/13 Hadoop Summit / O'Dell, Hsieh • HBase is designed to be fault tolerant and highly available • It depends on other systems to be as well. • Replication for fault tolerance • Serve regions from any Region server • Failover HMasters • ZK Quorums • HDFS Block replication on Data Nodes
  • 5. From the trenches at Cloudera Customer Operations Trends Supporting HBase
  • 6. Customers in 2011-12 vs in 2012-13 0.90.x / CDH3 era • Red Hat 5.x • Java jvm 1.6.13 • 4-8 disk machines • 24-48 GB RAM • Dual 4-core HT • CDH3 • Apache HBase 0.90 • Apache Hadoop 0.20.x 0.92.x/0.94.x / CDH4 era • Red Hat 6.x • Java jvm 1.6.31 • 12-15 disk machines • 48-96 GB RAM • Dual 6-core HT • CDH4 • Apache HBase 0.92/0.94 • Apache Hadoop 2.0 6 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 7. Support Incidents 6/2011-6/2012 • Patched Bug • Patched delivered, or • Fixed in next version • Operational Workaround • Misconfiguration • Schema design / tuning • hbck used to fix • Network/HW/OS • Problems with underlying systems. 7 Patched 12% Workaround (hbck) 28% Workaround (config) 44% Net/HW/OS 16% 6/11-6/12 - CDH3 / 0.90.x HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 8. Comparing 6/11-6/12 to 6/12-6/13 8 Patched 12% Workaround (hbck) 28% Workaround (config) 44% Net/HW/OS 16% 6/11-6/12 - CDH3 / 0.90.x HBase Support Tickets Patched 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets Much smaller! Merged config/hbck New category This is bigger! 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 9. Comparing 2011 to 2012 • Majority customers upgraded to CDH4. • More customers, but similar volume of support incidents • Shrunk the CDH3’s largest trouble spots significantly. • Larger number of issues due to underlying systems. • This is actually a good thing! 9 Patched 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 11. Operation’s pain points from 6/12 – 6/13 • Hardware (Net/OS/HW) • Upgrade (0.90 -> 0.92) • HBase configuration 11 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 12. Hardware / Network / Operating System • Leap second • Transparent Huge pages • Bad 10GB Ethernet Firmware 12 Bug 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 13. Cloudera Manager (CM) system host checker 13 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 14. Upgrade Issues • Old .edits (HBASE-6440) • 0-length HLogs (HBASE-6443) • Bad region refs (HBASE-7199) • Invalid HFile (Heisenbug) 14 Bug 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 15. Upgrade Assistance • Parcels • simplified distribution • flexibility of install location • side by side installs for rolling upgrades • Rolling upgrades via CM • hot fixes • minor version upgrades • Automated tests for upgrades and compatibility 15 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 16. Configuration / Feature • Continuous Bulk Load • Avoid and Use Puts • Region tuning • Updated defaults + CM • GC tuning • Updated defaults + CM • Balancer • Manual / custom tools • Bad Schema • Trial and Error 16 Bug 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 17. CM helps • Sanity checks on configurations • Wizard based installation and setup • Wizard based rolling upgrades (minor versions) • Wizard based backup and disaster recovery strategies 17 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 18. Configuration Management 18 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 19. Support improvement wishlist • Improved “Ergonomics” • Better default configuration and guard rails • “I’m sorry Dave, I can’t let you do that” • Improved error messaging • Suggest likely root causes in logs • Improve log signal-to-noise ratio • More improved ops tooling and frameworks for app development 6/26/13 Hadoop Summit / O'Dell, Hsieh19
  • 20. Good news • All bug fixes go into the Apache versions before CDH • HBase is maturing • Higher percentage of incidents by underlying OS/HW/NW • More performance and tuning oriented questions • Similar percentage of incidents caused by bugs • We’re getting better • Lower percentage of incidents managed with workarounds • More tools in place to help operational support • Hbck, CM, defaults • We can still do better! 20 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 21. Getting rid of workarounds Trends Developing HBase 21 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 22. Developer Community • Vibrant, Highly Active community! • We’re Growing! 6/26/13 Hadoop Summit / O'Dell, Hsieh22
  • 23. Upstream Development Improvements for 0.95+ • Improving Usability • Improving Reliability • Improving Predictability 23 6/26/13 Hadoop Summit / O'Dell, Hsieh Patched 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets
  • 25. Usability Concerns • Administering HBase has been too hard. • Difficult to see what is happening in HBase • Easy to make bad design decisions early without realizing • New Developments • Metrics Revamp • HTrace • Frameworks for Schema design 6/26/13 Hadoop Summit / O'Dell, Hsieh25
  • 26. Metrics Options Cloudera Manager OpenTSDB 26 Ganglia Ganglia Image From:http://www.flickr.com/photos/hongiiv/ 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 27. HTrace • Problem: • Where is time being spent inside HBase? • Solution: HTrace Framework • Inspired by Google Dapper • Threaded through HBase and HDFS • Tracks time spent in calls in a distributed system by tracking spans* on different machines. *Some assembly still required. 6/26/13 Hadoop Summit / O'Dell, Hsieh27
  • 28. HBase Schemas • HBase Application developers must iterate to find a suitable HBase schema • Schema critical for Performance at Scale • How can we make this easier? • How can we reduce the expertise required to do this? • Today: • Lots of tuning knobs • Developers need to understand Column Families, Rowkey design, Data encoding, … • Some are expensive to change after the fact 6/26/13 Hadoop Summit / O'Dell, Hsieh28
  • 29. Row key design techniques • Numeric Keys and lexicographic sort • Store numbers big-endian. • Pad ASCII numbers with 0’s. • Use reversal to have most significant traits first. • Reverse URL. • Reverse timestamp to get most recent first. • (MAX_LONG - ts) so “time” gets monotonically smaller. • Use composite keys to make key distribute nicely and work well with sub-scans • Ex: User-ReverseTimeStamp • Do not use current timestamp as first part of row key! 29 Row100 Row3 Row 31 Row003 Row031 Row100 vs. blog.cloudera.com hbase.apache.org strataconf.com vs. com.cloudera.blog com.strataconf org.apache.hbase 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 30. Row key design techniques • Numeric Keys and lexicographic sort • Store numbers big-endian. • Pad ASCII numbers with 0’s. • Use reversal to have most significant traits first. • Reverse URL. • Reverse timestamp to get most recent first. • (MAX_LONG - ts) so “time” gets monotonically smaller. • Use composite keys to make key distribute nicely and work well with sub-scans • Ex: User-ReverseTimeStamp • Do not use current timestamp as first part of row key! 30 Row100 Row3 Row 31 Row003 Row031 Row100 vs. blog.cloudera.com hbase.apache.org strataconf.com vs. com.cloudera.blog com.strataconf org.apache.hbase 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 32. Reliable Reliable / Highly Available • Reliable: • Ability to recover service if a component fails, without losing data. • Highly Available: • Ability to quickly recover service if a component fails, without losing data. • Goal: Minimize downtime! Highly Available 32 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 33. Mean Time To Recovery (MTTR) • Average time taken to automatically recover from a failure. • Detection time • Repair Time • Notification Time • Measure: HTrace (Dapper) Infrastructure (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh33 Detect Repair Notify time
  • 34. Reduce Detection Time • Proactive notification of HMaster failure (0.95) • Proactive notification of RS failure (0.95) • Fast server failover (Hardware) 6/26/13 Hadoop Summit / O'Dell, Hsieh34 Detect Notify time Repair
  • 35. Reduce Detection Time • Proactive notification of HMaster failure (0.95) • Proactive notification of RS failure (0.95) • Fast server failover (Hardware) 6/26/13 Hadoop Summit / O'Dell, Hsieh35 Repair Notify time Detect
  • 36. Reduce Recovery Time • Distributed Log Splitting (0.92) • Distributed Log Replay (0.95) • Fast Write recovery (0.95) • Pristine Read recovery (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh36 Notify time Detect Repair
  • 37. Reduce Recovery Time • Distributed Log Splitting (0.92) • Distributed Log Replay (0.95) • Fast Write recovery (0.95) • Pristine Read recovery (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh37 Repair Notify time Detect
  • 38. Reduce Notification Time • Notify client on recovery • Async Client rewrite (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh38 Notify time Detect Repair
  • 39. Reduce Notification Time • Notify client on recovery • Async Client rewrite (0.96+) 6/26/13 Hadoop Summit / O'Dell, Hsieh39 Repair Notify time Detect
  • 41. Reliable Reliable / Highly Available • Reliable: • Ability to recover service if a component fails, without losing data. • Highly Available: • Ability to quickly recover service if a component fails, without losing data. • Goal: Minimize downtime! Highly Available 41 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 42. Reliable Reliable / Highly Available / Latency Tolerant • Reliable: • Ability to recover service if a component fails, without losing data. • Highly Available: • Ability to quickly recover service if a component fails, without losing data. • Latency Tolerant • Ability to perform and recover in a predictable amount of time, without losing data • New Goal: Predictable performance Highly Available 42 Latency Tolerant 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 43. Common causes of performance variability • Compaction • Garbage Collection • Locality Loss 6/26/13 Hadoop Summit / O'Dell, Hsieh43
  • 44. Compaction • Compactions optimizing read layout by rewriting files • Reduce the seeks required to read a row • Improve random read performance • Age off expired or deleted data • Assumes uniformly distributed write workload • But we have new workloads: • Continuous Bulk load write pattern • Time-series write pattern 6/26/13 Hadoop Summit / O'Dell, Hsieh44
  • 45. Compactions: Put workload • Minor compactions • Optimizes a sub set of adjacent files • Major Compactions • Optimizes all files • Choosing: • Assume: older files should be larger than newer files. • “New” files are “larger” than “older” files? major compaction • Else, look at newer files and select files for a minor compaction 6/26/13 Hadoop Summit / O'Dell, Hsieh45 Newly flushed HFiles Minor … … Minor MajorMinor
  • 46. Compactions: Bulkload workload • Functionality for loading data en masse • Intended for Bootstrapping HBase tables • New write workload: frequently ingest data only via bulk load • Problem: • Breaks age/size assumption! • Major Compaction Storms! • Compactions unnecessarily rewrite large files. 46 6/26/13 Hadoop Summit / O'Dell, Hsieh Newly bulk loaded HFiles Major Newly flushed HFiles MajorMajor
  • 47. Bulkload: Exploring Compactor • Explore all compaction possibilities • Choose minor compactions that reduces # of files while incurring least IO. • “the best bang of the buck” • Compaction workload is more manageable 47 6/26/13 Hadoop Summit / O'Dell, Hsieh Newly bulk loaded HFiles Explore Newly flushed HFiles Minor Minor
  • 49. Comparing 6/11-6/12 to 6/12-6/13 49 Patched 12% Workaround (hbck) 28% Workaround (config) 44% Net/HW/OS 16% 6/11-6/12 - CDH3 / 0.90.x HBase Support Tickets Patched 14% Workaround (config/hbck) 36% Net/HW/OS 42% Documentation 8% 6/12-6/13 - CDH3+CDH4 HBase Support Tickets Development and tooling efforts continue to reduce HBase is becoming more robust 6/26/13 Hadoop Summit / O'Dell, Hsieh Improved testing
  • 50. Summary by Version 0.90 0.92 /0.94 0.95-dev / 0.96 0.98 /trunk •HBase Developer Expertise • HBase Operational Experience • Distributed Systems Admin Experience •  •True Durability • Consistency • Performance • MTTR • Protobufs • Snapshots • Table locks • (Predictability) • (File Block Affinity†) •Distributed log splitting* •Distributed log splitting • Distributed log splitting • Distributed log replay† • Fast Write Recovery† •Distributed log splitting •Distributed log replay† •Fast Write Recovery† •(Pristine Region Read Recovery) •Metrics • CF+Region Granularity Metrics • CF+Region Granularity Metrics • Improved failure detection time •CF +Region Granularity Metrics •Improved failure detection time •(Htrace) Recovery in Hours Recovery in Minutes Recovery in Seconds (for writes) Recovery in Seconds † experimental (in progress) *backported in CDH 50 6/26/13 Hadoop Summit / O'Dell, Hsieh
  • 51. Questions? 6/26/13 Hadoop Summit / O'Dell, Hsieh51 @kevinrodell @jmhsieh

Editor's Notes

  1. Hbase is a project that solves this problem. In a sentence, Hbase is an open source, distributed, sorted map modeled after Google’s BigTable.Open-source: Apache HBase is an open source project with an Apache 2.0 license.Distributed: HBase is designed to use multiple machines to store and serve data.Sorted Map: HBase stores data as a map, and guarantees that adjacent keys will be stored next to each other on disk.HBase is modeled after BigTable, a system that is used for hundreds of applications at Google.
  2. Hbase is a project that solves this problem. In a sentence, Hbase is an open source, distributed, sorted map modeled after Google’s BigTable.Open-source: Apache HBase is an open source project with an Apache 2.0 license.Distributed: HBase is designed to use multiple machines to store and serve data.Sorted Map: HBase stores data as a map, and guarantees that adjacent keys will be stored next to each other on disk.HBase is modeled after BigTable, a system that is used for hundreds of applications at Google.
  3. This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  4. This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  5. This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  6. Old edits – HBASE-64400-Length – HBASE-6443Bad refs – HBASE-7199(hbck) Invalid Hfile - Cosmic
  7. Hannibal helped a lot with identifying balance issues.
  8. Hannibal helped a lot with identifying balance issues.
  9. This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.