SlideShare a Scribd company logo
ACHIEVING HBASE
MULTI-TENANCY:
REGIONSERVER
GROUPS
AND
FAVORED NODES
Francis Liu & Thiruvel Thirumoolan
HBase Yahoos
HBase @ Y!
Multi-tenancy
Multi-tenancy
HBase Multi-tenancy @ Y!
•  ~45 Tenants
•  ~940 RegionServers
•  ~300k regions
•  RS Peak 115k requests/sec
RegionServer Groups
•  Group Membership
•  Table
•  RegionServer
•  Coarse Isolation
•  Namespace Integration
Group Foo
Region Server 1…4
Table1
Table2
RS1
Table1
Table2
RS2
Table1
Table2
RS3
Table1
Table2
RS4
Group Bar
Region Server 5…8
Table3
Table4
RS5
Table3
Table4
RS6
Table3
Table4
RS7
Table3
Table4
RS8
Divide and Conquer
RS RS…Group A RS
RS RS…Group B RS
RS RS…Group C RS
RS RS…Group D RS
RS RS…Group E RS
Multi-tenancy with RegionServer Groups
•  ~45 namespaces
•  ~45 Region server groups
•  4 to 100s of servers
•  Up to 2000+ regions per server
Architecture
LoadBalancer
RSGroupBasedLoadBalancer
RSGroupAdminEndpoint
HMaster
FilterBy
Group
foo
bar
RSGroupInfoManager
RSGroup
Table
RSGroup on
ZK
Group Metric Tag
RS RS…Group A RS
RS RS…Group B RS
RS RS…Group C RS
RS RS…Group D RS
RS RS…Group E RS
Dead RegionServer Thresholds
RSGroup A
RSGroup C RS
RS RS RS RS RS
RS RS
RSGroup B RS RS RS RS RS
Dead RegionServer Processing
▪ Per Group Queue RegionServer
RegionServer
RegionServer
HBase Master
Zookeeper
Group Aware Replication
Group A
Group A
RegionServer
Replication
Source
RS 4
RS 3RS 1 RS 2
WalEdit1
WalEdit2
WalEdit3
WalEdit4
sink
RS 5 RS 6
RSGroups @ Y!
•  Per Group configurations
•  hbase-site.xml
•  hbase-env.sh
•  System Group
•  Isolate system tables
•  Rolling Upgrade/Restart Per Group
•  Different strategies for Balance per Group
•  Alerting/Monitoring Per Group
•  Namespace Integration
•  User run DDL on their own tables in sandbox
•  Table and Region Quotas
Favored Nodes
Overview
▪ HDFS
›  File level block placement hint (on file creation)
›  Pass a set of preferred hosts to client to replicate data
›  preferred hosts => “Favored Nodes” or hints
▪ HBase
›  Region level block placement hint
›  Select 3 favored nodes for each region - primary, secondary, tertiary
›  Constraint: Favored Nodes on 2 racks (where possible)
Motivation
▪ Data Locality
▪ Performance
▪ Network utilization
▪ Datanode isolation
▪ Previous work from FB and Community
›  HBASE-4755 (HBase based block placement in DFS)
Enabling Favored Nodes
▪ HBase
›  Use Favored node balancer
›  Setup tool for creating FN for existing regions
▪ HDFS
›  Set “dfs.namenode.replication.considerLoad” to false
›  Recommend disabling HDFS balancer
Flow
hbase:meta
RG1 RS1 RS2 RS3
Col: info:fn
Master
FN Cache
Favored Balancer
RG1 RS1 RS2 RS3
Assignment Manager
openRegion
Region Server
RG1 DN1 DN2 DN3
Flush/Compaction
Enhancements - Summary
▪ Umbrella jira HBASE-15531 (design doc)
▪ Balancer
›  FavoredStochasticBalancer (HBASE-16942)
›  FavoredGroupBalancer – RSGroup version (HBASE-15533)
›  Splits/Merges inherit FN
▪ Admin APIs/tools
›  redistribute (HBASE-18064)
›  complete_redistribute (HBASE-18065)
›  removeFN (HBASE-18062)
›  checkFN (HBASE-18063)
›  hbck (HBASE-17153
Favored Node Balancers
▪ FavoredStochasticBalancer
›  Assigns only to FN of a region (user tables)
›  New Candidate Generators (FNLocality and FNLoad)
›  Recommended same cost for load and locality generators
›  Future – Work with Region Replicas
›  Future - WALs
▪ FavoredRSGroupLoadBalancer
›  Uses FavoredStochasticBalancer
›  Recommended minimum 4 nodes per group
›  Generated FN within the group servers
Region Split and Merge
▪ Splits
›  Each daughter inherits 2 FN from parent
›  One FN is randomly generated
›  Locality vs Distribution
›  FN within rsgroup servers (if enabled)
▪ Merge
›  Inherited from one of the parents
›  Preserve locality
Distribution
▪ Replica count distribution across favored nodes (FNReplica)
▪ Why is it important?
›  Balancer assigns only to FN
›  RegionServer crashes
›  Uniform load
▪ Sample replica load for a group from production
SN=Rack1_RS1 Primary=695 Secondary=19Tertiary=11 Total=725
SN=Rack1_RS2 Primary=142 Secondary=398 Tertiary=185 Total=725
SN=Rack2_RS1 Primary=93 Secondary=376 Tertiary=256 Total=725
SN=Rack2_RS1 Primary=36 Secondary=173 Tertiary=514 Total=723
Modifying Distribution
▪ Spread FN across all region servers
▪ redistribute:
›  Balance of FNReplicas
›  Also used when adding new servers
›  Only one FN is changed for a region, Constraint: 2 FN >= 80% locality
›  Current assignment not changed
›  Overloaded servers -> underloaded servers
▪ complete_redistribute:
›  Round robin generation of FNReplicas
›  Locality is lost and regions reassigned
▪ removeFN - Decommissioning a favored node
Adding servers - redistribute
RS3
DN3
RS Group - A
DN1 DN2
RS2
RS4
DN4
RS1
RS5
DN5
RS3
DN3
RS Group - A
DN1 DN2
RS2
RS4
DN4
RS1
RS5
DN5
redistribute
New node
added
Decommissioning a node - removeFN
RS3
DN3
RS Group - A
DN1 DN2
RS2
RS4
DN4
RS1
RS5
DN5
RS3
DN3
RS Group - A
DN1 DN2
RS2
RS4
DN4
RS1
RS5
DN5
removeFN
Decommission
node
Motivation (Revisited)
▪ Data Locality
▪ Performance
▪ Network utilization
▪ Datanode isolation
Data Locality - Fault Testing
Favored Nodes
▪ Locality preserved on chaos monkey tests
No Favored Nodes
percentfileslocal
percentfileslocal
Data Locality - Rolling Restart and Balancer
percentfileslocalregioncount
ß RS balanced à ß Rolling Restart à ß Favored Balancer à
Datanode Isolation – Tenant specific
▪ diskUsed% changes after FN (2 racks). Tenant #1 – Storage heavy
FN Enabled
diskused%diskused%
Tenant#1Tenant#2,3Tenant#2,3Tenant#1
diskUsed spread across diskUsed tenant specific
Remote DN reads…
▪ Cluster level remote DN reads significantly less inspite of 2x reads
BeforeFavoredNodesAfterFavoredNodes
Hbase - Read Request Rate (Cluster level)
BeforeFavoredNodesAfterFavoredNodes
Network Utilization (Cluster level N/W traffic)
10	
20	
30	
40	
50	
60	
70	
2016-01-15	
2016-01-23	
2016-01-31	
2016-02-08	
2016-02-16	
2016-02-24	
2016-03-03	
2016-03-11	
2016-03-19	
2016-03-27	
2016-04-04	
2016-04-12	
2016-04-20	
2016-04-28	
2016-05-06	
2016-05-14	
2016-05-22	
2016-05-30	
2016-06-07	
2016-06-15	
2016-06-23	
2016-07-01	
2016-07-09	
2016-07-17	
2016-07-25	
2016-08-02	
2016-08-10	
2016-08-18	
2016-08-26	
2016-09-03	
2016-09-11	
2016-09-19	
2016-09-27	
2016-10-05	
2016-10-13	
2016-10-21	
2016-10-29	
2016-11-06	
2016-11-14	
2016-11-22	
2016-11-30	
2016-12-08	
2016-12-16	
2016-12-24	
2017-01-01	
2017-01-09	
2017-01-17	
2017-01-25	
2017-02-02	
2017-02-10	
2017-02-18	
2017-02-26	
2017-03-06	
2017-03-14	
2017-03-22	
2017-03-30	
2017-04-07	
2017-04-15	
2017-04-23	
2017-05-01	
2017-05-09	
2017-05-17	
2017-05-25	
2017-06-02	
Max	Input	 Max	Output	
Scheduled Maintenance
Favored Nodes Enabled
Ø  Max Network Traffic
Ø  HDFS + User data
Ø  2x User traffic
NetworkTraffic(xUnits)–Maxtrafficonthecluster
Monitoring/Operations
▪ HBck checks various factors
›  No FN or incorrect FN
›  Regions with dead FN
›  Out-of-rsgroup favored nodes
›  System tables
▪ Check dead FN (tool, JMX)
▪ Master UI - RIT indicates when all FN dead
Production Experience
▪ Steady increase in data locality (percentfileslocal)
▪ Redistribute runs once a day for all groups
›  FN distribution more of less equally spread across group nodes
›  Adding 10% servers to an rsgroup – equal distribution
▪ FN hints not chosen when DN in decommission
›  DFSClient logs warning when hints not chosen, NN logs too
›  Sometimes DN takes a long time to decomm
›  HDFS Rolling upgrade or system updates causes DN downtime
▪ Regions in transition due to FN
›  All FN dead (missed alert)
›  Non-rsgroup servers as FN (bug in code)
Data Locality - Rolling Restart
▪ Region Count varies, but locality is preserved across multiple rolling restarts
percentfileslocalregioncount
ß Balanced à ß Rolling Restart à
Data growth
• Same set of tenants across 2 racks
Favored Nodes Enabled
storefilesizestorefilesize
0to4TB0to4TB
Network Utilization
▪ Cluster level writeRequestRate – Before and After FN (3x increase)
BeforeFavoredNodesAfterFavoredNodes

More Related Content

What's hot

Anatomy of file write in hadoop
Anatomy of file write in hadoopAnatomy of file write in hadoop
Anatomy of file write in hadoop
Rajesh Ananda Kumar
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
enissoz
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshots
enissoz
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
DataWorks Summit/Hadoop Summit
 
6 technical-dns-workshop-day3
6 technical-dns-workshop-day36 technical-dns-workshop-day3
6 technical-dns-workshop-day3
DNS Entrepreneurship Center
 
Hands-on DNSSEC Deployment
Hands-on DNSSEC DeploymentHands-on DNSSEC Deployment
Hands-on DNSSEC Deployment
Bangladesh Network Operators Group
 
Putting Wings on the Elephant
Putting Wings on the ElephantPutting Wings on the Elephant
Putting Wings on the Elephant
DataWorks Summit
 
Lets talk dns
Lets talk dnsLets talk dns
Lets talk dns
Abhinav Mehta
 
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix clusterFive major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
mas4share
 
Root servers
Root serversRoot servers
Root servers
Willem Kuypers
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
Biju Nair
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014
larsgeorge
 
Facebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeFacebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage Challenge
DataWorks Summit
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resources
DataWorks Summit
 
HBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table Snapshots
Cloudera, Inc.
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
Hanborq Inc.
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and Future
DataWorks Summit
 

What's hot (17)

Anatomy of file write in hadoop
Anatomy of file write in hadoopAnatomy of file write in hadoop
Anatomy of file write in hadoop
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshots
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
 
6 technical-dns-workshop-day3
6 technical-dns-workshop-day36 technical-dns-workshop-day3
6 technical-dns-workshop-day3
 
Hands-on DNSSEC Deployment
Hands-on DNSSEC DeploymentHands-on DNSSEC Deployment
Hands-on DNSSEC Deployment
 
Putting Wings on the Elephant
Putting Wings on the ElephantPutting Wings on the Elephant
Putting Wings on the Elephant
 
Lets talk dns
Lets talk dnsLets talk dns
Lets talk dns
 
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix clusterFive major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
 
Root servers
Root serversRoot servers
Root servers
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014
 
Facebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeFacebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage Challenge
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resources
 
HBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table Snapshots
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and Future
 

Similar to HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes

HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
HBaseCon
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
HBaseCon
 
Millions of Regions in HBase: Size Matters
Millions of Regions in HBase: Size MattersMillions of Regions in HBase: Size Matters
Millions of Regions in HBase: Size Matters
DataWorks Summit
 
Chapter 4 configuring and managing the dns server role
Chapter 4   configuring and managing the dns server roleChapter 4   configuring and managing the dns server role
Chapter 4 configuring and managing the dns server role
Luis Garay
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
Yiwei Ma
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
强 王
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
yongboy
 
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAILDNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
Utah Networxs Consultoria e Treinamento
 
Dns
DnsDns
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
 
SQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinarSQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinar
Denny Lee
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
Mahendran Ponnusamy
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012
StampedeCon
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and Future
Cloudera, Inc.
 
Meeting 4 DNS
Meeting 4   DNSMeeting 4   DNS
Meeting 4 DNS
Syaiful Ahdan
 
HBase at Flurry
HBase at FlurryHBase at Flurry
HBase at Flurry
ddlatham
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
 
Dns
DnsDns
Hadoop2
Hadoop2Hadoop2
Hadoop2
Gagan Agrawal
 
Domain Name Server
Domain Name ServerDomain Name Server
Domain Name Server
vipulvaid
 

Similar to HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes (20)

HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
Millions of Regions in HBase: Size Matters
Millions of Regions in HBase: Size MattersMillions of Regions in HBase: Size Matters
Millions of Regions in HBase: Size Matters
 
Chapter 4 configuring and managing the dns server role
Chapter 4   configuring and managing the dns server roleChapter 4   configuring and managing the dns server role
Chapter 4 configuring and managing the dns server role
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAILDNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
 
Dns
DnsDns
Dns
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
SQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinarSQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinar
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and Future
 
Meeting 4 DNS
Meeting 4   DNSMeeting 4   DNS
Meeting 4 DNS
 
HBase at Flurry
HBase at FlurryHBase at Flurry
HBase at Flurry
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Dns
DnsDns
Dns
 
Hadoop2
Hadoop2Hadoop2
Hadoop2
 
Domain Name Server
Domain Name ServerDomain Name Server
Domain Name Server
 

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
HBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
HBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
HBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
HBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
HBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
HBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
HBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
HBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
HBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
HBaseCon
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
HBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
HBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon
 

More from HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 

Recently uploaded

National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 

Recently uploaded (20)

National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 

HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes

  • 5. HBase Multi-tenancy @ Y! •  ~45 Tenants •  ~940 RegionServers •  ~300k regions •  RS Peak 115k requests/sec
  • 6. RegionServer Groups •  Group Membership •  Table •  RegionServer •  Coarse Isolation •  Namespace Integration Group Foo Region Server 1…4 Table1 Table2 RS1 Table1 Table2 RS2 Table1 Table2 RS3 Table1 Table2 RS4 Group Bar Region Server 5…8 Table3 Table4 RS5 Table3 Table4 RS6 Table3 Table4 RS7 Table3 Table4 RS8
  • 7. Divide and Conquer RS RS…Group A RS RS RS…Group B RS RS RS…Group C RS RS RS…Group D RS RS RS…Group E RS
  • 8. Multi-tenancy with RegionServer Groups •  ~45 namespaces •  ~45 Region server groups •  4 to 100s of servers •  Up to 2000+ regions per server
  • 10. Group Metric Tag RS RS…Group A RS RS RS…Group B RS RS RS…Group C RS RS RS…Group D RS RS RS…Group E RS
  • 11. Dead RegionServer Thresholds RSGroup A RSGroup C RS RS RS RS RS RS RS RS RSGroup B RS RS RS RS RS
  • 12. Dead RegionServer Processing ▪ Per Group Queue RegionServer RegionServer RegionServer HBase Master Zookeeper
  • 13. Group Aware Replication Group A Group A RegionServer Replication Source RS 4 RS 3RS 1 RS 2 WalEdit1 WalEdit2 WalEdit3 WalEdit4 sink RS 5 RS 6
  • 14. RSGroups @ Y! •  Per Group configurations •  hbase-site.xml •  hbase-env.sh •  System Group •  Isolate system tables •  Rolling Upgrade/Restart Per Group •  Different strategies for Balance per Group •  Alerting/Monitoring Per Group •  Namespace Integration •  User run DDL on their own tables in sandbox •  Table and Region Quotas
  • 16. Overview ▪ HDFS ›  File level block placement hint (on file creation) ›  Pass a set of preferred hosts to client to replicate data ›  preferred hosts => “Favored Nodes” or hints ▪ HBase ›  Region level block placement hint ›  Select 3 favored nodes for each region - primary, secondary, tertiary ›  Constraint: Favored Nodes on 2 racks (where possible)
  • 17. Motivation ▪ Data Locality ▪ Performance ▪ Network utilization ▪ Datanode isolation ▪ Previous work from FB and Community ›  HBASE-4755 (HBase based block placement in DFS)
  • 18. Enabling Favored Nodes ▪ HBase ›  Use Favored node balancer ›  Setup tool for creating FN for existing regions ▪ HDFS ›  Set “dfs.namenode.replication.considerLoad” to false ›  Recommend disabling HDFS balancer
  • 19. Flow hbase:meta RG1 RS1 RS2 RS3 Col: info:fn Master FN Cache Favored Balancer RG1 RS1 RS2 RS3 Assignment Manager openRegion Region Server RG1 DN1 DN2 DN3 Flush/Compaction
  • 20. Enhancements - Summary ▪ Umbrella jira HBASE-15531 (design doc) ▪ Balancer ›  FavoredStochasticBalancer (HBASE-16942) ›  FavoredGroupBalancer – RSGroup version (HBASE-15533) ›  Splits/Merges inherit FN ▪ Admin APIs/tools ›  redistribute (HBASE-18064) ›  complete_redistribute (HBASE-18065) ›  removeFN (HBASE-18062) ›  checkFN (HBASE-18063) ›  hbck (HBASE-17153
  • 21. Favored Node Balancers ▪ FavoredStochasticBalancer ›  Assigns only to FN of a region (user tables) ›  New Candidate Generators (FNLocality and FNLoad) ›  Recommended same cost for load and locality generators ›  Future – Work with Region Replicas ›  Future - WALs ▪ FavoredRSGroupLoadBalancer ›  Uses FavoredStochasticBalancer ›  Recommended minimum 4 nodes per group ›  Generated FN within the group servers
  • 22. Region Split and Merge ▪ Splits ›  Each daughter inherits 2 FN from parent ›  One FN is randomly generated ›  Locality vs Distribution ›  FN within rsgroup servers (if enabled) ▪ Merge ›  Inherited from one of the parents ›  Preserve locality
  • 23. Distribution ▪ Replica count distribution across favored nodes (FNReplica) ▪ Why is it important? ›  Balancer assigns only to FN ›  RegionServer crashes ›  Uniform load ▪ Sample replica load for a group from production SN=Rack1_RS1 Primary=695 Secondary=19Tertiary=11 Total=725 SN=Rack1_RS2 Primary=142 Secondary=398 Tertiary=185 Total=725 SN=Rack2_RS1 Primary=93 Secondary=376 Tertiary=256 Total=725 SN=Rack2_RS1 Primary=36 Secondary=173 Tertiary=514 Total=723
  • 24. Modifying Distribution ▪ Spread FN across all region servers ▪ redistribute: ›  Balance of FNReplicas ›  Also used when adding new servers ›  Only one FN is changed for a region, Constraint: 2 FN >= 80% locality ›  Current assignment not changed ›  Overloaded servers -> underloaded servers ▪ complete_redistribute: ›  Round robin generation of FNReplicas ›  Locality is lost and regions reassigned ▪ removeFN - Decommissioning a favored node
  • 25. Adding servers - redistribute RS3 DN3 RS Group - A DN1 DN2 RS2 RS4 DN4 RS1 RS5 DN5 RS3 DN3 RS Group - A DN1 DN2 RS2 RS4 DN4 RS1 RS5 DN5 redistribute New node added
  • 26. Decommissioning a node - removeFN RS3 DN3 RS Group - A DN1 DN2 RS2 RS4 DN4 RS1 RS5 DN5 RS3 DN3 RS Group - A DN1 DN2 RS2 RS4 DN4 RS1 RS5 DN5 removeFN Decommission node
  • 28. Data Locality - Fault Testing Favored Nodes ▪ Locality preserved on chaos monkey tests No Favored Nodes percentfileslocal percentfileslocal
  • 29. Data Locality - Rolling Restart and Balancer percentfileslocalregioncount ß RS balanced à ß Rolling Restart à ß Favored Balancer à
  • 30. Datanode Isolation – Tenant specific ▪ diskUsed% changes after FN (2 racks). Tenant #1 – Storage heavy FN Enabled diskused%diskused% Tenant#1Tenant#2,3Tenant#2,3Tenant#1 diskUsed spread across diskUsed tenant specific
  • 31. Remote DN reads… ▪ Cluster level remote DN reads significantly less inspite of 2x reads BeforeFavoredNodesAfterFavoredNodes
  • 32. Hbase - Read Request Rate (Cluster level) BeforeFavoredNodesAfterFavoredNodes
  • 33. Network Utilization (Cluster level N/W traffic) 10 20 30 40 50 60 70 2016-01-15 2016-01-23 2016-01-31 2016-02-08 2016-02-16 2016-02-24 2016-03-03 2016-03-11 2016-03-19 2016-03-27 2016-04-04 2016-04-12 2016-04-20 2016-04-28 2016-05-06 2016-05-14 2016-05-22 2016-05-30 2016-06-07 2016-06-15 2016-06-23 2016-07-01 2016-07-09 2016-07-17 2016-07-25 2016-08-02 2016-08-10 2016-08-18 2016-08-26 2016-09-03 2016-09-11 2016-09-19 2016-09-27 2016-10-05 2016-10-13 2016-10-21 2016-10-29 2016-11-06 2016-11-14 2016-11-22 2016-11-30 2016-12-08 2016-12-16 2016-12-24 2017-01-01 2017-01-09 2017-01-17 2017-01-25 2017-02-02 2017-02-10 2017-02-18 2017-02-26 2017-03-06 2017-03-14 2017-03-22 2017-03-30 2017-04-07 2017-04-15 2017-04-23 2017-05-01 2017-05-09 2017-05-17 2017-05-25 2017-06-02 Max Input Max Output Scheduled Maintenance Favored Nodes Enabled Ø  Max Network Traffic Ø  HDFS + User data Ø  2x User traffic NetworkTraffic(xUnits)–Maxtrafficonthecluster
  • 34. Monitoring/Operations ▪ HBck checks various factors ›  No FN or incorrect FN ›  Regions with dead FN ›  Out-of-rsgroup favored nodes ›  System tables ▪ Check dead FN (tool, JMX) ▪ Master UI - RIT indicates when all FN dead
  • 35. Production Experience ▪ Steady increase in data locality (percentfileslocal) ▪ Redistribute runs once a day for all groups ›  FN distribution more of less equally spread across group nodes ›  Adding 10% servers to an rsgroup – equal distribution ▪ FN hints not chosen when DN in decommission ›  DFSClient logs warning when hints not chosen, NN logs too ›  Sometimes DN takes a long time to decomm ›  HDFS Rolling upgrade or system updates causes DN downtime ▪ Regions in transition due to FN ›  All FN dead (missed alert) ›  Non-rsgroup servers as FN (bug in code)
  • 36.
  • 37. Data Locality - Rolling Restart ▪ Region Count varies, but locality is preserved across multiple rolling restarts percentfileslocalregioncount ß Balanced à ß Rolling Restart à
  • 38. Data growth • Same set of tenants across 2 racks Favored Nodes Enabled storefilesizestorefilesize 0to4TB0to4TB
  • 39. Network Utilization ▪ Cluster level writeRequestRate – Before and After FN (3x increase) BeforeFavoredNodesAfterFavoredNodes