HBase At Tencent
Andrew Cheng | 程广旭
Tencent | HBase Committer
Content
01. HBase Service In Tencent
02. Applications
03. Practices & Optimization
01. HBase Service
In Tencent
HBase Story in Tencent
l Began using since 2013
l Used version
l 0.94.17 -> 0.98.6 -> 1.2.5 -> 2.2.0 (ing)
l Largest cluster more than 500 nodes
90+
Clusters
4000+
Nodes
10PB+
Data
3Tri+
RPD
Overview
HBase Users come from 6 groups , more than 100+ different applications
Architecture
Tencent HBase Zookeeper
OpenTSDB
S2Graph
Spark Tookit
HBase Api
TDBank
Lhotse
RestServer
ThriftServer
Kylin
Phoenix Tenpay
Doss
monitoring
TNM2
Deploy CenterWepay Game
Advertiseme
nt
…
02. Applications
Tencent Ads – Real-Time Logjoin System
Mixer Exposure
TDBank
Tencent HBase
Model learning Freshness Budget control Report
Association Table
Flow Table
Click …
LogJoin LogJoin LogJoin LogJoin
Data Source
Transport
Logical
Storage
Consumer
Tenpay - Transaction record
Data Source MySQL
Binlog Paser DBSync
Cache Hippo
Storage Tencent HBase
Thrift Server
Application
C++
Read
Read
Write
Application
JAVA
Read
Write TDSort
03. Practices &
Optimization
Practices–Data migration
add_peer
disable_peer
Set REPLICATION_SCOPE => '1'
snapshot clone_snapshot
Set REPLICATION_SCOPE => '0'
Check Dataenable_peer
Client switch to new cluster
Cluster A Cluster B
ExportSnapshot
delete_snapshot
Business-insensitive data migration
Practices–Table
l Create table per day
l Large amount of data
l TTL is short
l Benefit
l Reduce the amount of data in compaction
l Easy to delete expired data
Optimization - Bandwidth
② RS2 and RS3 Wal data
① Input Data
③ RS2 and RS3 Flush data
⑤ RS2 and RS3 Large compact
④ RS2 and RS3 Small compact
RS1 RS2 RS3
Input Data
Wal
Flush
①
Small compact
Large compact
②
③
④
⑤
Input Data Input Data
Optimization - Bandwidth
l Enable compressing of CellBlocks
l Wal compressor
l Increase the size of memstore
l Reduce the number of threads about compaction
l Turn off major compaction
l create tables by day
Optimization - Online filtering of dirty data
l A large amount of data which have the same Rowkey
l How to find filter rowkeys?
l ResponseTooSlow
l How to set filter rowkeys?
l hbase.hregion.filter.rowkeys
l How to refresh filter rowkeys?
l update_config
Input Data
Filter
Enable
Write
Filter
Yes
Yes
No
No
Optimization - Prefix Bloom Filter(HBASE-20636)
l ROWPREFIX_FIXED_LENGTH
l ROWPREFIX_DELIMITER
uin ts action
Bloom Filter
Prefix
Create Table:
File info:
Optimization - Prefix Bloom Filter(HBASE-20636)
Scan
Not Filter StoreFile
Same
prefix?
{StartKey,EndKey}
Computer hash
value
Hit
BloomFilter?
Prefix length
>=
prefix_length
Yes
Yes
No
Filter StoreFile
No
No
Get prefix key by
prefix_length
Yes
Read
Rowkey
Get prefix key by prefix_length
Computer hash value
Set BloomFilter
Last line?
Input Data
Write BloomFilter information to StoreFile metadata
Yes
No
Write
Optimization - RestServer
RestServer A
Cluster A Cluster B Cluster C
RestServer CRestServer B RestServer D
User
Nginx
Optimization - RestServer
RestServer A
Cluster A Cluster B Cluster C
RestServer CRestServer B
User
Nginx
Mysql
Optimization - RestServer
l Only maintain one configuration
l use effectively resources
l User-friendly access
HBase Community
l 1 Committer, 2 Contributor
l Total commits: 80+
l Feature
l HBASE-20636 Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED
l HBASE-19799 Add web UI to rsgroup
l HBASE-20243 [Shell] Add shell command to create a new table by cloning the existent table
l HBASE-19483 Add proper privilege check for rsgroup commands
l ………
Join Us
Personal WechatDept. Wechat
Thanks!

hbaseconasia2019 HBase at Tencent

  • 2.
    HBase At Tencent AndrewCheng | 程广旭 Tencent | HBase Committer
  • 3.
    Content 01. HBase ServiceIn Tencent 02. Applications 03. Practices & Optimization
  • 4.
  • 5.
    HBase Story inTencent l Began using since 2013 l Used version l 0.94.17 -> 0.98.6 -> 1.2.5 -> 2.2.0 (ing) l Largest cluster more than 500 nodes 90+ Clusters 4000+ Nodes 10PB+ Data 3Tri+ RPD
  • 6.
    Overview HBase Users comefrom 6 groups , more than 100+ different applications
  • 7.
    Architecture Tencent HBase Zookeeper OpenTSDB S2Graph SparkTookit HBase Api TDBank Lhotse RestServer ThriftServer Kylin Phoenix Tenpay Doss monitoring TNM2 Deploy CenterWepay Game Advertiseme nt …
  • 8.
  • 9.
    Tencent Ads –Real-Time Logjoin System Mixer Exposure TDBank Tencent HBase Model learning Freshness Budget control Report Association Table Flow Table Click … LogJoin LogJoin LogJoin LogJoin Data Source Transport Logical Storage Consumer
  • 10.
    Tenpay - Transactionrecord Data Source MySQL Binlog Paser DBSync Cache Hippo Storage Tencent HBase Thrift Server Application C++ Read Read Write Application JAVA Read Write TDSort
  • 11.
  • 12.
    Practices–Data migration add_peer disable_peer Set REPLICATION_SCOPE=> '1' snapshot clone_snapshot Set REPLICATION_SCOPE => '0' Check Dataenable_peer Client switch to new cluster Cluster A Cluster B ExportSnapshot delete_snapshot Business-insensitive data migration
  • 13.
    Practices–Table l Create tableper day l Large amount of data l TTL is short l Benefit l Reduce the amount of data in compaction l Easy to delete expired data
  • 14.
    Optimization - Bandwidth ②RS2 and RS3 Wal data ① Input Data ③ RS2 and RS3 Flush data ⑤ RS2 and RS3 Large compact ④ RS2 and RS3 Small compact RS1 RS2 RS3 Input Data Wal Flush ① Small compact Large compact ② ③ ④ ⑤ Input Data Input Data
  • 15.
    Optimization - Bandwidth lEnable compressing of CellBlocks l Wal compressor l Increase the size of memstore l Reduce the number of threads about compaction l Turn off major compaction l create tables by day
  • 16.
    Optimization - Onlinefiltering of dirty data l A large amount of data which have the same Rowkey l How to find filter rowkeys? l ResponseTooSlow l How to set filter rowkeys? l hbase.hregion.filter.rowkeys l How to refresh filter rowkeys? l update_config Input Data Filter Enable Write Filter Yes Yes No No
  • 17.
    Optimization - PrefixBloom Filter(HBASE-20636) l ROWPREFIX_FIXED_LENGTH l ROWPREFIX_DELIMITER uin ts action Bloom Filter Prefix Create Table: File info:
  • 18.
    Optimization - PrefixBloom Filter(HBASE-20636) Scan Not Filter StoreFile Same prefix? {StartKey,EndKey} Computer hash value Hit BloomFilter? Prefix length >= prefix_length Yes Yes No Filter StoreFile No No Get prefix key by prefix_length Yes Read Rowkey Get prefix key by prefix_length Computer hash value Set BloomFilter Last line? Input Data Write BloomFilter information to StoreFile metadata Yes No Write
  • 19.
    Optimization - RestServer RestServerA Cluster A Cluster B Cluster C RestServer CRestServer B RestServer D User Nginx
  • 20.
    Optimization - RestServer RestServerA Cluster A Cluster B Cluster C RestServer CRestServer B User Nginx Mysql
  • 21.
    Optimization - RestServer lOnly maintain one configuration l use effectively resources l User-friendly access
  • 22.
    HBase Community l 1Committer, 2 Contributor l Total commits: 80+ l Feature l HBASE-20636 Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED l HBASE-19799 Add web UI to rsgroup l HBASE-20243 [Shell] Add shell command to create a new table by cloning the existent table l HBASE-19483 Add proper privilege check for rsgroup commands l ………
  • 23.
  • 24.