Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HBaseConAsia2018 Track1-3: HBase at Xiaomi

141 views

Published on

Guanghao Zhang of Xiaomi

Published in: Internet
  • Be the first to comment

  • Be the first to like this

HBaseConAsia2018 Track1-3: HBase at Xiaomi

  1. 1. hosted by HBase at Xiaomi Guanghao Zhang zghao@apache.org
  2. 2. hosted by About Xiaomi ● Founded in 2010 ● Worldwide smartphone No.4 shipments, Q2 2018 ● 300+ million global users ● Products: smart phone, TV, AI speaker, smart band…
  3. 3. hosted by Our HBase Team ● 2 PMC members ● 1 Committer ● 3 HBase Contributers
  4. 4. hosted by Content 01 02 03 Xiaomi HBase Quota and Throttling Synchronous Replication
  5. 5. hosted by Xiaomi HBase01
  6. 6. hosted by Architecture HDFS HBase Yarn SDS FDS EMQ MR Spark Zookeeper OpenTSDB
  7. 7. hosted by Clusters IDC ● 5 data centers (China), 30+ online clusters / 3 offline clusters AWS / Alibaba Cloud / KS Cloud / Azure ● 16 clusters (China/Singapore/US/Europe/India/Russia)
  8. 8. hosted by Backup RegionServer RegionServer RegionServer WAL Active cluster RegionServer RegionServer Standby cluster replication Snapshot Object Storage (S3) Message Queue replication Hot Backup Cold Backup Point-in-time recovery HDFS
  9. 9. hosted by Multi-tenancy Physical isolation ● Independent HDFS cluster for HBase ● Independent HBase cluster for important business ● Online serving cluster: SSD vs HDD ● Offline processing cluster Quota and throttling
  10. 10. hosted by Quota and throttling02
  11. 11. hosted by Rpc Throttling Channel Channel Channel Listener Reader Reader Reader Scheduler Queue Queue Queue Handler Handler Handler Handler Handler Handler Throttling when handler start to run request
  12. 12. hosted by Quota Type User Table Namespace User ⇒ Table User ⇒ Namespace Read Write Requests number Requests size 1. Hard limit 2. If any one quota exceed, then throttling
  13. 13. hosted by More Quota Types at Xiaomi RegionServer Quota: hard limit Request Unit: calculate both number and size ● Read capacity unit: 1KB/sec ● Write capacity unit: 1KB/sec Soft limit: allow to exceed user’s quota when regionserver quota not exceed
  14. 14. hosted by Other Improvements Switch to start/stop throttling Metrics and UI support Handle ThrottlingException in client ● DoNotRetryNowException ● Avoid MR/Spark job failed by throttling Punishment mechanism for huge requests
  15. 15. hosted by Synchronous Replication03
  16. 16. hosted by Async Replication RegionServer RegionServer RegionServer WAL Active cluster Client sync async RegionServer RegionServer RegionServer Standby cluster replication write read
  17. 17. hosted by Async Replication RegionServer RegionServer RegionServer WAL Active cluster Client sync async RegionServer RegionServer RegionServer Standby cluster replication read inconsistent data crashed crashed crashed
  18. 18. hosted by Sync Replication RegionServer RegionServer RegionServer WAL Active cluster Client sync async RegionServer RegionServer RegionServer Standby cluster replication write read Remote WAL sync
  19. 19. hosted by Sync Replication RegionServer RegionServer RegionServer WAL Active cluster Client sync async RegionServer RegionServer RegionServer Standby cluster replication read Remote WAL sync crashed crashed crashed replay
  20. 20. hosted by Sync Replication State Active Downgrade Active Standby
  21. 21. hosted by Setup Sync Replication Active Downgrade Active StandbyDowngrade Active source cluster peer cluster Downgrade Active Standby replication Step 1 Step 2 Step 3 1. Client read/write active cluster 2. Standby cluster accept async replication request replication replication
  22. 22. hosted by DualAsyncFSWAL Add a remoteWALDir to ReplicationPeerConfig Concurrent write WAL and remote WAL Write WAL Write remote WAL Client Success Success Success Same data Success Fail/Timeout Timeout Source cluster may has more data Fail/Timeout Success Timeout Peer cluster may has more data Fail Fail Fail Same data Timeout Timeout Timeout Source or peer cluster may has more data
  23. 23. hosted by One RegionServer Crashed Active Standby source cluster peer cluster Standby replication Step 1 Step 2 Step 3 When one regionserver crashed, it need copy WALs to remote WAL dir first. Then failover with normal way. replication replication Active Active StandbyRegionServercrashed RegionServercrashed WAL Failover
  24. 24. hosted by One RegionServer Crashed Active source cluster peer cluster StandbyStep 3 replication Failover Case 1: source cluster has more data Solution: It will replicate data to peer cluster, so the data will eventually consistent. Case 2: peer cluster has more data Solution: Active cluster will copy WAL (which need to split) to remote WAL dir first, and the original remote WAL will be replaced. So the data will eventually consistent.
  25. 25. hosted by Standby Cluster Crashed Active StandbyDowngrade Active source cluster peer cluster Standby replication Step 1 Step 2 Step 3 The async replication will stuck when standby crashed. The block data will be replicated to standby cluster when it come back. replication replication Active Standbycrashed crashed
  26. 26. hosted by Standby Cluster Crashed Active source cluster peer cluster StandbyStep 3 replication Case 1: source cluster has more data Solution: It will replicate data to peer cluster, so the data will eventually consistent. Case 2: peer cluster has more data Solution: The remote WALs not used anymore and will be deleted when the async replication replicate the original WALs to peer cluster. So the data will eventually consistent.
  27. 27. hosted by Active Cluster Crashed source cluster peer cluster replication Step 1 Step 2 Step 3 When source cluster come back and transit it to standby, it will skip to replay the original WALs. replication replication Active Active crashed crashed Standby Downgrade Active ActiveStandby 1. Rename remote WAL dir 2. Replay remote WALs 3. Client read/write peer cluster 4. peer cluster doesn’t accept replication request
  28. 28. hosted by Active Cluster Crashed source cluster peer cluster Step 3 replication ActiveStandby Case 1: source cluster has more data Solution: It will skip to replay the original WALs(which has been replayed in peer cluster) and accept replication request which from peer cluster. So the data will eventually consistent. Case 2: peer cluster has more data Solution: It will replay the remote WALs and replicate it to source cluster. So the data will eventually consistent.
  29. 29. hosted by Sync Replication State Constraint Active Downgrade Active Standby Write remote WAL Yes No No Accept client’s read/write request Yes Yes No Accept async replication request No No Yes
  30. 30. hosted by Replication: Async vs Sync Async Replication Sync Replication Read Path No affect No affect Write Path No affect Write extra remote WAL Network bandwidth 100% for async replication. 100% for async + 100% for remote WAL. Storage space No affect Need extra space for remote WAL Eventual Consistency No if active cluster crashed Yes Availability Unavailable when master crash Few time for waiting replay remote log. Operational Complexity Simple More complex, Need to transition cluster state by hand or your services.
  31. 31. hosted by Performance YCSB 10^8 operations, 100% Put Qps: Almost 14% decline compared with async replication regionserver.heapsize=8g, datanode.heapsize=4g, CMS-gc, disk=1*4T(HDD), cpu=24core Qps Avg latency P99 latency P999 latency Async replication 43K 3ms <49.6ms <567ms Sync replication 37K 3.5ms <74ms <558ms
  32. 32. hosted by Performance
  33. 33. hosted by Sync Replication HBASE-19064: The idea comes from Alibaba's presentation on HBaseCon Asia 2017 HBASE-20422: Synchronous replication for HBase phase II (OPEN) Sync Replication Xiaomi solution Alibaba solution Base branch master 0.94 Open source will be released in hbase 3.0 internal solution Qps 14% decline compared with async replication 2% decline compared with async replication
  34. 34. hosted by Thanks Guanghao Zhang zghao@apache.org

×