Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

1,647 views

Published on

Zhiyong Bai

As a high performance and scalable key value database, Zhihu use HBase to provide online data store system along with Mysql and Redis. Zhihu’s platform team had accumulated some experience in technology of container, and this time, based on Kubernetes, we build flexible platform of online HBase system, create multiple logic isolated HBase clusters on the shared physical cluster with fast rapid,and provide customized service for different business needs. Combined with Consul and DNS server, we implement high available access of HBase using client mainly written with Python. This presentation is mainly shared the architecture of online HBase platform in Zhihu and some practical experience in production environment.

hbaseconasia2017 hbasecon hbase

Published in: Technology

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes

  1. 1. Building Online HBase Cluster
 of 
 Zhihu Based on Kubernetes
  2. 2. HBase at Zhihu Agenda PPT模板:www.1ppt.com/moban/ PPT素材:www.1ppt.com/sucai/ PPT背景:www.1ppt.com/beijing/ PPT图表:www.1ppt.com/tubiao/ PPT下载:www.1ppt.com/xiazai/ PPT教程: www.1ppt.com/powerpoint/ 资料下载:www.1ppt.com/ziliao/ 范⽂下载:www.1ppt.com/fanwen/ 试卷下载:www.1ppt.com/shiti/ 教案下载:www.1ppt.com/jiaoan/ PPT论坛:www.1ppt.cn PPT课件:www.1ppt.com/kejian/ 语⽂课件:www.1ppt.com/kejian/yuwen/ 数学课件:www.1ppt.com/kejian/shuxue/ 英语课件:www.1ppt.com/kejian/yingyu/ 美术课件:www.1ppt.com/kejian/meishu/ 科学课件:www.1ppt.com/kejian/kexue/ 物理课件:www.1ppt.com/kejian/wuli/ 化学课件:www.1ppt.com/kejian/huaxue/ ⽣物课件:www.1ppt.com/kejian/shengwu/ 地理课件:www.1ppt.com/kejian/dili/ 历史课件:www.1ppt.com/kejian/lishi/ Using Kubernetes HBase Online Platform
  3. 3. HBase Online Platform Using Kubernetes HBase at Zhihu
  4. 4. • Offline • Physical machine, hundreds of nodes. • Work with Spark/Hadoop. • Online • Based on Kubernetes, more than 300 containers. HBase at Zhihu 01 02
  5. 5. Our online storage 01 02 03 MySQL used in most business some need scale, some need transform all SSD,expensive Redis cache and partial storage no shard expensive HBase / Cassandra / RocksDB etc. ?
  6. 6. Challenges at the beginning • All business at one big cluster • Also runs NodeManager and ImpalaServer • Basically operation • Physical node level monitor
  7. 7. What we want • From Business Sight • environment isolation • SLA definition • business level monition • From Operation Sight • balance resource ( CPU, I/O, RAM ) • friendly api • controllable costs 01 02
  8. 8. Make HBase as a Service. In short:
  9. 9. HBase Online Platform Using Kubernetes HBase at Zhihu
  10. 10. Zhihu’s Unified Cluster Manage Platfom
  11. 11. HBase online cluster • Platform controls cluster • Kubernetes schedule resources • Shared HDFS and ZK • Expose ZK address or ThriftServer to user
  12. 12. Kubernetes Cluster resource manager and scheduler Using container to isolate resource Application management Perfect API and active community 01 02 03 04
  13. 13. Component Design • Pod • infrastructure component • one Pod per component • ReplicationController -> HA • Define A cluster • 1 HMaster RC ( replica = 2 ) • 1 RegionServer RC ( replica = n, n >=1 ) • 1 ThriftServer RC ( replica = m, m>=0 )
  14. 14. Failover Design Data Replication Component Level Cluster Level
  15. 15. • HMaster -> use ZooKeeper • RegionServer -> Stateless designed • ThriftServer -> use proxy • HFile -> ??? Component Level
  16. 16. Component Level - HFile • Shared HDFS Cluster • Keep the whole cluster stateless
  17. 17. Cluster Level • What if cluster Pod is down ? • Kubernetes ReplicationController • What if Kubernetes is down ? • Mixed deployment • Few physical nodes with high CPU && RAM
  18. 18. Data Replication • Replication in cluster • HDFS built in ( 3 replicas) • period hdfs fsck • Replication between clusters • snapshot + bulk load • offline cluster doing MR / Spark 01 02
  19. 19. HBase Online Platform Using Kubernetes HBase at Zhihu
  20. 20. Physical Node Resource CPU: 2 * 12 core Disk: 4 T Memory: 128 G
  21. 21. Resource Definition (1) • Minimize the resource • Business scaled by number of containers • Pros • maximum resource usage on nodes • simplified debug • ease scale • Cons • minimum resource not easy to define by business • hardly tune params for RAMs and GC
  22. 22. Resource Definition (2) • Customize container resource by business • Business scaled by number of containers • Pros • flexible RAM config and tuning • used in production
  23. 23. Container Configuration • JAVA_HOME HBASE_HOME • inject to container via ENV • hdfs-site.xml core-site.xml • add xml config to container • hbase-site.xml hbase-env.sh • use start-env.sh to init configuration • Modify params during cluster running is permitted
  24. 24. RegionServer Configuration • Basie Config • hbase.hregion.majorcompaction = 0 • hbase.regionserver.handler.count = 50 • hbase.regionserver.codecs = snappy • hfile.block.cache.size = 0.4 • Using G1GC ( thanks to Xiaomi )
  25. 25. Network • Dedicated ip per pod • DNS register/deregister automatically • Modified /etc/hosts for pod
  26. 26. Client Design • For Java/Scala • native HBase client • only offer ZK address to business • For Python • happybase • client proxy • service discovery
  27. 27. API Server • A Bridge between Kubernetes and user • Encapsulate component of a HBase cluster • Restful API • Friendly interface
  28. 28. Painful Points • Cons: • fully scan still impact whole cluster • speed limited coprocessor • locality && short circuit • SSD Disk
  29. 29. Monitor Cluster • Physical nodes Level • nodes cpu loads && usage ( via IT ) • Cluster Level • Pods cpu loads ( via cAdvisor) • read && write rate , P95, cacheHit ( via JMX) • Table Level • client write speed && read latency ( via tracing ) • thrift server ( via JMX )
  30. 30. Current Situation • 10+ online business, 300+ Pods • P95 average 20-30 ms • 99.99% SLA in 9 months
  31. 31. Benefits Easy Isolate Flexible
  32. 32. • Almost no code needed • HBase container publish independently • Deployment and orchestration straight forward • Decoupled from physical nodes Easy
  33. 33. • Resource isolation • CPU • memory • Business isolation • data • proxy • monitor Isolate
  34. 34. • Multi version • mostly cdh5.5.0-hbase1.0.0 • one upgrade to 1.2 (HBASE-14283) • customize version easily • Configuration motivated by business • low latency -> read replica • etc. Flexible
  35. 35. • Enhance performance • Netty on ThriftServer • Python HBase Client • SSD for Datanode • Auto scale • by RegionServer number • by JVM heap • etc. Next
  36. 36. Thanks!

×