Improvements to Apache HBase and Its Applications in Alibaba Search

720 views

Published on

Yu Li and Shaoxuan Wang (Alibaba)

HBase is the core storage system in Alibaba’s Search Infrastructure. In this session, we will talk about the details of how we use HBase to serve such high-throughput, low-latency, mixed workloads and the various improvements we made to HBase to meet these challenges.

Published in: Software
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
720
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Improvements to Apache HBase and Its Applications in Alibaba Search

  1. 1. HBase in Alibaba Search ShaoXuan Wang, Yu Li {shaoxuan.wsx, jueding.ly} @alibaba-inc.com
  2. 2. Agenda   n  History  of  HBase  in  Alibaba  Search   n  Scenarios   l  Search  Indexing   l  Machine  Learning  for  Recommendation  and  Targeting   n  HBase  Improvements   l  Multiple  WAL,  Fine-grained  IO  Control,  etc.   n  HBase  Extensions   l  HQueue:  light-weight  message  queue  impl  on  HBase   l  HTunnel:  update  notification  service  based  on  HBase/HQueue   n  Challenges  and  future  
  3. 3. About  Us   n  Alibaba   l  Operates  the  world  largest  online  and  mobile  marketplace,  thriving  in  the  world  largest  e-commerce  market.   l  Annual  GMV  $394Billion  in  year  2015   l  alibaba.com,  aliexpress.com,  taobao.com,  tmall.com   n  Alibaba  Search   l  Serving  410  million  monthly  active  users   l  Personalized  recommendation  and  targeting  via  machine  learning  system   l  Major  contributor  to  GMV  
  4. 4. HBase  in  Alibaba  Search   n HBase  is  the  core  storage  in  Alibaba  search  system,  since  2010   n History  of  version  used  online   l  2010~2014:  0.20.6à0.90.3à0.92.1à0.94.1à0.94.2à0.94.5   l  2014~2015:  0.94à0.98.1à0.98.4à0.98.8à0.98.12   l  2016:  0.98.12à1.1.2   n Current  scale   l  3  clusters  each  with  1,000+  nodes   l  Shared  with  Flink/Yarn   l  Serving  over  10Million/s  Ops  throughout  the  day  
  5. 5. Data  Platform  for  Search  Indexing   n Data  Storage  for  Batch  and  Streaming  Processing   Data  Source   Hadoop  cluster   HBase   Batch  &   Streaming  Event   Offline  &  Real   Time  Processing   Exporting   Ali  ODPS   MySQL   Search  Engines   HBase   HBase   HDFS   HDFS   HDFS  
  6. 6. Data  Platform  for  Search  Indexing   n Continuous  Updated  Materialized  View  on  HBase   Streaming Data Join (Apply UDF) Source Source Source Join (Apply UDF) Materialized View Materialized View User Defined Processing DAG Batch Data Continuous Updated Result
  7. 7. Database  and  Queue  for  Machine  Learning   UDF UDF UDFHQueue Online log Parsing Log Training User Models Training Item ModelsItem ID User ID HQueue Aggregate Updates Machine Learning Models Online System Δw Export Model Updates Model Flink+MR Processing over Yarn
  8. 8. HBase  Improvements:  Multiple  WAL   n Multiple  WAL:  HBASE-14457   l  Fix  replication  under  multiwal  (HBASE-6617)   l  Namespace-based  region  grouping  strategy  (HBASE-14456)   l  Performance  improvement  observed  in  production  usage   n  Pure  SATA  disk:  ~20%  better  than  single  WAL   n  ONE_SSD  storage  policy  with  SATA-SSD:  ~40%  better   n  PCIe-SSD:  hsync  acceptable  with  multiwal  
  9. 9. HBase  Improvements:  IO  Isolation   n Challenge  from  shared  storage/computing  nodes   n Take  usage  of  Storage  Policy   l  ALL_SSD  for  WALs   l  ONE_SSD  for  HFile:  Support  CF-level  storage  policy  (HBASE-14061)   l  Support  setting  storage  policy  in  Bulkload  (HBASE-15172)   l  Only  use  SATA  disk  for  MR  temporary  data  (mapreduce.cluster.local.dir)  
  10. 10. HBase  Improvements:  IO  Isolation   n Better  control  of  Disk/Network  IO  spike   l  Compaction  throttling  (HBASE-8329)  +  Flush  throttling  (HBASE-14969)   l  Per-CF  flush  improvement:  further  less  flush  on  small  CF  (HBASE-14906)  
  11. 11. HBase  Improvements:  Machine  Learning  specific   n Remove  some  synchronous  in  the  RpcServer  responder  (HBASE-11297)   l  Verified  to  be  important  if  heavy  access  from  one  single  client   l  Not  included  in  0.98,  but  nice  to  have   n Improve  parallel  reading  a  single  key  from  BucketCache  (HBASE-14463)   l  Synchronous  =>  read/write  lock   l  Not  included  in  0.98,  but  nice  to  have  
  12. 12. HBase  Improvements:  Monitoring   n Add  metrics  for  get/scan/multi/mutate  count  separately  (HBASE-15163)     n Add  back  HFile  HDFS  op  latency  metrics  (HBASE-15160)  
  13. 13. HBase  Extension:  HQueue   n A  light-weight  message  queue  service  embedded  in  HBase   n Why  not  simply  using  Kafka?   l  Easier  deployment/management:  queue  and  storage  service  in  a  whole   l  Comparable  perf/features  through  native  features  HBase  supplies   n  TTL   n  Load  balance   n  Fault  tolerant/Failover   n  High  throughput   n  Good  scalability   n  Replication  
  14. 14. HBase  Extension:  HQueue   n Message  write  process  
  15. 15. HBase  Extension:  HQueue   n Message  read  process  
  16. 16. n An  update-notification  service  based  on  HBase/HQueue   n Similar  mechanism  to  replication   n Architecture   HBase  Extension:  HTunnel  
  17. 17. HBase  Extension:  HTunnel   n Working  process  
  18. 18. Challenges  and  future   n Meet  hardware  revolution   l  PCIe-SSD  plus  10Gb  network  card   l  Optimization  required  on  RPC/HDFS  layer   l  CPU  may  become  bottleneck   n Meet  challenges  from  new  database   l  HBase  v.s.  Rocksdb  
  19. 19. Q  &  A   Thank  You!  

×