Successfully reported this slideshow.
Your SlideShare is downloading. ×

Storage infrastructure using HBase behind LINE messages

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
HBase @ Twitter
HBase @ Twitter
Loading in …3
×

Check these out next

1 of 46 Ad

More Related Content

Slideshows for you (20)

Viewers also liked (20)

Advertisement

Similar to Storage infrastructure using HBase behind LINE messages (20)

Recently uploaded (20)

Advertisement

Storage infrastructure using HBase behind LINE messages

  1. 1. Storage infrastructure using HBase behind LINE messages NHN Japan Corp. LINE Server Task Force Shunsuke Nakamura @sunsuk7tp 13.1.21 Hadoop  Conference  Japan  2013  Winter 2
  2. 2. To support ’s users, we have built message storage that is Large scale (tens of billion rows/day) Responsive (under 10 ms) High available (dual clusters) 13.1.21 Hadoop  Conference  Japan  2013  Winter 3
  3. 3. Outline •  About LINE •  LINE & Storage requirements •  What we achieved •  Today’s topics –  IDC online migration –  NN failover –  Stabilizing LINE message cluster •  Conclusion 13.1.21 Hadoop  Conference  Japan  2013  Winter 4
  4. 4. LINE - A global messenger powered by NHN Japan - Devices 5 different mobile platforms + Desktop support 13.1.21 Hadoop  Conference  Japan  2013  Winter 5
  5. 5. 13.1.21 Hadoop  Conference  Japan  2013  Winter 6
  6. 6. 13.1.21 Hadoop  Conference  Japan  2013  Winter 7
  7. 7. New year 2013 in Japan Number of requests in a HBase cluster Usual Peak Hours New Year 2013 X  3 (ploFed  by  1min) あけおめ! 新年好! 3  5mes  traffic  explosion   LINE  Storage  had  no  problems  :)   13.1.21 Hadoop  Conference  Japan  2013  Winter 9
  8. 8. LINE on Hadoop Storages for service, backup and log For HBase, M/R and log archive Bulk migration and ad-hoc analysis For HBase and Sharded-Redis Collecting Apache and Tomcat logs KPI, Log analysis 13.1.21 Hadoop  Conference  Japan  2013  Winter 10
  9. 9. LINE on Hadoop Storages for service, backup and log For HBase, M/R and log archive Bulk migration and ad-hoc analysis For HBase and Sharded-Redis Collecting Apache and Tomcat logs KPI, Log analysis 13.1.21 Hadoop  Conference  Japan  2013  Winter 11
  10. 10. LINE service requirements LINE is a… Messaging Service - Should be fast Global Service - Downtime not allowed But, not a Simple Messaging Service. Message synchronization b/w phone & PCs –  Messages should be kept for a while. 13.1.21 Hadoop  Conference  Japan  2013  Winter 12
  11. 11. LINE’s storage requirements No     data  loss Eventual   Low   consistency latency HA Flexible   schema   Easy  scale-­‐ management out 13.1.21 Hadoop  Conference  Japan  2013  Winter 13
  12. 12. Our selection is HBase •  Low latency for large amount of data •  Linearly scalable •  Relatively lower operating cost –  Replication by nature –  Automatic failover •  Data model fits our requirements –  Semi-structured –  Timestamp 13.1.21 Hadoop  Conference  Japan  2013  Winter 14
  13. 13. Stored rows per day in a cluster (billions/day) 10 8 6 4 2 13.1.21 Hadoop  Conference  Japan  2013  Winter 15
  14. 14. What we achieved with HBase •  No data loss –  Persistent –  Data replication •  Automatic recovery from server failure •  Reasonable performance for large data sets –  Hundreds of billion rows –  Write: ~ 1 ms –  Read: 1 ~ 10 ms 13.1.21 Hadoop  Conference  Japan  2013  Winter 16
  15. 15. Many issues we had •  Heterogeneous storages coordination •  IDC online migration •  Flush & Compaction Storms by “too many HLogs” •  Row & Column distribution •  Secondary Index •  Region Management –  load, size balancing –  RS Allocation –  META region –  M/R •  Monitoring for diagnostics •  Traffic burst by decommission •  NN problems •  Performance degradation –  hotspot problem –  timeout burst –  GC problem •  Client bugs –  Thread Blocking on server failure (HBASE-6364) 13.1.21 Hadoop  Conference  Japan  2013  Winter 17
  16. 16. Today’s topics IDC online migration NN failover Stabilizing LINE message cluster 13.1.21 Hadoop  Conference  Japan  2013  Winter 18
  17. 17. IDC online migration NN failover Stabilizing LINE message cluster
  18. 18. Why? •  Move whole HBase clusters and data •  For better network infrastructure •  Without downtime 13.1.21 Hadoop  Conference  Japan  2013  Winter 20
  19. 19. IDC online migration Before migration App Server dst-HBase write src-HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 21
  20. 20. IDC online migration •  Write to both (client-level replication) write App Server dst-HBase write src-HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 22
  21. 21. IDC online migration •  New data: Incremental replication •  Old data: Bulk migration •  dst’s timestamp equals src’s one write App Server dst-HBase write src-HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 23
  22. 22. LINE HBase Replicator & BulkMigrator Replicator is for incremental replication BulkMigrator is for bulk migration 13.1.21 Hadoop  Conference  Japan  2013  Winter 24
  23. 23. LINE HBase Replicator •  Our own implementation •  Prefer pull to push •  Throughput throttling •  Workload isolation of replicator and RS •  Rowkey conversion and filtering HBase  Replicator LINE  HBase  Replicator src-HBase src-HBase push pull dst-HBase dst-HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 25
  24. 24. LINE HBase Replicator - A simple daemon to replicate local regions - 1.  HLogTracker reads a ckpt and selects next HLog. 2.  For each entry in HLog: 1.  Filter & convert a HLog.Entry 2.  Create Puts and batch to dst HBase •  Periodic checkpointing •  Generally, entries are replicated in seconds 13.1.21 Hadoop  Conference  Japan  2013  Winter 26
  25. 25. Bulk migration 1.  MapReduce between any storages –  Map task only –  Read source, write destination –  Task scheduling problem depends on region allocation 2.  Non MapReduce version (BulkMigrator) –  Our own implementation –  HBase → HBase –  On each RS, scan & batch by a region –  Throughput throttling –  Slow, but easy to implement and debug 13.1.21 Hadoop  Conference  Japan  2013  Winter 27
  26. 26. IDC online migration NN failover Stabilizing LINE message cluster
  27. 27. Background •  Our HBase has a SPOF: NameNode •  “Apache Hadoop HA Configuration” http://blog.cloudera.com/blog/2009/07/hadoop-ha-configuration/ •  Furthermore, added Pacemaker –  Heartbeat can’t detect whether NN is running 13.1.21 Hadoop  Conference  Japan  2013  Winter 29
  28. 28. Previous: HA-NN DRBD + VIP + Pacemaker 13.1.21 Hadoop  Conference  Japan  2013  Winter 30
  29. 29. NameNode failure in 2012.10 13.1.21 Hadoop  Conference  Japan  2013  Winter 31
  30. 30. HA-NN failover failed •  Not NameNode process •  Incorrect leader election at network partitioning •  Complicated configuration –  Easy to mistake, difficult to control –  Pacemaker scripting was not straightforward –  VIP is risky to HDFS •  DRBD split-brain problem –  Protocol C –  Unable to re-sync while service is online 13.1.21 Hadoop  Conference  Japan  2013  Winter 32
  31. 31. Now: In-house NN failure handling •  Bye-bye old HA-NN –  Had to restart whole HBase clusters after NN failover •  Alternative ideas –  Quorum-based leader election (Using ZK) –  Using L4 switch –  Implement our own AvatarNode •  Safer solution instead of a little downtime 13.1.21 Hadoop  Conference  Japan  2013  Winter 33
  32. 32. In-house NN failure handling (1)  rsync  with  -­‐-­‐link-­‐dest  periodically   13.1.21 Hadoop  Conference  Japan  2013  Winter 34
  33. 33. In-house NN failure handling (2) Bomb 13.1.21 Hadoop  Conference  Japan  2013  Winter 35
  34. 34. In-house NN failure handling (3) 13.1.21 Hadoop  Conference  Japan  2013  Winter 36
  35. 35. IDC online migration NN failover Stabilizing LINE message cluster
  36. 36. Stabilizing LINE message cluster Case  1 “Too  many   HLogs”   H/W  Failure   RS  GC  Storm   Handling   Case  3 Case  2 META  region   Hotspot   workload   Performance   problems isola5on Case  4 Region   mappings   to  RS 13.1.21 Hadoop  Conference  Japan  2013  Winter 38
  37. 37. Case1: “Too many HLogs” •  Effect –  MemStore flush storm –  Compaction storm •  Cause –  Different regions growth –  Heterogeneous tables in a RS •  Solution –  Region balancing –  External flush scheduler 13.1.21 Hadoop  Conference  Japan  2013  Winter 39
  38. 38. Case1: Number of HLogs Forced flushed shed N o flu Periodic flushed better case peak off-peak worse case Forced flushed Forced flushed flush storm Forced flushed 13.1.21 Hadoop  Conference  Japan  2013  Winter 40
  39. 39. Case2: Hotspot problems •  Effect –  Excessive GC –  RS performance degradation (High CPU usage) •  Cause –  Get/Scan: •  Row or column, updated too frequently •  Row which has too many columns (+ tombstones) •  Solution –  Schema and row/column distribution are important –  Hotspot region isolation 13.1.21 Hadoop  Conference  Japan  2013  Winter 41
  40. 40. Case3: META region workload isolation •  Effect 1.  RS high CPU 2.  Excessive timeout 3.  META lookup timeout •  Cause –  Inefficient exception handling of HBase client –  Hotspot region and META in same RS •  Solution –  META only RS 13.1.21 Hadoop  Conference  Japan  2013  Winter 42
  41. 41. Case4: Region mappings to RS •  Effect –  Region mapping is not restored on RS restart –  Some region mappings aren’t restored properly after graceful restart •  graceful_stop.sh --restart --reload •  Cause –  HBase does not support it well •  Solution –  Periodic dump and restore it 13.1.21 Hadoop  Conference  Japan  2013  Winter 43
  42. 42. Summary •  IDC online migration –  Without downtime –  LINE HBase Replicator & BulkMigrator •  NN failover –  Simple solution for a person saying “What’s Hadoop?” •  Stabilizing LINE message cluster –  Improved response time of RS 13.1.21 Hadoop  Conference  Japan  2013  Winter 44
  43. 43. Conclusion We won 100M user adopting HBase LINE Storage is a successful example of a messaging service using HBase 13.1.21 Hadoop  Conference  Japan  2013  Winter 45

×