HBaseCon 2015: HBase @ CyberAgent

1.
HBase @ CyberAgent ToshihiroSuzuki, Hirotaka Kakishima

2.
Who We Are ●Hirotaka Kakishima o Database Engineer, CyberAgent, Inc. ● Toshihiro Suzuki o Software Engineer, CyberAgent, Inc. o Worked on HBase since 2012 o @brfrn169

3.
Who We Are Weauthored Beginner’s Guide to HBase in Japanese

4.
Who We Are Ouroffice is located in Akihabara, Japan

5.
Agenda ● About CyberAgent& Ameba ● HBase @ CyberAgent Our HBase History Use Case: Social Graph Database

6.
About CyberAgent

7.
● Advertising (agency,tech) ● Games ● Ameba https://www.cyberagent.co.jp/en/ CyberAgent, Inc.

8.
What’s Ameba?

9.
● Blogging/Social Networking/GamePlatform ● 40 million users What’s Ameba?

10.
Ranking of DomesticInternet Services Desktop Smartphone by Nielsen 2014 http://www.nielsen.com/jp/ja/insights/newswire-j/press-release-chart/nielsen-news-release-20141216.html Rank WebSite Name Monthly Unique Visitors WebSite Name Monthly Unique VisitorsRank

11.
Ameba Blog 1.9billion blog articles

12.
Ameba Pigg

13.
… and More Platform

14.
HBase @ CyberAgent

15.
We Use HBasefor Log Analysis Social Graph Recommendations Advertising Tech

16.
● For LogAnalysis ● HBase 0.90 (CDH3) Our HBase History (1st Gen.) Log or SCP Transfer & HDFS Sink M/R & Store Results Our Web Application

17.
Our HBase History(2nd Gen.) ● For Social Graph Database, 24/7 ● HBase 0.92 (CDH4b1), HDFS CDH3u3 ● NameNode using Fault Tolerant Server http://www.nec.com/en/global/prod/express/fault_tolerant/technology.html

18.
Our HBase History(2nd Gen.) ● Replication using original WAL apply method ● 10TB (not considering HDFS Replicas) ● 6 million requests per minutes ● Average Latency < 20ms

19.
Our HBase History(3rd Gen.) ● For other social graph, recommendations ● HBase 0.94 (CDH4.2 〜 CDH4.7) ● NameNode HA ● Chef ● Master-slave replication (some clusters patched HBASE-8207)

20.
Our HBase History(4th Gen.) ● For advertising tech (DSP, DMP, etc.) ● HBase 0.98 (CDH5.3) ● Amazon EC2 ● Master-master replication ● Cloudera Manager

21.
Currently ● 10 Clustersin Production ● 10 ~ 50 RegionServers / Cluster ● uptime: 16 months (0.92) : Social Graph 24 months (0.94) : Other Social Graph 2 months (0.98) : Advertising tech

22.
We Cherish theBasics ● Learning architecture ● Considering Table Schema (very important) ● Having enough RAM, DISKs, Network Bandwidth ● Splitting large regions and running major compaction at off-peak ● Monitoring metrics & tuning configuration parameters ● Catching up BUG reports @ JIRA

23.
Next Challenge ● Weare going to migrate cluster from 0.92 to 1.0

24.
Case: Ameba’s SocialGraph

25.
Graph data Platform forSmartphone Apps

26.
Requirements ● Scalability o growingsocial graph data ● High availability o 24/7 ● Low latency o for online access

27.
Why HBase ● Autosharding ● Auto failover ● Low latency We decided to use HBase and developed graph database built on it

28.
How we useHBase as a Graph Database

29.
System Overview HBase Gateway Client ClientClient Client Gateway

30.
Data Model ● PropertyGraph follow follow follow node1 node2 node3

31.
Data Model ● PropertyGraph follow follow follow node1 node2 node3 name Taro age 24 date 5/7 name Ichiro age 31 date 4/1 date 3/31 name Jiro age 54

32.
API Graph g =... Node node1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Relationship rel = node1.addRelationship("follow", node2); rel.setProperty("date", valueOf("2015-02-19")); List<Relationship> outRels = node1.out("follow").list(); List<Relationship> inRels = node2.in("follow").list();

33.

34.

35.

36.

37.

38.

39.
Schema Design ● RowKey o<hash(nodeId)>-<nodeId> ● Column o n: o r:<direction>-<type>-<nodeId> ● Value o Serialized properties

40.
Schema Design (Example) Nodenode1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Node node3 = g.addNode(); node3.setProperty("name", valueOf("Jiro"));

41.
Schema Design (Example) Nodenode1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Node node3 = g.addNode(); node3.setProperty("name", valueOf("Jiro")); node1

42.
Schema Design (Example) Nodenode1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Node node3 = g.addNode(); node3.setProperty("name", valueOf("Jiro")); node1 node2

43.
Schema Design (Example) Nodenode1 = g.addNode(); node1.setProperty("name", valueOf("Taro")); Node node2 = g.addNode(); node2.setProperty("name", valueOf("Ichiro")); Node node3 = g.addNode(); node3.setProperty("name", valueOf("Jiro")); node1 node3 node2

44.
Schema Design (Example) RowKeyColumn Value

45.
Schema Design (Example) RowKeyColumn Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”}

46.
Schema Design (Example) RowKeyColumn Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”}

47.
Schema Design (Example) RowKeyColumn Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”}

48.

49.
Schema Design (Example) Relationshiprel1 = node1.addRelationship("follow", node2); rel1.setProperty("date", valueOf("2015-02-19")); Relationship rel2 = node1.addRelationship("follow", node3); rel2.setProperty("date", valueOf("2015-02-20")); Relationship rel3 = node3.addRelationship("follow", node2); rel3.setProperty("date", valueOf("2015-04-12")); node1 node3 node2

50.
Schema Design (Example) Relationshiprel1 = node1.addRelationship("follow", node2); rel1.setProperty("date", valueOf("2015-02-19")); Relationship rel2 = node1.addRelationship("follow", node3); rel2.setProperty("date", valueOf("2015-02-20")); Relationship rel3 = node3.addRelationship("follow", node2); rel3.setProperty("date", valueOf("2015-04-12")); node1 node3 node2 follow

51.
Schema Design (Example) Relationshiprel1 = node1.addRelationship("follow", node2); rel1.setProperty("date", valueOf("2015-02-19")); Relationship rel2 = node1.addRelationship("follow", node3); rel2.setProperty("date", valueOf("2015-02-20")); Relationship rel3 = node3.addRelationship("follow", node2); rel3.setProperty("date", valueOf("2015-04-12")); node1 node3 node2 follow follow

52.
Schema Design (Example) Relationshiprel1 = node1.addRelationship("follow", node2); rel1.setProperty("date", valueOf("2015-02-19")); Relationship rel2 = node1.addRelationship("follow", node3); rel2.setProperty("date", valueOf("2015-02-20")); Relationship rel3 = node3.addRelationship("follow", node2); rel3.setProperty("date", valueOf("2015-04-12")); node1 node3 node2 follow followfollow

53.

54.
Schema Design (Example) RowKeyColumn Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”}

55.
Schema Design (Example) RowKeyColumn Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”}

56.
Schema Design (Example) RowKeyColumn Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} r:OUTGOING-follow-nodeId3 {“date”: “2015-02-20”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-20”}

57.
Schema Design (Example) RowKeyColumn Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} r:OUTGOING-follow-nodeId3 {“date”: “2015-02-20”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-20”}

58.
Schema Design (Example) RowKeyColumn Value hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} r:OUTGOING-follow-nodeId3 {“date”: “2015-02-20”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} r:INCOMING-follow-nodeId3 {“date”: “2015-04-12”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-04-12”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-20”}

59.

60.
Schema Design (Example) List<Relationship>outRels = node1.out("follow").list(); node3 node2 follow followfollow node1

61.
Schema Design (Example) List<Relationship>outRels = node1.out("follow").list(); node3 node2 follow followfollow node1

62.

63.

64.

65.

66.
Schema Design (Example) List<Relationship>inRels = node2.in("follow").list(); node3 node2 follow followfollow node1

67.
Schema Design (Example) List<Relationship>inRels = node2.in("follow").list(); node3 node2 follow followfollow node1

68.

69.

70.

71.

72.
Consistency Problem ● HBasehas no native cross-row transactional support ● Possibility of inconsistency between outgoing and incoming rows

73.
Consistency Problem RowKey ColumnValue hash(nodeId1)-nodeId1 n: {“name”: “Taro”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”}

74.
Consistency Problem RowKey ColumnValue hash(nodeId1)-nodeId1 n: {“name”: “Taro”} r:OUTGOING-follow-nodeId2 {“date”: “2015-02-19”} hash(nodeId2)-nodeId2 n: {“name”: “Ichiro”} r:INCOMING-follow-nodeId1 {“date”: “2015-02-19”} hash(nodeId3)-nodeId3 n: {“name”: “Jiro”} Inconsistency

75.
Coprocessor ● Endpoints o likea stored procedure in RDBMS o push your business logic into RegionServer ● Observers o like a trigger in RDBMS o insert user code by overriding upcall methods

76.
Using Observers ● Weuse 2 observers o WALObserver#postWALWrite o RegionObserver#postWALRestore ● The same logic o write an INCOMING row ● Eventual Consistency

77.
Using Observers (NormalCase) Client Memstore RegionServer HDFS WALs WALObserver# postWALWrite

78.
Using Observers (NormalCase) Client 1, write only an OUTGOING row Memstore RegionServer HDFS WALs WALObserver# postWALWrite

79.
Using Observers (NormalCase) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS

80.
Using Observers (NormalCase) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS 4, write the INCOMING row

81.
Using Observers (NormalCase) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS 5, respond 4, write the INCOMING row

82.
Using Observers (AbnormalCase) Client Memstore RegionServer HDFS WALs WALObserver# postWALWrite

83.
Using Observers (AbnormalCase) Client 1, write only an OUTGOING row Memstore RegionServer HDFS WALs WALObserver# postWALWrite

84.
Using Observers (AbnormalCase) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS

85.
Using Observers (AbnormalCase) Client 1, write only an OUTGOING row Memstore 2, write to Memstore RegionServer HDFS WALs WALObserver# postWALWrite 3, write WAL to HDFS

86.
Using Observers (AbnormalCase) Another RegionServer HDFS RegionObserver# postWALRestore WALs Memstore

87.
Using Observers (AbnormalCase) Another RegionServer HDFS RegionObserver# postWALRestore WALs Memstore 1, replay a WAL of an OUTGOING row

88.
Using Observers (AbnormalCase) Another RegionServer HDFS RegionObserver# postWALRestore WALs Memstore 2, write the INCOMING row 1, replay a WAL of an OUTGOING row

89.
Summary ● We haveused HBase in several projects o Log Analysis, Social Graph, Recommendations, Advertising tech ● We developed graph database built on HBase o HBase is good for storing social graphs o We use coprocessor to resolve consistency problems

90.
If you haveany questions, please tweet @brfrn169. Questions

HBaseCon 2015: HBase @ CyberAgent

More Related Content

What's hot

Viewers also liked

Similar to HBaseCon 2015: HBase @ CyberAgent

More from HBaseCon

Recently uploaded

HBaseCon 2015: HBase @ CyberAgent

Editor's Notes