Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
How to build a scalable
SNS using HBase
Kewang
三竹資訊
Who I am
●
王慕羣
● Java / Node.js / AngularJS
● SQL-like / HBase
Github: kewang
Facebook: kewangtw
Linkedin: kewangtw
Slides...
Who Mitake is
三竹資訊
Who Mitake is
三竹資訊
大家都唸 Mitake
Who Mitake is
三竹資訊
大家都唸 Mitake ,但我們公司都唸 Mitake
Who Mitake is
三竹資訊
Mitake 不唸作 MiTAC 啊!!!
Who Mitake is
三竹資訊
Who Mitake is
三竹資訊
●
簡訊平台
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:不計其數
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:不計其數
●
行動銀行:
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:不計其數
●
行動銀行:臺銀、土銀、富邦、台新、聯邦、臺
企銀、遠銀、華南、澳盛、郵局、合庫、渣打 ...
等 18 家
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:不計其數
●
行動銀行:臺銀、土銀、富邦、台新、聯邦、臺
企銀、遠銀、華南、澳盛、郵局、合庫、渣打 ...
等 18 家
●
產壽險:
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:不計其數
●
行動銀行:臺銀、土銀、富邦、台新、聯邦、臺企
銀、遠銀、華南、澳盛、郵局、合庫、渣打 ... 等 18
家
● 產壽險:全球、明台、新光、新安東京、富邦 ... 等
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:不計其數
●
行動銀行:臺銀、土銀、富邦、台新、聯邦、臺企
銀、遠銀、華南、澳盛、郵局、合庫、渣打 ... 等 18 家
● 產壽險:全球、明台、新光、新安東京、富邦 ... 等...
Who Mitake is
三竹資訊
●
簡訊平台
●
行動下單:不計其數
●
行動銀行:臺銀、土銀、富邦、台新、聯邦、臺企銀、
遠銀、華南、澳盛、郵局、合庫、渣打 ... 等 18 家
● 產壽險:全球、明台、新光、新安東京、富邦 ... 等...
System Architecture
19
System Architecture
20
System Architecture (Backend)
21
System Architecture (Frontend)
22
System Architecture (Frontend)
MOPCON 2014 CfP
23
Agenda
● Rowkey design
● Best Practice in Java
● API Blueprint
● HBase Dataflow
24
Rowkey design
25
Rowkey design - Avoid hotspotting
26
Rowkey design - Avoid hotspotting
● Sorted lexicographically
27
Rowkey design - Avoid hotspotting
● Sorted lexicographically
Region 3
Region 1
Region 2
foo-1
foo-2
foo-3
foo-4
28
Rowkey design - Avoid hotspotting
● Sorted lexicographically
Region 3
Region 1
Region 2
foo-1
foo-2
foo-3
foo-4
29
Rowkey design - Avoid hotspotting
● Sorted lexicographically
Region 3
Region 1
Region 2
foo-1
foo-2
foo-3
foo-4
30
Rowkey design - Avoid hotspotting
● Salting, Hashing or Reversing
31
Rowkey design - Avoid hotspotting
● Salting, Hashing or Reversing
Region 3
Region 1
Region 2
foo-1
foo-2
foo-3
foo-4
32
Rowkey design - Avoid hotspotting
● Salting, Hashing or Reversing
Region 3
Region 1
Region 2
foo-1
foo-2
foo-3
foo-4
BOX
33
Rowkey design - Avoid hotspotting
● Salting, Hashing or Reversing
Region 3
Region 1
Region 2
foo-1
foo-2
foo-3
foo-4
BOX
34
Rowkey design - Avoid hotspotting
● Salting, Hashing or Reversing
Region 3
Region 1
Region 2
foo-1
foo-2
foo-3
foo-4
BO...
35
Rowkey design - Avoid hotspotting
● Salting, Hashing or Reversing
Region 3
Region 1
Region 2
foo-1
foo-2
foo-3
foo-4
BO...
36
Rowkey design - Refining ID
37
Rowkey design - Refining ID
● SHA1: 40 bytes
– 3204c3aefcca4a556f0f7547d056235fa823af3a
38
Rowkey design - Refining ID
● SHA1: 40 bytes
– 3204c3aefcca4a556f0f7547d056235fa823af3a
● UUID: 36 bytes
– 22bfad60-39d...
39
Rowkey design - Refining ID
● SHA1: 40 bytes
– 3204c3aefcca4a556f0f7547d056235fa823af3a
● UUID: 36 bytes
– 22bfad60-39d...
Too long !!!
41
Rowkey design - Refining ID
● X Algorithm
42
Rowkey design - Refining ID
● X Algorithm: 12 bytes
43
Rowkey design - Refining ID
● X Algorithm: 12 bytes
– Auto increment
44
Rowkey design - Refining ID
● X Algorithm: 12 bytes
– Auto increment
– Ordered
45
Rowkey design - Refining ID
● X Algorithm: 12 bytes
– Auto increment
– Ordered
– Counts to 2.17E21
Rowkey design - Refining ID
● X Algorithm: 12 bytes
– Auto increment
– Ordered
– Counts to 2.17E21
– e.g: H00000001B12
47
Rowkey design - Authenticating
48
Rowkey design - Authenticating
● Get frequently
49
Rowkey design - Authenticating
● Get frequently
User Id ID0000001A3B
Access Token d66e3b70-3666-11e4-8c21-0800200c9a66
...
50
Rowkey design - Authenticating
● Get frequently
● Multi-login
51
Rowkey design - Authenticating
● Get frequently
● Multi-login
User Id ID0000001A3B
Token 0 d66e3b70-3666-11e4-8c21-0800...
52
Rowkey design - Rice dumpling
53
Rowkey design - Rice dumpling
54
Rowkey design - Rice dumpling
55
Rowkey design - Rice dumpling
Id ME00000024AC
Title Announce
Content We are hiring
56
Rowkey design - Rice dumpling
Id ME00000024AC
Title Announce
Content We are hiring
Id ME00000024AC.ME00000037ZZ
Title (...
57
Rowkey design - Rice dumpling
Id ME00000024AC
Title Announce
Content We are hiring
Id ME00000024AC.ME00000037ZZ
Title (...
58
Rowkey design - Access controlling
59
Rowkey design - Access controlling
60
Rowkey design - Access controlling
Only A, B can see it.
61
Rowkey design - Access controlling
Only A, B can see it.
Of course, including me.
62
Rowkey design - Access controlling
● When post a message (Write)
63
Rowkey design - Access controlling
● When post a message (Write)
– Generate ACL Id
64
Rowkey design - Access controlling
● When post a message (Write)
– Generate ACL Id
– Put ACL Id to message, and reader'...
65
Rowkey design - Access controlling
● When post a message (Write)
– Generate ACL Id
– Put ACL Id to message, and reader'...
66
Rowkey design - Access controlling
● When post a message (Write)
– Generate ACL Id
– Put ACL Id to message, and reader'...
67
Rowkey design - Access controlling
● When post a message (Write)
– Generate ACL Id
– Put ACL Id to message, and reader'...
68
Rowkey design - Access controlling
Write
69
Rowkey design - Access controlling
ACL hash hash(A, B, K)+C+R
ACL Id AI0070AD
Write
70
Rowkey design - Access controlling
ACL hash hash(A, B, K)+C+R
ACL Id AI0070AD
Message Id ME00000024AC
Title Announce
Co...
71
Rowkey design - Access controlling
ACL Id+User Id AI0070AD+A AI0070AD+B AI0070AD+K
Create 1 1 1
Read 1 1 1
Update 0 0 0...
72
Rowkey design - Access controlling
User Id+ACL Id A+AI0070AD B+AI0070AD K+AI0070AD
Create 1 1 1
Read 1 1 1
Update 0 0 0...
73
Rowkey design - Access controlling
Read
74
Rowkey design - Access controlling
User Id+ACL Id K+AI0070AD K+AI028577
Create 1 1
Read 1 1
Update 0 1
Delete 0 1
Read
75
Rowkey design - Access controlling
User Id+ACL Id K+AI0070AD K+A1028577
Create 1 1
Read 1 1
Update 0 1
Delete 0 1
Read
...
76
Rowkey design - Access controlling
User Id+ACL Id K+AI0070AD K+A1028577
Create 1 1
Read 1 1
Update 0 1
Delete 0 1
Read
...
77
Rowkey design - Statistics
78
Rowkey design - Statistics
● Variety of types
– e.g., Likes, Comments, Registrations
79
Rowkey design - Statistics
● Variety of types
– e.g., Likes, Comments, Registrations
● By unit
– i.e., hourly,daily,wee...
80
Rowkey design - Statistics
● Variety of types
– e.g., Likes, Comments, Registrations
● By unit
– i.e., hourly,daily,wee...
81
Rowkey design - Statistics
Unit+Time Base+User Id+Type H+20140908+AAA+Like
11 7
15 22
21 15
Unit+Time Base+User Id+Type...
82
Rowkey design - Statistics
● Sum counts from 2014/9/7 to 2014/9/20 group by
user or counting type
Unit+Time Base+User I...
83
Rowkey design - Statistics
● Sum counts from 2014/9/7 to 2014/9/20 group by
user or counting type
Unit+Time Base+User I...
84
Rowkey design - Statistics
● Sum AAA's counts from 2014/9/7 to 2014/9/20
group by counting type
Unit+Time Base+User Id+...
85
Rowkey design - Statistics
● Sum AAA's like counts from 2014/9/7 to 2014/9/20
Unit+Time Base+User Id+Type D+201409+AAA+...
86
Rowkey design - Summary
● Avoid hotspotting
● Refining ID
● Authenticating
● Rice dumpling
● Access controlling
● Stati...
87
Best Practice in Java
88
No. 1
89
No. 1
USE HashMap
90
No. 1
USE HashMap
NoSQL is different from RDBMS
91
No. 1 USE HashMap
OLD
92
No. 1 USE HashMap
public class Validation1 {
private String accessToken;
private long expiredTime;
public Validation1()...
93
No. 1 USE HashMap
NEW
94
No. 1 USE HashMap
public static final String ACCESS_TOKEN = "access token";
private Map<String, byte[]> putMap;
public ...
95
No. 1 USE HashMap
public static final String ACCESS_TOKEN = "access token";
private Map<String, byte[]> putMap;
public ...
96
No. 1 USE HashMap
● Bytes.toXXX() returns always Type XXX or NULL
97
No. 1 USE HashMap
● Bytes.toXXX() returns always Type XXX or NULL
– Or throws Exception
No. 1 USE HashMap
● Bytes.toXXX() returns always Type XXX or NULL
– Or throws Exception
● Improve default value in Java
99
No. 2
100
No. 2
ONE table, MULTI domains
101
No. 2
ONE table, MULTI domains
NoSQL is different from RDBMS
102
No. 2 ONE table, MULTI domains
● In RDBMS
–
–
–
● In NoSQL
–
–
–
103
No. 2 ONE table, MULTI domains
● In RDBMS (at design time)
–
–
–
● In NoSQL
–
–
–
104
No. 2 ONE table, MULTI domains
● In RDBMS (at design time)
–
–
–
● In NoSQL (at runtime)
–
–
–
105
No. 2 ONE table, MULTI domains
● In RDBMS (at design time)
– Primary key affects only one column
–
–
● In NoSQL (at ru...
106
No. 2 ONE table, MULTI domains
● In RDBMS (at design time)
– Primary key affects only one column
–
–
● In NoSQL (at ru...
107
No. 2 ONE table, MULTI domains
● In RDBMS (at design time)
– Primary key affects only one column
– Schema is fixed
–
●...
108
No. 2 ONE table, MULTI domains
● In RDBMS (at design time)
– Primary key affects only one column
– Schema is fixed
–
●...
109
No. 2 ONE table, MULTI domains
● In RDBMS (at design time)
– Primary key affects only one column
– Schema is fixed
– D...
110
No. 2 ONE table, MULTI domains
● In RDBMS (at design time)
– Primary key affects only one column
– Schema is fixed
– D...
111
No. 2 ONE table, MULTI domains
User Id ID0000001A3B
Access Token d66e3b70-3666-11e4-8c21-0800200c9a66
Expired Time 141...
112
No. 2 ONE table, MULTI domains
User Id ID0000001A3B
Access Token d66e3b70-3666-11e4-8c21-0800200c9a66
Expired Time 141...
113
No. 2 ONE table, MULTI domains
User Id ID0000001A3B
Access Token d66e3b70-3666-11e4-8c21-0800200c9a66
Expired Time 141...
114
No. 2 ONE table, MULTI domains
● A DAO maps to a domain in RDBMS
115
No. 2 ONE table, MULTI domains
● A DAO maps to a domain in RDBMS
DB DAO Domain A
116
No. 2 ONE table, MULTI domains
● A DAO maps to multiple domains in NoSQL
117
No. 2 ONE table, MULTI domains
● A DAO maps to multiple domains in NoSQL
DB DAO
Domain A1
Domain A2
Domain A3
118
No. 2 ONE table, MULTI domains
● A DAO maps to multiple domains in NoSQL
● Build a middle layer to translate multiple ...
119
No. 2 ONE table, MULTI domains
● A DAO maps to multiple domains in NoSQL
● Build a middle layer to translate multiple ...
120
No. 2 ONE table, MULTI domains
● A DAO maps to multiple domains in NoSQL
● Build a middle layer to translate multiple ...
121
No. 2 ONE table, MULTI domains
private String checkDomainType(Result result) {
if (result.isEmpty()) {
return null;
} ...
122
No. 2 ONE table, MULTI domains
private String checkDomainType(Result result) {
if (result.isEmpty()) {
return null;
} ...
123
No. 2 ONE table, MULTI domains
private String checkDomainType(Result result) {
if (result.isEmpty()) {
return null;
} ...
124
No. 2 ONE table, MULTI domains
private String checkDomainType(Result result) {
if (result.isEmpty()) {
return null;
} ...
125
No. 2 ONE table, MULTI domains
private String checkDomainType(Result result) {
if (result.isEmpty()) {
return null;
} ...
126
No. 3
127
No. 3
NoSQL is different from RDBMS
128
No. 3
NoSQL is different from RDBMS
REALLY !!!
129
API Blueprint
130
131
API Blueprint - Introduction
132
API Blueprint - Introduction
● Web API Language
133
API Blueprint - Introduction
● Web API Language
● Pure Markdown
134
API Blueprint - Introduction
● Web API Language
● Pure Markdown
● Design for Humans
135
API Blueprint - Introduction
● Web API Language
● Pure Markdown
● Design for Humans
● Understandable by Machines
136
API Blueprint - Introduction
● Web API Language
● Pure Markdown
● Design for Humans
● Understandable by Machines
● Pow...
137
API Blueprint - Introduction
● Web API Language
● Pure Markdown
● Design for Humans
● Understandable by Machines
● Pow...
138
API Blueprint - Hello World
139
API Blueprint - Hello World
140
API Blueprint - Complex
141
API Blueprint - Complex
142
HBase dataflow
143
HBase dataflow
conserve your domain know-how
144
HBase dataflow - Solve what ?
145
HBase dataflow - Solve what ?
● How to conserve system know-how about Put, Get,
Scan or other operations in HBase ?
146
Paper & Pen ?
147
Paper & Pen ?
148
Redmine / KM ?
149
Redmine / KM ?
150http://kewangtw.github.io/hbase-dataflow/
151
HBase dataflow - introduction
152
HBase dataflow - introduction
● HBase operation - Put, Delete, Get, Scan, Filters
153
HBase dataflow - introduction
● HBase operation - Put, Delete, Get, Scan, Filters
● Export
154
HBase dataflow - introduction
● HBase operation - Put, Delete, Get, Scan, Filters
● Export
– to JSON / Markdown
155
HBase dataflow - introduction
● HBase operation - Put, Delete, Get, Scan, Filters
● Export
– to JSON / Markdown
– to P...
156
HBase dataflow - introduction
● HBase operation - Put, Delete, Get, Scan, Filters
● Export
– to JSON / Markdown
– to P...
157
HBase dataflow - introduction
● HBase operation - Put, Delete, Get, Scan, Filters
● Export
– to JSON / Markdown
– to P...
158
HBase dataflow - introduction
● HBase operation - Put, Delete, Get, Scan, Filters
● Export
– to JSON / Markdown
– to P...
159
Live DEMO
160
Design API Step by Step
161
Design API Step by Step
1.Paper & pen always are your friends
162
Design API Step by Step
1.Paper & pen always are your friends
2.Use HBase dataflow to simulate data's flow
163
Design API Step by Step
1.Paper & pen always are your friends
2.Use HBase dataflow to simulate data's flow
3.Export it
164
165
References
● HBase in Action
● apiblueprint, aglio
● HBase dataflow
166
167
Upcoming SlideShare
Loading in …5
×

How to build a scalable SNS using HBase

1,308 views

Published on

這場Talk將要分享如何使用HBase來建置一套可以延展的系統,大綱如下:

1. HBase brief introduction:簡單介紹HBase的運作原理
2. Rowkey(Schema) design:Rowkey的設計與AP的效能息息相關,如何設計Rowkey是HBase非常重要的一個課題
3. Best practice in Java:如何在操作HBase時,可以少碰一些雷
4. API Blueprint:分享如何將HBase設計出來的Dataflow,整理成文件
5. HBase Dataflow:可以利用這套工具,將設計出來的Dataflow傳承下去,利於保存

* Keyword: HBase, Rowkey Design, Dataflow
* HBase Dataflow: http://kewangtw.github.io/hbase-dataflow

Published in: Technology

How to build a scalable SNS using HBase

  1. 1. How to build a scalable SNS using HBase Kewang 三竹資訊
  2. 2. Who I am ● 王慕羣 ● Java / Node.js / AngularJS ● SQL-like / HBase Github: kewang Facebook: kewangtw Linkedin: kewangtw Slideshare: kewang Mail: cpckewang@gmail.com
  3. 3. Who Mitake is 三竹資訊
  4. 4. Who Mitake is 三竹資訊 大家都唸 Mitake
  5. 5. Who Mitake is 三竹資訊 大家都唸 Mitake ,但我們公司都唸 Mitake
  6. 6. Who Mitake is 三竹資訊 Mitake 不唸作 MiTAC 啊!!!
  7. 7. Who Mitake is 三竹資訊
  8. 8. Who Mitake is 三竹資訊 ● 簡訊平台
  9. 9. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:
  10. 10. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:
  11. 11. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:不計其數
  12. 12. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:不計其數 ● 行動銀行:
  13. 13. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:不計其數 ● 行動銀行:臺銀、土銀、富邦、台新、聯邦、臺 企銀、遠銀、華南、澳盛、郵局、合庫、渣打 ... 等 18 家
  14. 14. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:不計其數 ● 行動銀行:臺銀、土銀、富邦、台新、聯邦、臺 企銀、遠銀、華南、澳盛、郵局、合庫、渣打 ... 等 18 家 ● 產壽險:
  15. 15. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:不計其數 ● 行動銀行:臺銀、土銀、富邦、台新、聯邦、臺企 銀、遠銀、華南、澳盛、郵局、合庫、渣打 ... 等 18 家 ● 產壽險:全球、明台、新光、新安東京、富邦 ... 等
  16. 16. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:不計其數 ● 行動銀行:臺銀、土銀、富邦、台新、聯邦、臺企 銀、遠銀、華南、澳盛、郵局、合庫、渣打 ... 等 18 家 ● 產壽險:全球、明台、新光、新安東京、富邦 ... 等 ● 其他:
  17. 17. Who Mitake is 三竹資訊 ● 簡訊平台 ● 行動下單:不計其數 ● 行動銀行:臺銀、土銀、富邦、台新、聯邦、臺企銀、 遠銀、華南、澳盛、郵局、合庫、渣打 ... 等 18 家 ● 產壽險:全球、明台、新光、新安東京、富邦 ... 等 ● 其他: udn 買東西、手機逛週年慶、財政園地、證交 所、綜所稅申報 ... 等
  18. 18. System Architecture
  19. 19. 19 System Architecture
  20. 20. 20 System Architecture (Backend)
  21. 21. 21 System Architecture (Frontend)
  22. 22. 22 System Architecture (Frontend) MOPCON 2014 CfP
  23. 23. 23 Agenda ● Rowkey design ● Best Practice in Java ● API Blueprint ● HBase Dataflow
  24. 24. 24 Rowkey design
  25. 25. 25 Rowkey design - Avoid hotspotting
  26. 26. 26 Rowkey design - Avoid hotspotting ● Sorted lexicographically
  27. 27. 27 Rowkey design - Avoid hotspotting ● Sorted lexicographically Region 3 Region 1 Region 2 foo-1 foo-2 foo-3 foo-4
  28. 28. 28 Rowkey design - Avoid hotspotting ● Sorted lexicographically Region 3 Region 1 Region 2 foo-1 foo-2 foo-3 foo-4
  29. 29. 29 Rowkey design - Avoid hotspotting ● Sorted lexicographically Region 3 Region 1 Region 2 foo-1 foo-2 foo-3 foo-4
  30. 30. 30 Rowkey design - Avoid hotspotting ● Salting, Hashing or Reversing
  31. 31. 31 Rowkey design - Avoid hotspotting ● Salting, Hashing or Reversing Region 3 Region 1 Region 2 foo-1 foo-2 foo-3 foo-4
  32. 32. 32 Rowkey design - Avoid hotspotting ● Salting, Hashing or Reversing Region 3 Region 1 Region 2 foo-1 foo-2 foo-3 foo-4 BOX
  33. 33. 33 Rowkey design - Avoid hotspotting ● Salting, Hashing or Reversing Region 3 Region 1 Region 2 foo-1 foo-2 foo-3 foo-4 BOX
  34. 34. 34 Rowkey design - Avoid hotspotting ● Salting, Hashing or Reversing Region 3 Region 1 Region 2 foo-1 foo-2 foo-3 foo-4 BOX a-foo-1
  35. 35. 35 Rowkey design - Avoid hotspotting ● Salting, Hashing or Reversing Region 3 Region 1 Region 2 foo-1 foo-2 foo-3 foo-4 BOX a-foo-1 b-foo-2 c-foo-3 d-foo-4
  36. 36. 36 Rowkey design - Refining ID
  37. 37. 37 Rowkey design - Refining ID ● SHA1: 40 bytes – 3204c3aefcca4a556f0f7547d056235fa823af3a
  38. 38. 38 Rowkey design - Refining ID ● SHA1: 40 bytes – 3204c3aefcca4a556f0f7547d056235fa823af3a ● UUID: 36 bytes – 22bfad60-39d2-11e4-916c-0800200c9a66
  39. 39. 39 Rowkey design - Refining ID ● SHA1: 40 bytes – 3204c3aefcca4a556f0f7547d056235fa823af3a ● UUID: 36 bytes – 22bfad60-39d2-11e4-916c-0800200c9a66 ● MD5: 32 bytes – 27734b3f4f98e709f58c6ddb0193164e
  40. 40. Too long !!!
  41. 41. 41 Rowkey design - Refining ID ● X Algorithm
  42. 42. 42 Rowkey design - Refining ID ● X Algorithm: 12 bytes
  43. 43. 43 Rowkey design - Refining ID ● X Algorithm: 12 bytes – Auto increment
  44. 44. 44 Rowkey design - Refining ID ● X Algorithm: 12 bytes – Auto increment – Ordered
  45. 45. 45 Rowkey design - Refining ID ● X Algorithm: 12 bytes – Auto increment – Ordered – Counts to 2.17E21
  46. 46. Rowkey design - Refining ID ● X Algorithm: 12 bytes – Auto increment – Ordered – Counts to 2.17E21 – e.g: H00000001B12
  47. 47. 47 Rowkey design - Authenticating
  48. 48. 48 Rowkey design - Authenticating ● Get frequently
  49. 49. 49 Rowkey design - Authenticating ● Get frequently User Id ID0000001A3B Access Token d66e3b70-3666-11e4-8c21-0800200c9a66 Expired Time 1410077636
  50. 50. 50 Rowkey design - Authenticating ● Get frequently ● Multi-login
  51. 51. 51 Rowkey design - Authenticating ● Get frequently ● Multi-login User Id ID0000001A3B Token 0 d66e3b70-3666-11e4-8c21-0800200c9a66+1410077636+Device1 Token 1 92e84bf9-7852-492d-b56a-13ba7acb8fb5+1410123456+Device2
  52. 52. 52 Rowkey design - Rice dumpling
  53. 53. 53 Rowkey design - Rice dumpling
  54. 54. 54 Rowkey design - Rice dumpling
  55. 55. 55 Rowkey design - Rice dumpling Id ME00000024AC Title Announce Content We are hiring
  56. 56. 56 Rowkey design - Rice dumpling Id ME00000024AC Title Announce Content We are hiring Id ME00000024AC.ME00000037ZZ Title (n/a) Content I want to join your team !!!
  57. 57. 57 Rowkey design - Rice dumpling Id ME00000024AC Title Announce Content We are hiring Id ME00000024AC.ME00000037ZZ Title (n/a) Content I want to join your team !!!
  58. 58. 58 Rowkey design - Access controlling
  59. 59. 59 Rowkey design - Access controlling
  60. 60. 60 Rowkey design - Access controlling Only A, B can see it.
  61. 61. 61 Rowkey design - Access controlling Only A, B can see it. Of course, including me.
  62. 62. 62 Rowkey design - Access controlling ● When post a message (Write)
  63. 63. 63 Rowkey design - Access controlling ● When post a message (Write) – Generate ACL Id
  64. 64. 64 Rowkey design - Access controlling ● When post a message (Write) – Generate ACL Id – Put ACL Id to message, and reader's ACLs
  65. 65. 65 Rowkey design - Access controlling ● When post a message (Write) – Generate ACL Id – Put ACL Id to message, and reader's ACLs ● When read my messages (Read)
  66. 66. 66 Rowkey design - Access controlling ● When post a message (Write) – Generate ACL Id – Put ACL Id to message, and reader's ACLs ● When read my messages (Read) – Scan my ACLs, and all messages
  67. 67. 67 Rowkey design - Access controlling ● When post a message (Write) – Generate ACL Id – Put ACL Id to message, and reader's ACLs ● When read my messages (Read) – Scan my ACLs, and all messages – If my ACLs contains message's ACL Id, can SHOW it
  68. 68. 68 Rowkey design - Access controlling Write
  69. 69. 69 Rowkey design - Access controlling ACL hash hash(A, B, K)+C+R ACL Id AI0070AD Write
  70. 70. 70 Rowkey design - Access controlling ACL hash hash(A, B, K)+C+R ACL Id AI0070AD Message Id ME00000024AC Title Announce Content We are hiring ACL Id AI0070AD Write
  71. 71. 71 Rowkey design - Access controlling ACL Id+User Id AI0070AD+A AI0070AD+B AI0070AD+K Create 1 1 1 Read 1 1 1 Update 0 0 0 Delete 0 0 0 Write
  72. 72. 72 Rowkey design - Access controlling User Id+ACL Id A+AI0070AD B+AI0070AD K+AI0070AD Create 1 1 1 Read 1 1 1 Update 0 0 0 Delete 0 0 0 ACL Id+User Id AI0070AD+A AI0070AD+B AI0070AD+K Create 1 1 1 Read 1 1 1 Update 0 0 0 Delete 0 0 0 Write
  73. 73. 73 Rowkey design - Access controlling Read
  74. 74. 74 Rowkey design - Access controlling User Id+ACL Id K+AI0070AD K+AI028577 Create 1 1 Read 1 1 Update 0 1 Delete 0 1 Read
  75. 75. 75 Rowkey design - Access controlling User Id+ACL Id K+AI0070AD K+A1028577 Create 1 1 Read 1 1 Update 0 1 Delete 0 1 Read Message Id ME00000024AC Title Announce Content We are hiring ACL Id AI0070AD
  76. 76. 76 Rowkey design - Access controlling User Id+ACL Id K+AI0070AD K+A1028577 Create 1 1 Read 1 1 Update 0 1 Delete 0 1 Read Message Id ME00000024AC Title Announce Content We are hiring ACL Id AI0070AD
  77. 77. 77 Rowkey design - Statistics
  78. 78. 78 Rowkey design - Statistics ● Variety of types – e.g., Likes, Comments, Registrations
  79. 79. 79 Rowkey design - Statistics ● Variety of types – e.g., Likes, Comments, Registrations ● By unit – i.e., hourly,daily,weekly,monthly,yearly
  80. 80. 80 Rowkey design - Statistics ● Variety of types – e.g., Likes, Comments, Registrations ● By unit – i.e., hourly,daily,weekly,monthly,yearly ● By user
  81. 81. 81 Rowkey design - Statistics Unit+Time Base+User Id+Type H+20140908+AAA+Like 11 7 15 22 21 15 Unit+Time Base+User Id+Type D+201409+AAA+Like 08 44 11 58
  82. 82. 82 Rowkey design - Statistics ● Sum counts from 2014/9/7 to 2014/9/20 group by user or counting type Unit+Time Base+User Id+Type D+201409+AAA+Like 02 20 08 52 09 41 ... ... 20 55
  83. 83. 83 Rowkey design - Statistics ● Sum counts from 2014/9/7 to 2014/9/20 group by user or counting type Unit+Time Base+User Id+Type D+201409+AAA+Like 02 20 08 52 09 41 ... ... 20 55
  84. 84. 84 Rowkey design - Statistics ● Sum AAA's counts from 2014/9/7 to 2014/9/20 group by counting type Unit+Time Base+User Id+Type D+201409+AAA+Like 02 20 08 52 09 41 ... ... 20 55
  85. 85. 85 Rowkey design - Statistics ● Sum AAA's like counts from 2014/9/7 to 2014/9/20 Unit+Time Base+User Id+Type D+201409+AAA+Like 02 20 08 52 09 41 ... ... 20 55
  86. 86. 86 Rowkey design - Summary ● Avoid hotspotting ● Refining ID ● Authenticating ● Rice dumpling ● Access controlling ● Statistics
  87. 87. 87 Best Practice in Java
  88. 88. 88 No. 1
  89. 89. 89 No. 1 USE HashMap
  90. 90. 90 No. 1 USE HashMap NoSQL is different from RDBMS
  91. 91. 91 No. 1 USE HashMap OLD
  92. 92. 92 No. 1 USE HashMap public class Validation1 { private String accessToken; private long expiredTime; public Validation1() { accessToken = null; expiredTime = -1; } public String getAccessToken() { return accessToken; } public void setAccessToken(String accessToken) { this.accessToken = accessToken; } public long getExpiredTime() { return expiredTime; } public void setExpiredTime(long expiredTime) { this.expiredTime = expiredTime; } } OLD
  93. 93. 93 No. 1 USE HashMap NEW
  94. 94. 94 No. 1 USE HashMap public static final String ACCESS_TOKEN = "access token"; private Map<String, byte[]> putMap; public Validation1() { super(); } public Validation1(Result result) { super(result); } public String getAccessToken() { return Bytes.toString(putMap.get(ACCESS_TOKEN)); } public void setAccessToken(String accessToken) { putMap.put(ACCESS_TOKEN, Bytes.toBytes(accessToken)); } NEW
  95. 95. 95 No. 1 USE HashMap public static final String ACCESS_TOKEN = "access token"; private Map<String, byte[]> putMap; public Validation1() { super(); } public Validation1(Result result) { super(result); } public String getAccessToken() { return Bytes.toString(putMap.get(ACCESS_TOKEN)); } public void setAccessToken(String accessToken) { putMap.put(ACCESS_TOKEN, Bytes.toBytes(accessToken)); } NEW
  96. 96. 96 No. 1 USE HashMap ● Bytes.toXXX() returns always Type XXX or NULL
  97. 97. 97 No. 1 USE HashMap ● Bytes.toXXX() returns always Type XXX or NULL – Or throws Exception
  98. 98. No. 1 USE HashMap ● Bytes.toXXX() returns always Type XXX or NULL – Or throws Exception ● Improve default value in Java
  99. 99. 99 No. 2
  100. 100. 100 No. 2 ONE table, MULTI domains
  101. 101. 101 No. 2 ONE table, MULTI domains NoSQL is different from RDBMS
  102. 102. 102 No. 2 ONE table, MULTI domains ● In RDBMS – – – ● In NoSQL – – –
  103. 103. 103 No. 2 ONE table, MULTI domains ● In RDBMS (at design time) – – – ● In NoSQL – – –
  104. 104. 104 No. 2 ONE table, MULTI domains ● In RDBMS (at design time) – – – ● In NoSQL (at runtime) – – –
  105. 105. 105 No. 2 ONE table, MULTI domains ● In RDBMS (at design time) – Primary key affects only one column – – ● In NoSQL (at runtime) – – –
  106. 106. 106 No. 2 ONE table, MULTI domains ● In RDBMS (at design time) – Primary key affects only one column – – ● In NoSQL (at runtime) – Rowkey always changes – –
  107. 107. 107 No. 2 ONE table, MULTI domains ● In RDBMS (at design time) – Primary key affects only one column – Schema is fixed – ● In NoSQL (at runtime) – Rowkey always changes – –
  108. 108. 108 No. 2 ONE table, MULTI domains ● In RDBMS (at design time) – Primary key affects only one column – Schema is fixed – ● In NoSQL (at runtime) – Rowkey always changes – Schema always changes –
  109. 109. 109 No. 2 ONE table, MULTI domains ● In RDBMS (at design time) – Primary key affects only one column – Schema is fixed – DAO serves one domain ● In NoSQL (at runtime) – Rowkey always changes – Schema always changes –
  110. 110. 110 No. 2 ONE table, MULTI domains ● In RDBMS (at design time) – Primary key affects only one column – Schema is fixed – DAO serves one domain ● In NoSQL (at runtime) – Rowkey always changes – Schema always changes – DAO serves many domains
  111. 111. 111 No. 2 ONE table, MULTI domains User Id ID0000001A3B Access Token d66e3b70-3666-11e4-8c21-0800200c9a66 Expired Time 1410077636
  112. 112. 112 No. 2 ONE table, MULTI domains User Id ID0000001A3B Access Token d66e3b70-3666-11e4-8c21-0800200c9a66 Expired Time 1410077636 User Id+ACL Id ID0000001A3B+AI0070AD Create 1 Read 1 Update 0 Delete 0
  113. 113. 113 No. 2 ONE table, MULTI domains User Id ID0000001A3B Access Token d66e3b70-3666-11e4-8c21-0800200c9a66 Expired Time 1410077636 User Id+ACL Id ID0000001A3B+AI0070AD Create 1 Read 1 Update 0 Delete 0
  114. 114. 114 No. 2 ONE table, MULTI domains ● A DAO maps to a domain in RDBMS
  115. 115. 115 No. 2 ONE table, MULTI domains ● A DAO maps to a domain in RDBMS DB DAO Domain A
  116. 116. 116 No. 2 ONE table, MULTI domains ● A DAO maps to multiple domains in NoSQL
  117. 117. 117 No. 2 ONE table, MULTI domains ● A DAO maps to multiple domains in NoSQL DB DAO Domain A1 Domain A2 Domain A3
  118. 118. 118 No. 2 ONE table, MULTI domains ● A DAO maps to multiple domains in NoSQL ● Build a middle layer to translate multiple domains DB DAO Domain A1 Domain A2 Domain A3
  119. 119. 119 No. 2 ONE table, MULTI domains ● A DAO maps to multiple domains in NoSQL ● Build a middle layer to translate multiple domains DB DAO Domain A1 Domain A2 Domain A3 Schema
  120. 120. 120 No. 2 ONE table, MULTI domains ● A DAO maps to multiple domains in NoSQL ● Build a middle layer to translate multiple domains DB DAO Domain A1 Domain A2 Domain A3 Schema
  121. 121. 121 No. 2 ONE table, MULTI domains private String checkDomainType(Result result) { if (result.isEmpty()) { return null; } else { String rowkey = Bytes.toString(result.getRow()); String[] splitKey = rowkey.split(DIVIDER); if (splitKey.length == 1) { } } }
  122. 122. 122 No. 2 ONE table, MULTI domains private String checkDomainType(Result result) { if (result.isEmpty()) { return null; } else { String rowkey = Bytes.toString(result.getRow()); String[] splitKey = rowkey.split(DIVIDER); if (splitKey.length == 1) { return DOMAIN_TYPE_VALIDATION1; } } }
  123. 123. 123 No. 2 ONE table, MULTI domains private String checkDomainType(Result result) { if (result.isEmpty()) { return null; } else { String rowkey = Bytes.toString(result.getRow()); String[] splitKey = rowkey.split(DIVIDER); if (splitKey.length == 1) { return DOMAIN_TYPE_VALIDATION1; } else if (splitKey.length == 2) { return DOMAIN_TYPE_VALIDATION2; } } }
  124. 124. 124 No. 2 ONE table, MULTI domains private String checkDomainType(Result result) { if (result.isEmpty()) { return null; } else { String rowkey = Bytes.toString(result.getRow()); String[] splitKey = rowkey.split(DIVIDER); if (splitKey.length == 1) { return DOMAIN_TYPE_VALIDATION1; } else if (splitKey.length == 2) { return DOMAIN_TYPE_VALIDATION2; } else { return DOMAIN_TYPE_VALIDATION3; } } }
  125. 125. 125 No. 2 ONE table, MULTI domains private String checkDomainType(Result result) { if (result.isEmpty()) { return null; } else { String rowkey = Bytes.toString(result.getRow()); String[] splitKey = rowkey.split(DIVIDER); if (splitKey.length == 1) { return DOMAIN_TYPE_VALIDATION1; } else if (splitKey.length == 2) { return DOMAIN_TYPE_VALIDATION2; } else { return DOMAIN_TYPE_VALIDATION3; } } } Customize
  126. 126. 126 No. 3
  127. 127. 127 No. 3 NoSQL is different from RDBMS
  128. 128. 128 No. 3 NoSQL is different from RDBMS REALLY !!!
  129. 129. 129 API Blueprint
  130. 130. 130
  131. 131. 131 API Blueprint - Introduction
  132. 132. 132 API Blueprint - Introduction ● Web API Language
  133. 133. 133 API Blueprint - Introduction ● Web API Language ● Pure Markdown
  134. 134. 134 API Blueprint - Introduction ● Web API Language ● Pure Markdown ● Design for Humans
  135. 135. 135 API Blueprint - Introduction ● Web API Language ● Pure Markdown ● Design for Humans ● Understandable by Machines
  136. 136. 136 API Blueprint - Introduction ● Web API Language ● Pure Markdown ● Design for Humans ● Understandable by Machines ● Powerful Tooling
  137. 137. 137 API Blueprint - Introduction ● Web API Language ● Pure Markdown ● Design for Humans ● Understandable by Machines ● Powerful Tooling ● Easy Lifecycle
  138. 138. 138 API Blueprint - Hello World
  139. 139. 139 API Blueprint - Hello World
  140. 140. 140 API Blueprint - Complex
  141. 141. 141 API Blueprint - Complex
  142. 142. 142 HBase dataflow
  143. 143. 143 HBase dataflow conserve your domain know-how
  144. 144. 144 HBase dataflow - Solve what ?
  145. 145. 145 HBase dataflow - Solve what ? ● How to conserve system know-how about Put, Get, Scan or other operations in HBase ?
  146. 146. 146 Paper & Pen ?
  147. 147. 147 Paper & Pen ?
  148. 148. 148 Redmine / KM ?
  149. 149. 149 Redmine / KM ?
  150. 150. 150http://kewangtw.github.io/hbase-dataflow/
  151. 151. 151 HBase dataflow - introduction
  152. 152. 152 HBase dataflow - introduction ● HBase operation - Put, Delete, Get, Scan, Filters
  153. 153. 153 HBase dataflow - introduction ● HBase operation - Put, Delete, Get, Scan, Filters ● Export
  154. 154. 154 HBase dataflow - introduction ● HBase operation - Put, Delete, Get, Scan, Filters ● Export – to JSON / Markdown
  155. 155. 155 HBase dataflow - introduction ● HBase operation - Put, Delete, Get, Scan, Filters ● Export – to JSON / Markdown – to PNG / PDF
  156. 156. 156 HBase dataflow - introduction ● HBase operation - Put, Delete, Get, Scan, Filters ● Export – to JSON / Markdown – to PNG / PDF ● Import from JSON
  157. 157. 157 HBase dataflow - introduction ● HBase operation - Put, Delete, Get, Scan, Filters ● Export – to JSON / Markdown – to PNG / PDF ● Import from JSON ● Write title & summary
  158. 158. 158 HBase dataflow - introduction ● HBase operation - Put, Delete, Get, Scan, Filters ● Export – to JSON / Markdown – to PNG / PDF ● Import from JSON ● Write title & summary ● Open source
  159. 159. 159 Live DEMO
  160. 160. 160 Design API Step by Step
  161. 161. 161 Design API Step by Step 1.Paper & pen always are your friends
  162. 162. 162 Design API Step by Step 1.Paper & pen always are your friends 2.Use HBase dataflow to simulate data's flow
  163. 163. 163 Design API Step by Step 1.Paper & pen always are your friends 2.Use HBase dataflow to simulate data's flow 3.Export it
  164. 164. 164
  165. 165. 165 References ● HBase in Action ● apiblueprint, aglio ● HBase dataflow
  166. 166. 166
  167. 167. 167

×