2. stack@apache.org
PMC* Chair of Apache HBase Project
● Caretaker/Janitor
● Member of the Hadoop PMC
● Engineer at Cloudera in San Francisco
●
* Project Management Committee
3. Table of Contents:
What is HBase?
● Who uses it?
● Who runs the project?
● HBase Today
● Tomorrow
● Ecosystem
●
4. HBase is...
”...an open source,
distributed, scalable,
consistent, low latency,
non-relational, random
access database”
5. Built on Apache Hadoop
● Hadoop core is:
– Distributed file system (HDFS)
–
MapReduce
● HBase persists all data to HDFS
● Uses Apache ZooKeeper
–
Cluster coordination
6. Project Goal:
“Billions of rows X millions
of columns on clusters of
'commodity hardware'”
http://www.flickr.com/photos/ag_gilmore/8170021483/in/photostream/
7. Inspiration
A Google Technology described in a 2006 paper,
Bigtable: A Distributed Storage System for Structured Da
ta
by Chang et al.?
15. DataModel: A Bigtable!
0-N Bigtable(s)
● Bigtable has:
● Rows x Column Families
● Rows have primary key
● Column Families have:
● Any number of Columns
● By access/attributes
● CF prefix and qualifier
●
●
e.g. attribute:mimetype
= Cell @ bigtable 'A', row key 'p', CF 'B:red'
Bigtable A
Row Key
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
Colum n Family A
Column Family B
16. Datamodel: Regions
Bigtable splits into “regions”
● Automatically as table grows
● Region has contiguous rows
● Known by [startRow, endRow)
a
● Distributed over cluster
b
Region a-e
c
d
● 0-100s per server
e
●
Region e-j
f
g
h
i
j
Region k-o
k
l
m
n
o
Etc.
17. DataModel: Sorted & Versioned
All is byte []
● No native 'types'
● Minor schema or schema-less (NoSQL)
● All is SORTED
● Rows in byte-lexicographical order
● Columns sorted along row
● VERSIONED
ab
● Cells are “versioned”
bc c
ab c d
bc cdde
● 3D (timestamp)
ab c d e
e
b c de
●
Region a-e
c d
cdd e
e
dee
e
3D
18. Datamodel: Strongly consistent
●
Favors consistency over availability
“Designing applications to cope with concurrency
anomalies in their data is very error-prone, timeconsuming, and ultimately not worth the performance
gains” -- F1: A Distributed SQL Database That Scales
●
Row modifications are atomic
● Even if thousands of columns on a row
19. Datamodel: in short
”...a sparse, distributed, persistent
multidimensional sorted map”
– Bigtable Paper (2006)
(Table, Row, ColumnFamily, Qualifer, Timestamp) → Value
21. Features
•Classes to MapReduce HBase tables
–
HIVE, PIG, etc.
•Query predicate push down via server side filters
•Coprocessors (stored procedures/triggers)
–
e.g. security, secondary indices
•Java clients
–
REST and thrift too
•Extensible jruby-based (JIRB) shell
•Replication
•Security
–
Table/Column Family
–
Kerberos Authentication, ACLs
23. What to expect
• Writes:
–
1-3ms, 1k-20k writes/sec per node
• Reads:
–
0-3ms cached, 10-30ms disk
–
10-40k reads / second / node from cache
–
> if SSD
• Cell size
• 0-3MB preferred
• Column-orientated so wide tables are OK
• Sparsely populated rows OK
27. ●
●
●
●
OLTP & Batch
Messages
○ 1B+ users
○ Tens of PBs (compressed)
○ Thousands of machines, Pods of ~200
ODS/Real-time monitoring/Timeseries
○ Metrics from every server @ FB
○ 2.5B writes/16k reads per minute
Post Search Store
○
○
MapReduce to build index
1 Trillion posts
28. ●
●
●
●
All on AWS
5 production clusters and growing
Mix of SSD and SATA
Billions of page views per month
29. ● Long time HBase user
● Two clusters of 1k nodes each
○ Master-Master replicating
● Separate low-latency cluster
○ Up to 1M reads a second
30. Cassini
●
●
●
●
●
●
●
Ebay item search indexing
600M active items in HBase tables
1.4TB of data processed each day
400M puts to HBase each day
250M search metrics per day
Two datacenters
Growing clusters...
– 500->1k
31. Deploy types
•
•
•
•
Multitenant multifarious feature store
o a.k.a dumping ground
o Stumbleupon, Y!, SalesForce
Reconciliation store
o ebay
Timeseries
o SalesForce, FB ODS
Lots-o-entities store
o Flurry, genome
o Lots-o-entities BLOBs, FB Messages
51. Namespaces
• Grouping of tables
–
Like database in mysql
• System/User
hbase:meta
Quota
Coming
– Security by namespace
– Grouping on cluster by namespace
–
•
•
52. And more...
• X-row (in-region) Transactions
• Query tracing
• New UI
• Online Region Merge
• Client-side types
• Metrics2
o Radical revamp
• Windows!
53.
54. • Branched, released soon
• Rolling upgrade from 0.96.0
• In-line Cell-tags
– Security++
ACL down to the Cell-level
●
Cell-level visibility labels
●
Encryption
Reverse Scan
●
•
55. HBase 2014
HBase 1.0.0
th
●Reining in the 99
percentiles
●Multi-WAL
●Speculative replica reads
●More support for multi-tenancy
●Off-heap
●
58. Haeinsa
Haeinsa 란 무엇인가 ?
Is a linearly scalable multi-row, multi-table
transaction library for HBase. Haeinsa uses
two-phase locking and optimistic concurrency
control for implementing transaction. The
isolation level of transaction is serializable.
Inspired by Google Percolator
● VCNC
●
68. TODO
●
●
●
●
●
DBA: R (read), W (write), C (create), X (execute), A (admin).
cell-level security. Every cell in an Accumulo store can have a label, stored effectively as part of
the key, which is used to determine whether a value is visible to a given subject or not. The label is
not an ACL, it is a different way of expressing security policy.
A label instead turns this on its head and describes the sensitivity of the information to a decision
engine that then figures out if the subject is authorized to view data of that sensitivity based on
(potentially, many) factors.
Then, as of HBASE-7662, HBase can store into and apply ACLs from cell tags, extending the
current HBase ACL model down to the cell.
Finally, we have also contributed transparent server side encryption, as HBASE-7544, for
additional assurance against accidental leakage of data at rest, which is at this time an HBaseonly feature.
●
Auto-manages partitioning
●
Storage machinery in the RS
●
I like the Latency/Throughput/Read/Write axis in Nick