Michael stack -the state of apache h base

The State of Apache HBase
Michael Stack

stack@apache.org
PMC* Chair of Apache HBase Project
● Caretaker/Janitor
● Member of the Hadoop PMC
● Engineer at Cloudera in San Francisco
●

* Project Management Committee

Table of Contents:
What is HBase?
● Who uses it?
● Who runs the project?
● HBase Today
● Tomorrow
● Ecosystem
●

HBase is...
”...an open source,
distributed, scalable,
consistent, low latency,
non-relational, random
access database”

Built on Apache Hadoop
● Hadoop core is:
– Distributed file system (HDFS)
–

MapReduce

● HBase persists all data to HDFS
● Uses Apache ZooKeeper
–

Cluster coordination

Project Goal:
“Billions of rows X millions
of columns on clusters of
'commodity hardware'”

http://www.flickr.com/photos/ag_gilmore/8170021483/in/photostream/

Inspiration
A Google Technology described in a 2006 paper,
Bigtable: A Distributed Storage System for Structured Da
ta
by Chang et al.?

First commit...

commit 454a9dbe046194f8eef3dddc3e5942910dd5b7a1
Author: Douglass Cutting <cutting@apache.org>
Date:
Tue Apr 3 20:34:28 2007 +0000
HADOOP-1045.

Add contrib/hbase, a BigTable-like online database.

Low-latency,
online, random
read/writes
+ “Simple” access patterns

Datamodel*

*Like Google Bigtable model only different nomenclature

DataModel: A Bigtable!
0-N Bigtable(s)
● Bigtable has:
● Rows x Column Families
● Rows have primary key
● Column Families have:
● Any number of Columns
● By access/attributes
● CF prefix and qualifier
●

●

e.g. attribute:mimetype

= Cell @ bigtable 'A', row key 'p', CF 'B:red'

Bigtable A
Row Key

a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x

Colum n Family A

Column Family B

Datamodel: Regions
Bigtable splits into “regions”
● Automatically as table grows
● Region has contiguous rows
● Known by [startRow, endRow)
a
● Distributed over cluster
b
Region a-e
c
d
● 0-100s per server
e
●

Region e-j

f
g
h
i
j

Region k-o

k
l
m
n
o
Etc.

DataModel: Sorted & Versioned
All is byte []
● No native 'types'
● Minor schema or schema-less (NoSQL)
● All is SORTED
● Rows in byte-lexicographical order
● Columns sorted along row
● VERSIONED
ab
● Cells are “versioned”
bc c
ab c d
bc cdde
● 3D (timestamp)
ab c d e
e
b c de
●

Region a-e

c d
cdd e
e
dee
e

3D

Datamodel: Strongly consistent
●

Favors consistency over availability

“Designing applications to cope with concurrency
anomalies in their data is very error-prone, timeconsuming, and ultimately not worth the performance
gains” -- F1: A Distributed SQL Database That Scales

●

Row modifications are atomic
● Even if thousands of columns on a row

Datamodel: in short
”...a sparse, distributed, persistent
multidimensional sorted map”
– Bigtable Paper (2006)
(Table, Row, ColumnFamily, Qualifer, Timestamp) → Value

Architecture: Birds-eye view
Application

M
apR
educe

Im
pala

Thrift/REST
G
atew
ay

H
Base Java
C
lient
ZooKeeper

HBase M
aster
H
Base R
egionServer

H
DFS

Features

•Classes to MapReduce HBase tables
–

HIVE, PIG, etc.

•Query predicate push down via server side filters
•Coprocessors (stored procedures/triggers)
–

e.g. security, secondary indices

•Java clients
–

REST and thrift too

•Extensible jruby-based (JIRB) shell
•Replication
•Security
–

Table/Column Family

–

Kerberos Authentication, ACLs

API
●

●

●

●

●

●

●

●

●

get
put
delete
multi
scan
increment
append
checkAnd*
MapReduce

What to expect
• Writes:
–

1-3ms, 1k-20k writes/sec per node

• Reads:
–

0-3ms cached, 10-30ms disk

–

10-40k reads / second / node from cache

–

> if SSD

• Cell size
• 0-3MB preferred
• Column-orientated so wide tables are OK
• Sparsely populated rows OK

●
●

●

●

OLTP & Batch
Messages
○ 1B+ users
○ Tens of PBs (compressed)
○ Thousands of machines, Pods of ~200
ODS/Real-time monitoring/Timeseries
○ Metrics from every server @ FB
○ 2.5B writes/16k reads per minute
Post Search Store
○
○

MapReduce to build index
1 Trillion posts

●
●
●
●

All on AWS
5 production clusters and growing
Mix of SSD and SATA
Billions of page views per month

● Long time HBase user
● Two clusters of 1k nodes each
○ Master-Master replicating
● Separate low-latency cluster
○ Up to 1M reads a second

Cassini
●
●
●
●
●
●
●

Ebay item search indexing
600M active items in HBase tables
1.4TB of data processed each day
400M puts to HBase each day
250M search metrics per day
Two datacenters
Growing clusters...
– 500->1k

Deploy types
•
•
•
•

Multitenant multifarious feature store
o a.k.a dumping ground
o Stumbleupon, Y!, SalesForce
Reconciliation store
o ebay
Timeseries
o SalesForce, FB ODS
Lots-o-entities store
o Flurry, genome
o Lots-o-entities BLOBs, FB Messages

Diverse team*

COMMITTERS!
Preferably ALIVE!
* http://hbase.apache.org/team-list.html

# of commits
Total Files 2021
Total Lines of Code 832122
Total Commits 6615 (~ 3/day)
Authors 39

(https://www.ohloh.net/p/hbase)

Commits/Month Over Time
(0.94/trunk)

• Release every month
• Each more stable
• & more performant
• Some features…
•
• Currently at 0.94.13

Wire compatible between releases

http://www.flickr.com/photos/sysli/3026288256/sizes/o/in/photostream/

hbase-0.96.0
– Released October 19th, 2013
– 18months in the making
●
>2000 fixes
●

Big Themes
●

●

●

●

Stability
Operability
– Insight, tools
Scalability
Evolvability

Sampler
●

Pluggable Compaction
–

Smarter triggers

●

Hadoop1 AND Hadoop2

●

Smarter Region Balancer

●

Region Assignment & Replication
–

●

Hardened

Coprocessors
–

More hooks

http://www.flickr.com/photos/allspaw/5815258929/sizes/o/in/photostream/

http://www.flickr.com/photos/38595542@N02/3690830720/sizes/o/in/photostream/

• System tables
• Filesystem
• Up in zookeeper
• Over the wire

Snapshots
•

•

•

By Table
oSnapshot, clone, restore, export
Inexpensive
oJust metadata
Good for...
oBackups
oReplication
oOffline processing

Namespaces
• Grouping of tables
–

Like database in mysql

• System/User

hbase:meta
Quota
Coming
– Security by namespace
– Grouping on cluster by namespace
–

•
•

And more...
• X-row (in-region) Transactions
• Query tracing
• New UI
• Online Region Merge
• Client-side types
• Metrics2
o Radical revamp

• Windows!

• Branched, released soon
• Rolling upgrade from 0.96.0
• In-line Cell-tags
– Security++

ACL down to the Cell-level
●
Cell-level visibility labels
●
Encryption
Reverse Scan
●

•

HBase 2014

HBase 1.0.0
th
●Reining in the 99
percentiles
●Multi-WAL
●Speculative replica reads
●More support for multi-tenancy
●Off-heap
●

OpenTSDB

Timeseries
● Store, index and serve metrics at large scale
● Make data easily accessible and graphable
●

Haeinsa
Haeinsa 란 무엇인가 ?

Is a linearly scalable multi-row, multi-table
transaction library for HBase. Haeinsa uses
two-phase locking and optimistic concurrency
control for implementing transaction. The
isolation level of transaction is serializable.
Inspired by Google Percolator
● VCNC
●

How to make it
easier writing
applications
against HBase?

Frameworks: Kiji.org

•
•
•
•
•
•
•

Entity-centric, simple model
o Types, complex, compound types.
Each cell is schema versioned
Works across MR & REST, etc.
Machine-learning libs
Examples, tutorials
Production users
Open-source

Frameworks: CDK
•
•

APIs providing Dataset abstraction
– get/put/delete API in AVRO objects
Highlights:
– Supports multiple components
●

flume, morphlines, hive, crunch, hcat

–
–

•

Types using Avro and parquet formats
Manages schema evolution

Open source by Cloudera
–

http://cloudera.github.io/cdk/docs/current

● Client-embedded JDBC driver
○

Connection conn =
DriverManager.getConnection("jdbc:phoenix:localhost");

● Alternate HBase Client API (SQL)
● Fast!
○
○
○
○
○

Exploits HBase Coprocessors/Filters
Types
Aggregations
Skip scans
Secondary indices

End

Thank You!
stack@apache.org

TODO
●

●

●

●

●

DBA: R (read), W (write), C (create), X (execute), A (admin).
cell-level security. Every cell in an Accumulo store can have a label, stored effectively as part of
the key, which is used to determine whether a value is visible to a given subject or not. The label is
not an ACL, it is a different way of expressing security policy.
A label instead turns this on its head and describes the sensitivity of the information to a decision
engine that then figures out if the subject is authorized to view data of that sensitivity based on
(potentially, many) factors.
Then, as of HBASE-7662, HBase can store into and apply ACLs from cell tags, extending the
current HBase ACL model down to the cell.
Finally, we have also contributed transparent server side encryption, as HBASE-7544, for
additional assurance against accidental leakage of data at rest, which is at this time an HBaseonly feature.

●

Auto-manages partitioning

●

Storage machinery in the RS

●

I like the Latency/Throughput/Read/Write axis in Nick

Michael stack -the state of apache h base

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Michael stack -the state of apache h base

Similar to Michael stack -the state of apache h base (20)

More from hdhappy001

More from hdhappy001 (20)

Recently uploaded

Recently uploaded (20)

Michael stack -the state of apache h base