1. HBase is a NoSQL (Not Only SQL) database. HBase store data
rows in labeled tables, each row have sortable key and an
arbitrary number of columns.
v HBase is linear and modular scaling
v It have automatic region server failover
v Tables are distributed on the cluster via
regions which supports automatic
sharding
v Hadoop/HDFS integration
v Supports MPP (massively parallelized
processing)
v Support Thrift and REST API
v Supports Block Cache and Bloom Filters
for high volume query optimization
v Provides build-in web pages for
operational insight
v Supports strongly consistent
reads/writes
v No joins supported as RDBMS
v Supports Tables with column family,
rows and columns
v All columns belong to column family
v Have table cells with intersection of row
and column coordinates and are
versioned {row, column, version}
v You can run get command to select row,
put to insert or update a row, scan to do
a loop for multiple rows and Delete to
delete record
By: Milind Zodge
v If you have millions or billions of rows,
then HBase is a good candidate
v You do not need advanced query
language SQL and can leave without
secondary indexes and typed columns
v You have enough hardware available as
even HDFS doesn’t do well with
anything less than 5 Data Node so you
will at least need 5 nodes cluster
v If your application has a variable
schema where each row is slightly
different
v If you data I stored in collections
v If you need key based access to data
when storing or retrieving
About HBase When to use HBase