Hbase: an introduction

1
HBASE: overview
Jean-Baptiste Poullet
Consultant @Stat'Rgy

2
Contents
● What is HBase ?
● HBase vs RDBMS (like MySQL or PostgreSQL)
● Backup ? CRUD operations ? ACID compliant ?
● Hardware/OS
● HBase DB Design
● UI ? Let's make a demo.

3
What is HBase ?
● Wikipedia definition: HBase is an open source, non-relational,
distributed database modeled after Google's BigTable and
written in Java. It is developed as part of Apache Software
Foundation's Apache Hadoop project and runs on top of HDFS
(Hadoop Distributed Filesystem), providing BigTable-like
capabilities for Hadoop. That is, it provides a fault-tolerant way of
storing large quantities of sparse data (small amounts of
information caught within a large collection of empty or
unimportant data, such as finding the 50 largest items in a group
of 2 billion records, or finding the non-zero items representing less
than 0.1% of a huge collection).

4
HBase is used by the largest companies

5
HBase features
No real indexes
● Rows are stored sequentially, as are the columns within each row. Therefore, no issues with index bloat, and insert performance is
independent of table size.
●
● Automatic partitioning
● As your tables grow, they will automatically be split into regions and distributed across all available nodes.
●
● Scale linearly and automatically with new nodes
● Add a node, point it to the existing cluster, and run the regionserver. Regions will automatically rebalance and load will spread evenly.
●
● Commodity hardware
● Clusters are built on $1,000–$5,000 nodes rather than $50,000 nodes. RDBMSs are I/O hungry, requiring more costly hardware.
●
● Fault tolerance
● Lots of nodes means each is relatively insignificant. No need to worry about individual node downtime.
●
● Batch processing
● MapReduce integration allows fully parallel, distributed jobs against your data with locality awareness.

6
HBase vs RDBMS
Why should I migrate to HBase ?
● Scalability / dealing with sparse matrix
– In RDBMS, NULL cells need to be set and occupy space
– In HBase, NULL cells are simply not stored
When ?
If you stay up at night worrying about your database (uptime, scale, or speed), then you should seriously
consider making a jump from the RDBMS world to HBase.
How ?
● ETL (sqoop, scalding/cascading, scala, python, BI ETL, etc)

7
CRUD operations in HBase
CRUD operations for many clients
Single-row transactions (multiple-row transactions are possible since version 0.94 if the
rows are on the same region server)
Select columns and version possible
Atomic read-modify-write on data stored => concurrent access is not an issue
Co-processors are equivalent to stored-procedures in RDBMS
allow to push user code in the address space of the server
access to server local data
implement lightweight batch jobs, data pre-processing, data summarization
HFile is persistent and ordered immutable maps from key to value
Deleting data: a delete marker (tombstone marker) is written to indicate that a given key is
deleted. In the READ process data marked as deleted are skipped.
DDI: Stands for Denormalization, Duplication and Intelligent Keys
• Denormalization : replacement for JOINs
• Duplication : Design for reads
• Intelligent Keys : Implement indexing and sorting, optimize reads

8
Is HBase ACID ?
● ACID = Atomicity, Consistency, Isolation, and Durability
● HBase guarantees:
– Atomic: All row level operations within a table are atomic. This guarantee
is maintained even when there’s more than one column family within a row.
– Consistency: Scan operations return a consistent view of the data stored
in HBase at some point in the past. Concurrent client interaction could
update a row during a multi-row scan, but all rows returned by a scan
operation will always contain valid data from some point in the past.
– Durability: Any data that can be retrieved from HBase has also been made
durable to disk (persisted to HDFS, in other words).
–
When ACID properties are required by HBase clients, design the
HBase schema such that cross row or cross table data operations
are not required. Keeping data within a row provides atomicity.

9
HBase cluster – Failure Candidates
● Data Center: geo distributed data
● Cluster: avoid redundant cluster, rather have one big cluster with high redundancy
● Rack: Hadoop has built-in rack awareness
● Network Switch: redundant network within each node
● Power Strip: redundant power within each node
● Region Server or Data Node: can be added/removed dynamically for regular
maintenance => need of a replication factor of 3 or 4
● Zookeeper Node: Zookeeper nodes are distributed and can be added/removed
dynamically, must be in odd number due to the quorum (Best practices: 5 or 7)
● HBase Master or Name Node: Multiple Hmaster (Best practices: 2-3, 1 per rack)

10
Backup built-in
● HBase is highly distributed and has built-in versioning,
data retention policy
– No need to backup just for redundancy
– Point-in-time restore:
● Use TTL/Table/CF/C and keep the history for X hours/days
– Accidental deletes:
● Use 'KeepDeletedCells' to keep all deleted data
HDFS is a key enabling technology not only for Hadoop but also for HBase. By
storing data in HDFS, HBase offers reliability, availability, seamless scalability,
high performance and much more — all on cost effective distributed servers.

11
Backup - Tools
● Use export/import tool:
– Based on timestamp; and use it for point-in-time backup/restore
● Use region snapshots
– Take HFile snapshots and copy them over to new storage
location
– Copy Hlog files for point-in-time roll-forward from snapshot time
(replay using WALPlayer post import)
● Table snapshots (0.94.6+)

12
Hardware/Disk/OS best practices
● 1U or 2U preferred, avoid 4U or NAS or expensive systems
● JBOD on slaves, RAID 1+0 on masters
● No SSDs, No virtualized storage
● Good number of cores (4-16), HyperThreading enabled on CPUs
● Good amount of RAM (24-72G)
● Dual 1G network, 10G or InfiniBand
● SATA, 7/10/15K, the cheaper the better
● Use RAID firmware drives, faster error detection and enable disk to fail on hardware errors
● Ext3/Ext4/XFS
● RHEL or CentOS or Ubuntu
● Swappiness=0 and no swap files
● Automation with Puppet (e.g. for deploying an HBase cluster) and Fabric (e.g. for deploying new HBase
release with zero downtime)

13
Alerting system
● Need proper alerting system
– JMX exposes all metrics
– Ops Dashboards (Ganglia, Cacti, OpenTSDB, NewRelic)
– Small Dashboard for critical events
– Define proper level for escalation
– Critical
● Loosing a Master or ZooKeeper Node
● +/- 10% drop in performance or latency
● Key thresholds (load,swap,IO)
● Loosing 2 or more slave nodes
● Disk failures
● Unbalanced nodes
● FATAL errors in logs

14
Tables in HBase
• Tables are sorted by Row in lexicographical order
• Table schema only defines its column families
• Each family consists of any number of columns
• Each column consists of any number of versions
• Columns only exist when inserted, NULLs are free
• Columns within a family are sorted and stored together
• Everything except table name are byte
KeyValue:
(Table, Row, Family:Column, Timestamp) -> Value
KeyValue instances are not split across blocks.
For example, if there is an 8 MB KeyValue,
even if the block-size is 64kb this KeyValue will
be read in as a coherent block. For more
information, see the KeyValue source code.
The KeyValue format inside a byte array is:
• keylength
• valuelength
• key
• value
The Key is further decomposed as:
• rowlength
• row (i.e., the rowkey)
• columnfamilylength
• columnfamily
• columnqualifier
• timestamp
• keytype (e.g., Put, Delete,
DeleteColumn, DeleteFamily)

15
What about the schema design ?
Schema design is a combination of
• Designing the keys (rows and columns)
• Segregate data into column families
• Choose compression and block sizes
CONFIG file: conf/hbase-site.xml

Designing the keys: READ or WRITE design
Sequential keys
([timestamp])
would be more appropriate
for BridgeIris since the
writing process can be done
in a batch mode
Interactive queries require
a fast access to the data.
Risk of hotspotting on
regions when continous
writing (ok if
Bulk loads instead)
16

18
Designing keys
• Tall-Narrow Tables (many rows, few columns) vs Flat-Wide Tables (few rows,
many columns)
 Tall-Narrow is recommended
 Store part of the cell data in the row key
• Rows do not split => avoid too large rows.
• Dimensions that are queried together in the same column family since
those columns will be stored in the same low-level storage file (HFile on HDFS)
• Atomicity on row level => not an issue in BrideIris: we can build
row/column key such that we don’t need several rows to be updated in a row.

What about the cluster and HBase config ?
19
• Data node and region server should be co-located. Same cluster
• Replication: at least 3 => OK with HDFS
• Too many or too small regions are not good.
• When does a region split ? Region size ? Keep default or set to 1 GB
• Store larger than hbase.hregion.max.filesize (HBase v0.94 used by EMR: 10GB) after a
major compaction, for a 10 node cluster it is better to have 10 regions of 0.4 GB than one
big of 4 GB. But too many will generate an overhead in memory (MSLAB requires 2MB per
family per region).
• How is the region assigned to a region server ? Keep default
– Automated to insure a balance between the region servers (manual command in HBase
shell: balance_switch, hbase.balancer.period property)
• What is the best block size ? Keep default
– The block size can be configured for each column family (default 64 kb).
– Column families can be defined in memory (quick read access) => are there columns that
will be almost always requested by the user ???
• Should blocks be compressed ? How ? No compression and Snappy if
needed
– Compression is possible for each column family. GZIP (built in), SNAPPY (to be installed on
all nodes). GZIP better compression but slower. If compression, SNAPPY would be more
appropriate

20
Benchmark is a key
● Nothing fits for all
● Simulate use cases and run the tests:
– Bulk loading
– Random access, read/write
– Batch processing
– Scan, filter
● Negative performance
– Replication factor
– Zookeeper nodes
– Network latency
– Slower disks, CPUs
– Hot regions, Bad row keys or Bulk loading without pre-splits

21
MySQL to HBase
Row key Column family:{column
qualifier:Version:Value}
0000000001 gatk_change_stats:
{'chr':1383859:'5',
'pos':1383834:'3932',
…}
gatk_gene_coverage:
{'id_project':38398:'38',
'gene_symbol':3938:'ENSG000034
33'}
0000000002 gatk_change_stats:
{'chr':1383859:'2',
'pos':1383834:'3232',
…}
gatk_gene_coverage:
{'id_project':38398:'8',
'gene_symbol':3938:'ENSG000033
890'}
SQOOP
http://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html#_connec
ting_to_a_database_server

Hbase: an introduction

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Hbase: an introduction

Similar to Hbase: an introduction (20)

Recently uploaded

Recently uploaded (20)

Hbase: an introduction