Apache HBase for Architects

Apache HBase
For Architects
Nick Dimiduk, Hortonworks
Strata/Hadoop World Barcelona, 2014-11-21
Licensed under a Creative Commons
Attribution-ShareAlike 3.0 Unported License.
Page 1

Attribution-ShareAlike 3.0 Unported License. Page 2

Agenda
• Background
– (how did we get here?)
• TL;DR
– (don’t waste my time!)
• High-level Architecture
– (where are we?)
• Anatomy of a RegionServer
– (how does this thing work?)
• By Example
– (how do I use it?)
• Resources
– (where do we go from here?)
Page 3

Background
Page 4

So what is HBase anyway?
• BigTable paper from Google, 2006, Dean et al.
– “Bigtable is a sparse, distributed, persistent multi-dimensional sorted map.”
– http://research.google.com/archive/bigtable.html
• Key Features:
– Distributed storage across cluster of machines
– Random, online read and write data access
– Schemaless data model (“NoSQL”)
– Self-managed data partitions
Page 5

Apache Hadoop Dependencies
• Apache Hadoop Distributed Filesystem (HDFS)
– Distributed, fault-tolerant, throughput-optimized data storage
– The Google File System, 2003, Ghemawat et al.
– http://research.google.com/archive/gfs.html
• Apache Zookeeper (ZK)
– Distributed, available, reliable coordination system
– The Chubby Lock Service …, 2006, Burrows
– http://research.google.com/archive/chubby.html
• Apache Hadoop MapReduce (MR)
– Distributed, fault-tolerant, batch-oriented data processing
– MapReduce: …, 2004, Dean and Ghemawat
– http://research.google.com/archive/mapreduce.html
Page 6

TL;DR
Page 7

Page 8
C1 tree C0 tree
Disk Memory
Figure 2.1. Schematic picture of an LSM-tree of two components
Figure 2.1 reproduced from O’Neil, Patrick, et al. "The log-structured
merge-tree (LSM-tree)." Acta Informatica 33.4 (1996): 351-385.

C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
Page 9
DataNode RegionServer C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory

primarily random reads and writes. In other deployments, is also a part of the workloads, TaskTrackers, Servers can run together.
primarily random primarily reads random and writes. reads and In other writes. is also a part is also of the a part workloads, of the workloads, TaskTrackers, Servers can Servers run together.
can run together.
DataNode RegionServer DataNode RegionServer Figure 3.7 HBase RegionServer and HDFS DataNode processes C1 DataNode RegionServer DataNode RegionServer DataNode C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 C1 Figure 3.7 HBase Figure RegionServer 3.7 HBase and RegionServer HDFS DataNode and DataNode RegionServer DataNode RegionServer C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 Licensed under a Creative Commons
Page 10
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
DataNode RegionServer DataNode C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory

RegionServers, and each RegionServer typically hosts multiple regions.
RegionServers, RegionServers, and each RegionServer and each RegionServer typically hosts typically multiple hosts Given that the Given underlying that the underlying data is stored data in is HDFS, stored which in HDFS, is available a single namespace, a single namespace, all RegionServers all RegionServers have access have to the access same to system and system can therefore and can host therefore any region host any (figure region 3.8). (figure By physically 3.8). Nodes and Nodes RegionServers, and RegionServers, you can use you the can data use locality the data property; locality can theoretically can theoretically read and write read to and write local to DataNode local as DataNode the You may wonder You may where wonder the TaskTrackers where the TaskTrackers are in this are scheme HBase deployments, HBase deployments, the MapReduce the MapReduce framework framework isn’t deployed isn’t primarily random primarily reads random and writes. reads and In other writes. deployments, In other deployments, where is also a part is also of the a part workloads, of the workloads, TaskTrackers, TaskTrackers, DataNodes, Servers can Servers run together.
can run together.
Nodes and RegionServers, you can use the data locality property; that is, RegionServ-ers
Nodes and Nodes RegionServers, and Nodes RegionServers, you and can RegionServers, use you the can data use locality you the can data property; use locality the data that property; is, locality can theoretically can theoretically read and can write theoretically read to and write local read to DataNode and write local to as DataNode the primary local as DataNode the You may wonder You may where wonder You the may TaskTrackers where wonder the TaskTrackers where are in the this TaskTrackers are scheme in this of are scheme things. HBase deployments, HBase deployments, the HBase MapReduce deployments, the MapReduce framework the MapReduce framework isn’t deployed framework isn’t at deployed all if the isn’t primarily random primarily reads random primarily and writes. reads random and In other writes. reads deployments, and In other writes. deployments, In where other the deployments, MapReduce where is also a part is also of the a part workloads, is also of the a part workloads, TaskTrackers, of the workloads, TaskTrackers, DataNodes, TaskTrackers, DataNodes, and HBase Servers can Servers run together.
can Servers run together.
can run together.
primarily random reads and writes. In other deployments, where the MapReduce pro-cessing
primarily random primarily reads random primarily and writes. reads random and In other writes. reads deployments, and In other writes. deployments, In where other the deployments, MapReduce where is also a part is also of the a part workloads, is also of the a part workloads, TaskTrackers, of the workloads, TaskTrackers, DataNodes, TaskTrackers, DataNodes, and HBase Servers can Servers run together.
can run together.
Figure 3.6 A table consists of multiple smaller chunks called regions.
Figure 3.6 A table consists of multiple smaller chunks called DataNode RegionServer DataNode RegionServer Figure 3.7 HBase RegionServer and HDFS DataNode processes C1 Given that the underlying data is stored in HDFS, which is available to all clients as
Given that the underlying data is stored in HDFS, which is available to all clients as
Given that the underlying data is stored in HDFS, which is available to a single namespace, all RegionServers have access to the same persisted files system and can therefore host any region (figure 3.8). By physically collocating Nodes and RegionServers, you can use the data locality property; that is, can theoretically read and write to local DataNode as the primary You may wonder where the TaskTrackers are in this scheme of things. HBase deployments, the MapReduce framework isn’t deployed at all if the primarily random reads and writes. In other deployments, where the MapReduce is also a part of the workloads, TaskTrackers, DataNodes, and HBase Servers can run together.
can theoretically read and write to the local DataNode as the primary DataNode.
You may wonder where the TaskTrackers are in this scheme of things. In some
can theoretically read and write to local DataNode as the primary DataNode.
is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-
a single namespace, all RegionServers have access to the same persisted files in the file
system and can therefore host any region (figure 3.8). By physically collocating Data-
Servers can run together.
store/access data on HDFS. The master process does the distribution of regions among
store/access data on HDFS. The master process does the distribution of regions RegionServers, and each RegionServer typically hosts multiple regions.
store/access store/data access on HDFS. data The on master HDFS. process The master does process the distribution does RegionServers, RegionServers, and each RegionServer and each RegionServer typically hosts typically multiple hosts Given that the Given underlying that the underlying data is stored data in is HDFS, stored which in HDFS, is available a single namespace, a single namespace, all RegionServers all RegionServers have access have to the access same to system and system can therefore and can host therefore any region host any (figure region 3.8). (figure By physically 3.8). Nodes and Nodes RegionServers, and RegionServers, you can use you the can data use locality the data property; locality can theoretically can theoretically read and write read to and write local to DataNode local as DataNode the You may wonder You may where wonder the TaskTrackers where the TaskTrackers are in this are scheme HBase deployments, HBase deployments, the MapReduce the MapReduce framework framework isn’t deployed isn’t primarily random primarily reads random and writes. reads and In other writes. deployments, In other deployments, where is also a part is also of the a part workloads, of the workloads, TaskTrackers, TaskTrackers, DataNodes, Servers can Servers run together.
can run together.
HBase deployments, the MapReduce framework isn’t deployed at all if the workload is
DataNode RegionServer DataNode RegionServer DataNode RegionServer
DataNode RegionServer DataNode RegionServer Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically Licensed to Nick Dimiduk <ndimiduk@gmail.com>
C1 tree C0 tree
primarily Disk Memory
C1 tree C0 tree
random reads and writes. In other deployments, where the MapReduce pro-cessing
Disk Memory Cache
C1 tree C0 tree
is also Disk Memory
C1 tree C0 tree
a part of the workloads, TaskTrackers, DataNodes, and HBase Region-
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
is also a part of Disk C1 tree the Memory
C0 tree
workloads, TaskTrackers, DataNodes, and HBase Region-
DataNode RegionServer DataNode RegionServer C1 tree C0 tree
C1 tree C0 tree
Disk Memory
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
Disk Memory Cache
Disk Memory Cache
C1 tree C0 tree
C1 tree C0 tree
Disk Memory
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
Figure 3.7 HBase RegionServer and HDFS DataNode processes Licensed to Nick Dimiduk <ndimiduk@DataNode RegionServer DataNode RegionServer DataNode RegionServer
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
C1 tree C0 tree
C1 tree C0 tree
random Disk Memory
C1 tree C0 tree
reads and writes. In other deployments, where the MapReduce pro-cessing
Disk Memory Cache
Disk Memory Cache
DataNode RegionServer DataNode RegionServer Disk Memory
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically Licensed to Nick Dimiduk <ndimiduk@gmail.com>
Disk Memory
C1 Figure 3.7 HBase RegionServer and HDFS DataNode processes Licensed to Nick Dimiduk <ndimiduk@C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.
Figure C1 tree C0 tree
3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.
is also Disk Memory
C1 tree C0 tree
Disk Memory
3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on Disk Memory
C1 tree C0 tree
Licensed to Nick Dimiduk <ndimiduk@gmail.com>
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
C1 Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically C1 tree C0 tree
Disk Memory
C1 tree C0 tree
C1 Figure 3.7 HBase RegionServer and HDFS DataNode processes tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
3.7 HBase C1 tree C0 tree
RegionServer tree C0 tree
and HDFS tree DataNode C0 tree
processes are typically collocated on Disk Memory
Disk Memory
Disk Memory
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
C1 tree C0 tree
C1 tree C0 tree
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on Licensed to Nick Dimiduk <ndimiduk@gmail.com>
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
C1 tree C0 tree
Disk Memory
Disk Memory
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
C1 tree C0 tree
Disk Memory Cache
Disk Memory Cache
Disk Memory Cache
C1 Disk Memory
C1 tree C0 tree
Disk Memory
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on DataNode RegionServer DataNode RegionServer DataNode RegionServer
Disk Memory
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
Page 11
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
Figure 3.7 HBase C1 tree C0 tree
RegionServer and HDFS DataNode processes are typically collocated on the same host.
Disk Memory
C1 tree C0 tree
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.

RegionServers, RegionServers, and each RegionServer and each RegionServer typically hosts typically multiple hosts Given that the Given underlying that the underlying data is stored data in is HDFS, stored which in HDFS, is available a single namespace, a single namespace, all RegionServers all RegionServers have access have to the access same to system and system can therefore and can host therefore any region host any (figure region 3.8). (figure By physically 3.8). Nodes and Nodes RegionServers, and RegionServers, you can use you the can data use locality the data property; locality can theoretically can theoretically read and write read to and write local to DataNode local as DataNode the You may wonder You may where wonder the TaskTrackers where the TaskTrackers are in this are scheme HBase deployments, HBase deployments, the MapReduce the MapReduce framework framework isn’t deployed isn’t primarily random primarily reads random and writes. reads and In other writes. deployments, In other deployments, where is also a part is also of the a part workloads, of the workloads, TaskTrackers, TaskTrackers, DataNodes, Servers can Servers run together.
can run together.
Nodes and Nodes RegionServers, and Nodes RegionServers, you and can RegionServers, use you the can data use locality you the can data property; use locality the data that property; is, locality can theoretically can theoretically read and can write theoretically read to and write local read to DataNode and write local to as DataNode the primary local as DataNode the You may wonder You may where wonder You the may TaskTrackers where wonder the TaskTrackers where are in the this TaskTrackers are scheme in this of are scheme things. HBase deployments, HBase deployments, the HBase MapReduce deployments, the MapReduce framework the MapReduce framework isn’t deployed framework isn’t at deployed all if the isn’t primarily random primarily reads random primarily and writes. reads random and In other writes. reads deployments, and In other writes. deployments, In where other the deployments, MapReduce where is also a part is also of the a part workloads, is also of the a part workloads, TaskTrackers, of the workloads, TaskTrackers, DataNodes, TaskTrackers, DataNodes, and HBase Servers can Servers run together.
can run together.
primarily random primarily reads random primarily and writes. reads random and In other writes. reads deployments, and In other writes. deployments, In where other the deployments, MapReduce where is also a part is also of the a part workloads, is also of the a part workloads, TaskTrackers, of the workloads, TaskTrackers, DataNodes, TaskTrackers, DataNodes, and HBase Servers can Servers run together.
can run together.
Figure 3.6 A table consists of multiple smaller chunks called DataNode RegionServer DataNode RegionServer Figure 3.7 HBase RegionServer and HDFS DataNode processes C1 Given that the underlying data is stored in HDFS, which is available to all clients as
store/access data on HDFS. The master process does the distribution of regions RegionServers, and each RegionServer typically hosts multiple regions.
store/access store/data access on HDFS. data The on master HDFS. process The master does process the distribution does RegionServers, RegionServers, and each RegionServer and each RegionServer typically hosts typically multiple hosts Given that the Given underlying that the underlying data is stored data in is HDFS, stored which in HDFS, is available a single namespace, a single namespace, all RegionServers all RegionServers have access have to the access same to system and system can therefore and can host therefore any region host any (figure region 3.8). (figure By physically 3.8). Nodes and Nodes RegionServers, and RegionServers, you can use you the can data use locality the data property; locality can theoretically can theoretically read and write read to and write local to DataNode local as DataNode the You may wonder You may where wonder the TaskTrackers where the TaskTrackers are in this are scheme HBase deployments, HBase deployments, the MapReduce the MapReduce framework framework isn’t deployed isn’t primarily random primarily reads random and writes. reads and In other writes. deployments, In other deployments, where is also a part is also of the a part workloads, of the workloads, TaskTrackers, TaskTrackers, DataNodes, Servers can Servers run together.
can run together.
DataNode RegionServer DataNode RegionServer Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically Licensed to Nick Dimiduk <ndimiduk@gmail.com>
C1 tree C0 tree
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
is also Disk Memory
C1 tree C0 tree
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
is also a part of Disk C1 tree the Memory
C0 tree
workloads, TaskTrackers, DataNodes, and HBase Region-
C1 tree C0 tree
Disk Memory
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
Disk Memory Cache
Disk Memory Cache
C1 tree C0 tree
C1 tree C0 tree
Disk Memory
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
Figure 3.7 HBase RegionServer and HDFS DataNode processes Licensed to Nick Dimiduk <ndimiduk@DataNode RegionServer DataNode RegionServer DataNode RegionServer
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
C1 tree C0 tree
C1 tree C0 tree
random Disk Memory
C1 tree C0 tree
reads and writes. In other deployments, where the MapReduce pro-cessing
Disk Memory Cache
Disk Memory Cache
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically Licensed to Nick Dimiduk <ndimiduk@gmail.com>
Disk Memory
C1 Figure 3.7 HBase RegionServer and HDFS DataNode processes Licensed to Nick Dimiduk <ndimiduk@C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.
is also Disk Memory
C1 tree C0 tree
Disk Memory
3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
C1 Figure 3.7 HBase RegionServer and HDFS DataNode processes tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
3.7 HBase C1 RegionServer and HDFS DataNode processes are typically collocated on Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on Licensed to Nick Dimiduk <ndimiduk@gmail.com>
"2011-07-04" Disk Memory
C1 tree C0 tree
1368396302 "fourth of July"
tree C0 tree
Disk Memory
C1 tree C0 tree
tree C0 tree
Disk Memory
C1 tree C0 tree
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
C1 tree C0 tree
Disk Memory
Disk Memory
Disk Memory
C1 tree C0 tree
C1 tree C0 tree
C1 tree C0 tree
Disk Memory Cache
Disk Memory Cache
Disk Memory Cache
C1 Disk Memory
C1 tree C0 tree
Disk Memory
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on DataNode RegionServer DataNode RegionServer DataNode RegionServer
Disk Memory
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
Page 12
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
Figure 3.7 HBase C1 tree C0 tree
RegionServer and HDFS DataNode processes are typically collocated on the same host.
Disk Memory
C1 tree C0 tree
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory
a
cf1
1368394583 7
1368394261 "hello"
"bar"
1368394583 22
1368394925 13.6
1368393847 "world"
"foo"
cf2
1.0001 1368387684 "almost the loneliest number"

High-level Architecture
Page 13

Logical Data Model
1368394583 7
1368394261 "hello"
"bar"
1368394583 22
1368394925 13.6
1368393847 "world"
"foo"
"2011-07-04" 1368396302 "fourth of July"
1.0001 1368387684 "almost the loneliest number"
Page 14
a
cf1
cf2
b cf2 "thumb" 1368387247 [3.6 kb png data]
Table A
rowkey
column
family
column
qualifier
timestamp value
Rows
Column Families

Logical Architecture
Page 15
Table A
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
Region 1
Region 2
Region 3
Region 4
Region Server 7
Table A, Region 1
Table A, Region 2
Table G, Region 1070
Table L, Region 25
Region Server 86
Table A, Region 3
Table C, Region 30
Table F, Region 160
Table F, Region 776
Region Server 367
Table A, Region 4
Table C, Region 17
Table E, Region 52
Table P, Region 1116

and RegionServers, you can use the data locality property; that can theoretically read and write to the local DataNode as the primary You may wonder where the TaskTrackers are in this scheme of HBase deployments, the MapReduce framework isn’t deployed at all primarily random reads and writes. In other deployments, where the is also a part of the workloads, TaskTrackers, DataNodes, and Servers can run together.
Nodes Nodes DataNode RegionServer DataNode DataNode RegionServer RegionServer DataNode DataNode RegionServer Figure 3.7 HBase RegionServer Figure 3.7 and HDFS HBase DataNode RegionServer processes and HDFS are typically DataNode collocated processes and RegionServers, you can use the data locality property; can theoretically read and write to the local DataNode You may wonder where the TaskTrackers are in this HBase deployments, the MapReduce framework isn’t deployed primarily random reads and writes. In other deployments, is also a part of the workloads, TaskTrackers, DataNodes, Servers can run together.
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same Nodes and RegionServers, you can use the data can theoretically read and write to the local You may wonder where the TaskTrackers HBase deployments, the MapReduce framework primarily random reads and writes. In other deployments, is also a part of the workloads, TaskTrackers, Servers can run together.
Physical Architecture
You may wonder where the TaskTrackers are in this scheme of things. HBase deployments, the MapReduce framework isn’t deployed at all if the workload primarily random reads and writes. In other deployments, where the MapReduce is also a part of the workloads, TaskTrackers, DataNodes, and HBase Servers can run together.
Zoo
Keeper
Zoo
Keeper
DataNode RegionServer DataNode RegionServer Region
Region
Region
Server
Server
Server
Data
Data
Data
Node
Node
Node
Figure Licensed Attribution-3.7 under a ShareAlike HBase Creative Commons
3.0 Unported RegionServer License.
and HDFS DataNode processes Page 16
are typically Region
Server
Data
Node
...
can theoretically read and write You may wonder where the TaskTrackers HBase deployments, the MapReduce primarily random reads and writes. is also a part of the workloads, Servers can run together.
DataNode RegionServer Master
Master
Figure 3.7 HBase RegionServer and HDFS Servers can run together.
Name
Node
DataNode RegionServer DataNode RegionServer HBase
Client
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically Licensed to Nick Dimiduk <ndimiduk@gmail.HDFS
HBase

User API
• {rowkey => {family => {qualifier => {version => value}}}}
– Think: nested TreeMap (Java), OrderedDictionary (C#), OrderedDict (Python)
• Basic data operations: GET, PUT, DELETE
• SCAN over range of key-values
– benefit of the sorted rowkey business
– this is how you implement any kind of "complex query” *
• GET, SCAN support Filters
– Push application logic to RegionServers
• INCREMENT, APPEND, CheckAnd{Put,Delete}
– Server-side, atomic data operations, can be contentious!
Page 17
* This is also a foundational component in what we refer to as
“schema design” in this “schemaless” database.

Anatomy of a RegionServer
Page 18

C1 tree C0 tree
Disk Memory
C1 tree C0 tree
Disk Memory Cache
Page 19
Disk Memory
C1 tree C0 tree
Disk Memory

Storage Machinery
Page 20
RegionServer (HBase)
DataNode (Hadoop DFS)
HLog
(WAL)
HRegion
HStore
StoreFile
HFile
StoreFile
HFile
MemStore
...
...
HStore
BlockCache
HRegion
HStore HStore
...
...

Storage Machinery
Page 21
HLog
(WAL)
HRegion
HStore
StoreFile
HFile
StoreFile
HFile
MemStore
...
...
HStore
BlockCache
HRegion
HStore HStore
...
...
C1
C0
C1
C0
C1
C0
C1
C0
Cache

Storage Machinery: Write Path
Page 22
HLog
(WAL)
HRegion
HStore
StoreFile
HFile
StoreFile
HFile
MemStore
...
...
HStore
BlockCache
HRegion
HStore HStore
...
...
1
2
3
4
5

Storage Machinery: Read Path
Page 23
HLog
(WAL)
HRegion
HStore
StoreFile
HFile
StoreFile
HFile
MemStore
...
...
HStore
BlockCache
HRegion
HStore HStore
...
...
1 5
2
3
3
2
4

By Example
Page 24

Database Dichotomy
Latency
Predicate
Pushdown
Page 25
Row+Column
Bloomfilter
Lossy
WAL Durability
Smart Rowkey
Design
Compression
Write Read
Throughput
Larger
Block Size
Bulkload
HFiles
Compression
Compression
Compression
Smaller
Block Size
Larger
BlockCache
Increase
Blocking Storefiles
Increased
Scanner Caching
Larger
Flush Size
Smart Rowkey
Design
Smart Rowkey
Design
Smart Rowkey
Design

Web-scale Database
Page 26
App
server
App
server
App
server
App
server
App
server
Counters
Sessions
User profiles
Social Media
Application
Data
Latency
Write Read
Throughput

Search Search
“BigIndex”
Page 27
"BigIndex"
Document
store
Search Search
Search
App
server
App
server
App
server
App
server
App
server
Latency
Write Read
Throughput

Materialized View
Page 28
App
server
App
server
App
server
App
server
App
server
Latency
Write Read
Throughput

ETL Assist
Write Read
Page 29
Latency
Throughput

Lambda Architecture
Page 30
App
server
App
server
App
server
App
server
App
server

Resources
Page 31

Join the Community!
• hbase.apache.org
– hbase.apache.org/book.html
– blogs.apache.org/hbase/
– hbase.apache.org/mail-lists.html
• IRC: irc.freenode.net #hbase
• JIRA: issues.apache.org/jira/browse/HBASE
• Source: git clone git://git.apache.org/hbase.git
• In person
– HBaseCon, hbasecon.com
– Hadoop Summit, hadoopsummit.org
– Strata / Hadoop World, strataconf.com
– Local meetup near you!
Page 32

HBase 0.98/1.0
• Hardening
– Stability, Reliability, Availability, Performance
• Horizontal Scalability
– 1000’s of machines
• Availability
– Speed of Recovery (MTTR), Region Replicas
• Improved Multi-tenancy
– RPC priorities/QoS, namespace management
• Cell-level security
• Semantic Versioning
• Client API cleanup
Page 33

Thanks!
Page 34
M A N N I N G
Nick Dimiduk
Amandeep Khurana
FOREWORD BY
Michael Stack
hbaseinaction.com
Nick Dimiduk
github.com/ndimiduk
@xefyr
n10k.com
strataeucftw
slideshare.net/xefyr/hbase-for-architects

Apache HBase for Architects

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Apache HBase for Architects

Similar to Apache HBase for Architects (20)

Recently uploaded

Recently uploaded (20)

Apache HBase for Architects