Meet HBase 2.0 and Phoenix 5.0

Meet HBase 2.0 and Phoenix 5.0
Enis Soztutar
Rajeshbabu Chintaguntla
15th June 2017

2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
About Us
 Enis Soztutar
– Apache HBase PMC from 2012
– Release Manager for 1.0
– HBase dev @ Hortonworks
– @enissoz
 Rajeshbabu Chintaguntla
– Apache HBase Committer
– Apache Phoenix PMC
– HBase dev @ Hortonworks
– @rajeshhcu32

Agenda
Versioning / Compatibility
Changes in Behavior
Major features in HBase 2.0
Phoenix 5.0
Recent Developments in Phoenix

GIANT DISCLAIMER!!!
HBase-2.0.0 / Phoenix-5.0.0 is NOT released yet (as of
June2017)
Community is working hard on these releases
Expected to arrive before end of year 2017
Keep an eye out for scheduled alpha and beta releases

Agenda
Changes in Behavior
Phoenix 5.0

Semantic Versioning
Starting with the 1.0 release, HBase works toward
Semantic Versioning
MAJOR.MINOR.PATCH[-identifiers]
PATCH: only BC bug fixes.
MINOR: BC new features
MAJOR: Incompatible changes

To be, or not to be (Compatible)
Compatibility is NOT a simple yes or no
Many dimensions
source, binary, wire, command line,
dependencies etc
What is client interface?
Read
https://hbase.apache.org/book.html#upgradin
g
7

Major Minor Patch
Client-Server Wire
Compatibility
✗ ✓ ✓
Server-Server Compatibility ✗ ✓ ✓
File Format Compatibility ✗* ✓ ✓
Client API Compatibility ✗ ✓ ✓
Client Binary Compatibility ✗ ✗ ✓
Server Side Limited API
Compatibility
✗ ✗*/✓* ✓
Dependency Compatibility ✗ ✓ ✓
Operation Compatibility ✗ ✗ ✓

Agenda
Changes in Behavior
Phoenix 5.0

Changes in behavior – HBase-2.0
JDK-8+ only
Likely Hadoop-2.7+ and Hadoop-3 support
Filter and Coprocessor changes
 Managed connections are gone.
Deprecated client interfaces are gone
Target is “Rolling upgradable” from 1.x
Updated dependencies
10

How to prepare for HBase-2.0
2.0 contains more API clean up
Cleanup PB and guava “leaks” into the API
Some deprecated APIs (HConnection, HTable,
HBaseAdmin, etc) going away
Start using JDK-8 (and G1). You will like it.
1.x client should be able to do read / write /
scan against 2.0 clusters
Some DDL / Admin operations may not work
11

Agenda
Changes in Behavior
Phoenix 5.0

Offheaping Read Path
Goals
–Make use of all available RAM
–Reduce temporary garbage for reading
–Make bucket cache as fast as LRU cache
HBase block cache can be configured as L1 / L2
“Bucket cache” can be backed by onheap, off-
heap or SSD
Different sized buckets in 4MB slices
HFile blocks placed in different buckets
Cell comparison only used to deal with byte[]
Copy the whole block to heap
13

https://www.slideshare.net/HBaseCon/offheaping-the-apache-hbase-read-path

https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in1

Offheap Read-Path in Production - The Alibaba story (https://blogs.apache.org/hbase/)

Offheaping Write Path
Reduce GC, improve throughput predictability
Use off-heap buffer pools to read RPC requests
Remove garbage created in Codec parsing, MSLAB
copying, etc
MSLAB is used for heap fragmentation
Now MSLAB can allocate offheap pooled buffers
[2MB]
17

Offheaping Write Path
Allows bigger total memstore space
Cell data goes off-heap
Cell metadata (and CSLM#Entry, etc) is on-heap
(~100 byte overhead)
Async WAL is used to avoid copying Cells into
heap
Works with new “Compacting Memstore”
18

Compacting Memstore
In memory compaction
–Eliminate duplicates
–Gains are proportional to duplicates
In memory index reduction
–Reduce index overhead per cell
–Gains proportional to cell size
Reduce flushes to disk
–Reduce file creation
–Reduce write amplification

Region Assignment
Region assignment is one of the pain points
Complete overhaul of region assignment
Based on the internal “Procedure V2” framework
Series of operations are broken down into states
in State Machine
State’s are serialized to durable storage
Uses a WAL on HDFS
Backup master can recover the state from where
left off
20

Async Client
A complete new client maintained in HBase
Different than OpenTSDB/asynchbase
Fully async end to end
Based on Netty and JDK-8 CompletableFutures
Includes most critical functionality including
Admin / DDL
Some advanced functionality (read replicas, etc)
are not done yet
Have both sync and async client.
21

Async Client

C++ Client
-std=c++14
Synchronous client built fully on async client
Uses folly and wangle from FB (like Netty, JDK-8
Futures)
Follow the same architecture as the async client
Gets / Puts / Multi-Gets are working. Scan next
23

C++ Client

Backup / Restore
HBase needs a reliable backup / restore
mechanism
Full backups and Incremental backups
Backup destination to a remote FS
HDFS, S3, ADLS, WASB, and any Hadoop-
compatible FS
Full backup based on “snapshots”
Incremental backup ships Write-Ahead-Logs
Backup + Replication are complimentary (for a full
DR solution)
25

FileSystem Quotas
HBase supports quotas
Request count or size per sec,min,hour,day (per
server)
Num tables / num regions in namespace
New work enables quotas for disk space usage
Master periodically gathers info about disk space
usage
Regionservers enforce limit when notified from
master
Limits per table or per namespace (not per user)
Violation policy: {DISABLE,
NO_WRITES_COMPACTIONS, NO_WRITES,
NO_INSERTS} 26

Will be released for the first time
These will be released for the first time in Apache
(although other backports exist)
Medium Object Support
Region Server groups
Favored Nodes enhancements
HBase-Spark module
WAL’s and data files in separate FileSystems

Misc
NettyRpcClient
NettyRpcServer
API Cleanup
Replication V2
New metrics API
Metrics API for Coprocessors
Tons of other improvements

Agenda
Changes in Behavior
Phoenix 5.0

Apache Phoenix-5.0
Phoenix + Calcite introduction
Examples of How Calcite helps to optimize the
queries.
Development is happening in a branch for more
than a year

Current Phoenix Architecture
Parser
Algebra
Query Plan
Runtime
HBase Data
Stage1:Parse Node Tree
Stage2:Normalization
Secondary index rewrite
Stage3:Expression Tree
Phoenix Schema

Current Calcite Architecture
Parser
Algebra
Operators,
Rules,
Statistics,
Cost Model
Schema API
ENGINE
DATA
ENGINE
DATA
ENGINE
DATA

Why calcite integration?
 Phoenix query optimizer applies minimal static rules to decide the
query plan and for selecting the tables to be used in the query.
 But with calcite all query optimization decisions will be based on cost
and query optimizations are modeled as pluggable rules
– Filter push down; range scan vs. skip scan
– Hash aggregate vs. stream aggregate vs. partial stream aggregate
– Sort optimized out; sort/limit push through; fwd/rev/unordered
scan
– Hash join vs. merge join; join ordering
– Use of data table vs. index table
– All above (and many others) COMBINED
 Can make use of Phoenix Statistics for cost based decisions

Phoenix+Calcite Architecture
Parser
Algebra
Logical + Phoenix Operators,
Builtin + Phoenix Rules,
Phoenix Statistics,
Phoenix Cost Model
Schema API
JDBC(OPTIONA
L)
DATA
QUERY PLAN
Other(optio
nal)
DATADATA
Phoenix
Runtime

SELECT ItemName, SUM(Price*Quantity) as
OrderValue
FROM Items
JOIN Orders
ON Items.ItemID=Orders.ItemID
WHERE Orders.CustomerID > ‘C002’
GROUP BY ItemName
ORDER BY SUM(Price*Quantity) DESC
sort
aggregate
filter
join
scan
[Items]
Scan
[Orders]
Translate SQL
to relational
algebra
Example 1: CALCITE ALGEBRA

Example 1: FilterInto Join Rule
SELECT ItemName,
SUM(Price*Quantity) as OrderValue
FROM Items
JOIN Orders
ON Items.ItemID=Orders.ItemID
WHERE Orders.CustomerID > ‘C002’
GROUP BY ItemName
ORDER BY SUM(Price*Quantity) DESC
sort sort
aggregate aggregate
Filter
Filterjoin
Join
scan
[Items]
Scan
[Orders]
scan
[Items]
Scan
[Orders]
FilterInto Join rule
Translate SQL
to relational
algebra

SELECT O.OrderID, C.CustomerName, C.Country,
O.Date
FROM Orders AS O
INNER JOIN Customers AS C
ON O.CustomerID = C.CustomerID
ORDER BY CustomerID
Project
sort
join
scan
[Orders]
Scan
[Customers]
Translate SQL
to relational
algebra
Example 2: CALCITE ALGEBRA

Example 2: Plan Candidates
proje
ct
sort
Hash-
join
Scan
[Orders]
Scan
[Customers]
Candidate 1 :
hash-join
*also what
standalone Phoenix
compiler would
generate
Candidate 2 :
Merge-join
Win
sort
Scan
[Orders]
Scan
[Costomers]
SortRemove Rule
Sorted on [CustomerID]
SortRemove Rule
Sorted on [O.CustomerID]
1. Very little difference in all other operators: project, scan, hash-join or merge-join
2. Candidate 1 would sort “emps join depts”, while candidate 2 would only sort “emps”
proje
ct
Merge-
join

Advantages with Phoenix-Calcite integration
Strong SQL standard compliance
Can make use of Calcite features like Window
functions, scalar subquery, FILTER etc.
Interop with other Calcite adapters
–Already used by Drill, Hive, Kylin, Samza, etc..
–Supports any JDBC source
39

Phoenix – Calcite Integration work status
We are working separate phoenix branch “calcite”
for calcite integration work.
Making changes in calcite to support missing
features like dynamic columns, ordered-set and
hypothetical set aggregation functions etc.
Need to fix some issues in Phoenix Calcite
integration as well
~70% test cases are passing.

Agenda
Changes in Behavior
Phoenix 5.0

Local index hardening and performance improvements
Region Server
Region
CLIENT
1.Write
request
prepare index updates
Data cf Index cf
2.batch call
Mem
Store
Me
mSto
re
Index
updates
Data updates
4.Merge data and
index updates
5.Write to
MemStores
WAL
6.Write to WAL
100% Atomic
and consistent
local updates
with data
updates

Local index hardening and performance improvements
- 4 node cluster
- Tested with 5 local indexes on the base table of 25 columns with 10 regions.
- Ingested 50M rows.
- 3x faster upsert time comparing to global indexes
- 5x less network RX/TX utilizations during write comparing to global indexes
- Similar read performance comparing to global indexes with queries like
aggregations, group by, limit etc.

Column mapping and data encoding
 Maps columns with numbers of variable size based on number of
columns.
 It helps to reduce disk space utilization ,makes easy alteration of
column names and improve the performance.
 Supporting a new encoding scheme for immutable tables that
packs all the values into a single cell for column family.
 Supported from Phoenix 4.10.0+
CREATE TABLE T
(
a_string varchar not null,
col1 integer
CONSTRAINT pk PRIMARY KEY (a_string)
)
COLUMN_ENCODED_BYTES = 1;

Apache Hive integration
 Apache Phoenix supports Hive integration to access Phoenix
Table data from HiveQL using Phoenix Storage Handler.
create <external> table phoenix_table (
s1 string,
i1 int,
)
STORED BY
'org.apache.phoenix.hive.PhoenixStorageHandler'
TBLPROPERTIES (
"phoenix.table.name" = "phoenix_table",
"phoenix.zookeeper.quorum" = "localhost",
"phoenix.zookeeper.znode.parent" = "/hbase",
"phoenix.zookeeper.client.port" = "2181",
"phoenix.rowkeys" = "s1",
"phoenix.column.mapping" = "s1:s1, i1:i1",
);

Apache Kapka Integration
 Apache Kafka plugin helps to reliably and efficiently stream
large amounts of data onto HBase using Phoenix API. As part
of this plugin we are providing PhoenixConsumer to receive
messages from KafkaProducer.

Misc
 Use Hbase snapshorts for MR Based queries and async index
building – 4.11.0
 Support for forward-only cursors – 4.11.0
 Integration with spark 2.0 – 4.10.0
 Support for HBase 1.3.1+ - 4.11.0
 Server side upsert select – 4.10.0
 Omid transactions Integration – IN PROGRESS

Thank You
Q & A?

Meet HBase 2.0 and Phoenix 5.0

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Meet HBase 2.0 and Phoenix 5.0

Similar to Meet HBase 2.0 and Phoenix 5.0 (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Meet HBase 2.0 and Phoenix 5.0