SlideShare a Scribd company logo
Meet HBase 2.0 and Phoenix 5.0
Enis Soztutar
Rajeshbabu Chintaguntla
15th June 2017
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
About Us
 Enis Soztutar
– Apache HBase PMC from 2012
– Release Manager for 1.0
– HBase dev @ Hortonworks
– @enissoz
 Rajeshbabu Chintaguntla
– Apache HBase Committer
– Apache Phoenix PMC
– HBase dev @ Hortonworks
– @rajeshhcu32
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Versioning / Compatibility
Changes in Behavior
Major features in HBase 2.0
Phoenix 5.0
Recent Developments in Phoenix
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
GIANT DISCLAIMER!!!
HBase-2.0.0 / Phoenix-5.0.0 is NOT released yet (as of
June2017)
Community is working hard on these releases
Expected to arrive before end of year 2017
Keep an eye out for scheduled alpha and beta releases
5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Versioning / Compatibility
Changes in Behavior
Major features in HBase 2.0
Phoenix 5.0
Recent Developments in Phoenix
6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Semantic Versioning
Starting with the 1.0 release, HBase works toward
Semantic Versioning
MAJOR.MINOR.PATCH[-identifiers]
PATCH: only BC bug fixes.
MINOR: BC new features
MAJOR: Incompatible changes
7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
To be, or not to be (Compatible)
Compatibility is NOT a simple yes or no
Many dimensions
source, binary, wire, command line,
dependencies etc
What is client interface?
Read
https://hbase.apache.org/book.html#upgradin
g
7
8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Major Minor Patch
Client-Server Wire
Compatibility
✗ ✓ ✓
Server-Server Compatibility ✗ ✓ ✓
File Format Compatibility ✗* ✓ ✓
Client API Compatibility ✗ ✓ ✓
Client Binary Compatibility ✗ ✗ ✓
Server Side Limited API
Compatibility
✗ ✗*/✓* ✓
Dependency Compatibility ✗ ✓ ✓
Operation Compatibility ✗ ✗ ✓
9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Versioning / Compatibility
Changes in Behavior
Major features in HBase 2.0
Phoenix 5.0
Recent Developments in Phoenix
10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Changes in behavior – HBase-2.0
JDK-8+ only
Likely Hadoop-2.7+ and Hadoop-3 support
Filter and Coprocessor changes
 Managed connections are gone.
Deprecated client interfaces are gone
Target is “Rolling upgradable” from 1.x
Updated dependencies
10
11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
How to prepare for HBase-2.0
2.0 contains more API clean up
Cleanup PB and guava “leaks” into the API
Some deprecated APIs (HConnection, HTable,
HBaseAdmin, etc) going away
Start using JDK-8 (and G1). You will like it.
1.x client should be able to do read / write /
scan against 2.0 clusters
Some DDL / Admin operations may not work
11
12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Versioning / Compatibility
Changes in Behavior
Major features in HBase 2.0
Phoenix 5.0
Recent Developments in Phoenix
13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Offheaping Read Path
Goals
–Make use of all available RAM
–Reduce temporary garbage for reading
–Make bucket cache as fast as LRU cache
HBase block cache can be configured as L1 / L2
“Bucket cache” can be backed by onheap, off-
heap or SSD
Different sized buckets in 4MB slices
HFile blocks placed in different buckets
Cell comparison only used to deal with byte[]
Copy the whole block to heap
13
14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
https://www.slideshare.net/HBaseCon/offheaping-the-apache-hbase-read-path
15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Offheaping Read Path
https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in1
16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Offheaping Read Path
Offheap Read-Path in Production - The Alibaba story (https://blogs.apache.org/hbase/)
17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Offheaping Write Path
Reduce GC, improve throughput predictability
Use off-heap buffer pools to read RPC requests
Remove garbage created in Codec parsing, MSLAB
copying, etc
MSLAB is used for heap fragmentation
Now MSLAB can allocate offheap pooled buffers
[2MB]
17
18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Offheaping Write Path
Allows bigger total memstore space
Cell data goes off-heap
Cell metadata (and CSLM#Entry, etc) is on-heap
(~100 byte overhead)
Async WAL is used to avoid copying Cells into
heap
Works with new “Compacting Memstore”
18
19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Compacting Memstore
In memory compaction
–Eliminate duplicates
–Gains are proportional to duplicates
In memory index reduction
–Reduce index overhead per cell
–Gains proportional to cell size
Reduce flushes to disk
–Reduce file creation
–Reduce write amplification
20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Region Assignment
Region assignment is one of the pain points
Complete overhaul of region assignment
Based on the internal “Procedure V2” framework
Series of operations are broken down into states
in State Machine
State’s are serialized to durable storage
Uses a WAL on HDFS
Backup master can recover the state from where
left off
20
21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Async Client
A complete new client maintained in HBase
Different than OpenTSDB/asynchbase
Fully async end to end
Based on Netty and JDK-8 CompletableFutures
Includes most critical functionality including
Admin / DDL
Some advanced functionality (read replicas, etc)
are not done yet
Have both sync and async client.
21
22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Async Client
23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
C++ Client
-std=c++14
Synchronous client built fully on async client
Uses folly and wangle from FB (like Netty, JDK-8
Futures)
Follow the same architecture as the async client
Gets / Puts / Multi-Gets are working. Scan next
23
24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
C++ Client
25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Backup / Restore
HBase needs a reliable backup / restore
mechanism
Full backups and Incremental backups
Backup destination to a remote FS
HDFS, S3, ADLS, WASB, and any Hadoop-
compatible FS
Full backup based on “snapshots”
Incremental backup ships Write-Ahead-Logs
Backup + Replication are complimentary (for a full
DR solution)
25
26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
FileSystem Quotas
HBase supports quotas
Request count or size per sec,min,hour,day (per
server)
Num tables / num regions in namespace
New work enables quotas for disk space usage
Master periodically gathers info about disk space
usage
Regionservers enforce limit when notified from
master
Limits per table or per namespace (not per user)
Violation policy: {DISABLE,
NO_WRITES_COMPACTIONS, NO_WRITES,
NO_INSERTS} 26
27 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Will be released for the first time
These will be released for the first time in Apache
(although other backports exist)
Medium Object Support
Region Server groups
Favored Nodes enhancements
HBase-Spark module
WAL’s and data files in separate FileSystems
28 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Misc
NettyRpcClient
NettyRpcServer
API Cleanup
Replication V2
New metrics API
Metrics API for Coprocessors
Tons of other improvements
29 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Versioning / Compatibility
Changes in Behavior
Major features in HBase 2.0
Phoenix 5.0
Recent Developments in Phoenix
30 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Phoenix-5.0
Phoenix + Calcite introduction
Examples of How Calcite helps to optimize the
queries.
Development is happening in a branch for more
than a year
31 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Current Phoenix Architecture
Parser
Algebra
Query Plan
Runtime
HBase Data
Stage1:Parse Node Tree
Stage2:Normalization
Secondary index rewrite
Stage3:Expression Tree
Phoenix Schema
32 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Current Calcite Architecture
Parser
Algebra
Operators,
Rules,
Statistics,
Cost Model
Schema API
ENGINE
DATA
ENGINE
DATA
ENGINE
DATA
33 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Why calcite integration?
 Phoenix query optimizer applies minimal static rules to decide the
query plan and for selecting the tables to be used in the query.
 But with calcite all query optimization decisions will be based on cost
and query optimizations are modeled as pluggable rules
– Filter push down; range scan vs. skip scan
– Hash aggregate vs. stream aggregate vs. partial stream aggregate
– Sort optimized out; sort/limit push through; fwd/rev/unordered
scan
– Hash join vs. merge join; join ordering
– Use of data table vs. index table
– All above (and many others) COMBINED
 Can make use of Phoenix Statistics for cost based decisions
34 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Phoenix+Calcite Architecture
Parser
Algebra
Logical + Phoenix Operators,
Builtin + Phoenix Rules,
Phoenix Statistics,
Phoenix Cost Model
Schema API
JDBC(OPTIONA
L)
DATA
QUERY PLAN
Other(optio
nal)
DATADATA
Phoenix
Runtime
35 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
SELECT ItemName, SUM(Price*Quantity) as
OrderValue
FROM Items
JOIN Orders
ON Items.ItemID=Orders.ItemID
WHERE Orders.CustomerID > ‘C002’
GROUP BY ItemName
ORDER BY SUM(Price*Quantity) DESC
sort
aggregate
filter
join
scan
[Items]
Scan
[Orders]
Translate SQL
to relational
algebra
Example 1: CALCITE ALGEBRA
36 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Example 1: FilterInto Join Rule
SELECT ItemName,
SUM(Price*Quantity) as OrderValue
FROM Items
JOIN Orders
ON Items.ItemID=Orders.ItemID
WHERE Orders.CustomerID > ‘C002’
GROUP BY ItemName
ORDER BY SUM(Price*Quantity) DESC
sort sort
aggregate aggregate
Filter
Filterjoin
Join
scan
[Items]
Scan
[Orders]
scan
[Items]
Scan
[Orders]
FilterInto Join rule
Translate SQL
to relational
algebra
37 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
SELECT O.OrderID, C.CustomerName, C.Country,
O.Date
FROM Orders AS O
INNER JOIN Customers AS C
ON O.CustomerID = C.CustomerID
ORDER BY CustomerID
Project
sort
join
scan
[Orders]
Scan
[Customers]
Translate SQL
to relational
algebra
Example 2: CALCITE ALGEBRA
38 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Example 2: Plan Candidates
proje
ct
sort
Hash-
join
Scan
[Orders]
Scan
[Customers]
Candidate 1 :
hash-join
*also what
standalone Phoenix
compiler would
generate
Candidate 2 :
Merge-join
Win
sort
Scan
[Orders]
Scan
[Costomers]
SortRemove Rule
Sorted on [CustomerID]
SortRemove Rule
Sorted on [O.CustomerID]
1. Very little difference in all other operators: project, scan, hash-join or merge-join
2. Candidate 1 would sort “emps join depts”, while candidate 2 would only sort “emps”
proje
ct
Merge-
join
39 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Advantages with Phoenix-Calcite integration
Strong SQL standard compliance
Can make use of Calcite features like Window
functions, scalar subquery, FILTER etc.
Interop with other Calcite adapters
–Already used by Drill, Hive, Kylin, Samza, etc..
–Supports any JDBC source
39
40 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Phoenix – Calcite Integration work status
We are working separate phoenix branch “calcite”
for calcite integration work.
Making changes in calcite to support missing
features like dynamic columns, ordered-set and
hypothetical set aggregation functions etc.
Need to fix some issues in Phoenix Calcite
integration as well
~70% test cases are passing.
41 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Versioning / Compatibility
Changes in Behavior
Major features in HBase 2.0
Phoenix 5.0
Recent Developments in Phoenix
42 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Local index hardening and performance improvements
Region Server
Region
CLIENT
1.Write
request
prepare index updates
Data cf Index cf
2.batch call
Mem
Store
Me
mSto
re
Index
updates
Data updates
4.Merge data and
index updates
5.Write to
MemStores
WAL
6.Write to WAL
100% Atomic
and consistent
local updates
with data
updates
43 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Local index hardening and performance improvements
- 4 node cluster
- Tested with 5 local indexes on the base table of 25 columns with 10 regions.
- Ingested 50M rows.
- 3x faster upsert time comparing to global indexes
- 5x less network RX/TX utilizations during write comparing to global indexes
- Similar read performance comparing to global indexes with queries like
aggregations, group by, limit etc.
44 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Column mapping and data encoding
 Maps columns with numbers of variable size based on number of
columns.
 It helps to reduce disk space utilization ,makes easy alteration of
column names and improve the performance.
 Supporting a new encoding scheme for immutable tables that
packs all the values into a single cell for column family.
 Supported from Phoenix 4.10.0+
CREATE TABLE T
(
a_string varchar not null,
col1 integer
CONSTRAINT pk PRIMARY KEY (a_string)
)
COLUMN_ENCODED_BYTES = 1;
45 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Hive integration
 Apache Phoenix supports Hive integration to access Phoenix
Table data from HiveQL using Phoenix Storage Handler.
create <external> table phoenix_table (
s1 string,
i1 int,
)
STORED BY
'org.apache.phoenix.hive.PhoenixStorageHandler'
TBLPROPERTIES (
"phoenix.table.name" = "phoenix_table",
"phoenix.zookeeper.quorum" = "localhost",
"phoenix.zookeeper.znode.parent" = "/hbase",
"phoenix.zookeeper.client.port" = "2181",
"phoenix.rowkeys" = "s1",
"phoenix.column.mapping" = "s1:s1, i1:i1",
);
46 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Kapka Integration
 Apache Kafka plugin helps to reliably and efficiently stream
large amounts of data onto HBase using Phoenix API. As part
of this plugin we are providing PhoenixConsumer to receive
messages from KafkaProducer.
47 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Misc
 Use Hbase snapshorts for MR Based queries and async index
building – 4.11.0
 Support for forward-only cursors – 4.11.0
 Integration with spark 2.0 – 4.10.0
 Support for HBase 1.3.1+ - 4.11.0
 Server side upsert select – 4.10.0
 Omid transactions Integration – IN PROGRESS
48 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank You
Q & A?

More Related Content

What's hot

Apache Phoenix Query Server
Apache Phoenix Query ServerApache Phoenix Query Server
Apache Phoenix Query Server
Josh Elser
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
 
Apache Hive on ACID
Apache Hive on ACIDApache Hive on ACID
Apache Hive on ACID
DataWorks Summit/Hadoop Summit
 
The Heterogeneous Data lake
The Heterogeneous Data lakeThe Heterogeneous Data lake
The Heterogeneous Data lake
DataWorks Summit/Hadoop Summit
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
enissoz
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
DataWorks Summit
 
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016
Josh Elser
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
enissoz
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
DataWorks Summit/Hadoop Summit
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
Josh Elser
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon
 
Spark + HBase
Spark + HBase Spark + HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
DataWorks Summit
 
April 2014 HUG : Apache Phoenix
April 2014 HUG : Apache PhoenixApril 2014 HUG : Apache Phoenix
April 2014 HUG : Apache Phoenix
Yahoo Developer Network
 
Empower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and HadoopEmpower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and Hadoop
DataWorks Summit/Hadoop Summit
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, Solutions
DataWorks Summit
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit
 
Dancing with the elephant h base1_final
Dancing with the elephant   h base1_finalDancing with the elephant   h base1_final
Dancing with the elephant h base1_finalasterix_smartplatf
 

What's hot (20)

Apache Phoenix Query Server
Apache Phoenix Query ServerApache Phoenix Query Server
Apache Phoenix Query Server
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Apache Hive on ACID
Apache Hive on ACIDApache Hive on ACID
Apache Hive on ACID
 
The Heterogeneous Data lake
The Heterogeneous Data lakeThe Heterogeneous Data lake
The Heterogeneous Data lake
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
 
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
 
Spark + HBase
Spark + HBase Spark + HBase
Spark + HBase
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
April 2014 HUG : Apache Phoenix
April 2014 HUG : Apache PhoenixApril 2014 HUG : Apache Phoenix
April 2014 HUG : Apache Phoenix
 
Empower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and HadoopEmpower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and Hadoop
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, Solutions
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Dancing with the elephant h base1_final
Dancing with the elephant   h base1_finalDancing with the elephant   h base1_final
Dancing with the elephant h base1_final
 

Similar to Meet HBase 2.0 and Phoenix 5.0

Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
Ankit Singhal
 
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
Streamline Apache Hadoop Operations with Apache Ambari and SmartSenseStreamline Apache Hadoop Operations with Apache Ambari and SmartSense
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
Hortonworks
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
DataWorks Summit
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData Day
John Park
 
Apache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleApache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, Scale
Hortonworks
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
DataWorks Summit
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_features
Alberto Romero
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
DataWorks Summit
 
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Apache Phoenix and HBase - Hadoop Summit Tokyo, JapanApache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Ankit Singhal
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
enissoz
 
Hive 3.0 - HDPの最新バージョンで実現する新機能とパフォーマンス改善
Hive 3.0 - HDPの最新バージョンで実現する新機能とパフォーマンス改善Hive 3.0 - HDPの最新バージョンで実現する新機能とパフォーマンス改善
Hive 3.0 - HDPの最新バージョンで実現する新機能とパフォーマンス改善
HortonworksJapan
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
DataWorks Summit/Hadoop Summit
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
Chris Nauroth
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
 
Hadoop in adtech
Hadoop in adtechHadoop in adtech
Hadoop in adtech
Yuta Imai
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoop
Gergely Devenyi
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
alanfgates
 

Similar to Meet HBase 2.0 and Phoenix 5.0 (20)

Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
Streamline Apache Hadoop Operations with Apache Ambari and SmartSenseStreamline Apache Hadoop Operations with Apache Ambari and SmartSense
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData Day
 
Apache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleApache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, Scale
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_features
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Apache Phoenix and HBase - Hadoop Summit Tokyo, JapanApache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
Hive 3.0 - HDPの最新バージョンで実現する新機能とパフォーマンス改善
Hive 3.0 - HDPの最新バージョンで実現する新機能とパフォーマンス改善Hive 3.0 - HDPの最新バージョンで実現する新機能とパフォーマンス改善
Hive 3.0 - HDPの最新バージョンで実現する新機能とパフォーマンス改善
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
Hadoop in adtech
Hadoop in adtechHadoop in adtech
Hadoop in adtech
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoop
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 

Recently uploaded (20)

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 

Meet HBase 2.0 and Phoenix 5.0

  • 1. Meet HBase 2.0 and Phoenix 5.0 Enis Soztutar Rajeshbabu Chintaguntla 15th June 2017
  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved About Us  Enis Soztutar – Apache HBase PMC from 2012 – Release Manager for 1.0 – HBase dev @ Hortonworks – @enissoz  Rajeshbabu Chintaguntla – Apache HBase Committer – Apache Phoenix PMC – HBase dev @ Hortonworks – @rajeshhcu32
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda Versioning / Compatibility Changes in Behavior Major features in HBase 2.0 Phoenix 5.0 Recent Developments in Phoenix
  • 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved GIANT DISCLAIMER!!! HBase-2.0.0 / Phoenix-5.0.0 is NOT released yet (as of June2017) Community is working hard on these releases Expected to arrive before end of year 2017 Keep an eye out for scheduled alpha and beta releases
  • 5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda Versioning / Compatibility Changes in Behavior Major features in HBase 2.0 Phoenix 5.0 Recent Developments in Phoenix
  • 6. 6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Semantic Versioning Starting with the 1.0 release, HBase works toward Semantic Versioning MAJOR.MINOR.PATCH[-identifiers] PATCH: only BC bug fixes. MINOR: BC new features MAJOR: Incompatible changes
  • 7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved To be, or not to be (Compatible) Compatibility is NOT a simple yes or no Many dimensions source, binary, wire, command line, dependencies etc What is client interface? Read https://hbase.apache.org/book.html#upgradin g 7
  • 8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Major Minor Patch Client-Server Wire Compatibility ✗ ✓ ✓ Server-Server Compatibility ✗ ✓ ✓ File Format Compatibility ✗* ✓ ✓ Client API Compatibility ✗ ✓ ✓ Client Binary Compatibility ✗ ✗ ✓ Server Side Limited API Compatibility ✗ ✗*/✓* ✓ Dependency Compatibility ✗ ✓ ✓ Operation Compatibility ✗ ✗ ✓
  • 9. 9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda Versioning / Compatibility Changes in Behavior Major features in HBase 2.0 Phoenix 5.0 Recent Developments in Phoenix
  • 10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Changes in behavior – HBase-2.0 JDK-8+ only Likely Hadoop-2.7+ and Hadoop-3 support Filter and Coprocessor changes  Managed connections are gone. Deprecated client interfaces are gone Target is “Rolling upgradable” from 1.x Updated dependencies 10
  • 11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved How to prepare for HBase-2.0 2.0 contains more API clean up Cleanup PB and guava “leaks” into the API Some deprecated APIs (HConnection, HTable, HBaseAdmin, etc) going away Start using JDK-8 (and G1). You will like it. 1.x client should be able to do read / write / scan against 2.0 clusters Some DDL / Admin operations may not work 11
  • 12. 12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda Versioning / Compatibility Changes in Behavior Major features in HBase 2.0 Phoenix 5.0 Recent Developments in Phoenix
  • 13. 13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Offheaping Read Path Goals –Make use of all available RAM –Reduce temporary garbage for reading –Make bucket cache as fast as LRU cache HBase block cache can be configured as L1 / L2 “Bucket cache” can be backed by onheap, off- heap or SSD Different sized buckets in 4MB slices HFile blocks placed in different buckets Cell comparison only used to deal with byte[] Copy the whole block to heap 13
  • 14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved https://www.slideshare.net/HBaseCon/offheaping-the-apache-hbase-read-path
  • 15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Offheaping Read Path https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in1
  • 16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Offheaping Read Path Offheap Read-Path in Production - The Alibaba story (https://blogs.apache.org/hbase/)
  • 17. 17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Offheaping Write Path Reduce GC, improve throughput predictability Use off-heap buffer pools to read RPC requests Remove garbage created in Codec parsing, MSLAB copying, etc MSLAB is used for heap fragmentation Now MSLAB can allocate offheap pooled buffers [2MB] 17
  • 18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Offheaping Write Path Allows bigger total memstore space Cell data goes off-heap Cell metadata (and CSLM#Entry, etc) is on-heap (~100 byte overhead) Async WAL is used to avoid copying Cells into heap Works with new “Compacting Memstore” 18
  • 19. 19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Compacting Memstore In memory compaction –Eliminate duplicates –Gains are proportional to duplicates In memory index reduction –Reduce index overhead per cell –Gains proportional to cell size Reduce flushes to disk –Reduce file creation –Reduce write amplification
  • 20. 20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Region Assignment Region assignment is one of the pain points Complete overhaul of region assignment Based on the internal “Procedure V2” framework Series of operations are broken down into states in State Machine State’s are serialized to durable storage Uses a WAL on HDFS Backup master can recover the state from where left off 20
  • 21. 21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Async Client A complete new client maintained in HBase Different than OpenTSDB/asynchbase Fully async end to end Based on Netty and JDK-8 CompletableFutures Includes most critical functionality including Admin / DDL Some advanced functionality (read replicas, etc) are not done yet Have both sync and async client. 21
  • 22. 22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Async Client
  • 23. 23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved C++ Client -std=c++14 Synchronous client built fully on async client Uses folly and wangle from FB (like Netty, JDK-8 Futures) Follow the same architecture as the async client Gets / Puts / Multi-Gets are working. Scan next 23
  • 24. 24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved C++ Client
  • 25. 25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Backup / Restore HBase needs a reliable backup / restore mechanism Full backups and Incremental backups Backup destination to a remote FS HDFS, S3, ADLS, WASB, and any Hadoop- compatible FS Full backup based on “snapshots” Incremental backup ships Write-Ahead-Logs Backup + Replication are complimentary (for a full DR solution) 25
  • 26. 26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved FileSystem Quotas HBase supports quotas Request count or size per sec,min,hour,day (per server) Num tables / num regions in namespace New work enables quotas for disk space usage Master periodically gathers info about disk space usage Regionservers enforce limit when notified from master Limits per table or per namespace (not per user) Violation policy: {DISABLE, NO_WRITES_COMPACTIONS, NO_WRITES, NO_INSERTS} 26
  • 27. 27 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Will be released for the first time These will be released for the first time in Apache (although other backports exist) Medium Object Support Region Server groups Favored Nodes enhancements HBase-Spark module WAL’s and data files in separate FileSystems
  • 28. 28 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Misc NettyRpcClient NettyRpcServer API Cleanup Replication V2 New metrics API Metrics API for Coprocessors Tons of other improvements
  • 29. 29 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda Versioning / Compatibility Changes in Behavior Major features in HBase 2.0 Phoenix 5.0 Recent Developments in Phoenix
  • 30. 30 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Phoenix-5.0 Phoenix + Calcite introduction Examples of How Calcite helps to optimize the queries. Development is happening in a branch for more than a year
  • 31. 31 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Current Phoenix Architecture Parser Algebra Query Plan Runtime HBase Data Stage1:Parse Node Tree Stage2:Normalization Secondary index rewrite Stage3:Expression Tree Phoenix Schema
  • 32. 32 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Current Calcite Architecture Parser Algebra Operators, Rules, Statistics, Cost Model Schema API ENGINE DATA ENGINE DATA ENGINE DATA
  • 33. 33 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Why calcite integration?  Phoenix query optimizer applies minimal static rules to decide the query plan and for selecting the tables to be used in the query.  But with calcite all query optimization decisions will be based on cost and query optimizations are modeled as pluggable rules – Filter push down; range scan vs. skip scan – Hash aggregate vs. stream aggregate vs. partial stream aggregate – Sort optimized out; sort/limit push through; fwd/rev/unordered scan – Hash join vs. merge join; join ordering – Use of data table vs. index table – All above (and many others) COMBINED  Can make use of Phoenix Statistics for cost based decisions
  • 34. 34 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Phoenix+Calcite Architecture Parser Algebra Logical + Phoenix Operators, Builtin + Phoenix Rules, Phoenix Statistics, Phoenix Cost Model Schema API JDBC(OPTIONA L) DATA QUERY PLAN Other(optio nal) DATADATA Phoenix Runtime
  • 35. 35 © Hortonworks Inc. 2011 – 2017. All Rights Reserved SELECT ItemName, SUM(Price*Quantity) as OrderValue FROM Items JOIN Orders ON Items.ItemID=Orders.ItemID WHERE Orders.CustomerID > ‘C002’ GROUP BY ItemName ORDER BY SUM(Price*Quantity) DESC sort aggregate filter join scan [Items] Scan [Orders] Translate SQL to relational algebra Example 1: CALCITE ALGEBRA
  • 36. 36 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Example 1: FilterInto Join Rule SELECT ItemName, SUM(Price*Quantity) as OrderValue FROM Items JOIN Orders ON Items.ItemID=Orders.ItemID WHERE Orders.CustomerID > ‘C002’ GROUP BY ItemName ORDER BY SUM(Price*Quantity) DESC sort sort aggregate aggregate Filter Filterjoin Join scan [Items] Scan [Orders] scan [Items] Scan [Orders] FilterInto Join rule Translate SQL to relational algebra
  • 37. 37 © Hortonworks Inc. 2011 – 2017. All Rights Reserved SELECT O.OrderID, C.CustomerName, C.Country, O.Date FROM Orders AS O INNER JOIN Customers AS C ON O.CustomerID = C.CustomerID ORDER BY CustomerID Project sort join scan [Orders] Scan [Customers] Translate SQL to relational algebra Example 2: CALCITE ALGEBRA
  • 38. 38 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Example 2: Plan Candidates proje ct sort Hash- join Scan [Orders] Scan [Customers] Candidate 1 : hash-join *also what standalone Phoenix compiler would generate Candidate 2 : Merge-join Win sort Scan [Orders] Scan [Costomers] SortRemove Rule Sorted on [CustomerID] SortRemove Rule Sorted on [O.CustomerID] 1. Very little difference in all other operators: project, scan, hash-join or merge-join 2. Candidate 1 would sort “emps join depts”, while candidate 2 would only sort “emps” proje ct Merge- join
  • 39. 39 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Advantages with Phoenix-Calcite integration Strong SQL standard compliance Can make use of Calcite features like Window functions, scalar subquery, FILTER etc. Interop with other Calcite adapters –Already used by Drill, Hive, Kylin, Samza, etc.. –Supports any JDBC source 39
  • 40. 40 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Phoenix – Calcite Integration work status We are working separate phoenix branch “calcite” for calcite integration work. Making changes in calcite to support missing features like dynamic columns, ordered-set and hypothetical set aggregation functions etc. Need to fix some issues in Phoenix Calcite integration as well ~70% test cases are passing.
  • 41. 41 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda Versioning / Compatibility Changes in Behavior Major features in HBase 2.0 Phoenix 5.0 Recent Developments in Phoenix
  • 42. 42 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Local index hardening and performance improvements Region Server Region CLIENT 1.Write request prepare index updates Data cf Index cf 2.batch call Mem Store Me mSto re Index updates Data updates 4.Merge data and index updates 5.Write to MemStores WAL 6.Write to WAL 100% Atomic and consistent local updates with data updates
  • 43. 43 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Local index hardening and performance improvements - 4 node cluster - Tested with 5 local indexes on the base table of 25 columns with 10 regions. - Ingested 50M rows. - 3x faster upsert time comparing to global indexes - 5x less network RX/TX utilizations during write comparing to global indexes - Similar read performance comparing to global indexes with queries like aggregations, group by, limit etc.
  • 44. 44 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Column mapping and data encoding  Maps columns with numbers of variable size based on number of columns.  It helps to reduce disk space utilization ,makes easy alteration of column names and improve the performance.  Supporting a new encoding scheme for immutable tables that packs all the values into a single cell for column family.  Supported from Phoenix 4.10.0+ CREATE TABLE T ( a_string varchar not null, col1 integer CONSTRAINT pk PRIMARY KEY (a_string) ) COLUMN_ENCODED_BYTES = 1;
  • 45. 45 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Hive integration  Apache Phoenix supports Hive integration to access Phoenix Table data from HiveQL using Phoenix Storage Handler. create <external> table phoenix_table ( s1 string, i1 int, ) STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ( "phoenix.table.name" = "phoenix_table", "phoenix.zookeeper.quorum" = "localhost", "phoenix.zookeeper.znode.parent" = "/hbase", "phoenix.zookeeper.client.port" = "2181", "phoenix.rowkeys" = "s1", "phoenix.column.mapping" = "s1:s1, i1:i1", );
  • 46. 46 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Kapka Integration  Apache Kafka plugin helps to reliably and efficiently stream large amounts of data onto HBase using Phoenix API. As part of this plugin we are providing PhoenixConsumer to receive messages from KafkaProducer.
  • 47. 47 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Misc  Use Hbase snapshorts for MR Based queries and async index building – 4.11.0  Support for forward-only cursors – 4.11.0  Integration with spark 2.0 – 4.10.0  Support for HBase 1.3.1+ - 4.11.0  Server side upsert select – 4.10.0  Omid transactions Integration – IN PROGRESS
  • 48. 48 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Thank You Q & A?