SlideShare a Scribd company logo
1 of 38
Use Cases and New Features
@ApachePhoenix
http://phoenix.apache.org
V5
Agenda
• Phoenix Use Cases
– Argus: Time-series data with Phoenix (Tom Valine, Salesforce.com)
– Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
(Masayasu Suzuki, Sony)
– Phoenix & eHarmony, a perfect match (Vijay Vangapandu, eHarmony)
• What’s new in Phoenix
– ACID Transactions with Tephra (Poorna Chandra, Cask)
– Cost-based Query Optimization with Calcite (Maryann Xue, Intel)
• Q & A
–PhoenixCon tomorrow 9am-1pm @ Salesforce.com, 1 Market St, SF
Argus: Time-series data with Phoenix
Tom Valine
Salesforce.com
OpenTSDB Limitations
OpenTSDB is good, but we need more
•Tag Cardinality
– Total number of tags per metric is limited to 8
– Performance decreases drastically as tag values increase.
•UID Exhaustion
– Hard limit of 16M UIDs
•Ad hoc querying not possible
– Join to other data sources
– Joins of time series and events
– Simplification of Argus’ transform grammar
Phoenix-backed Argus TSDB Service
• 3 day hackathon
• Modeled metric as Phoenix VIEW
–Leverage ROW_TIMESTAMP optimization
•Tag values inlined in row key
–Uses SKIP_SCAN filter optimization
–Allows for secondary indexes on particular metric + tags
•Metric and tag names managed outside of data as metadata
•Eventually leverage Drillix (Phoenix + Drill)
–Cross cluster queries
–Joins to other data sources
Write Performance
•OpenTSDB - ~25M data points/min
•Phoenix - ~18M data points/min
Using 2 clients to write in parallel. Phoenix is using 10 writer threads per client
Read Performance
• Metrics with one tag (60 distinct values)
– OpenTSDB and Phoenix performance comparable for small aggregations
– Phoenix outperforms OpenTSDB as aggregation size increases
• Leverages statistics collection for query parallelization
Disk usage
• Phoenix & OTSDB use approximately the same amount of space with FAST_DIFF
and Snappy compression
Five major tips to maximize
performance on a 200+ SQL
HBase/Phoenix cluster
Masayasu “Mas” Suzuki
Shinji Nagasaka
Takanari Tamesue
Sony Corporation
Who we are, and why we chose HBase/Phoenix
• We are DevOps members from
Sony’s News Suite team
http://socialife.sony.net/
• HBase/Phoenix was chosen
because of
a. Scalability,
b. SQL compatibility, and
c. secondary indexing support
Our use case
Performance test apparatus & results
• Test apparatus
• Test results
Specs
Number of records 1.2 billion records (1 KB each)
Number of indexes 8 orthogonal indexes
Servers
3 Zookeepers (Zookeeper 3.4.5, m3.xlarge x 3)
3 HMaster servers (hadoop 2.5.0, hbase 0.98.6, Phoenix 4.3.0, m3.xlarge x 3)
200 RegionServers
(hadoop 2.5.0, hbase 0.98.6, Phoenix 4.3.0, r3.xlarge x 199, c4.8xlarge x 1)
Clients 100 x c4.xlarge
Results
Number of queries 51,053 queries/sec
Response time (average) 46 ms
Five major tips to maximize performance
using HBase/Phoenix
Ordered by effectiveness (most effective on the very top)
– An extra RPC is issued when the client runs a SQL statement that uses a secondary index
– Using SQL hint clause can mitigate this
– From Ver. 4.7, changing “UPDATE_CACHE_FREQUENCY” may also work (we have yet to test
this)
– A memory rich node should be selected for use in RegionServers so as to minimize disk access
– As an example, running major compaction and index creation simultaneously should be
avoided
Details will be presented at the PhoenixCon tomorrow (May 25)
2. Use memories aggressively
1. Use SQL hint clause when using a secondary index
4. Scale-out instead of scale-up
3. Manually split Region files if possible but never over split them
5. Avoid running power intensive tasks simultaneously
Vijay Vangapandu
Principal Platform Engineer
eHarmony and Phoenix a perfect match
NEED FOR
● Handling 30+ Million events during Batch Run
● Serving low latency queries on 16+ Billion records
75th% - 800MS 95th% - 2Sec 99th% - 4Sec
eHarmony and Phoenix a perfect match
LAMBDA FOR THE SAVE
• Layered architecture provides fault tolerance
• Hbase as batch storage for write throughput with reasonable read latency
• Apache Phoenix as query layer to work with complex queries with confidence
• Redis as speed layer cache
eHarmony and Phoenix a perfect match
PERFORMANCE
Phoenix/HBase goes live
Get Matches API Response Times
Phoenix/HBase goes live
Save Match API Response Times
eHarmony and Phoenix a perfect match
• Highly Consistent and fault tolerant
• Need for store level filtering and sorting
• Apache Phoenix helped us build an abstract high performance
query layer on top of Hbase.
• Eased the development process.
• Reduced boiler plate code, which provides maintainability.
• Build complex queries with confidence.
• Secondary indexes.
• JDBC connection.
• Good community support
WHY HBASE AND PHOENIX
HBASE
APACHE PHOENIX
eHarmony and Phoenix a perfect match
JAVA ORM LIBRARY(PHO)
• Apache Phoenix helped us build PHO (Phoenix-HBase ORM)
• PHO provides ability to annotate your entity bean and provides interfaces
to build DSL like queries.
Disjunction disjunction = new Disjunction();
for (int statusFilter : statusFilters) {
disjunction.add(Restrictions.eq("status", statusFilter));
}
QueryBuilder.builderFor(FeedItemDto.class).select()
.add(Restrictions.eq("userId", userId))
.add(Restrictions.gte("spotlightEnd", spotlightEndDate))
.add(disjunction)
.setReturnFields(projection)
.addOrder(orderings)
.setMaxResults(maxResults)
.build();
eHarmony and Phoenix a perfect match
http://eharmony.github.io/
OPEN SOURCE REPOSITORY
https://github.com/eHarmony/pho
http://www.eharmony.com/about/careers/
*Please Join us for more details at PhoenixCon tomorrow (May 25)
ACID Transactions
+
Poorna Chandra
Cask
Why Transactions?
• All or none semantics simplifies life of
developer
– Ensures every client has a consistent view of data
–Protects against concurrent updates
– No need to reason about what state data is left in
if write fails
– Guaranteed consistency between data and index
Apache Tephra
• Transactions on HBase
– Across regions, tables and RPC calls
• ACID semantics
• Tephra Powers
– CDAP (Cask Data Application Platform)
– Apache Phoenix (4.7 onwards)
Apache Tephra Architecture
ZooKeeper
Tx Manager
(standby)
HBase
Master 1
Master 2
RS 1
RS 2 RS 4
RS 3
Client 1
Client 2
Client N
Tx Manager
(active)
Tephra Components
• TransactionAware client
• Coordinates transaction lifecycle with manager
• Communicates directly with HBase for reads and writes
• Transaction Manager
• Assigns transaction IDs
• Maintains state on in-progress, committed and invalid transactions
• Transaction Processor coprocessor
• Applies server-side filtering for reads
• Cleans up data from failed transactions, and no longer visible versions
Snapshot Isolation
•Multi-version concurrency control
–Cell version (timestamp) = transaction ID
–Reads exclude other uncommitted transactions
(for isolation)
•Optimistic Concurrency Control
–Avoids cost of locking rows and tables
–Good if conflicts are rare: short transaction,
disjoint partitioning of work
Single client using 10 threads in parallel with 5K batch size
No performance penalty for non-transactional tables
Performance
Future Work
• Partitioned Transaction Manager
• Automatic pruning of invalid transaction list
• Read-only transactions
• Performance optimizations
–Conflict detection
–Appends to transaction edit log
+
Cost-based Query Optimization
Maryann Xue
Intel
Integration model
Calcite Parser & Validator
Calcite Query Optimizer
Phoenix Query Plan Generator
Phoenix Runtime
Phoenix Tables over HBase
JDBC Client
SQL + Phoenix
specific
grammar Built-in rules
+ Phoenix
specific rules
Cost-based query optimizer
with Apache Calcite
• Base all query optimization decisions on cost
– Filter push down; range scan vs. skip scan
– Hash aggregate vs. stream aggregate vs. partial stream aggregate
– Sort optimized out; sort/limit push through; fwd/rev/unordered scan
– Hash join vs. merge join; join ordering
– Use of data table vs. index table
– All above (any many others) COMBINED
• Query optimizations are modeled as pluggable rules
Beyond Phoenix 4.8
with Apache Calcite
• Get the missing SQL support
– WITH, UNNEST, Scalar subquery, etc.
• Materialized views
– To allow other forms of indices (maybe defined as external), e.g., a filter
view, a join view, or an aggregate view.
• Interop with other Calcite adaptors
– Already used by Drill, Hive, Kylin, Samza, etc.
– Supports any JDBC source
– Initial version of Drill-Phoenix integration already working
Query Example - no cost-based optimizer
select empid, e.name,
d.deptno, d.name,
location
from emps e, depts d
using deptno
order by e.deptno
Phoenix
Compiler
scan ‘depts’
send ‘depts’ over to RS
& build hash-cache
scan ‘emps’ hash-join ‘depts’
sort joined table on ‘e.deptno’
Query Example - with cost-based optimizer
(sort optimization combined with join algorithm decision)
LogicalSort
key: deptno
LogicalJoin
inner,
e.deptno = d.deptno
LogicalProject
empid, e.name, d.deptno,
d.name, location
LogicalTableScan
emps LogicalTableScan
depts
PhoenixTableScan
depts
PhoenixMergeJoin
inner,
e.deptno = d.deptno
PhoenixClientProject
empid, e.name, d.deptno,
d.name, location
Optimizer
Optimization rules
+
Phoenix operator
conversion rules
PhoenixTableScan
emps
PhoenixServerProjec
t
empid, name, deptno
PhoenixServerProject
deptno, name, location
select empid, e.name, d.deptno,
d.name, location
from emps e, depts d using deptno
order by e.deptno
PhoenixServerSort
key: deptno
empid
empid
deptno
deptno
deptno
e.deptno;
d.deptno;
e.deptno;
d.deptno;
Query Example - with cost-based optimizer
(sort optimization combined with join algorithm decision)
Phoenix
Implementor
PhoenixTableScan
depts
PhoenixMergeJoin
inner,
e.deptno = d.deptno
PhoenixClientProject
empid, e.name, d.deptno,
d.name, location
PhoenixTableScan
emps
PhoenixServerProjec
t
empid, name, deptno
PhoenixServerProject
deptno, name, location
PhoenixServerSort
key: deptno
empid
empid
deptno
deptno
deptno
e.deptno;
d.deptno;
e.deptno;
d.deptno;
scan ‘emps’
merge-join ‘emps’ and ‘depts’
sort by ‘deptno’
scan ‘depts’
Query Example - Comparison
Query plan w/o cost-based
optimizer
Query plan w/ cost-based optimizer
scan ‘emps’, ‘depts’ first ‘depts’, then ‘emps’ 2 tables in parallel
hash-cache send & build proportional to size of ‘depts’;
might cause exception if too large
none
hash-cache look-up 1 look-up per ‘emps’ row none
sorting sort ‘emps’ join ‘depts’ sort ‘emps’ only
optimization approach Local, serial optimization processes Cost-based, rule-driven, integrated
performance
(single node, 2M * 2K rows)
19.46 s 13.92 s
Drillix: Interoperability with Drill
select deptno, sum(salary) from emps group by deptno
Drill Final Aggregation
deptno, sum(salary)
Phoenix Table Scan
emps
Phoenix Tables over HBase
Drill Shuffle
Phoenix Partial Aggregation
deptno, sum(salary)
Stage 1:
Local Partial aggregation
Stage 3:
Final aggregation
Stage 2:
Shuffle partial results
Thank you! Questions?
Join us tomorrow for PhoenixCon
Salesforce.com, 1 Market St, SF 9am-1pm
(some companies using Phoenix)

More Related Content

What's hot

HBaseCon 2013: Full-Text Indexing for Apache HBase
HBaseCon 2013: Full-Text Indexing for Apache HBaseHBaseCon 2013: Full-Text Indexing for Apache HBase
HBaseCon 2013: Full-Text Indexing for Apache HBaseCloudera, Inc.
 
Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixNick Dimiduk
 
Apache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseApache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseDataWorks Summit
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesHBaseCon
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtMichael Stack
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the unionenissoz
 
Local Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixLocal Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixRajeshbabu Chintaguntla
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshotsenissoz
 
Apache Phoenix Query Server
Apache Phoenix Query ServerApache Phoenix Query Server
Apache Phoenix Query ServerJosh Elser
 
A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoYu Liu
 
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Josh Elser
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicasenissoz
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseNick Dimiduk
 
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the CloudSpeed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloudgluent.
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandJosh Elser
 
Dancing with the elephant h base1_final
Dancing with the elephant   h base1_finalDancing with the elephant   h base1_final
Dancing with the elephant h base1_finalasterix_smartplatf
 

What's hot (20)

Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
 
HBaseCon 2013: Full-Text Indexing for Apache HBase
HBaseCon 2013: Full-Text Indexing for Apache HBaseHBaseCon 2013: Full-Text Indexing for Apache HBase
HBaseCon 2013: Full-Text Indexing for Apache HBase
 
Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - Phoenix
 
Apache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseApache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL Database
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
 
Local Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixLocal Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache Phoenix
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshots
 
Apache Phoenix Query Server
Apache Phoenix Query ServerApache Phoenix Query Server
Apache Phoenix Query Server
 
A TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with PrestoA TPC Benchmark of Hive LLAP and Comparison with Presto
A TPC Benchmark of Hive LLAP and Comparison with Presto
 
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server PhoenixCon2016
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the CloudSpeed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
 
Dancing with the elephant h base1_final
Dancing with the elephant   h base1_finalDancing with the elephant   h base1_final
Dancing with the elephant h base1_final
 

Viewers also liked

Vaikundarajan Reviews Vijay’s Upcoming Film Their
Vaikundarajan Reviews Vijay’s Upcoming Film TheirVaikundarajan Reviews Vijay’s Upcoming Film Their
Vaikundarajan Reviews Vijay’s Upcoming Film TheirVaikundarajan S
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impalahuguk
 
Apache ranger meetup
Apache ranger meetupApache ranger meetup
Apache ranger meetupnvvrajesh
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoopnvvrajesh
 
HBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBaseHBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBaseHBaseCon
 
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
HBaseCon 2015 General Session: The Evolution of HBase @ BloombergHBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
HBaseCon 2015 General Session: The Evolution of HBase @ BloombergHBaseCon
 
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory HBaseCon
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path HBaseCon
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseHBaseCon
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiHBaseCon
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon
 
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBaseHBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBaseHBaseCon
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon
 

Viewers also liked (20)

Vaikundarajan Reviews Vijay’s Upcoming Film Their
Vaikundarajan Reviews Vijay’s Upcoming Film TheirVaikundarajan Reviews Vijay’s Upcoming Film Their
Vaikundarajan Reviews Vijay’s Upcoming Film Their
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Apache ranger meetup
Apache ranger meetupApache ranger meetup
Apache ranger meetup
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Decision trees in hadoop
Decision trees in hadoopDecision trees in hadoop
Decision trees in hadoop
 
HBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBaseHBaseCon 2015 General Session: State of HBase
HBaseCon 2015 General Session: State of HBase
 
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
HBaseCon 2015 General Session: The Evolution of HBase @ BloombergHBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
 
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0
 
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTraceHBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
HBaseCon 2015: Solving HBase Performance Problems with Apache HTrace
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
 
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBaseHBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
HBaseCon 2015 General Session: Zen - A Graph Data Model on HBase
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond Panel
 
Giraph+Gora in ApacheCon14
Giraph+Gora in ApacheCon14Giraph+Gora in ApacheCon14
Giraph+Gora in ApacheCon14
 

Similar to eHarmony @ Hbase Conference 2016 by vijay vangapandu.

HBaseCon2015-final
HBaseCon2015-finalHBaseCon2015-final
HBaseCon2015-finalMaryann Xue
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impalamarkgrover
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptxNParakh1
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesAmazon Web Services
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
SQL and Machine Learning on Hadoop
SQL and Machine Learning on HadoopSQL and Machine Learning on Hadoop
SQL and Machine Learning on HadoopMukund Babbar
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networkspbelko82
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Architectural Evolution Starting from Hadoop
Architectural Evolution Starting from HadoopArchitectural Evolution Starting from Hadoop
Architectural Evolution Starting from HadoopSpagoWorld
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackDataWorks Summit/Hadoop Summit
 
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...Christian Tzolov
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problemsAbhishek Gupta
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkJames Chen
 
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and SparkHBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and SparkMichael Stack
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014cdmaxime
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...ssuserd3a367
 

Similar to eHarmony @ Hbase Conference 2016 by vijay vangapandu. (20)

HBaseCon2015-final
HBaseCon2015-finalHBaseCon2015-final
HBaseCon2015-final
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impala
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
SQL On Hadoop
SQL On HadoopSQL On Hadoop
SQL On Hadoop
 
SQL and Machine Learning on Hadoop
SQL and Machine Learning on HadoopSQL and Machine Learning on Hadoop
SQL and Machine Learning on Hadoop
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networks
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Architectural Evolution Starting from Hadoop
Architectural Evolution Starting from HadoopArchitectural Evolution Starting from Hadoop
Architectural Evolution Starting from Hadoop
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problems
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
 
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and SparkHBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

eHarmony @ Hbase Conference 2016 by vijay vangapandu.

  • 1. Use Cases and New Features @ApachePhoenix http://phoenix.apache.org V5
  • 2. Agenda • Phoenix Use Cases – Argus: Time-series data with Phoenix (Tom Valine, Salesforce.com) – Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster (Masayasu Suzuki, Sony) – Phoenix & eHarmony, a perfect match (Vijay Vangapandu, eHarmony) • What’s new in Phoenix – ACID Transactions with Tephra (Poorna Chandra, Cask) – Cost-based Query Optimization with Calcite (Maryann Xue, Intel) • Q & A –PhoenixCon tomorrow 9am-1pm @ Salesforce.com, 1 Market St, SF
  • 3. Argus: Time-series data with Phoenix Tom Valine Salesforce.com
  • 4. OpenTSDB Limitations OpenTSDB is good, but we need more •Tag Cardinality – Total number of tags per metric is limited to 8 – Performance decreases drastically as tag values increase. •UID Exhaustion – Hard limit of 16M UIDs •Ad hoc querying not possible – Join to other data sources – Joins of time series and events – Simplification of Argus’ transform grammar
  • 5. Phoenix-backed Argus TSDB Service • 3 day hackathon • Modeled metric as Phoenix VIEW –Leverage ROW_TIMESTAMP optimization •Tag values inlined in row key –Uses SKIP_SCAN filter optimization –Allows for secondary indexes on particular metric + tags •Metric and tag names managed outside of data as metadata •Eventually leverage Drillix (Phoenix + Drill) –Cross cluster queries –Joins to other data sources
  • 6. Write Performance •OpenTSDB - ~25M data points/min •Phoenix - ~18M data points/min Using 2 clients to write in parallel. Phoenix is using 10 writer threads per client
  • 7. Read Performance • Metrics with one tag (60 distinct values) – OpenTSDB and Phoenix performance comparable for small aggregations – Phoenix outperforms OpenTSDB as aggregation size increases • Leverages statistics collection for query parallelization
  • 8. Disk usage • Phoenix & OTSDB use approximately the same amount of space with FAST_DIFF and Snappy compression
  • 9. Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster Masayasu “Mas” Suzuki Shinji Nagasaka Takanari Tamesue Sony Corporation
  • 10. Who we are, and why we chose HBase/Phoenix • We are DevOps members from Sony’s News Suite team http://socialife.sony.net/ • HBase/Phoenix was chosen because of a. Scalability, b. SQL compatibility, and c. secondary indexing support
  • 12. Performance test apparatus & results • Test apparatus • Test results Specs Number of records 1.2 billion records (1 KB each) Number of indexes 8 orthogonal indexes Servers 3 Zookeepers (Zookeeper 3.4.5, m3.xlarge x 3) 3 HMaster servers (hadoop 2.5.0, hbase 0.98.6, Phoenix 4.3.0, m3.xlarge x 3) 200 RegionServers (hadoop 2.5.0, hbase 0.98.6, Phoenix 4.3.0, r3.xlarge x 199, c4.8xlarge x 1) Clients 100 x c4.xlarge Results Number of queries 51,053 queries/sec Response time (average) 46 ms
  • 13. Five major tips to maximize performance using HBase/Phoenix Ordered by effectiveness (most effective on the very top) – An extra RPC is issued when the client runs a SQL statement that uses a secondary index – Using SQL hint clause can mitigate this – From Ver. 4.7, changing “UPDATE_CACHE_FREQUENCY” may also work (we have yet to test this) – A memory rich node should be selected for use in RegionServers so as to minimize disk access – As an example, running major compaction and index creation simultaneously should be avoided Details will be presented at the PhoenixCon tomorrow (May 25) 2. Use memories aggressively 1. Use SQL hint clause when using a secondary index 4. Scale-out instead of scale-up 3. Manually split Region files if possible but never over split them 5. Avoid running power intensive tasks simultaneously
  • 15. eHarmony and Phoenix a perfect match NEED FOR ● Handling 30+ Million events during Batch Run ● Serving low latency queries on 16+ Billion records 75th% - 800MS 95th% - 2Sec 99th% - 4Sec
  • 16. eHarmony and Phoenix a perfect match LAMBDA FOR THE SAVE • Layered architecture provides fault tolerance • Hbase as batch storage for write throughput with reasonable read latency • Apache Phoenix as query layer to work with complex queries with confidence • Redis as speed layer cache
  • 17. eHarmony and Phoenix a perfect match PERFORMANCE Phoenix/HBase goes live Get Matches API Response Times Phoenix/HBase goes live Save Match API Response Times
  • 18. eHarmony and Phoenix a perfect match • Highly Consistent and fault tolerant • Need for store level filtering and sorting • Apache Phoenix helped us build an abstract high performance query layer on top of Hbase. • Eased the development process. • Reduced boiler plate code, which provides maintainability. • Build complex queries with confidence. • Secondary indexes. • JDBC connection. • Good community support WHY HBASE AND PHOENIX HBASE APACHE PHOENIX
  • 19. eHarmony and Phoenix a perfect match JAVA ORM LIBRARY(PHO) • Apache Phoenix helped us build PHO (Phoenix-HBase ORM) • PHO provides ability to annotate your entity bean and provides interfaces to build DSL like queries. Disjunction disjunction = new Disjunction(); for (int statusFilter : statusFilters) { disjunction.add(Restrictions.eq("status", statusFilter)); } QueryBuilder.builderFor(FeedItemDto.class).select() .add(Restrictions.eq("userId", userId)) .add(Restrictions.gte("spotlightEnd", spotlightEndDate)) .add(disjunction) .setReturnFields(projection) .addOrder(orderings) .setMaxResults(maxResults) .build();
  • 20. eHarmony and Phoenix a perfect match http://eharmony.github.io/ OPEN SOURCE REPOSITORY https://github.com/eHarmony/pho http://www.eharmony.com/about/careers/ *Please Join us for more details at PhoenixCon tomorrow (May 25)
  • 22. Why Transactions? • All or none semantics simplifies life of developer – Ensures every client has a consistent view of data –Protects against concurrent updates – No need to reason about what state data is left in if write fails – Guaranteed consistency between data and index
  • 23. Apache Tephra • Transactions on HBase – Across regions, tables and RPC calls • ACID semantics • Tephra Powers – CDAP (Cask Data Application Platform) – Apache Phoenix (4.7 onwards)
  • 24. Apache Tephra Architecture ZooKeeper Tx Manager (standby) HBase Master 1 Master 2 RS 1 RS 2 RS 4 RS 3 Client 1 Client 2 Client N Tx Manager (active)
  • 25. Tephra Components • TransactionAware client • Coordinates transaction lifecycle with manager • Communicates directly with HBase for reads and writes • Transaction Manager • Assigns transaction IDs • Maintains state on in-progress, committed and invalid transactions • Transaction Processor coprocessor • Applies server-side filtering for reads • Cleans up data from failed transactions, and no longer visible versions
  • 26. Snapshot Isolation •Multi-version concurrency control –Cell version (timestamp) = transaction ID –Reads exclude other uncommitted transactions (for isolation) •Optimistic Concurrency Control –Avoids cost of locking rows and tables –Good if conflicts are rare: short transaction, disjoint partitioning of work
  • 27. Single client using 10 threads in parallel with 5K batch size No performance penalty for non-transactional tables Performance
  • 28. Future Work • Partitioned Transaction Manager • Automatic pruning of invalid transaction list • Read-only transactions • Performance optimizations –Conflict detection –Appends to transaction edit log
  • 30. Integration model Calcite Parser & Validator Calcite Query Optimizer Phoenix Query Plan Generator Phoenix Runtime Phoenix Tables over HBase JDBC Client SQL + Phoenix specific grammar Built-in rules + Phoenix specific rules
  • 31. Cost-based query optimizer with Apache Calcite • Base all query optimization decisions on cost – Filter push down; range scan vs. skip scan – Hash aggregate vs. stream aggregate vs. partial stream aggregate – Sort optimized out; sort/limit push through; fwd/rev/unordered scan – Hash join vs. merge join; join ordering – Use of data table vs. index table – All above (any many others) COMBINED • Query optimizations are modeled as pluggable rules
  • 32. Beyond Phoenix 4.8 with Apache Calcite • Get the missing SQL support – WITH, UNNEST, Scalar subquery, etc. • Materialized views – To allow other forms of indices (maybe defined as external), e.g., a filter view, a join view, or an aggregate view. • Interop with other Calcite adaptors – Already used by Drill, Hive, Kylin, Samza, etc. – Supports any JDBC source – Initial version of Drill-Phoenix integration already working
  • 33. Query Example - no cost-based optimizer select empid, e.name, d.deptno, d.name, location from emps e, depts d using deptno order by e.deptno Phoenix Compiler scan ‘depts’ send ‘depts’ over to RS & build hash-cache scan ‘emps’ hash-join ‘depts’ sort joined table on ‘e.deptno’
  • 34. Query Example - with cost-based optimizer (sort optimization combined with join algorithm decision) LogicalSort key: deptno LogicalJoin inner, e.deptno = d.deptno LogicalProject empid, e.name, d.deptno, d.name, location LogicalTableScan emps LogicalTableScan depts PhoenixTableScan depts PhoenixMergeJoin inner, e.deptno = d.deptno PhoenixClientProject empid, e.name, d.deptno, d.name, location Optimizer Optimization rules + Phoenix operator conversion rules PhoenixTableScan emps PhoenixServerProjec t empid, name, deptno PhoenixServerProject deptno, name, location select empid, e.name, d.deptno, d.name, location from emps e, depts d using deptno order by e.deptno PhoenixServerSort key: deptno empid empid deptno deptno deptno e.deptno; d.deptno; e.deptno; d.deptno;
  • 35. Query Example - with cost-based optimizer (sort optimization combined with join algorithm decision) Phoenix Implementor PhoenixTableScan depts PhoenixMergeJoin inner, e.deptno = d.deptno PhoenixClientProject empid, e.name, d.deptno, d.name, location PhoenixTableScan emps PhoenixServerProjec t empid, name, deptno PhoenixServerProject deptno, name, location PhoenixServerSort key: deptno empid empid deptno deptno deptno e.deptno; d.deptno; e.deptno; d.deptno; scan ‘emps’ merge-join ‘emps’ and ‘depts’ sort by ‘deptno’ scan ‘depts’
  • 36. Query Example - Comparison Query plan w/o cost-based optimizer Query plan w/ cost-based optimizer scan ‘emps’, ‘depts’ first ‘depts’, then ‘emps’ 2 tables in parallel hash-cache send & build proportional to size of ‘depts’; might cause exception if too large none hash-cache look-up 1 look-up per ‘emps’ row none sorting sort ‘emps’ join ‘depts’ sort ‘emps’ only optimization approach Local, serial optimization processes Cost-based, rule-driven, integrated performance (single node, 2M * 2K rows) 19.46 s 13.92 s
  • 37. Drillix: Interoperability with Drill select deptno, sum(salary) from emps group by deptno Drill Final Aggregation deptno, sum(salary) Phoenix Table Scan emps Phoenix Tables over HBase Drill Shuffle Phoenix Partial Aggregation deptno, sum(salary) Stage 1: Local Partial aggregation Stage 3: Final aggregation Stage 2: Shuffle partial results
  • 38. Thank you! Questions? Join us tomorrow for PhoenixCon Salesforce.com, 1 Market St, SF 9am-1pm (some companies using Phoenix)