Apache Phoenix: Use Cases and New Features

869 views

Published on

James Taylor (Salesforce) and Maryann Xue (Intel)

This talk with be broken into two parts: Phoenix use cases and new Phoenix features. Three use cases will be presented as lightning talks by individuals from 1) Sony about its social media NewsSuite app, 2) eHarmony on its matching service, and 3) Salesforce.com on its time-series metrics engine. Two new features will be discussed in detail by the engineers who developed them: ACID transactions in Phoenix through Apache Tephra. and cost-based query optimization through Apache Calcite. The focus will be on helping end users more easily develop scalable applications on top of Phoenix.

Published in: Software

Apache Phoenix: Use Cases and New Features

  1. 1. Use Cases and New Features @ApachePhoenix http://phoenix.apache.org V5
  2. 2. Agenda • Phoenix Use Cases – Argus: Time-series data with Phoenix (Tom Valine, Salesforce.com) – Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster (Masayasu Suzuki, Sony) – Phoenix & eHarmony, a perfect match (Vijay Vangapandu, eHarmony) • What’s new in Phoenix – ACID Transactions with Tephra (Poorna Chandra, Cask) – Cost-based Query Optimization with Calcite (Maryann Xue, Intel) • Q & A – PhoenixCon tomorrow 9am-1pm @ Salesforce.com, 1 Market St, SF
  3. 3. Argus: Time-series data with Phoenix Tom Valine Salesforce.com
  4. 4. OpenTSDB Limitations OpenTSDB is good, but we need more •Tag Cardinality – Total number of tags per metric is limited to 8 – Performance decreases drastically as tag values increase. •UID Exhaustion – Hard limit of 16M UIDs •Ad hoc querying not possible – Join to other data sources – Joins of time series and events – Simplification of Argus’ transform grammar
  5. 5. Phoenix-backed Argus TSDB Service • 3 day hackathon • Modeled metric as Phoenix VIEW – Leverage ROW_TIMESTAMP optimization • Tag values inlined in row key – Uses SKIP_SCAN filter optimization – Allows for secondary indexes on particular metric + tags • Metric and tag names managed outside of data as metadata • Eventually leverage Drillix (Phoenix + Drill) – Cross cluster queries – Joins to other data sources
  6. 6. Write Performance Using 2 clients to write in parallel. Phoenix is using 10 writer threads per client
  7. 7. Read Performance • Metrics with one tag (60 distinct values) – OpenTSDB and Phoenix performance comparable for small aggregations – Phoenix outperforms OpenTSDB as aggregation size increases
  8. 8. Disk usage • Phoenix & OTSDB use approximately the same amount of space with FAST_DIFF and Snappy compression
  9. 9. Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster Masayasu “Mas” Suzuki Shinji Nagasaka Takanari Tamesue Sony Corporation
  10. 10. Who we are, and why we chose HBase/Phoenix • We are DevOps members from Sony’s News Suite team http://socialife.sony.net/ • HBase/Phoenix was chosen because of a. Scalability, b. SQL compatibility, and c. secondary indexing support
  11. 11. Our use case
  12. 12. Performance test apparatus & results • Test apparatus • Test results Specs Number of records 1.2 billion records (1 KB each) Number of indexes 8 orthogonal indexes Servers 3 Zookeepers (Zookeeper 3.4.5, m3.xlarge x 3) 3 HMaster servers (hadoop 2.5.0, hbase 0.98.6, Phoenix 4.3.0, m3.xlarge x 3) 200 RegionServers (hadoop 2.5.0, hbase 0.98.6, Phoenix 4.3.0, r3.xlarge x 199, c4.8xlarge x 1) Clients 100 x c4.xlarge Results Number of queries 51,053 queries/sec Response time (average) 46 ms
  13. 13. Five major tips to maximize performance using HBase/Phoenix Ordered by effectiveness (most effective on the very top) – An extra RPC is issued when the client runs a SQL statement that uses a secondary index – Using SQL hint clause can mitigate this – From Ver. 4.7, changing “UPDATE_CACHE_FREQUENCY” may also work (we have yet to test this) – A memory rich node should be selected for use in RegionServers so as to minimize disk access – As an example, running major compaction and index creation simultaneously should be avoided Details will be presented at the PhoenixCon tomorrow (May 25) 2. Use memories aggressively 1. Use SQL hint clause when using a secondary index 4. Scale-out instead of scale-up 3. Manually split Region files if possible but never over split them 5. Avoid running power intensive tasks simultaneously
  14. 14. Vijay Vangapandu Principal Platform Engineer
  15. 15. eHarmony and Phoenix a perfect match NEED FOR ● Handling 30+ Million events during Batch Run ● Serving low latency queries on 16+ Billion records 75th% - 800MS 95th% - 2Sec 99th% - 4Sec
  16. 16. eHarmony and Phoenix a perfect match LAMBDA FOR THE SAVE • Layered architecture provides fault tolerance • Hbase as batch storage for write throughput with reasonable read latency • Apache Phoenix as query layer to work with complex queries with confidence • Redis as speed layer cache
  17. 17. eHarmony and Phoenix a perfect match PERFORMANCE Phoenix/HBase goes live Get Matches API Response Times Phoenix/HBase goes live Save Match API Response Times
  18. 18. eHarmony and Phoenix a perfect match • Highly Consistent and fault tolerant • Need for store level filtering and sorting • Apache Phoenix helped us build an abstract high performance query layer on top of Hbase. • Eased the development process. • Reduced boiler plate code, which provides maintainability. • Build complex queries with confidence. • Secondary indexes. • JDBC connection. • Good community support WHY HBASE AND PHOENIX HBASE APACHE PHOENIX
  19. 19. eHarmony and Phoenix a perfect match JAVA ORM LIBRARY(PHO) • Apache Phoenix helped us build PHO (Phoenix-HBase ORM) • PHO provides ability to annotate your entity bean and provides interfaces to build DSL like queries. Disjunction disjunction = new Disjunction(); for (int statusFilter : statusFilters) { disjunction.add(Restrictions.eq("status", statusFilter)); } QueryBuilder.builderFor(FeedItemDto.class).select() .add(Restrictions.eq("userId", userId)) .add(Restrictions.gte("spotlightEnd", spotlightEndDate)) .add(disjunction) .setReturnFields(projection) .addOrder(orderings) .setMaxResults(maxResults) .build();
  20. 20. eHarmony and Phoenix a perfect match http://eharmony.github.io/ OPEN SOURCE REPOSITORY https://github.com/eHarmony/pho http://www.eharmony.com/about/careers/ *Please Join us for more details at PhoenixCon tomorrow (May 25)
  21. 21. ACID Transactions + Poorna Chandra Cask
  22. 22. Why Transactions? • All or none semantics simplifies life of developer – Ensures every client has a consistent view of data – Protects against concurrent updates – No need to reason about what state data is left in if write fails – Guaranteed consistency between data and index
  23. 23. Apache Tephra • Transactions on HBase – Across regions, tables and RPC calls • ACID semantics • Tephra Powers – CDAP (Cask Data Application Platform) – Apache Phoenix (4.7 onwards)
  24. 24. Apache Tephra Architecture Zookeeper Tx Manager (standby) HBase Master 1 RS 1 RS 2 RS 4 RS 3 Client 1 Client 2 Client N Tx Manager (active) Master 2
  25. 25. Tephra Components • TransactionAware client • Coordinates transaction lifecycle with manager • Communicates directly with HBase for reads and writes • Transaction Manager • Assigns transaction IDs • Maintains state on in-progress, committed and invalid transactions • Transaction Processor coprocessor • Applies server-side filtering for reads • Cleans up data from failed transactions, and no longer visible versions
  26. 26. Snapshot Isolation • Multi-version concurrency control – Cell version (timestamp) = transaction ID – Reads exclude other uncommitted transactions (for isolation) • Optimistic Concurrency Control – Avoids cost of locking rows and tables – Good if conflicts are rare: short transaction, disjoint partitioning of work
  27. 27. Single client using 10 threads in parallel with 5K batch size No performance penalty for non-transactional tables Performance
  28. 28. Concurrent Write Performance 2 write threads per client, 1000 row batch size, 15 columns table
  29. 29. Future Work • Partitioned Transaction Manager • Automatic pruning of invalid transaction list • Read-only transactions • Performance optimizations • Conflict detection • Appends to transaction edit log
  30. 30. + Cost-based Query Optimization Maryann Xue Intel
  31. 31. Integration model Calcite Parser & Validator Calcite Query Optimizer Phoenix Query Plan Generator Phoenix Runtime Phoenix Tables over HBase JDBC Client SQL + Phoenix specific grammar Built-in rules + Phoenix specific rules
  32. 32. Cost-based query optimizer with Apache Calcite • Base all query optimization decisions on cost – Filter push down; range scan vs. skip scan – Hash aggregate vs. stream aggregate vs. partial stream aggregate – Sort optimized out; sort/limit push through; fwd/rev/unordered scan – Hash join vs. merge join; join ordering – Use of data table vs. index table – All above (any many others) COMBINED • Query optimizations are modeled as pluggable rules
  33. 33. Beyond Phoenix 4.8 with Apache Calcite • Get the missing SQL support – WITH, UNNEST, Scalar subquery, etc. • Materialized views – To allow other forms of indices (maybe defined as external), e.g., a filter view, a join view, or an aggregate view. • Interop with other Calcite adaptors – Already used by Drill, Hive, Kylin, Samza, etc. – Supports any JDBC source – Initial version of Drill-Phoenix integration already working
  34. 34. Query Example - no cost-based optimizer select empid, e.name, d.deptno, d.name, location from emps e, depts d using deptno order by e.deptno Phoenix Compiler scan ‘depts’ send ‘depts’ over to RS & build hash-cache scan ‘emps’ hash-join ‘depts’ sort joined table on ‘e.deptno’
  35. 35. Query Example - with cost-based optimizer (sort optimization combined with join algorithm decision) LogicalSort key: deptno LogicalJoin inner, e.deptno = d.deptno LogicalProject empid, e.name, d.deptno, d.name, location LogicalTableScan emps LogicalTableScan depts PhoenixTableScan depts PhoenixMergeJoin inner, e.deptno = d.deptno PhoenixClientProject empid, e.name, d.deptno, d.name, location Optimizer Optimization rules + Phoenix operator conversion rules PhoenixTableScan emps PhoenixServerProject empid, name, deptno PhoenixServerProject deptno, name, location select empid, e.name, d.deptno, d.name, location from emps e, depts d using deptno order by e.deptno PhoenixServerSort key: deptno empid empid deptno deptno deptno e.deptno; d.deptno; e.deptno; d.deptno;
  36. 36. Query Example - with cost-based optimizer (sort optimization combined with join algorithm decision) Phoenix Implementor PhoenixTableScan depts PhoenixMergeJoin inner, e.deptno = d.deptno PhoenixClientProject empid, e.name, d.deptno, d.name, location PhoenixTableScan emps PhoenixServerProject empid, name, deptno PhoenixServerProject deptno, name, location PhoenixServerSort key: deptno empid empid deptno deptno deptno e.deptno; d.deptno; e.deptno; d.deptno; scan ‘emps’ merge-join ‘emps’ and ‘depts’ sort by ‘deptno’ scan ‘depts’
  37. 37. Query Example - Comparison Query plan w/o cost-based optimizer Query plan w/ cost-based optimizer scan ‘emps’, ‘depts’ first ‘depts’, then ‘emps’ 2 tables in parallel hash-cache send & build proportional to size of ‘depts’; might cause exception if too large none hash-cache look-up 1 look-up per ‘emps’ row none sorting sort ‘emps’ join ‘depts’ sort ‘emps’ only optimization approach Local, serial optimization processes Cost-based, rule-driven, integrated performance (single node, 2M * 2K rows) 19.46 s 13.92 s
  38. 38. Drillix: Interoperability with Drill select deptno, sum(salary) from emps group by deptno Drill Final Aggregation deptno, sum(salary) Phoenix Table Scan emps Phoenix Tables over HBase Drill Shuffle Phoenix Partial Aggregation deptno, sum(salary) Stage 1: Local Partial aggregation Stage 3: Final aggregation Stage 2: Shuffle partial results
  39. 39. Thank you! Questions? Join us tomorrow for PhoenixCon Salesforce.com, 1 Market St, SF 9am-1pm (some companies using Phoenix)

×