Successfully reported this slideshow.
Your SlideShare is downloading. ×

Cursor Implementation in Apache Phoenix

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Cursor Implementation in Apache Phoenix

  1. 1. © 2017 Bloomberg Finance L.P. All rights reserved. HBaseCon West 2017 June 12, 2017 Anirudha Jadhav ajadhav2@bloomberg.net Biju Nair bnair10@bloomberg.net Cursors in Apache Phoenix
  2. 2. © 2017 Bloomberg Finance L.P. All rights reserved. Leading data and analytics provider for the financial industry Bloomberg
  3. 3. Bloomberg is a data company
  4. 4. © 2017 Bloomberg Finance L.P. All rights reserved. Reality of working with data • The data model changes over time • Users querying the data model don’t necessarily change • Alternate query patterns for the same dataset • Data infrastructure usage needs to be simple
  5. 5. © 2017 Bloomberg Finance L.P. All rights reserved. Apache Phoenix • Recipes of best practices for using HBase over a familiar SQL’ish grammar • It is so much more than SQL o User defined functions for push-down o Secondary indices o Statistics collections, optimizations based on heuristics o ORM libraries o JDBC, ODBC support with Query servers o Integrations: Spark, Kafka, MR and others
  6. 6. © 2017 Bloomberg Finance L.P. All rights reserved. Extending Apache Phoenix • A very active and helpful community • Our ongoing work o Apache Calcite o Distributed tests and nightly performance build o Multi-DC replication o Deep paging with cursor implementation
  7. 7. © 2017 Bloomberg Finance L.P. All rights reserved. HBase HBase Master RegionServer RegionServer RegionServer ZooKeeper QuorumHBase Client Application HDFS DataNode HDFS DataNode HDFS DataNode
  8. 8. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf HBase Master RegionServer RegionServer RegionServer ZooKeeper QuorumHBase Client Application HDFS DataNode HDFS DataNode HDFS DataNode Phoenix Client Phoenix RPC endpoint Phoenix RPC endpoint Phoenix Coprocessors SYSTEM.CATALOG SYSTEM.STATS
  9. 9. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix Client Phoenix Client Authentication SQL Parsing Query rewrite/ Optimization Query Plan Generation Transaction Management HBase Client ANTLR4 Hints/Rules Rules Tephra Connection Management HBase Client
  10. 10. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix query execution Connection con = DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile") ; … PreparedStatement statement = con.prepareStatement("select * from TBL"); … ResultSet rset = statement.executeQuery(); … while (rset.next() != null) … rset.close() …
  11. 11. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix query execution Connect to HBase Parse SQL Statement Read/Cache Metadata Validate SQL statement Create query plan Optimize query plan Create Phoenix Result Set Close ResultSet Create Result Iterator getConnection prepareStatement executeQuery close()
  12. 12. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix Server Meta Data Request RegionServer MetaDataEndPointImpl SYSTEM.CATALOG RegionServer UngroupedAggregateRO USER_TABLE GroupedAggregateRO ScanRegionObserverMetaDataRegionObserver Indexer RegionServer UngroupedAggregateRO GroupedAggregateRO ScanRegionObserver ServerCachingEndpointImpl HBase Client Application Phoenix Client Index Write Request Read Request USER_TABLE
  13. 13. © 2017 Bloomberg Finance L.P. All rights reserved. Cursors • To support row pagination o Should support forward and backward traversal • Support required for select queries only • Data needs to be consistent during traversal
  14. 14. © 2017 Bloomberg Finance L.P. All rights reserved. Cursors • DECLARE tCursor CURSOR FOR SELECT * FROM TBL • OPEN tCursor • FETCH NEXT 10 ROWS FROM tCursor • FETCH PRIOR 5 ROWS FROM tCursor • CLOSE tCursor
  15. 15. © 2017 Bloomberg Finance L.P. All rights reserved. Implementation options • PHOENIX-2606 • Use row value constructors o Query rewrite and complex • Wrapper over available query Resultsets o Can leverage Resultsets and so relatively simple
  16. 16. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor Lifecycle PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR FOR SELECT * FROM TBL"); statement.execute(); … statement = con.prepareStatement("OPEN tCursor"); statement = con.prepareStatement("FETCH NEXT FROM tCursor"); ResultSet rset = statement.execute(); while (rset.next != null) … statement = con.prepareStatement(“CLOSE tCursor"); statement.execute();
  17. 17. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor lifecycle Parse SQL Statement Create/Optimize QueryPlan Create CursorWrapper Set Cursor Status to Open Execute CursorFetchPlan Create CursorResultIterator Close Cursor Create Phoenix ResultSet DECLARE CURSOR FETCH OPEN CURSOR CLOSE
  18. 18. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor Challenges • Data Consistency o Query start timestamp provides snapshot consistency • Optimization o Use Scan object for non aggregate queries • Cache sizing o Dynamic sizing
  19. 19. Contributors • Gabriel Jimenez (MIT) • Anirudha Jadhav (Bloomberg) • Biju Nair (Bloomberg) • Ankit Singhal (Hortonworks)
  20. 20. © 2017 Bloomberg Finance L.P. All rights reserved. Thank You Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT Q&A

×