© 2017 Bloomberg Finance L.P. All rights reserved.
HBaseCon West 2017
June 12, 2017
Anirudha Jadhav
ajadhav2@bloomberg.net
Biju Nair
bnair10@bloomberg.net
Cursors in Apache Phoenix
© 2017 Bloomberg Finance L.P. All rights reserved.
Leading data and analytics provider for the financial industry
Bloomberg
Bloomberg is a data company
© 2017 Bloomberg Finance L.P. All rights reserved.
Reality of working with data
• The data model changes over time
• Users querying the data model don’t necessarily change
• Alternate query patterns for the same dataset
• Data infrastructure usage needs to be simple
© 2017 Bloomberg Finance L.P. All rights reserved.
Apache Phoenix
• Recipes of best practices for using HBase over a familiar SQL’ish grammar
• It is so much more than SQL
o User defined functions for push-down
o Secondary indices
o Statistics collections, optimizations based on heuristics
o ORM libraries
o JDBC, ODBC support with Query servers
o Integrations: Spark, Kafka, MR and others
© 2017 Bloomberg Finance L.P. All rights reserved.
Extending Apache Phoenix
• A very active and helpful community
• Our ongoing work
o Apache Calcite
o Distributed tests and nightly performance build
o Multi-DC replication
o Deep paging with cursor implementation
© 2017 Bloomberg Finance L.P. All rights reserved.
HBase
HBase
Master
RegionServer RegionServer RegionServer
ZooKeeper
QuorumHBase Client
Application
HDFS
DataNode
HDFS
DataNode
HDFS
DataNode
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix
https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase
http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf
HBase
Master
RegionServer RegionServer RegionServer
ZooKeeper
QuorumHBase Client
Application
HDFS
DataNode
HDFS
DataNode
HDFS
DataNode
Phoenix Client
Phoenix RPC
endpoint
Phoenix RPC
endpoint
Phoenix
Coprocessors
SYSTEM.CATALOG SYSTEM.STATS
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix Client
Phoenix Client
Authentication
SQL Parsing
Query rewrite/
Optimization
Query Plan Generation
Transaction Management
HBase
Client
ANTLR4
Hints/Rules
Rules
Tephra
Connection Management
HBase
Client
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix query execution
Connection con =
DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile")
;
…
PreparedStatement statement = con.prepareStatement("select * from TBL");
…
ResultSet rset = statement.executeQuery();
…
while (rset.next() != null)
…
rset.close()
…
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix query execution
Connect to HBase
Parse SQL Statement
Read/Cache Metadata
Validate SQL statement
Create query plan
Optimize query plan
Create Phoenix Result Set
Close ResultSet
Create Result Iterator
getConnection
prepareStatement
executeQuery
close()
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix Server
Meta Data
Request
RegionServer
MetaDataEndPointImpl
SYSTEM.CATALOG
RegionServer
UngroupedAggregateRO
USER_TABLE
GroupedAggregateRO
ScanRegionObserverMetaDataRegionObserver
Indexer
RegionServer
UngroupedAggregateRO
GroupedAggregateRO
ScanRegionObserver
ServerCachingEndpointImpl
HBase Client
Application
Phoenix Client
Index
Write
Request
Read
Request
USER_TABLE
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursors
• To support row pagination
o Should support forward and backward traversal
• Support required for select queries only
• Data needs to be consistent during traversal
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursors
• DECLARE tCursor CURSOR FOR SELECT * FROM TBL
• OPEN tCursor
• FETCH NEXT 10 ROWS FROM tCursor
• FETCH PRIOR 5 ROWS FROM tCursor
• CLOSE tCursor
© 2017 Bloomberg Finance L.P. All rights reserved.
Implementation options
• PHOENIX-2606
• Use row value constructors
o Query rewrite and complex
• Wrapper over available query Resultsets
o Can leverage Resultsets and so relatively simple
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor Lifecycle
PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR
FOR SELECT * FROM TBL");
statement.execute();
…
statement = con.prepareStatement("OPEN tCursor");
statement = con.prepareStatement("FETCH NEXT FROM tCursor");
ResultSet rset = statement.execute();
while (rset.next != null)
…
statement = con.prepareStatement(“CLOSE tCursor");
statement.execute();
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor lifecycle
Parse SQL Statement
Create/Optimize QueryPlan
Create CursorWrapper
Set Cursor Status to Open
Execute CursorFetchPlan
Create CursorResultIterator
Close Cursor
Create Phoenix ResultSet
DECLARE CURSOR
FETCH
OPEN CURSOR
CLOSE
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor Challenges
• Data Consistency
o Query start timestamp provides snapshot consistency
• Optimization
o Use Scan object for non aggregate queries
• Cache sizing
o Dynamic sizing
Contributors
• Gabriel Jimenez (MIT)
• Anirudha Jadhav (Bloomberg)
• Biju Nair (Bloomberg)
• Ankit Singhal (Hortonworks)
© 2017 Bloomberg Finance L.P. All rights reserved.
Thank You
Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT
Q&A

HBaseCon2017 Cursors in Apache Phoenix

  • 1.
    © 2017 BloombergFinance L.P. All rights reserved. HBaseCon West 2017 June 12, 2017 Anirudha Jadhav ajadhav2@bloomberg.net Biju Nair bnair10@bloomberg.net Cursors in Apache Phoenix
  • 2.
    © 2017 BloombergFinance L.P. All rights reserved. Leading data and analytics provider for the financial industry Bloomberg
  • 3.
    Bloomberg is adata company
  • 4.
    © 2017 BloombergFinance L.P. All rights reserved. Reality of working with data • The data model changes over time • Users querying the data model don’t necessarily change • Alternate query patterns for the same dataset • Data infrastructure usage needs to be simple
  • 5.
    © 2017 BloombergFinance L.P. All rights reserved. Apache Phoenix • Recipes of best practices for using HBase over a familiar SQL’ish grammar • It is so much more than SQL o User defined functions for push-down o Secondary indices o Statistics collections, optimizations based on heuristics o ORM libraries o JDBC, ODBC support with Query servers o Integrations: Spark, Kafka, MR and others
  • 6.
    © 2017 BloombergFinance L.P. All rights reserved. Extending Apache Phoenix • A very active and helpful community • Our ongoing work o Apache Calcite o Distributed tests and nightly performance build o Multi-DC replication o Deep paging with cursor implementation
  • 7.
    © 2017 BloombergFinance L.P. All rights reserved. HBase HBase Master RegionServer RegionServer RegionServer ZooKeeper QuorumHBase Client Application HDFS DataNode HDFS DataNode HDFS DataNode
  • 8.
    © 2017 BloombergFinance L.P. All rights reserved. Phoenix https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf HBase Master RegionServer RegionServer RegionServer ZooKeeper QuorumHBase Client Application HDFS DataNode HDFS DataNode HDFS DataNode Phoenix Client Phoenix RPC endpoint Phoenix RPC endpoint Phoenix Coprocessors SYSTEM.CATALOG SYSTEM.STATS
  • 9.
    © 2017 BloombergFinance L.P. All rights reserved. Phoenix Client Phoenix Client Authentication SQL Parsing Query rewrite/ Optimization Query Plan Generation Transaction Management HBase Client ANTLR4 Hints/Rules Rules Tephra Connection Management HBase Client
  • 10.
    © 2017 BloombergFinance L.P. All rights reserved. Phoenix query execution Connection con = DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile") ; … PreparedStatement statement = con.prepareStatement("select * from TBL"); … ResultSet rset = statement.executeQuery(); … while (rset.next() != null) … rset.close() …
  • 11.
    © 2017 BloombergFinance L.P. All rights reserved. Phoenix query execution Connect to HBase Parse SQL Statement Read/Cache Metadata Validate SQL statement Create query plan Optimize query plan Create Phoenix Result Set Close ResultSet Create Result Iterator getConnection prepareStatement executeQuery close()
  • 12.
    © 2017 BloombergFinance L.P. All rights reserved. Phoenix Server Meta Data Request RegionServer MetaDataEndPointImpl SYSTEM.CATALOG RegionServer UngroupedAggregateRO USER_TABLE GroupedAggregateRO ScanRegionObserverMetaDataRegionObserver Indexer RegionServer UngroupedAggregateRO GroupedAggregateRO ScanRegionObserver ServerCachingEndpointImpl HBase Client Application Phoenix Client Index Write Request Read Request USER_TABLE
  • 13.
    © 2017 BloombergFinance L.P. All rights reserved. Cursors • To support row pagination o Should support forward and backward traversal • Support required for select queries only • Data needs to be consistent during traversal
  • 14.
    © 2017 BloombergFinance L.P. All rights reserved. Cursors • DECLARE tCursor CURSOR FOR SELECT * FROM TBL • OPEN tCursor • FETCH NEXT 10 ROWS FROM tCursor • FETCH PRIOR 5 ROWS FROM tCursor • CLOSE tCursor
  • 15.
    © 2017 BloombergFinance L.P. All rights reserved. Implementation options • PHOENIX-2606 • Use row value constructors o Query rewrite and complex • Wrapper over available query Resultsets o Can leverage Resultsets and so relatively simple
  • 16.
    © 2017 BloombergFinance L.P. All rights reserved. Cursor Lifecycle PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR FOR SELECT * FROM TBL"); statement.execute(); … statement = con.prepareStatement("OPEN tCursor"); statement = con.prepareStatement("FETCH NEXT FROM tCursor"); ResultSet rset = statement.execute(); while (rset.next != null) … statement = con.prepareStatement(“CLOSE tCursor"); statement.execute();
  • 17.
    © 2017 BloombergFinance L.P. All rights reserved. Cursor lifecycle Parse SQL Statement Create/Optimize QueryPlan Create CursorWrapper Set Cursor Status to Open Execute CursorFetchPlan Create CursorResultIterator Close Cursor Create Phoenix ResultSet DECLARE CURSOR FETCH OPEN CURSOR CLOSE
  • 18.
    © 2017 BloombergFinance L.P. All rights reserved. Cursor Challenges • Data Consistency o Query start timestamp provides snapshot consistency • Optimization o Use Scan object for non aggregate queries • Cache sizing o Dynamic sizing
  • 19.
    Contributors • Gabriel Jimenez(MIT) • Anirudha Jadhav (Bloomberg) • Biju Nair (Bloomberg) • Ankit Singhal (Hortonworks)
  • 20.
    © 2017 BloombergFinance L.P. All rights reserved. Thank You Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT Q&A