HiveServer2 for Apache Hive

12,724 views

Published on

Published in: Technology
1 Comment
7 Likes
Statistics
Notes
  • http://dbmanagement.info/Tutorials/Apache_Hive.htm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
12,724
On SlideShare
0
From Embeds
0
Number of Embeds
188
Actions
Shares
0
Downloads
113
Comments
1
Likes
7
Embeds 0
No embeds

No notes for slide

HiveServer2 for Apache Hive

  1. 1. June 2012HiveServer2 Project (WIP)Carl Steinbach | Platform Engineering
  2. 2. Hive Background: What is it? An ETL/Data Warehouse system for Hadoop: •  SQL->MR Compiler and Execution Engine •  SerDes: Pluggable Data Format Handlers •  MetaStore: Persistent Metadata Storage2 ©2012 Cloudera, Inc. All Rights Reserved.
  3. 3. Hive Evolution •  Original Vision: –  Let users express their queries in a high-level language without having to write MR programs •  Now more and more: –  A parallel SQL DBMS that happens to use Hadoop for its storage and execution layer.3 ©2012 Cloudera, Inc. All Rights Reserved.
  4. 4. What do users expect from a DBMS? •  Sessions/Concurrency –  Persistent client state on the server-side –  Ability to run multiple client concurrently •  ODBC/JDBC –  SQL IDEs, BI, ETL, … •  Authentication/Authorization •  Auditing/Logging4 ©2012 Cloudera, Inc. All Rights Reserved.
  5. 5. What’s Missing? •  Sessions/Concurrency –  Current Thrift API can’t support concurrency •  ODBC/JDBC –  Thrift API doesn’t support common ODBC/JDBC •  Authentication/Authorization –  Incomplete implementations •  Auditing/Logging –  Multiple plugin interfaces in need of consolidation5 ©2012 Cloudera, Inc. All Rights Reserved.
  6. 6. What’s Missing Concurrency/Sessions •  Current Thrift API can’t support multiple connections or client sessions. •  User/Global Configuration and Session Info •  Query compiler memory leaks6 ©2012 Cloudera, Inc. All Rights Reserved.
  7. 7. What’s Missing ODBC/JDBC •  Thrift API can’t support common ODBC/ JDBC calls: –  SQLGetInfo –  SQLGetTypeInfo –  SQLCancel –  SQLGetFunctions7 ©2012 Cloudera, Inc. All Rights Reserved.
  8. 8. What’s Missing Authentication/Authorization •  SASL Authentication for HiveServer •  Hive supports GRANT/ROLE based authorization, but implementation is incomplete. •  Code injection vectors: ADD JAR, TRANSFORM, SET x, …8 ©2012 Cloudera, Inc. All Rights Reserved.
  9. 9. Project Milestones •  HiveServer2 Thrift API Spec •  JDBC/ODBC HiveServer2 Drivers •  Concurrent Thrift clients –  Fix query compiler memory leaks –  User/Global session/configuration information •  Authentication (Kerberos) •  Authorization –  Extend to configuration, ADD x, TRANSFORM, …9 ©2012 Cloudera, Inc. All Rights Reserved.
  10. 10. Who’s working on it? •  Carl Steinbach –  carl@cloudera •  Prasad Mujumdar –  prasadm@cloudera10 ©2011 Cloudera, Inc. All Rights Reserved.
  11. 11. Resources •  HIVE-2935: Implement HiveServer2 •  HiveServer API Proposal: –  https://cwiki.apache.org/confluence/display/ Hive/HiveServer2+Thrift+API11 ©2011 Cloudera, Inc. All Rights Reserved.
  12. 12. Questions? •  Questions? •  Questions? •  Questions? •  Questions?12 ©2012 Cloudera, Inc. All Rights Reserved.

×