Your SlideShare is downloading. ×
0
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
HiveServer2 for Apache Hive
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

HiveServer2 for Apache Hive

7,146

Published on

Published in: Technology
1 Comment
3 Likes
Statistics
Notes
  • http://dbmanagement.info/Tutorials/Apache_Hive.htm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
7,146
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
77
Comments
1
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. June 2012HiveServer2 Project (WIP)Carl Steinbach | Platform Engineering
  • 2. Hive Background: What is it? An ETL/Data Warehouse system for Hadoop: •  SQL->MR Compiler and Execution Engine •  SerDes: Pluggable Data Format Handlers •  MetaStore: Persistent Metadata Storage2 ©2012 Cloudera, Inc. All Rights Reserved.
  • 3. Hive Evolution •  Original Vision: –  Let users express their queries in a high-level language without having to write MR programs •  Now more and more: –  A parallel SQL DBMS that happens to use Hadoop for its storage and execution layer.3 ©2012 Cloudera, Inc. All Rights Reserved.
  • 4. What do users expect from a DBMS? •  Sessions/Concurrency –  Persistent client state on the server-side –  Ability to run multiple client concurrently •  ODBC/JDBC –  SQL IDEs, BI, ETL, … •  Authentication/Authorization •  Auditing/Logging4 ©2012 Cloudera, Inc. All Rights Reserved.
  • 5. What’s Missing? •  Sessions/Concurrency –  Current Thrift API can’t support concurrency •  ODBC/JDBC –  Thrift API doesn’t support common ODBC/JDBC •  Authentication/Authorization –  Incomplete implementations •  Auditing/Logging –  Multiple plugin interfaces in need of consolidation5 ©2012 Cloudera, Inc. All Rights Reserved.
  • 6. What’s Missing Concurrency/Sessions •  Current Thrift API can’t support multiple connections or client sessions. •  User/Global Configuration and Session Info •  Query compiler memory leaks6 ©2012 Cloudera, Inc. All Rights Reserved.
  • 7. What’s Missing ODBC/JDBC •  Thrift API can’t support common ODBC/ JDBC calls: –  SQLGetInfo –  SQLGetTypeInfo –  SQLCancel –  SQLGetFunctions7 ©2012 Cloudera, Inc. All Rights Reserved.
  • 8. What’s Missing Authentication/Authorization •  SASL Authentication for HiveServer •  Hive supports GRANT/ROLE based authorization, but implementation is incomplete. •  Code injection vectors: ADD JAR, TRANSFORM, SET x, …8 ©2012 Cloudera, Inc. All Rights Reserved.
  • 9. Project Milestones •  HiveServer2 Thrift API Spec •  JDBC/ODBC HiveServer2 Drivers •  Concurrent Thrift clients –  Fix query compiler memory leaks –  User/Global session/configuration information •  Authentication (Kerberos) •  Authorization –  Extend to configuration, ADD x, TRANSFORM, …9 ©2012 Cloudera, Inc. All Rights Reserved.
  • 10. Who’s working on it? •  Carl Steinbach –  carl@cloudera •  Prasad Mujumdar –  prasadm@cloudera10 ©2011 Cloudera, Inc. All Rights Reserved.
  • 11. Resources •  HIVE-2935: Implement HiveServer2 •  HiveServer API Proposal: –  https://cwiki.apache.org/confluence/display/ Hive/HiveServer2+Thrift+API11 ©2011 Cloudera, Inc. All Rights Reserved.
  • 12. Questions? •  Questions? •  Questions? •  Questions? •  Questions?12 ©2012 Cloudera, Inc. All Rights Reserved.

×