Cloudera Hue/Beeswax
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Cloudera Hue/Beeswax

on

  • 7,111 views

 

Statistics

Views

Total Views
7,111
Views on SlideShare
7,111
Embed Views
0

Actions

Likes
2
Downloads
43
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cloudera Hue/Beeswax Presentation Transcript

  • 1. Hue/Beeswax – A Hive UI 2010.7.6 bc Wong <bcwalrus [at] cloudera.com>
  • 2. 10 A Demo Is Worth 2 Slides
  • 3. Beeswax Server as the Hive Connector ● A proxy that supports concurrent queries ● Needed to work around HiveServer ● 617 lines of code ● Apache License v2.0
  • 4. Beeswax Server Talks Thrift QueryHandle query(Query); QueryState get_state(QueryHandle); Results fetch(QueryHandle, bool start_over); ResultsMetadata get_results_metadata(QueryHandle); QueryExplanation explain(Query); string get_log(LogContextId);
  • 5. A Handle for Each Query QueryHandle query(Query); QueryState get_state(QueryHandle); Results fetch(QueryHandle, bool start_over); { { string query; ResultsMetadata string id; get_results_metadata(QueryHandle); list<string> configuration; LogContextId; string user; } QueryExplanation explain(Query); list<string> groups; } string get_log(LogContextId);
  • 6. Asynchronous Query Processing QueryHandle query(Query); QueryState get_state(QueryHandle); Results MAIN fetch(QueryHandle, bool start_over); THREAD handle = new QueryHandle(...); ResultsMetadata get_results_metadata(QueryHandle); query = new RunningQueryState(handle, ...); query.compile(); QueryExplanation explain(Query); <thread pool execution> return handle; WORKER THREAD Hive.closeCurrent(); string get_log(LogContextId); Hive.get(conf); driver.execute(); notify();
  • 7. Hue Polls on the QueryState QueryHandle query(Query); QueryState get_state(QueryHandle); Results fetch(QueryHandle, bool start_over); ResultsMetadata get_results_metadata(QueryHandle); CREATED, INITIALIZED, QueryExplanation COMPILED, explain(Query); RUNNING, FINISHED, string EXCEPTION get_log(LogContextId);
  • 8. Results Delimiter Is a Big Problem driver.getResults(); QueryHandle query(Query); QueryState get_state(QueryHandle); Results fetch(QueryHandle, bool start_over); ResultsMetadata get_results_metadata(QueryHandle); QueryExplanation explain(Query); { bool ready; list<string> columns; stringlist<string> data; get_log(LogContextId); // list of tab delimited fields int start_row; bool has_more; }
  • 9. Metadata on Results for Direct Access QueryHandle query(Query); QueryState get_state(QueryHandle); Results fetch(QueryHandle, bool start_over); ResultsMetadata get_results_metadata(QueryHandle); QueryExplanation explain(Query); // Only for “ selects” { string hive_metastore.Schema; get_log(LogContextId); string results_dir; // temporary results dir string table_name; // for select * string delim; }
  • 10. Logs Collected Per Query QueryHandle Query_a <-- query(Query); LogContextId_a <-- dynamic -- Thread_1 thread to ctx QueryState Query_b <-- get_state(QueryHandle); LogContextId_b <-- map -- Thread_2 Results fetch(QueryHandle, bool start_over); ResultsMetadata get_results_metadata(QueryHandle); LogDivertAppender QueryExplanation explain(Query); string get_log(LogContextId);