What's on theMenu
● Hue Architecture
○ Many interfaces to implement
○ How do I list HDFS files, how do I submit a job...?
○ SDK
● Hue UI: Dynamic Workflow Editor
○ Why improve the user experience?
○ How can we improve the user experience?
○ Design Considerations
○ Design and Code Deep Dive
Integrate YARN
● JobBrowserMR2, Oozie
● No JT, 4 more REST API
● MR to History Server, missing logs...
● MR1/2 API not 100% compatible
(like Beeswax/HiveServer2, Beeswax
UI/Impala switches)
15.
Integrate security
● 'hue'superuser ● One 'hue'
JT, Shell setuid root:hue Kerberos ticket
● Hive Server 2 ?
● 'hue' Proxy User / doAs
HDFS
Oozie
<property>
<name>oozie.service.ProxyUserService.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>oozie.service.ProxyUserService.proxyuser.hue.groups</name>
<value>*</value>
</property>
Why Improve UserExperience
● Users like things that are easy to use
● Intuition and ease of use
21.
How to ImproveUser Experience
● How can we do this for Oozie?
○ Hue users are not engineers
○ Most users are not familiar with shortcuts and
command lines
○ Windowing systems have taught us drag and drop is
good
Drag and drop every thing in a Workflow!
Design Constraints
● Existingbackend from Hue 2.1
○ Need to be able to easily migrate from Hue 2.1 to
Hue 2.2
● Knockout JS and JQuery already chosen
○ Rudimentary templating
○ Subscription based bindings
○ Observables for arrays and Javascript literals only
○ Event delegation
● Existing UI from Hue 2.1
○ Provides basic node movement through form
submission (reloads the page)
○ Not dynamic
25.
Other Design Considerations
●Serializing should be trivial
● Basic API
○ Save a workflow
○ Validate a node
○ Read a workflow
● Difference in representation between Hue
2.1 backend and the KnockoutJS way of
doing things
● New nodes need an ID
26.
Design - HighLevel Components
● Left out
○ Many event bindings and custom events
○ Views left out
27.
Purpose of theNode Model
● Provides defaults for data:
var NodeModel = ModelModule($);
$.extend(NodeModel.prototype, {
id: 0,
name: '',
description: '',
node_type: '',
workflow: 0,
child_links: []
});
● Sent over the wire
● Mimics Django models
28.
Model - ModelViewSeparation
● ModelViews should be the "shield" and
Models the source of truth.
● Models are more serializable if they do not
carry extraneous data.
● Subscribed update through KnockoutJS:
$.each(mapping, function(key, value) {
var key = key;
if (ko.isObservable(self[key])) {
self[key].subscribe(function(value) {
model[key] = ko.mapping.toJS(value);
});
}
});
29.
Purpose of theRegistry
● Construction optimization
● Constant time node lookup
● Looking towards the future and storage
● Simple start:
var self = this;
self.nodes = {};
module.prototype.initialize.apply(self, arguments);
return self;
30.
Purpose of IDGeneration
● Unique identifier for new nodes (IE: mapreduce:1).
● Assists in creating parent-child relationships through
links.
var IdGeneratorModule = function($) {
return function(options) {
var self = this;
$.extend(self, options);
self.counter = 1;
self.nextId = function() {
return ((self.prefix) ? self.prefix + ':' : '') +
self.counter++;
};
};
};
31.
Transpose to Show
●KnockoutJS supports 3 kinds of observables
○ Observables for literals
○ Observable arrays
○ Computed Observables
● DAG received is represented as a tree
● DAG represented as a list of lists when we display...
MVVM restriction
32.
Other Difficulties
● Decisionnode representation
● JSON.stringify does not include parent class
members
● Memory consumption
● Cycles, cycles, cycles
33.
Next steps
● Integrate
○ Pig, Hive Server 2
○ Oozie Bundles, SLA
○ Document model, "Editors", git
○ SDK revamp, language agnostic, proxy app
● UX
○ Impala real time UI
○ Redesign overall layout
● Sqoop 2, HBase? Mahout?...
Face of Hadoop/CDH