H2O	Advancements
Arno	Candel,	PhD	
Chief	Architect,	Physicist	&	Hacker,	H2O.ai	
@ArnoCandel	
July	18	2016
New	Enterprise	Features
Auth	&	Security LDAP/Kerberos/HTTPS/Encryption	for	max.	compliance,	IPv6
Semi-Supervised Pre-train	Deep	Learning	model	on	unlabeled	data,	then	fine-tune
Large	POJOs Productionize	large	Java	models	(multi-GB	source	code)
Hyper	Parameter	
Tuning
Automatically	tunes	model	parameters	for	the	desired	metric,	
with	convergence-based	early	stopping	for	models	and	search
Steam Platform	for	Data	Products	-	next-gen	product
Sparkling	Water	2.0 The	killer	app	for	the	latest	version	of	Apache	Spark
Advanced	Munging Bringing	R’s	data.table	to	H2O	-	scalable,	fast	and	distributed
Mission-Critical	Capabilities	for	Enterprise	Production	Use
New	Features	for	H2O	Tree	Algorithms:	GBM	+	DRF
Highest	Accuracy	for	Smarter	Applications
Optimal	missing	
value	handling
Missing	data	has	meaning,	is	rarely	missing	at	random	
Optimal	splits	are	found	taking	missing	values	into	account
Quantile-based	
histograms
Finds	optimal	split	points	for	data	with	outliers,	
e.g.	-99999,0,1,2,3,4,5
New	Algorithm ExtraTreesClassifier	(pick	best	among	random	split	points)
Robust	Regression:	
Huber	loss	for	GBM
Higher	accuracy	for	models	on	data	with	outliers	
quadratic	loss	for	inliers,	linear	loss	for	outliers
More	tuning	
parameters	for	
GBM	
col_sample_rate_change_per_level		-	H2O	exclusive	
learn_rate_annealing	-	faster	training	
min_split_improvement	-	avoids	overfitting	
max_abs_leafnode_pred	-	avoids	overfitting	
sample_rate_per_class	-	for	imbalanced	datasets
Integration	with	existing	
GPU	backends
Leverage	open-source	tools	and	research	for	
TensorFlow,	Caffe,	mxnet,	Theano,	etc.
Scalability	and	Ease	of	
Use/Deployment	of	H2O
Distributed	training,	real-time	model	inspection	
Flow,	R,	Python,	Spark/Scala,	Java,	REST,	POJO,	Steam
Convolutional	Neural	
Networks
Image,	video,	speech	recognition,	etc.
Recurrent	Neural	
Networks
Sequences,	time	series,	etc.	
NLP:	natural	language	processing
Hybrid	Neural	Networks	
Architectures
Speech	to	text	translation,	image	captioning,	scene	
parsing,	etc.
Deep	Water:	Next-Gen	Deep	Learning	in	H2O
Enterprise	Deep	Learning	for	Business	Transformation
Much	more	about	Deep	Learning	tomorrow	afternoon!

H2O Advancements - Arno Candel