Deep	Water	
Or:	Bringing	TensorFlow	et	al.	to	H2O
Arno	Candel,	PhD	
Chief	Architect,	Physicist	&	Hacker,	H2O.ai	
@ArnoCandel	
July	18	2016
Overview
Machine Learning (ML)
Artificial Intelligence (A.I.)
Computer Science (CS)
H2O.ai
Deep Learning (DL)
hot hot hot hot hot
A	Simple	Deep	Learning	Model:	Artificial	Neural	Network
heartbeat
blood
pressure
oxygen
send to regular care
send to intensive

care unit (ICU)
IN: data OUT: prediction
nodes : neuron activations (real numbers) — represent features
arrows : connecting weights (real numbers) — learned during training
: non-linearity x -> f(x) — adds model complexity
from 1970s, now rebranded as DL
• conceptually	simple	
• non	linear	
• highly	flexible	and	configurable	
• learned	features	can	be	extracted	
• can	be	fine-tuned	with	more	data	
• efficient	for	multi-class	problems	
• world-class	at	pattern	recognition
Deep	Learning	Pros	and	Cons
Pros:
• hard	to	interpret	
• theory	not	well	understood	
• slow	to	train	and	score	
• overfits,	needs	regularization	
• many	hyper-parameters	
• inefficient	for	categorical	variables	
• very	data	hungry,	learns	slowly
Cons:
Deep	Learning	got	boosted	recently	by	faster	computers
Brief	History	of	A.I.,	ML	and	DL
John McCarthy

Princeton, Bell Labs, Dartmouth, later: MIT, Stanford
1955: “A proposal for the Dartmouth summer
research project on Artificial Intelligence”
with Marvin Minsky (MIT), Claude Shannon 

(Bell Labs) and Nathaniel Rochester (IBM)
http://www.asiapacific-mathnews.com/04/0403/0015_0020.pdf
A step back: A.I. was coined over 60 years ago
1955	proposal	for	the	Dartmouth	summer	research	project	on	A.I.
“We propose that a 2-month, 10-man study of artificial
intelligence be carried out during the summer of 1956 at
Dartmouth College in Hanover, New Hampshire. The study is to
proceed on the basis of the conjecture that every aspect of
learning and any other feature of intelligence can in principle be
so precisely described that a machine can be made to simulate
it. An attempt will be made to find how to make machines use
language, form abstractions and concepts, solve kinds of problems
now reserved for humans, and improve themselves. We think that
a significant advance can be made in one or more of these
problems if a carefully selected group of scientists work on it
together for one summer.”
Step	1:	Great	Algorithms	+	Fast	Computers
http://nautil.us/issue/18/genius/why-the-chess-computer-deep-blue-played-like-a-human
1997: Playing Chess
(IBM Deep Blue beats Kasparov)
Computer	Science

30	custom	CPUs,	60	billion	moves	in	3	mins
“No computer will ever beat me at playing chess.”
Step	2:	More	Data	+	Real-Time	Processing
http://cs.stanford.edu/group/roadrunner/old/presskit.html
2005: Self-driving Cars

DARPA Grand Challenge, 132 miles
(won by Stanford A.I. lab*)
Sensors	&	Computer	Science

video,	radar,	laser,	GPS,	7	Pentium	computers
“No computer will ever drive a car!?”
*A.I. lab was established by McCarthy et al. in the early 60s
Step	3:	Big	Data	+	In-Memory	Clusters
2011: Jeopardy (IBM Watson)
In-Memory	Analytics/ML	
4	TB	of	data	(incl.	wikipedia),	90	servers,

16	TB	RAM,	Hadoop,	6	million	logic	rules
https://www.youtube.com/watch?v=P18EdAKuC1U https://en.wikipedia.org/wiki/Watson_(computer)
Note: IBM Watson received the question in electronic written form, and was often
able to press the answer button faster than the competing humans.
“No computer will ever answer random questions!?”
“No computer will ever understand my language!?”
2014: Google

(acquired Quest Visual)
Deep	Learning

Convolutional	and	Recurrent	
Neural	Networks,	
with	training	data	from	users
Step	4:	Deep	Learning
• Translate between 103 languages by typing
• Instant camera translation: Use your camera to translate text instantly in 29 languages
• Camera Mode: Take pictures of text for higher-quality translations in 37 languages
• Conversation Mode: Two-way instant speech translation in 32 languages
• Handwriting: Draw characters instead of using the keyboard in 93 languages
Step	5:	Augmented	Deep	Learning
2014: Atari Games (DeepMind)
2016: AlphaGo (Google DeepMind)
Deep	Learning

+	reinforcement	learning,	tree	search,

Monte	Carlo,	GPUs,	playing	against	itself,	…
Go board has approx.
200,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 (2E170) possible positions.
trained	from	raw	pixel	values,	no	human	rules
“No computer will ever beat the best Go master!?”
Microsoft had won the Visual Recognition challenge:
http://image-net.org/challenges/LSVRC/2015/
Step	6:	A.I.	Chatbots	have	Opinions	too!
What	Will	Change?
Today Tomorrow
Better	Data	—	Better	Models	—	Better	Results
Example:	
Fraud	
Prediction
What	about	Jobs?
Anything that can be automated will be automated.
Jobs of the past:
assembly line work, teller (ATMs), taxi-firm receptionist
Jobs being automated away now:
resume matching, driving, language translation, education
Jobs being automated away soon:
healthcare, arts & crafts, entertainment, design, decoration,
software engineering, politics, management, professional
gaming, financial planning, auditing, real estate agent
Jobs of the future:
professional sports, food & wine reviewer
Maybe	we’ll	finally	get	the	eating	machine?
https://www.youtube.com/watch?v=n_1apYo6-Ow
1936 Modern Times (Charlie Chaplin)
Live	H2O	Deep	Learning	Demo:	Predict	Airplane	Delays
10 nodes: all

320 cores busy
real-time, interactive
model inspection in Flow
116M rows, 6GB CSV file

800+ predictors (numeric + categorical)
model trained in <1 min:

2M+ samples/second
Deep Learning Model
H2O Elastic Net (GLM): 10 secs
alpha=0.5, lambda=1.379e-4 (auto)
H2O Deep Learning: 45 secs
4 hidden ReLU layers of 20 neurons, 1 epoch
Features have non-
linear impact
Chicago, Atlanta,
Dallas:

often delayed
Significant	Performance	Gains	with	Deep	Learning
Predict departure delay (Y/N) on 20 years of airline flight data
(116M rows, 12 cols, categorical + numerical data with missing values)
WATCH NOW
AUC: 0.656
AUC: 0.703
(higher is better, ranges from 0.5 to 1)
Feature importances
10 nodes: Dual E5-2650 (8 cores, 2.6GHz), 10GbE
READ MORE
Kaggle challenge
2nd place winner

Colin Priest
READ MORE
“For	my	final	competition	submission	I	used	an	ensemble	of	
models,	including	3	deep	learning	models	built	with	R	and	h2o.”
“I	did	really	like	H2O’s	deep	learning	implementation	in	
R,	though	-	the	interface	was	great,	the	back	end	
extremely	easy	to	understand,	and	it	was	scalable	and	
flexible.	Definitely	a	tool	I’ll	be	going	back	to.”
H2O	Deep	Learning	Community	Quotes
H2O	Deep	Learning	Community	Quotes
READ MORE WATCH NOW
“H2O	Deep	Learning	models	outperform	other	Gleason	
predicting	models.”
READ MORE
“…	combine	ADAM	and	Apache	Spark	with	H2O’s	deep	learning	capabilities	to	predict	
an	individual’s	population	group	based	on	his	or	her	genomic	data.	Our	results	
demonstrate	that	we	can	predict	these	very	well,	with	more	than	99%	accuracy.”
H2O	Deep	Learning	Community	Quotes
H2O	Booklets
DOWNLOAD
R Python Deep Learning GLMGBMSparkling Water
KDNuggets	Poll	about	Deep	Learning	Tools	&	Platforms
http://www.kdnuggets.com
H2O	and	TensorFlow	are	tied
usage	of	Deep	
Learning	tools	in	
past	year
TensorFlow	+	H2O	+	Apache	Spark	=	Anything	is	possible
https://github.com/h2oai/sparkling-
water/blob/master/py/examples/
notebooks/TensorFlowDeepLearning.ipynb https://www.youtube.com/watch?v=62TFK641gG8
Integration	with	existing	
GPU	backends
Leverage	open-source	tools	and	research	for	
TensorFlow,	Caffe,	mxnet,	Theano,	etc.
Scalability	and	Ease	of	
Use/Deployment	of	H2O
Distributed	training,	real-time	model	inspection	
Flow,	R,	Python,	Spark/Scala,	Java,	REST,	POJO,	Steam
Convolutional	Neural	
Networks
Image,	video,	speech	recognition,	etc.
Recurrent	Neural	
Networks
Sequences,	time	series,	etc.	
NLP:	natural	language	processing
Hybrid	Neural	Networks	
Architectures
Speech	to	text	translation,	image	captioning,	scene	
parsing,	etc.
Deep	Water:	Next-Gen	Deep	Learning	in	H2O
Enterprise	Deep	Learning	for	Business	Transformation
Deep	Water:	Next-Gen	Deep	Learning	in	H2O
Don’t	miss	our	Deep	Learning	sessions	tomorrow	afternoon!
TensorFlow mxnetCaffe H2O

Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O