SlideShare a Scribd company logo
H2O at Poznan R Meetup
Introduction to H2O, IoT Use Cases and Deep Water
Jo-fai (Joe) Chow
Data Scientist
joe@h2o.ai
@matlabulous
Poznan R
20th April, 2017
About Me
• Civil (Water) Engineer
• 2010 – 2015
• Consultant (UK)
• Utilities
• Asset Management
• Constrained Optimization
• Industrial PhD (UK)
• Infrastructure Design Optimization
• Machine Learning +
Water Engineering
• Discovered H2O in 2014
• Data Scientist
• 2015
• Virgin Media (UK)
• Domino Data Lab (Silicon Valley)
• 2016 – Present
• H2O.ai (Silicon Valley)
2
About Me
3
Side Project #1 – Crime Data Visualization
4
https://github.com/woobe/rApps/tree/master/crimemap
http://insidebigdata.com/2013/11/30/visualization-week-crimemap/
Side Project #2 – Data Visualization Contest
5
https://github.com/woobe/rugsmaps http://blog.revolutionanalytics.com/2014/08/winner-for-revolution-analytics-user-group-map-contest.html
Side Project #3
6
Developing R Packages for Fun
rPlotter (2014)
Side Project #4 – Kaggle Blog Post
7
R + H2O + Domino for Kaggle
Guest Blog Post for Domino & H2O (2014)
• The Long Story
• bit.ly/joe_kaggle_story
Agenda
• Introduction
• Company
• Machine Learning Platform
• IoT Use Cases
• Predictive Maintenance
• Outlier Detection
• Break (25 mins)
• Deep Water
• Motivation / Benefits
• Interfaces / Demo
8
About H2O.ai
9
Company Overview
Founded 2011 Venture-backed, debuted in 2012
Products • H2O Open Source In-Memory AI Prediction Engine
• Sparkling Water
• Steam
Mission Operationalize Data Science, and provide a platform for users to build beautiful data products
Team 70 employees
• Distributed Systems Engineers doing Machine Learning
• World-class visualization designers
Headquarters Mountain View, CA
10
11
Our Team
Joe
Scientific Advisory Council
12
13
0
10000
20000
30000
40000
50000
60000
70000
1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16
# H2O Users
H2O Community Growth
Tremendous Momentum Globally
65,000+ users globally
(Sept 2016)
• 65,000+ users from
~8,000 companies in 140
countries. Top 5 from:
Large User Circle
* DATA FROM GOOGLE ANALYTICS EMBEDDED IN THE END USER PRODUCT
14
0
2000
4000
6000
8000
10000
1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16
# Companies Using H2O ~8,000+ companies
(Sept 2016)
+127%
+60%
#AroundTheWorldWithH2Oai
15
H2O for Kaggle Competitions
16
H2O for Academic Research
17
http://www.sciencedirect.com/science/article/pii/S0377221716308657
https://arxiv.org/abs/1509.01199
Users In Various Verticals Adore H2O
Financial Insurance MarketingTelecom Healthcare
18
19
Joe (2015)
http://www.h2o.ai/gartner-magic-quadrant/
20
Check
out our
website
h2o.ai
H2O Machine Learning Platform
21
22
23
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
24
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
25
Import Data from
Multiple Sources
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
26
Fast, Scalable & Distributed
Compute Engine Written in
Java
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
27
Fast, Scalable & Distributed
Compute Engine Written in
Java
Supervised Learning
• Generalized Linear Models: Binomial,
Gaussian, Gamma, Poisson and Tweedie
• Naïve Bayes
Statistical
Analysis
Ensembles
• Distributed Random Forest: Classification
or regression models
• Gradient Boosting Machine: Produces an
ensemble of decision trees with increasing
refined approximations
Deep Neural
Networks
• Deep learning: Create multi-layer feed
forward neural networks starting with an
input layer followed by multiple layers of
nonlinear transformations
Algorithms Overview
Unsupervised Learning
• K-means: Partitions observations into k
clusters/groups of the same spatial size.
Automatically detect optimal k
Clustering
Dimensionality
Reduction
• Principal Component Analysis: Linearly transforms
correlated variables to independent components
• Generalized Low Rank Models: extend the idea of
PCA to handle arbitrary data consisting of numerical,
Boolean, categorical, and missing data
Anomaly
Detection
• Autoencoders: Find outliers using a
nonlinear dimensionality reduction using
deep learning
28
H2O Deep Learning in Action
29
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
30
Multiple Interfaces
H2O + R
31
Package ‘h2o’ from CRAN
or H2O’s website
Start a local H2O (Java
Virtual Machine) cluster
Simple ‘iris’ example
H2O + R
32
H2O + Python
33
34
H2O Flow (Web) Interface
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
35
Export Standalone Models
for Production
H2O Overview
36
37
docs.h2o.ai
H2O Tutorials
• Introduction to Machine
Learning with H2O and Python
• Basic Extract, Transform and Load
(ETL)
• Supervised Learning
• Parameters Tuning
• Stacking
• GitHub Repository
• bit.ly/joe_h2o_tutorials
• R Code Examples (available soon)
38
H2O IoT Use Cases
Predictive Maintenance and Outlier Detection
39
Predictive Maintenance – SECOM Dataset
40
Predictive Maintenance – SECOM Dataset
41
We want to predict fails in the future.
Predictive Maintenance – SECOM Dataset
42
Predictive Maintenance – SECOM Dataset
• Dataset Summary
• Inputs:
• 591 numerical features
• Binary Outcome:
• Pass (-1)
• Fail (1)
• Size:
• 1567 samples
43
Basic H2O Usage – GBM with Default Settings
• Link to Jupyter Notebook
• https://github.com/woobe/h2o_tutorials/blob/master/use_cases/predictive
_maintenance/step_01_basics.ipynb
44
45
H2O’s R package
Start a local H2O Cluster (JVM)
“nthreads = -1” means using all
available virtual cores
Information of the
H2O Cluster
46
Import data into H2O cluster
(instead of R’s memory)
47
Convert numerical to
categorical values
48
49
Split data into training / test in
order to evaluate out-of-bag
performance later
50
H2O automatically ignore
columns with constant values
Classification Performance – Confusion Matrix
51
Confusion Matrix
52
53
Looking at metrics based on
training (in-sample) data only
It doesn’t represent out-of-
bag performance
54
Users could reduce number of
features based on these
findings
55
Metrics based on test (out-of-
bag) samples
56
H2O returns predicted class as
well as probabilities of each class
Advanced H2O Usage – Random Grid Search
• Link to Jupyter Notebook
• https://github.com/woobe/h2o_tutorials/blob/master/use_cases/predictive_
maintenance/step_02_random_grid_search.ipynb
• Using Random Grid Search to fine-tune hyper-parameters
57
58
API for performing a Random
Grid Search
59
Grid search results sorted by
specific metric. Best model on top.
60
Extract the best model
61
Performance metrics based on
5-fold cross-validation
Comparison – Default vs. Tuned
62GBM with Default Settings GBM with Random Grid Search
Still not perfect but
better than default
settings
H2O Tutorials
• Introduction to Machine
Learning with H2O and Python
• Basic Extract, Transform and Load
(ETL)
• Supervised Learning
• Parameters Tuning
• Stacking
• GitHub Repository
• bit.ly/joe_h2o_tutorials
• R Code Examples (available soon)
63
Outlier Detection
64
Photo credit: www.dbta.com
Outlier Detection – MNIST Dataset
65
Photo credit: http://www.opendeep.org/v0.0.5/docs/tutorial-classifying-handwritten-mnist-images
Outlier Detection – MNIST Dataset
• 784 Inputs
• 28 x 28 = 784 pixels
• 1 Output
• 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9
• File (from Kaggle)
• Train (42k Records)
• kaggle_mnist_train.csv.gz
• Source
• https://www.kaggle.com/c/digit-
recognizer/data
66
Photo credit: https://ml4a.github.io/ml4a/neural_networks/
Advanced H2O Usage – Random Grid Search
• Link to Data and Code
• https://github.com/woobe/h2o_tutorials/tree/master/use_cases/outlier_det
ection
67
68
69
Import data
(directly from a gz)
Remove the label (not needed
for unsupervised learning)
70
Input =
784 pixels
Output =
784 pixels
Nodes =
50
71
Quantify the average
reconstruction error per sample
- ) / no. of pixels = Avg. Error(
72
Helper functions for
plotting graphs only
(not related to H2O
algorithms)
73
74
Reconstruction
Original Image
75
Reconstruction
Original Image
76
Reconstruction
Original Image
77
End of First Talk
Let’s have a break ☺
78
H2O Deep Water
H2O’s Integration with TensorFlow, mxnet and Caffe
79
TensorFlow
• Open source machine learning
framework by Google
• Python / C++ API
• TensorBoard
• Data Flow Graph Visualization
• Multi CPU / GPU
• v0.8+ distributed machines support
• Multi devices support
• desktop, server and Android devices
• Image, audio and NLP applications
• HUGE Community
• Support for Spark, Windows …
80
https://github.com/tensorflow/tensorflow
81
https://github.com/dmlc/mxnet
https://www.zeolearn.com/magazine/amazon-to-use-mxnet-as-deep-learning-framework
Caffe
• Convolution Architecture For
Feature Extraction (CAFFE)
• Pure C++ / CUDA architecture for
deep learning
• Command line, Python and
MATLAB interface
• Model Zoo
• Open collection of models
82
https://docs.google.com/presentation/d/1UeKXVgRvvxg9OUdh_UiC5G71UMscNPlvArsWER41PsU/
H2O Deep Learning
83
Both TensorFlow and H2O are widely used
84
TensorFlow , MXNet, Caffe and H2O DL
democratize the power of deep learning.
H2O platform democratizes artificial
intelligence & big data science.
There are other open source deep learning libraries like Theano and Torch too.
Let’s have a party, this will be fun!
85
86
Deep Water
Next-Gen Distributed Deep Learning with H2O
H2O integrates with existing GPU backends
for significant performance gains
One Interface - GPU Enabled - Significant Performance Gains
Inherits All H2O Properties in Scalability, Ease of Use and Deployment
Recurrent Neural Networks
enabling natural language processing,
sequences, time series, and more
Convolutional Neural Networks enabling
Image, video, speech recognition
Hybrid Neural Network Architectures
enabling speech to text translation, image
captioning, scene parsing and more
Deep Water
87
Deep Water Architecture
Node 1 Node N
Scala
Spark
H2O
Java
Execution Engine
TensorFlow/mxnet/Caffe
C++
GPU CPU
TensorFlow/mxnet/Caffe
C++
GPU CPU
RPC
R/Py/Flow/Scala client
REST API
Web server
H2O
Java
Execution Engine
grpc/MPI/RDMA
Scala
Spark
88
Available Networks in Deep Water
• LeNet
• AlexNet
• VGGNet
• Inception (GoogLeNet)
• ResNet (Deep Residual
Learning)
• Build Your Own
89
ResNet
90
Example: Deep Water + H2O Flow
Choosing different network structures
91
Choosing different backends
(TensorFlow, MXNet, Caffe)
Unified Interface (Deep Water + R)
92
Choosing different network structures
Unified Interface (Deep Water + Python)
93
Change backend to
“mxnet”, “caffe” or “auto”
Choosing different network structures
Easy Stacking with other H2O Models
94
Ensemble of Deep Water, Gradient Boosting
Machine & Random Forest models
95
docs.h2o.ai
96
https://github.com/h2oai/h2o-3/tree/master/examples/deeplearning/notebooks
Deep Water
Example notebooks
Deep Water Cat/Dog/Mouse
Demo
97
Deep Water R Demo
• H2O + MXNet + TensorFlow
• Dataset – Cat/Dog/Mouse
• MXNet & TF as GPU backend
• Train LeNet (CNN) models
• R Demo
• Code and Data
• github.com/h2oai/deepwater
98
Data – Cat/Dog/Mouse Images
99
Data – CSV
100
Deep Water – Basic Usage
Live Demo if Possible
101
Start and Connect to H2O Deep Water Cluster
102
• Download Latest Nightly Build
• https://s3.amazonaws.com/h2o-deepwater/public/nightly/latest/h2o.jar
• In Terminal
• cd to the folder containing h2o.jar
• java –jar h2o.jar (this is the default command)
• java –jar –Xmx16g h2o.jar (this is the command to allocate 16GB of memory)
• In R
• library(h2o) (latest stable release from h2o.ai website or CRAN)
• h2o.connect(ip = “xxx.xxx.xxx.xxx”, strict_version_check = FALSE)
Import CSV
103
Train a CNN (LeNet) Model on GPU
104
Train a CNN (LeNet) Model on GPU
105
Using GPU for training
Model
106
Deep Water – Custom Network
107
Xxx
108
Saving the custom network
structure as a file
Configure custom
network structure
(MXNet syntax)
Train a Custom Network
109
Point it to the custom
network structure file
Model
110
Conclusions
111
Project “Deep Water”
• H2O + TF + MXNet + Caffe
• A powerful combination of widely
used open source machine
learning libraries.
• All Goodies from H2O
• Inherits all H2O properties in
scalability, ease of use and
deployment.
• Unified Interface
• Allows users to build, stack and
deploy deep learning models from
different libraries efficiently.
112
• Latest Nightly Build
• https://s3.amazonaws.com/h2o-
deepwater/public/nightly/latest/h
2o.jar
• 100% Open Source
• The party will get bigger!
Other H2O Developments
• H2O + xgboost [Link]
• Stacked Ensembles [Link]
• Automatic Machine Learning
[Link]
• Time Series [Link]
• High Availability Mode in
Sparkling Water [Link]
• Model Interpretation [Link]
• word2vec [Link]
113
• Previous Talks
• https://github.com/h2oai/h2o-
meetups/blob/master/2017_04_0
6_Amsterdam/2017_04_06_Latest
_H2O_Developments.pdf
• Organizers & Sponsors
• Poznan R Users Group (PAZUR)
• H2O.ai
114
Thanks!
• Code, Slides & Documents
• bit.ly/h2o_meetups
• docs.h2o.ai
• Contact
• joe@h2o.ai
• @matlabulous
• github.com/woobe
• Please search/ask questions on
Stack Overflow
• Use the tag `h2o` (not H2 zero)

More Related Content

What's hot

H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
Jo-fai Chow
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
Jo-fai Chow
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
Sri Ambati
 
Automatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEAutomatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIME
Jo-fai Chow
 
H2O Big Join Slides
H2O Big Join SlidesH2O Big Join Slides
H2O Big Join Slides
Sri Ambati
 
Intro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LAIntro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LA
Sri Ambati
 
Scalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2OScalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2O
odsc
 
H2O Machine Learning Use Cases
H2O Machine Learning Use CasesH2O Machine Learning Use Cases
H2O Machine Learning Use Cases
Jo-fai Chow
 
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, AetnaFrom H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
Sri Ambati
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
Sri Ambati
 
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyMaking Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Jo-fai Chow
 
ArnoCandelAIFrontiers011217
ArnoCandelAIFrontiers011217ArnoCandelAIFrontiers011217
ArnoCandelAIFrontiers011217
Sri Ambati
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize Seattle
Sri Ambati
 
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIAH2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
Sri Ambati
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling Water
Sri Ambati
 
Applying Machine Learning using H2O
Applying Machine Learning using H2OApplying Machine Learning using H2O
Applying Machine Learning using H2O
Ian Gomez
 
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Sri Ambati
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
Sri Ambati
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
Jo-fai Chow
 

What's hot (19)

H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Automatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEAutomatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIME
 
H2O Big Join Slides
H2O Big Join SlidesH2O Big Join Slides
H2O Big Join Slides
 
Intro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LAIntro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LA
 
Scalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2OScalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2O
 
H2O Machine Learning Use Cases
H2O Machine Learning Use CasesH2O Machine Learning Use Cases
H2O Machine Learning Use Cases
 
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, AetnaFrom H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
 
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyMaking Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
 
ArnoCandelAIFrontiers011217
ArnoCandelAIFrontiers011217ArnoCandelAIFrontiers011217
ArnoCandelAIFrontiers011217
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize Seattle
 
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIAH2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling Water
 
Applying Machine Learning using H2O
Applying Machine Learning using H2OApplying Machine Learning using H2O
Applying Machine Learning using H2O
 
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 

Similar to H2O at Poznan R Meetup

Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep Water
Sri Ambati
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
Jo-fai Chow
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Matt Stubbs
 
Berlin R Meetup
Berlin R MeetupBerlin R Meetup
Berlin R Meetup
Sri Ambati
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
Sri Ambati
 
Hambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2OHambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2O
Sri Ambati
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Demi Ben-Ari
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Codemotion
 
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2OIntroduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Data Science Milan
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Sri Ambati
 
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetupH2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
PyData Piraeus
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Demi Ben-Ari
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Wes McKinney
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data Platform
GoDataDriven
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Codemotion
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Demi Ben-Ari
 
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfOSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
Altinity Ltd
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
Sri Ambati
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
Mostafa Majidpour
 
Machine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OMachine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2O
Sri Ambati
 

Similar to H2O at Poznan R Meetup (20)

Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep Water
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
 
Berlin R Meetup
Berlin R MeetupBerlin R Meetup
Berlin R Meetup
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Hambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2OHambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2O
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
 
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2OIntroduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
 
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetupH2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data Platform
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
 
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfOSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
Machine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OMachine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2O
 

More from Jo-fai Chow

Kaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesKaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New Opportunities
Jo-fai Chow
 
Using H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters OptimizationUsing H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters Optimization
Jo-fai Chow
 
Introduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesIntroduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing Values
Jo-fai Chow
 
Improving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters TuningImproving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters Tuning
Jo-fai Chow
 
Kaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunitiesKaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunities
Jo-fai Chow
 
Deploying your Predictive Models as a Service via Domino
Deploying your Predictive Models as a Service via DominoDeploying your Predictive Models as a Service via Domino
Deploying your Predictive Models as a Service via Domino
Jo-fai Chow
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
Jo-fai Chow
 
Designing Sustainable Drainage Systems
Designing Sustainable Drainage SystemsDesigning Sustainable Drainage Systems
Designing Sustainable Drainage Systems
Jo-fai Chow
 
Developing a New Decision Support System for SuDS
Developing a New Decision Support System for SuDSDeveloping a New Decision Support System for SuDS
Developing a New Decision Support System for SuDS
Jo-fai Chow
 
Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)Jo-fai Chow
 
Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I, Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I, Jo-fai Chow
 
Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)Jo-fai Chow
 
Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)Jo-fai Chow
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
Jo-fai Chow
 

More from Jo-fai Chow (14)

Kaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesKaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New Opportunities
 
Using H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters OptimizationUsing H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters Optimization
 
Introduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesIntroduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing Values
 
Improving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters TuningImproving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters Tuning
 
Kaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunitiesKaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunities
 
Deploying your Predictive Models as a Service via Domino
Deploying your Predictive Models as a Service via DominoDeploying your Predictive Models as a Service via Domino
Deploying your Predictive Models as a Service via Domino
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
 
Designing Sustainable Drainage Systems
Designing Sustainable Drainage SystemsDesigning Sustainable Drainage Systems
Designing Sustainable Drainage Systems
 
Developing a New Decision Support System for SuDS
Developing a New Decision Support System for SuDSDeveloping a New Decision Support System for SuDS
Developing a New Decision Support System for SuDS
 
Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)
 
Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I, Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I,
 
Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)
 
Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
 

Recently uploaded

Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
XfilesPro
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
ayushiqss
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
NaapbooksPrivateLimi
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 

Recently uploaded (20)

Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 

H2O at Poznan R Meetup

  • 1. H2O at Poznan R Meetup Introduction to H2O, IoT Use Cases and Deep Water Jo-fai (Joe) Chow Data Scientist joe@h2o.ai @matlabulous Poznan R 20th April, 2017
  • 2. About Me • Civil (Water) Engineer • 2010 – 2015 • Consultant (UK) • Utilities • Asset Management • Constrained Optimization • Industrial PhD (UK) • Infrastructure Design Optimization • Machine Learning + Water Engineering • Discovered H2O in 2014 • Data Scientist • 2015 • Virgin Media (UK) • Domino Data Lab (Silicon Valley) • 2016 – Present • H2O.ai (Silicon Valley) 2
  • 4. Side Project #1 – Crime Data Visualization 4 https://github.com/woobe/rApps/tree/master/crimemap http://insidebigdata.com/2013/11/30/visualization-week-crimemap/
  • 5. Side Project #2 – Data Visualization Contest 5 https://github.com/woobe/rugsmaps http://blog.revolutionanalytics.com/2014/08/winner-for-revolution-analytics-user-group-map-contest.html
  • 6. Side Project #3 6 Developing R Packages for Fun rPlotter (2014)
  • 7. Side Project #4 – Kaggle Blog Post 7 R + H2O + Domino for Kaggle Guest Blog Post for Domino & H2O (2014) • The Long Story • bit.ly/joe_kaggle_story
  • 8. Agenda • Introduction • Company • Machine Learning Platform • IoT Use Cases • Predictive Maintenance • Outlier Detection • Break (25 mins) • Deep Water • Motivation / Benefits • Interfaces / Demo 8
  • 10. Company Overview Founded 2011 Venture-backed, debuted in 2012 Products • H2O Open Source In-Memory AI Prediction Engine • Sparkling Water • Steam Mission Operationalize Data Science, and provide a platform for users to build beautiful data products Team 70 employees • Distributed Systems Engineers doing Machine Learning • World-class visualization designers Headquarters Mountain View, CA 10
  • 13. 13
  • 14. 0 10000 20000 30000 40000 50000 60000 70000 1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16 # H2O Users H2O Community Growth Tremendous Momentum Globally 65,000+ users globally (Sept 2016) • 65,000+ users from ~8,000 companies in 140 countries. Top 5 from: Large User Circle * DATA FROM GOOGLE ANALYTICS EMBEDDED IN THE END USER PRODUCT 14 0 2000 4000 6000 8000 10000 1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16 # Companies Using H2O ~8,000+ companies (Sept 2016) +127% +60%
  • 16. H2O for Kaggle Competitions 16
  • 17. H2O for Academic Research 17 http://www.sciencedirect.com/science/article/pii/S0377221716308657 https://arxiv.org/abs/1509.01199
  • 18. Users In Various Verticals Adore H2O Financial Insurance MarketingTelecom Healthcare 18
  • 21. H2O Machine Learning Platform 21
  • 22. 22
  • 23. 23
  • 24. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 24
  • 25. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 25 Import Data from Multiple Sources
  • 26. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 26 Fast, Scalable & Distributed Compute Engine Written in Java
  • 27. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 27 Fast, Scalable & Distributed Compute Engine Written in Java
  • 28. Supervised Learning • Generalized Linear Models: Binomial, Gaussian, Gamma, Poisson and Tweedie • Naïve Bayes Statistical Analysis Ensembles • Distributed Random Forest: Classification or regression models • Gradient Boosting Machine: Produces an ensemble of decision trees with increasing refined approximations Deep Neural Networks • Deep learning: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations Algorithms Overview Unsupervised Learning • K-means: Partitions observations into k clusters/groups of the same spatial size. Automatically detect optimal k Clustering Dimensionality Reduction • Principal Component Analysis: Linearly transforms correlated variables to independent components • Generalized Low Rank Models: extend the idea of PCA to handle arbitrary data consisting of numerical, Boolean, categorical, and missing data Anomaly Detection • Autoencoders: Find outliers using a nonlinear dimensionality reduction using deep learning 28
  • 29. H2O Deep Learning in Action 29
  • 30. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 30 Multiple Interfaces
  • 31. H2O + R 31 Package ‘h2o’ from CRAN or H2O’s website Start a local H2O (Java Virtual Machine) cluster Simple ‘iris’ example
  • 34. 34 H2O Flow (Web) Interface
  • 35. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 35 Export Standalone Models for Production
  • 38. H2O Tutorials • Introduction to Machine Learning with H2O and Python • Basic Extract, Transform and Load (ETL) • Supervised Learning • Parameters Tuning • Stacking • GitHub Repository • bit.ly/joe_h2o_tutorials • R Code Examples (available soon) 38
  • 39. H2O IoT Use Cases Predictive Maintenance and Outlier Detection 39
  • 40. Predictive Maintenance – SECOM Dataset 40
  • 41. Predictive Maintenance – SECOM Dataset 41 We want to predict fails in the future.
  • 42. Predictive Maintenance – SECOM Dataset 42
  • 43. Predictive Maintenance – SECOM Dataset • Dataset Summary • Inputs: • 591 numerical features • Binary Outcome: • Pass (-1) • Fail (1) • Size: • 1567 samples 43
  • 44. Basic H2O Usage – GBM with Default Settings • Link to Jupyter Notebook • https://github.com/woobe/h2o_tutorials/blob/master/use_cases/predictive _maintenance/step_01_basics.ipynb 44
  • 45. 45 H2O’s R package Start a local H2O Cluster (JVM) “nthreads = -1” means using all available virtual cores Information of the H2O Cluster
  • 46. 46 Import data into H2O cluster (instead of R’s memory)
  • 48. 48
  • 49. 49 Split data into training / test in order to evaluate out-of-bag performance later
  • 50. 50 H2O automatically ignore columns with constant values
  • 51. Classification Performance – Confusion Matrix 51
  • 53. 53 Looking at metrics based on training (in-sample) data only It doesn’t represent out-of- bag performance
  • 54. 54 Users could reduce number of features based on these findings
  • 55. 55 Metrics based on test (out-of- bag) samples
  • 56. 56 H2O returns predicted class as well as probabilities of each class
  • 57. Advanced H2O Usage – Random Grid Search • Link to Jupyter Notebook • https://github.com/woobe/h2o_tutorials/blob/master/use_cases/predictive_ maintenance/step_02_random_grid_search.ipynb • Using Random Grid Search to fine-tune hyper-parameters 57
  • 58. 58 API for performing a Random Grid Search
  • 59. 59 Grid search results sorted by specific metric. Best model on top.
  • 61. 61 Performance metrics based on 5-fold cross-validation
  • 62. Comparison – Default vs. Tuned 62GBM with Default Settings GBM with Random Grid Search Still not perfect but better than default settings
  • 63. H2O Tutorials • Introduction to Machine Learning with H2O and Python • Basic Extract, Transform and Load (ETL) • Supervised Learning • Parameters Tuning • Stacking • GitHub Repository • bit.ly/joe_h2o_tutorials • R Code Examples (available soon) 63
  • 65. Outlier Detection – MNIST Dataset 65 Photo credit: http://www.opendeep.org/v0.0.5/docs/tutorial-classifying-handwritten-mnist-images
  • 66. Outlier Detection – MNIST Dataset • 784 Inputs • 28 x 28 = 784 pixels • 1 Output • 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9 • File (from Kaggle) • Train (42k Records) • kaggle_mnist_train.csv.gz • Source • https://www.kaggle.com/c/digit- recognizer/data 66 Photo credit: https://ml4a.github.io/ml4a/neural_networks/
  • 67. Advanced H2O Usage – Random Grid Search • Link to Data and Code • https://github.com/woobe/h2o_tutorials/tree/master/use_cases/outlier_det ection 67
  • 68. 68
  • 69. 69 Import data (directly from a gz) Remove the label (not needed for unsupervised learning)
  • 70. 70 Input = 784 pixels Output = 784 pixels Nodes = 50
  • 71. 71 Quantify the average reconstruction error per sample - ) / no. of pixels = Avg. Error(
  • 72. 72 Helper functions for plotting graphs only (not related to H2O algorithms)
  • 73. 73
  • 77. 77
  • 78. End of First Talk Let’s have a break ☺ 78
  • 79. H2O Deep Water H2O’s Integration with TensorFlow, mxnet and Caffe 79
  • 80. TensorFlow • Open source machine learning framework by Google • Python / C++ API • TensorBoard • Data Flow Graph Visualization • Multi CPU / GPU • v0.8+ distributed machines support • Multi devices support • desktop, server and Android devices • Image, audio and NLP applications • HUGE Community • Support for Spark, Windows … 80 https://github.com/tensorflow/tensorflow
  • 82. Caffe • Convolution Architecture For Feature Extraction (CAFFE) • Pure C++ / CUDA architecture for deep learning • Command line, Python and MATLAB interface • Model Zoo • Open collection of models 82 https://docs.google.com/presentation/d/1UeKXVgRvvxg9OUdh_UiC5G71UMscNPlvArsWER41PsU/
  • 84. Both TensorFlow and H2O are widely used 84
  • 85. TensorFlow , MXNet, Caffe and H2O DL democratize the power of deep learning. H2O platform democratizes artificial intelligence & big data science. There are other open source deep learning libraries like Theano and Torch too. Let’s have a party, this will be fun! 85
  • 86. 86
  • 87. Deep Water Next-Gen Distributed Deep Learning with H2O H2O integrates with existing GPU backends for significant performance gains One Interface - GPU Enabled - Significant Performance Gains Inherits All H2O Properties in Scalability, Ease of Use and Deployment Recurrent Neural Networks enabling natural language processing, sequences, time series, and more Convolutional Neural Networks enabling Image, video, speech recognition Hybrid Neural Network Architectures enabling speech to text translation, image captioning, scene parsing and more Deep Water 87
  • 88. Deep Water Architecture Node 1 Node N Scala Spark H2O Java Execution Engine TensorFlow/mxnet/Caffe C++ GPU CPU TensorFlow/mxnet/Caffe C++ GPU CPU RPC R/Py/Flow/Scala client REST API Web server H2O Java Execution Engine grpc/MPI/RDMA Scala Spark 88
  • 89. Available Networks in Deep Water • LeNet • AlexNet • VGGNet • Inception (GoogLeNet) • ResNet (Deep Residual Learning) • Build Your Own 89 ResNet
  • 90. 90 Example: Deep Water + H2O Flow Choosing different network structures
  • 92. Unified Interface (Deep Water + R) 92 Choosing different network structures
  • 93. Unified Interface (Deep Water + Python) 93 Change backend to “mxnet”, “caffe” or “auto” Choosing different network structures
  • 94. Easy Stacking with other H2O Models 94 Ensemble of Deep Water, Gradient Boosting Machine & Random Forest models
  • 98. Deep Water R Demo • H2O + MXNet + TensorFlow • Dataset – Cat/Dog/Mouse • MXNet & TF as GPU backend • Train LeNet (CNN) models • R Demo • Code and Data • github.com/h2oai/deepwater 98
  • 101. Deep Water – Basic Usage Live Demo if Possible 101
  • 102. Start and Connect to H2O Deep Water Cluster 102 • Download Latest Nightly Build • https://s3.amazonaws.com/h2o-deepwater/public/nightly/latest/h2o.jar • In Terminal • cd to the folder containing h2o.jar • java –jar h2o.jar (this is the default command) • java –jar –Xmx16g h2o.jar (this is the command to allocate 16GB of memory) • In R • library(h2o) (latest stable release from h2o.ai website or CRAN) • h2o.connect(ip = “xxx.xxx.xxx.xxx”, strict_version_check = FALSE)
  • 104. Train a CNN (LeNet) Model on GPU 104
  • 105. Train a CNN (LeNet) Model on GPU 105 Using GPU for training
  • 107. Deep Water – Custom Network 107
  • 108. Xxx 108 Saving the custom network structure as a file Configure custom network structure (MXNet syntax)
  • 109. Train a Custom Network 109 Point it to the custom network structure file
  • 112. Project “Deep Water” • H2O + TF + MXNet + Caffe • A powerful combination of widely used open source machine learning libraries. • All Goodies from H2O • Inherits all H2O properties in scalability, ease of use and deployment. • Unified Interface • Allows users to build, stack and deploy deep learning models from different libraries efficiently. 112 • Latest Nightly Build • https://s3.amazonaws.com/h2o- deepwater/public/nightly/latest/h 2o.jar • 100% Open Source • The party will get bigger!
  • 113. Other H2O Developments • H2O + xgboost [Link] • Stacked Ensembles [Link] • Automatic Machine Learning [Link] • Time Series [Link] • High Availability Mode in Sparkling Water [Link] • Model Interpretation [Link] • word2vec [Link] 113 • Previous Talks • https://github.com/h2oai/h2o- meetups/blob/master/2017_04_0 6_Amsterdam/2017_04_06_Latest _H2O_Developments.pdf
  • 114. • Organizers & Sponsors • Poznan R Users Group (PAZUR) • H2O.ai 114 Thanks! • Code, Slides & Documents • bit.ly/h2o_meetups • docs.h2o.ai • Contact • joe@h2o.ai • @matlabulous • github.com/woobe • Please search/ask questions on Stack Overflow • Use the tag `h2o` (not H2 zero)