SlideShare a Scribd company logo
Introduction to
Machine Learning with H2O
Jo-fai (Joe) Chow
Data Scientist
joe@h2o.ai
@matlabulous
Data Science Milan
Politecnico di Milano
10th October, 2016
About Me: Civil Engineer → Data Scientist
• 2005 - 2015
• Water Engineer
o Consultant for Utilities
o Industrial PhD
• Water Engineering +
Machine Learning
• Discovered H2O in 2014!
• 2015 - Present
• Data Scientist
o Virgin Media (UK)
o Domino Data Lab (US)
o H2O.ai (US)
2
Why? Long story – see bit.ly/joe_h2o_talk2
Agenda
• First Talk (25 mins)
o About H2O.ai
o Demo
• A Simple Classification Task
• H2O’s Web Interface
o Why H2O?
• Our Community
• Our Customers
o What’s Next?
• New H2O Features
• Second Talk (25 mins)
o H2O for IoT
• Predictive Maintenance
• Anomaly Detection
• H2O’s R Interface
• Third Talk (25 mins)
o Deep Water
o Demo
• H2O + mxnet on GPU
• H2O’s Python Interface
3
About H2O.ai
About H2O.ai
• H2O.ai, the Company
o Team: 80 (70 shown)
o Founded in 2012
o HQ: Mountain View, California
• H2O, the Platform
o Open Source (Apache 2.0)
o Algorithms written in Java
• Fast, distributed and scalable
o Multiple interfaces to suit different users
• Web, R, Python, Java, Scala, REST/JSON
o Works with desktop/laptop, cloud, Spark
and Hadoop
Joe
Scientific Advisory Council
6
Current Algorithm Overview
7
Joe’s Strata Hadoop
London Talk
bit.ly/joe_h2o_talk4
Today’s
Demos
Joe’s LondonR Talk
bit.ly/joe_h2o_talk3
H2O Overview
8
H2O’s Mission
9
Making Machine Learning Accessible to Everyone
Photo credit: Virgin Media
H2O Web Interface Demo
A Typical Machine Learning Task
• Demo
o Dataset – MNIST
• LeCun et al. (1999)
• Hand-written Digits
o Import & Explore Data
o Build & Evaluate Models
o Make Predictions
11Photo credit: http://www.opendeep.org/v0.0.5/docs/tutorial-classifying-handwritten-mnist-images
MNIST Hand-Written Digits
• 784 Inputs
o 28 x 28 = 784 pixels
• 1 Output
o 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9
o Classification
• Files
o Train (60k Records)
o Test (10k)
• Links
o https://s3.amazonaws.com/h2o-public-test-
data/bigdata/laptop/mnist/train.csv.gz
o https://s3.amazonaws.com/h2o-public-test-
data/bigdata/laptop/mnist/test.csv.gz
12
Photo credit: https://ml4a.github.io/ml4a/neural_networks/
H2O Flow (Web Interface) Demo
• Download and unzip jar
from www.h2o.ai
• In terminal:
o java -jar h2o.jar
• Web browser:
o localhost:54321
13
H2O Live Demo
More H2O Flow Examples
15
Other H2O Interfaces
• R
• Python
• docs.h2o.ai
16
Key Resources
More Advanced Topics
• Advanced Features
o Hyperparameters Tuning
o Model Stacking
o Saving/Loading Models
o Export Plain Old Java
Object (POJO)
• Key Resources
o docs.h2o.ai
• Joe’s Previous H2O Talks
o bit.ly/joe_h2o_talk3
o bit.ly/h2o_budapest_1
o bit.ly/h2o_paris_1
17
Why H2O?
19
Szilard Pafka – Chief Data Scientist at Epoch
• Sziland’s talks / blog
posts about H2O:
o ML Benchmark
o Intro to ML with H2O
o H2O Scoring
o Tweets
20
Szilard Pafka – Why H2O?
21
• Szilard’s Summary Slide
H2O for Kaggle
22
H2O Community Support
23
Google forum – h2osteam community.h2o.ai
Please try
#AroundTheWorldWithH2Oai
24
Strata Hadoop
London
PyData
Amsterdam
useR! 2016
Stanford
satRdays
Budapest
London Kaggle
Meetup
Chelsea FC
Paris ML
Meetup
Big Data London
#AroundTheWorldWithH2Oai
25
Data Science Milan
Thank you 
H2O Usage in Italy
26
www.h2o.ai/community
27
28
www.h2o.ai/customers
H2O in Action
29
Thank you 
Data Science Milan – May 19, 2016
Bringing Deep Learning into production - Paolo Platter, AgileLab
http://www.slideshare.net/ds_mi/bringing-deep-learning-into-production-paolo-platter-agilelab
What’s Next?
H2O is Evolving
• H2O Open Tour NYC
YouTube Playlist
o Advanced data munging
o Visual ML
o Deep Water (3rd talk)
o Sparkling Water
• PySparkling & RSparkling
o Steam
31
Next time?
H2O’s Mission
32
Making Machine Learning Accessible to Everyone
Photo credit: Virgin Media
End of First Talk – Thanks!
33
• Data Science Milan
• Gianmario Spacagna
• Politecnico di Milano
• Resources
o bit.ly/h2o_milan_1
o www.h2o.ai
o docs.h2o.ai
• Contact
o joe@h2o.ai
o @matlabulous
o github.com/woobe
Extra Slides
(H2O Flow Demo Screenshots – just in case)
35
Upload the file without decompressing it first
36
Change the data type of “label” from “Numeric” to “Enum” (categorical)
37
Note: Size in Memory
Click on individual labels to explore data
38
39
Split the full dataset into training (80% = 48k records) and
validation (20% = 12k) – a common machine learning
practice
40
Click and select parameters
for model training
41
Users have full access to all available parameters
– fine-tune model training process
For example, I am using
rectifier with dropout as the activation
to train the model for 20 epochs
with classes balancing
Leaving other settings as default
42
Training the model with estimated remaining time
– users can stop the process early if they want to
43
Performance (logloss) on validation set
Performance (logloss) on training set
44
Confusion Matrix on Training Set (48k Records)
About 2% Error
Confusion Matrix on Validation Set (12k Records)
About 4% Error
45
Using the model for prediction on test set
46
Confusion Matrix on Test Set (10k Records)
About 4% Error (similar to validation)
47
Full prediction outputs including individual
probabilities and predicted label
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O

More Related Content

What's hot

Deep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry LarkoDeep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry Larko
Sri Ambati
 
Towards the Cytoscape Cyberinfrastructure
Towards the Cytoscape CyberinfrastructureTowards the Cytoscape Cyberinfrastructure
Towards the Cytoscape Cyberinfrastructure
Keiichiro Ono
 
Some "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data frontSome "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data front
Greg Landrum
 
cyREST: Cytoscape as a Service
cyREST: Cytoscape as a ServicecyREST: Cytoscape as a Service
cyREST: Cytoscape as a Service
Keiichiro Ono
 
Cloud Computing - examples
Cloud Computing - examplesCloud Computing - examples
Cloud Computing - examples
EUBrasilCloudFORUM .
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
Keiichiro Ono
 
Overview of Modern Graph Analysis Tools
Overview of Modern Graph Analysis ToolsOverview of Modern Graph Analysis Tools
Overview of Modern Graph Analysis Tools
Keiichiro Ono
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
Sri Ambati
 
Deep learning with Tensorflow in R
Deep learning with Tensorflow in RDeep learning with Tensorflow in R
Deep learning with Tensorflow in R
mikaelhuss
 
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGData Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Thamme Gowda
 
Predictive churn h20_dsx
Predictive churn h20_dsxPredictive churn h20_dsx
Predictive churn h20_dsx
Ndjido Ardo BAR
 
Use of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsUse of standards and related issues in predictive analytics
Use of standards and related issues in predictive analytics
Paco Nathan
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Keiichiro Ono
 
What's New in Cytoscape
What's New in CytoscapeWhat's New in Cytoscape
What's New in Cytoscape
Keiichiro Ono
 
TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
Seldon
 
A Precise Model for Google Cloud Platform (IC2E'2018)
A Precise Model for Google Cloud Platform (IC2E'2018)A Precise Model for Google Cloud Platform (IC2E'2018)
A Precise Model for Google Cloud Platform (IC2E'2018)
Stéphanie Challita
 
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
Keiichiro Ono
 
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford ConsortiumSDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
Keiichiro Ono
 
Automated Reverse-Engineering of a Cloud API
Automated Reverse-Engineering of a Cloud APIAutomated Reverse-Engineering of a Cloud API
Automated Reverse-Engineering of a Cloud API
Stéphanie Challita
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis Tools
Keiichiro Ono
 

What's hot (20)

Deep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry LarkoDeep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry Larko
 
Towards the Cytoscape Cyberinfrastructure
Towards the Cytoscape CyberinfrastructureTowards the Cytoscape Cyberinfrastructure
Towards the Cytoscape Cyberinfrastructure
 
Some "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data frontSome "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data front
 
cyREST: Cytoscape as a Service
cyREST: Cytoscape as a ServicecyREST: Cytoscape as a Service
cyREST: Cytoscape as a Service
 
Cloud Computing - examples
Cloud Computing - examplesCloud Computing - examples
Cloud Computing - examples
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
 
Overview of Modern Graph Analysis Tools
Overview of Modern Graph Analysis ToolsOverview of Modern Graph Analysis Tools
Overview of Modern Graph Analysis Tools
 
Stacked Ensembles in H2O
Stacked Ensembles in H2OStacked Ensembles in H2O
Stacked Ensembles in H2O
 
Deep learning with Tensorflow in R
Deep learning with Tensorflow in RDeep learning with Tensorflow in R
Deep learning with Tensorflow in R
 
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGData Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
 
Predictive churn h20_dsx
Predictive churn h20_dsxPredictive churn h20_dsx
Predictive churn h20_dsx
 
Use of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsUse of standards and related issues in predictive analytics
Use of standards and related issues in predictive analytics
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
 
What's New in Cytoscape
What's New in CytoscapeWhat's New in Cytoscape
What's New in Cytoscape
 
TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
 
A Precise Model for Google Cloud Platform (IC2E'2018)
A Precise Model for Google Cloud Platform (IC2E'2018)A Precise Model for Google Cloud Platform (IC2E'2018)
A Precise Model for Google Cloud Platform (IC2E'2018)
 
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
 
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford ConsortiumSDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
 
Automated Reverse-Engineering of a Cloud API
Automated Reverse-Engineering of a Cloud APIAutomated Reverse-Engineering of a Cloud API
Automated Reverse-Engineering of a Cloud API
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis Tools
 

Viewers also liked

Inaugural talk Data Science Milan - Gianmario Spacagna
Inaugural talk Data Science Milan - Gianmario SpacagnaInaugural talk Data Science Milan - Gianmario Spacagna
Inaugural talk Data Science Milan - Gianmario Spacagna
Data Science Milan
 
Data intensive applications with Apache Flink - Simone Robutti, Radicalbit
Data intensive applications with Apache Flink - Simone Robutti, RadicalbitData intensive applications with Apache Flink - Simone Robutti, Radicalbit
Data intensive applications with Apache Flink - Simone Robutti, Radicalbit
Data Science Milan
 
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
Data Science Milan
 
Instituciones administrativas del trabajo
Instituciones administrativas del trabajoInstituciones administrativas del trabajo
Instituciones administrativas del trabajo
yessihernendez
 
ANNUAL REPORT 2014-Digital
ANNUAL REPORT 2014-DigitalANNUAL REPORT 2014-Digital
ANNUAL REPORT 2014-Digital
Dan Weinbaum
 
5+5+5 Fusion 2015 final
5+5+5 Fusion 2015 final5+5+5 Fusion 2015 final
5+5+5 Fusion 2015 final
Brenda Wilson, M.ED
 
Arquitectura romana
Arquitectura  romanaArquitectura  romana
Arquitectura romana
anyel anyela
 
Railings
RailingsRailings
Railings
mnfsteel
 
EXAMEN INFORMATICA
EXAMEN INFORMATICAEXAMEN INFORMATICA
EXAMEN INFORMATICA
Anthony Grefa
 
POSDigital_References_en_small2
POSDigital_References_en_small2POSDigital_References_en_small2
POSDigital_References_en_small2
David Šauer
 
class-action-lit-study
class-action-lit-studyclass-action-lit-study
class-action-lit-study
Will McLennan
 
Contaminacion
ContaminacionContaminacion
Contaminacion
Sharleen Lugo Mata
 
Britton_NoAH World Package Design 10_Page_1-10
Britton_NoAH World Package Design 10_Page_1-10Britton_NoAH World Package Design 10_Page_1-10
Britton_NoAH World Package Design 10_Page_1-10
Patti Britton
 
UWI Vice-Chancellor's Report to University Council
UWI Vice-Chancellor's Report to University CouncilUWI Vice-Chancellor's Report to University Council
UWI Vice-Chancellor's Report to University Council
UWI_Markcomm
 
Tics sthefy
Tics sthefyTics sthefy
Tics sthefy
STEFYFONSECA
 
Zika Virus Surveillance and Reporting in the Caribbean
Zika Virus Surveillance and Reporting in the CaribbeanZika Virus Surveillance and Reporting in the Caribbean
Zika Virus Surveillance and Reporting in the Caribbean
UWI_Markcomm
 
Wayne Hellon
Wayne HellonWayne Hellon
Wayne Hellon
wayne hellon
 
Solatube international, Inc - portfolio
Solatube international, Inc - portfolioSolatube international, Inc - portfolio
Solatube international, Inc - portfolio
MilaKuci
 
505LeePosterPresentation
505LeePosterPresentation505LeePosterPresentation
505LeePosterPresentation
Anita Louise Kariniemi
 
Quimica pebd
Quimica pebdQuimica pebd
Quimica pebd
Sarahí Garcia
 

Viewers also liked (20)

Inaugural talk Data Science Milan - Gianmario Spacagna
Inaugural talk Data Science Milan - Gianmario SpacagnaInaugural talk Data Science Milan - Gianmario Spacagna
Inaugural talk Data Science Milan - Gianmario Spacagna
 
Data intensive applications with Apache Flink - Simone Robutti, Radicalbit
Data intensive applications with Apache Flink - Simone Robutti, RadicalbitData intensive applications with Apache Flink - Simone Robutti, Radicalbit
Data intensive applications with Apache Flink - Simone Robutti, Radicalbit
 
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
 
Instituciones administrativas del trabajo
Instituciones administrativas del trabajoInstituciones administrativas del trabajo
Instituciones administrativas del trabajo
 
ANNUAL REPORT 2014-Digital
ANNUAL REPORT 2014-DigitalANNUAL REPORT 2014-Digital
ANNUAL REPORT 2014-Digital
 
5+5+5 Fusion 2015 final
5+5+5 Fusion 2015 final5+5+5 Fusion 2015 final
5+5+5 Fusion 2015 final
 
Arquitectura romana
Arquitectura  romanaArquitectura  romana
Arquitectura romana
 
Railings
RailingsRailings
Railings
 
EXAMEN INFORMATICA
EXAMEN INFORMATICAEXAMEN INFORMATICA
EXAMEN INFORMATICA
 
POSDigital_References_en_small2
POSDigital_References_en_small2POSDigital_References_en_small2
POSDigital_References_en_small2
 
class-action-lit-study
class-action-lit-studyclass-action-lit-study
class-action-lit-study
 
Contaminacion
ContaminacionContaminacion
Contaminacion
 
Britton_NoAH World Package Design 10_Page_1-10
Britton_NoAH World Package Design 10_Page_1-10Britton_NoAH World Package Design 10_Page_1-10
Britton_NoAH World Package Design 10_Page_1-10
 
UWI Vice-Chancellor's Report to University Council
UWI Vice-Chancellor's Report to University CouncilUWI Vice-Chancellor's Report to University Council
UWI Vice-Chancellor's Report to University Council
 
Tics sthefy
Tics sthefyTics sthefy
Tics sthefy
 
Zika Virus Surveillance and Reporting in the Caribbean
Zika Virus Surveillance and Reporting in the CaribbeanZika Virus Surveillance and Reporting in the Caribbean
Zika Virus Surveillance and Reporting in the Caribbean
 
Wayne Hellon
Wayne HellonWayne Hellon
Wayne Hellon
 
Solatube international, Inc - portfolio
Solatube international, Inc - portfolioSolatube international, Inc - portfolio
Solatube international, Inc - portfolio
 
505LeePosterPresentation
505LeePosterPresentation505LeePosterPresentation
505LeePosterPresentation
 
Quimica pebd
Quimica pebdQuimica pebd
Quimica pebd
 

Similar to Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O

H2O at Poznan R Meetup
H2O at Poznan R MeetupH2O at Poznan R Meetup
H2O at Poznan R Meetup
Jo-fai Chow
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
Jo-fai Chow
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
Jo-fai Chow
 
Berlin R Meetup
Berlin R MeetupBerlin R Meetup
Berlin R Meetup
Sri Ambati
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R Meetup
Jo-fai Chow
 
Introduction to H2O and Model Stacking Use Cases
Introduction to H2O and Model Stacking Use CasesIntroduction to H2O and Model Stacking Use Cases
Introduction to H2O and Model Stacking Use Cases
Jo-fai Chow
 
H2O Machine Learning Use Cases
H2O Machine Learning Use CasesH2O Machine Learning Use Cases
H2O Machine Learning Use Cases
Jo-fai Chow
 
Kaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesKaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New Opportunities
Jo-fai Chow
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
Jo-fai Chow
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
Sri Ambati
 
Hambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2OHambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2O
Sri Ambati
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
Sri Ambati
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
Jo-fai Chow
 
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Artefactual Systems - AtoM
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
Tao Xie
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
Sri Ambati
 
Testing In Production (TiP) Advances with Big Data and the Cloud
Testing In Production (TiP) Advances with Big Data and the CloudTesting In Production (TiP) Advances with Big Data and the Cloud
Testing In Production (TiP) Advances with Big Data and the Cloud
SOASTA
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
Jo-fai Chow
 
Building Data Pipelines in Python
Building Data Pipelines in PythonBuilding Data Pipelines in Python
Building Data Pipelines in Python
C4Media
 
The Quest for an Open Source Data Science Platform
 The Quest for an Open Source Data Science Platform The Quest for an Open Source Data Science Platform
The Quest for an Open Source Data Science Platform
QAware GmbH
 

Similar to Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O (20)

H2O at Poznan R Meetup
H2O at Poznan R MeetupH2O at Poznan R Meetup
H2O at Poznan R Meetup
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
 
Berlin R Meetup
Berlin R MeetupBerlin R Meetup
Berlin R Meetup
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R Meetup
 
Introduction to H2O and Model Stacking Use Cases
Introduction to H2O and Model Stacking Use CasesIntroduction to H2O and Model Stacking Use Cases
Introduction to H2O and Model Stacking Use Cases
 
H2O Machine Learning Use Cases
H2O Machine Learning Use CasesH2O Machine Learning Use Cases
H2O Machine Learning Use Cases
 
Kaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesKaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New Opportunities
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Hambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2OHambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2O
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Testing In Production (TiP) Advances with Big Data and the Cloud
Testing In Production (TiP) Advances with Big Data and the CloudTesting In Production (TiP) Advances with Big Data and the Cloud
Testing In Production (TiP) Advances with Big Data and the Cloud
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Building Data Pipelines in Python
Building Data Pipelines in PythonBuilding Data Pipelines in Python
Building Data Pipelines in Python
 
The Quest for an Open Source Data Science Platform
 The Quest for an Open Source Data Science Platform The Quest for an Open Source Data Science Platform
The Quest for an Open Source Data Science Platform
 

More from Data Science Milan

ML & Graph algorithms to prevent financial crime in digital payments
ML & Graph  algorithms to prevent  financial crime in  digital paymentsML & Graph  algorithms to prevent  financial crime in  digital payments
ML & Graph algorithms to prevent financial crime in digital payments
Data Science Milan
 
How to use the Economic Complexity Index to guide innovation plans
How to use the Economic Complexity Index to guide innovation plansHow to use the Economic Complexity Index to guide innovation plans
How to use the Economic Complexity Index to guide innovation plans
Data Science Milan
 
Robustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsRobustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning Methods
Data Science Milan
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies
Data Science Milan
 
Question generation using Natural Language Processing by QuestGen.AI
Question generation using Natural Language Processing by QuestGen.AIQuestion generation using Natural Language Processing by QuestGen.AI
Question generation using Natural Language Processing by QuestGen.AI
Data Science Milan
 
Speed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWSSpeed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWS
Data Science Milan
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
Data Science Milan
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
Data Science Milan
 
Reinforcement Learning Overview | Marco Del Pra
Reinforcement Learning Overview | Marco Del PraReinforcement Learning Overview | Marco Del Pra
Reinforcement Learning Overview | Marco Del Pra
Data Science Milan
 
Time Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraTime Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del Pra
Data Science Milan
 
Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AI
Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AILudwig: A code-free deep learning toolbox | Piero Molino, Uber AI
Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AI
Data Science Milan
 
Audience projection of target consumers over multiple domains a ner and baye...
Audience projection of target consumers over multiple domains  a ner and baye...Audience projection of target consumers over multiple domains  a ner and baye...
Audience projection of target consumers over multiple domains a ner and baye...
Data Science Milan
 
Weak supervised learning - Kristina Khvatova
Weak supervised learning - Kristina KhvatovaWeak supervised learning - Kristina Khvatova
Weak supervised learning - Kristina Khvatova
Data Science Milan
 
GANs beyond nice pictures: real value of data generation, Alex Honchar
GANs beyond nice pictures: real value of data generation, Alex HoncharGANs beyond nice pictures: real value of data generation, Alex Honchar
GANs beyond nice pictures: real value of data generation, Alex Honchar
Data Science Milan
 
Continual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Continual/Lifelong Learning with Deep Architectures, Vincenzo LomonacoContinual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Continual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Data Science Milan
 
3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep Learning3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep Learning
Data Science Milan
 
Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Deep time-to-failure: predicting failures, churns and customer lifetime with ...Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Data Science Milan
 
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
Data Science Milan
 
Pricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Pricing Optimization: Close-out, Online and Renewal strategies, Data ReplyPricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Pricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Data Science Milan
 
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig..."How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
Data Science Milan
 

More from Data Science Milan (20)

ML & Graph algorithms to prevent financial crime in digital payments
ML & Graph  algorithms to prevent  financial crime in  digital paymentsML & Graph  algorithms to prevent  financial crime in  digital payments
ML & Graph algorithms to prevent financial crime in digital payments
 
How to use the Economic Complexity Index to guide innovation plans
How to use the Economic Complexity Index to guide innovation plansHow to use the Economic Complexity Index to guide innovation plans
How to use the Economic Complexity Index to guide innovation plans
 
Robustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsRobustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning Methods
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies
 
Question generation using Natural Language Processing by QuestGen.AI
Question generation using Natural Language Processing by QuestGen.AIQuestion generation using Natural Language Processing by QuestGen.AI
Question generation using Natural Language Processing by QuestGen.AI
 
Speed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWSSpeed up data preparation for ML pipelines on AWS
Speed up data preparation for ML pipelines on AWS
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
 
Reinforcement Learning Overview | Marco Del Pra
Reinforcement Learning Overview | Marco Del PraReinforcement Learning Overview | Marco Del Pra
Reinforcement Learning Overview | Marco Del Pra
 
Time Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraTime Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del Pra
 
Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AI
Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AILudwig: A code-free deep learning toolbox | Piero Molino, Uber AI
Ludwig: A code-free deep learning toolbox | Piero Molino, Uber AI
 
Audience projection of target consumers over multiple domains a ner and baye...
Audience projection of target consumers over multiple domains  a ner and baye...Audience projection of target consumers over multiple domains  a ner and baye...
Audience projection of target consumers over multiple domains a ner and baye...
 
Weak supervised learning - Kristina Khvatova
Weak supervised learning - Kristina KhvatovaWeak supervised learning - Kristina Khvatova
Weak supervised learning - Kristina Khvatova
 
GANs beyond nice pictures: real value of data generation, Alex Honchar
GANs beyond nice pictures: real value of data generation, Alex HoncharGANs beyond nice pictures: real value of data generation, Alex Honchar
GANs beyond nice pictures: real value of data generation, Alex Honchar
 
Continual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Continual/Lifelong Learning with Deep Architectures, Vincenzo LomonacoContinual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
Continual/Lifelong Learning with Deep Architectures, Vincenzo Lomonaco
 
3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep Learning3D Point Cloud analysis using Deep Learning
3D Point Cloud analysis using Deep Learning
 
Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Deep time-to-failure: predicting failures, churns and customer lifetime with ...Deep time-to-failure: predicting failures, churns and customer lifetime with ...
Deep time-to-failure: predicting failures, churns and customer lifetime with ...
 
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
 
Pricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Pricing Optimization: Close-out, Online and Renewal strategies, Data ReplyPricing Optimization: Close-out, Online and Renewal strategies, Data Reply
Pricing Optimization: Close-out, Online and Renewal strategies, Data Reply
 
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig..."How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...
 

Recently uploaded

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 

Recently uploaded (20)

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 

Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O

  • 1. Introduction to Machine Learning with H2O Jo-fai (Joe) Chow Data Scientist joe@h2o.ai @matlabulous Data Science Milan Politecnico di Milano 10th October, 2016
  • 2. About Me: Civil Engineer → Data Scientist • 2005 - 2015 • Water Engineer o Consultant for Utilities o Industrial PhD • Water Engineering + Machine Learning • Discovered H2O in 2014! • 2015 - Present • Data Scientist o Virgin Media (UK) o Domino Data Lab (US) o H2O.ai (US) 2 Why? Long story – see bit.ly/joe_h2o_talk2
  • 3. Agenda • First Talk (25 mins) o About H2O.ai o Demo • A Simple Classification Task • H2O’s Web Interface o Why H2O? • Our Community • Our Customers o What’s Next? • New H2O Features • Second Talk (25 mins) o H2O for IoT • Predictive Maintenance • Anomaly Detection • H2O’s R Interface • Third Talk (25 mins) o Deep Water o Demo • H2O + mxnet on GPU • H2O’s Python Interface 3
  • 5. About H2O.ai • H2O.ai, the Company o Team: 80 (70 shown) o Founded in 2012 o HQ: Mountain View, California • H2O, the Platform o Open Source (Apache 2.0) o Algorithms written in Java • Fast, distributed and scalable o Multiple interfaces to suit different users • Web, R, Python, Java, Scala, REST/JSON o Works with desktop/laptop, cloud, Spark and Hadoop Joe
  • 7. Current Algorithm Overview 7 Joe’s Strata Hadoop London Talk bit.ly/joe_h2o_talk4 Today’s Demos Joe’s LondonR Talk bit.ly/joe_h2o_talk3
  • 9. H2O’s Mission 9 Making Machine Learning Accessible to Everyone Photo credit: Virgin Media
  • 11. A Typical Machine Learning Task • Demo o Dataset – MNIST • LeCun et al. (1999) • Hand-written Digits o Import & Explore Data o Build & Evaluate Models o Make Predictions 11Photo credit: http://www.opendeep.org/v0.0.5/docs/tutorial-classifying-handwritten-mnist-images
  • 12. MNIST Hand-Written Digits • 784 Inputs o 28 x 28 = 784 pixels • 1 Output o 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9 o Classification • Files o Train (60k Records) o Test (10k) • Links o https://s3.amazonaws.com/h2o-public-test- data/bigdata/laptop/mnist/train.csv.gz o https://s3.amazonaws.com/h2o-public-test- data/bigdata/laptop/mnist/test.csv.gz 12 Photo credit: https://ml4a.github.io/ml4a/neural_networks/
  • 13. H2O Flow (Web Interface) Demo • Download and unzip jar from www.h2o.ai • In terminal: o java -jar h2o.jar • Web browser: o localhost:54321 13
  • 15. More H2O Flow Examples 15
  • 16. Other H2O Interfaces • R • Python • docs.h2o.ai 16 Key Resources
  • 17. More Advanced Topics • Advanced Features o Hyperparameters Tuning o Model Stacking o Saving/Loading Models o Export Plain Old Java Object (POJO) • Key Resources o docs.h2o.ai • Joe’s Previous H2O Talks o bit.ly/joe_h2o_talk3 o bit.ly/h2o_budapest_1 o bit.ly/h2o_paris_1 17
  • 19. 19
  • 20. Szilard Pafka – Chief Data Scientist at Epoch • Sziland’s talks / blog posts about H2O: o ML Benchmark o Intro to ML with H2O o H2O Scoring o Tweets 20
  • 21. Szilard Pafka – Why H2O? 21 • Szilard’s Summary Slide
  • 23. H2O Community Support 23 Google forum – h2osteam community.h2o.ai Please try
  • 26. H2O Usage in Italy 26 www.h2o.ai/community
  • 27. 27
  • 29. H2O in Action 29 Thank you  Data Science Milan – May 19, 2016 Bringing Deep Learning into production - Paolo Platter, AgileLab http://www.slideshare.net/ds_mi/bringing-deep-learning-into-production-paolo-platter-agilelab
  • 31. H2O is Evolving • H2O Open Tour NYC YouTube Playlist o Advanced data munging o Visual ML o Deep Water (3rd talk) o Sparkling Water • PySparkling & RSparkling o Steam 31 Next time?
  • 32. H2O’s Mission 32 Making Machine Learning Accessible to Everyone Photo credit: Virgin Media
  • 33. End of First Talk – Thanks! 33 • Data Science Milan • Gianmario Spacagna • Politecnico di Milano • Resources o bit.ly/h2o_milan_1 o www.h2o.ai o docs.h2o.ai • Contact o joe@h2o.ai o @matlabulous o github.com/woobe
  • 34. Extra Slides (H2O Flow Demo Screenshots – just in case)
  • 35. 35 Upload the file without decompressing it first
  • 36. 36 Change the data type of “label” from “Numeric” to “Enum” (categorical)
  • 37. 37 Note: Size in Memory Click on individual labels to explore data
  • 38. 38
  • 39. 39 Split the full dataset into training (80% = 48k records) and validation (20% = 12k) – a common machine learning practice
  • 40. 40 Click and select parameters for model training
  • 41. 41 Users have full access to all available parameters – fine-tune model training process For example, I am using rectifier with dropout as the activation to train the model for 20 epochs with classes balancing Leaving other settings as default
  • 42. 42 Training the model with estimated remaining time – users can stop the process early if they want to
  • 43. 43 Performance (logloss) on validation set Performance (logloss) on training set
  • 44. 44 Confusion Matrix on Training Set (48k Records) About 2% Error Confusion Matrix on Validation Set (12k Records) About 4% Error
  • 45. 45 Using the model for prediction on test set
  • 46. 46 Confusion Matrix on Test Set (10k Records) About 4% Error (similar to validation)
  • 47. 47 Full prediction outputs including individual probabilities and predicted label