AI to Do
Good.
Democratize AI for
Everyone.
H2O is the open leader in AI.
H2O.ai at a Glance
Founded 2012 - Series C in November 2017
Products • H2O Open Source Machine Learning (Enterprise Support)
• Driverless AI – Automated Machine Learning
• Sparkling Water
What we do The Open Leader in AI
Team ~100 employees
• Distributed Systems Engineers doing Machine Learning
• World-class visualization designers
• 5 of the top World’s Kaggle Grandmasters
Global Mountain View, London, Prague, India, Japan, New Zealand
• H2O.ai recognized as a technology
leader with most completeness of
vision
• H2O.ai was recognized for the
mindshare, partner network and
status as a quasi-industry standard
for machine learning and AI.
• H2O.ai customers gave the highest
overall score among all the vendors
for sales relationship and account
management, customer support
(onboarding, troubleshooting, etc.)
and overall service and support.
CONFIDENTIAL
Open
12,622 Companies using H2O.ai 155,000 H2O.ai Users
CONFIDENTIAL
Financial InsuranceData
Companies
HW
Vendors
Retail Advisory &
Accounting
Healthcare
“H2O.ai's reference customers gave it the highest overall score for sales
relationship and overall service and support” - Gartner MQ 2018
CONFIDENTIAL
CONFIDENTIAL
8
Wholesale / Commercial Banking
• Know Your Customers (KYC)
• Anti-Money Laundering
(AML)
Card/Payments Business
• Transaction Frauds
• Collusion Fraud
• Real-time Targeting
• Credit Risk Scoring
• In-Context Promotion
Retail Banking
• Deposit Fraud
• Customer Churn Prediction
• Auto-Loan
IT Infrastructure
• Security Cyberlake
• DoS Detection and Protection
• Master Data Management
CONFIDENTIAL
9
Flu Season
Prediction
Personalized Drug
Matching
Medical Claim Fraud
Detection
Emergency Room and Hospital
Management
Drug
Discovery
Remote Patient
Monitoring
Early Cancer Detection /
Oncology
Medical Imaging and
Diagnostics
Product
Recommendation
CONFIDENTIAL
1
0
Predictive Maintenance
• Battery Failure
• Resilient networks
Enhanced Offerings
• Personalized program
recommendations
• Intelligent Ad placements
• In-Context Promotion
Customer Service
• Avoidable Truck-roll
• Customer Churn Prediction
• Improved customer viewing
experience (TV)
IT Infrastructure
• Security Cyberlake
• DoS Detection and Protection
• Master Data Management
In-Memory, Distributed
Machine Learning Algorithms
with H2O Flow GUI
H2O AI Open Source Engine
Integration with Spark
Lightning Fast machine
learning on GPUs
Automatic feature
engineering, machine
learning and interpretability
• 100% open source – Apache V2 licensed
• Built for data scientists – interface using R,
Python on H2O Flow (interactive notebook
interface)
• We offer Enterprise Support subscriptions
• Commercial Licensed (closed
source)
• Built for domain users, analysts &
data scientists – GUI based
interface for end-to-end data
science
• Fully automated machine learning
from ingest to deployment
• We offer user licenses on a per
seat basis (annual subscription)
• 100% open source – Apache V2 licensed
• Built for data scientists – interface using R,
Python on H2O Flow (interactive notebook
interface)
• We offer Enterprise Support subscriptions
• Fully automated machine learning
from ingest to deployment
• We offer user licenses on a per
seat basis (annual subscription)
• 100% open source – Apache V2 licensed
• Built for data scientists – interface using R,
Python on H2O Flow (interactive notebook
interface)
• We offer Enterprise Support subscriptions
• Fully automated machine learning
from ingest to deployment
• We offer user licenses on a per
seat basis (annual subscription)
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Model Object, Optimized
Your
Imagination
Model Export:
Plain Old Java Object
Local
SQL
Supervised Learning
• Generalized Linear Models: Binomial,
Gaussian, Gamma, Poisson and
Tweedie
• Naïve Bayes
Statistical
Analysis
Ensembles
• Distributed Random Forest:
Classification or regression models
• Gradient Boosting Machine:
Produces an ensemble of decision
trees with increasing refined
approximations
Deep Neural
Networks
• Deep learning: Create multi-layer feed
forward neural networks starting with
an input layer followed by multiple
layers of nonlinear transformations
Unsupervised Learning
• K-means: Partitions observations into
k clusters/groups of the same spatial
size. Automatically detect optimal k
Clustering
Dimensionality
Reduction
• Principal Component Analysis: Linearly
transforms correlated variables to independent
components
• Generalized Low Rank Models: extend the idea
of PCA to handle arbitrary data consisting of
numerical, Boolean, categorical, and missing
data
Anomaly
Detection
• Autoencoders: Find outliers using a
nonlinear dimensionality reduction
using deep learning
High Level Architecture
• Enable all capabilities of H2O on top
of Spark
• H2O code executes directly in the
Spark Executor JVM
• Spark RDDs and H2O Frames share
the same memory space. Data can
freely transfer between Spark and
H2O without any overhead.
High Level Architecture
• Enable all capabilities of H2O on top
of Spark
• H2O code executes directly in the
Spark Executor JVM
• Spark RDDs and H2O Frames share
the same memory space. Data can
freely transfer between Spark and
H2O without any overhead.
Model Building
Data Munging
Stream Processing
Deep Learning, GBM, GLM,
DRF, Kmeans, GLRM, PCA,
Ensembles
• To get started on H2O from the Azure marketplace, click
here
• To learn more: check out our documentation site
• To learn more about Sparkling Water, check out the
documentation and booklet.
• Blog: Developing & Operationalizing H2O models on Azure
• Blog post on using H2O on HDInsight
ISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on Azure

ISV Showcase: End-to-end Machine Learning using H2O on Azure

  • 3.
    AI to Do Good. DemocratizeAI for Everyone. H2O is the open leader in AI.
  • 4.
    H2O.ai at aGlance Founded 2012 - Series C in November 2017 Products • H2O Open Source Machine Learning (Enterprise Support) • Driverless AI – Automated Machine Learning • Sparkling Water What we do The Open Leader in AI Team ~100 employees • Distributed Systems Engineers doing Machine Learning • World-class visualization designers • 5 of the top World’s Kaggle Grandmasters Global Mountain View, London, Prague, India, Japan, New Zealand
  • 5.
    • H2O.ai recognizedas a technology leader with most completeness of vision • H2O.ai was recognized for the mindshare, partner network and status as a quasi-industry standard for machine learning and AI. • H2O.ai customers gave the highest overall score among all the vendors for sales relationship and account management, customer support (onboarding, troubleshooting, etc.) and overall service and support.
  • 6.
    CONFIDENTIAL Open 12,622 Companies usingH2O.ai 155,000 H2O.ai Users
  • 7.
    CONFIDENTIAL Financial InsuranceData Companies HW Vendors Retail Advisory& Accounting Healthcare “H2O.ai's reference customers gave it the highest overall score for sales relationship and overall service and support” - Gartner MQ 2018 CONFIDENTIAL
  • 8.
    CONFIDENTIAL 8 Wholesale / CommercialBanking • Know Your Customers (KYC) • Anti-Money Laundering (AML) Card/Payments Business • Transaction Frauds • Collusion Fraud • Real-time Targeting • Credit Risk Scoring • In-Context Promotion Retail Banking • Deposit Fraud • Customer Churn Prediction • Auto-Loan IT Infrastructure • Security Cyberlake • DoS Detection and Protection • Master Data Management
  • 9.
    CONFIDENTIAL 9 Flu Season Prediction Personalized Drug Matching MedicalClaim Fraud Detection Emergency Room and Hospital Management Drug Discovery Remote Patient Monitoring Early Cancer Detection / Oncology Medical Imaging and Diagnostics Product Recommendation
  • 10.
    CONFIDENTIAL 1 0 Predictive Maintenance • BatteryFailure • Resilient networks Enhanced Offerings • Personalized program recommendations • Intelligent Ad placements • In-Context Promotion Customer Service • Avoidable Truck-roll • Customer Churn Prediction • Improved customer viewing experience (TV) IT Infrastructure • Security Cyberlake • DoS Detection and Protection • Master Data Management
  • 11.
    In-Memory, Distributed Machine LearningAlgorithms with H2O Flow GUI H2O AI Open Source Engine Integration with Spark Lightning Fast machine learning on GPUs Automatic feature engineering, machine learning and interpretability • 100% open source – Apache V2 licensed • Built for data scientists – interface using R, Python on H2O Flow (interactive notebook interface) • We offer Enterprise Support subscriptions • Commercial Licensed (closed source) • Built for domain users, analysts & data scientists – GUI based interface for end-to-end data science • Fully automated machine learning from ingest to deployment • We offer user licenses on a per seat basis (annual subscription)
  • 12.
    • 100% opensource – Apache V2 licensed • Built for data scientists – interface using R, Python on H2O Flow (interactive notebook interface) • We offer Enterprise Support subscriptions • Fully automated machine learning from ingest to deployment • We offer user licenses on a per seat basis (annual subscription)
  • 13.
    • 100% opensource – Apache V2 licensed • Built for data scientists – interface using R, Python on H2O Flow (interactive notebook interface) • We offer Enterprise Support subscriptions • Fully automated machine learning from ingest to deployment • We offer user licenses on a per seat basis (annual subscription)
  • 14.
    HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O ComputeEngine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Model Object, Optimized Your Imagination Model Export: Plain Old Java Object Local SQL
  • 15.
    Supervised Learning • GeneralizedLinear Models: Binomial, Gaussian, Gamma, Poisson and Tweedie • Naïve Bayes Statistical Analysis Ensembles • Distributed Random Forest: Classification or regression models • Gradient Boosting Machine: Produces an ensemble of decision trees with increasing refined approximations Deep Neural Networks • Deep learning: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations Unsupervised Learning • K-means: Partitions observations into k clusters/groups of the same spatial size. Automatically detect optimal k Clustering Dimensionality Reduction • Principal Component Analysis: Linearly transforms correlated variables to independent components • Generalized Low Rank Models: extend the idea of PCA to handle arbitrary data consisting of numerical, Boolean, categorical, and missing data Anomaly Detection • Autoencoders: Find outliers using a nonlinear dimensionality reduction using deep learning
  • 16.
    High Level Architecture •Enable all capabilities of H2O on top of Spark • H2O code executes directly in the Spark Executor JVM • Spark RDDs and H2O Frames share the same memory space. Data can freely transfer between Spark and H2O without any overhead.
  • 17.
    High Level Architecture •Enable all capabilities of H2O on top of Spark • H2O code executes directly in the Spark Executor JVM • Spark RDDs and H2O Frames share the same memory space. Data can freely transfer between Spark and H2O without any overhead.
  • 18.
  • 19.
  • 20.
  • 21.
    Deep Learning, GBM,GLM, DRF, Kmeans, GLRM, PCA, Ensembles
  • 23.
    • To getstarted on H2O from the Azure marketplace, click here • To learn more: check out our documentation site • To learn more about Sparkling Water, check out the documentation and booklet. • Blog: Developing & Operationalizing H2O models on Azure • Blog post on using H2O on HDInsight

Editor's Notes

  • #7 6
  • #9 Gartner Predicts 2017: According to the report, “by 2019, startups will overtake Amazon, Google, IBM and Microsoft in driving the artificial intelligence economy with disruptive business solutions.”
  • #10 Gartner Predicts 2017: According to the report, “by 2019, startups will overtake Amazon, Google, IBM and Microsoft in driving the artificial intelligence economy with disruptive business solutions.”
  • #11 Gartner Predicts 2017: According to the report, “by 2019, startups will overtake Amazon, Google, IBM and Microsoft in driving the artificial intelligence economy with disruptive business solutions.”
  • #15 14