SlideShare a Scribd company logo
Machine Learning with H2O.ai
on Google Cloud
Nicholas Png
Partnerships Software Engineer
nicholas@h2o.ai
Who is H2O.ai?
Company
● Founded in Silicon Valley
in 2012
● Funded: $75m
● Investors: Wells Fargo,
NVIDIA, Nexus Ventures,
Paxion Ventures
Products
● H2O Open Source Machine
Learning (14,000
organizations)
● H2O Driverless AI -
Automated Machine
Learning
Leadership
Leader in Gartner MQ
machine learning and
data science platform
Team
90 AI experts (5 of the
world’s top 100 data
scientists with Kaggle
Grandmasters)
Global
Mountain View
London
Prague
India
Technology leader with most
completeness of vision
Recognized for the mindshare, partner network
and status as a quasi-industry standard for
machine learning and AI
H2O.ai customers gave the highest overall
score among all the vendors for sales relationship
and account management, customer support
(onboarding, troubleshooting, etc.) and overall
service and support
Get the Gartner Magic Quadrant here
H2O.ai is a Leader in the 2018 Gartner Data Science and
Machine Learning Platforms Magic Quadrant
In-Memory,
Distributed Machine
Learning Algorithms
with H2O Flow GUI
H2O AI Open
Source Engine
Integration
with Spark
Lightning Fast
machine learning on
GPUs
100% open source – Apache V2 licensed
Built for data scientists – interface using R, Python
on H2O Flow (interactive notebook interface)
We offer Enterprise Support subscriptions
Commercial Licensed
(closed source)
Built for domain users, analysts &
data scientists – GUI based
interface for end-to-end
data science
Fully automated machine
learning from ingest to
deployment
We offer user licenses on a per
seat basis (annual subscription)
Automatic feature engineering,
machine learning and
interpretability
H2O.ai Product Suite
H2O-3
What is H2O?
Math Platform
Open source in-memory
AI engine
● Parallelized and distributed
algorithms
● GLM, Random Forest, GBM,
Deep Learning, etc.
Tech and API
Easy to use and adopt
● Written in Java - perfect for
Java programmers
● Install is lightweight
● REST API (Java) - run H2O
from R, Python, WebUI
Big data
More data or better models?
BOTH
● Use all of your data - model
without sampling
● More data + better models
= better predictions
Clustering
• K-Means (Auto-K)
Dimension reduction
• Principal Component Analysis
• Generalized Low Rank Models
Word embedding
• Word2Vec
Time series
• iSAX
Machine Learning tuning
• Hyperparameter Search
• Early Stopping
Algorithms on H2O
Statistical analysis
• Linear Models (GLM)
• Naïve Bayes
Ensembles
• Random Forest
• Distributed Trees
• Gradient Boosting Machine
• Stacking / Super Learner
Deep Neural Networks
• NLP
• Autoencoder
• Anomaly Detection
• Deep Features
• CNN, RNN (Deep Water)
+
Data integration
Data quality and
transformation
Modeling table Model building Model
Features Target
Simplified typical machine learning pipeline
Production
Environments
JobFluid Vector Frame
MRTaskDistributed K/V Store
Distributed Fork/JoinNon-Blocking Hash Table
Distributed In-Memory Processing
REST / JSON
Parse
Exploratory
Analysis
Feature
Engineering
ML
Algorithms
Model
Evaluation
Scoring
Data/Model
Export
SQL
NFS
Local
GCS
HDFS
POJO
High level architecture
Driverless AI
Driverless AI delivers
“Expert data scientist in a box”
Created and supported by world renowned AI experts
Empowers companies to accomplish AI and ML
with a single platform
Performs the function of an expert data scientist
and adds more power to both novice and expert
teams
Details and highlights insights and interpretability
with easy to understand results and visualizations
21 day free trial for Driverless AI
Driverless AI
+
Data integration
Data quality and
transformation
Modeling table Model building Model
Features Target
Typical enterprise machine learning workflow
Data is a team sport
~100
Data science experts in the
world
Weeks to
hours
Time for a data scientist to
build a model
Black box models
Lack of AI talent Time to insights slow Lack of trust in AI
”US alone faces a shortage of 190,000
people with analytical expertise.”
Driverless
AI delivers
Your digital data
scientist
Automatic Feature Engineering with
GPU accelerated machine learning
Explainable and
Interpretable AI
Why Driverless AI for Enterprise AI adoption
Automatic feature engineering to
increase accuracy - AlphaGo for AI
Automatic Kaggle Grandmaster
recipes in a box for solving wide
variety of use-cases
Automatic machine learning to find
and tune the right ensemble of
models
Accuracy
Original features
Generated
features
Automatic Text Handling
Frequency Encoding
Cross Validation Target Encoding
Truncated Singular Value
Decomposition
Clustering and more
Feature transformations
Auto feature generation
Kaggle Grandmaster Out of the Box
Deployment Options
YARN
CPU CPU
Model BuildingSQL NFS
GCS
Kubernetes / Kubefow
H2O.ai Driverless AI
H2O Distributed
In-Memory
H2O.ai + Kubeflow
CPU
H2
O Flow
H2O Cluster
(H2O can run anywhere: desktop, cloud, on-prem;
Hadoop and Spark environments supported)
Model training
Model Repository
POJO
(java file)
MOJO
(zip file)
C++
MOJO
Library
Java
MOJO
Library
Java R Py .NET ...
...
...
Apps Language bindings
Model management Model deployment
(Store models in H2O Steam, git, HDFS, S3, etc.) (Add any language with C/C++ binding support)
Save Model
Load Model
Load Model
H2O deployment options
BigQuery
NFS
Local
Cloud
Storage
HDFS
Storage Data Munging Driverless AI
Compute Engine
MOJO
(.zip)
Compute Engine
Inference
●Initial data stored on
HDFS or Google
BigQuery
●Deploy MOJO file to serve
real-time inference (millisecond
response times)
●Additional logic can be placed
before or after calling the MOJO
High Level Deployment Pipeline - Spark
Google Dataproc
●Save munged data to structured data file
●Ingest data file into Driverless AI
●Automatic feature engineering
●Automatic visualizations
●Complete model pipeline exported as MOJO
●Generate high performance model, ensemble
XGBoost, + TF + RunFit
●Ingest data into Spark running
on Google Dataproc.
●Use Sparkling Water for
preliminary modeling and data
munging.
●Current data pipeline can be
added here
BigQuery
NFS
Local
Cloud
Storage
HDFS
Storage
Google BigQuery
Data Munging Driverless AI
Compute Engine
MOJO
(.zip)
Compute Engine
Inference
●Initial data stored on
HDFS or Google
BigQuery
●Perform data cleaning and data
munging in Google BigQuery.
●Driverless AI has an integrated
connector with GBQ for direct
data ingest via SQL queries
●Automatic feature engineering
●Automatic visualizations
●Complete model pipeline exported as MOJO
●Generate high performance models, ensemble
XGBoost + TF + RunFit
●Deploy MOJO file to serve
real-time inference (millisecond
response times)
●Additional logic can be placed
before or after calling the MOJO
High Level Deployment Pipeline - BigQuery
Demo Time!
nicholas@h2o.ai

More Related Content

What's hot

Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Databricks
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
Sri Ambati
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on Azure
Mark Tabladillo
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
Jasjeet Thind
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel Kobran
Databricks
 
Distributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDLDistributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDL
Yulia Tell
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning Platform
Mk Kim
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling Water
Sri Ambati
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Sri Ambati
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
Sri Ambati
 
Accelerate Your AI Today
Accelerate Your AI TodayAccelerate Your AI Today
Accelerate Your AI Today
DESMOND YUEN
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Databricks
 
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Databricks
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Databricks
 
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Rodney Joyce
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
Databricks
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Databricks
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
VMware Tanzu
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R Meetup
Jo-fai Chow
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Sri Ambati
 

What's hot (20)

Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on Azure
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel Kobran
 
Distributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDLDistributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDL
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning Platform
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling Water
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
 
Accelerate Your AI Today
Accelerate Your AI TodayAccelerate Your AI Today
Accelerate Your AI Today
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
 
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R Meetup
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
 

Similar to Machine Learning on Google Cloud with H2O

Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
Sri Ambati
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
James Serra
 
Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep Water
Sri Ambati
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Ian Gomez
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
Sri Ambati
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
Trivadis
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
James Serra
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
David Talby
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the Experts
DataWorks Summit/Hadoop Summit
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
Mark Tabladillo
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Alberto Diaz Martin
 
Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session
Sri Ambati
 
Deep Learning Technical Pitch Deck
Deep Learning Technical Pitch DeckDeep Learning Technical Pitch Deck
Deep Learning Technical Pitch Deck
Nicholas Vossburg
 
Microsoft AI Platform Overview
Microsoft AI Platform OverviewMicrosoft AI Platform Overview
Microsoft AI Platform Overview
David Chou
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
DataWorks Summit/Hadoop Summit
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Sri Ambati
 
Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)
Lviv Startup Club
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
Nicolas Poggi
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWS
Sri Ambati
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
Mark Tabladillo
 

Similar to Machine Learning on Google Cloud with H2O (20)

Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep Water
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the Experts
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
 
Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session
 
Deep Learning Technical Pitch Deck
Deep Learning Technical Pitch DeckDeep Learning Technical Pitch Deck
Deep Learning Technical Pitch Deck
 
Microsoft AI Platform Overview
Microsoft AI Platform OverviewMicrosoft AI Platform Overview
Microsoft AI Platform Overview
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
 
Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWS
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 

More from Sri Ambati

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Sri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
Sri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
Sri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
Sri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
Sri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
Sri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
Sri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
Sri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
Sri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
Sri Ambati
 

More from Sri Ambati (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 

Recently uploaded

June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 

Recently uploaded (20)

June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 

Machine Learning on Google Cloud with H2O

  • 1. Machine Learning with H2O.ai on Google Cloud Nicholas Png Partnerships Software Engineer nicholas@h2o.ai
  • 2. Who is H2O.ai? Company ● Founded in Silicon Valley in 2012 ● Funded: $75m ● Investors: Wells Fargo, NVIDIA, Nexus Ventures, Paxion Ventures Products ● H2O Open Source Machine Learning (14,000 organizations) ● H2O Driverless AI - Automated Machine Learning Leadership Leader in Gartner MQ machine learning and data science platform Team 90 AI experts (5 of the world’s top 100 data scientists with Kaggle Grandmasters) Global Mountain View London Prague India
  • 3. Technology leader with most completeness of vision Recognized for the mindshare, partner network and status as a quasi-industry standard for machine learning and AI H2O.ai customers gave the highest overall score among all the vendors for sales relationship and account management, customer support (onboarding, troubleshooting, etc.) and overall service and support Get the Gartner Magic Quadrant here H2O.ai is a Leader in the 2018 Gartner Data Science and Machine Learning Platforms Magic Quadrant
  • 4. In-Memory, Distributed Machine Learning Algorithms with H2O Flow GUI H2O AI Open Source Engine Integration with Spark Lightning Fast machine learning on GPUs 100% open source – Apache V2 licensed Built for data scientists – interface using R, Python on H2O Flow (interactive notebook interface) We offer Enterprise Support subscriptions Commercial Licensed (closed source) Built for domain users, analysts & data scientists – GUI based interface for end-to-end data science Fully automated machine learning from ingest to deployment We offer user licenses on a per seat basis (annual subscription) Automatic feature engineering, machine learning and interpretability H2O.ai Product Suite
  • 6. What is H2O? Math Platform Open source in-memory AI engine ● Parallelized and distributed algorithms ● GLM, Random Forest, GBM, Deep Learning, etc. Tech and API Easy to use and adopt ● Written in Java - perfect for Java programmers ● Install is lightweight ● REST API (Java) - run H2O from R, Python, WebUI Big data More data or better models? BOTH ● Use all of your data - model without sampling ● More data + better models = better predictions
  • 7. Clustering • K-Means (Auto-K) Dimension reduction • Principal Component Analysis • Generalized Low Rank Models Word embedding • Word2Vec Time series • iSAX Machine Learning tuning • Hyperparameter Search • Early Stopping Algorithms on H2O Statistical analysis • Linear Models (GLM) • Naïve Bayes Ensembles • Random Forest • Distributed Trees • Gradient Boosting Machine • Stacking / Super Learner Deep Neural Networks • NLP • Autoencoder • Anomaly Detection • Deep Features • CNN, RNN (Deep Water)
  • 8. + Data integration Data quality and transformation Modeling table Model building Model Features Target Simplified typical machine learning pipeline
  • 9. Production Environments JobFluid Vector Frame MRTaskDistributed K/V Store Distributed Fork/JoinNon-Blocking Hash Table Distributed In-Memory Processing REST / JSON Parse Exploratory Analysis Feature Engineering ML Algorithms Model Evaluation Scoring Data/Model Export SQL NFS Local GCS HDFS POJO High level architecture
  • 11. Driverless AI delivers “Expert data scientist in a box” Created and supported by world renowned AI experts Empowers companies to accomplish AI and ML with a single platform Performs the function of an expert data scientist and adds more power to both novice and expert teams Details and highlights insights and interpretability with easy to understand results and visualizations 21 day free trial for Driverless AI
  • 12. Driverless AI + Data integration Data quality and transformation Modeling table Model building Model Features Target Typical enterprise machine learning workflow
  • 13. Data is a team sport ~100 Data science experts in the world Weeks to hours Time for a data scientist to build a model Black box models Lack of AI talent Time to insights slow Lack of trust in AI ”US alone faces a shortage of 190,000 people with analytical expertise.” Driverless AI delivers Your digital data scientist Automatic Feature Engineering with GPU accelerated machine learning Explainable and Interpretable AI Why Driverless AI for Enterprise AI adoption
  • 14. Automatic feature engineering to increase accuracy - AlphaGo for AI Automatic Kaggle Grandmaster recipes in a box for solving wide variety of use-cases Automatic machine learning to find and tune the right ensemble of models Accuracy
  • 15. Original features Generated features Automatic Text Handling Frequency Encoding Cross Validation Target Encoding Truncated Singular Value Decomposition Clustering and more Feature transformations Auto feature generation Kaggle Grandmaster Out of the Box
  • 17. YARN CPU CPU Model BuildingSQL NFS GCS Kubernetes / Kubefow H2O.ai Driverless AI H2O Distributed In-Memory H2O.ai + Kubeflow CPU
  • 18. H2 O Flow H2O Cluster (H2O can run anywhere: desktop, cloud, on-prem; Hadoop and Spark environments supported) Model training Model Repository POJO (java file) MOJO (zip file) C++ MOJO Library Java MOJO Library Java R Py .NET ... ... ... Apps Language bindings Model management Model deployment (Store models in H2O Steam, git, HDFS, S3, etc.) (Add any language with C/C++ binding support) Save Model Load Model Load Model H2O deployment options
  • 19. BigQuery NFS Local Cloud Storage HDFS Storage Data Munging Driverless AI Compute Engine MOJO (.zip) Compute Engine Inference ●Initial data stored on HDFS or Google BigQuery ●Deploy MOJO file to serve real-time inference (millisecond response times) ●Additional logic can be placed before or after calling the MOJO High Level Deployment Pipeline - Spark Google Dataproc ●Save munged data to structured data file ●Ingest data file into Driverless AI ●Automatic feature engineering ●Automatic visualizations ●Complete model pipeline exported as MOJO ●Generate high performance model, ensemble XGBoost, + TF + RunFit ●Ingest data into Spark running on Google Dataproc. ●Use Sparkling Water for preliminary modeling and data munging. ●Current data pipeline can be added here
  • 20. BigQuery NFS Local Cloud Storage HDFS Storage Google BigQuery Data Munging Driverless AI Compute Engine MOJO (.zip) Compute Engine Inference ●Initial data stored on HDFS or Google BigQuery ●Perform data cleaning and data munging in Google BigQuery. ●Driverless AI has an integrated connector with GBQ for direct data ingest via SQL queries ●Automatic feature engineering ●Automatic visualizations ●Complete model pipeline exported as MOJO ●Generate high performance models, ensemble XGBoost + TF + RunFit ●Deploy MOJO file to serve real-time inference (millisecond response times) ●Additional logic can be placed before or after calling the MOJO High Level Deployment Pipeline - BigQuery