SlideShare a Scribd company logo
Machine Learning for
Smarter Applications
Tom Kraljevic
January 28, 2015
Jacksonville, FL
• About H2O.ai and H2O (10 minutes)
• H2O in Big Data Environments (10 minutes)
• How H2O Processes Data (10 minutes)
• Building Smarter Apps with H2O (20 minutes)
• (Demo) Storm App w/ Streaming Predictions (10 minutes)
• Q & A (20 minutes)
Content for today’s talk can be found at:
https://github.com/h2oai/h2o-
meetups/tree/master/2015_01_28_MLForSmarterApps
Outline for today’s talk
H2O.ai
Machine Intelligence
• Founded: 2011 venture-backed, debuted in 2012
• Product: H2O open source in-memory prediction engine
• Team: 30
• HQ: Mountain View, CA
• SriSatish Ambati – CEO & Co-founder (Founder Platfora, DataStax; Azul)
• Cliff Click – CTO & Co-founder (Creator Hotspot, Azul, Sun, Motorola, HP)
• Tom Kraljevic – VP of Engineering (CTO & Founder Luminix, Azul, Chromatic)
H2O.ai Overview
H2O.ai
Machine Intelligence
TBD.
Customer
Support
TBD
Head of Sales
Distributed
Systems
Engineers
Making
ML Scale!
H2O.ai
Machine Intelligence
H2O.ai
Machine Intelligence
cientific Advisory Council
Stephen Boyd
Professor of EE Engineering
Stanford University
Rob Tibshirani
Professor of Health Research
and Policy, and Statistics
Stanford University
Trevor Hastie
Professor of Statistics
Stanford University
H2O.ai
Machine Intelligence
What is H2O?
Open source in-memory prediction engineMath Platform
• Parallelized and distributed algorithms making the most use out of
multithreaded systems
• GLM, Random Forest, GBM, PCA, etc.
Easy to use and adoptAPI
• Written in Java – perfect for Java Programmers
• REST API (JSON) – drives H2O from R, Python, Excel, Tableau
More data? Or better models? BOTHBig Data
• Use all of your data – model without down sampling
• Run a simple GLM or a more complex GBM to find the best fit for the data
• More Data + Better Models = Better Predictions
Ensembles
Deep Neural Networks
Algorithms on H2O
• Generalized Linear Models: Binomial,
Gaussian, Gamma, Poisson and Tweedie
• Cox Proportional Hazards Models
• Naïve Bayes
• Distributed Random Forest: Classification or
regression models
• Gradient Boosting Machine: Produces an
ensemble of decision trees with increasing
refined approximations
• Deep learning: Create multi-layer feed
forward neural networks starting with an input
layer followed by multiple layers of nonlinear
transformations
Supervised Learning
Statistical
Analysis
H2O.ai
Machine Intelligence
Dimensionality Reduction
Anomaly Detection
Algorithms on H2O
• K-means: Partitions observations into k
clusters/groups of the same spatial size
• Principal Component Analysis: Linearly
transforms correlated variables to independent
components
• Autoencoders: Find outliers using a nonlinear
dimensionality reduction using deep learning
Unsupervised Learning
Clustering
H2O.ai
Machine Intelligence
Per Node
2M Row ingest/sec
50M Row Regression/sec
750M Row Aggregates / sec
On Premise
On Hadoop & Spark
On EC2
Tableau
R
JSON
Scala
Java
H2O Prediction Engine
Nano Fast Scoring Engine
Memory Manager
Columnar Compression
Rapids Query R-engine
In-Mem Map Reduce
Distributed fork/join
Python
HDFS S3 SQL NoSQL
SDK / API
Excel
H2O.ai
Machine Intelligence
Deep Learning
Ensembles
H2O in Big Data
Environments
H2O on YARN Deployment
Hadoop Gateway Node
hadoop jar h2odriver.jar …
YARN RM
HDFSHDFS
Data Nodes
Hadoop Cluster
H2
O
H2O Mappers
(YARN Containers)
YARN Worker Nodes
NM NM NM NM NM NM
YARN Node
Managers (NM)
H2
O
H2
O
Now You Have an H2O Cluster
Hadoop Gateway Node
hadoop jar h2odriver.jar …
HDFSHDFS
Data Nodes
Hadoop Cluster
H2
O
H2O Mappers
(YARN Containers)
YARN Worker Nodes
NM NM NM
YARN Node
Managers (NM)
H2
O
H2
O
YARN RM
Read Data from HDFS Once
Hadoop Gateway Node
hadoop jar h2odriver.jar …
HDFSHDFS
Data Nodes
Hadoop Cluster
H2
O
H2O Mappers
(YARN Containers)
YARN Worker Nodes
NM NM NM
YARN Node
Managers (NM)
H2
O
H2
O
YARN RM
Build Models in-Memory
Hadoop Gateway Node
hadoop jar h2odriver.jar …
Hadoop Cluster
H2
O
H2O Mappers
(YARN Containers)
YARN Worker Nodes
NM NM NM
YARN Node
Managers (NM)
H2
O
H2
O
YARN RM
How H2O Processes Data
Distributed Data Taxonomy
Vector
H2O.ai
Machine Intelligence
Distributed Data Taxonomy
Vector
The vector may be very large
(billions of rows)
- Stored as a compressed column (often 4x)
- Access as Java primitives with
on-the-fly decompression
- Support fast Random access
- Modifiable with Java memory semantics
H2O.ai
Machine Intelligence
Distributed Data Taxonomy
Vector Large vectors must be distributed
over multiple JVMs
- Vector is split into chunks
- Chunk is a unit of parallel access
- Each chunk ~ 1000 elements
- Per-chunk compression
- Homed to a single node
- Can be spilled to disk
- GC very cheap
H2O.ai
Machine Intelligence
Distributed Data Taxonomy
age gender zip_code ID
A row of data is always
stored in a single JVM
Distributed data frame
- Similar to R data frame
- Adding and removing
columns is cheap
H2O.ai
Machine Intelligence
Distributed Fork/Join
JVM
task
JVM
task
JVM
task
JVM
task
JVM
task
Task is distributed in a
tree pattern
- Results are reduced
at each inner node
- Returns with a single
result when all
subtasks done
H2O.ai
Machine Intelligence
Distributed Fork/Join
JVM
task
JVM
task
task task
tasktaskchunk
chunk chunk
On each node the task is parallelized using Fork/Join
H2O.ai
Machine Intelligence
Propensity to Buy modeling factory
A Propensity To Buy model (P2B) is a statistical model that tries to predict whether
a certain company will buy a certain product in a given time frame in the future.
What is a Propensity To Buy model?
Build
Statistical
Model
Historical
• Firmographics
• Past Purchase Behavior
• Contacts (# and types)
• Marketing Interactions
• Cust Sat surveys
• Macroeconomic indicators
• Purchase / Non Purchase
• SCORE: probability that a company
will buy a specific technology in the
next quarter
• VALUE: bookings amount that
Cisco will likely see IF the company
in fact buys the technology
t
…
Q1 Q2Q4
(most recent
closed quarter)
Q3Q2Q1Q4
past purchase behavior
and marketing interactions
firmographics
and contact data
…
predicted
purchase
window
Scoring
happens
here
Latest
• Firmographics
• Past Purchase Behavior
• Contacts (# and types)
• Marketing Interactions
• Customer satis. surveys
• Macroeconomic indicators
H2O.ai
Machine Intelligence
H2O Business Benefit with P2B
Q1 Q2
P2B Training
Scoring
models
Data
Refresh Q2
Data
Refresh Q1
Prepare,
execute
Mktg & Sales
activities
Before, without H2O
Q1 Q2
Train
&
score
Data
Refresh
Prepare, execute
Mktg & Sales
activities
Train
&
score
Data
Refresh
Prepare, execute
Mktg & Sales
activities
Now, with H2O
H2O.ai
Machine Intelligence
Without H2O:
• Models needed to be
prepared in advance, not to
delay scoring
• More time preparing
models, less time left for
using the scores in the
sales activities
With H2O:
• Newer buying patterns
incorporated immediately
into models
• Scores are published
sooner
• More time for planning
and executing
activities
Results & Lessons Learned
Improvements
• P2B factory is 15x faster with H2O
• Quicker techniques for simpler problems, deeper for harder ones
(grid searches!)
• Ensembles improved accuracy and stability of models significantly
Lessons Learned
• Even with few nodes speed improvement over traditional data
mining tools is substantial
• H2O becomes really powerful and robust when combined with R
• Rely on H2O’s extremely responsive support
H2O.ai
Machine Intelligence
Fraud prevention using Deep Learning
Fraud Prevention at PayPal
H2O.ai
Machine Intelligence
• Employs state of the art machine learning and
statistical models to flag fraudulent behavior
upfront
• More sophisticated algorithms after
transaction complete
Transaction Level
• Monitor activity to identify abusive behavior
• Frequent payments, suspicious profile
changes
Account Level
• Monitor account to account interaction
• Frequent transfer of money from sever
accounts to one central account suspicious
Network Level
Fraud Prevention at PayPal
H2O.ai
Machine Intelligence
• Unearth low-level complex
abstractions
• Learn complex, highly
varying functions not
present in the training
examples
• Widely employed for
image/video processing
and object recognition
Why Deep
Learning?
• Highly scalable
• Superior performance
• Flexible deployment
• Works seamlessly with
other big data frameworks
• Simple interface
Why H2O?
Experiment
• Dataset
− 160 million records
− 1500 features (150 categorical)
− 0.6TB compressed in HDFS
• Infrastructure
− 800 node Hadoop (CDH3)
cluster
• Decision
− Fraud/not-fraud
Conclusions using H2O
Deep Learning with H2O is beneficial for payment fraud prevention
• Network architecture- 6 layers with 600 neurons each performed the
best
• Activation function
− RectifierWithDropout performed the best
• Improved performance with limited feature set & a deep network
− 11% improvement with a third of the original feature set, 6 hidden layers, 600
neurons each
• Robust to temporal variations
H2O.ai
Machine Intelligence
Bordeaux Wines
Unsupervised Learning & Bordeaux Wine
• A personal and expen$$$ive obsession
• Alex Tellez - h2o.ai
BORDEAUX WINE
Largest wine-growing region in France
+ 700 Million bottles of wine produced / year !
Some years better than others: Great ($$$) vs. Typical ($)
Last Great years: 2010, 2009, 2005, 2000
‘EN PRIMEUR’
While wine is still barreled, purchasers can ‘invest’ in the wine
before bottling and official public release.
Advantage: Wines may be considerably cheaper during
‘en primeur’ period than @ bottling.
Great Years: 2009, ’05, ‘00
GREAT VS. TYPICAL VINTAGE?
Question:
Can we study weather patterns in Bordeaux
leading up to harvest to identify ‘anomalous’ weather years >>
correlates to Great ($$$) vs. Typical ($) Vintage?
The Bordeaux Dataset (1952 - 2014 Yearly)
Amount of Winter Rain (Oct > Apr of harvest year)
Average Summer Temp (Apr > Sept of harvest year)
Rain during Harvest (Aug > Sept)
Years since last Great Vintage
AUTOENCODER + ANOMALY
DETECTION
In Steps:
1) Train deep autoencoder to learn ‘typical’ vintage weather pattern
2) Append ‘great’ vintage year weather data to original dataset
3) IF great vintage year weather data does NOT match learned
weather pattern, autoencoder will produce high reconstruction
error (MSE)
‘en primeur of en primeur’
- Can we use weather patterns to identify anomalous years
>> indicates great vintage quality?
Goal:
RESULTS (MSE > 0.10)
MeanSquareError
1961 V 2009 V
2005 V
2000 V
1990 V
1989 V
1982 V
2010 V
2014 BORDEAUX??
MeanSquareError
2014 ??
ROBERT PARKER JR.
“The world’s most prized palette” - Financial Times
DESCRIBING WINES
=
…wet forest floor
…decaying animal skin
…cat pee
…grandmother’s closet
…barnyardy (AKA sh*t)
…asian plum sauce
…chewy tannins
Can we use Parker reviews to recommend wines we would like??
WINE RECOMMENDATIONS
I like: 2000 Cheval Blanc (Red)
Robert Parker Tasting Notes:
“…sweet nose of menthol, melted licorice, boysenberry, blueberry,
creme de cassis…sweet tannins w/ hints of coffee & earth”
So I’m at BevMo…now what?
APPROACH
In Steps:
1. Collect hundreds of Robert Parker reviews of wine
2. Create bag-of-words dataset & scale word counts between 0 & 1
(~ probability a particular word occurs in a wine review)
3. Build deep autoencoder from sparse bag-of-words matrix
>> reduce to 10 ‘deep features’
4. Take Euclidian distance of wine I like (2000 Cheval Blanc) to
other wines >> Information Retrieval
5. ‘Nearby’ wine review-vectors = recommendations!
(HELLA) DEEP AUTOENCODER
Wine
Review
Decoder
Encoder
EUCLIDIAN DISTANCE
Input: Output
Deep Feature Space
RESULTS
I like: 2000 Cheval Blanc
“…sweet nose of menthol, melted licorice, boysenberry, blueberry,
creme de cassis…sweet tannins w/ hints of coffee & earth”
Similar Wines:
“… aromas of licorice, damp forest floor, blueberries & currant”
“…loads of cassis, coffee, earth & chocolatey notes”
“…massive blackberry, mulberry fruit intermixed with
earth, crushed rocks”
“…ethereal bouquet of menthol, coffee, wet stones,
black cherries & blackberries.”
Real-time Streaming
Predictions
with H2O and Storm
H2O on Storm
(Spout)
(Bolt)
H2O
Scoring
POJO
Real-time data
Predictions
Data
Source
(e.g. HDFS)
H2O
Emit POJO Model
Real-time StreamModeling workflow
H2O.ai
Machine Intelligence
Q & A
H2O.ai
Machine Intelligence
Thanks for attending!
Content for today’s talk can be found at:
https://github.com/h2oai/h2o-
meetups/tree/master/2015_01_28_MLForSmarterApps

More Related Content

What's hot

Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in R
Anqi Fu
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
MLconf
 
Building Machine Learning Applications with Sparkling Water
Building Machine Learning Applications with Sparkling WaterBuilding Machine Learning Applications with Sparkling Water
Building Machine Learning Applications with Sparkling Water
Sri Ambati
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
Sri Ambati
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
Albert Bifet
 
Deep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry LarkoDeep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry Larko
Sri Ambati
 
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Sri Ambati
 
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
Herman Wu
 
Deep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno CandelDeep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno Candel
Sri Ambati
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
Sri Ambati
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
Joshua Bloom
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Sri Ambati
 
雲端影音與物聯網平台的軟體工程挑戰:以 Skywatch 為例-陳維超
雲端影音與物聯網平台的軟體工程挑戰:以 Skywatch 為例-陳維超雲端影音與物聯網平台的軟體工程挑戰:以 Skywatch 為例-陳維超
雲端影音與物聯網平台的軟體工程挑戰:以 Skywatch 為例-陳維超
台灣資料科學年會
 
Automatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEAutomatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIME
Jo-fai Chow
 
AI Development with H2O.ai
AI Development with H2O.aiAI Development with H2O.ai
AI Development with H2O.ai
Yalçın Yenigün
 
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochgArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
Sri Ambati
 
Productive Use of the Apache Spark Prompt with Sam Penrose
Productive Use of the Apache Spark Prompt with Sam PenroseProductive Use of the Apache Spark Prompt with Sam Penrose
Productive Use of the Apache Spark Prompt with Sam Penrose
Databricks
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 
Applying Machine Learning using H2O
Applying Machine Learning using H2OApplying Machine Learning using H2O
Applying Machine Learning using H2O
Ian Gomez
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
MLconf
 

What's hot (20)

Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in R
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
 
Building Machine Learning Applications with Sparkling Water
Building Machine Learning Applications with Sparkling WaterBuilding Machine Learning Applications with Sparkling Water
Building Machine Learning Applications with Sparkling Water
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Deep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry LarkoDeep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry Larko
 
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
 
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
 
Deep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno CandelDeep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno Candel
 
High Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2OHigh Performance Machine Learning in R with H2O
High Performance Machine Learning in R with H2O
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
 
雲端影音與物聯網平台的軟體工程挑戰:以 Skywatch 為例-陳維超
雲端影音與物聯網平台的軟體工程挑戰:以 Skywatch 為例-陳維超雲端影音與物聯網平台的軟體工程挑戰:以 Skywatch 為例-陳維超
雲端影音與物聯網平台的軟體工程挑戰:以 Skywatch 為例-陳維超
 
Automatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEAutomatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIME
 
AI Development with H2O.ai
AI Development with H2O.aiAI Development with H2O.ai
AI Development with H2O.ai
 
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochgArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
 
Productive Use of the Apache Spark Prompt with Sam Penrose
Productive Use of the Apache Spark Prompt with Sam PenroseProductive Use of the Apache Spark Prompt with Sam Penrose
Productive Use of the Apache Spark Prompt with Sam Penrose
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Applying Machine Learning using H2O
Applying Machine Learning using H2OApplying Machine Learning using H2O
Applying Machine Learning using H2O
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
 

Viewers also liked

Pdf world s_most_admired_wine_brands_pdf_march_2014
Pdf world s_most_admired_wine_brands_pdf_march_2014Pdf world s_most_admired_wine_brands_pdf_march_2014
Pdf world s_most_admired_wine_brands_pdf_march_2014
Jenna Boodram
 
Wine As An Asset or Alternative Strategy
Wine As An Asset or Alternative StrategyWine As An Asset or Alternative Strategy
Wine As An Asset or Alternative Strategy
Jennifer Williams-Bulkeley DWS
 
H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215
Sri Ambati
 
H2O 3 REST API Overview
H2O 3 REST API OverviewH2O 3 REST API Overview
H2O 3 REST API Overview
Sri Ambati
 
H2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDellH2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDell
Sri Ambati
 
H2O World - Collaborative, Reproducible Research with H2O - Nick Elprin
H2O World - Collaborative, Reproducible Research with H2O - Nick ElprinH2O World - Collaborative, Reproducible Research with H2O - Nick Elprin
H2O World - Collaborative, Reproducible Research with H2O - Nick Elprin
Sri Ambati
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Sri Ambati
 
Sparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal MalohlavaSparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal Malohlava
Sri Ambati
 
33insights: Investing in WineTech, Global Insights
33insights: Investing in WineTech, Global Insights33insights: Investing in WineTech, Global Insights
33insights: Investing in WineTech, Global Insights
Vincent PRETET
 
HKIWSF Conference - Liv-ex on Fine Wine Investment Seminar 2009
HKIWSF Conference - Liv-ex on Fine Wine Investment Seminar 2009HKIWSF Conference - Liv-ex on Fine Wine Investment Seminar 2009
HKIWSF Conference - Liv-ex on Fine Wine Investment Seminar 2009
Livex
 

Viewers also liked (11)

Pdf world s_most_admired_wine_brands_pdf_march_2014
Pdf world s_most_admired_wine_brands_pdf_march_2014Pdf world s_most_admired_wine_brands_pdf_march_2014
Pdf world s_most_admired_wine_brands_pdf_march_2014
 
Fine winelist
Fine winelistFine winelist
Fine winelist
 
Wine As An Asset or Alternative Strategy
Wine As An Asset or Alternative StrategyWine As An Asset or Alternative Strategy
Wine As An Asset or Alternative Strategy
 
H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215
 
H2O 3 REST API Overview
H2O 3 REST API OverviewH2O 3 REST API Overview
H2O 3 REST API Overview
 
H2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDellH2O World - Ensembles with Erin LeDell
H2O World - Ensembles with Erin LeDell
 
H2O World - Collaborative, Reproducible Research with H2O - Nick Elprin
H2O World - Collaborative, Reproducible Research with H2O - Nick ElprinH2O World - Collaborative, Reproducible Research with H2O - Nick Elprin
H2O World - Collaborative, Reproducible Research with H2O - Nick Elprin
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
 
Sparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal MalohlavaSparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal Malohlava
 
33insights: Investing in WineTech, Global Insights
33insights: Investing in WineTech, Global Insights33insights: Investing in WineTech, Global Insights
33insights: Investing in WineTech, Global Insights
 
HKIWSF Conference - Liv-ex on Fine Wine Investment Seminar 2009
HKIWSF Conference - Liv-ex on Fine Wine Investment Seminar 2009HKIWSF Conference - Liv-ex on Fine Wine Investment Seminar 2009
HKIWSF Conference - Liv-ex on Fine Wine Investment Seminar 2009
 

Similar to Machine Learning for Smarter Apps - Jacksonville Meetup

ISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on AzureISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on Azure
Microsoft Tech Community
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
Inside Analysis
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
Sri Ambati
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
HostedbyConfluent
 
Eric Proegler Oredev Performance Testing in New Contexts
Eric Proegler Oredev Performance Testing in New ContextsEric Proegler Oredev Performance Testing in New Contexts
Eric Proegler Oredev Performance Testing in New Contexts
Eric Proegler
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
Abhishek Roy
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Michael Hiskey
 
Proud to be polyglot
Proud to be polyglotProud to be polyglot
Proud to be polyglot
Tugdual Grall
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
Sri Ambati
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataStylight
 
How to Choose a Host for a Big Data Project
How to Choose a Host for a Big Data ProjectHow to Choose a Host for a Big Data Project
How to Choose a Host for a Big Data ProjectPeak Hosting
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
The New Model
The New ModelThe New Model
The New Model
David Kaiser
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
Skillwise Group
 
There is more to Big Data than data
There is more to Big Data than dataThere is more to Big Data than data
There is more to Big Data than dataCapgemini
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
Skillwise Group
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Retail & CPG
Retail & CPGRetail & CPG
Gcp dataflow
Gcp dataflowGcp dataflow
Gcp dataflow
Igor Roiter
 

Similar to Machine Learning for Smarter Apps - Jacksonville Meetup (20)

ISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on AzureISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on Azure
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
 
Eric Proegler Oredev Performance Testing in New Contexts
Eric Proegler Oredev Performance Testing in New ContextsEric Proegler Oredev Performance Testing in New Contexts
Eric Proegler Oredev Performance Testing in New Contexts
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Proud to be polyglot
Proud to be polyglotProud to be polyglot
Proud to be polyglot
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big Data
 
How to Choose a Host for a Big Data Project
How to Choose a Host for a Big Data ProjectHow to Choose a Host for a Big Data Project
How to Choose a Host for a Big Data Project
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
The New Model
The New ModelThe New Model
The New Model
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
There is more to Big Data than data
There is more to Big Data than dataThere is more to Big Data than data
There is more to Big Data than data
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Retail & CPG
Retail & CPGRetail & CPG
Retail & CPG
 
Gcp dataflow
Gcp dataflowGcp dataflow
Gcp dataflow
 

More from Sri Ambati

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Sri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
Sri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
Sri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
Sri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
Sri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
Sri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
Sri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
Sri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
Sri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
Sri Ambati
 

More from Sri Ambati (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 

Recently uploaded

Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 

Recently uploaded (20)

Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 

Machine Learning for Smarter Apps - Jacksonville Meetup

  • 1. Machine Learning for Smarter Applications Tom Kraljevic January 28, 2015 Jacksonville, FL
  • 2. • About H2O.ai and H2O (10 minutes) • H2O in Big Data Environments (10 minutes) • How H2O Processes Data (10 minutes) • Building Smarter Apps with H2O (20 minutes) • (Demo) Storm App w/ Streaming Predictions (10 minutes) • Q & A (20 minutes) Content for today’s talk can be found at: https://github.com/h2oai/h2o- meetups/tree/master/2015_01_28_MLForSmarterApps Outline for today’s talk H2O.ai Machine Intelligence
  • 3. • Founded: 2011 venture-backed, debuted in 2012 • Product: H2O open source in-memory prediction engine • Team: 30 • HQ: Mountain View, CA • SriSatish Ambati – CEO & Co-founder (Founder Platfora, DataStax; Azul) • Cliff Click – CTO & Co-founder (Creator Hotspot, Azul, Sun, Motorola, HP) • Tom Kraljevic – VP of Engineering (CTO & Founder Luminix, Azul, Chromatic) H2O.ai Overview H2O.ai Machine Intelligence
  • 5. H2O.ai Machine Intelligence cientific Advisory Council Stephen Boyd Professor of EE Engineering Stanford University Rob Tibshirani Professor of Health Research and Policy, and Statistics Stanford University Trevor Hastie Professor of Statistics Stanford University
  • 6. H2O.ai Machine Intelligence What is H2O? Open source in-memory prediction engineMath Platform • Parallelized and distributed algorithms making the most use out of multithreaded systems • GLM, Random Forest, GBM, PCA, etc. Easy to use and adoptAPI • Written in Java – perfect for Java Programmers • REST API (JSON) – drives H2O from R, Python, Excel, Tableau More data? Or better models? BOTHBig Data • Use all of your data – model without down sampling • Run a simple GLM or a more complex GBM to find the best fit for the data • More Data + Better Models = Better Predictions
  • 7. Ensembles Deep Neural Networks Algorithms on H2O • Generalized Linear Models: Binomial, Gaussian, Gamma, Poisson and Tweedie • Cox Proportional Hazards Models • Naïve Bayes • Distributed Random Forest: Classification or regression models • Gradient Boosting Machine: Produces an ensemble of decision trees with increasing refined approximations • Deep learning: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations Supervised Learning Statistical Analysis H2O.ai Machine Intelligence
  • 8. Dimensionality Reduction Anomaly Detection Algorithms on H2O • K-means: Partitions observations into k clusters/groups of the same spatial size • Principal Component Analysis: Linearly transforms correlated variables to independent components • Autoencoders: Find outliers using a nonlinear dimensionality reduction using deep learning Unsupervised Learning Clustering H2O.ai Machine Intelligence
  • 9. Per Node 2M Row ingest/sec 50M Row Regression/sec 750M Row Aggregates / sec On Premise On Hadoop & Spark On EC2 Tableau R JSON Scala Java H2O Prediction Engine Nano Fast Scoring Engine Memory Manager Columnar Compression Rapids Query R-engine In-Mem Map Reduce Distributed fork/join Python HDFS S3 SQL NoSQL SDK / API Excel H2O.ai Machine Intelligence Deep Learning Ensembles
  • 10. H2O in Big Data Environments
  • 11. H2O on YARN Deployment Hadoop Gateway Node hadoop jar h2odriver.jar … YARN RM HDFSHDFS Data Nodes Hadoop Cluster H2 O H2O Mappers (YARN Containers) YARN Worker Nodes NM NM NM NM NM NM YARN Node Managers (NM) H2 O H2 O
  • 12. Now You Have an H2O Cluster Hadoop Gateway Node hadoop jar h2odriver.jar … HDFSHDFS Data Nodes Hadoop Cluster H2 O H2O Mappers (YARN Containers) YARN Worker Nodes NM NM NM YARN Node Managers (NM) H2 O H2 O YARN RM
  • 13. Read Data from HDFS Once Hadoop Gateway Node hadoop jar h2odriver.jar … HDFSHDFS Data Nodes Hadoop Cluster H2 O H2O Mappers (YARN Containers) YARN Worker Nodes NM NM NM YARN Node Managers (NM) H2 O H2 O YARN RM
  • 14. Build Models in-Memory Hadoop Gateway Node hadoop jar h2odriver.jar … Hadoop Cluster H2 O H2O Mappers (YARN Containers) YARN Worker Nodes NM NM NM YARN Node Managers (NM) H2 O H2 O YARN RM
  • 17. Distributed Data Taxonomy Vector The vector may be very large (billions of rows) - Stored as a compressed column (often 4x) - Access as Java primitives with on-the-fly decompression - Support fast Random access - Modifiable with Java memory semantics H2O.ai Machine Intelligence
  • 18. Distributed Data Taxonomy Vector Large vectors must be distributed over multiple JVMs - Vector is split into chunks - Chunk is a unit of parallel access - Each chunk ~ 1000 elements - Per-chunk compression - Homed to a single node - Can be spilled to disk - GC very cheap H2O.ai Machine Intelligence
  • 19. Distributed Data Taxonomy age gender zip_code ID A row of data is always stored in a single JVM Distributed data frame - Similar to R data frame - Adding and removing columns is cheap H2O.ai Machine Intelligence
  • 20. Distributed Fork/Join JVM task JVM task JVM task JVM task JVM task Task is distributed in a tree pattern - Results are reduced at each inner node - Returns with a single result when all subtasks done H2O.ai Machine Intelligence
  • 21. Distributed Fork/Join JVM task JVM task task task tasktaskchunk chunk chunk On each node the task is parallelized using Fork/Join H2O.ai Machine Intelligence
  • 22.
  • 23.
  • 24. Propensity to Buy modeling factory
  • 25. A Propensity To Buy model (P2B) is a statistical model that tries to predict whether a certain company will buy a certain product in a given time frame in the future. What is a Propensity To Buy model? Build Statistical Model Historical • Firmographics • Past Purchase Behavior • Contacts (# and types) • Marketing Interactions • Cust Sat surveys • Macroeconomic indicators • Purchase / Non Purchase • SCORE: probability that a company will buy a specific technology in the next quarter • VALUE: bookings amount that Cisco will likely see IF the company in fact buys the technology t … Q1 Q2Q4 (most recent closed quarter) Q3Q2Q1Q4 past purchase behavior and marketing interactions firmographics and contact data … predicted purchase window Scoring happens here Latest • Firmographics • Past Purchase Behavior • Contacts (# and types) • Marketing Interactions • Customer satis. surveys • Macroeconomic indicators H2O.ai Machine Intelligence
  • 26. H2O Business Benefit with P2B Q1 Q2 P2B Training Scoring models Data Refresh Q2 Data Refresh Q1 Prepare, execute Mktg & Sales activities Before, without H2O Q1 Q2 Train & score Data Refresh Prepare, execute Mktg & Sales activities Train & score Data Refresh Prepare, execute Mktg & Sales activities Now, with H2O H2O.ai Machine Intelligence Without H2O: • Models needed to be prepared in advance, not to delay scoring • More time preparing models, less time left for using the scores in the sales activities With H2O: • Newer buying patterns incorporated immediately into models • Scores are published sooner • More time for planning and executing activities
  • 27. Results & Lessons Learned Improvements • P2B factory is 15x faster with H2O • Quicker techniques for simpler problems, deeper for harder ones (grid searches!) • Ensembles improved accuracy and stability of models significantly Lessons Learned • Even with few nodes speed improvement over traditional data mining tools is substantial • H2O becomes really powerful and robust when combined with R • Rely on H2O’s extremely responsive support H2O.ai Machine Intelligence
  • 28. Fraud prevention using Deep Learning
  • 29. Fraud Prevention at PayPal H2O.ai Machine Intelligence • Employs state of the art machine learning and statistical models to flag fraudulent behavior upfront • More sophisticated algorithms after transaction complete Transaction Level • Monitor activity to identify abusive behavior • Frequent payments, suspicious profile changes Account Level • Monitor account to account interaction • Frequent transfer of money from sever accounts to one central account suspicious Network Level
  • 30. Fraud Prevention at PayPal H2O.ai Machine Intelligence • Unearth low-level complex abstractions • Learn complex, highly varying functions not present in the training examples • Widely employed for image/video processing and object recognition Why Deep Learning? • Highly scalable • Superior performance • Flexible deployment • Works seamlessly with other big data frameworks • Simple interface Why H2O? Experiment • Dataset − 160 million records − 1500 features (150 categorical) − 0.6TB compressed in HDFS • Infrastructure − 800 node Hadoop (CDH3) cluster • Decision − Fraud/not-fraud
  • 31. Conclusions using H2O Deep Learning with H2O is beneficial for payment fraud prevention • Network architecture- 6 layers with 600 neurons each performed the best • Activation function − RectifierWithDropout performed the best • Improved performance with limited feature set & a deep network − 11% improvement with a third of the original feature set, 6 hidden layers, 600 neurons each • Robust to temporal variations H2O.ai Machine Intelligence
  • 33. Unsupervised Learning & Bordeaux Wine • A personal and expen$$$ive obsession • Alex Tellez - h2o.ai
  • 34. BORDEAUX WINE Largest wine-growing region in France + 700 Million bottles of wine produced / year ! Some years better than others: Great ($$$) vs. Typical ($) Last Great years: 2010, 2009, 2005, 2000
  • 35. ‘EN PRIMEUR’ While wine is still barreled, purchasers can ‘invest’ in the wine before bottling and official public release. Advantage: Wines may be considerably cheaper during ‘en primeur’ period than @ bottling. Great Years: 2009, ’05, ‘00
  • 36. GREAT VS. TYPICAL VINTAGE? Question: Can we study weather patterns in Bordeaux leading up to harvest to identify ‘anomalous’ weather years >> correlates to Great ($$$) vs. Typical ($) Vintage? The Bordeaux Dataset (1952 - 2014 Yearly) Amount of Winter Rain (Oct > Apr of harvest year) Average Summer Temp (Apr > Sept of harvest year) Rain during Harvest (Aug > Sept) Years since last Great Vintage
  • 37. AUTOENCODER + ANOMALY DETECTION In Steps: 1) Train deep autoencoder to learn ‘typical’ vintage weather pattern 2) Append ‘great’ vintage year weather data to original dataset 3) IF great vintage year weather data does NOT match learned weather pattern, autoencoder will produce high reconstruction error (MSE) ‘en primeur of en primeur’ - Can we use weather patterns to identify anomalous years >> indicates great vintage quality? Goal:
  • 38. RESULTS (MSE > 0.10) MeanSquareError 1961 V 2009 V 2005 V 2000 V 1990 V 1989 V 1982 V 2010 V
  • 40. ROBERT PARKER JR. “The world’s most prized palette” - Financial Times
  • 41. DESCRIBING WINES = …wet forest floor …decaying animal skin …cat pee …grandmother’s closet …barnyardy (AKA sh*t) …asian plum sauce …chewy tannins Can we use Parker reviews to recommend wines we would like??
  • 42. WINE RECOMMENDATIONS I like: 2000 Cheval Blanc (Red) Robert Parker Tasting Notes: “…sweet nose of menthol, melted licorice, boysenberry, blueberry, creme de cassis…sweet tannins w/ hints of coffee & earth” So I’m at BevMo…now what?
  • 43. APPROACH In Steps: 1. Collect hundreds of Robert Parker reviews of wine 2. Create bag-of-words dataset & scale word counts between 0 & 1 (~ probability a particular word occurs in a wine review) 3. Build deep autoencoder from sparse bag-of-words matrix >> reduce to 10 ‘deep features’ 4. Take Euclidian distance of wine I like (2000 Cheval Blanc) to other wines >> Information Retrieval 5. ‘Nearby’ wine review-vectors = recommendations!
  • 46. RESULTS I like: 2000 Cheval Blanc “…sweet nose of menthol, melted licorice, boysenberry, blueberry, creme de cassis…sweet tannins w/ hints of coffee & earth” Similar Wines: “… aromas of licorice, damp forest floor, blueberries & currant” “…loads of cassis, coffee, earth & chocolatey notes” “…massive blackberry, mulberry fruit intermixed with earth, crushed rocks” “…ethereal bouquet of menthol, coffee, wet stones, black cherries & blackberries.”
  • 48. H2O on Storm (Spout) (Bolt) H2O Scoring POJO Real-time data Predictions Data Source (e.g. HDFS) H2O Emit POJO Model Real-time StreamModeling workflow H2O.ai Machine Intelligence
  • 49. Q & A H2O.ai Machine Intelligence Thanks for attending! Content for today’s talk can be found at: https://github.com/h2oai/h2o- meetups/tree/master/2015_01_28_MLForSmarterApps

Editor's Notes

  1. H2O MapReduce is not Hadoop MapReduce For more details, go visit cliff in his Hacker corner.