SlideShare a Scribd company logo
Pragmatic Machine Learning
@louisdorard
#MLSpain - 18 Jan 2016
I’M LAZY
“Programming is for lazy people
who want to automate things
— AI is for lazier people who want
to automate programming”
• Consider manual classification task
• Automate with ML model?
• Build PoC
• Deploy in production
• Maintain
• Monitor performance
• Update with new data
6
The Lazy MLer
Phrase problem as ML task
Engineer features
Prepare data (csv)
Learn model
Make predictions
Deploy model & integrate
Evaluate model
Measure impact
• Florian Douetteau at PAPIs Connect in May 2015)













• Top companies invested more than 5M$ in their ML production
platform (Facebook, Amazon, LinkedIn, Spotify…)
7
Cost of ML projects
• Real-world ML is/was complicated and costly (especially at
web scale)
• Do I really need ML?
• How about Human API? (e.g. Amazon Mechanical Turk)
• → Back to Square 1 (but someone else’s problem!)
• → Baseline! (performance, time, cost)
8
The Lazy MLer
Performance evaluation
How do you evaluate the
performance of an ML system?
Accuracy
Latency
Throughput
11
Performance measures
• Go beyond accuracy… example: recommendations
• Get clicks!
• → Simulate how many you’d get with your model
• → Need to learn accurately what people like — not what they dislike
• Better decisions with ML
• Revenue increase (A/B test)
• Decisions can have a cost (e.g. give special offer/pricing to customer)… ROI?
13
Domain-specific evaluation
Decisions from predictions
1. Descriptive
2. Predictive
3. Prescriptive
15
Types of analytics
1. Show churn rate against time
2. Predict which customers will churn next
3. Suggest what to do about each customer

(e.g. propose to switch plan, send promotional offer, etc.)
17
Churn analysis
• Who: SaaS company selling monthly subscription
• Question asked:“Is this customer going to leave within 1
month?”
• Input: customer
• Output: no-churn or churn
• Data collection: history up until 1 month ago
18
Churn prediction
• #TP (we predict customer churns and he does)
• #FP (we predict customer churns but he doesn’t)
• #FN (we predict customer doesn’t churn but he does)
19
Churn prediction accuracy
Assume we know who’s going to churn. What do we do?
• Contact them (in which order?)
• Switch to different plan
• Give special offer
• No action?
20
Churn prevention
“3. Suggest what to do about each customer”

→ prioritised list of actions, based on…
• Customer representation + context (e.g. competition)
• Churn prediction (& action prediction?)
• Uncertainty in predictions
• Revenue brought by customer & cost of action
• Constraints on frequency of solicitations
21
Churn prevention
• Taking action for each TP (and FP) has a cost
• For each TP we“gain”: (success rate of action) * (revenue /cust. /month)
• Imagine…
• perfect predictions
• revenue /cust. /month = 10€
• success rate of action = 20%
• cost of action = 2€
• Which ROI?
22
Churn prevention ROI
• We predicted customer would churn but they didn’t…
• That’s actually good! Prevention worked!
• Need to store which actions were taken
• Is ML really helping?
• Compare to baseline,

e.g. if no usage for more than 15 days then predict churn
• Is fancy model really improving bottom line?
23
Churn prevention evaluation
1. Show past demand against calendar
2. Predict demand for [product] at [store] in next 2 days
3. Suggest how much to ship
• Trade-off: cost of storage vs risk of lost sales
• Constraints on order size, truck volume, capacity of people
putting stuff into shelves
24
Replenishment
• Context
• Predictions
• Uncertainty in predictions
• Constraints
• Costs / benefits
• Competing objectives ( trade-offs to make)
• Business rules
25
Decisions are based on…
APIs are key
Software components for automated decisions:
• Create training dataset from historical data (merge sources, aggregate…)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given context
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
27
Separation of concerns
Software components for automated decisions:
• Create training dataset from historical data (merge sources, aggregate…)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given context
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
28
Operations Research component
Software components for automated decisions:
• Create training dataset from historical data (merge sources, aggregate…)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given context
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
29
Machine Learning components
Software components for automated decisions:
• Create training dataset from historical data (merge sources, aggregate…)
• Provide predictive model from given training set (i.e. learn)
• Provide prediction against model for given context
• Provide optimal decision from given contextual data, predictions,
uncertainties, constraints, objectives, costs
• Apply given decision
30
Predictive APIs
The two methods of predictive APIs:
• model = create_model(‘training.csv’)
• predicted_output = create_prediction(model,
new_input)
31
Predictive APIs
Amazon ML
BigML
Google Prediction
PredicSis
… or your own company!
32
Providers of REST http Predictive APIs
?
Experiment on “ScienceCluster”
• Distributed jobs
• Collaborative workspace
• Serialize chosen model
Deploy model as API on “ScienceOps”
• Load balancing
• Auto scaling
• Monitoring (API calls, accuracy)
• “Open source prediction server”in Scala
• Based on Spark, MLlib, Spray
• DASE framework: Data preparation, Algorithm, Serving,
Evaluation
• Amazon CloudFormation template → cluster
• Manual up/down scaling
40
→ PAPI+
→ PAPI+
Interesting research problems
45
Concurrency for high-throughput ML APIs
Brian Gawalt (Senior Data Scientist at Upwork)

Talk at PAPIs ’15
upwork.com use case:
• predict freelancer availability
• huge web platform (millions of users)

→ need very high throughput and low latency
• things change quickly → need freshest data & predictions
46
Concurrency for high-throughput ML APIs
• event: invitation sent to freelancer
• steps to prediction:
• gather raw data from all sources
• featurize event
• make prediction
Concurrency for high-throughput ML APIs
• An actor…
• gets & sends messages
• makes computations
• Actors we need:
• “Historians”: one per data source
• “Featurizer”
• “Scorer”
48
Concurrency with Actor framework
49
Concurrency for high-throughput ML APIs
before
50
Concurrency for high-throughput ML APIs
after
• Python defacto standard: scikit-learn
• “Sparkit-learn aims to provide scikit-
learn functionality and API on PySpark. The main goal of the library is to cr
eate an API that stays close to sklearn’s.”
• REST standard: PSI (Protocols & Structures for Inference)
• Pretty similar to BigML API!
• Implementation for scikit available
• Easier benchmarking! Ensembles!
51
API standards?
• “AzureML: Anatomy of a machine learning service”
• “Deploying high throughput predictive models with the actor
framework”
• “Protocols and Structures for Inference: A RESTful API for Machine
Learning”
• Coming soon… JMLR W&CP Volume 50
• Get updates: @papisdotio or papis.io/updates
52
PAPIs ’15 Proceedings
53
Simple MLaaS comparison
Amazon Google PredicSis BigML
Accuracy 0,862 0,743 0,858 0,790
Training 135s 76s 17s 5s
Test time 188s 369s 5s 1s
louisdorard.com/blog/machine-learning-apis-comparison
• With SKLL (SciKit Learn Laboratory)
• Wrap each service in a scikit estimator
• Specify evaluations to perform in a config file (datasets,
metrics, eval procedure)
• Need to also measure time…
• See papiseval on Github
54
Automated Benchmark?
• Return of the Lazy MLer!
• Model selection
• Find optimal values for n (hyper-)parameters

→ optimisation problem (function in n dimensions)
• Search space of parameters, efficiently → explore vs exploit
• Bayesian optimization?
55
AutoML
56
Bayesian Optimization in 1 dimension
From CODE517E
57
Bayesian Optimization in 1 dimension
From CODE517E
• Building ensembles
• Decide to continue training existing model, or to train new one
• Explore vs exploit again!
• Reward is accuracy. Let’s estimate reward for all options.
• Choose option with highest expected reward + uncertainty? (i.e.
upper confidence bound)
• Limited computational budget…
58
AutoML
• Zoubin Gharahmani & James Lloyd @ Uni Cambridge
• Gaussian Processes: find (mixture of) kernel(s) that maximises
data likelihood
• Also Bayesian!
59
Automatic Statistician
• Spearmint:“Bayesian optimization”for tuning parameters →
Whetlab → Twitter
• Auto-sklearn:“automated machine learning toolkit and drop-
in replacement for a scikit-learn estimator”
• See automl.org and challenge
60
Open Source AutoML libraries
61
Scikit
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
62
Scikit
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
63
AutoML Scikit
import autosklearn
model = autosklearn.AutoSklearnClassifier()
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
• Before learning:
• Automatic feature extraction from text?
• After learning:
• Monitor new predictions and automatically retrain models
when necessary?
• See panel discussion at PAPIs‘15
64
More automation ideas…
• Same as Azure ML?
• Scaling up? down?
65
Open Source Auto Scaling?
Tech talks:
• Intro to Spark
• Using ML to build an autonomous drone
• Demystifying Deep Learning (speaker needed!)
• Distributed Deep Learning with Spark on AWS
67
PAPIs Connect (14-15 March, Valencia)
Topics:
• Managing technology
• FinTech
• Enterprise, Retail, Operations
• AI for Society (Nuria Oliver, Scientific Director at Telefonica R&D)
• Future of AI (Ramon Lopez de Mantaras, Director AI Research at
Spanish Research Council)
68
PAPIs Connect (14-15 March, Valencia)
• Dev? Bring your manager!
• Manager? Bring your devs!
• Discount code: MLSVLC20
• papis.io/connect
69
PAPIs Connect (14-15 March, Valencia)

More Related Content

What's hot

A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
PAPIs.io
 
DutchMLSchool. ML Automation
DutchMLSchool. ML AutomationDutchMLSchool. ML Automation
DutchMLSchool. ML Automation
BigML, Inc
 
VSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised LearningVSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised Learning
BigML, Inc
 
MLSEV Virtual. ML Platformization and AutoML in the Enterprise
MLSEV Virtual. ML Platformization and AutoML in the EnterpriseMLSEV Virtual. ML Platformization and AutoML in the Enterprise
MLSEV Virtual. ML Platformization and AutoML in the Enterprise
BigML, Inc
 
DutchMLSchool. ML for Logistics
DutchMLSchool. ML for LogisticsDutchMLSchool. ML for Logistics
DutchMLSchool. ML for Logistics
BigML, Inc
 
MLSEV Virtual. Optimization of Passengers Waiting Time in Elevators
MLSEV Virtual. Optimization of Passengers Waiting Time in ElevatorsMLSEV Virtual. Optimization of Passengers Waiting Time in Elevators
MLSEV Virtual. Optimization of Passengers Waiting Time in Elevators
BigML, Inc
 
The Mathematics of Neural Networks
The Mathematics of Neural NetworksThe Mathematics of Neural Networks
The Mathematics of Neural Networks
m.a.kirn
 
Data Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsData Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analytics
Akin Osman Kazakci
 

What's hot (8)

A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
 
DutchMLSchool. ML Automation
DutchMLSchool. ML AutomationDutchMLSchool. ML Automation
DutchMLSchool. ML Automation
 
VSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised LearningVSSML18 Introduction to Supervised Learning
VSSML18 Introduction to Supervised Learning
 
MLSEV Virtual. ML Platformization and AutoML in the Enterprise
MLSEV Virtual. ML Platformization and AutoML in the EnterpriseMLSEV Virtual. ML Platformization and AutoML in the Enterprise
MLSEV Virtual. ML Platformization and AutoML in the Enterprise
 
DutchMLSchool. ML for Logistics
DutchMLSchool. ML for LogisticsDutchMLSchool. ML for Logistics
DutchMLSchool. ML for Logistics
 
MLSEV Virtual. Optimization of Passengers Waiting Time in Elevators
MLSEV Virtual. Optimization of Passengers Waiting Time in ElevatorsMLSEV Virtual. Optimization of Passengers Waiting Time in Elevators
MLSEV Virtual. Optimization of Passengers Waiting Time in Elevators
 
The Mathematics of Neural Networks
The Mathematics of Neural NetworksThe Mathematics of Neural Networks
The Mathematics of Neural Networks
 
Data Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsData Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analytics
 

Viewers also liked

Being Practical About Artificial Intelligence (Forbes U30 Summit 2016)
Being Practical About Artificial Intelligence (Forbes U30 Summit 2016)Being Practical About Artificial Intelligence (Forbes U30 Summit 2016)
Being Practical About Artificial Intelligence (Forbes U30 Summit 2016)
Tarun Gangwani
 
Techexpo bigdata ml_ai_hanoi
Techexpo bigdata ml_ai_hanoiTechexpo bigdata ml_ai_hanoi
Techexpo bigdata ml_ai_hanoi
Lam Pham
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data science
Mahesh Kumar CV
 
From Data to AI with the Machine Learning Canvas
From Data to AI with the Machine Learning CanvasFrom Data to AI with the Machine Learning Canvas
From Data to AI with the Machine Learning Canvas
Louis Dorard
 
IoT, AI, ML Mix or How to Deal with New Technologies (Borys Pratsiuk Technolo...
IoT, AI, ML Mix or How to Deal with New Technologies (Borys Pratsiuk Technolo...IoT, AI, ML Mix or How to Deal with New Technologies (Borys Pratsiuk Technolo...
IoT, AI, ML Mix or How to Deal with New Technologies (Borys Pratsiuk Technolo...
IT Arena
 
Analytics in business
Analytics in businessAnalytics in business
Analytics in business
Niko Vuokko
 
Predicting YOU! The Future of Artificial Intelligence
Predicting YOU! The Future of Artificial Intelligence Predicting YOU! The Future of Artificial Intelligence
Predicting YOU! The Future of Artificial Intelligence
Stephenie Rodriguez
 
Business Analytics for the Airline MRO Industry: An Analytics Master class
Business Analytics for the Airline MRO Industry: An Analytics Master classBusiness Analytics for the Airline MRO Industry: An Analytics Master class
Business Analytics for the Airline MRO Industry: An Analytics Master class
Hazelknight Media & Entertainment Pvt Ltd
 
Business Analytics to solve your Business Problems
Business Analytics to solve your Business ProblemsBusiness Analytics to solve your Business Problems
Business Analytics to solve your Business Problems
Vishal Pawar
 
Predire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big DataPredire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big Data
Data Driven Innovation
 
Business analytics
Business analyticsBusiness analytics
Business analytics
Silla Rupesh
 
Business Analytics and Optimization Introduction
Business Analytics and Optimization IntroductionBusiness Analytics and Optimization Introduction
Business Analytics and Optimization Introduction
Raul Chong
 
Analytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionAnalytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolution
Deloitte United States
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
Booz Allen Hamilton
 

Viewers also liked (14)

Being Practical About Artificial Intelligence (Forbes U30 Summit 2016)
Being Practical About Artificial Intelligence (Forbes U30 Summit 2016)Being Practical About Artificial Intelligence (Forbes U30 Summit 2016)
Being Practical About Artificial Intelligence (Forbes U30 Summit 2016)
 
Techexpo bigdata ml_ai_hanoi
Techexpo bigdata ml_ai_hanoiTechexpo bigdata ml_ai_hanoi
Techexpo bigdata ml_ai_hanoi
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data science
 
From Data to AI with the Machine Learning Canvas
From Data to AI with the Machine Learning CanvasFrom Data to AI with the Machine Learning Canvas
From Data to AI with the Machine Learning Canvas
 
IoT, AI, ML Mix or How to Deal with New Technologies (Borys Pratsiuk Technolo...
IoT, AI, ML Mix or How to Deal with New Technologies (Borys Pratsiuk Technolo...IoT, AI, ML Mix or How to Deal with New Technologies (Borys Pratsiuk Technolo...
IoT, AI, ML Mix or How to Deal with New Technologies (Borys Pratsiuk Technolo...
 
Analytics in business
Analytics in businessAnalytics in business
Analytics in business
 
Predicting YOU! The Future of Artificial Intelligence
Predicting YOU! The Future of Artificial Intelligence Predicting YOU! The Future of Artificial Intelligence
Predicting YOU! The Future of Artificial Intelligence
 
Business Analytics for the Airline MRO Industry: An Analytics Master class
Business Analytics for the Airline MRO Industry: An Analytics Master classBusiness Analytics for the Airline MRO Industry: An Analytics Master class
Business Analytics for the Airline MRO Industry: An Analytics Master class
 
Business Analytics to solve your Business Problems
Business Analytics to solve your Business ProblemsBusiness Analytics to solve your Business Problems
Business Analytics to solve your Business Problems
 
Predire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big DataPredire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big Data
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Business Analytics and Optimization Introduction
Business Analytics and Optimization IntroductionBusiness Analytics and Optimization Introduction
Business Analytics and Optimization Introduction
 
Analytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionAnalytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolution
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
 

Similar to Pragmatic Machine Learning @ ML Spain

Machine learning
Machine learningMachine learning
Machine learning
Saravanan Subburayal
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Sri Ambati
 
Ai in finance
Ai in financeAi in finance
Ai in finance
QuantUniversity
 
Practical model management in the age of Data science and ML
Practical model management in the age of Data science and MLPractical model management in the age of Data science and ML
Practical model management in the age of Data science and ML
QuantUniversity
 
The Fine Art of Combining Capacity Management with Machine Learning
The Fine Art of Combining Capacity Management with Machine LearningThe Fine Art of Combining Capacity Management with Machine Learning
The Fine Art of Combining Capacity Management with Machine Learning
Precisely
 
Business intelligence prof nikhat fatma mumtaz husain shaikh
Business intelligence  prof nikhat fatma mumtaz husain shaikhBusiness intelligence  prof nikhat fatma mumtaz husain shaikh
Business intelligence prof nikhat fatma mumtaz husain shaikh
Nikhat Fatma Mumtaz Husain Shaikh
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
Splunk
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
Dev Raj Gautam
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
Sasha Lazarevic
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
Splunk
 
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
Sri Ambati
 
L7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIsL7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIs
Machine Learning Valencia
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
Databricks
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive Sector
BigML, Inc
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
Andrew Musselman
 
C19013010 the tutorial to build shared ai services session 1
C19013010  the tutorial to build shared ai services session 1C19013010  the tutorial to build shared ai services session 1
C19013010 the tutorial to build shared ai services session 1
Bill Liu
 
Estimation
EstimationEstimation
Estimation
Dev9Com
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
Pierre Gutierrez
 

Similar to Pragmatic Machine Learning @ ML Spain (20)

Machine learning
Machine learningMachine learning
Machine learning
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
Ai in finance
Ai in financeAi in finance
Ai in finance
 
Practical model management in the age of Data science and ML
Practical model management in the age of Data science and MLPractical model management in the age of Data science and ML
Practical model management in the age of Data science and ML
 
The Fine Art of Combining Capacity Management with Machine Learning
The Fine Art of Combining Capacity Management with Machine LearningThe Fine Art of Combining Capacity Management with Machine Learning
The Fine Art of Combining Capacity Management with Machine Learning
 
Business intelligence prof nikhat fatma mumtaz husain shaikh
Business intelligence  prof nikhat fatma mumtaz husain shaikhBusiness intelligence  prof nikhat fatma mumtaz husain shaikh
Business intelligence prof nikhat fatma mumtaz husain shaikh
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
 
L7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIsL7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIs
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive Sector
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
 
C19013010 the tutorial to build shared ai services session 1
C19013010  the tutorial to build shared ai services session 1C19013010  the tutorial to build shared ai services session 1
C19013010 the tutorial to build shared ai services session 1
 
Estimation
EstimationEstimation
Estimation
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 

More from Louis Dorard

Machine Learning: je m'y mets demain!
Machine Learning: je m'y mets demain!Machine Learning: je m'y mets demain!
Machine Learning: je m'y mets demain!
Louis Dorard
 
Trusting AI with important decisions
Trusting AI with important decisionsTrusting AI with important decisions
Trusting AI with important decisions
Louis Dorard
 
Predictive APIs at APIdays Berlin
Predictive APIs at APIdays BerlinPredictive APIs at APIdays Berlin
Predictive APIs at APIdays Berlin
Louis Dorard
 
Pragmatic machine learning for the real world
Pragmatic machine learning for the real worldPragmatic machine learning for the real world
Pragmatic machine learning for the real world
Louis Dorard
 
Data Summit Brussels: Introduction
Data Summit Brussels: IntroductionData Summit Brussels: Introduction
Data Summit Brussels: IntroductionLouis Dorard
 
Bootstrapping Machine Learning
Bootstrapping Machine LearningBootstrapping Machine Learning
Bootstrapping Machine Learning
Louis Dorard
 
Big Data 2.0
Big Data 2.0Big Data 2.0
Big Data 2.0
Louis Dorard
 
Exploration & Exploitation Challenge 2011
Exploration & Exploitation Challenge 2011Exploration & Exploitation Challenge 2011
Exploration & Exploitation Challenge 2011
Louis Dorard
 

More from Louis Dorard (8)

Machine Learning: je m'y mets demain!
Machine Learning: je m'y mets demain!Machine Learning: je m'y mets demain!
Machine Learning: je m'y mets demain!
 
Trusting AI with important decisions
Trusting AI with important decisionsTrusting AI with important decisions
Trusting AI with important decisions
 
Predictive APIs at APIdays Berlin
Predictive APIs at APIdays BerlinPredictive APIs at APIdays Berlin
Predictive APIs at APIdays Berlin
 
Pragmatic machine learning for the real world
Pragmatic machine learning for the real worldPragmatic machine learning for the real world
Pragmatic machine learning for the real world
 
Data Summit Brussels: Introduction
Data Summit Brussels: IntroductionData Summit Brussels: Introduction
Data Summit Brussels: Introduction
 
Bootstrapping Machine Learning
Bootstrapping Machine LearningBootstrapping Machine Learning
Bootstrapping Machine Learning
 
Big Data 2.0
Big Data 2.0Big Data 2.0
Big Data 2.0
 
Exploration & Exploitation Challenge 2011
Exploration & Exploitation Challenge 2011Exploration & Exploitation Challenge 2011
Exploration & Exploitation Challenge 2011
 

Recently uploaded

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 

Recently uploaded (20)

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 

Pragmatic Machine Learning @ ML Spain

  • 3.
  • 4.
  • 5. “Programming is for lazy people who want to automate things — AI is for lazier people who want to automate programming”
  • 6. • Consider manual classification task • Automate with ML model? • Build PoC • Deploy in production • Maintain • Monitor performance • Update with new data 6 The Lazy MLer Phrase problem as ML task Engineer features Prepare data (csv) Learn model Make predictions Deploy model & integrate Evaluate model Measure impact
  • 7. • Florian Douetteau at PAPIs Connect in May 2015)
 
 
 
 
 
 
 • Top companies invested more than 5M$ in their ML production platform (Facebook, Amazon, LinkedIn, Spotify…) 7 Cost of ML projects
  • 8. • Real-world ML is/was complicated and costly (especially at web scale) • Do I really need ML? • How about Human API? (e.g. Amazon Mechanical Turk) • → Back to Square 1 (but someone else’s problem!) • → Baseline! (performance, time, cost) 8 The Lazy MLer
  • 10. How do you evaluate the performance of an ML system?
  • 12.
  • 13. • Go beyond accuracy… example: recommendations • Get clicks! • → Simulate how many you’d get with your model • → Need to learn accurately what people like — not what they dislike • Better decisions with ML • Revenue increase (A/B test) • Decisions can have a cost (e.g. give special offer/pricing to customer)… ROI? 13 Domain-specific evaluation
  • 15. 1. Descriptive 2. Predictive 3. Prescriptive 15 Types of analytics
  • 16.
  • 17. 1. Show churn rate against time 2. Predict which customers will churn next 3. Suggest what to do about each customer
 (e.g. propose to switch plan, send promotional offer, etc.) 17 Churn analysis
  • 18. • Who: SaaS company selling monthly subscription • Question asked:“Is this customer going to leave within 1 month?” • Input: customer • Output: no-churn or churn • Data collection: history up until 1 month ago 18 Churn prediction
  • 19. • #TP (we predict customer churns and he does) • #FP (we predict customer churns but he doesn’t) • #FN (we predict customer doesn’t churn but he does) 19 Churn prediction accuracy
  • 20. Assume we know who’s going to churn. What do we do? • Contact them (in which order?) • Switch to different plan • Give special offer • No action? 20 Churn prevention
  • 21. “3. Suggest what to do about each customer”
 → prioritised list of actions, based on… • Customer representation + context (e.g. competition) • Churn prediction (& action prediction?) • Uncertainty in predictions • Revenue brought by customer & cost of action • Constraints on frequency of solicitations 21 Churn prevention
  • 22. • Taking action for each TP (and FP) has a cost • For each TP we“gain”: (success rate of action) * (revenue /cust. /month) • Imagine… • perfect predictions • revenue /cust. /month = 10€ • success rate of action = 20% • cost of action = 2€ • Which ROI? 22 Churn prevention ROI
  • 23. • We predicted customer would churn but they didn’t… • That’s actually good! Prevention worked! • Need to store which actions were taken • Is ML really helping? • Compare to baseline,
 e.g. if no usage for more than 15 days then predict churn • Is fancy model really improving bottom line? 23 Churn prevention evaluation
  • 24. 1. Show past demand against calendar 2. Predict demand for [product] at [store] in next 2 days 3. Suggest how much to ship • Trade-off: cost of storage vs risk of lost sales • Constraints on order size, truck volume, capacity of people putting stuff into shelves 24 Replenishment
  • 25. • Context • Predictions • Uncertainty in predictions • Constraints • Costs / benefits • Competing objectives ( trade-offs to make) • Business rules 25 Decisions are based on…
  • 27. Software components for automated decisions: • Create training dataset from historical data (merge sources, aggregate…) • Provide predictive model from given training set (i.e. learn) • Provide prediction against model for given context • Provide optimal decision from given contextual data, predictions, uncertainties, constraints, objectives, costs • Apply given decision 27 Separation of concerns
  • 28. Software components for automated decisions: • Create training dataset from historical data (merge sources, aggregate…) • Provide predictive model from given training set (i.e. learn) • Provide prediction against model for given context • Provide optimal decision from given contextual data, predictions, uncertainties, constraints, objectives, costs • Apply given decision 28 Operations Research component
  • 29. Software components for automated decisions: • Create training dataset from historical data (merge sources, aggregate…) • Provide predictive model from given training set (i.e. learn) • Provide prediction against model for given context • Provide optimal decision from given contextual data, predictions, uncertainties, constraints, objectives, costs • Apply given decision 29 Machine Learning components
  • 30. Software components for automated decisions: • Create training dataset from historical data (merge sources, aggregate…) • Provide predictive model from given training set (i.e. learn) • Provide prediction against model for given context • Provide optimal decision from given contextual data, predictions, uncertainties, constraints, objectives, costs • Apply given decision 30 Predictive APIs
  • 31. The two methods of predictive APIs: • model = create_model(‘training.csv’) • predicted_output = create_prediction(model, new_input) 31 Predictive APIs
  • 32. Amazon ML BigML Google Prediction PredicSis … or your own company! 32 Providers of REST http Predictive APIs
  • 33.
  • 34.
  • 35.
  • 36. ?
  • 37.
  • 38. Experiment on “ScienceCluster” • Distributed jobs • Collaborative workspace • Serialize chosen model Deploy model as API on “ScienceOps” • Load balancing • Auto scaling • Monitoring (API calls, accuracy)
  • 39.
  • 40. • “Open source prediction server”in Scala • Based on Spark, MLlib, Spray • DASE framework: Data preparation, Algorithm, Serving, Evaluation • Amazon CloudFormation template → cluster • Manual up/down scaling 40
  • 43.
  • 45. 45 Concurrency for high-throughput ML APIs Brian Gawalt (Senior Data Scientist at Upwork)
 Talk at PAPIs ’15
  • 46. upwork.com use case: • predict freelancer availability • huge web platform (millions of users)
 → need very high throughput and low latency • things change quickly → need freshest data & predictions 46 Concurrency for high-throughput ML APIs
  • 47. • event: invitation sent to freelancer • steps to prediction: • gather raw data from all sources • featurize event • make prediction Concurrency for high-throughput ML APIs
  • 48. • An actor… • gets & sends messages • makes computations • Actors we need: • “Historians”: one per data source • “Featurizer” • “Scorer” 48 Concurrency with Actor framework
  • 51. • Python defacto standard: scikit-learn • “Sparkit-learn aims to provide scikit- learn functionality and API on PySpark. The main goal of the library is to cr eate an API that stays close to sklearn’s.” • REST standard: PSI (Protocols & Structures for Inference) • Pretty similar to BigML API! • Implementation for scikit available • Easier benchmarking! Ensembles! 51 API standards?
  • 52. • “AzureML: Anatomy of a machine learning service” • “Deploying high throughput predictive models with the actor framework” • “Protocols and Structures for Inference: A RESTful API for Machine Learning” • Coming soon… JMLR W&CP Volume 50 • Get updates: @papisdotio or papis.io/updates 52 PAPIs ’15 Proceedings
  • 53. 53 Simple MLaaS comparison Amazon Google PredicSis BigML Accuracy 0,862 0,743 0,858 0,790 Training 135s 76s 17s 5s Test time 188s 369s 5s 1s louisdorard.com/blog/machine-learning-apis-comparison
  • 54. • With SKLL (SciKit Learn Laboratory) • Wrap each service in a scikit estimator • Specify evaluations to perform in a config file (datasets, metrics, eval procedure) • Need to also measure time… • See papiseval on Github 54 Automated Benchmark?
  • 55. • Return of the Lazy MLer! • Model selection • Find optimal values for n (hyper-)parameters
 → optimisation problem (function in n dimensions) • Search space of parameters, efficiently → explore vs exploit • Bayesian optimization? 55 AutoML
  • 56. 56 Bayesian Optimization in 1 dimension From CODE517E
  • 57. 57 Bayesian Optimization in 1 dimension From CODE517E
  • 58. • Building ensembles • Decide to continue training existing model, or to train new one • Explore vs exploit again! • Reward is accuracy. Let’s estimate reward for all options. • Choose option with highest expected reward + uncertainty? (i.e. upper confidence bound) • Limited computational budget… 58 AutoML
  • 59. • Zoubin Gharahmani & James Lloyd @ Uni Cambridge • Gaussian Processes: find (mixture of) kernel(s) that maximises data likelihood • Also Bayesian! 59 Automatic Statistician
  • 60. • Spearmint:“Bayesian optimization”for tuning parameters → Whetlab → Twitter • Auto-sklearn:“automated machine learning toolkit and drop- in replacement for a scikit-learn estimator” • See automl.org and challenge 60 Open Source AutoML libraries
  • 61. 61 Scikit from sklearn import svm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 62. 62 Scikit from sklearn import svm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 63. 63 AutoML Scikit import autosklearn model = autosklearn.AutoSklearnClassifier() from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 64. • Before learning: • Automatic feature extraction from text? • After learning: • Monitor new predictions and automatically retrain models when necessary? • See panel discussion at PAPIs‘15 64 More automation ideas…
  • 65. • Same as Azure ML? • Scaling up? down? 65 Open Source Auto Scaling?
  • 66.
  • 67. Tech talks: • Intro to Spark • Using ML to build an autonomous drone • Demystifying Deep Learning (speaker needed!) • Distributed Deep Learning with Spark on AWS 67 PAPIs Connect (14-15 March, Valencia)
  • 68. Topics: • Managing technology • FinTech • Enterprise, Retail, Operations • AI for Society (Nuria Oliver, Scientific Director at Telefonica R&D) • Future of AI (Ramon Lopez de Mantaras, Director AI Research at Spanish Research Council) 68 PAPIs Connect (14-15 March, Valencia)
  • 69. • Dev? Bring your manager! • Manager? Bring your devs! • Discount code: MLSVLC20 • papis.io/connect 69 PAPIs Connect (14-15 March, Valencia)