Machine Learning: je m’y mets
demain!
@louisdorard
#TTFX - March 31, 2016
AI is everywhere
Amazon for David Jones (@d_jones, see source)
Amazon for David Jones (@d_jones, see source)
Lars Trieloff
@trieloff
(see source)
@louisdorard
ChurnSpotter.io
• Startups pitch
• AI asks questions live to each startup
• AI assigns score
• Startup with highest score wins 100000 €
18
AI Startup Battle at PAPIs.io
Preseries
How does it work?
Data + Machine Learning
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
ML is a set of AI techniques where
“intelligence” is built from
examples
30
Use cases
• Real-estate
• Spam filtering
• City bikes
• Startup competition
• Reduce churn
• Optimize pricing
• Anticipate demand
property price
email spam indicator
location, context #bikes
startup success indicator
customer churn indicator
product, price #sales
product, store, date #sales
Zillow
Gmail
V3 predict
Preseries
ChurnSpotter
Amazon
Blue Yonder
RULES
Making Machine Learning accessible
with cloud platforms
HTML / CSS / JavaScript
HTML / CSS / JavaScript
squarespace.com
The two phases of ML

• TRAIN a model
• PREDICT with a model

38
Machine Learning APIs
The two methods of ML Application Programming Interfaces
(here in Python)
• model = create_model(‘training.csv’)
• predicted_output, confidence =
create_prediction(model, new_input)
39
Machine Learning APIs
The two methods of ML Application Programming Interfaces
(here in Python)
• model = create_model(‘training.csv’)
• predicted_output, confidence =
create_prediction(model, new_input)
40
Machine Learning APIs
Example request to BigML API
$ curl https://bigml.io/dev/model?$BIGML_AUTH 
-X POST 
-H "content-type: application/json" 
-d '{"dataset": "dataset/50ca447b3b56356ae0000029"}'
• Classification problem
• Features:
• Text of email
• Sender in address book?
• How often do I reply?
• How quickly do I reply?
• Demo
43
Priority detection
• VM with Jupyter notebooks (Python & Bash)
• API wrappers preinstalled: BigML & Google Pred
• Notebook for easy setup of credentials
• Scikit-learn and Pandas preinstalled
• Open source VM provisioning script & notebooks
• Search public Snaps on terminal.com:“machine learning”
45
Getting started
Making Machine Learning easier
How was it before?
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
How was it before?
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
WAT?
http://oscar.sensout.com
• Spearmint:“Bayesian optimization”for tuning parameters →
Whetlab → Twitter
• Auto-sklearn:“automated machine learning toolkit and drop-
in replacement for a scikit-learn estimator”
50
Open Source AutoML libraries
Scikit
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
Scikit
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
AutoML Scikit
import autosklearn
model = autosklearn.AutoSklearnClassifier()
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
• Algorithm selection… AutoML
• Scaling… Azure ML or Yhat (Greg at PAPIs Connect)
• “Automating ML workflows: a report from the trenches”—
Jose A. Ortega Ruiz
54
Automatization
Making Deep Learning accessible
• Classification problem
• Input is an image = pixel values
56
Image categorization
pixel1 pixel2 pixel3 animal?
102 0 255 Yes
35 41 209 No
… … … …
• Neural network:
• Layers
• Neurons of one layer connected to
neurons of next layer
• Each neuron receives signals from
previous layer and sends new signal to
next layer
• New signal based on linear combination
of signals received
• “Deep”-> more than 3 layers
57
Deep Learning
58
Deep Learning for animal detection
59
Deep Learning for animal detection
pixel1
pixel2
pixel3
cat
dog
1st layer
value=(102, 0, 255)
Last layer
value=(0.1, 0.7, 0.4)
Output
value=(0.8, 0.3) => there’s
probably a cat!
60
Deep Learning for animal detection
pixel1
pixel2
pixel3
cat
dog
1st layer
value=(4, 166, 23)
Last layer
value=(0.1, 0.7, 0.4)
Output
value=(0.1, 0.2) => probably no
animal here
pixel1
pixel2
pixel3
cat
dog
1st layer
value=(102, 0, 255)
Output
value=(0.8, 0.3) => there’s
probably a cat!
Last layer
value=(0.1, 0.7, 0.4)
62
Deep Learning for animal detection
pixel1 pixel2 pixel3 animal?
102 0 255 Yes
35 41 209 No
… … … …
• Replace images with“smart”representation given by last layer
neuron1 neuron2 neuron3 animal?
0.1 0.2 0.5 Yes
0.8 0.3 0.8 No
… … … …
• Prochain meetup:
• Développer une application prédictive

(Hors-série débutants)
• Mardi 12 Avril à 19h - Le Node
• Workshop:
• Operational Machine Learning with open source and cloud platforms
• Samedi 23 Avril - sera annoncé sur le Meetup!
63
Prochains événements ML à Bordeaux
Machine Learning: je m’y mets

le 12 et le 23 Avril!
meetup.com/Bordeaux-Machine-Learning-Meetup/
meetup.com/Bordeaux-Machine-Learning-Meetup/
@louisdorard

Machine Learning: je m'y mets demain!

  • 1.
    Machine Learning: jem’y mets demain! @louisdorard #TTFX - March 31, 2016
  • 2.
  • 4.
    Amazon for DavidJones (@d_jones, see source)
  • 5.
    Amazon for DavidJones (@d_jones, see source)
  • 11.
  • 12.
  • 13.
  • 18.
    • Startups pitch •AI asks questions live to each startup • AI assigns score • Startup with highest score wins 100000 € 18 AI Startup Battle at PAPIs.io
  • 20.
  • 21.
  • 22.
  • 24.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 25.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 27.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 28.
    ML is aset of AI techniques where “intelligence” is built from examples
  • 30.
    30 Use cases • Real-estate •Spam filtering • City bikes • Startup competition • Reduce churn • Optimize pricing • Anticipate demand property price email spam indicator location, context #bikes startup success indicator customer churn indicator product, price #sales product, store, date #sales Zillow Gmail V3 predict Preseries ChurnSpotter Amazon Blue Yonder RULES
  • 31.
    Making Machine Learningaccessible with cloud platforms
  • 33.
    HTML / CSS/ JavaScript
  • 34.
    HTML / CSS/ JavaScript
  • 35.
  • 38.
    The two phasesof ML
 • TRAIN a model • PREDICT with a model
 38 Machine Learning APIs
  • 39.
    The two methodsof ML Application Programming Interfaces (here in Python) • model = create_model(‘training.csv’) • predicted_output, confidence = create_prediction(model, new_input) 39 Machine Learning APIs
  • 40.
    The two methodsof ML Application Programming Interfaces (here in Python) • model = create_model(‘training.csv’) • predicted_output, confidence = create_prediction(model, new_input) 40 Machine Learning APIs
  • 41.
    Example request toBigML API $ curl https://bigml.io/dev/model?$BIGML_AUTH -X POST -H "content-type: application/json" -d '{"dataset": "dataset/50ca447b3b56356ae0000029"}'
  • 43.
    • Classification problem •Features: • Text of email • Sender in address book? • How often do I reply? • How quickly do I reply? • Demo 43 Priority detection
  • 45.
    • VM withJupyter notebooks (Python & Bash) • API wrappers preinstalled: BigML & Google Pred • Notebook for easy setup of credentials • Scikit-learn and Pandas preinstalled • Open source VM provisioning script & notebooks • Search public Snaps on terminal.com:“machine learning” 45 Getting started
  • 46.
  • 47.
    How was itbefore? from sklearn import svm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 48.
    How was itbefore? from sklearn import svm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1]) WAT?
  • 49.
  • 50.
    • Spearmint:“Bayesian optimization”fortuning parameters → Whetlab → Twitter • Auto-sklearn:“automated machine learning toolkit and drop- in replacement for a scikit-learn estimator” 50 Open Source AutoML libraries
  • 51.
    Scikit from sklearn importsvm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 52.
    Scikit from sklearn importsvm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 53.
    AutoML Scikit import autosklearn model= autosklearn.AutoSklearnClassifier() from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 54.
    • Algorithm selection…AutoML • Scaling… Azure ML or Yhat (Greg at PAPIs Connect) • “Automating ML workflows: a report from the trenches”— Jose A. Ortega Ruiz 54 Automatization
  • 55.
  • 56.
    • Classification problem •Input is an image = pixel values 56 Image categorization pixel1 pixel2 pixel3 animal? 102 0 255 Yes 35 41 209 No … … … …
  • 57.
    • Neural network: •Layers • Neurons of one layer connected to neurons of next layer • Each neuron receives signals from previous layer and sends new signal to next layer • New signal based on linear combination of signals received • “Deep”-> more than 3 layers 57 Deep Learning
  • 58.
    58 Deep Learning foranimal detection
  • 59.
    59 Deep Learning foranimal detection pixel1 pixel2 pixel3 cat dog 1st layer value=(102, 0, 255) Last layer value=(0.1, 0.7, 0.4) Output value=(0.8, 0.3) => there’s probably a cat!
  • 60.
    60 Deep Learning foranimal detection pixel1 pixel2 pixel3 cat dog 1st layer value=(4, 166, 23) Last layer value=(0.1, 0.7, 0.4) Output value=(0.1, 0.2) => probably no animal here
  • 61.
    pixel1 pixel2 pixel3 cat dog 1st layer value=(102, 0,255) Output value=(0.8, 0.3) => there’s probably a cat! Last layer value=(0.1, 0.7, 0.4)
  • 62.
    62 Deep Learning foranimal detection pixel1 pixel2 pixel3 animal? 102 0 255 Yes 35 41 209 No … … … … • Replace images with“smart”representation given by last layer neuron1 neuron2 neuron3 animal? 0.1 0.2 0.5 Yes 0.8 0.3 0.8 No … … … …
  • 63.
    • Prochain meetup: •Développer une application prédictive
 (Hors-série débutants) • Mardi 12 Avril à 19h - Le Node • Workshop: • Operational Machine Learning with open source and cloud platforms • Samedi 23 Avril - sera annoncé sur le Meetup! 63 Prochains événements ML à Bordeaux
  • 64.
    Machine Learning: jem’y mets
 le 12 et le 23 Avril! meetup.com/Bordeaux-Machine-Learning-Meetup/
  • 65.