Predictive Apps for Startups
@louisdorard
#MLVLC - 11 March 2016
AI is everywhere
Lars Trieloff
@trieloff
(see source)
@louisdorard
How does it work?
Data + Machine Learning
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)
3 1 860 1950 house 565,000
3 1 1012 1951 house
2 1.5 968 1976 townhouse 447,000
4 1315 1950 house 648,000
3 2 1599 1964 house
3 2 987 1951 townhouse 790,000
1 1 530 2007 condo 122,000
4 2 1574 1964 house 835,000
4 2001 house 855,000
3 2.5 1472 2005 house
4 3.5 1714 2005 townhouse
2 2 1113 1999 condo
1 769 1999 condo 315,000
ML is a set of AI techniques where
“intelligence” is built from
examples
“Weak AI” vs. “Strong AI”
22
Everyday use cases
• Real-estate
• Spam
• Priority inbox
• Crowd prediction
property price
email spam indicator
email importance indicator
location & context #people
Zillow
Gmail
Gmail
Tranquilien
Making Machine Learning accessible
with cloud platforms
HTML / CSS / JavaScript
HTML / CSS / JavaScript
squarespace.com
The two phases of ML

• TRAIN a model
• PREDICT with a model

30
Machine Learning APIs
The two methods of ML Application Programming Interfaces
(here in Python)
• model = create_model(‘training.csv’)
• predicted_output, confidence =
create_prediction(model, new_input)
31
Machine Learning APIs
The two methods of ML Application Programming Interfaces
(here in Python)
• model = create_model(‘training.csv’)
• predicted_output, confidence =
create_prediction(model, new_input)
32
Machine Learning APIs
Example request to BigML API
$ curl https://bigml.io/dev/model?$BIGML_AUTH 
-X POST 
-H "content-type: application/json" 
-d '{"dataset": "dataset/50ca447b3b56356ae0000029"}'
• Classification problem
• Features:
• Text of email
• Sender in address book?
• How often do I reply?
• How quickly do I reply?
• Demo
35
Priority detection
• VM with Jupyter notebooks (Python & Bash)
• API wrappers preinstalled: BigML & Google Pred
• Notebook for easy setup of credentials
• Scikit-learn and Pandas preinstalled
• Open source VM provisioning script & notebooks
• Search public Snaps on terminal.com:“machine learning”
37
Getting started
Making Machine Learning easier
How was it before?
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
How was it before?
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
WAT?
http://oscar.sensout.com
• Spearmint:“Bayesian optimization”for tuning parameters →
Whetlab → Twitter
• Auto-sklearn:“automated machine learning toolkit and drop-
in replacement for a scikit-learn estimator”
42
Open Source AutoML libraries
Scikit
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
Scikit
from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
AutoML Scikit
import autosklearn
model = autosklearn.AutoSklearnClassifier()
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
• Algorithm selection… AutoML
• Scaling… Azure ML or Yhat (Greg at PAPIs Connect)
• “Automating ML workflows: a report from the trenches”—
Jose A. Ortega Ruiz
46
Automatization
Making Deep Learning accessible
• Classification problem
• Input is an image = pixel values
• Deep Learning! (with Vincent)
48
Image categorization
49
Machine Learning for person detection
pixel1 pixel2 pixel3 person?
102 0 255 Yes
35 41 209 No
… … … …
• Neural network:
• Layers
• Neurons of one layer connected to
neurons of next layer
• Each neuron receives signals from
previous layer and sends new signal to
next layer
• New signal based on linear combination
of signals received
• “Deep”-> more than 3 layers
50
Deep Learning
51
Deep Learning for animal detection
52
Deep Learning for animal detection
pixel1
pixel2
pixel3
cat
dog
1st layer
value=(102, 0, 255)
Last layer
value=(0.1, 0.7, 0.4)
Output
value=(0.8, 0.3) => there’s
probably a cat!
53
Deep Learning for animal detection
pixel1
pixel2
pixel3
cat
dog
1st layer
value=(4, 166, 23)
Last layer
value=(0.1, 0.7, 0.4)
Output
value=(0.1, 0.2) => probably no
animal here
pixel1
pixel2
pixel3
cat
dog
1st layer
value=(102, 0, 255)
Output
value=(0.8, 0.3) => there’s
probably a cat!
Last layer
value=(0.1, 0.7, 0.4)
55
Machine Learning for person detection
pixel1 pixel2 pixel3 person?
102 0 255 Yes
35 41 209 No
… … … …
• Use network for animal detection until last layer
• Replace images with“smart”representation given by last layer
neuron1 neuron2 neuron3 person?
0.1 0.2 0.5 Yes
0.8 0.3 0.8 No
… … … …
• Artificial Intelligence for Business and Society
• Next Monday & Tuesday
• papis.io/connect
• Discount for 24 hours only!
57
PAPIs Connect

Predictive apps for startups

  • 1.
    Predictive Apps forStartups @louisdorard #MLVLC - 11 March 2016
  • 2.
  • 10.
  • 11.
  • 12.
  • 13.
  • 15.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 16.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 18.
    Bedrooms Bathrooms Surface(foot²) Year built Type Price ($) 3 1 860 1950 house 565,000 3 1 1012 1951 house 2 1.5 968 1976 townhouse 447,000 4 1315 1950 house 648,000 3 2 1599 1964 house 3 2 987 1951 townhouse 790,000 1 1 530 2007 condo 122,000 4 2 1574 1964 house 835,000 4 2001 house 855,000 3 2.5 1472 2005 house 4 3.5 1714 2005 townhouse 2 2 1113 1999 condo 1 769 1999 condo 315,000
  • 19.
    ML is aset of AI techniques where “intelligence” is built from examples
  • 21.
    “Weak AI” vs.“Strong AI”
  • 22.
    22 Everyday use cases •Real-estate • Spam • Priority inbox • Crowd prediction property price email spam indicator email importance indicator location & context #people Zillow Gmail Gmail Tranquilien
  • 23.
    Making Machine Learningaccessible with cloud platforms
  • 25.
    HTML / CSS/ JavaScript
  • 26.
    HTML / CSS/ JavaScript
  • 27.
  • 30.
    The two phasesof ML
 • TRAIN a model • PREDICT with a model
 30 Machine Learning APIs
  • 31.
    The two methodsof ML Application Programming Interfaces (here in Python) • model = create_model(‘training.csv’) • predicted_output, confidence = create_prediction(model, new_input) 31 Machine Learning APIs
  • 32.
    The two methodsof ML Application Programming Interfaces (here in Python) • model = create_model(‘training.csv’) • predicted_output, confidence = create_prediction(model, new_input) 32 Machine Learning APIs
  • 33.
    Example request toBigML API $ curl https://bigml.io/dev/model?$BIGML_AUTH -X POST -H "content-type: application/json" -d '{"dataset": "dataset/50ca447b3b56356ae0000029"}'
  • 35.
    • Classification problem •Features: • Text of email • Sender in address book? • How often do I reply? • How quickly do I reply? • Demo 35 Priority detection
  • 37.
    • VM withJupyter notebooks (Python & Bash) • API wrappers preinstalled: BigML & Google Pred • Notebook for easy setup of credentials • Scikit-learn and Pandas preinstalled • Open source VM provisioning script & notebooks • Search public Snaps on terminal.com:“machine learning” 37 Getting started
  • 38.
  • 39.
    How was itbefore? from sklearn import svm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 40.
    How was itbefore? from sklearn import svm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1]) WAT?
  • 41.
  • 42.
    • Spearmint:“Bayesian optimization”fortuning parameters → Whetlab → Twitter • Auto-sklearn:“automated machine learning toolkit and drop- in replacement for a scikit-learn estimator” 42 Open Source AutoML libraries
  • 43.
    Scikit from sklearn importsvm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 44.
    Scikit from sklearn importsvm model = svm.SVC(gamma=0.001, C=100.) from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 45.
    AutoML Scikit import autosklearn model= autosklearn.AutoSklearnClassifier() from sklearn import datasets digits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1]) model.predict(digits.data[-1])
  • 46.
    • Algorithm selection…AutoML • Scaling… Azure ML or Yhat (Greg at PAPIs Connect) • “Automating ML workflows: a report from the trenches”— Jose A. Ortega Ruiz 46 Automatization
  • 47.
  • 48.
    • Classification problem •Input is an image = pixel values • Deep Learning! (with Vincent) 48 Image categorization
  • 49.
    49 Machine Learning forperson detection pixel1 pixel2 pixel3 person? 102 0 255 Yes 35 41 209 No … … … …
  • 50.
    • Neural network: •Layers • Neurons of one layer connected to neurons of next layer • Each neuron receives signals from previous layer and sends new signal to next layer • New signal based on linear combination of signals received • “Deep”-> more than 3 layers 50 Deep Learning
  • 51.
    51 Deep Learning foranimal detection
  • 52.
    52 Deep Learning foranimal detection pixel1 pixel2 pixel3 cat dog 1st layer value=(102, 0, 255) Last layer value=(0.1, 0.7, 0.4) Output value=(0.8, 0.3) => there’s probably a cat!
  • 53.
    53 Deep Learning foranimal detection pixel1 pixel2 pixel3 cat dog 1st layer value=(4, 166, 23) Last layer value=(0.1, 0.7, 0.4) Output value=(0.1, 0.2) => probably no animal here
  • 54.
    pixel1 pixel2 pixel3 cat dog 1st layer value=(102, 0,255) Output value=(0.8, 0.3) => there’s probably a cat! Last layer value=(0.1, 0.7, 0.4)
  • 55.
    55 Machine Learning forperson detection pixel1 pixel2 pixel3 person? 102 0 255 Yes 35 41 209 No … … … … • Use network for animal detection until last layer • Replace images with“smart”representation given by last layer neuron1 neuron2 neuron3 person? 0.1 0.2 0.5 Yes 0.8 0.3 0.8 No … … … …
  • 57.
    • Artificial Intelligencefor Business and Society • Next Monday & Tuesday • papis.io/connect • Discount for 24 hours only! 57 PAPIs Connect