Predictive APIs are making it easier to integrate Machine Learning in your apps and to add predictive features to them. Starting with some basics we'll see what the different types of APIs are and we'll give some examples of proprietary predictive APIs. We'll go over some ways of exposing your own predictive models as APIs served by 3rd party platforms, and open source frameworks for creating and serving your own APIs on your infrastructure of choice. We'll give some remarks on recent (and missing) tools to make it easier to use and compare all these APIs. Finally, we'll give some pointers to a Virtual Machine to help you get started with these technologies...
Slides from my talk at the Valencian Summer School on Machine Learning (#VSMML15)
19. AMAZON GOOGLE PREDICSIS BIGML
ACCURACY 0.862 0.743 0.858 0.790
TRAINING
TIME
135s 76s 17s 5s
TEST TIME 188s 369s 5s 1s
louisdorard.com/blog/machine-learning-apis-comparison
42. Experiment on “ScienceCluster”
• Distributed jobs
• Collaborative workspace
• Serialize chosen model
Deploy model as API on “ScienceOps”
• Load balancing
• Auto scaling
• Monitoring (API calls, accuracy)
43. 43
• 1 for serving predictions
• 1 for running ML experiment (i.e. train and evaluate
models on given data)?
• 1 for deploying ML models?
Your API endpoints
57. 57
• Spearmint: “Bayesian optimization” for tuning
parameters → Whetlab → Twitter
• Auto-sklearn: “automated machine learning toolkit
and drop-in replacement for a scikit-learn
estimator”
• See automl.org and challenge
Open Source AutoML?!
58. from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
Scikit Python
59. from sklearn import svm
model = svm.SVC(gamma=0.001, C=100.)
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
Scikit Python
60. import autosklearn
model = autosklearn.AutoSklearnClassifier()
from sklearn import datasets
digits = datasets.load_digits()
model.fit(digits.data[:-1], digits.target[:-1])
model.predict(digits.data[-1])
AutoML Scikit
62. AMAZON GOOGLE PREDICSIS BIGML
ACCURACY 0.862 0.743 0.858 0.790
TRAINING
TIME
135s 76s 17s 5s
TEST TIME 188s 369s 5s 1s
louisdorard.com/blog/machine-learning-apis-comparison
63. 63
• Requirement:
• train/test splits on local machine
• compute evaluation on local machine
• Solutions
• adapt bigmler and use local evaluations?
• use scikit-learn framework?
Automated Benchmark?!
64. 64
• Python defacto standard: scikit-learn
• “Sparkit-learn aims to provide scikit-
learn functionality and API on PySpark. The main goal of the
library is to create an API that stays close to sklearn’s."
• REST standard: PSI (Protocols & Structures for Inference)
• Pretty similar to BigML API!
• Implementation for scikit available
• Easy benchmarking! Ensembles!
API standards?!
65. 65
• VM with Jupyter notebooks (Python & Bash)
• API wrappers preinstalled: BigML & Google Pred
• Notebook for easy setup of credentials
• Scikit-learn and Pandas preinstalled
• Open source VM provisioning script & notebooks
• Search public Snaps on terminal.com: “machine learning”
Getting started