2. BigML, Inc 2ML Crash Course - API/WhizzML/Predictive Apps
BigML Architecture
Tools
REST API
Distributed Machine Learning Backend
Source
Server
Dataset
Server
Model
Server
Prediction
Server
Sample
Server
WhizzML
Server
Evaluation
Server
Web-based Frontend
Visualizations
Smart Infrastructure
(auto-deployable, auto-scalable)
3. BigML, Inc 3ML Crash Course - API/WhizzML/Predictive Apps
The Need for a ML API
• Workflow Automation - reduce drudgery
• Abstraction - reuse code
• Composability - powerful combinations of APIs
• Integration - Dashboard or UI component
• Automate deployment
• Repeatable results
4. BigML, Inc 4ML Crash Course - API/WhizzML/Predictive Apps
Predictive Applications
Collect
& Format
Data
Define
ML
Problem
ETL
Model &
Evaluate
no
yes
Explore
Collect
& Format
Data
Model
Automate
Consume
& Monitor
Predict
Score
Label
Drift &
Anomaly
feature
engineer
Not
Possible
tune
algorithm
Goal
Met?
5. BigML, Inc 5ML Crash Course - API/WhizzML/Predictive Apps
BigML API Endpoint
https://bigml.io/ / /{id}?{auth}
source
dataset
model
ensemble
prediction
batchprediction
evaluation
…
andromeda
dev
dev/andromeda
• Path elements:
• /andromeda specifies the API version (optional)
• /dev specifies development mode
• if not specified, then latest API in production mode
• {id} is required for PUT and DELETE
• {auth} contains url parameters username and api_key
• api_key can be an alternative key
6. BigML, Inc 6ML Crash Course - API/WhizzML/Predictive Apps
BigML API Endpoint
https://bigml.io/...{JSON} {JSON}
Operation HTTP Method Semantics
CREATE POST
Creates a new resource. Returns a JSON document
including a unique identifier.
RETRIEVE GET
Retrieves either a specific resource or a list of
resources.
UPDATE PUT Updates a resource. Only certain fields are putable.
DELETE DELETE Deletes a resource
8. BigML, Inc 8ML Crash Course - API/WhizzML/Predictive Apps
Python Binding Overview
Operation HTTP Method Binding Method
CREATE POST api.create_<resource>(from, {opts})
RETRIEVE GET
api.get_<resource>(id, {opts})
api.list_<resource>({opts})
UPDATE PUT api.update_<resource>(id, {opts})
DELETE DELETE api.delete_<resource>(id)
• Where <resource> is one of: source, dataset, model, ensemble, evaluation, etc
• id is a resource identifier or resource dict
• from is a resource identifier, dict, or string depending on context
9. BigML, Inc 9ML Crash Course - API/WhizzML/Predictive Apps
Diabetes Anomalies
DIABETES
SOURCE
DIABETES
DATASET
TRAIN SET
TEST SET
ALL
MODEL
CLEAN
DATASET
FILTER
ALL
MODEL
ALL
EVALUATION
CLEAN
EVALUATION
COMPARE
EVALUATIONS
ANAOMALY
DETECTOR
11. BigML, Inc 11ML Crash Course - API/WhizzML/Predictive Apps
WhizzML
• Complete programming language
• Machine Learning operations are first-class citizens
• Server-side execution abstracts infrastructure
• API First! - Everything is composable
• Shareable
A Domain-Specific Language (DSL) for
automating Machine Learning workflows.
12. BigML, Inc 12ML Crash Course - API/WhizzML/Predictive Apps
WhizzML vs API
WhizzML API
/
Bindings
Executes
server-‐side
Zero
latency
Paralleliza?on
built-‐in
Sharing
built-‐in
Code
agnos?c
workflows
Workflows
can
be
UI
integrated
Requires
local
execu?on
Every
API
call
has
latency
Manual
paralleliza?on
Manual
sharing
Code
specific
workflows
Workflows
external
to
UI
13. BigML, Inc 13ML Crash Course - API/WhizzML/Predictive Apps
WhizzML vs Flatline
WhizzML Flatline
Concerned
with
resources
Turing
complete
Op?mized
for
paralleliza?on
Concerned
with
datasets
More
specific
to
features
Op?mized
for
speed
15. BigML, Inc 15ML Crash Course - API/WhizzML/Predictive Apps
Redfin Workflow
Model
Predicts
Sale Price
Sold
Homes
Compare
List to
Prediction
16. BigML, Inc 16ML Crash Course - API/WhizzML/Predictive Apps
Redfin Workflow
MODEL
FILTERSOLD HOMES
BATCH
PREDICTION
NEW FEATURES
DATASET DEALS
DATASET
FILTERFORSALE HOMES NEW FEATURES
17. BigML, Inc 17ML Crash Course - API/WhizzML/Predictive Apps
WhizzML Resources
LIBRARY
CITY 1 SOLD HOMES
CITY 1 DEALS
DATASET
EXECUTION
CITY 1 FORSALE HOMES
SCRIPT
18. BigML, Inc 18ML Crash Course - API/WhizzML/Predictive Apps
WhizzML Resources
LIBRARY
CITY 2 SOLD HOMES
CITY 2 DEALS
DATASET
EXECUTION
CITY 2 FORSALE HOMES
SCRIPT
19. BigML, Inc 19ML Crash Course - API/WhizzML/Predictive Apps
Scriptify
• "Reifies" a resource into a WhizzML script.
• Rapid prototyping meets automation.
20. BigML, Inc 20ML Crash Course - API/WhizzML/Predictive Apps
WhizzML FE
Worth More
Worth Less
26. BigML, Inc 26ML Crash Course - API/WhizzML/Predictive Apps
Best-First Features
{F1}
CHOOSE BEST
S = {Fa}
{F2} {F3} {F4} Fn
S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1}
CHOOSE BEST
S = {Fa, Fb}
S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1}
CHOOSE BEST
S = {Fa, Fb, Fc}
27. BigML, Inc 27ML Crash Course - API/WhizzML/Predictive Apps
Model Selection
ENSEMBLE LOGISTIC
REGRESSION
EVALUATION
SOURCE DATASET
TRAINING
TEST
MODEL
EVALUATIONEVALUATION
CHOOSE
28. BigML, Inc 28ML Crash Course - API/WhizzML/Predictive Apps
Model Tuning
ENSEMBLE
N=20
EVALUATION
SOURCE DATASET
TRAINING
TEST
EVALUATIONEVALUATION
ENSEMBLE
N=10
ENSEMBLE
N=1000
CHOOSE
29. BigML, Inc 29ML Crash Course - API/WhizzML/Predictive Apps
SMACdown
• How many models?
• How many nodes?
• Missing splits or not?
• Number of random candidates?
• Balance the objective?
SMACdown can tell you!
30. BigML, Inc 30ML Crash Course - API/WhizzML/Predictive Apps
Path to Automatic ML
time
Automation
REST
API
Programmable
Infrastructure
A
Sauron
• Automatic
deployment
and
auto-‐scaling
Data
Generation
and
Filtering
C
Flatline
• DSL
for
transformation
and
new
field
generation
B
Wintermute
• Distributed
Machine
Learning
Framework
2011 Spring 2016
Automatic
Model
Selection
E
SMACdown
• Automatic
parameter
optimization
Workflow
Automation
D
WhizzML
• DSL
for
programmable
workflows
33. BigML, Inc 33ML Crash Course - API/WhizzML/Predictive Apps
Why WhizzML
• Automation is critical to fulfilling the promise of ML
• WhizzML can create workflows that:
• Automate repetitive tasks.
• Automate model tuning and feature
selection.
• Combine ML models into more powerful
algorithms.
• Create shareable and re-usable executions.