SlideShare a Scribd company logo
Automating Machine Learning
Advanced Workflows and WhizzML
#BSSML16
December 2016
#BSSML16 Automating Machine Learning December 2016 1 / 46
Outline
1 Server-side workflows: WhizzML
2 Basic Workflow: Model or ensemble?
3 Case study: Using Flatline in Whizzml
4 Advanced Workflows
5 Case Study: Stacked Generalization in WhizzML
#BSSML16 Automating Machine Learning December 2016 2 / 46
Outline
1 Server-side workflows: WhizzML
2 Basic Workflow: Model or ensemble?
3 Case study: Using Flatline in Whizzml
4 Advanced Workflows
5 Case Study: Stacked Generalization in WhizzML
#BSSML16 Automating Machine Learning December 2016 3 / 46
Client-side Machine Learning Automation
Problems of client-side solutions
Complexity Lots of details outside the problem domain
Reuse No inter-language compatibility
Scalability Client-side workflows hard to optimize
Extensibility Bigmler hides complexity at the cost of flexibility
Not enough abstraction
#BSSML16 Automating Machine Learning December 2016 4 / 46
Machine Learning Automation for real
Solution (complexity, reuse): Domain-specific languages
#BSSML16 Automating Machine Learning December 2016 5 / 46
Machine Learning Automation for real
Solution (scalability, reuse): Back to the server
#BSSML16 Automating Machine Learning December 2016 6 / 46
Machine Learning Automation for real
Solution (scalability, reuse): Back to the server
#BSSML16 Automating Machine Learning December 2016 6 / 46
WhizzML in a Nutshell
• Domain-specific language for ML workflow automation
High-level problem and solution specification
• Framework for scalable, remote execution of ML workflows
Sophisticated server-side optimization
Out-of-the-box scalability
Client-server brittleness removed
Infrastructure for creating and sharing ML scripts and libraries
#BSSML16 Automating Machine Learning December 2016 7 / 46
WhizzML REST Resources
Library Reusable building-block: a collection of
WhizzML definitions that can be imported by
other libraries or scripts.
Script Executable code that describes an actual
workflow.
• Imports List of libraries with code used by
the script.
• Inputs List of input values that
parameterize the workflow.
• Outputs List of values computed by the
script and returned to the user.
Execution Given a script and a complete set of inputs,
the workflow can be executed and its outputs
generated.
#BSSML16 Automating Machine Learning December 2016 8 / 46
Different ways to create WhizzML Scripts/Libraries
Github
Script editor
Gallery
Other scripts
Scriptify
−→
#BSSML16 Automating Machine Learning December 2016 9 / 46
Basic workflow in WhizzML
(let (dataset (create-dataset source)
cluster (create-cluster dataset))
(create-batchcentroid dataset
cluster
{"output_dataset" true
"all_fields" true}))
#BSSML16 Automating Machine Learning December 2016 10 / 46
Basic workflow in WhizzML: Usable by any binding
from bigml.api import BigML
api = BigML()
# choose workflow
script = 'script/567b4b5be3f2a123a690ff56'
# define parameters
inputs = {'source': 'source/5643d345f43a234ff2310a3e'}
# execute
api.ok(api.create_execution(script, inputs))
#BSSML16 Automating Machine Learning December 2016 11 / 46
Basic workflow in WhizzML: Trivial parallelization
;; Workflow for 1 resource
(let (dataset (create-dataset source)
cluster (create-cluster dataset))
(create-batchcentroid dataset
cluster
{"output_dataset" true
"all_fields" true}))
#BSSML16 Automating Machine Learning December 2016 12 / 46
Basic workflow in WhizzML: Trivial parallelization
;; Workflow for any number of resources
(let (datasets (map create-dataset sources)
clusters (map create-cluster datasets)
params {"output_dataset" true "all_fields" true})
(map (lambda (d c) (create-batchcentroid d c params))
datasets
clusters))
#BSSML16 Automating Machine Learning December 2016 13 / 46
Basic workflows in WhizzML: automatic generation
#BSSML16 Automating Machine Learning December 2016 14 / 46
Standard functions
• Numeric and relational operators (+, *, <, =, ...)
• Mathematical functions (cos, sinh, floor ...)
• Strings and regular expressions (str, matches?, replace, ...)
• Flatline generation
• Collections: list traversal, sorting, map manipulation
• BigML resources manipulation
Creation create-source, create-and-wait-dataset, etc.
Retrieval fetch, list-anomalies, etc.
Update update
Deletion delete
• Machine Learning Algorithms (SMACDown, Boosting, etc.)
#BSSML16 Automating Machine Learning December 2016 15 / 46
Outline
1 Server-side workflows: WhizzML
2 Basic Workflow: Model or ensemble?
3 Case study: Using Flatline in Whizzml
4 Advanced Workflows
5 Case Study: Stacked Generalization in WhizzML
#BSSML16 Automating Machine Learning December 2016 16 / 46
Model or Ensemble?
• Split a dataset in test and training parts
• Create a model and an ensemble with the training dataset
• Evaluate both with the test dataset
• Choose the one with better evaluation (f-measure)
https://github.com/whizzml/examples/tree/master/model-or-ensemble
#BSSML16 Automating Machine Learning December 2016 17 / 46
Model or Ensemble?
;; Functions for creating the two dataset parts
;; Sample a dataset taking a fraction of its rows (rate) and
;; keeping either that fraction (out-of-bag? false) or its
;; complement (out-of-bag? true)
(define (sample-dataset origin-id rate out-of-bag?)
(create-dataset {"origin_dataset" origin-id
"sample_rate" rate
"out_of_bag" out-of-bag?
"seed" "example-seed-0001"})))
;; Create in parallel two halves of a dataset using
;; the sample function twice. Return a list of the two
;; new dataset ids.
(define (split-dataset origin-id rate)
(list (sample-dataset origin-id rate false)
(sample-dataset origin-id rate true)))
#BSSML16 Automating Machine Learning December 2016 18 / 46
Model or Ensemble?
;; Functions to create an ensemble and extract the f-measure from
;; evaluation, given its id.
(define (make-ensemble ds-id size)
(create-ensemble ds-id {"number_of_models" size}))
(define (f-measure ev-id)
(let (ev-id (wait ev-id) ;; because fetch doesn't wait
evaluation (fetch ev-id))
(evaluation ["result" "model" "average_f_measure"]))
#BSSML16 Automating Machine Learning December 2016 19 / 46
Model or Ensemble?
;; Function encapsulating the full workflow
(define (model-or-ensemble src-id)
(let (ds-id (create-dataset {"source" src-id})
[train-id test-id] (split-dataset ds-id 0.8)
m-id (create-model train-id)
e-id (make-ensemble train-id 15)
m-f (f-measure (create-evaluation m-id test-id))
e-f (f-measure (create-evaluation e-id test-id)))
(log-info "model f " m-f " / ensemble f " e-f)
(if (> m-f e-f) m-id e-id)))
;; Compute the result of the script execution
;; - Inputs: [{"name": "input-source-id", "type": "source-id"}]
;; - Outputs: [{"name": "result", "type": "resource-id"}]
(define result (model-or-ensemble input-source-id))
#BSSML16 Automating Machine Learning December 2016 20 / 46
Outline
1 Server-side workflows: WhizzML
2 Basic Workflow: Model or ensemble?
3 Case study: Using Flatline in Whizzml
4 Advanced Workflows
5 Case Study: Stacked Generalization in WhizzML
#BSSML16 Automating Machine Learning December 2016 21 / 46
Transforming item counts to features
basket milk eggs flour salt chocolate caviar
milk,eggs Y Y N N N N
milk,flour Y N Y N N N
milk,flour,eggs Y Y Y N N N
chocolate N N N N Y N
#BSSML16 Automating Machine Learning December 2016 22 / 46
Item counts to features with Flatline
(if (contains-items? "basket" "milk") "Y" "N")
(if (contains-items? "basket" "eggs") "Y" "N")
(if (contains-items? "basket" "flour") "Y" "N")
(if (contains-items? "basket" "salt") "Y" "N")
(if (contains-items? "basket" "chocolate") "Y" "N")
(if (contains-items? "basket" "caviar") "Y" "N")
Parameterized code generation
Field name
Item values
Y/N category names
#BSSML16 Automating Machine Learning December 2016 23 / 46
Flatline code generation with WhizzML
"(if (contains-items? "basket" "milk") "Y" "N")"
#BSSML16 Automating Machine Learning December 2016 24 / 46
Flatline code generation with WhizzML
"(if (contains-items? "basket" "milk") "Y" "N")"
(let (field "basket"
item "milk"
yes "Y"
no "N")
(flatline "(if (contains-items? {{field}} {{item}})"
"{{yes}}"
"{{no}})"))
#BSSML16 Automating Machine Learning December 2016 24 / 46
Flatline code generation with WhizzML
"(if (contains-items? "basket" "milk") "Y" "N")"
(let (field "basket"
item "milk"
yes "Y"
no "N")
(flatline "(if (contains-items? {{field}} {{item}})"
"{{yes}}"
"{{no}})"))
(define (field-flatline field item yes no)
(flatline "(if (contains-items? {{field}} {{item}})"
"{{yes}}"
"{{no}})"))
#BSSML16 Automating Machine Learning December 2016 24 / 46
Flatline code generation with WhizzML
(define (field-flatline field item yes no)
(flatline "(if (contains-items? {{field}} {{item}})"
"{{yes}}"
"{{no}})"))
(define (item-fields field items yes no)
(for (item items)
{"field" (field-flatline field item yes no)}))
(define (dataset-item-fields ds-id field)
(let (ds (fetch ds-id)
item-dist (ds ["fields" field "summary" "items"])
items (map head item-dist))
(item-fields field items "Y" "N")))
#BSSML16 Automating Machine Learning December 2016 25 / 46
Flatline code generation with WhizzML
(define output-dataset
(let (fs {"new_fields" (dataset-item-fields input-dataset
field)})
(create-dataset input-dataset fs)))
{"inputs": [{"name": "input-dataset",
"type": "dataset-id",
"description": "The input dataset"},
{"name": "field",
"type": "string",
"description": "Id of the items field"}],
"outputs": [{"name": "output-dataset",
"type": "dataset-id",
"description": "The id of the generated dataset"}]}
#BSSML16 Automating Machine Learning December 2016 26 / 46
Outline
1 Server-side workflows: WhizzML
2 Basic Workflow: Model or ensemble?
3 Case study: Using Flatline in Whizzml
4 Advanced Workflows
5 Case Study: Stacked Generalization in WhizzML
#BSSML16 Automating Machine Learning December 2016 27 / 46
What Do We Know About WhizzML?
• It’s a complete programming language
• Machine learning “operations” are first-class
• Those operations are performed in BigML’s backend
One-line of code to perform API requests
We get scale “for free”
• Everything is Composable
Functions
Libraries
The Web Interface
#BSSML16 Automating Machine Learning December 2016 28 / 46
What Can We Do With It?
• Non-trivial Model Selection
n-fold cross validation
Comparison of model types (tree, ensemble, logistic)
• Automation of Drudgery
One-click retraining/validation
Standarized dataset transformations / cleaning
• Sure, but what else?
#BSSML16 Automating Machine Learning December 2016 29 / 46
Algorithms as Workflows
• Many ML algorithms can be thought of as workflows
• In these algorithms, machine learning operations are the
primitives
Make a model
Make a prediction
Evaluate a model
• Many such algorithms can be implemented in WhizzML
Reap the advantages of BigML’s infrastructure
Once implemented, it is language-agnostic
#BSSML16 Automating Machine Learning December 2016 30 / 46
Examples: Stacked Generalization
#BSSML16 Automating Machine Learning December 2016 31 / 46
Examples: Randomized Parameter Optimization
#BSSML16 Automating Machine Learning December 2016 32 / 46
Examples: SMACdown
#BSSML16 Automating Machine Learning December 2016 33 / 46
Examples: SMACdown
Objective: Find the best set of parameters even more quickly!
• Do:
Generate several random sets of parameters for an ML algorithm
Do 10-fold cross-validation with those parameters
Learn a predictive model to predict performance from parameter
values
Use the model to help you select the next set of parameters to
evaluate
• Until you get a set of parameters that performs “well” or you get
bored
#BSSML16 Automating Machine Learning December 2016 34 / 46
Examples: Boosting
• General idea: Iteratively model the dataset
Each iteration is trained on the mistakes of previous iterations
Said another way, the objective changes each iteration
The final model is a summation of all iterations
• Lots of variations on this theme
Adaboost
Logitboost
Martingale Boosting
Gradient Boosting
• Let’s take a look at a WhizzML implementation of the latter
#BSSML16 Automating Machine Learning December 2016 35 / 46
Examples: Boosting
#BSSML16 Automating Machine Learning December 2016 36 / 46
Outline
1 Server-side workflows: WhizzML
2 Basic Workflow: Model or ensemble?
3 Case study: Using Flatline in Whizzml
4 Advanced Workflows
5 Case Study: Stacked Generalization in WhizzML
#BSSML16 Automating Machine Learning December 2016 37 / 46
Examples: Stacked Generalization
#BSSML16 Automating Machine Learning December 2016 38 / 46
Stacked generalization
Objective: Improve predictions by modeling the output scores of
multiple trained models.
• Create a training and a holdout set
• Create n different models on the training set (with some difference
among them; e.g., single-tree vs. ensemble vs. logistic regression)
• Make predictions from those models on the holdout set
• Train a model to predict the class based on the other models’
predictions
#BSSML16 Automating Machine Learning December 2016 39 / 46
A Stacked generalization library: creating the stack
;; Splits the given dataset, using half of it to create
;; an heterogeneous collection of models and the other
;; half to train a tree that predicts based on those other
;; models predictions. Returns a map with the collection
;; of models (under the key "models") and the meta-prediction
;; as the value of the key "metamodel". The key "result"
;; has as value a boolean flag indicating whether the
;; process was successful.
(define (make-stack dataset-id)
(let ([train-id hold-id] (create-random-dataset-split dataset-id 0.5)
models (create-stack-models train-id)
id (create-stack-predictions models hold-id)
orig-fields (model-inputs (head models))
obj-id (dataset-get-objective-id train-id)
meta-id (create-model {"dataset" id
"excluded_fields" orig-fields
"objective_field" obj-id})
success? (resource-done? (fetch (wait meta-id))))
{"models" models "metamodel" meta-id "result" success?}))
#BSSML16 Automating Machine Learning December 2016 40 / 46
A Stacked generalization library: using the stack
;; Use the models and metamodels computed by make-stack
;; to make a prediction on the input-data map. Returns
;; the identifier of the prediction object.
(define (make-stack-prediction models meta-model input-data)
(let (preds (map (lambda (m) (create-prediction {"model" m
"input_data" input-data}))
models)
preds (map (lambda (p)
(head (values ((fetch p) "prediction"))))
preds)
meta-input (make-map (model-inputs meta-model) preds))
(create-prediction {"model" meta-model "input_data" meta-input})))
#BSSML16 Automating Machine Learning December 2016 41 / 46
A Stacked generalization library: auxiliary functions
;; Extract for a batchpredction its associated dataset of results
(define (batch-dataset id)
(wait ((fetch id) "output_dataset_resource")))
;; Create a batchprediction for the given model and datasets,
;; with a map of additional options and using defaults appropriate
;; for model stacking
(define (make-batch ds-id mod-id)
(let (name (resource-type mod-id))
(create-batchprediction ds-id mod-id {"all_fields" true
"output_dataset" true
"prediction_name" name})))
;; Auxiliary function extracting the model_inputs of a model
(define (model-inputs mod-id)
(fetch mod-id) "input_fields"))
#BSSML16 Automating Machine Learning December 2016 42 / 46
A Stacked generalization library: auxiliary functions
;; Auxiliary function to create the set of stack models
(define (create-stack-models train-id)
[(create-model {"dataset" train-id})
(create-ensemble {"dataset" train-id
"number_of_models" 20
"randomize" false})
(create-ensemble {"dataset" train-id
"number_of_models" 20
"randomize" true})
(create-logisticregression {"dataset" train-id})])
;; Auxiliary funtion to successively create batchpredictions using the
;; given models over the initial dataset ds-id. Returns the final
;; dataset id.
(define (create-stack-predictions models ds-id)
(reduce (lambda (did mid)
(batch-dataset (make-batch did mid)))
ds-id
models))
#BSSML16 Automating Machine Learning December 2016 43 / 46
A Stacked generalization library: creating the stack
;; Splits the given dataset, using half of it to create
;; an heterogeneous collection of models and the other
;; half to train a tree that predicts based on those other
;; models predictions. Returns a map with the collection
;; of models (under the key "models") and the meta-prediction
;; as the value of the key "metamodel". The key "result"
;; has as value a boolean flag indicating whether the
;; process was successful.
(define (make-stack dataset-id)
(let ([train-id hold-id] (create-random-dataset-split dataset-id 0.5)
models (create-stack-models train-id)
id (create-stack-predictions models hold-id)
orig-fields (model-inputs (head models))
obj-id (dataset-get-objective-id train-id)
meta-id (create-model {"dataset" id
"excluded_fields" orig-fields
"objective_field" obj-id})
success? (resource-done? (fetch (wait meta-id))))
{"models" models "metamodel" meta-id "result" success?}))
#BSSML16 Automating Machine Learning December 2016 44 / 46
Library-based scripts
Script for creating the models
(define stack (make-stack dataset-id))
Script for predictions using the stack
(define (make-prediction exec-id input-data)
(let (exec (fetch exec-id)
stack (nth (head (get-in exec ["execution" "outputs"])) 1)
models (get stack "models")
metamodel (get stack "metamodel"))
(when (get stack "result")
(try (make-stack-prediction models metamodel {})
(catch e (log-info "Error: " e) false)))))
(define prediction-id (make-prediction exec-id input-data))
(define prediction (when prediction-id (fetch prediction-id)))
https://github.com/whizzml/examples/tree/master/stacked-generalizati
#BSSML16 Automating Machine Learning December 2016 45 / 46
Questions?
#BSSML16 Automating Machine Learning December 2016 46 / 46

More Related Content

What's hot

VSSML16 L7. REST API, Bindings, and Basic Workflows
VSSML16 L7. REST API, Bindings, and Basic WorkflowsVSSML16 L7. REST API, Bindings, and Basic Workflows
VSSML16 L7. REST API, Bindings, and Basic Workflows
BigML, Inc
 
A developer's overview of the world of predictive APIs
A developer's overview of the world of predictive APIsA developer's overview of the world of predictive APIs
A developer's overview of the world of predictive APIs
Louis Dorard
 
BSSML17 - API and WhizzML
BSSML17 - API and WhizzMLBSSML17 - API and WhizzML
BSSML17 - API and WhizzML
BigML, Inc
 
BSSML17 - Feature Engineering
BSSML17 - Feature EngineeringBSSML17 - Feature Engineering
BSSML17 - Feature Engineering
BigML, Inc
 
BigML Fall 2016 Release
BigML Fall 2016 ReleaseBigML Fall 2016 Release
BigML Fall 2016 Release
BigML, Inc
 
VSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature EngineeringVSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature Engineering
BigML, Inc
 
VSSML17 Review. Summary Day 2 Sessions
VSSML17 Review. Summary Day 2 SessionsVSSML17 Review. Summary Day 2 Sessions
VSSML17 Review. Summary Day 2 Sessions
BigML, Inc
 
VSSML18. Feature Engineering
VSSML18. Feature EngineeringVSSML18. Feature Engineering
VSSML18. Feature Engineering
BigML, Inc
 
MLSD18. Feature Engineering
MLSD18. Feature EngineeringMLSD18. Feature Engineering
MLSD18. Feature Engineering
BigML, Inc
 
BigML Summer 2017 Release
BigML Summer 2017 ReleaseBigML Summer 2017 Release
BigML Summer 2017 Release
BigML, Inc
 
VSSML18. Data Transformations
VSSML18. Data TransformationsVSSML18. Data Transformations
VSSML18. Data Transformations
BigML, Inc
 
BSSML17 - Ensembles
BSSML17 - EnsemblesBSSML17 - Ensembles
BSSML17 - Ensembles
BigML, Inc
 
Big Data, Bigger Analytics
Big Data, Bigger AnalyticsBig Data, Bigger Analytics
Big Data, Bigger Analytics
Itzhak Kameli
 
VSSML18. Introduction to WhizzML
VSSML18. Introduction to WhizzMLVSSML18. Introduction to WhizzML
VSSML18. Introduction to WhizzML
BigML, Inc
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
Turi, Inc.
 
Graph Analytics for big data
Graph Analytics for big dataGraph Analytics for big data
Graph Analytics for big data
Sigmoid
 
Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create
Turi, Inc.
 
DutchMLSchool. ML Automation
DutchMLSchool. ML AutomationDutchMLSchool. ML Automation
DutchMLSchool. ML Automation
BigML, Inc
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
Marko Rodriguez
 
MLSD18. Ensembles, Logistic Regression, Deepnets
MLSD18. Ensembles, Logistic Regression, DeepnetsMLSD18. Ensembles, Logistic Regression, Deepnets
MLSD18. Ensembles, Logistic Regression, Deepnets
BigML, Inc
 

What's hot (20)

VSSML16 L7. REST API, Bindings, and Basic Workflows
VSSML16 L7. REST API, Bindings, and Basic WorkflowsVSSML16 L7. REST API, Bindings, and Basic Workflows
VSSML16 L7. REST API, Bindings, and Basic Workflows
 
A developer's overview of the world of predictive APIs
A developer's overview of the world of predictive APIsA developer's overview of the world of predictive APIs
A developer's overview of the world of predictive APIs
 
BSSML17 - API and WhizzML
BSSML17 - API and WhizzMLBSSML17 - API and WhizzML
BSSML17 - API and WhizzML
 
BSSML17 - Feature Engineering
BSSML17 - Feature EngineeringBSSML17 - Feature Engineering
BSSML17 - Feature Engineering
 
BigML Fall 2016 Release
BigML Fall 2016 ReleaseBigML Fall 2016 Release
BigML Fall 2016 Release
 
VSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature EngineeringVSSML17 L5. Basic Data Transformations and Feature Engineering
VSSML17 L5. Basic Data Transformations and Feature Engineering
 
VSSML17 Review. Summary Day 2 Sessions
VSSML17 Review. Summary Day 2 SessionsVSSML17 Review. Summary Day 2 Sessions
VSSML17 Review. Summary Day 2 Sessions
 
VSSML18. Feature Engineering
VSSML18. Feature EngineeringVSSML18. Feature Engineering
VSSML18. Feature Engineering
 
MLSD18. Feature Engineering
MLSD18. Feature EngineeringMLSD18. Feature Engineering
MLSD18. Feature Engineering
 
BigML Summer 2017 Release
BigML Summer 2017 ReleaseBigML Summer 2017 Release
BigML Summer 2017 Release
 
VSSML18. Data Transformations
VSSML18. Data TransformationsVSSML18. Data Transformations
VSSML18. Data Transformations
 
BSSML17 - Ensembles
BSSML17 - EnsemblesBSSML17 - Ensembles
BSSML17 - Ensembles
 
Big Data, Bigger Analytics
Big Data, Bigger AnalyticsBig Data, Bigger Analytics
Big Data, Bigger Analytics
 
VSSML18. Introduction to WhizzML
VSSML18. Introduction to WhizzMLVSSML18. Introduction to WhizzML
VSSML18. Introduction to WhizzML
 
GraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos GuestrinGraphLab Conference 2014 Keynote - Carlos Guestrin
GraphLab Conference 2014 Keynote - Carlos Guestrin
 
Graph Analytics for big data
Graph Analytics for big dataGraph Analytics for big data
Graph Analytics for big data
 
Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create Conference 2014: Rajat Arya - Deployment with GraphLab Create
Conference 2014: Rajat Arya - Deployment with GraphLab Create
 
DutchMLSchool. ML Automation
DutchMLSchool. ML AutomationDutchMLSchool. ML Automation
DutchMLSchool. ML Automation
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
 
MLSD18. Ensembles, Logistic Regression, Deepnets
MLSD18. Ensembles, Logistic Regression, DeepnetsMLSD18. Ensembles, Logistic Regression, Deepnets
MLSD18. Ensembles, Logistic Regression, Deepnets
 

Viewers also liked

VSSML16 L5. Basic Data Transformations
VSSML16 L5. Basic Data TransformationsVSSML16 L5. Basic Data Transformations
VSSML16 L5. Basic Data Transformations
BigML, Inc
 
VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1
BigML, Inc
 
VSSML16 LR2. Summary Day 2
VSSML16 LR2. Summary Day 2VSSML16 LR2. Summary Day 2
VSSML16 LR2. Summary Day 2
BigML, Inc
 
BSSML16 L4. Association Discovery and Topic Modeling
BSSML16 L4. Association Discovery and Topic ModelingBSSML16 L4. Association Discovery and Topic Modeling
BSSML16 L4. Association Discovery and Topic Modeling
BigML, Inc
 
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
Ian Lumb
 
BSSML16 L6. Basic Data Transformations
BSSML16 L6. Basic Data TransformationsBSSML16 L6. Basic Data Transformations
BSSML16 L6. Basic Data Transformations
BigML, Inc
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and Evaluations
BigML, Inc
 
VSSML16 L2. Ensembles and Logistic Regression
VSSML16 L2. Ensembles and Logistic RegressionVSSML16 L2. Ensembles and Logistic Regression
VSSML16 L2. Ensembles and Logistic Regression
BigML, Inc
 
Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Drilling Deep with Machine Learning as an Enterprise Enabled Micro ServiceDrilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Ian Lumb
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
台灣資料科學年會
 

Viewers also liked (10)

VSSML16 L5. Basic Data Transformations
VSSML16 L5. Basic Data TransformationsVSSML16 L5. Basic Data Transformations
VSSML16 L5. Basic Data Transformations
 
VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1VSSML16 LR1. Summary Day 1
VSSML16 LR1. Summary Day 1
 
VSSML16 LR2. Summary Day 2
VSSML16 LR2. Summary Day 2VSSML16 LR2. Summary Day 2
VSSML16 LR2. Summary Day 2
 
BSSML16 L4. Association Discovery and Topic Modeling
BSSML16 L4. Association Discovery and Topic ModelingBSSML16 L4. Association Discovery and Topic Modeling
BSSML16 L4. Association Discovery and Topic Modeling
 
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
 
BSSML16 L6. Basic Data Transformations
BSSML16 L6. Basic Data TransformationsBSSML16 L6. Basic Data Transformations
BSSML16 L6. Basic Data Transformations
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and Evaluations
 
VSSML16 L2. Ensembles and Logistic Regression
VSSML16 L2. Ensembles and Logistic RegressionVSSML16 L2. Ensembles and Logistic Regression
VSSML16 L2. Ensembles and Logistic Regression
 
Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Drilling Deep with Machine Learning as an Enterprise Enabled Micro ServiceDrilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
 

Similar to BSSML16 L9. Advanced Workflows: Feature Selection, Boosting, Gradient Descent, and Stacking

Basic WhizzML Workflows
Basic WhizzML WorkflowsBasic WhizzML Workflows
Basic WhizzML Workflows
BigML, Inc
 
VSSML17 L7. REST API, Bindings, and Basic Workflows
VSSML17 L7. REST API, Bindings, and Basic WorkflowsVSSML17 L7. REST API, Bindings, and Basic Workflows
VSSML17 L7. REST API, Bindings, and Basic Workflows
BigML, Inc
 
Advanced WhizzML Workflows
Advanced WhizzML WorkflowsAdvanced WhizzML Workflows
Advanced WhizzML Workflows
BigML, Inc
 
Intermediate WhizzML Workflows
Intermediate WhizzML WorkflowsIntermediate WhizzML Workflows
Intermediate WhizzML Workflows
BigML, Inc
 
Desing pattern prototype-Factory Method, Prototype and Builder
Desing pattern prototype-Factory Method, Prototype and Builder Desing pattern prototype-Factory Method, Prototype and Builder
Desing pattern prototype-Factory Method, Prototype and Builder
paramisoft
 
Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Compiler Construction for DLX Processor
Compiler Construction for DLX Processor
Soham Kulkarni
 
Ch01 basic-java-programs
Ch01 basic-java-programsCh01 basic-java-programs
Ch01 basic-java-programs
James Brotsos
 
Intake 37 linq2
Intake 37 linq2Intake 37 linq2
Intake 37 linq2
Mahmoud Ouf
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
Knoldus Inc.
 
Augustus Overview Open Source Analytics
Augustus Overview  Open Source AnalyticsAugustus Overview  Open Source Analytics
Augustus Overview Open Source Analytics
jtrussell
 
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
DataBench
 
Exploratory Analysis of Spark Structured Streaming
Exploratory Analysis of Spark Structured StreamingExploratory Analysis of Spark Structured Streaming
Exploratory Analysis of Spark Structured Streaming
t_ivanov
 
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Martin Loetzsch
 
AI Library - An Open Source Machine Learning Framework
AI Library - An Open Source Machine Learning FrameworkAI Library - An Open Source Machine Learning Framework
AI Library - An Open Source Machine Learning Framework
MLconf
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
Sebastiano Panichella
 
maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learning
Max Kleiner
 
Headache from using mathematical software
Headache from using mathematical softwareHeadache from using mathematical software
Headache from using mathematical software
PVS-Studio
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
Chetan Khatri
 
Price of an Error
Price of an ErrorPrice of an Error
Price of an Error
Andrey Karpov
 
Scikit-Learn: Machine Learning in Python
Scikit-Learn: Machine Learning in PythonScikit-Learn: Machine Learning in Python
Scikit-Learn: Machine Learning in Python
Microsoft
 

Similar to BSSML16 L9. Advanced Workflows: Feature Selection, Boosting, Gradient Descent, and Stacking (20)

Basic WhizzML Workflows
Basic WhizzML WorkflowsBasic WhizzML Workflows
Basic WhizzML Workflows
 
VSSML17 L7. REST API, Bindings, and Basic Workflows
VSSML17 L7. REST API, Bindings, and Basic WorkflowsVSSML17 L7. REST API, Bindings, and Basic Workflows
VSSML17 L7. REST API, Bindings, and Basic Workflows
 
Advanced WhizzML Workflows
Advanced WhizzML WorkflowsAdvanced WhizzML Workflows
Advanced WhizzML Workflows
 
Intermediate WhizzML Workflows
Intermediate WhizzML WorkflowsIntermediate WhizzML Workflows
Intermediate WhizzML Workflows
 
Desing pattern prototype-Factory Method, Prototype and Builder
Desing pattern prototype-Factory Method, Prototype and Builder Desing pattern prototype-Factory Method, Prototype and Builder
Desing pattern prototype-Factory Method, Prototype and Builder
 
Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Compiler Construction for DLX Processor
Compiler Construction for DLX Processor
 
Ch01 basic-java-programs
Ch01 basic-java-programsCh01 basic-java-programs
Ch01 basic-java-programs
 
Intake 37 linq2
Intake 37 linq2Intake 37 linq2
Intake 37 linq2
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
 
Augustus Overview Open Source Analytics
Augustus Overview  Open Source AnalyticsAugustus Overview  Open Source Analytics
Augustus Overview Open Source Analytics
 
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
 
Exploratory Analysis of Spark Structured Streaming
Exploratory Analysis of Spark Structured StreamingExploratory Analysis of Spark Structured Streaming
Exploratory Analysis of Spark Structured Streaming
 
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
 
AI Library - An Open Source Machine Learning Framework
AI Library - An Open Source Machine Learning FrameworkAI Library - An Open Source Machine Learning Framework
AI Library - An Open Source Machine Learning Framework
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
 
maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learning
 
Headache from using mathematical software
Headache from using mathematical softwareHeadache from using mathematical software
Headache from using mathematical software
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
 
Price of an Error
Price of an ErrorPrice of an Error
Price of an Error
 
Scikit-Learn: Machine Learning in Python
Scikit-Learn: Machine Learning in PythonScikit-Learn: Machine Learning in Python
Scikit-Learn: Machine Learning in Python
 

More from BigML, Inc

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
BigML, Inc
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
BigML, Inc
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
BigML, Inc
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
BigML, Inc
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
BigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
BigML, Inc
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
BigML, Inc
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
BigML, Inc
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
BigML, Inc
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
BigML, Inc
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
BigML, Inc
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
BigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
BigML, Inc
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
BigML, Inc
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
BigML, Inc
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
BigML, Inc
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
BigML, Inc
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
BigML, Inc
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
BigML, Inc
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
BigML, Inc
 

More from BigML, Inc (20)

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 

Recently uploaded

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 

Recently uploaded (20)

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 

BSSML16 L9. Advanced Workflows: Feature Selection, Boosting, Gradient Descent, and Stacking

  • 1. Automating Machine Learning Advanced Workflows and WhizzML #BSSML16 December 2016 #BSSML16 Automating Machine Learning December 2016 1 / 46
  • 2. Outline 1 Server-side workflows: WhizzML 2 Basic Workflow: Model or ensemble? 3 Case study: Using Flatline in Whizzml 4 Advanced Workflows 5 Case Study: Stacked Generalization in WhizzML #BSSML16 Automating Machine Learning December 2016 2 / 46
  • 3. Outline 1 Server-side workflows: WhizzML 2 Basic Workflow: Model or ensemble? 3 Case study: Using Flatline in Whizzml 4 Advanced Workflows 5 Case Study: Stacked Generalization in WhizzML #BSSML16 Automating Machine Learning December 2016 3 / 46
  • 4. Client-side Machine Learning Automation Problems of client-side solutions Complexity Lots of details outside the problem domain Reuse No inter-language compatibility Scalability Client-side workflows hard to optimize Extensibility Bigmler hides complexity at the cost of flexibility Not enough abstraction #BSSML16 Automating Machine Learning December 2016 4 / 46
  • 5. Machine Learning Automation for real Solution (complexity, reuse): Domain-specific languages #BSSML16 Automating Machine Learning December 2016 5 / 46
  • 6. Machine Learning Automation for real Solution (scalability, reuse): Back to the server #BSSML16 Automating Machine Learning December 2016 6 / 46
  • 7. Machine Learning Automation for real Solution (scalability, reuse): Back to the server #BSSML16 Automating Machine Learning December 2016 6 / 46
  • 8. WhizzML in a Nutshell • Domain-specific language for ML workflow automation High-level problem and solution specification • Framework for scalable, remote execution of ML workflows Sophisticated server-side optimization Out-of-the-box scalability Client-server brittleness removed Infrastructure for creating and sharing ML scripts and libraries #BSSML16 Automating Machine Learning December 2016 7 / 46
  • 9. WhizzML REST Resources Library Reusable building-block: a collection of WhizzML definitions that can be imported by other libraries or scripts. Script Executable code that describes an actual workflow. • Imports List of libraries with code used by the script. • Inputs List of input values that parameterize the workflow. • Outputs List of values computed by the script and returned to the user. Execution Given a script and a complete set of inputs, the workflow can be executed and its outputs generated. #BSSML16 Automating Machine Learning December 2016 8 / 46
  • 10. Different ways to create WhizzML Scripts/Libraries Github Script editor Gallery Other scripts Scriptify −→ #BSSML16 Automating Machine Learning December 2016 9 / 46
  • 11. Basic workflow in WhizzML (let (dataset (create-dataset source) cluster (create-cluster dataset)) (create-batchcentroid dataset cluster {"output_dataset" true "all_fields" true})) #BSSML16 Automating Machine Learning December 2016 10 / 46
  • 12. Basic workflow in WhizzML: Usable by any binding from bigml.api import BigML api = BigML() # choose workflow script = 'script/567b4b5be3f2a123a690ff56' # define parameters inputs = {'source': 'source/5643d345f43a234ff2310a3e'} # execute api.ok(api.create_execution(script, inputs)) #BSSML16 Automating Machine Learning December 2016 11 / 46
  • 13. Basic workflow in WhizzML: Trivial parallelization ;; Workflow for 1 resource (let (dataset (create-dataset source) cluster (create-cluster dataset)) (create-batchcentroid dataset cluster {"output_dataset" true "all_fields" true})) #BSSML16 Automating Machine Learning December 2016 12 / 46
  • 14. Basic workflow in WhizzML: Trivial parallelization ;; Workflow for any number of resources (let (datasets (map create-dataset sources) clusters (map create-cluster datasets) params {"output_dataset" true "all_fields" true}) (map (lambda (d c) (create-batchcentroid d c params)) datasets clusters)) #BSSML16 Automating Machine Learning December 2016 13 / 46
  • 15. Basic workflows in WhizzML: automatic generation #BSSML16 Automating Machine Learning December 2016 14 / 46
  • 16. Standard functions • Numeric and relational operators (+, *, <, =, ...) • Mathematical functions (cos, sinh, floor ...) • Strings and regular expressions (str, matches?, replace, ...) • Flatline generation • Collections: list traversal, sorting, map manipulation • BigML resources manipulation Creation create-source, create-and-wait-dataset, etc. Retrieval fetch, list-anomalies, etc. Update update Deletion delete • Machine Learning Algorithms (SMACDown, Boosting, etc.) #BSSML16 Automating Machine Learning December 2016 15 / 46
  • 17. Outline 1 Server-side workflows: WhizzML 2 Basic Workflow: Model or ensemble? 3 Case study: Using Flatline in Whizzml 4 Advanced Workflows 5 Case Study: Stacked Generalization in WhizzML #BSSML16 Automating Machine Learning December 2016 16 / 46
  • 18. Model or Ensemble? • Split a dataset in test and training parts • Create a model and an ensemble with the training dataset • Evaluate both with the test dataset • Choose the one with better evaluation (f-measure) https://github.com/whizzml/examples/tree/master/model-or-ensemble #BSSML16 Automating Machine Learning December 2016 17 / 46
  • 19. Model or Ensemble? ;; Functions for creating the two dataset parts ;; Sample a dataset taking a fraction of its rows (rate) and ;; keeping either that fraction (out-of-bag? false) or its ;; complement (out-of-bag? true) (define (sample-dataset origin-id rate out-of-bag?) (create-dataset {"origin_dataset" origin-id "sample_rate" rate "out_of_bag" out-of-bag? "seed" "example-seed-0001"}))) ;; Create in parallel two halves of a dataset using ;; the sample function twice. Return a list of the two ;; new dataset ids. (define (split-dataset origin-id rate) (list (sample-dataset origin-id rate false) (sample-dataset origin-id rate true))) #BSSML16 Automating Machine Learning December 2016 18 / 46
  • 20. Model or Ensemble? ;; Functions to create an ensemble and extract the f-measure from ;; evaluation, given its id. (define (make-ensemble ds-id size) (create-ensemble ds-id {"number_of_models" size})) (define (f-measure ev-id) (let (ev-id (wait ev-id) ;; because fetch doesn't wait evaluation (fetch ev-id)) (evaluation ["result" "model" "average_f_measure"])) #BSSML16 Automating Machine Learning December 2016 19 / 46
  • 21. Model or Ensemble? ;; Function encapsulating the full workflow (define (model-or-ensemble src-id) (let (ds-id (create-dataset {"source" src-id}) [train-id test-id] (split-dataset ds-id 0.8) m-id (create-model train-id) e-id (make-ensemble train-id 15) m-f (f-measure (create-evaluation m-id test-id)) e-f (f-measure (create-evaluation e-id test-id))) (log-info "model f " m-f " / ensemble f " e-f) (if (> m-f e-f) m-id e-id))) ;; Compute the result of the script execution ;; - Inputs: [{"name": "input-source-id", "type": "source-id"}] ;; - Outputs: [{"name": "result", "type": "resource-id"}] (define result (model-or-ensemble input-source-id)) #BSSML16 Automating Machine Learning December 2016 20 / 46
  • 22. Outline 1 Server-side workflows: WhizzML 2 Basic Workflow: Model or ensemble? 3 Case study: Using Flatline in Whizzml 4 Advanced Workflows 5 Case Study: Stacked Generalization in WhizzML #BSSML16 Automating Machine Learning December 2016 21 / 46
  • 23. Transforming item counts to features basket milk eggs flour salt chocolate caviar milk,eggs Y Y N N N N milk,flour Y N Y N N N milk,flour,eggs Y Y Y N N N chocolate N N N N Y N #BSSML16 Automating Machine Learning December 2016 22 / 46
  • 24. Item counts to features with Flatline (if (contains-items? "basket" "milk") "Y" "N") (if (contains-items? "basket" "eggs") "Y" "N") (if (contains-items? "basket" "flour") "Y" "N") (if (contains-items? "basket" "salt") "Y" "N") (if (contains-items? "basket" "chocolate") "Y" "N") (if (contains-items? "basket" "caviar") "Y" "N") Parameterized code generation Field name Item values Y/N category names #BSSML16 Automating Machine Learning December 2016 23 / 46
  • 25. Flatline code generation with WhizzML "(if (contains-items? "basket" "milk") "Y" "N")" #BSSML16 Automating Machine Learning December 2016 24 / 46
  • 26. Flatline code generation with WhizzML "(if (contains-items? "basket" "milk") "Y" "N")" (let (field "basket" item "milk" yes "Y" no "N") (flatline "(if (contains-items? {{field}} {{item}})" "{{yes}}" "{{no}})")) #BSSML16 Automating Machine Learning December 2016 24 / 46
  • 27. Flatline code generation with WhizzML "(if (contains-items? "basket" "milk") "Y" "N")" (let (field "basket" item "milk" yes "Y" no "N") (flatline "(if (contains-items? {{field}} {{item}})" "{{yes}}" "{{no}})")) (define (field-flatline field item yes no) (flatline "(if (contains-items? {{field}} {{item}})" "{{yes}}" "{{no}})")) #BSSML16 Automating Machine Learning December 2016 24 / 46
  • 28. Flatline code generation with WhizzML (define (field-flatline field item yes no) (flatline "(if (contains-items? {{field}} {{item}})" "{{yes}}" "{{no}})")) (define (item-fields field items yes no) (for (item items) {"field" (field-flatline field item yes no)})) (define (dataset-item-fields ds-id field) (let (ds (fetch ds-id) item-dist (ds ["fields" field "summary" "items"]) items (map head item-dist)) (item-fields field items "Y" "N"))) #BSSML16 Automating Machine Learning December 2016 25 / 46
  • 29. Flatline code generation with WhizzML (define output-dataset (let (fs {"new_fields" (dataset-item-fields input-dataset field)}) (create-dataset input-dataset fs))) {"inputs": [{"name": "input-dataset", "type": "dataset-id", "description": "The input dataset"}, {"name": "field", "type": "string", "description": "Id of the items field"}], "outputs": [{"name": "output-dataset", "type": "dataset-id", "description": "The id of the generated dataset"}]} #BSSML16 Automating Machine Learning December 2016 26 / 46
  • 30. Outline 1 Server-side workflows: WhizzML 2 Basic Workflow: Model or ensemble? 3 Case study: Using Flatline in Whizzml 4 Advanced Workflows 5 Case Study: Stacked Generalization in WhizzML #BSSML16 Automating Machine Learning December 2016 27 / 46
  • 31. What Do We Know About WhizzML? • It’s a complete programming language • Machine learning “operations” are first-class • Those operations are performed in BigML’s backend One-line of code to perform API requests We get scale “for free” • Everything is Composable Functions Libraries The Web Interface #BSSML16 Automating Machine Learning December 2016 28 / 46
  • 32. What Can We Do With It? • Non-trivial Model Selection n-fold cross validation Comparison of model types (tree, ensemble, logistic) • Automation of Drudgery One-click retraining/validation Standarized dataset transformations / cleaning • Sure, but what else? #BSSML16 Automating Machine Learning December 2016 29 / 46
  • 33. Algorithms as Workflows • Many ML algorithms can be thought of as workflows • In these algorithms, machine learning operations are the primitives Make a model Make a prediction Evaluate a model • Many such algorithms can be implemented in WhizzML Reap the advantages of BigML’s infrastructure Once implemented, it is language-agnostic #BSSML16 Automating Machine Learning December 2016 30 / 46
  • 34. Examples: Stacked Generalization #BSSML16 Automating Machine Learning December 2016 31 / 46
  • 35. Examples: Randomized Parameter Optimization #BSSML16 Automating Machine Learning December 2016 32 / 46
  • 36. Examples: SMACdown #BSSML16 Automating Machine Learning December 2016 33 / 46
  • 37. Examples: SMACdown Objective: Find the best set of parameters even more quickly! • Do: Generate several random sets of parameters for an ML algorithm Do 10-fold cross-validation with those parameters Learn a predictive model to predict performance from parameter values Use the model to help you select the next set of parameters to evaluate • Until you get a set of parameters that performs “well” or you get bored #BSSML16 Automating Machine Learning December 2016 34 / 46
  • 38. Examples: Boosting • General idea: Iteratively model the dataset Each iteration is trained on the mistakes of previous iterations Said another way, the objective changes each iteration The final model is a summation of all iterations • Lots of variations on this theme Adaboost Logitboost Martingale Boosting Gradient Boosting • Let’s take a look at a WhizzML implementation of the latter #BSSML16 Automating Machine Learning December 2016 35 / 46
  • 39. Examples: Boosting #BSSML16 Automating Machine Learning December 2016 36 / 46
  • 40. Outline 1 Server-side workflows: WhizzML 2 Basic Workflow: Model or ensemble? 3 Case study: Using Flatline in Whizzml 4 Advanced Workflows 5 Case Study: Stacked Generalization in WhizzML #BSSML16 Automating Machine Learning December 2016 37 / 46
  • 41. Examples: Stacked Generalization #BSSML16 Automating Machine Learning December 2016 38 / 46
  • 42. Stacked generalization Objective: Improve predictions by modeling the output scores of multiple trained models. • Create a training and a holdout set • Create n different models on the training set (with some difference among them; e.g., single-tree vs. ensemble vs. logistic regression) • Make predictions from those models on the holdout set • Train a model to predict the class based on the other models’ predictions #BSSML16 Automating Machine Learning December 2016 39 / 46
  • 43. A Stacked generalization library: creating the stack ;; Splits the given dataset, using half of it to create ;; an heterogeneous collection of models and the other ;; half to train a tree that predicts based on those other ;; models predictions. Returns a map with the collection ;; of models (under the key "models") and the meta-prediction ;; as the value of the key "metamodel". The key "result" ;; has as value a boolean flag indicating whether the ;; process was successful. (define (make-stack dataset-id) (let ([train-id hold-id] (create-random-dataset-split dataset-id 0.5) models (create-stack-models train-id) id (create-stack-predictions models hold-id) orig-fields (model-inputs (head models)) obj-id (dataset-get-objective-id train-id) meta-id (create-model {"dataset" id "excluded_fields" orig-fields "objective_field" obj-id}) success? (resource-done? (fetch (wait meta-id)))) {"models" models "metamodel" meta-id "result" success?})) #BSSML16 Automating Machine Learning December 2016 40 / 46
  • 44. A Stacked generalization library: using the stack ;; Use the models and metamodels computed by make-stack ;; to make a prediction on the input-data map. Returns ;; the identifier of the prediction object. (define (make-stack-prediction models meta-model input-data) (let (preds (map (lambda (m) (create-prediction {"model" m "input_data" input-data})) models) preds (map (lambda (p) (head (values ((fetch p) "prediction")))) preds) meta-input (make-map (model-inputs meta-model) preds)) (create-prediction {"model" meta-model "input_data" meta-input}))) #BSSML16 Automating Machine Learning December 2016 41 / 46
  • 45. A Stacked generalization library: auxiliary functions ;; Extract for a batchpredction its associated dataset of results (define (batch-dataset id) (wait ((fetch id) "output_dataset_resource"))) ;; Create a batchprediction for the given model and datasets, ;; with a map of additional options and using defaults appropriate ;; for model stacking (define (make-batch ds-id mod-id) (let (name (resource-type mod-id)) (create-batchprediction ds-id mod-id {"all_fields" true "output_dataset" true "prediction_name" name}))) ;; Auxiliary function extracting the model_inputs of a model (define (model-inputs mod-id) (fetch mod-id) "input_fields")) #BSSML16 Automating Machine Learning December 2016 42 / 46
  • 46. A Stacked generalization library: auxiliary functions ;; Auxiliary function to create the set of stack models (define (create-stack-models train-id) [(create-model {"dataset" train-id}) (create-ensemble {"dataset" train-id "number_of_models" 20 "randomize" false}) (create-ensemble {"dataset" train-id "number_of_models" 20 "randomize" true}) (create-logisticregression {"dataset" train-id})]) ;; Auxiliary funtion to successively create batchpredictions using the ;; given models over the initial dataset ds-id. Returns the final ;; dataset id. (define (create-stack-predictions models ds-id) (reduce (lambda (did mid) (batch-dataset (make-batch did mid))) ds-id models)) #BSSML16 Automating Machine Learning December 2016 43 / 46
  • 47. A Stacked generalization library: creating the stack ;; Splits the given dataset, using half of it to create ;; an heterogeneous collection of models and the other ;; half to train a tree that predicts based on those other ;; models predictions. Returns a map with the collection ;; of models (under the key "models") and the meta-prediction ;; as the value of the key "metamodel". The key "result" ;; has as value a boolean flag indicating whether the ;; process was successful. (define (make-stack dataset-id) (let ([train-id hold-id] (create-random-dataset-split dataset-id 0.5) models (create-stack-models train-id) id (create-stack-predictions models hold-id) orig-fields (model-inputs (head models)) obj-id (dataset-get-objective-id train-id) meta-id (create-model {"dataset" id "excluded_fields" orig-fields "objective_field" obj-id}) success? (resource-done? (fetch (wait meta-id)))) {"models" models "metamodel" meta-id "result" success?})) #BSSML16 Automating Machine Learning December 2016 44 / 46
  • 48. Library-based scripts Script for creating the models (define stack (make-stack dataset-id)) Script for predictions using the stack (define (make-prediction exec-id input-data) (let (exec (fetch exec-id) stack (nth (head (get-in exec ["execution" "outputs"])) 1) models (get stack "models") metamodel (get stack "metamodel")) (when (get stack "result") (try (make-stack-prediction models metamodel {}) (catch e (log-info "Error: " e) false))))) (define prediction-id (make-prediction exec-id input-data)) (define prediction (when prediction-id (fetch prediction-id))) https://github.com/whizzml/examples/tree/master/stacked-generalizati #BSSML16 Automating Machine Learning December 2016 45 / 46
  • 49. Questions? #BSSML16 Automating Machine Learning December 2016 46 / 46