SlideShare a Scribd company logo
Model Automation in R
Using MASS, randomForest, forecast,
and caret
Who is Will Johnson?
● Database Manager at Uline (Pleasant Prairie)
● MS Predictive Analytics (2015)
● Operating www.LearnByMarketing.com
○ R tutorials, thoughts on analysis.
Learn By
Marketing.com
Agenda
1. What is Model Automation
2. Pros and Cons of Model Automation
3. Decision Trees and Random Forests {randomForest}
4. Stepwise Regression {MASS}
5. Auto.Arima for time series {forecast}
6. Hyperparameter Search {caret}
What is Model Automation?
Hypothesis Space
vs
Hyperparameter Space
Pros and Cons of Model Automation
PROS:
● You Don’t Have to Think!
● “Faster” Iterations.
● See what’s “Important”
CONS:
● You Don’t Have to Think!
● Jellybeans
Agenda
1. What is Model Automation
2. Pros and Cons of Model Automation
3. Decision Trees and Random Forests {randomForest}
4. Stepwise Regression {MASS}
5. Auto.Arima for time series {forecast}
6. Hyperparameter Search {caret}
Decision Trees
● Gini Index +
Entropy
randomForest
● Mean Decrease
in Gini Index
library(randomForest)
rf <- randomForest(y~., data = dat)
rf$importance #Var Name + Importance
varImpPlot(rf) #Visualization
Stepwise
Regression
● AIC
Stepwise
Regression
library(MASS)
mod <- lm(hp~.,data=mt)
#Step Backward and remove one variable at a time
stepAIC(mod,direction = "backward",trace = T)
#Create a model using only the intercept
mod_lower = lm(hp~1,data=mt)
#Step Forward and add one variable at a time
stepAIC(mod_lower,direction = "forward",
scope=list(upper=upper_form,lower=~1))
#Step Forward or Backward each step starting with a intercept model
stepAIC(mod_lower,direction = "both",
scope=list(upper=upper_form,lower=~1))
#Get the Independent Variables
#(and exclude hp dependent variable)
indep_vars <-paste(names(mt)[-which(names(mt)=="hp")],
collapse="+")
#Turn those variable names into a formula
upper_form = formula(paste("~",indep_vars,collapse=""))
#~mpg + cyl + disp + drat + wt + qsec + vs + am + gear + carb
Auto.Arima
● Time Series models.
● AutoRegressive…
● Moving Averages…
● With Differencing!
library(forecast)
library(fpp)
#Step Backward and remove one variable at a time
data("elecequip")
ee <- elecequip[1:180]
model <- auto.arima(ee,stationary = T)
# ar1 ma1 ma2 ma3 intercept
#0.8428 -0.6571 -0.1753 0.6353 95.7265
#s.e. 0.0431 0.0537 0.0573 0.0561 3.2223
plot(forecast(model,h=10))
lines(x = 181:191, y= elecequip[181:191],
type = 'l', col = 'red')
Auto.Arima
train {caret}
library(caret)
#Step Backward and remove one variable at a time
tctrl <- trainControl(method = "cv",number=10,
repeats=10)
rpart_opts <- expand.grid(cp = seq(0.0,0.01, by = 0.001))
rpart_model <- train(y~. data, method="rpart",
metric = "Kappa", trControl = tctrl,
tuneGrid = rpart_opts, subset = train_log)
train {caret}
Recap
Learn By
Marketing.com
library(randomForest) varImpPlot()
library(MASS) stepAIC()
library(forecast) auto.arima()
library(caret) train()
Questions?
Learn By
Marketing.com

More Related Content

Viewers also liked

Recommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionRecommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS Function
Will Johnson
 
The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...
odsc
 
Random Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna QuachRandom Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna Quach
WithTheBest
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016
Makoto Yui
 
Error analysis randomforest
Error analysis randomforestError analysis randomforest
Error analysis randomforest
riswan_zen
 
Machine Learning with R
Machine Learning with RMachine Learning with R
Machine Learning with R
butest
 
Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...
butest
 
Access any data anywhere
Access any data anywhereAccess any data anywhere
Access any data anywhere
Lohith Goudagere Nagaraj
 
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes) Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Dmytro Melnychuk
 
Larry tabb hft - part 1
Larry tabb   hft - part 1Larry tabb   hft - part 1
Larry tabb hft - part 1
Smith Kim
 
Meeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID IIMeeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID II
Leigh Hill
 
MiFID II: Data for best execution
MiFID II: Data for best executionMiFID II: Data for best execution
MiFID II: Data for best execution
Leigh Hill
 
Getting Ready for MiFID II
Getting Ready for MiFID II Getting Ready for MiFID II
Getting Ready for MiFID II
corfinancial
 
MiFID II: Data for transparency
MiFID II: Data for transparencyMiFID II: Data for transparency
MiFID II: Data for transparency
Leigh Hill
 
The impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading businessThe impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading business
Tom White
 
MiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation LeedsMiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation Leeds
Bovill
 
Naive Bayes Example using R
Naive Bayes Example using  R Naive Bayes Example using  R
Naive Bayes Example using R
Dr. Volkan OBAN
 
MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15
Bovill
 
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFTExtent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
extentconf Tsoy
 

Viewers also liked (19)

Recommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionRecommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS Function
 
The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...
 
Random Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna QuachRandom Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna Quach
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016
 
Error analysis randomforest
Error analysis randomforestError analysis randomforest
Error analysis randomforest
 
Machine Learning with R
Machine Learning with RMachine Learning with R
Machine Learning with R
 
Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...
 
Access any data anywhere
Access any data anywhereAccess any data anywhere
Access any data anywhere
 
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes) Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
 
Larry tabb hft - part 1
Larry tabb   hft - part 1Larry tabb   hft - part 1
Larry tabb hft - part 1
 
Meeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID IIMeeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID II
 
MiFID II: Data for best execution
MiFID II: Data for best executionMiFID II: Data for best execution
MiFID II: Data for best execution
 
Getting Ready for MiFID II
Getting Ready for MiFID II Getting Ready for MiFID II
Getting Ready for MiFID II
 
MiFID II: Data for transparency
MiFID II: Data for transparencyMiFID II: Data for transparency
MiFID II: Data for transparency
 
The impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading businessThe impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading business
 
MiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation LeedsMiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation Leeds
 
Naive Bayes Example using R
Naive Bayes Example using  R Naive Bayes Example using  R
Naive Bayes Example using R
 
MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15
 
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFTExtent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
 

Similar to Model Automation in R

TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
Chetan Khatri
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine Learning
Knoldus Inc.
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Chetan Khatri
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
HackerEarth
 
Linear regression in R
Linear regression in R Linear regression in R
Linear regression in R
Leon Kim
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
Spencer Fox
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
Salford Systems
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Olivier Teytaud
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIML
VijaySharma802
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
Greg Makowski
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tune
XiaoweiJiang7
 
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGA GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
Lubna_Alhenaki
 
Time Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlowTime Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlow
SeungHyun Jeon
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
Adam Doyle
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
Amund Tveit
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
PATHALAMRAJESH
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
HONGJOO LEE
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
Olivier Teytaud
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
Olivier Teytaud
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
Wush Wu
 

Similar to Model Automation in R (20)

TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine Learning
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
Linear regression in R
Linear regression in R Linear regression in R
Linear regression in R
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIML
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tune
 
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGA GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
 
Time Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlowTime Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlow
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 

Recently uploaded

writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024
facilitymanager11
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
bmucuha
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 

Recently uploaded (20)

writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 

Model Automation in R

  • 1. Model Automation in R Using MASS, randomForest, forecast, and caret
  • 2. Who is Will Johnson? ● Database Manager at Uline (Pleasant Prairie) ● MS Predictive Analytics (2015) ● Operating www.LearnByMarketing.com ○ R tutorials, thoughts on analysis. Learn By Marketing.com
  • 3. Agenda 1. What is Model Automation 2. Pros and Cons of Model Automation 3. Decision Trees and Random Forests {randomForest} 4. Stepwise Regression {MASS} 5. Auto.Arima for time series {forecast} 6. Hyperparameter Search {caret}
  • 4. What is Model Automation? Hypothesis Space vs Hyperparameter Space
  • 5. Pros and Cons of Model Automation PROS: ● You Don’t Have to Think! ● “Faster” Iterations. ● See what’s “Important” CONS: ● You Don’t Have to Think! ● Jellybeans
  • 6.
  • 7. Agenda 1. What is Model Automation 2. Pros and Cons of Model Automation 3. Decision Trees and Random Forests {randomForest} 4. Stepwise Regression {MASS} 5. Auto.Arima for time series {forecast} 6. Hyperparameter Search {caret}
  • 8. Decision Trees ● Gini Index + Entropy
  • 9. randomForest ● Mean Decrease in Gini Index library(randomForest) rf <- randomForest(y~., data = dat) rf$importance #Var Name + Importance varImpPlot(rf) #Visualization
  • 11. Stepwise Regression library(MASS) mod <- lm(hp~.,data=mt) #Step Backward and remove one variable at a time stepAIC(mod,direction = "backward",trace = T) #Create a model using only the intercept mod_lower = lm(hp~1,data=mt) #Step Forward and add one variable at a time stepAIC(mod_lower,direction = "forward", scope=list(upper=upper_form,lower=~1)) #Step Forward or Backward each step starting with a intercept model stepAIC(mod_lower,direction = "both", scope=list(upper=upper_form,lower=~1)) #Get the Independent Variables #(and exclude hp dependent variable) indep_vars <-paste(names(mt)[-which(names(mt)=="hp")], collapse="+") #Turn those variable names into a formula upper_form = formula(paste("~",indep_vars,collapse="")) #~mpg + cyl + disp + drat + wt + qsec + vs + am + gear + carb
  • 12. Auto.Arima ● Time Series models. ● AutoRegressive… ● Moving Averages… ● With Differencing! library(forecast) library(fpp) #Step Backward and remove one variable at a time data("elecequip") ee <- elecequip[1:180] model <- auto.arima(ee,stationary = T) # ar1 ma1 ma2 ma3 intercept #0.8428 -0.6571 -0.1753 0.6353 95.7265 #s.e. 0.0431 0.0537 0.0573 0.0561 3.2223 plot(forecast(model,h=10)) lines(x = 181:191, y= elecequip[181:191], type = 'l', col = 'red')
  • 14. train {caret} library(caret) #Step Backward and remove one variable at a time tctrl <- trainControl(method = "cv",number=10, repeats=10) rpart_opts <- expand.grid(cp = seq(0.0,0.01, by = 0.001)) rpart_model <- train(y~. data, method="rpart", metric = "Kappa", trControl = tctrl, tuneGrid = rpart_opts, subset = train_log)
  • 16. Recap Learn By Marketing.com library(randomForest) varImpPlot() library(MASS) stepAIC() library(forecast) auto.arima() library(caret) train()