SlideShare a Scribd company logo
1 of 17
Download to read offline
Model Automation in R
Using MASS, randomForest, forecast,
and caret
Who is Will Johnson?
● Database Manager at Uline (Pleasant Prairie)
● MS Predictive Analytics (2015)
● Operating www.LearnByMarketing.com
○ R tutorials, thoughts on analysis.
Learn By
Marketing.com
Agenda
1. What is Model Automation
2. Pros and Cons of Model Automation
3. Decision Trees and Random Forests {randomForest}
4. Stepwise Regression {MASS}
5. Auto.Arima for time series {forecast}
6. Hyperparameter Search {caret}
What is Model Automation?
Hypothesis Space
vs
Hyperparameter Space
Pros and Cons of Model Automation
PROS:
● You Don’t Have to Think!
● “Faster” Iterations.
● See what’s “Important”
CONS:
● You Don’t Have to Think!
● Jellybeans
Agenda
1. What is Model Automation
2. Pros and Cons of Model Automation
3. Decision Trees and Random Forests {randomForest}
4. Stepwise Regression {MASS}
5. Auto.Arima for time series {forecast}
6. Hyperparameter Search {caret}
Decision Trees
● Gini Index +
Entropy
randomForest
● Mean Decrease
in Gini Index
library(randomForest)
rf <- randomForest(y~., data = dat)
rf$importance #Var Name + Importance
varImpPlot(rf) #Visualization
Stepwise
Regression
● AIC
Stepwise
Regression
library(MASS)
mod <- lm(hp~.,data=mt)
#Step Backward and remove one variable at a time
stepAIC(mod,direction = "backward",trace = T)
#Create a model using only the intercept
mod_lower = lm(hp~1,data=mt)
#Step Forward and add one variable at a time
stepAIC(mod_lower,direction = "forward",
scope=list(upper=upper_form,lower=~1))
#Step Forward or Backward each step starting with a intercept model
stepAIC(mod_lower,direction = "both",
scope=list(upper=upper_form,lower=~1))
#Get the Independent Variables
#(and exclude hp dependent variable)
indep_vars <-paste(names(mt)[-which(names(mt)=="hp")],
collapse="+")
#Turn those variable names into a formula
upper_form = formula(paste("~",indep_vars,collapse=""))
#~mpg + cyl + disp + drat + wt + qsec + vs + am + gear + carb
Auto.Arima
● Time Series models.
● AutoRegressive…
● Moving Averages…
● With Differencing!
library(forecast)
library(fpp)
#Step Backward and remove one variable at a time
data("elecequip")
ee <- elecequip[1:180]
model <- auto.arima(ee,stationary = T)
# ar1 ma1 ma2 ma3 intercept
#0.8428 -0.6571 -0.1753 0.6353 95.7265
#s.e. 0.0431 0.0537 0.0573 0.0561 3.2223
plot(forecast(model,h=10))
lines(x = 181:191, y= elecequip[181:191],
type = 'l', col = 'red')
Auto.Arima
train {caret}
library(caret)
#Step Backward and remove one variable at a time
tctrl <- trainControl(method = "cv",number=10,
repeats=10)
rpart_opts <- expand.grid(cp = seq(0.0,0.01, by = 0.001))
rpart_model <- train(y~. data, method="rpart",
metric = "Kappa", trControl = tctrl,
tuneGrid = rpart_opts, subset = train_log)
train {caret}
Recap
Learn By
Marketing.com
library(randomForest) varImpPlot()
library(MASS) stepAIC()
library(forecast) auto.arima()
library(caret) train()
Questions?
Learn By
Marketing.com

More Related Content

Viewers also liked

Recommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionRecommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionWill Johnson
 
The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...odsc
 
Random Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna QuachRandom Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna QuachWithTheBest
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Makoto Yui
 
Error analysis randomforest
Error analysis randomforestError analysis randomforest
Error analysis randomforestriswan_zen
 
Machine Learning with R
Machine Learning with RMachine Learning with R
Machine Learning with Rbutest
 
Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...butest
 
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes) Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes) Dmytro Melnychuk
 
Larry tabb hft - part 1
Larry tabb   hft - part 1Larry tabb   hft - part 1
Larry tabb hft - part 1Smith Kim
 
Meeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID IIMeeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID IILeigh Hill
 
MiFID II: Data for best execution
MiFID II: Data for best executionMiFID II: Data for best execution
MiFID II: Data for best executionLeigh Hill
 
Getting Ready for MiFID II
Getting Ready for MiFID II Getting Ready for MiFID II
Getting Ready for MiFID II corfinancial
 
MiFID II: Data for transparency
MiFID II: Data for transparencyMiFID II: Data for transparency
MiFID II: Data for transparencyLeigh Hill
 
The impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading businessThe impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading businessTom White
 
MiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation LeedsMiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation LeedsBovill
 
Naive Bayes Example using R
Naive Bayes Example using  R Naive Bayes Example using  R
Naive Bayes Example using R Dr. Volkan OBAN
 
MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15Bovill
 
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFTExtent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFTextentconf Tsoy
 

Viewers also liked (19)

Recommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS FunctionRecommender Systems with Apache Spark's ALS Function
Recommender Systems with Apache Spark's ALS Function
 
The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...The caret package is a unified interface to a large number of predictive mode...
The caret package is a unified interface to a large number of predictive mode...
 
Random Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna QuachRandom Forests: The Vanilla of Machine Learning - Anna Quach
Random Forests: The Vanilla of Machine Learning - Anna Quach
 
Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016Hivemall dbtechshowcase 20160713 #dbts2016
Hivemall dbtechshowcase 20160713 #dbts2016
 
Error analysis randomforest
Error analysis randomforestError analysis randomforest
Error analysis randomforest
 
Machine Learning with R
Machine Learning with RMachine Learning with R
Machine Learning with R
 
Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...Visualization and Machine Learning - for exploratory data ...
Visualization and Machine Learning - for exploratory data ...
 
Access any data anywhere
Access any data anywhereAccess any data anywhere
Access any data anywhere
 
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes) Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
Big Data in Stock Exchange( HFT, Forex, Flash Crashes)
 
Larry tabb hft - part 1
Larry tabb   hft - part 1Larry tabb   hft - part 1
Larry tabb hft - part 1
 
Meeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID IIMeeting the data management challenges of MiFID II
Meeting the data management challenges of MiFID II
 
MiFID II: Data for best execution
MiFID II: Data for best executionMiFID II: Data for best execution
MiFID II: Data for best execution
 
Getting Ready for MiFID II
Getting Ready for MiFID II Getting Ready for MiFID II
Getting Ready for MiFID II
 
MiFID II: Data for transparency
MiFID II: Data for transparencyMiFID II: Data for transparency
MiFID II: Data for transparency
 
The impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading businessThe impact of MiFID II on your OTC derivatives trading business
The impact of MiFID II on your OTC derivatives trading business
 
MiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation LeedsMiFID II- Client issues presentation Leeds
MiFID II- Client issues presentation Leeds
 
Naive Bayes Example using R
Naive Bayes Example using  R Naive Bayes Example using  R
Naive Bayes Example using R
 
MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15MiFID II - investor protection - Bovill briefing feb 15
MiFID II - investor protection - Bovill briefing feb 15
 
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFTExtent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
Extent 2013 Obninsk Cross-Asset Portfolio Margin Risk Calculation for HFT
 

Similar to Model Automation in R Using MASS, randomForest, forecast, and caret

TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...Chetan Khatri
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine LearningKnoldus Inc.
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaChetan Khatri
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? HackerEarth
 
Linear regression in R
Linear regression in R Linear regression in R
Linear regression in R Leon Kim
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Spencer Fox
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4Salford Systems
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Olivier Teytaud
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLVijaySharma802
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Greg Makowski
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneXiaoweiJiang7
 
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGA GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGLubna_Alhenaki
 
Time Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlowTime Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlowSeungHyun Jeon
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...PATHALAMRAJESH
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systemsOlivier Teytaud
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fittingWush Wu
 

Similar to Model Automation in R Using MASS, randomForest, forecast, and caret (20)

TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine Learning
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
Linear regression in R
Linear regression in R Linear regression in R
Linear regression in R
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIML
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tune
 
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERINGA GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
A GENETIC-FROG LEAPING ALGORITHM FOR TEXT DOCUMENT CLUSTERING
 
Time Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlowTime Series Analysis: Challenge Kaggle with TensorFlow
Time Series Analysis: Challenge Kaggle with TensorFlow
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 

Recently uploaded

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Recently uploaded (20)

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Model Automation in R Using MASS, randomForest, forecast, and caret

  • 1. Model Automation in R Using MASS, randomForest, forecast, and caret
  • 2. Who is Will Johnson? ● Database Manager at Uline (Pleasant Prairie) ● MS Predictive Analytics (2015) ● Operating www.LearnByMarketing.com ○ R tutorials, thoughts on analysis. Learn By Marketing.com
  • 3. Agenda 1. What is Model Automation 2. Pros and Cons of Model Automation 3. Decision Trees and Random Forests {randomForest} 4. Stepwise Regression {MASS} 5. Auto.Arima for time series {forecast} 6. Hyperparameter Search {caret}
  • 4. What is Model Automation? Hypothesis Space vs Hyperparameter Space
  • 5. Pros and Cons of Model Automation PROS: ● You Don’t Have to Think! ● “Faster” Iterations. ● See what’s “Important” CONS: ● You Don’t Have to Think! ● Jellybeans
  • 6.
  • 7. Agenda 1. What is Model Automation 2. Pros and Cons of Model Automation 3. Decision Trees and Random Forests {randomForest} 4. Stepwise Regression {MASS} 5. Auto.Arima for time series {forecast} 6. Hyperparameter Search {caret}
  • 8. Decision Trees ● Gini Index + Entropy
  • 9. randomForest ● Mean Decrease in Gini Index library(randomForest) rf <- randomForest(y~., data = dat) rf$importance #Var Name + Importance varImpPlot(rf) #Visualization
  • 11. Stepwise Regression library(MASS) mod <- lm(hp~.,data=mt) #Step Backward and remove one variable at a time stepAIC(mod,direction = "backward",trace = T) #Create a model using only the intercept mod_lower = lm(hp~1,data=mt) #Step Forward and add one variable at a time stepAIC(mod_lower,direction = "forward", scope=list(upper=upper_form,lower=~1)) #Step Forward or Backward each step starting with a intercept model stepAIC(mod_lower,direction = "both", scope=list(upper=upper_form,lower=~1)) #Get the Independent Variables #(and exclude hp dependent variable) indep_vars <-paste(names(mt)[-which(names(mt)=="hp")], collapse="+") #Turn those variable names into a formula upper_form = formula(paste("~",indep_vars,collapse="")) #~mpg + cyl + disp + drat + wt + qsec + vs + am + gear + carb
  • 12. Auto.Arima ● Time Series models. ● AutoRegressive… ● Moving Averages… ● With Differencing! library(forecast) library(fpp) #Step Backward and remove one variable at a time data("elecequip") ee <- elecequip[1:180] model <- auto.arima(ee,stationary = T) # ar1 ma1 ma2 ma3 intercept #0.8428 -0.6571 -0.1753 0.6353 95.7265 #s.e. 0.0431 0.0537 0.0573 0.0561 3.2223 plot(forecast(model,h=10)) lines(x = 181:191, y= elecequip[181:191], type = 'l', col = 'red')
  • 14. train {caret} library(caret) #Step Backward and remove one variable at a time tctrl <- trainControl(method = "cv",number=10, repeats=10) rpart_opts <- expand.grid(cp = seq(0.0,0.01, by = 0.001)) rpart_model <- train(y~. data, method="rpart", metric = "Kappa", trControl = tctrl, tuneGrid = rpart_opts, subset = train_log)
  • 16. Recap Learn By Marketing.com library(randomForest) varImpPlot() library(MASS) stepAIC() library(forecast) auto.arima() library(caret) train()