Methods for feature/variable selection in Regression Analysis

Machine
Learning - II
Methods for
feature/variable selection

Algorithms to choose the variables
Types:
1) Forward: this method begins by calculating and examining the
univariate chi-square or individual predictive power of each
variable. It looks for the predictive variable that has the most
variation or greatest differences between its levels when compared
to the different levels of the target variable.
2) Stepwise: this method is very similar to forward selection. The main
difference is that if any variable newly entered or already in the
model, becomes insignificant after it or another variable enters, it
will be removed. This method offers some additional power over
selection in finding the best set of predictors. Its main disadvantage
is slower processing time because each step considers every
variable for entry or removal.
Rupak Roy

3) Backward:
This method begins with all the variables in the model. Each variable
begins the process with a multivariate chi-square or a measure of
predictive power when considered in conjunction with all other
variables. It then removes any variable whose predictive power is
insignificant, beginning with the most insignificant variable. After each
variable is removed, the multivariate chi-square for all variables still in
the model is recalculated with one less variable.
This continues until all remaining variables have multivariate
significance. This method has on distinct benefit over forward and
stepwise. It allows variables of lower significance to be considered in
combination that might never enter the model under the forward and
stepwise methods.
Therefore, the resulting model may depend on more equal
contributions of many variables instead of the dominance of one or two
very powerful variables
Rupak Roy

4) Score:
This method constructs models using all possible subsets of variables
within the list of candidate variables using the highest likelihood score
Chi-square statistic. It does not derive the model coefficients. It simply
lists the best variables for each model along with the overall chi-square.
To apply the algorithms to choose the best variables use:
>step(m,direction=“both”)
Where,
m= the regression model
direction = both indicates use both backward and forward selection
Rupak Roy

Methods for feature/variable selection in Regression Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Methods for feature/variable selection in Regression Analysis

Similar to Methods for feature/variable selection in Regression Analysis (20)

More from Rupak Roy

More from Rupak Roy (20)

Recently uploaded

Recently uploaded (20)

Methods for feature/variable selection in Regression Analysis