Standardization…
Does it make a difference whether you standardize your
variables before running your model or standardize the
regression coefficients after you run your model?
Gaetan Lion, December 29, 2021
Leveraging the original OLS Regression model of December 9, 2021 to
estimate quarterly Real GDP growth
We are revisiting an OLS Regression model we used to estimate Real GDP growth in the related presentation:
“Regularization Models… Why you should avoid them” developed by myself on December 9, 2021. This model fit Real
GDP quarterly growth since 1959 using the following 7 variables:
Labor force Lag 1 quarter (laborL1)
Velocity of money (M2/GDP)
M2 Lag 1 quarter
S&P 500 level Lag 1 quarter
Fed Funds rate Lag 3 quarter and Lag 2 quarter
10 Year Treasury bill Lag 1 quarter.
Now, we ran this model 3 times using 3 different versions of the data set.
The first version included variables that were fully detrended on either quarterly % change or quarterly First Difference
in percentage (if Fed Funds increased from 0.75% to 1.00%, the First Difference was disclosed as 0.25%).
The second version included variables that were similar than in the first version, but if on a First Difference basis, they
would be disclosed in percentage points. So, the same First Difference, shown above, would be 0.25 (instead of 0.25%).
The third version used standardized variables, as we did earlier in our work dated December 9.
2
Three Questions
1. Does it make a difference whether you standardize your variables before running
your model or standardize the regression coefficients after you run your model?
2. Does the scale of the respective original non-standardized variables affect the
resulting standardized coefficients?
3. Does using non-standardized variables vs. standardized variables have an impact
when conducting regularization (Ridge Regression, LASSO)?
3
The 3 different
data versions.
These variables are
fully detrended as
mentioned on
previous slide
These variables use
the percentage point
transformation for the
rates variables.
These variables are
standardized. They all
have an average of 0
and a standard
deviation of 1.0.
4
The 2 models’ with non-standardized variables regression coefficients
Using regular variables
Using percentage points variables
As shown, the different scale of
the variables (the regular
detrended variables vs. the ones
in percentage points) did not have
any impact on the ultimate
standardized coefficients
conducted after the respective
models were run.
This answers our second question.
The scale of the variables does not
impact the resulting standardized
coefficients.
5
The model using standardized variables generated the exact
same standardized coefficients
These results answer the first question. Whether we first standardize the variables and run
the model, or we first run the model (with non-standardized variables) and standardize the
regression coefficients afterwards has no impact on the ultimate resulting standardized
coefficients. 6
Regularization just does not work with non-standardized variable
You can’t use non-standardized variables to run Regularization models. As shown the Ridge Regression on
the left does not make much sense with all regression coefficients being zeroed out the minute Lambda is a
bit greater than zero. The model on the right using standardized variables is more coherent. That’s the
answer to our third question or hypothesis.
7

Standardization

  • 1.
    Standardization… Does it makea difference whether you standardize your variables before running your model or standardize the regression coefficients after you run your model? Gaetan Lion, December 29, 2021
  • 2.
    Leveraging the originalOLS Regression model of December 9, 2021 to estimate quarterly Real GDP growth We are revisiting an OLS Regression model we used to estimate Real GDP growth in the related presentation: “Regularization Models… Why you should avoid them” developed by myself on December 9, 2021. This model fit Real GDP quarterly growth since 1959 using the following 7 variables: Labor force Lag 1 quarter (laborL1) Velocity of money (M2/GDP) M2 Lag 1 quarter S&P 500 level Lag 1 quarter Fed Funds rate Lag 3 quarter and Lag 2 quarter 10 Year Treasury bill Lag 1 quarter. Now, we ran this model 3 times using 3 different versions of the data set. The first version included variables that were fully detrended on either quarterly % change or quarterly First Difference in percentage (if Fed Funds increased from 0.75% to 1.00%, the First Difference was disclosed as 0.25%). The second version included variables that were similar than in the first version, but if on a First Difference basis, they would be disclosed in percentage points. So, the same First Difference, shown above, would be 0.25 (instead of 0.25%). The third version used standardized variables, as we did earlier in our work dated December 9. 2
  • 3.
    Three Questions 1. Doesit make a difference whether you standardize your variables before running your model or standardize the regression coefficients after you run your model? 2. Does the scale of the respective original non-standardized variables affect the resulting standardized coefficients? 3. Does using non-standardized variables vs. standardized variables have an impact when conducting regularization (Ridge Regression, LASSO)? 3
  • 4.
    The 3 different dataversions. These variables are fully detrended as mentioned on previous slide These variables use the percentage point transformation for the rates variables. These variables are standardized. They all have an average of 0 and a standard deviation of 1.0. 4
  • 5.
    The 2 models’with non-standardized variables regression coefficients Using regular variables Using percentage points variables As shown, the different scale of the variables (the regular detrended variables vs. the ones in percentage points) did not have any impact on the ultimate standardized coefficients conducted after the respective models were run. This answers our second question. The scale of the variables does not impact the resulting standardized coefficients. 5
  • 6.
    The model usingstandardized variables generated the exact same standardized coefficients These results answer the first question. Whether we first standardize the variables and run the model, or we first run the model (with non-standardized variables) and standardize the regression coefficients afterwards has no impact on the ultimate resulting standardized coefficients. 6
  • 7.
    Regularization just doesnot work with non-standardized variable You can’t use non-standardized variables to run Regularization models. As shown the Ridge Regression on the left does not make much sense with all regression coefficients being zeroed out the minute Lambda is a bit greater than zero. The model on the right using standardized variables is more coherent. That’s the answer to our third question or hypothesis. 7