MULTIPLE LINEAR REGRESSION
MODEL ANALYSIS
Submitted by
Md. Sadiqur Rahman(2019-2-60-057)
Miskath Jahan Simu(2019-2-60-056)
Insan Ara Milu(2019-2-60-029)
Meftahul Zannat(2018-2-60-049)
Sharmin Akther Rima(2018-2-60-112)
Submitted to
Md Al-Imran
Lecturer,
Department of Computer Science
and Engineering,
East West University
Multiple Regression
• A regression that contains more than one regression variable is called
multivariate regression.
• Multivariate regression model is a linear function of the unknown parameters
β0, β1, β2, β3,…., βi
• Example:
Yi=β0+β1x1 + β2x2 + β3x3 +…….+ βixi
• Y is also called predictor or dependent variable and β0+β1x1 + β2x2 + β3x3
+…….+ βixi is called predicator or independent variables.
Estimation Example: Average Test Scores and
Per Students Spending
avgscore= β0 + β1expend + β2avginc + u
avgscore= Average standardized test score of school, expend= Per student
spending at this school, avginc= Average family income of students at this
school, u= Other factors
• Per student spending is likely to be correlated with average family income at a
given high school because of school financing
• Omitting average family income in regression would lead to biased estimate of
the effect of spending on average test scores
• In a simple regression model, effect of per student spending would partly
include the effect of family income on test scores
Steps of Multivariate Regression analysis
• Feature selection-
The selection of features is an important step in multivariate regression.
Feature selection also known as variable selection. It becomes important for
us to pick significant variables for better model building.
• Normalizing Features-
We need to scale the features as it maintains general distribution and ratios
in data. This will lead to an efficient analysis. The value of each feature can
also be changed.
Steps of Multivariate Regression analysis
• Select Loss function and Hypothesis-
The loss function predicts whenever there is an error. Meaning, when the
hypothesis prediction deviates from actual values. Here, the hypothesis is the
predicted value from the feature/variable.
• Set Hypothesis Parameters-
The hypothesis parameter needs to be set in such a way that it reduces the
loss function and predicts well.
Steps of Multivariate Regression analysis
• Minimize the Loss Function-
The loss function needs to be minimized by using a loss minimization
algorithm on the dataset, which will help in adjusting hypothesis parameters.
After the loss is minimized, it can be used for further action. Gradient
descent is one of the algorithms commonly used for loss minimization.
• Test the hypothesis function-
The hypothesis function needs to be checked on as well, as it is predicting
values. Once this is done, it has to be tested on test data.
Multivariate Regression Analysis
• A multiple regression analysis involves estimation, testing, diagnostic
procedures designed to fit the multiple regression model
E(y)=β0+β1x1 + β2x2 + β3x3 +…….+ βkxk +ϵk
to a set of data.
Multivariate Regression Model in Matrix Form
• In matrix notation the multiple regression model is: Y=Xβ+ ϵ
• Where
, , , , , ,
• Y and ϵ are n*1 vectors, β is a (k+1)*1 vector and X is a n*(k+1) matrix.
• The Gauss- Markov assumptions are: E(ϵ)=0, Var(ϵ)=σ2I.
• The Least Square estimate of β is b=(XTX)-1XTY
Mean Squared Error (MSE)
• The Mean squared error (MSE) represents the error of the estimator or predictive
model created based on the given set of observations in the sample.
• The MSE is used to represent the penalty of the model for each of the
predictions. In other words, it can be used to represent the cost associated with
the prediction. And, the squared penalties are advantageous because they
exaggerate the difference between the true value and the predicted value.
We can find MSE by,
Accuracy
• Accuracy or R-Squared is the ratio of the sum of squares regression (SSR) and
the sum of squares total (SST).
• R-squared value is used to measure the goodness of fit or best-fit line. The
greater the value of R-Squared, the better is the regression model as most of
the variation of actual values from the mean value get explained by the
regression model.
We can find accuracy by,
Advantages of Multivariate Regression
• The most important advantage of Multivariate regression is it helps us to
understand the relationships among variables present in the dataset.
• This will further help in understanding the correlation between dependent
and independent variables. Multivariate linear regression is a widely used
machine learning algorithm.
Disadvantages of Multivariate Regression
Multivariate techniques are a bit complex and require a high-levels of
mathematical calculation.
The multivariate regression model’s output is not easy to interpret sometimes,
because it has some loss and error output which are not identical.
This model does not have much scope for smaller datasets. Hence, the same
cannot be applied to them. The results are better for larger datasets.
Conclusion
The main purpose to use multivariate regression is when you have more than
one variables are available and in that case, single linear regression will not
work. Mainly real world has multiple variables or features when multiple
variables/features come into play multivariate regression are used.

Multiple-Linear-Regression-Model-Analysis.pptx

  • 1.
    MULTIPLE LINEAR REGRESSION MODELANALYSIS Submitted by Md. Sadiqur Rahman(2019-2-60-057) Miskath Jahan Simu(2019-2-60-056) Insan Ara Milu(2019-2-60-029) Meftahul Zannat(2018-2-60-049) Sharmin Akther Rima(2018-2-60-112) Submitted to Md Al-Imran Lecturer, Department of Computer Science and Engineering, East West University
  • 2.
    Multiple Regression • Aregression that contains more than one regression variable is called multivariate regression. • Multivariate regression model is a linear function of the unknown parameters β0, β1, β2, β3,…., βi • Example: Yi=β0+β1x1 + β2x2 + β3x3 +…….+ βixi • Y is also called predictor or dependent variable and β0+β1x1 + β2x2 + β3x3 +…….+ βixi is called predicator or independent variables.
  • 3.
    Estimation Example: AverageTest Scores and Per Students Spending avgscore= β0 + β1expend + β2avginc + u avgscore= Average standardized test score of school, expend= Per student spending at this school, avginc= Average family income of students at this school, u= Other factors • Per student spending is likely to be correlated with average family income at a given high school because of school financing • Omitting average family income in regression would lead to biased estimate of the effect of spending on average test scores • In a simple regression model, effect of per student spending would partly include the effect of family income on test scores
  • 4.
    Steps of MultivariateRegression analysis • Feature selection- The selection of features is an important step in multivariate regression. Feature selection also known as variable selection. It becomes important for us to pick significant variables for better model building. • Normalizing Features- We need to scale the features as it maintains general distribution and ratios in data. This will lead to an efficient analysis. The value of each feature can also be changed.
  • 5.
    Steps of MultivariateRegression analysis • Select Loss function and Hypothesis- The loss function predicts whenever there is an error. Meaning, when the hypothesis prediction deviates from actual values. Here, the hypothesis is the predicted value from the feature/variable. • Set Hypothesis Parameters- The hypothesis parameter needs to be set in such a way that it reduces the loss function and predicts well.
  • 6.
    Steps of MultivariateRegression analysis • Minimize the Loss Function- The loss function needs to be minimized by using a loss minimization algorithm on the dataset, which will help in adjusting hypothesis parameters. After the loss is minimized, it can be used for further action. Gradient descent is one of the algorithms commonly used for loss minimization. • Test the hypothesis function- The hypothesis function needs to be checked on as well, as it is predicting values. Once this is done, it has to be tested on test data.
  • 7.
    Multivariate Regression Analysis •A multiple regression analysis involves estimation, testing, diagnostic procedures designed to fit the multiple regression model E(y)=β0+β1x1 + β2x2 + β3x3 +…….+ βkxk +ϵk to a set of data.
  • 8.
    Multivariate Regression Modelin Matrix Form • In matrix notation the multiple regression model is: Y=Xβ+ ϵ • Where , , , , , , • Y and ϵ are n*1 vectors, β is a (k+1)*1 vector and X is a n*(k+1) matrix. • The Gauss- Markov assumptions are: E(ϵ)=0, Var(ϵ)=σ2I. • The Least Square estimate of β is b=(XTX)-1XTY
  • 9.
    Mean Squared Error(MSE) • The Mean squared error (MSE) represents the error of the estimator or predictive model created based on the given set of observations in the sample. • The MSE is used to represent the penalty of the model for each of the predictions. In other words, it can be used to represent the cost associated with the prediction. And, the squared penalties are advantageous because they exaggerate the difference between the true value and the predicted value. We can find MSE by,
  • 10.
    Accuracy • Accuracy orR-Squared is the ratio of the sum of squares regression (SSR) and the sum of squares total (SST). • R-squared value is used to measure the goodness of fit or best-fit line. The greater the value of R-Squared, the better is the regression model as most of the variation of actual values from the mean value get explained by the regression model. We can find accuracy by,
  • 11.
    Advantages of MultivariateRegression • The most important advantage of Multivariate regression is it helps us to understand the relationships among variables present in the dataset. • This will further help in understanding the correlation between dependent and independent variables. Multivariate linear regression is a widely used machine learning algorithm.
  • 12.
    Disadvantages of MultivariateRegression Multivariate techniques are a bit complex and require a high-levels of mathematical calculation. The multivariate regression model’s output is not easy to interpret sometimes, because it has some loss and error output which are not identical. This model does not have much scope for smaller datasets. Hence, the same cannot be applied to them. The results are better for larger datasets.
  • 13.
    Conclusion The main purposeto use multivariate regression is when you have more than one variables are available and in that case, single linear regression will not work. Mainly real world has multiple variables or features when multiple variables/features come into play multivariate regression are used.