SlideShare a Scribd company logo
1 of 39
LOGO
MULTIPLE REGRESSION
www.themegallery.com
Multiple Regression
Multiple regression analysis is used:
 To know the effect of some independent
variables, X1, X2, ...,Xk to dependent variable
Y.
 To predict a value of dependent variable
based on the values of independent
variables, X1, X2, ...,Xk .
2
www.themegallery.com
Model
General Model:
Yi = 0+1X1i+ 2X2i+3X3i+ ...+kXki+
i = Number of observations
k = Number of independent variables
0, 1, 2 , ..., k = Parameter/regression
coefficient
X1 , X2 ......Xk = Independent variables
Y = Dependent variable
 = Error
3
www.themegallery.com
Examination of Regression
 Coefficient of determination (R2)
 Hypothesis test:
 F-Test  regression model
 t-Test  coefficient of regression
 Classic assumption test:
 Normality test
 Multicolinearity test
 Homoskedasticity test
 Autocorrelation test
4
www.themegallery.com
Coefficent of Determination (R2)
 R2 is used to measure the strength of association
between dependent and independent variables.
 R2 is interpreted as the proportion of variance in
the dependent variable that is explained by
dependent variables.
 The property of R2 : 0 ≤ R2 ≤ 1.
 R2  the larger, the better.
 Use adjusted R2 in multiple regression.
 Example:
 From a wine price study (the independent variable is
growing-season temperature), R2 was found = 0.80. It
means eighty percent of the variance in price may be
explained by growing-season temperature.
5
www.themegallery.com
F-Test
 F-test is used to test whether a group of variables (independent
variables) in the model are jointly significant.
 Hypothesis test:
H0 : 1 = 2 = 3 = 4 =............= k = 0
H1 : At least there is a  ≠ 0
where k = number of independent variable
 Output of F-test  see ANOVA (Analysis of Variance)
table. If:
 Sig. ≥ 0.05  accept H0  the independent variables
jointly have no significant effect to the
dependent variable.
 Sig. < 0.05  reject H0  the independent variables
jointly have significant effect to the
dependent variable  the model is good.
6
www.themegallery.com
t-Test
 t-Test is used to check the significance of individual (partial) regression
coefficients in the multiple linear regression model.
 Hypothesis test:
H0 : i = 0
H1 : i ≠ 0
where i = 1, 2, 3,..., k number of independent variable
 From the Coefficient table (in the regression output)  see/find the Sig.
(Significance) value for each independent variable:
 If Sig. ≥ 0.05  accept H0, it means the independent
variable has no significant effect to
the dependent variable.
 If Sig. < 0.05  reject H0, it means the independent
variable has significant effect to
the dependent variable.
7
www.themegallery.com
Basic Assumption for Regression
There are 4 main classic/basic assumptions:
 Normality
 No multicollinearity
 Homoskedasticity
 No autocorrelation
8
www.themegallery.com
Classical Assumptions Test in SPSS
No. Assumption Detector Notes
1. Normality Normal P-P Plot of Regression
Standardized Residual
Normal P-P Plot of Regression Standardized Residual
shows that points of data form a linear pattern or spread
approtimate to linear line.
2. Homoskedasticity Scatter Plot  Scatter plot of standardized residual *ZRESID and
standardized predicted value *ZPRED  not form a
specific pattern  variance of its residual is constant
 homoskedastic
 If its scatter plot form any pattern  heteroskedastic
 variance of its residual is different
3. Multicolliniearity VIF (Variance Inflating Factor)  The value of VIF : 1 –
 VIF ≤ 10  there is no multicollinearity
 VIF > 10  there is multicollinearity
TOL (Tolerance)  The value of TOL = 0 – 1;
 TOL  0 : there is multicollinearity
 TOL  1 : there is no multicollinearity
9
∞
www.themegallery.com
Uji Asumsi Klasik pada PASW/SPSS
No. Assumption Detector Notes
3. Multicollinearity Eigenvalues  Eigenvalues  0 : there is multicolinearity
 Eigenvalues  1 : there is no multicollinearity
Conditional Index (CI)  CI > 15  there is multicollinearity
 CI ≤ 15  there is no multicollinearity
4. Autocorrelation Durbin-Watson (DW)  The value of DW = 0 – 4
Compare DW from output and value of d from the table
(Durbin-Watson table) by condition:
• If : DW < dL  positive correlation
• If : dL ≤ DW ≤ dU  no conclusion/don’t know
• If : dU < DW < 4 – dU  no autocorrelation
• If : 4 - dU ≤ DW ≤ 4 - dL  no conclusion/don’t know
• If : DW > 4 – dL  negative correlation
Continuation:
10
www.themegallery.com
© 2007 Prentice Hall
17-11
Assumptions
 The error term is normally distributed. For each
fixed value of X, the distribution of Y (dependent
variable) is normal.
 The means of all these normal distributions of Y,
given X, lie on a straight line with slope b.
 The mean of the error term is 0.
 The variance of the error term is constant. This
variance does not depend on the values
assumed by X  homoscedasticity
 The error terms are uncorrelated. In other
words, the observations have been drawn
independently  no multicollinearity
www.themegallery.com
Multicollinearity
 Multicollinearity  there are linear
relationship between independent variables.
 For instance:
Yi = 0 + 1X1 + 2X2 + 3X3 + ui
Y : Consumption
X1 : Total Income
X2 : Income from salary
X3 : Income from non salary
Total income (X1) = Income from salary (X2)
+ Nonsalary income (X3)
 multicollinearity exist
12
www.themegallery.com
Consequences of Multicollinearity:
 Variance so high
 Confidence Interval  wide (variance is high
 Standar Error is high  Confidence
Interval is wide).
 R2 is high but many variables are not
significant.
13
Multicollinearity
www.themegallery.com
Multicollinearity Detection
1. Eigenvalues dan Conditional Index (CI)
 Multicollinearity exist in the regression equation if
Eigenvalues approtimate to zero (0).
 Relationship between Eigenvalues and Conditional
Index (CI) :
s
eigenvalue
s
eigenvalue
=
CI
min
max
• If CI > 15  there is multicollinearity
• If CI < 15  there is no multicolinearity
14
www.themegallery.com
)
R
(
=
j
j 2
1
1
VIF

; j = 1,2,……,k
k = number of independent variable bebas
is a coefficient of determination between one independent
variable and other independent variables.
2
j
R
If VIF > 10  there is multicollinearity
If VIF ≤ 10  there is no multicolinearity
15
2. Variance Inflation Factor (VIF)
Multicollinearity Detection
www.themegallery.com
3. Tolerance (TOL)
 VIF has relationship with TOL, as follow:
 If TOL approtimate 1  no multicollinearity
 
2
1
1
TOL j
j R
=
VIF
= 
16
Multicollinearity Detection
www.themegallery.com
Handling the Multicollinearity
 Delete the variables which have strong
relationship with other variable.
– Commonly used.
– Be careful when deleting the variable.
 Transform the variabel.
 Add the sample/observation.
17
www.themegallery.com
Homoskedasticity
 A homoskedastic error is one that has
constant variance  basic assumption.
 A heteroskedastic error is one that has a
nonconstant variance.
 Heteroskedasticity happened when
variance of error was not constant.
 Heteroskedasticity is more commonly a
problem for cross-section data sets,
although a time-series model can also
have a non-constant variance.
18
www.themegallery.com
Examination of Homoskedasticity
Graph Method
 Principle: check the residual pattern (ui
2) to predicted
value of Yi.
 The steps:
 Running a regression model
 Making scatter plot between ui
2 and predicted Yi.
19
Homoskedasticity
www.themegallery.com
ui
2
i
Observation:
1. There is no sistematic pattern.
2. Variance is constant  homoskedastic data
,
20
Homoskedasticity
www.themegallery.com
ui
2
ui
2
i i
21
Observation:
1. There is a sistematic pattern.
2. Variance is not constant  heteroskedastic data
Homoskedasticity
www.themegallery.com
Handling of Heteroskedasticity
 Transform the data using Logarithm. The objective of
this transformation is to reduce the scale between
independent variables, so that the variance of error is
so small, not too different between observation group.
 The model is:
Ln Yj = β0 + β1 Ln Xj + uj
22
www.themegallery.com
Autocorrelation
 Autocorrelation  correlation between variable itself
in observation at different time or different individual.
 Commonly found in time series data. Current data
were influenced by previous time data. For example:
data about weight, salary/wage etc.
 One of detector  see the relationship pattern
between residual (ui) and independent variable or
time (X).
23
www.themegallery.com
 Autocorrelation Pattern
 ui ui
 * **
 * * * * * *
 * * *
 * * * ** * * * **
*
time/X time/X
(1) (2)
 Diagram (1) shows there is a cycle, whereas diagram
(2) shows a linear line. Both indicate there are
autocorrelation.
24
Autocorrelation
www.themegallery.com
Autocorrelation Detection
 Using Durbin Watson (DW) statistic.
 Comparing the DW from SPSS output and DW
value in DW table.
 The rules:
• DW < dL  positive correlation
• dL ≤ DW ≤ dU  no conclusion/don’t know
• dU < DW < 4 – dU  no autocorrelation
• 4 - dU ≤ DW ≤ 4 - dL  no conclusion/don’t know
• DW > 4 – dL  negative correlation
25
www.themegallery.com
No conclusion/don’t know
Positive
correlation
0 dL dU 4-dU 4-dL 4
26
Autocorrelation Detection
No conclusion/don’t know
No autocorrelation Negative
correlation
www.themegallery.com
Application
 A company has a sales person data that consist of age, income and
work experience. The director want to know whether any relationship
between age and work experience to income of sales person.
Besides that, the company also wants to make a multiple regression
model to predict the income based on age and work experience.
 Regression model:
Yi = b0 + b1X1 + b2X2 + ui
Y : Income
X1 : Age
X2 : Work experience
b0 ,b1, b2 : Parameter
ui : residual
27
www.themegallery.com
Based on data in file multiple_regression1.sav , we will find the multiple regression
equation, y = b0 + b1x1 + b2x2 and conduct the hypothesis test to know whether the
regression coefficients are significat or not.
The steps:
1. Open file multiple_regression1.sav.
2. Click Analyze  Regression  Linear.
3. In Linear Regression view, move variable Income to Dependent box, then variable
Age dan Experience to Independent(s) box.
4. In Method section : select Enter.
5. Click Statistics knob, then give check at Estimates, Model fit, Collinearity
Diagnostics dan Durbin-Watson.
6. Click Continue.
7. Click Plots.
8. In Linear Regression section: Plots di bagian Standardized Residual
Plots, pilih Normal probability plot. Kemudian pindahkan *ZRESID
(standardized residual) ke dolam kotak Y dan *ZPRED (standardized
predicted value) ke dalam kotak X.
9. Click Continue, then OK.
28
Application
www.themegallery.com
1. Coefficient of Determination (R2)
Adjusted R2 shows that 92.7 % variance of Income can be explained by the
changes in Experience and Age.
2. Autocorrelation Test
 Durbin-Watson (DW) value = 1.497
 From DW table, with k=2 (independent variable), a = 0.05, we find that
dL=0.6972; dU=1.6413; 4-dU=2.3587; 4-dL=3.3028.
 Since DW=1.497, then dL (0.6972) ≤ DW (1.497) ≤ dU (1.6413)  no conclusion
29
Output Interpretation
Experience, Age
www.themegallery.com
3. F-Test and t-Test :
30
Output Interpretation
Experience, Age
Age
Experience
www.themegallery.com
 F-Test :
From ANOVA table:
Sig. (p value) = 0.000 < a = 0.05  the independent variables jointly
influence the dependent variable
significantly  model is good
Hypothesis test:
H0 : b1 = b2 = 0
H1 : b1 ≠ b2 ≠ 0
Sig. (p value) = 0.000 < a = 0.05  reject H0  b1 and b2 not equal
zero
31
Output Interpretation
www.themegallery.com
 t-Test
– To test whether each regression coefficient is significant or
not  see Coefficient table.
From Coefficient table:
– Variable Age (X1) :
Hypothesis test:
H0 : b1 = 0
H1 : b1 ≠ 0
Sig. : 0.000 < 0.05  Significant  variable Age affects
income significantly
32
Output Interpretation
www.themegallery.com
 t-Test (Cont.)
– Variable Experience (X2) :
Hypothesis test:
H0 : b2 = 0
H1 : b2 ≠ 0
Sig. : 0.013 < 0.05  Significant  variable Experience
affects income significantly
33
Output Interpretation
www.themegallery.com
 Regression equation:
From Coefficient table:
Y = -10360.5 + 1201.098 X1 + 1663.516 X2
where : Y = Income
X1 = Age
X2 = Work experience
Interpretation of regression parameters:
-10360.5  intercept; the value of Y If X1 and X2 are zero.
+ 1201.098  every increase of X1 by one unit, the value of Y will
increase by +1201.098 unit.
+ 1663.516  every increase of X2 by one unit, the value of Y will
increase by +1663.516 unit.
34
Output Interpretation
www.themegallery.com
4. Normality Test:
 See the Normal P-P Plots of Regression  the data
points spread approtimate diagonal line form linear
pattern  the data distribution is normal.
Company Logo
www.themegallery.com
36
Output Interpretation
The data points spread approtimate diagonal line and form a linear
pattern  the data distribution is normal.
www.themegallery.com
5. Homoskedasticity Test
 See Scatterplot output below:
 The dispersion of data doesn’t form a specific pattern  its variance is
constant  homoskedastic.
37
Output Interpretation
www.themegallery.com
6. Multicollinearity Test
See Coefficients table, in Collinearity Statistics column:
 The value of VIF = 1.377 < 10  no multicolinearity
 The value of TOL = 0.726  approtimate to 1  no multicollinearity
38
Interpretasi Output Regresi Berganda
Age
Experience
LOGO
www.themegallery.com

More Related Content

Similar to Analyze Multiple Regression Model to Predict Income

Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Stephen Ong
 
Data Analysison Regression
Data Analysison RegressionData Analysison Regression
Data Analysison Regressionjamuga gitulho
 
linear regression PDF.pdf
linear regression PDF.pdflinear regression PDF.pdf
linear regression PDF.pdfJoshuaLau29
 
Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelMehdi Shayegani
 
simple-linear-regression (1).pptx
simple-linear-regression (1).pptxsimple-linear-regression (1).pptx
simple-linear-regression (1).pptxShrutiGupta3922
 
Multicolinearity
MulticolinearityMulticolinearity
MulticolinearityPawan Kawan
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.pptTanyaWadhwani4
 
08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)Duke Network Analysis Center
 
Intro to econometrics
Intro to econometricsIntro to econometrics
Intro to econometricsGaetan Lion
 
Presentation on Regression Analysis
Presentation on Regression AnalysisPresentation on Regression Analysis
Presentation on Regression AnalysisJ P Verma
 
regression-130929093340-phpapp02 (1).pdf
regression-130929093340-phpapp02 (1).pdfregression-130929093340-phpapp02 (1).pdf
regression-130929093340-phpapp02 (1).pdfMuhammadAftab89
 
SURE Model_Panel data.pptx
SURE Model_Panel data.pptxSURE Model_Panel data.pptx
SURE Model_Panel data.pptxGeetaShreeprabha
 

Similar to Analyze Multiple Regression Model to Predict Income (20)

Chapter 14
Chapter 14 Chapter 14
Chapter 14
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Math(2)
Math(2)Math(2)
Math(2)
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
LINEAR REGRESSION.pptx
LINEAR REGRESSION.pptxLINEAR REGRESSION.pptx
LINEAR REGRESSION.pptx
 
Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02
 
Data Analysison Regression
Data Analysison RegressionData Analysison Regression
Data Analysison Regression
 
linear regression PDF.pdf
linear regression PDF.pdflinear regression PDF.pdf
linear regression PDF.pdf
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression model
 
simple-linear-regression (1).pptx
simple-linear-regression (1).pptxsimple-linear-regression (1).pptx
simple-linear-regression (1).pptx
 
Corrleation and regression
Corrleation and regressionCorrleation and regression
Corrleation and regression
 
Multicolinearity
MulticolinearityMulticolinearity
Multicolinearity
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)
 
Regression
RegressionRegression
Regression
 
Intro to econometrics
Intro to econometricsIntro to econometrics
Intro to econometrics
 
Presentation on Regression Analysis
Presentation on Regression AnalysisPresentation on Regression Analysis
Presentation on Regression Analysis
 
regression-130929093340-phpapp02 (1).pdf
regression-130929093340-phpapp02 (1).pdfregression-130929093340-phpapp02 (1).pdf
regression-130929093340-phpapp02 (1).pdf
 
SURE Model_Panel data.pptx
SURE Model_Panel data.pptxSURE Model_Panel data.pptx
SURE Model_Panel data.pptx
 

Recently uploaded

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 

Analyze Multiple Regression Model to Predict Income

  • 2. www.themegallery.com Multiple Regression Multiple regression analysis is used:  To know the effect of some independent variables, X1, X2, ...,Xk to dependent variable Y.  To predict a value of dependent variable based on the values of independent variables, X1, X2, ...,Xk . 2
  • 3. www.themegallery.com Model General Model: Yi = 0+1X1i+ 2X2i+3X3i+ ...+kXki+ i = Number of observations k = Number of independent variables 0, 1, 2 , ..., k = Parameter/regression coefficient X1 , X2 ......Xk = Independent variables Y = Dependent variable  = Error 3
  • 4. www.themegallery.com Examination of Regression  Coefficient of determination (R2)  Hypothesis test:  F-Test  regression model  t-Test  coefficient of regression  Classic assumption test:  Normality test  Multicolinearity test  Homoskedasticity test  Autocorrelation test 4
  • 5. www.themegallery.com Coefficent of Determination (R2)  R2 is used to measure the strength of association between dependent and independent variables.  R2 is interpreted as the proportion of variance in the dependent variable that is explained by dependent variables.  The property of R2 : 0 ≤ R2 ≤ 1.  R2  the larger, the better.  Use adjusted R2 in multiple regression.  Example:  From a wine price study (the independent variable is growing-season temperature), R2 was found = 0.80. It means eighty percent of the variance in price may be explained by growing-season temperature. 5
  • 6. www.themegallery.com F-Test  F-test is used to test whether a group of variables (independent variables) in the model are jointly significant.  Hypothesis test: H0 : 1 = 2 = 3 = 4 =............= k = 0 H1 : At least there is a  ≠ 0 where k = number of independent variable  Output of F-test  see ANOVA (Analysis of Variance) table. If:  Sig. ≥ 0.05  accept H0  the independent variables jointly have no significant effect to the dependent variable.  Sig. < 0.05  reject H0  the independent variables jointly have significant effect to the dependent variable  the model is good. 6
  • 7. www.themegallery.com t-Test  t-Test is used to check the significance of individual (partial) regression coefficients in the multiple linear regression model.  Hypothesis test: H0 : i = 0 H1 : i ≠ 0 where i = 1, 2, 3,..., k number of independent variable  From the Coefficient table (in the regression output)  see/find the Sig. (Significance) value for each independent variable:  If Sig. ≥ 0.05  accept H0, it means the independent variable has no significant effect to the dependent variable.  If Sig. < 0.05  reject H0, it means the independent variable has significant effect to the dependent variable. 7
  • 8. www.themegallery.com Basic Assumption for Regression There are 4 main classic/basic assumptions:  Normality  No multicollinearity  Homoskedasticity  No autocorrelation 8
  • 9. www.themegallery.com Classical Assumptions Test in SPSS No. Assumption Detector Notes 1. Normality Normal P-P Plot of Regression Standardized Residual Normal P-P Plot of Regression Standardized Residual shows that points of data form a linear pattern or spread approtimate to linear line. 2. Homoskedasticity Scatter Plot  Scatter plot of standardized residual *ZRESID and standardized predicted value *ZPRED  not form a specific pattern  variance of its residual is constant  homoskedastic  If its scatter plot form any pattern  heteroskedastic  variance of its residual is different 3. Multicolliniearity VIF (Variance Inflating Factor)  The value of VIF : 1 –  VIF ≤ 10  there is no multicollinearity  VIF > 10  there is multicollinearity TOL (Tolerance)  The value of TOL = 0 – 1;  TOL  0 : there is multicollinearity  TOL  1 : there is no multicollinearity 9 ∞
  • 10. www.themegallery.com Uji Asumsi Klasik pada PASW/SPSS No. Assumption Detector Notes 3. Multicollinearity Eigenvalues  Eigenvalues  0 : there is multicolinearity  Eigenvalues  1 : there is no multicollinearity Conditional Index (CI)  CI > 15  there is multicollinearity  CI ≤ 15  there is no multicollinearity 4. Autocorrelation Durbin-Watson (DW)  The value of DW = 0 – 4 Compare DW from output and value of d from the table (Durbin-Watson table) by condition: • If : DW < dL  positive correlation • If : dL ≤ DW ≤ dU  no conclusion/don’t know • If : dU < DW < 4 – dU  no autocorrelation • If : 4 - dU ≤ DW ≤ 4 - dL  no conclusion/don’t know • If : DW > 4 – dL  negative correlation Continuation: 10
  • 11. www.themegallery.com © 2007 Prentice Hall 17-11 Assumptions  The error term is normally distributed. For each fixed value of X, the distribution of Y (dependent variable) is normal.  The means of all these normal distributions of Y, given X, lie on a straight line with slope b.  The mean of the error term is 0.  The variance of the error term is constant. This variance does not depend on the values assumed by X  homoscedasticity  The error terms are uncorrelated. In other words, the observations have been drawn independently  no multicollinearity
  • 12. www.themegallery.com Multicollinearity  Multicollinearity  there are linear relationship between independent variables.  For instance: Yi = 0 + 1X1 + 2X2 + 3X3 + ui Y : Consumption X1 : Total Income X2 : Income from salary X3 : Income from non salary Total income (X1) = Income from salary (X2) + Nonsalary income (X3)  multicollinearity exist 12
  • 13. www.themegallery.com Consequences of Multicollinearity:  Variance so high  Confidence Interval  wide (variance is high  Standar Error is high  Confidence Interval is wide).  R2 is high but many variables are not significant. 13 Multicollinearity
  • 14. www.themegallery.com Multicollinearity Detection 1. Eigenvalues dan Conditional Index (CI)  Multicollinearity exist in the regression equation if Eigenvalues approtimate to zero (0).  Relationship between Eigenvalues and Conditional Index (CI) : s eigenvalue s eigenvalue = CI min max • If CI > 15  there is multicollinearity • If CI < 15  there is no multicolinearity 14
  • 15. www.themegallery.com ) R ( = j j 2 1 1 VIF  ; j = 1,2,……,k k = number of independent variable bebas is a coefficient of determination between one independent variable and other independent variables. 2 j R If VIF > 10  there is multicollinearity If VIF ≤ 10  there is no multicolinearity 15 2. Variance Inflation Factor (VIF) Multicollinearity Detection
  • 16. www.themegallery.com 3. Tolerance (TOL)  VIF has relationship with TOL, as follow:  If TOL approtimate 1  no multicollinearity   2 1 1 TOL j j R = VIF =  16 Multicollinearity Detection
  • 17. www.themegallery.com Handling the Multicollinearity  Delete the variables which have strong relationship with other variable. – Commonly used. – Be careful when deleting the variable.  Transform the variabel.  Add the sample/observation. 17
  • 18. www.themegallery.com Homoskedasticity  A homoskedastic error is one that has constant variance  basic assumption.  A heteroskedastic error is one that has a nonconstant variance.  Heteroskedasticity happened when variance of error was not constant.  Heteroskedasticity is more commonly a problem for cross-section data sets, although a time-series model can also have a non-constant variance. 18
  • 19. www.themegallery.com Examination of Homoskedasticity Graph Method  Principle: check the residual pattern (ui 2) to predicted value of Yi.  The steps:  Running a regression model  Making scatter plot between ui 2 and predicted Yi. 19 Homoskedasticity
  • 20. www.themegallery.com ui 2 i Observation: 1. There is no sistematic pattern. 2. Variance is constant  homoskedastic data , 20 Homoskedasticity
  • 21. www.themegallery.com ui 2 ui 2 i i 21 Observation: 1. There is a sistematic pattern. 2. Variance is not constant  heteroskedastic data Homoskedasticity
  • 22. www.themegallery.com Handling of Heteroskedasticity  Transform the data using Logarithm. The objective of this transformation is to reduce the scale between independent variables, so that the variance of error is so small, not too different between observation group.  The model is: Ln Yj = β0 + β1 Ln Xj + uj 22
  • 23. www.themegallery.com Autocorrelation  Autocorrelation  correlation between variable itself in observation at different time or different individual.  Commonly found in time series data. Current data were influenced by previous time data. For example: data about weight, salary/wage etc.  One of detector  see the relationship pattern between residual (ui) and independent variable or time (X). 23
  • 24. www.themegallery.com  Autocorrelation Pattern  ui ui  * **  * * * * * *  * * *  * * * ** * * * ** * time/X time/X (1) (2)  Diagram (1) shows there is a cycle, whereas diagram (2) shows a linear line. Both indicate there are autocorrelation. 24 Autocorrelation
  • 25. www.themegallery.com Autocorrelation Detection  Using Durbin Watson (DW) statistic.  Comparing the DW from SPSS output and DW value in DW table.  The rules: • DW < dL  positive correlation • dL ≤ DW ≤ dU  no conclusion/don’t know • dU < DW < 4 – dU  no autocorrelation • 4 - dU ≤ DW ≤ 4 - dL  no conclusion/don’t know • DW > 4 – dL  negative correlation 25
  • 26. www.themegallery.com No conclusion/don’t know Positive correlation 0 dL dU 4-dU 4-dL 4 26 Autocorrelation Detection No conclusion/don’t know No autocorrelation Negative correlation
  • 27. www.themegallery.com Application  A company has a sales person data that consist of age, income and work experience. The director want to know whether any relationship between age and work experience to income of sales person. Besides that, the company also wants to make a multiple regression model to predict the income based on age and work experience.  Regression model: Yi = b0 + b1X1 + b2X2 + ui Y : Income X1 : Age X2 : Work experience b0 ,b1, b2 : Parameter ui : residual 27
  • 28. www.themegallery.com Based on data in file multiple_regression1.sav , we will find the multiple regression equation, y = b0 + b1x1 + b2x2 and conduct the hypothesis test to know whether the regression coefficients are significat or not. The steps: 1. Open file multiple_regression1.sav. 2. Click Analyze  Regression  Linear. 3. In Linear Regression view, move variable Income to Dependent box, then variable Age dan Experience to Independent(s) box. 4. In Method section : select Enter. 5. Click Statistics knob, then give check at Estimates, Model fit, Collinearity Diagnostics dan Durbin-Watson. 6. Click Continue. 7. Click Plots. 8. In Linear Regression section: Plots di bagian Standardized Residual Plots, pilih Normal probability plot. Kemudian pindahkan *ZRESID (standardized residual) ke dolam kotak Y dan *ZPRED (standardized predicted value) ke dalam kotak X. 9. Click Continue, then OK. 28 Application
  • 29. www.themegallery.com 1. Coefficient of Determination (R2) Adjusted R2 shows that 92.7 % variance of Income can be explained by the changes in Experience and Age. 2. Autocorrelation Test  Durbin-Watson (DW) value = 1.497  From DW table, with k=2 (independent variable), a = 0.05, we find that dL=0.6972; dU=1.6413; 4-dU=2.3587; 4-dL=3.3028.  Since DW=1.497, then dL (0.6972) ≤ DW (1.497) ≤ dU (1.6413)  no conclusion 29 Output Interpretation Experience, Age
  • 30. www.themegallery.com 3. F-Test and t-Test : 30 Output Interpretation Experience, Age Age Experience
  • 31. www.themegallery.com  F-Test : From ANOVA table: Sig. (p value) = 0.000 < a = 0.05  the independent variables jointly influence the dependent variable significantly  model is good Hypothesis test: H0 : b1 = b2 = 0 H1 : b1 ≠ b2 ≠ 0 Sig. (p value) = 0.000 < a = 0.05  reject H0  b1 and b2 not equal zero 31 Output Interpretation
  • 32. www.themegallery.com  t-Test – To test whether each regression coefficient is significant or not  see Coefficient table. From Coefficient table: – Variable Age (X1) : Hypothesis test: H0 : b1 = 0 H1 : b1 ≠ 0 Sig. : 0.000 < 0.05  Significant  variable Age affects income significantly 32 Output Interpretation
  • 33. www.themegallery.com  t-Test (Cont.) – Variable Experience (X2) : Hypothesis test: H0 : b2 = 0 H1 : b2 ≠ 0 Sig. : 0.013 < 0.05  Significant  variable Experience affects income significantly 33 Output Interpretation
  • 34. www.themegallery.com  Regression equation: From Coefficient table: Y = -10360.5 + 1201.098 X1 + 1663.516 X2 where : Y = Income X1 = Age X2 = Work experience Interpretation of regression parameters: -10360.5  intercept; the value of Y If X1 and X2 are zero. + 1201.098  every increase of X1 by one unit, the value of Y will increase by +1201.098 unit. + 1663.516  every increase of X2 by one unit, the value of Y will increase by +1663.516 unit. 34 Output Interpretation
  • 35. www.themegallery.com 4. Normality Test:  See the Normal P-P Plots of Regression  the data points spread approtimate diagonal line form linear pattern  the data distribution is normal. Company Logo
  • 36. www.themegallery.com 36 Output Interpretation The data points spread approtimate diagonal line and form a linear pattern  the data distribution is normal.
  • 37. www.themegallery.com 5. Homoskedasticity Test  See Scatterplot output below:  The dispersion of data doesn’t form a specific pattern  its variance is constant  homoskedastic. 37 Output Interpretation
  • 38. www.themegallery.com 6. Multicollinearity Test See Coefficients table, in Collinearity Statistics column:  The value of VIF = 1.377 < 10  no multicolinearity  The value of TOL = 0.726  approtimate to 1  no multicollinearity 38 Interpretasi Output Regresi Berganda Age Experience