Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
PREDICTIVE ANALYTICS
1. Data Analytics for Beginners
Predictive Analytics
September 5-9, 2022
Dr. Roma Mitra Debnath
Associate Professor (Applied Statistics)
Indian Instituteof Public Administration
Email: roma.mitra@gmail.com
9/9/2022 1
Roma Mitra IIPA
2. Regression and Correlation
• Regression analysis is the process of
constructing a mathematical model that can
be used to predict or determine one variable
by another variable.
• Correlation is a measure of the degree of
relatedness of two variables.
9/9/2022 Roma Mitra IIPA 2
3. Roma Mitra IIPA Chap 12-3
• Only one independentvariable, X
• Relationship between X and Y is described
by a linear function
• Changes in Y are assumed to be caused by
changes in X
Regression Analysis
9/9/2022
4. Regression Analysis
• ForecastingTool
• Assumes Causal Relation between dependent(DV) and
independentvariable (IV)
• Linear/non-linearRelation
• Select a model based on relation
Causal::
Education (IV) reduces Poverty (DV)
• Government expenditureon health Infant mortality
rate
• Efficiency of Government depends on Governance
• Corporate social responsibilityand financial
performance of the organization
• …
9/9/2022 Roma Mitra IIPA 4
5. Simple Regression Analysis
• bivariate (two variables) linear regression --
the most elementary regression model
–dependent variable, the variable to be
predicted, usually called Y
–independent variable, the predictor or
explanatory variable, usually called X
9/9/2022 Roma Mitra IIPA 5
6. i
i
1
0
i ε
X
β
β
Y
Linear component
Simple Linear Regression Model
Population
Y intercept
Population
Slope
Coefficient
Random
Error
term
Dependent
Variable
Independent
Variable
Random Error
component
9/9/2022 6
Roma Mitra IIPA
7. i
1
0
i X
b
b
Ŷ
The simple linear regressionequation provides an estimateof the
population regression line
Simple Linear Regression Equation (Prediction Line)
Estimate of the
regression
intercept
Estimate of the
regression slope
Estimated (or
predicted) Y
value for
observation i
Value of X for
observation i
The individual random error terms ei have a mean of zero
9/9/2022 7
Roma Mitra IIPA
11. Assumptions of Regression
• Linearity
– The underlying relationship between X and Y is linear
• Independence of Errors
– Error values are statistically independent
• Normality of Error
– Error values (ε) are normally distributed for any given value of X
• Equal Variance (Homoscedasticity)
– The probability distribution of the errors has constant variance
9/9/2022 11
Roma Mitra IIPA
13. Excel Output
9/9/2022 Roma Mitra IIPA 13
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.947662558
R Square 0.898064324
Adjusted R Square 0.881075045
Standard Error 108.7575267
Observations 8
ANOVA
df SS MS F Significance F
Regression 1 625246.3024 625246.3 52.86065 0.000344486
Residual 6 70969.19765 11828.2
Total 7 696215.5
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept -46.29180548 64.89096049 -0.71338 0.502402 -205.0742654 112.4906545
Advertising 15.23977165 2.096101053 7.270533 0.000344 10.11079715 20.36874615
14. Sample Data for House Price
Model : Quiz
House Price in $1000s
(Y)
Square Feet
(X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
9/9/2022 14
Roma Mitra IIPA
15. • The duration of QUIZ is 20 minutes. There are 20
questions.
• Filling up the Feedback form is mandatory.
• Please use the link ( shared on Email, Whatsapp
and Chat box) and fill it only once.
• Certificates will be uploaded on IIPA Portal on
Tuesday (September 13, 2022) . Please check in
the evening (Indian Standard Time).
9/9/2022 Roma Mitra IIPA 15