SlideShare a Scribd company logo
1 of 19
REGRESSION ANALYSIS
PROF. DR. MUHAMMAD AZAM
Introduction
• The term regression was introduced by Sir Francis Galton in
connection with height of parents and their children. For this
purpose he collected heights data of 1000 parents and their
children. Finally he concluded that tall parents have tall children
and short parents have short children. But their children were not
as tall or short as their parents were i.e. their height tend towards
the average height. This tendency was called regression by
Galton.
• Today the term regression has quite different meanings. “It
investigates the dependence of one variable (dependent variable)
upon one or more other variables (called independent variables)
and provide an equation for estimating or predicting the average
value of dependent variable”.
Independent and Dependent Variable
• A variable whose value are fixed or
determined by an experimenter is called
Independent Variable e.g. amount of fertilizer
in different plots decided by the farmer. So
amount of fertilizer will be an independent
variable. It is also called regressor predictor.
• On the other hand a variable whose values are
influenced or affected by the values of an
independent variable is called dependent
variable e.g. wheat yield obtained from
different plots by using specified amount of
fertilizer.
Independent and Dependent
Variable
Simple Linear Regression
• To study the dependence of one variable (called dependent variable) upon a
single independent variable is called Simple Linear Regression (SLR).
• For population data SLR model is 𝑌 = 𝛼 + 𝛽𝑋 + 𝜀
• For sample data SLR model is 𝑌 = 𝑎 + 𝑏𝑋 +e
• Also the estimated SLR model is 𝑌
෠= 𝑎 + 𝑏𝑋
• Therefore 𝑌 = 𝑌
෠
+e
• Hence e = 𝑌 − 𝑌
෠
is an error.
Method of Least Squares
• Method of Least Squares: According to method of least squares, we obtain those
values of unknown parameters (𝛼, 𝛽 𝑒𝑡𝑐.) those will minimize the error sum of
squares i.e. this method provide us least or minimum value of σ 𝑒2 = σ 𝑌 − 𝑌
෠ 2
.
• Estimation of Parameters: The values of 𝛼 𝑎𝑛𝑑 𝛽 are estimated by using method
of least squares as:
𝑛 σ 𝑥2− σ 𝑥 2
• 𝑏 = 𝑛 σ 𝑥𝑦−σ 𝑥 σ 𝑦
and 𝑎 = 𝑦
ത
− 𝑏𝑥ҧ
• 𝑅2 = 1 −
σ 𝑒2
σ 𝑦
−
𝑦
ത 2
where σ 𝑦 − 𝑦
ത2 = 𝑛 σ 𝑦2 − σ 𝑦 2
Definitions
• Intercept: It is the value of dependent
variable without any influence of
independent variable. It is denoted by
“𝑎” which is an estimate of 𝛼.
• Regression Coefficient: It is the
change in the value of dependent
variable (Y) due to unite change in the
value of independent variable. It is
denoted by 𝑏 which is an estimate of
𝛽.
Application
• Example: The marketing manager of a large supermarket chain would like to use
shelf space to predict the sales of pet food. A random sample of 8 equal sized
stores is selected with the following results:
Shelf Space (Feet) 𝑥 5 5 10 10 15 15 20 20
Weekly Sales ($) 𝑦 160 220 190 240 230 280 290 310
(1) Construct a scatter plot and interpret.
(2) Fit a regression model of weekly sales on shelf space and show that sum of errors is zero.
(3) Compute 𝑅2 and interpret.
Scatter
Plot
X Y
5 160
5 220
10 190
10 240
15 230
15 280
20 290
20 310
10, 10, 15, 15, 20, 20)
x = c(5, 5,
y = c(160, 220, 190, 240, 230, 280, 290, 310)
plot(x, y, col = 2, main = "Scatter Plot", cex = 1.5, pch = 11)
# cex: character expansion
# pch: plot character
Fitting of Regression
Model
Estimated Regression Model is given by:
𝑌
෠= 𝑎 + 𝑏𝑥
where
𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑏 =
𝑛 σ 𝑥2 − σ 𝑥 2
𝑦
ത
=
𝑎 = 𝑦
ത
− 𝑏𝑥ҧ
σ 𝑦
𝑛
And
𝑥ҧ
=
σ 𝑥
𝑛
# Using R
x = c(5, 5, 10, 10, 15, 15, 20, 20)
y = c(160, 220, 190, 240, 230, 280, 290, 310)
fit = lm(y ~ x)
fit
summary(fit)
Fitting of Regression Model
x y x y x2 𝒚𝟐
5 160 800 25 25600
5 220 1100 25 48400
10 190 1900 100 36100
10 240 2400 100 57600
15 230 3450 225 52900
15 280 4200 225 78400
20 290 5800 400 84100
20 310 6200 400 96100
100 1920 25850 1500 479200
Fitting of Regression Model
Estimated Regression Model is given by:
𝑌
෠= 𝑎 + 𝑏𝑥
where
𝑏 =
8 25850 − 100
8 1500 − 100 2
1920 14800
=
2000
= 7.4
σ 𝑦 1920 σ 𝑥 100
𝑦
ത= = = 240, 𝑥ҧ
= = = 12.5
𝑛 8 𝑛 8
𝑎 = 𝑦
ത
− 𝑏𝑥ҧ
= 240 − 7.4 ∗ 12.5 = 147.5
𝑌
෠= 147.5 + 7.4x
Fitting of Regression Model
𝒚
ෝ = 𝟏𝟒𝟕. 𝟓 + 𝟕. 𝟒𝒙 𝒆 = 𝒚 − 𝒚
ෝ 𝒆𝟐
184.5 -24.5 600.25
184.5 35.5 1260.25
221.5 -31.5 992.25
221.5 18.5 342.25
258.5 -28.5 812.25
258.5 21.5 462.25
295.5 -5.5 30.25
295.5 14.5 210.25
1920 0 4710
Coefficient of Determination (𝑅2)
𝑅2 = 1 −
σ 𝑒2
σ 𝑦
−
𝑦
ത 2
where σ 𝑦 − 𝑦
ത2 = σ 𝑦2 −
σ 𝑦 2
𝑛
෠ 𝑦 − 𝑦
ത2 = ෠ 𝑦2 −
σ 𝑦 2
𝑛
8
2
σ 𝑦 − 𝑦
ത2 = 479200 − 1920
= 18400
𝑅2 = 1 −
4710
18400
= 0.7440 or 74.40%
It mean contribution of Shelf Space (in feet) is 74.40% in Weekly Sales (in $) of pet
food.
Coefficient of Determination (𝑅2)
about the
It is the ratio between “Explained Variation” and “Total Variation”. It tells us
contribution of independent variable into the dependent variable. Here
Total Variation = Explained Variation + Unexplained Variation
Explained Variation = Total Variation – Unexplained Variation
𝑅2 =
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
=
Total Variation – Unexplained Variation
𝑇𝑜𝑡𝑎𝑙 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 Total Variation
𝑅2 = 1 − = 1 −
Unexplained Variation σ 𝑒2
Total Variation σ 𝑦 − 𝑦
ത2
Where σ 𝑒2 = σ 𝑦2 − 𝑎 σ 𝑦 − 𝑏 σ 𝑥𝑦
2
෠ 𝑦 − 𝑦
ത2 = 𝑛 ෠ 𝑦2 − ෠ 𝑦
Coefficient of Determination (𝑅2)
where 0 ≤ 𝑅2 ≤ 1 and is usually expressed in percentage. For Example: 𝑅2 =
0.85 or 85%; it means contribution of independent variable is 85% into the total
variation in dependent variable. In other word 85% of the variation in dependent
variable is due to independent variable.
Application
• Example: The following data are the rates of oxygen consumption of birds,
measured at different environmental temperatures:
Temperature (oC) -18 -15 -10 -5 0 5 10 19
Oxygen Consumption
(ml/g/hr) 5.2 4.7 4.5 3.6 3.4 3.1 2.7 1.8
(1) Construct a scatter plot and interpret.
(2) Fit a regression model of Oxygen Consumption on Temperature and show that sum of errors is zero.
(3) Compute 𝑅2 and interpret.
Application
• Example: Given the following data on yield of rice and amount of water:
Amount of Water 13 19 25 30 33 42 56
Yield of Rice 2.30 2.90 3.05 3.20 3.45 3.85 4.25
(1) Construct a scatter plot and interpret.
(2) Fit a regression model of Yield of Rice on Amount of Water and show that sum of errors is zero.
(3) Compute 𝑅2 and interpret.
Application
• Example: One task is assigned to foresters is to estimate the
potential lumber harvest of a forest. The description of variables
is as under: HT: the height in feet and VOL: the volume of
lumber (a measure of the yield) in cubic feet.
• HT: 89.00, 90.07, 95.08, 98.03, 99.00, 91.05, 105.60, 100.80,
94.00, 93.09
• VOL: 25.93, 45.87, 56.20, 58.60, 63.36, 46.35, 68.99, 62.91,
58.13, 59.79
• Estimate the relationship betweenVOL andHT for

More Related Content

Similar to Regression Analysis , A statistical approch to analysis data.pptx

Curve_Fitting.pdf
Curve_Fitting.pdfCurve_Fitting.pdf
Curve_Fitting.pdf
Irfan Khan
 
15 ch ken black solution
15 ch ken black solution15 ch ken black solution
15 ch ken black solution
Krunal Shah
 
Statistics project2
Statistics project2Statistics project2
Statistics project2
shri1984
 

Similar to Regression Analysis , A statistical approch to analysis data.pptx (20)

Fst ch3 notes
Fst ch3 notesFst ch3 notes
Fst ch3 notes
 
An overview of statistics management with excel
An overview of statistics management with excelAn overview of statistics management with excel
An overview of statistics management with excel
 
Different Types of Machine Learning Algorithms
Different Types of Machine Learning AlgorithmsDifferent Types of Machine Learning Algorithms
Different Types of Machine Learning Algorithms
 
simple linear regression - brief introduction
simple linear regression - brief introductionsimple linear regression - brief introduction
simple linear regression - brief introduction
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Bba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression modelsBba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression models
 
Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help
 
Comm5005 lecture 4
Comm5005 lecture 4Comm5005 lecture 4
Comm5005 lecture 4
 
Introduction to Probability and Statistics 13th Edition Mendenhall Solutions ...
Introduction to Probability and Statistics 13th Edition Mendenhall Solutions ...Introduction to Probability and Statistics 13th Edition Mendenhall Solutions ...
Introduction to Probability and Statistics 13th Edition Mendenhall Solutions ...
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
 
Curve_Fitting.pdf
Curve_Fitting.pdfCurve_Fitting.pdf
Curve_Fitting.pdf
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Unit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptxUnit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptx
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Lesson 9 transcendental functions
Lesson 9 transcendental functionsLesson 9 transcendental functions
Lesson 9 transcendental functions
 
15 ch ken black solution
15 ch ken black solution15 ch ken black solution
15 ch ken black solution
 
Statistics project2
Statistics project2Statistics project2
Statistics project2
 
Statr session 23 and 24
Statr session 23 and 24Statr session 23 and 24
Statr session 23 and 24
 
Measurement_and_Units.pptx
Measurement_and_Units.pptxMeasurement_and_Units.pptx
Measurement_and_Units.pptx
 

Recently uploaded

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
fonyou31
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Krashi Coaching
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 

Recently uploaded (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 

Regression Analysis , A statistical approch to analysis data.pptx

  • 2. Introduction • The term regression was introduced by Sir Francis Galton in connection with height of parents and their children. For this purpose he collected heights data of 1000 parents and their children. Finally he concluded that tall parents have tall children and short parents have short children. But their children were not as tall or short as their parents were i.e. their height tend towards the average height. This tendency was called regression by Galton. • Today the term regression has quite different meanings. “It investigates the dependence of one variable (dependent variable) upon one or more other variables (called independent variables) and provide an equation for estimating or predicting the average value of dependent variable”.
  • 3. Independent and Dependent Variable • A variable whose value are fixed or determined by an experimenter is called Independent Variable e.g. amount of fertilizer in different plots decided by the farmer. So amount of fertilizer will be an independent variable. It is also called regressor predictor. • On the other hand a variable whose values are influenced or affected by the values of an independent variable is called dependent variable e.g. wheat yield obtained from different plots by using specified amount of fertilizer.
  • 5. Simple Linear Regression • To study the dependence of one variable (called dependent variable) upon a single independent variable is called Simple Linear Regression (SLR). • For population data SLR model is 𝑌 = 𝛼 + 𝛽𝑋 + 𝜀 • For sample data SLR model is 𝑌 = 𝑎 + 𝑏𝑋 +e • Also the estimated SLR model is 𝑌 ෠= 𝑎 + 𝑏𝑋 • Therefore 𝑌 = 𝑌 ෠ +e • Hence e = 𝑌 − 𝑌 ෠ is an error.
  • 6. Method of Least Squares • Method of Least Squares: According to method of least squares, we obtain those values of unknown parameters (𝛼, 𝛽 𝑒𝑡𝑐.) those will minimize the error sum of squares i.e. this method provide us least or minimum value of σ 𝑒2 = σ 𝑌 − 𝑌 ෠ 2 . • Estimation of Parameters: The values of 𝛼 𝑎𝑛𝑑 𝛽 are estimated by using method of least squares as: 𝑛 σ 𝑥2− σ 𝑥 2 • 𝑏 = 𝑛 σ 𝑥𝑦−σ 𝑥 σ 𝑦 and 𝑎 = 𝑦 ത − 𝑏𝑥ҧ • 𝑅2 = 1 − σ 𝑒2 σ 𝑦 − 𝑦 ത 2 where σ 𝑦 − 𝑦 ത2 = 𝑛 σ 𝑦2 − σ 𝑦 2
  • 7. Definitions • Intercept: It is the value of dependent variable without any influence of independent variable. It is denoted by “𝑎” which is an estimate of 𝛼. • Regression Coefficient: It is the change in the value of dependent variable (Y) due to unite change in the value of independent variable. It is denoted by 𝑏 which is an estimate of 𝛽.
  • 8. Application • Example: The marketing manager of a large supermarket chain would like to use shelf space to predict the sales of pet food. A random sample of 8 equal sized stores is selected with the following results: Shelf Space (Feet) 𝑥 5 5 10 10 15 15 20 20 Weekly Sales ($) 𝑦 160 220 190 240 230 280 290 310 (1) Construct a scatter plot and interpret. (2) Fit a regression model of weekly sales on shelf space and show that sum of errors is zero. (3) Compute 𝑅2 and interpret.
  • 9. Scatter Plot X Y 5 160 5 220 10 190 10 240 15 230 15 280 20 290 20 310 10, 10, 15, 15, 20, 20) x = c(5, 5, y = c(160, 220, 190, 240, 230, 280, 290, 310) plot(x, y, col = 2, main = "Scatter Plot", cex = 1.5, pch = 11) # cex: character expansion # pch: plot character
  • 10. Fitting of Regression Model Estimated Regression Model is given by: 𝑌 ෠= 𝑎 + 𝑏𝑥 where 𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦 𝑏 = 𝑛 σ 𝑥2 − σ 𝑥 2 𝑦 ത = 𝑎 = 𝑦 ത − 𝑏𝑥ҧ σ 𝑦 𝑛 And 𝑥ҧ = σ 𝑥 𝑛 # Using R x = c(5, 5, 10, 10, 15, 15, 20, 20) y = c(160, 220, 190, 240, 230, 280, 290, 310) fit = lm(y ~ x) fit summary(fit)
  • 11. Fitting of Regression Model x y x y x2 𝒚𝟐 5 160 800 25 25600 5 220 1100 25 48400 10 190 1900 100 36100 10 240 2400 100 57600 15 230 3450 225 52900 15 280 4200 225 78400 20 290 5800 400 84100 20 310 6200 400 96100 100 1920 25850 1500 479200
  • 12. Fitting of Regression Model Estimated Regression Model is given by: 𝑌 ෠= 𝑎 + 𝑏𝑥 where 𝑏 = 8 25850 − 100 8 1500 − 100 2 1920 14800 = 2000 = 7.4 σ 𝑦 1920 σ 𝑥 100 𝑦 ത= = = 240, 𝑥ҧ = = = 12.5 𝑛 8 𝑛 8 𝑎 = 𝑦 ത − 𝑏𝑥ҧ = 240 − 7.4 ∗ 12.5 = 147.5 𝑌 ෠= 147.5 + 7.4x
  • 13. Fitting of Regression Model 𝒚 ෝ = 𝟏𝟒𝟕. 𝟓 + 𝟕. 𝟒𝒙 𝒆 = 𝒚 − 𝒚 ෝ 𝒆𝟐 184.5 -24.5 600.25 184.5 35.5 1260.25 221.5 -31.5 992.25 221.5 18.5 342.25 258.5 -28.5 812.25 258.5 21.5 462.25 295.5 -5.5 30.25 295.5 14.5 210.25 1920 0 4710
  • 14. Coefficient of Determination (𝑅2) 𝑅2 = 1 − σ 𝑒2 σ 𝑦 − 𝑦 ത 2 where σ 𝑦 − 𝑦 ത2 = σ 𝑦2 − σ 𝑦 2 𝑛 ෠ 𝑦 − 𝑦 ത2 = ෠ 𝑦2 − σ 𝑦 2 𝑛 8 2 σ 𝑦 − 𝑦 ത2 = 479200 − 1920 = 18400 𝑅2 = 1 − 4710 18400 = 0.7440 or 74.40% It mean contribution of Shelf Space (in feet) is 74.40% in Weekly Sales (in $) of pet food.
  • 15. Coefficient of Determination (𝑅2) about the It is the ratio between “Explained Variation” and “Total Variation”. It tells us contribution of independent variable into the dependent variable. Here Total Variation = Explained Variation + Unexplained Variation Explained Variation = Total Variation – Unexplained Variation 𝑅2 = 𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 = Total Variation – Unexplained Variation 𝑇𝑜𝑡𝑎𝑙 𝑉𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 Total Variation 𝑅2 = 1 − = 1 − Unexplained Variation σ 𝑒2 Total Variation σ 𝑦 − 𝑦 ത2 Where σ 𝑒2 = σ 𝑦2 − 𝑎 σ 𝑦 − 𝑏 σ 𝑥𝑦 2 ෠ 𝑦 − 𝑦 ത2 = 𝑛 ෠ 𝑦2 − ෠ 𝑦
  • 16. Coefficient of Determination (𝑅2) where 0 ≤ 𝑅2 ≤ 1 and is usually expressed in percentage. For Example: 𝑅2 = 0.85 or 85%; it means contribution of independent variable is 85% into the total variation in dependent variable. In other word 85% of the variation in dependent variable is due to independent variable.
  • 17. Application • Example: The following data are the rates of oxygen consumption of birds, measured at different environmental temperatures: Temperature (oC) -18 -15 -10 -5 0 5 10 19 Oxygen Consumption (ml/g/hr) 5.2 4.7 4.5 3.6 3.4 3.1 2.7 1.8 (1) Construct a scatter plot and interpret. (2) Fit a regression model of Oxygen Consumption on Temperature and show that sum of errors is zero. (3) Compute 𝑅2 and interpret.
  • 18. Application • Example: Given the following data on yield of rice and amount of water: Amount of Water 13 19 25 30 33 42 56 Yield of Rice 2.30 2.90 3.05 3.20 3.45 3.85 4.25 (1) Construct a scatter plot and interpret. (2) Fit a regression model of Yield of Rice on Amount of Water and show that sum of errors is zero. (3) Compute 𝑅2 and interpret.
  • 19. Application • Example: One task is assigned to foresters is to estimate the potential lumber harvest of a forest. The description of variables is as under: HT: the height in feet and VOL: the volume of lumber (a measure of the yield) in cubic feet. • HT: 89.00, 90.07, 95.08, 98.03, 99.00, 91.05, 105.60, 100.80, 94.00, 93.09 • VOL: 25.93, 45.87, 56.20, 58.60, 63.36, 46.35, 68.99, 62.91, 58.13, 59.79 • Estimate the relationship betweenVOL andHT for