SlideShare a Scribd company logo
1 of 51
Correlation and
Regression
1
BY UNSA SHAKIR
Correlation and Regression
2
Correlation describes the strength of a
linear relationship between two variables
Regression tells us how to draw the straight
line described by the correlation
Correlation and Regression
• For example:
A sociologist may be interested in the relationship
between education and self-esteem or Income and
Number of Children in a family.
Independent Variables
Education
Family Income
Dependent Variables
Self-Esteem
Number of Children
3
Correlation and Regression
• For example:
• May expect: As education increases, self-esteem
increases (positive relationship).
• May expect: As family income increases, the number
of children in families declines (negative relationship).
Family Income
Dependent Variables
Self-Esteem
Number of Children
Independent Variables
Education
+
4
-
Correlation
5
Correlation
6
• Correlation is a statistical technique used to
determine the degree to which two variables
are related
• A correlation is a relationship between two
variables. The data can be represented by the
ordered pairs (x, y) where x is the independent
(or explanatory) variable, and y is the
dependent (or response) variable.
Correlation
x 1 2 3 4 5
y – 4 – 2 – 1 0 2
A scatter plot can be used to determine
whether a linear (straight line) correlation
exists between two variables.
x
2 4
–2
– 4
y
7
2
6
Example:
Linear Correlation
y
x
Negative Linear Correlation
y
x
No Correlation
y
x
Positive Linear Correlation
y
x
Nonlinear Correlation
As x increases,
y tends to
decrease.
8
As x increases,
y tends to
increase.
Correlation Coefficient
• It is also called Pearson's correlation or
product moment correlation coefficient
• The correlation coefficient is a measure of
the strength and the direction of a linear
relationship between two variables. The
symbol r represents the sample correlation
coefficient. The formula for r is
.
9
r 
nxy xy
nx2
x2
ny2
y2
of r denotes the nature of
The sign
association
while the value of r denotes the strength of
association.
10
If the sign is +ve this means the relation is
direct (an increase in one variable is
11
other variable and a decrease in
variable is associated with
associated with an increase in the
one
a
decrease in the other variable).
While if the sign is -ve this means an
inverse or indirect relationship (which
means an increase in one variable is
associated with a decrease in the other).
-1 -0.75 -0.25 0 0.25 0.75 1
The value of r ranges between ( -1) and ( +1)
The value of r denotes the strength of the
association as illustrated
by the following diagram.
strong intermediate weak weak intermediate strong
no
relation
perfect
correlation
perfect
correlation
Direct
12
indirect
If r = Zero this means no association or
correlation between the two variables.
If 0 < r < 0.25 = weak correlation.
If 0.25 ≤ r < 0.75 = intermediate correlation.
If 0.75 ≤ r < 1 = strong correlation.
If r = l = perfect correlation.
13
Linear Correlation
x
Weak positive correlation
y
x
Nonlinear Correlation
y
r = 0.91
x
Strong negative correlation
y
r = 0.42
14
r = 0.88
x
Strong positive correlation
y
r = 0.07
Calculatinga CorrelationCoefficient
.
15
r 
nxy  xy
nx2
 x2
n y2
 y2
Calculating a Correlation Coefficient
In Words
1. Find the sum of the x-values.
2. Find the sum of the y-values.
In Symbols
x
 y
xy
3. Multiply each x-value by its
corresponding y-value and find
the sum.
Calculatinga CorrelationCoefficient
16
Calculating a Correlation Coefficient
x2
 y 2
In Words In Symbols
4. Square each x-value and
find the sum.
5. Square each y-value and
find the sum.
6. Use these five sums to
calculate the correlation
coefficient.
Correlation Coefficient
17
Example:
Calculate the correlation coefficient r for the following
data.
x y xy x2 y2
1 – 3 – 3 1 9
2 – 1 – 2 4 1
3 0 0 9 0
4 1 4 16 1
5 2 10 25 4
x 15  y  1 xy  9 x2
 55  y2
 15
Correlation Coefficient
nxy  xy
r 
Example:
Calculate the correlation coefficient r for the following
data.
nx2
 x2
n y2
 y2
5(9)  151

5(55) 152
5(15)  12
 6 0
18
ï‚»0.986
5 0 7 4
There is a strong positive linear correlation
between x and y.
Correlation Coefficient
19
Hours,
x
0 1 2 3 3 5 5 5 6 7 7 10
Test score,
y
96 85 82 74 95 68 76 84 58 65 75 50
Example:
The following data represents the number of hours, 12
different students watched television during the
weekend and the scores of each student who took a test
the following Monday.
a) Display the scatter plot.
b) Calculate the correlation coefficient r.
Correlation Coefficient
Test
score
y
100
80
60
40
20
x
2 4 6 8 10
Hours watching TV
Hours,
x
0 1 2 3 3 5 5 5 6 7 7 10
Test score,
y
96 85 82 74 95 68 76 84 58 65 75 50
20
Correlation Coefficient
21
Hours, x 0 1 2 3 3 5 5 5 6 7 7 10
Test
score, y
96 85 82 74 95 68 76 84 58 65 75 50
xy 0 85
16
4
222
28
5
34
0
38
0
420 348
45
5
52
5
50
0
x2 0 1 4 9 9 25 25 25 36 49 49
10
0
y2
921
6
722
5
67
24
547
6
90
25
46
24
57
76
705
6
336
4
42
25
56
25
25
00
Example continued:
x  54  y  908 xy  3724 x2
 332  y2
 70836
Correlation Coefficient
Example continued:
r 
nxy xy
nx2
x2
ny2
y2
12(3724)  54908
22

12(332)  542
12(70836)  9082  0.831
• There is a strong negative linear correlation.
• As the number of hours spent watching TV increases,
the test scores tend to decrease.
Example:
23
A sample of 6 children was selected, data about their
age in years and weight in kilograms was recorded
as shown in the following table . It is required to find
the correlation between age and weight.
serial
No
Age
(years)
Weight
(Kg)
1 7 12
2 6 8
3 8 12
4 5 10
5 6 11
6 9 13
Serial
n.
Age
(year)
(x)
Weight
(Kg)
(y)
xy X2 Y2
1 7 12 84 49 144
2 6 8 48 36 64
3 8 12 96 64 144
4 5 10 50 25 100
5 6 11 66 36 121
6 9 13 117 81 169
Total ∑x=
41
∑y=
66
∑xy=
461
∑x2=
291
∑y2=
742
24
r = 0.759
strong direct correlation

25

 


6
.742 
6
291
461
41ï‚´ 66
r  6
(41)2
  (66)2

EXAMPLE: Relationship betweenAnxiety and Test
Scores
26
Anxiety
(X)
Test
score (Y)
X2 Y2 XY
10 2 100 4 20
8 3 64 9 24
2 9 4 81 18
1 7 1 49 7
5 6 25 36 30
6 5 36 25 30
∑X = 32 ∑Y = 32 ∑X2 = 230 ∑Y2 = 204 ∑XY=129
Calculating Correlation Coefficient
(356)(200)
27
 .94
774 1024

(6)(129)  (32)(32)
6(230)  322
6(204) 322

r 
r = - 0.94
Indirect strong correlation
Example
Tree
Height
y
Trunk
Diameter
x xy y2 x2
35 8 280 1225 64
49 9 441 2401 81
27 7 189 729 49
33 6 198 1089 36
60 13 780 3600 169
21 7 147 441 49
45 11 495 2025 121
51 12 612 2601 144
Σ =321 Σ =73 Σ =3142 Σ =14111 Σ =713
28
13
10
0
20
30
40
50
60
70
0 2 12 14
2
29
4 6 8 10
Trunk Diameter, x
Tree
Height,
y
Example
• r = 0.886 → relatively
strong positive linear
association between x
and y
30
Regression
31
Regression Analyses
32
• Regression technique is concerned with
predicting some variables by knowing others
• The process of predicting variable Y using
variable X
20
Types of Regression Models
Positive Linear Relationship
Negative Linear Relationship
Relationship NOT Linear
No Relationship
33
Regression
34
Uses a variable (x) to predict some outcome
variable (y)
Tells you how values in y change as a
function of changes in values of x
The regression line makes the sum of the squares of the
residuals smaller than for any other line
Regression minimizes residuals
220
200
180
160
140
120
100
80
60 70 80 90 100 110
Wt (kg)
120
SBP(mmHg)
35
By using the least squares method (a procedure that
minimizes the vertical deviations of
surrounding a straight line)
plotted
we
points
are
able to construct a best fitting straight line to the scatter
diagram points and then formulate a regression equation
in the form of:

n
 n
  x  y
 xy
b
(  x) 2
 x 2
1
ŷ  y  b(x  x)
ŷ  a  bX
Regression equation describes the regression line
mathematically by showing Intercept and Slope
36
Correlation and Regression
37
• The statistics equation for a line:
Y = a + bx
Where^
: Y = the line’s position on
the v
^ertical axis at any point
(estimated value of dependent
variable)
X = the line’s position on the
horizontal axis at any point (value of
the independent variable for which you
want an estimate of Y)
b = the slope of the line
(called the coefficient)
a = the intercept with the Y axis,
where X equals zero
Linear Equations
Y
Y=bX+a
Change
inY
ChangeinX
a=Y-intercept
X
b=Slope
38
Exercise
A sample of 6 persons was selected the value of
their age ( x variable) and their weight is
demonstrated in the following table. Find the
regression equation and what is the predicted
weight when age is 8.5 years.
39
Serial no. Age (x) Weight (y)
1 7 12
2 6 8
3 8 12
4 5 10
5 6 11
6 9 13
Answer
40
Serial no. Age (x) Weight (y) xy X2 Y2
1 7 12 84 49 144
2 6 8 48 36 64
3 8 12 96 64 144
4 5 10 50 25 100
5 6 11 66 36 121
6 9 13 117 81 169
Total 41 66 461 291 742
6
x 
41
 6.83
6
y 
66
11
6
41
(41)2
291 
461
41ï‚´ 66
b  6  0.92
Regression equation
ŷ(x) 11 0.9(x  6.83)
ŷ(x)
42
 4.675  0.92x
ŷ(8.5)  4.675  0.92*8.5 12.50Kg
ŷ(7.5) 4.6750.92*7.511.58Kg
we create a regression line by plotting two estimated
values for y against their X component, then extending
the line right and left.
43
Regression Line
44
Example:
a) Find the equation of the regression line.
b) Use the equation to find the expected value when
value of x is 2.3
x y xy x2 y2
1 – 3 – 3 1 9
2 – 1 – 2 4 1
3 0 0 9 0
4 1 4 16 1
5 2 10 25 4
x 15  y  1 xy  9 x2
 55  y2
 15
Regression Line
x
2
1
1
2
3
1 2 3 4 5
m 
nxy  xy
nx2
 x2
y

5(9) 151
45
5(55) 152
50
 60
1.2
Regression Line
46
Example:
The following data represents the number of hours 12
different students watched television during the
weekend and the scores of each student who took a
test the following Monday.
a) Find the equation of the regression line.
b) Use the equation to find the expected test score
for a student who watches 9 hours of TV.
Regression Line
47
Hours, x 0 1 2 3 3 5 5 5 6 7 7 10
Test score,
y
96 85 82 74 95 68 76 84 58 65 75 50
xy 0 85 164 222 285 340 380 420 348 455 525 500
x2 0 1 4 9 9 25 25 25 36 49 49 100
y2 9216
722
5
672
4
547
6
902
5
462
4
577
6
705
6
336
4
422
5
562
5
250
0
x  54  y  908 xy  3724 x2
 332  y2
 70836
Exercise
• Find the correlation between age and blood
pressure using simple and Spearman's
correlation coefficients, and comment.
• Find the regression equation?
• What is the predicted blood pressure for a
man aging 25 years?
48
Serial x y xy x2
1 20 120 2400 400
2 43 128 5504 1849
3 63 141 8883 3969
4 26 126 3276 676
5 53 134 7102 2809
6 31 128 3968 961
7 58 136 7888 3364
8 46 132 6072 2116
9 58 140 8120 3364
10 70 144 10080 4900
49
Serial x y xy x2
11 46 128 5888 2116
12 53 136 7208 2809
13 60 146 8760 3600
14 20 124 2480 400
15 63 143 9009 3969
16 43 130 5590 1849
17 26 124 3224 676
18 19 121 2299 361
19 31 126 3906 961
20 23 123 2829 529
Total 852 2630 114486 41678
50

n
xy  x y
b  n
(x)2
x2
1
20
51
8522
20  0.4547
41678 
114486 
852 ï‚´ 2630
=
ŷ =112.13 + 0.4547 x
for age 25
B.P = 112.13 + 0.4547 * 25=123.49
= 123.5 mm hg

More Related Content

Similar to Regression and correlation in statistics

stats_ch12.pdf
stats_ch12.pdfstats_ch12.pdf
stats_ch12.pdfshermanullah
 
Regression
RegressionRegression
RegressionLavanyaK75
 
correlation_regression.ppt
correlation_regression.pptcorrelation_regression.ppt
correlation_regression.pptRicha Joshi
 
Lesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing dataLesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing datamjlobetos
 
Linear regression.ppt
Linear regression.pptLinear regression.ppt
Linear regression.ppthabtamu biazin
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and RegressionShubham Mehta
 
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression Dr. Tushar J Bhatt
 
Math4 presentation.ppsx
Math4 presentation.ppsxMath4 presentation.ppsx
Math4 presentation.ppsxRaviPal876687
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relationnuwan udugampala
 
Regression Analysis by Muthama JM
Regression Analysis by Muthama JM Regression Analysis by Muthama JM
Regression Analysis by Muthama JM Japheth Muthama
 
Regression analysis by Muthama JM
Regression analysis by Muthama JMRegression analysis by Muthama JM
Regression analysis by Muthama JMJapheth Muthama
 
Correlation
CorrelationCorrelation
Correlationshaminggg
 
Unit 5 Correlation
Unit 5 CorrelationUnit 5 Correlation
Unit 5 CorrelationRai University
 
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...RekhaChoudhary24
 
Chap5 correlation
Chap5 correlationChap5 correlation
Chap5 correlationSemurt Ensem
 
Correlation and Regression ppt
Correlation and Regression pptCorrelation and Regression ppt
Correlation and Regression pptSantosh Bhaskar
 
correlationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfcorrelationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfDrAmanSaxena
 

Similar to Regression and correlation in statistics (20)

stats_ch12.pdf
stats_ch12.pdfstats_ch12.pdf
stats_ch12.pdf
 
Regression
RegressionRegression
Regression
 
correlation_regression.ppt
correlation_regression.pptcorrelation_regression.ppt
correlation_regression.ppt
 
Lesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing dataLesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing data
 
Linear regression.ppt
Linear regression.pptLinear regression.ppt
Linear regression.ppt
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression
 
Math4 presentation.ppsx
Math4 presentation.ppsxMath4 presentation.ppsx
Math4 presentation.ppsx
 
Correlations
CorrelationsCorrelations
Correlations
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relation
 
Regression Analysis by Muthama JM
Regression Analysis by Muthama JM Regression Analysis by Muthama JM
Regression Analysis by Muthama JM
 
Regression analysis by Muthama JM
Regression analysis by Muthama JMRegression analysis by Muthama JM
Regression analysis by Muthama JM
 
Correlation
CorrelationCorrelation
Correlation
 
Unit 5 Correlation
Unit 5 CorrelationUnit 5 Correlation
Unit 5 Correlation
 
Co re
Co reCo re
Co re
 
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
 
Chap5 correlation
Chap5 correlationChap5 correlation
Chap5 correlation
 
Correlation and Regression ppt
Correlation and Regression pptCorrelation and Regression ppt
Correlation and Regression ppt
 
correlationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfcorrelationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdf
 

Recently uploaded

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEslot gacor bisa pakai pulsa
 

Recently uploaded (20)

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 

Regression and correlation in statistics

  • 2. Correlation and Regression 2 Correlation describes the strength of a linear relationship between two variables Regression tells us how to draw the straight line described by the correlation
  • 3. Correlation and Regression • For example: A sociologist may be interested in the relationship between education and self-esteem or Income and Number of Children in a family. Independent Variables Education Family Income Dependent Variables Self-Esteem Number of Children 3
  • 4. Correlation and Regression • For example: • May expect: As education increases, self-esteem increases (positive relationship). • May expect: As family income increases, the number of children in families declines (negative relationship). Family Income Dependent Variables Self-Esteem Number of Children Independent Variables Education + 4 -
  • 6. Correlation 6 • Correlation is a statistical technique used to determine the degree to which two variables are related • A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where x is the independent (or explanatory) variable, and y is the dependent (or response) variable.
  • 7. Correlation x 1 2 3 4 5 y – 4 – 2 – 1 0 2 A scatter plot can be used to determine whether a linear (straight line) correlation exists between two variables. x 2 4 –2 – 4 y 7 2 6 Example:
  • 8. Linear Correlation y x Negative Linear Correlation y x No Correlation y x Positive Linear Correlation y x Nonlinear Correlation As x increases, y tends to decrease. 8 As x increases, y tends to increase.
  • 9. Correlation Coefficient • It is also called Pearson's correlation or product moment correlation coefficient • The correlation coefficient is a measure of the strength and the direction of a linear relationship between two variables. The symbol r represents the sample correlation coefficient. The formula for r is . 9 r  nxy xy nx2 x2 ny2 y2
  • 10. of r denotes the nature of The sign association while the value of r denotes the strength of association. 10
  • 11. If the sign is +ve this means the relation is direct (an increase in one variable is 11 other variable and a decrease in variable is associated with associated with an increase in the one a decrease in the other variable). While if the sign is -ve this means an inverse or indirect relationship (which means an increase in one variable is associated with a decrease in the other).
  • 12. -1 -0.75 -0.25 0 0.25 0.75 1 The value of r ranges between ( -1) and ( +1) The value of r denotes the strength of the association as illustrated by the following diagram. strong intermediate weak weak intermediate strong no relation perfect correlation perfect correlation Direct 12 indirect
  • 13. If r = Zero this means no association or correlation between the two variables. If 0 < r < 0.25 = weak correlation. If 0.25 ≤ r < 0.75 = intermediate correlation. If 0.75 ≤ r < 1 = strong correlation. If r = l = perfect correlation. 13
  • 14. Linear Correlation x Weak positive correlation y x Nonlinear Correlation y r = 0.91 x Strong negative correlation y r = 0.42 14 r = 0.88 x Strong positive correlation y r = 0.07
  • 15. Calculatinga CorrelationCoefficient . 15 r  nxy  xy nx2  x2 n y2  y2 Calculating a Correlation Coefficient In Words 1. Find the sum of the x-values. 2. Find the sum of the y-values. In Symbols x  y xy 3. Multiply each x-value by its corresponding y-value and find the sum.
  • 16. Calculatinga CorrelationCoefficient 16 Calculating a Correlation Coefficient x2  y 2 In Words In Symbols 4. Square each x-value and find the sum. 5. Square each y-value and find the sum. 6. Use these five sums to calculate the correlation coefficient.
  • 17. Correlation Coefficient 17 Example: Calculate the correlation coefficient r for the following data. x y xy x2 y2 1 – 3 – 3 1 9 2 – 1 – 2 4 1 3 0 0 9 0 4 1 4 16 1 5 2 10 25 4 x 15  y  1 xy  9 x2  55  y2  15
  • 18. Correlation Coefficient nxy  xy r  Example: Calculate the correlation coefficient r for the following data. nx2  x2 n y2  y2 5(9)  151  5(55) 152 5(15)  12  6 0 18 ï‚»0.986 5 0 7 4 There is a strong positive linear correlation between x and y.
  • 19. Correlation Coefficient 19 Hours, x 0 1 2 3 3 5 5 5 6 7 7 10 Test score, y 96 85 82 74 95 68 76 84 58 65 75 50 Example: The following data represents the number of hours, 12 different students watched television during the weekend and the scores of each student who took a test the following Monday. a) Display the scatter plot. b) Calculate the correlation coefficient r.
  • 20. Correlation Coefficient Test score y 100 80 60 40 20 x 2 4 6 8 10 Hours watching TV Hours, x 0 1 2 3 3 5 5 5 6 7 7 10 Test score, y 96 85 82 74 95 68 76 84 58 65 75 50 20
  • 21. Correlation Coefficient 21 Hours, x 0 1 2 3 3 5 5 5 6 7 7 10 Test score, y 96 85 82 74 95 68 76 84 58 65 75 50 xy 0 85 16 4 222 28 5 34 0 38 0 420 348 45 5 52 5 50 0 x2 0 1 4 9 9 25 25 25 36 49 49 10 0 y2 921 6 722 5 67 24 547 6 90 25 46 24 57 76 705 6 336 4 42 25 56 25 25 00 Example continued: x  54  y  908 xy  3724 x2  332  y2  70836
  • 22. Correlation Coefficient Example continued: r  nxy xy nx2 x2 ny2 y2 12(3724)  54908 22  12(332)  542 12(70836)  9082 ï‚» 0.831 • There is a strong negative linear correlation. • As the number of hours spent watching TV increases, the test scores tend to decrease.
  • 23. Example: 23 A sample of 6 children was selected, data about their age in years and weight in kilograms was recorded as shown in the following table . It is required to find the correlation between age and weight. serial No Age (years) Weight (Kg) 1 7 12 2 6 8 3 8 12 4 5 10 5 6 11 6 9 13
  • 24. Serial n. Age (year) (x) Weight (Kg) (y) xy X2 Y2 1 7 12 84 49 144 2 6 8 48 36 64 3 8 12 96 64 144 4 5 10 50 25 100 5 6 11 66 36 121 6 9 13 117 81 169 Total ∑x= 41 ∑y= 66 ∑xy= 461 ∑x2= 291 ∑y2= 742 24
  • 25. r = 0.759 strong direct correlation  25      6 .742  6 291 461 41ï‚´ 66 r  6 (41)2   (66)2 
  • 26. EXAMPLE: Relationship betweenAnxiety and Test Scores 26 Anxiety (X) Test score (Y) X2 Y2 XY 10 2 100 4 20 8 3 64 9 24 2 9 4 81 18 1 7 1 49 7 5 6 25 36 30 6 5 36 25 30 ∑X = 32 ∑Y = 32 ∑X2 = 230 ∑Y2 = 204 ∑XY=129
  • 27. Calculating Correlation Coefficient (356)(200) 27  .94 774 1024  (6)(129)  (32)(32) 6(230)  322 6(204) 322  r  r = - 0.94 Indirect strong correlation
  • 28. Example Tree Height y Trunk Diameter x xy y2 x2 35 8 280 1225 64 49 9 441 2401 81 27 7 189 729 49 33 6 198 1089 36 60 13 780 3600 169 21 7 147 441 49 45 11 495 2025 121 51 12 612 2601 144 Σ =321 Σ =73 Σ =3142 Σ =14111 Σ =713 28
  • 29. 13 10 0 20 30 40 50 60 70 0 2 12 14 2 29 4 6 8 10 Trunk Diameter, x Tree Height, y Example • r = 0.886 → relatively strong positive linear association between x and y
  • 30. 30
  • 32. Regression Analyses 32 • Regression technique is concerned with predicting some variables by knowing others • The process of predicting variable Y using variable X
  • 33. 20 Types of Regression Models Positive Linear Relationship Negative Linear Relationship Relationship NOT Linear No Relationship 33
  • 34. Regression 34 Uses a variable (x) to predict some outcome variable (y) Tells you how values in y change as a function of changes in values of x
  • 35. The regression line makes the sum of the squares of the residuals smaller than for any other line Regression minimizes residuals 220 200 180 160 140 120 100 80 60 70 80 90 100 110 Wt (kg) 120 SBP(mmHg) 35
  • 36. By using the least squares method (a procedure that minimizes the vertical deviations of surrounding a straight line) plotted we points are able to construct a best fitting straight line to the scatter diagram points and then formulate a regression equation in the form of:  n  n   x  y  xy b (  x) 2  x 2 1 yÌ‚  y  b(x  x) yÌ‚  a  bX Regression equation describes the regression line mathematically by showing Intercept and Slope 36
  • 37. Correlation and Regression 37 • The statistics equation for a line: Y = a + bx Where^ : Y = the line’s position on the v ^ertical axis at any point (estimated value of dependent variable) X = the line’s position on the horizontal axis at any point (value of the independent variable for which you want an estimate of Y) b = the slope of the line (called the coefficient) a = the intercept with the Y axis, where X equals zero
  • 39. Exercise A sample of 6 persons was selected the value of their age ( x variable) and their weight is demonstrated in the following table. Find the regression equation and what is the predicted weight when age is 8.5 years. 39 Serial no. Age (x) Weight (y) 1 7 12 2 6 8 3 8 12 4 5 10 5 6 11 6 9 13
  • 40. Answer 40 Serial no. Age (x) Weight (y) xy X2 Y2 1 7 12 84 49 144 2 6 8 48 36 64 3 8 12 96 64 144 4 5 10 50 25 100 5 6 11 66 36 121 6 9 13 117 81 169 Total 41 66 461 291 742
  • 41. 6 x  41  6.83 6 y  66 11 6 41 (41)2 291  461 41ï‚´ 66 b  6  0.92 Regression equation yÌ‚(x) 11 0.9(x  6.83)
  • 42. yÌ‚(x) 42  4.675  0.92x yÌ‚(8.5)  4.675  0.92*8.5 12.50Kg yÌ‚(7.5) 4.6750.92*7.511.58Kg
  • 43. we create a regression line by plotting two estimated values for y against their X component, then extending the line right and left. 43
  • 44. Regression Line 44 Example: a) Find the equation of the regression line. b) Use the equation to find the expected value when value of x is 2.3 x y xy x2 y2 1 – 3 – 3 1 9 2 – 1 – 2 4 1 3 0 0 9 0 4 1 4 16 1 5 2 10 25 4 x 15  y  1 xy  9 x2  55  y2  15
  • 45. Regression Line x 2 1 1 2 3 1 2 3 4 5 m  nxy  xy nx2  x2 y  5(9) 151 45 5(55) 152 50  60 1.2
  • 46. Regression Line 46 Example: The following data represents the number of hours 12 different students watched television during the weekend and the scores of each student who took a test the following Monday. a) Find the equation of the regression line. b) Use the equation to find the expected test score for a student who watches 9 hours of TV.
  • 47. Regression Line 47 Hours, x 0 1 2 3 3 5 5 5 6 7 7 10 Test score, y 96 85 82 74 95 68 76 84 58 65 75 50 xy 0 85 164 222 285 340 380 420 348 455 525 500 x2 0 1 4 9 9 25 25 25 36 49 49 100 y2 9216 722 5 672 4 547 6 902 5 462 4 577 6 705 6 336 4 422 5 562 5 250 0 x  54  y  908 xy  3724 x2  332  y2  70836
  • 48. Exercise • Find the correlation between age and blood pressure using simple and Spearman's correlation coefficients, and comment. • Find the regression equation? • What is the predicted blood pressure for a man aging 25 years? 48
  • 49. Serial x y xy x2 1 20 120 2400 400 2 43 128 5504 1849 3 63 141 8883 3969 4 26 126 3276 676 5 53 134 7102 2809 6 31 128 3968 961 7 58 136 7888 3364 8 46 132 6072 2116 9 58 140 8120 3364 10 70 144 10080 4900 49
  • 50. Serial x y xy x2 11 46 128 5888 2116 12 53 136 7208 2809 13 60 146 8760 3600 14 20 124 2480 400 15 63 143 9009 3969 16 43 130 5590 1849 17 26 124 3224 676 18 19 121 2299 361 19 31 126 3906 961 20 23 123 2829 529 Total 852 2630 114486 41678 50
  • 51.  n xy  x y b  n (x)2 x2 1 20 51 8522 20  0.4547 41678  114486  852 ï‚´ 2630 = yÌ‚ =112.13 + 0.4547 x for age 25 B.P = 112.13 + 0.4547 * 25=123.49 = 123.5 mm hg