CHAPTER-14
INTRODUCTION TO CORRELATION &
REGRESSION ANALYSIS
By
DR. PRASANT SARANGI
Key concepts:
 Introduction to Correlation Analysis
 Rank Correlation
 Linear Regression Analysis
 Multiple Regression...
CORRELATION ANALYSIS
• Positive Correlation
• Negative Correlation
• Linear Correlation and
• Non-linear Correlation
Positive Correlation
• Two variables are said to be positively correlated when the movement of
the one variable leads the ...
Negative Correlation
• Correlation between two variables is said to be negative when the
movement of one variable leads to...
Linear Correlation
• The correlation between two variables is said to be linear where the
points when drawn is a graph rep...
Methods of Measuring Correlation
• The Graphical Method
The correlation can be graphically shown by using scatter diagrams...
• Karl Pearson’s Coefficient of Correlation
Karl Pearson’s coefficient of correlation (developed in 1986) measures
linear ...
• Direct method
∑∑
∑=
22
ii
ii
XY
yx
yx
r
Assumed Mean Method
∑ ∑∑ ∑
∑ ∑ ∑
−−
−
=
2222
)()(
))((
YYXX
YXYX
XY
ddnddn
ddddn...
• Grouped Data
∑ ∑∑ ∑
∑ ∑
−−
−
=
2222
)()(
))((
YYXX
YXYX
XY
fdfdnfdfdn
fdfddfdn
r
Assumptions of Coefficient of Correlation
1. The Value of the Coefficient of Correlation Lies between -1 (minus
one) to +1...
Rank Correlation Coefficient
There are three different situations of applying the Spearman’s rank
correlation coefficient....
• When Ranks of Both the Variables are Given
)(
6
1
6
1 2
2
3
2
nnn
d
or
nn
d
RXY
−
−
−
−=
∑∑
When Ranks of both the Varia...
• When Ranks between Two or More Observations in a Series are Equal
• The ranks to be assigned to each observation are an ...
Simple Linear Regression Model
What do we use regression models for:
1. Estimate a relationship among economic variables, such as
y = f(x).
2. Test hypot...
Dependent and Independent Variables
Dependent variable - the variable we are trying to explain
Independent (or explanatory...
Simple Regression Model
Y = dependent variable
X = independent variable
Model is: Y = α + β X
α is the intercept or consta...
Linearity
Models that are linear in the variables and in the coefficients:
Y = α + β X
Models that are nonlinear in the va...
Models that are nonlinear in the variables and in the coefficients:
Y = α + X β
Some models that are nonlinear can be made...
r
{α
∆Χ
∆E(Y|X)
E(Y|X)
Average
Expenditure
X (income)
E(Y|X)= α +βX
β=
∆E(Y|X)
∆X
An Example showing income and average ex...
Error Term
Y is a random variable composed of two parts:
I. Systematic component: E(Y) = α+ βX
This is the mean of Y.
II. ...
Sources of error term
• Dependent variable measured with error
• Model left out relevant variables
• Wrong functional form...
True Relationship
u4
Y
X
E(Y)= α + β X
•
•
Y4
Y1
Y3
Y2
X1 X2 X3 X4
u1
u2
u3
The Estimated Model
We use the data on Y and X to come up with guesses for α and β. These
estimated parameters or coeffici...
Our estimated, or “fitted”, model gives the predicted value for Y for any given
X:
Yi = α + β Xi
The residual is the diffe...
Upcoming SlideShare
Loading in …5
×

Research Methodology-Chapter 14

793 views

Published on

Published in: Technology, Economy & Finance
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
793
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Research Methodology-Chapter 14

  1. 1. CHAPTER-14 INTRODUCTION TO CORRELATION & REGRESSION ANALYSIS By DR. PRASANT SARANGI
  2. 2. Key concepts:  Introduction to Correlation Analysis  Rank Correlation  Linear Regression Analysis  Multiple Regression Analysis
  3. 3. CORRELATION ANALYSIS • Positive Correlation • Negative Correlation • Linear Correlation and • Non-linear Correlation
  4. 4. Positive Correlation • Two variables are said to be positively correlated when the movement of the one variable leads the movement of the other variable in the same direction. • There exists direct relationship between the two variables.
  5. 5. Negative Correlation • Correlation between two variables is said to be negative when the movement of one variable leads to the movement in the other variable in the opposite direction. • Here there exists inverse relationship between the two variables.
  6. 6. Linear Correlation • The correlation between two variables is said to be linear where the points when drawn is a graph represents a straight line. • Non-linear Correlation A relationship between two variables is said to be non-linear if a unit change in one variable causes the other variable to change in fluctuations. If X is changed then corresponding values of Y will not change in the same proportion.
  7. 7. Methods of Measuring Correlation • The Graphical Method The correlation can be graphically shown by using scatter diagrams. Scatter diagram reveals two important useful information. Firstly, through this diagram, one can observe the patterns between two variables which indicate whether there exists some association between the variables or not. Secondly, if an association between the variables is found, then it can be easily identified regarding the nature of relationship between the two (whether two variables are linearly related or non-linearly related).
  8. 8. • Karl Pearson’s Coefficient of Correlation Karl Pearson’s coefficient of correlation (developed in 1986) measures linear relationship between two variables under study. Since, the relationship is expressed is linear, hence, two variables change in a fixed proportion. This measure provides the answer of the degree of relationship in real number, independent of the units in which the variables have been expressed, and also indicates the direction of the correlation.
  9. 9. • Direct method ∑∑ ∑= 22 ii ii XY yx yx r Assumed Mean Method ∑ ∑∑ ∑ ∑ ∑ ∑ −− − = 2222 )()( ))(( YYXX YXYX XY ddnddn ddddn r
  10. 10. • Grouped Data ∑ ∑∑ ∑ ∑ ∑ −− − = 2222 )()( ))(( YYXX YXYX XY fdfdnfdfdn fdfddfdn r
  11. 11. Assumptions of Coefficient of Correlation 1. The Value of the Coefficient of Correlation Lies between -1 (minus one) to +1 (plus one). 2. The Value of the Coefficient of Correlation is Independent of the Change of Origin and Change of Scale of Measurement ∑ ∑∑ ∑ ∑ ∑ ∑ −− − = 2222 )()( )()( iiii iiii XY kknhhn khkhn r
  12. 12. Rank Correlation Coefficient There are three different situations of applying the Spearman’s rank correlation coefficient. • When ranks of both the variables are given • When ranks of both the variables are not given and • When ranks between two or more observations in a series are equal
  13. 13. • When Ranks of Both the Variables are Given )( 6 1 6 1 2 2 3 2 nnn d or nn d RXY − − − −= ∑∑ When Ranks of both the Variables are not Given •In such cases, each observation in the series is to be ranked first. •The selection of highest value depends on the researcher. • In other words, either the highest value or the lowest value will be ranked 1 (one) depends upon the decision of the researcher.
  14. 14. • When Ranks between Two or More Observations in a Series are Equal • The ranks to be assigned to each observation are an average of the ranks which these observations would have got, if they differed from each other. )1( ......)( 12 1 )( 12 1 )( 12 1 6 1 2 3 3 32 3 21 3 1 2 −       +−+−+−+ −= ∑ nn mmmmmmd RXY
  15. 15. Simple Linear Regression Model
  16. 16. What do we use regression models for: 1. Estimate a relationship among economic variables, such as y = f(x). 2. Test hypotheses 3. Forecast or predict the value of one variable, y, based on the value of another variable, x.
  17. 17. Dependent and Independent Variables Dependent variable - the variable we are trying to explain Independent (or explanatory) variables - variables that we think cause movements in the dependent variable
  18. 18. Simple Regression Model Y = dependent variable X = independent variable Model is: Y = α + β X α is the intercept or constant β is the slope coefficient
  19. 19. Linearity Models that are linear in the variables and in the coefficients: Y = α + β X Models that are nonlinear in the variables but linear in the coefficients: Y = α + β X2
  20. 20. Models that are nonlinear in the variables and in the coefficients: Y = α + X β Some models that are nonlinear can be made linear in the coefficients: Y = e α X β take logs: ln Y = α + β ln X
  21. 21. r {α ∆Χ ∆E(Y|X) E(Y|X) Average Expenditure X (income) E(Y|X)= α +βX β= ∆E(Y|X) ∆X An Example showing income and average expenditure
  22. 22. Error Term Y is a random variable composed of two parts: I. Systematic component: E(Y) = α+ βX This is the mean of Y. II. Random component: u = Y - E(Y | X) = Y - α- βX u is called the stochastic or random error. Together E(Y) and u form the model: Y = α+ βX + u
  23. 23. Sources of error term • Dependent variable measured with error • Model left out relevant variables • Wrong functional form • Inherent randomness of behaviour
  24. 24. True Relationship u4 Y X E(Y)= α + β X • • Y4 Y1 Y3 Y2 X1 X2 X3 X4 u1 u2 u3
  25. 25. The Estimated Model We use the data on Y and X to come up with guesses for α and β. These estimated parameters or coefficients are α and β cap ^ ^
  26. 26. Our estimated, or “fitted”, model gives the predicted value for Y for any given X: Yi = α + β Xi The residual is the difference between the actual or observed value of Y and the predicted value: ui = Yi - Yi = Yi - α - β Xi ^ ^ ^ ^ ^ ^ ^

×