P. MURUGAN M.Com [CA]., M.Phil., SET., [P.hD]
Assistant Professor in B.Com (CA)
Vivekananda College
Tiruvedakam West
Madurai.
Correlation Definition
Meaning of correlation
• Correlation is a statistical measure that
indicates the extent to which two or more
variables fluctuate together. A
positive correlation indicates the extent to
which those variables increase or decrease in
parallel; a negative correlation indicates the
extent to which one variable increases as the
other decreases.
Types of correlation
• Positive Correlation
• Negative Correlation
• Zero Correlation
• Linear Correlation
• Curvilinear Correlation
Methods of Correlation
• (A) Graphical Methods:
• Scatter diagram or Scatter gram.
• Simple graph or graphic Method.
• (B) Mathematical Methods:
• Karl Pearson’s Coefficient of Correlation
Method or Karl Pearson’s method.
• Spearman’s coefficient of correlation
• Concurrent Deviation Methods
Methods of Correlation
Scatter Diagram Method
• Scatter Diagram Method is the simplest method of studying
the correlation between two variables.
• In this method the values of one of the variables are represented by
X axis and other variable are represented by Y axis.
• Then for each pairs of the values of the variables a dot is plotted
which gives an indication of the direction of the diagram.
• The scatter of points on the graph gives an idea whether the
variables are related or not.
• When the dots are more scattered then the degree of relation
between two variables are very less.
• The closer the dots near the straight line more will be association
between the variables.
Scatter Diagram
Karl Pearson’s Coefficient Method
• Karl Pearson’s method of coefficient
of correlation is also known as Pearsonian
coefficient or correlation or product
correlation method.
• Let X and Y be two random variables, then the
correlation of coefficient between the
variables X and Y is denoted by r(X, Y) or
simply by rXY, and is defined as.
Steps to calculate the value of ‘r’
• The correlation coefficient is designated by the letter ‘r’
and it is also called as Karl Pearson’s Coefficient of
Correlation which is calculated by the following formula:
• (a) Two series are made by x and y variable.
• (b) Mean of both the series are calculated, x and y.
• (c) The deviation of each observation is calculated as dx
and dy.
• (d) Squaring of the deviations are noted.
• (e) The deviations of both the variables are multiplied.
• (f) All the data are summed up according to formula to
calculate ‘r’.
Karl Pearson’s Coefficient Method
When deviation taken from actual mean
Karl Pearson's Coefficient of
Correlation
• When deviation taken from actual mean:
• r(x, y)= Σxy / √ Σx² Σy²
• When deviation taken from an assumed
mean:
• r = N Σdxdy - Σdx Σdy /√N Σdx²-( Σdx)² √N
Σdy²-( Σdy)²
Spearman’s Rank Correlation
Coefficient
• The Spearman’s Rank Correlation Coefficient is the
non-parametric statistical measure used to study the
strength of association between the two ranked
variables.
• This method is applied to the ordinal set of numbers,
which can be arranged in order, i.e. one after the other
so that ranks can be given to each.
• In the rank correlation coefficient method, the ranks
are given to each individual on the basis of its quality
or quantity, such as ranking starts from position 1st and
goes till Nth position for the one ranked last in the
group.
Spearman’s Rank Correlation formula
R = Rank correlation coefficient „
D = Difference of rank between paired item in two series.
N = Total number of observation.
Ranks are given
Steps to Rank Correlation Coefficient
• Problems where actual rank are given.
• Calculate the difference ‘D’ of two Ranks i.e.
(R1 – R2).
• Square the difference & calculate the sum of
the difference i.e. ∑ D 2
• Substitute the values obtained in the formula.
Equal Ranks or Tie in Ranks
• Equal Ranks or Tie in Ranks: In case the same
ranks are assigned to two or more entities,
then the ranks are assigned on an average
basis. Such as if two individuals are ranked
equal at third position, then the ranks shall be
calculated as:
• (3+4)/2 = 3.5
Equal Ranks or Tie in Ranks formula
Regression Analysis
• „Regression Analysis is a very powerful tool in the
field of statistical analysis in predicting the value
of one variable, given the value of another
variable, when those variables are related to each
other.
• Regression Analysis „ Regression Analysis is
mathematical measure of average relationship
between two or more variables. „ Regression
analysis is a statistical tool used in prediction of
value of unknown variable from known variable.
Assumptions in Regression Analysis
• Existence of actual linear relationship. „
• The regression analysis is used to estimate the values
within the range for which it is valid. „
• The relationship between the dependent and
independent variables remains the same till the
regression equation is calculated.
• The dependent variable takes any random value but
the values of the independent variables are fixed. „
• In regression, we have only one dependant variable in
our estimating equation.
• However, we can use more than one independent
variable.
Correlation analysis vs. Regression
analysis. „
• Regression is the average relationship between
two variables „
• Correlation need not imply cause & effect
relationship between the variables understudy.- R
A clearly indicate the cause and effect relation
ship between the variables. „There may be non-
sense correlation between two variables.-
• There is no such thing like non-sense regression.
Correlation a
Regression Analysis – Simple linear
regression
• Simple linear regression is a model that assesses
the relationship between a dependent variable
and an independent variable. The simple linear
model is expressed using the following equation:
• Y = a + bX + ϵ
• Where:
– Y – Dependent variable
– X – Independent (explanatory) variable
– a – Intercept
– b – Slope
– ϵ – Residual (error)
Regression Analysis – Multiple linear
regression
• Multiple linear regression analysis is essentially similar
to the simple linear model, with the exception that
multiple independent variables are used in the model.
The mathematical representation of multiple linear
regression is:
Y = a + bX1 + cX2 + dX3 + ϵ
– Where:
– Y – Dependent variable
– X1, X2, X3 – Independent (explanatory) variables
– a – Intercept
– b, c, d – Slopes
– ϵ – Residual (error)

Statistics ppt

  • 1.
    P. MURUGAN M.Com[CA]., M.Phil., SET., [P.hD] Assistant Professor in B.Com (CA) Vivekananda College Tiruvedakam West Madurai.
  • 2.
  • 3.
    Meaning of correlation •Correlation is a statistical measure that indicates the extent to which two or more variables fluctuate together. A positive correlation indicates the extent to which those variables increase or decrease in parallel; a negative correlation indicates the extent to which one variable increases as the other decreases.
  • 4.
    Types of correlation •Positive Correlation • Negative Correlation • Zero Correlation • Linear Correlation • Curvilinear Correlation
  • 5.
    Methods of Correlation •(A) Graphical Methods: • Scatter diagram or Scatter gram. • Simple graph or graphic Method. • (B) Mathematical Methods: • Karl Pearson’s Coefficient of Correlation Method or Karl Pearson’s method. • Spearman’s coefficient of correlation • Concurrent Deviation Methods
  • 6.
  • 7.
    Scatter Diagram Method •Scatter Diagram Method is the simplest method of studying the correlation between two variables. • In this method the values of one of the variables are represented by X axis and other variable are represented by Y axis. • Then for each pairs of the values of the variables a dot is plotted which gives an indication of the direction of the diagram. • The scatter of points on the graph gives an idea whether the variables are related or not. • When the dots are more scattered then the degree of relation between two variables are very less. • The closer the dots near the straight line more will be association between the variables.
  • 8.
  • 9.
    Karl Pearson’s CoefficientMethod • Karl Pearson’s method of coefficient of correlation is also known as Pearsonian coefficient or correlation or product correlation method. • Let X and Y be two random variables, then the correlation of coefficient between the variables X and Y is denoted by r(X, Y) or simply by rXY, and is defined as.
  • 10.
    Steps to calculatethe value of ‘r’ • The correlation coefficient is designated by the letter ‘r’ and it is also called as Karl Pearson’s Coefficient of Correlation which is calculated by the following formula: • (a) Two series are made by x and y variable. • (b) Mean of both the series are calculated, x and y. • (c) The deviation of each observation is calculated as dx and dy. • (d) Squaring of the deviations are noted. • (e) The deviations of both the variables are multiplied. • (f) All the data are summed up according to formula to calculate ‘r’.
  • 11.
    Karl Pearson’s CoefficientMethod When deviation taken from actual mean
  • 12.
    Karl Pearson's Coefficientof Correlation • When deviation taken from actual mean: • r(x, y)= Σxy / √ Σx² Σy² • When deviation taken from an assumed mean: • r = N Σdxdy - Σdx Σdy /√N Σdx²-( Σdx)² √N Σdy²-( Σdy)²
  • 13.
    Spearman’s Rank Correlation Coefficient •The Spearman’s Rank Correlation Coefficient is the non-parametric statistical measure used to study the strength of association between the two ranked variables. • This method is applied to the ordinal set of numbers, which can be arranged in order, i.e. one after the other so that ranks can be given to each. • In the rank correlation coefficient method, the ranks are given to each individual on the basis of its quality or quantity, such as ranking starts from position 1st and goes till Nth position for the one ranked last in the group.
  • 14.
    Spearman’s Rank Correlationformula R = Rank correlation coefficient „ D = Difference of rank between paired item in two series. N = Total number of observation. Ranks are given
  • 15.
    Steps to RankCorrelation Coefficient • Problems where actual rank are given. • Calculate the difference ‘D’ of two Ranks i.e. (R1 – R2). • Square the difference & calculate the sum of the difference i.e. ∑ D 2 • Substitute the values obtained in the formula.
  • 16.
    Equal Ranks orTie in Ranks • Equal Ranks or Tie in Ranks: In case the same ranks are assigned to two or more entities, then the ranks are assigned on an average basis. Such as if two individuals are ranked equal at third position, then the ranks shall be calculated as: • (3+4)/2 = 3.5
  • 17.
    Equal Ranks orTie in Ranks formula
  • 18.
    Regression Analysis • „RegressionAnalysis is a very powerful tool in the field of statistical analysis in predicting the value of one variable, given the value of another variable, when those variables are related to each other. • Regression Analysis „ Regression Analysis is mathematical measure of average relationship between two or more variables. „ Regression analysis is a statistical tool used in prediction of value of unknown variable from known variable.
  • 19.
    Assumptions in RegressionAnalysis • Existence of actual linear relationship. „ • The regression analysis is used to estimate the values within the range for which it is valid. „ • The relationship between the dependent and independent variables remains the same till the regression equation is calculated. • The dependent variable takes any random value but the values of the independent variables are fixed. „ • In regression, we have only one dependant variable in our estimating equation. • However, we can use more than one independent variable.
  • 20.
    Correlation analysis vs.Regression analysis. „ • Regression is the average relationship between two variables „ • Correlation need not imply cause & effect relationship between the variables understudy.- R A clearly indicate the cause and effect relation ship between the variables. „There may be non- sense correlation between two variables.- • There is no such thing like non-sense regression. Correlation a
  • 21.
    Regression Analysis –Simple linear regression • Simple linear regression is a model that assesses the relationship between a dependent variable and an independent variable. The simple linear model is expressed using the following equation: • Y = a + bX + ϵ • Where: – Y – Dependent variable – X – Independent (explanatory) variable – a – Intercept – b – Slope – ϵ – Residual (error)
  • 22.
    Regression Analysis –Multiple linear regression • Multiple linear regression analysis is essentially similar to the simple linear model, with the exception that multiple independent variables are used in the model. The mathematical representation of multiple linear regression is: Y = a + bX1 + cX2 + dX3 + ϵ – Where: – Y – Dependent variable – X1, X2, X3 – Independent (explanatory) variables – a – Intercept – b, c, d – Slopes – ϵ – Residual (error)