RNB GLOBAL UNIVERSITY
BCA 3rd SEMESTER
Mathematics-III
SUBMITTED TO : -
Ms. Purnima
SUBMITTED BY : -
Tanishq Soni
Correlation & Regression
CORRELATION
DEFINITION:- WHENEVER, TWO VARIABLES X AND Y ARE SO RELATED THAT
A CHANGE IN ONE IS ACCOMPANIED BY A CHANGE IN THE OTHER IN SUCH
A WAY THAT AN INCREASE(OR DECREASE) IN THE ONE ACCOMPANIED BY
AN INCREASE(OR DECREASE) IN THE OTHER, THAN VARIABLES IS CALLED
CORRELATION.
THE FOLLOWING OBSERVATIONS ARE AS FOLLOWS:-
1) IF WHEN X INCREASE Y ALSO INCREASE THEN THERE IS POSITIVE
LINEAR CORRELATION.
2) IF WHEN X INCREASE Y DECREASE THEN THERE IS NEGATIVE
CORRELATION.
3) WHEN CHANGE OF ONE VARIABLE DOES NOT EFFECT ON SECOND
VARIABLE THEN THERE IS NO CORRELATION I.E., VARIABLE ARE SAID
TO BE UNCORRELATED.
ASSUMPTION OF
CORRELATION ANALYSIS
The only real assumption of correlation analysis is that the
variables are interval level. However the correlation only
examines the linear relationship between X and Y. so,
while the correlation doesn’t assume anything about the
variables.
It is similar to the case with the mean: the
arithmetic mean doesn’t assume anything about the
variables except, again that they are invariant.
However, it is not necessary the case that the
mean or correltion are poor choices even with the oddly
distributed data.it depends on what you are trying to
measure.
COEFFICIENT OF CORRELATION
( KARL PEARSON COEFFICIENT)
Karl pearson defined the coefficient of correlation r
(𝑜𝑟 𝑟𝑥𝑦) between two variables x and y by the relation
r = 𝑟𝑥𝑦 =
𝐶𝑜𝑣 (𝑥,𝑦)
(𝑆𝐷 𝑜𝑓 𝑥)(𝑆𝐷𝑜𝑓 𝑦)
Where cov(x,y) =
1
𝑛 𝑖=1
𝑛
𝑥𝑦 − ( 𝑥 𝑦 )
SD of x = {
1
𝑛 𝑖=1
𝑛
𝑥²- (𝑥)²}
SD of y = {
1
𝑛 𝑖=1
𝑛
𝑦²- (𝑦)²}
Other formula of correlation coefficient are
r =
𝑋𝑌
𝑋² 𝑌²
COEFFICIENT OF
DETERMINATION (𝑅2
)
Coefficient of determination is used to analyze how differences in one
variable can be explained by a differences in a second variables.
STEP-1 = Find r (correlation coefficient)
STEP-2 = Square the r (correlation coefficient)
STEP-3 = Convert the correlation coefficient to a
percent (%).
The coefficient of determination can be thought of as a % . It gives you
an idea of how many data points fall within the line formed by the
regression equation.
MEASUREMENT OF
CORRELATION -KARL
PEARSON’S COEFFICIENT
The pearson correlation coefficient, often referred to as the Pearson R
test, is a statistical formula that measures the strength between
variables and relationship. To determine how strong the relationship is
between two variables, you need to find the COEFFICIENT value, which
can range between -1 and 1.
Karl Pearson coefficient equation is
𝑟𝑥𝑦 = 𝑖=1
𝑛
(𝑥 − 𝑥)(𝑦− 𝑦)
𝑖=1
𝑛
(𝑥− 𝑥 )²(𝑦− 𝑦)²
MEASUREMENT OF
CORRELATION-SPEARMAN’S
RANK CORRELATION
The coefficient of correlation can be found between X’s and
Y’s by the methods followed in the use of pearson’s
coefficient of correlation.however only ranks are considered
in this and so we call RANK CORRELATION COEFFICIENT in
the characteristics X and Y for that group individuals.
Assuming that no two individual are equal of the individual
take the rank value 1,2,…….,n.
Formula for unrepeated rank is:-
ρ = 1-
6 𝑑²
𝑛(𝑛2−1)
Formula for repeated rank is:-
ρ= 1-
6( 𝑑²+
(𝑡3−𝑡)
12
)
𝑛(𝑛2−1)
CONCURRENT DEVIATION THE
CORRELATION COEFFICIENT
 This is another simple method of obtaining a quick
but crude idea of correlation between two variables.
 Here only direction of change in the concerned
variables are noted by comparing a value by its
preceding value.
 If the value is greater than its preceding value, it is
indicated by a positive (+) sign, if less; it is indicated by
a negative (-) sign and equal values are indicated by
equal(=) sign.
 All the pairs having same signs , i.e., either both the deviations are
positive or negative or have equal sign, are known as CONCURRENT
DEVIATIONS and are indicated by positive (+) sign in a separate column
designated as “CONCURRENCES”.
 The number of such concurrences are denoted by C.
 Similarly the remaining pairs having different signs are marked by
negative (-) sign in another column , designated as “DISAGREEMENTS”.
 The coefficient of correlation denoted by 𝑟𝑐 is given by
the following formula:
𝑟𝑐 = ± ±(
2𝐶−𝐷
𝐷
)
Where;
C denotes the number of concurrences
D is the number of pair of observations.
i.e., D = number of observations – 1 = (n-1).
REGRESSION
‘Regression’ stand for some sort of functional relationship between two or
more related variables.
suppose that the scatter diagram indicates some relationship
between the two variates x and y ; the dots of the scatter diagram will be
more or less concentrated round a curve.
This curve is called the curve of regression.
LINES OF REGRESSION :- When the curve is a straight line ;it is called a line of
regresion and the regression is said to be linear. A line of regression is the
straight line which gives the best fit in the least square sense of the given
frequency.
EQUATION OF THE LINES OF
REGRESSION
The regression line of Y on X are :-
Y- 𝑦 = r
𝑆𝐷 𝑜𝑓 𝑦
𝑆𝐷 𝑜𝑓 𝑥
( X- 𝑥)
The regression line of X on Y are :-
X- 𝑥 = r
𝑆𝐷 𝑜𝑓 𝑥
𝑆𝐷 𝑜𝑓 𝑦
( Y- 𝑦)

Correlation and regresion-Mathematics

  • 1.
    RNB GLOBAL UNIVERSITY BCA3rd SEMESTER Mathematics-III SUBMITTED TO : - Ms. Purnima SUBMITTED BY : - Tanishq Soni Correlation & Regression
  • 2.
    CORRELATION DEFINITION:- WHENEVER, TWOVARIABLES X AND Y ARE SO RELATED THAT A CHANGE IN ONE IS ACCOMPANIED BY A CHANGE IN THE OTHER IN SUCH A WAY THAT AN INCREASE(OR DECREASE) IN THE ONE ACCOMPANIED BY AN INCREASE(OR DECREASE) IN THE OTHER, THAN VARIABLES IS CALLED CORRELATION. THE FOLLOWING OBSERVATIONS ARE AS FOLLOWS:- 1) IF WHEN X INCREASE Y ALSO INCREASE THEN THERE IS POSITIVE LINEAR CORRELATION. 2) IF WHEN X INCREASE Y DECREASE THEN THERE IS NEGATIVE CORRELATION. 3) WHEN CHANGE OF ONE VARIABLE DOES NOT EFFECT ON SECOND VARIABLE THEN THERE IS NO CORRELATION I.E., VARIABLE ARE SAID TO BE UNCORRELATED.
  • 3.
    ASSUMPTION OF CORRELATION ANALYSIS Theonly real assumption of correlation analysis is that the variables are interval level. However the correlation only examines the linear relationship between X and Y. so, while the correlation doesn’t assume anything about the variables. It is similar to the case with the mean: the arithmetic mean doesn’t assume anything about the variables except, again that they are invariant. However, it is not necessary the case that the mean or correltion are poor choices even with the oddly distributed data.it depends on what you are trying to measure.
  • 4.
    COEFFICIENT OF CORRELATION (KARL PEARSON COEFFICIENT) Karl pearson defined the coefficient of correlation r (𝑜𝑟 𝑟𝑥𝑦) between two variables x and y by the relation r = 𝑟𝑥𝑦 = 𝐶𝑜𝑣 (𝑥,𝑦) (𝑆𝐷 𝑜𝑓 𝑥)(𝑆𝐷𝑜𝑓 𝑦) Where cov(x,y) = 1 𝑛 𝑖=1 𝑛 𝑥𝑦 − ( 𝑥 𝑦 ) SD of x = { 1 𝑛 𝑖=1 𝑛 𝑥²- (𝑥)²} SD of y = { 1 𝑛 𝑖=1 𝑛 𝑦²- (𝑦)²} Other formula of correlation coefficient are r = 𝑋𝑌 𝑋² 𝑌²
  • 5.
    COEFFICIENT OF DETERMINATION (𝑅2 ) Coefficientof determination is used to analyze how differences in one variable can be explained by a differences in a second variables. STEP-1 = Find r (correlation coefficient) STEP-2 = Square the r (correlation coefficient) STEP-3 = Convert the correlation coefficient to a percent (%). The coefficient of determination can be thought of as a % . It gives you an idea of how many data points fall within the line formed by the regression equation.
  • 6.
    MEASUREMENT OF CORRELATION -KARL PEARSON’SCOEFFICIENT The pearson correlation coefficient, often referred to as the Pearson R test, is a statistical formula that measures the strength between variables and relationship. To determine how strong the relationship is between two variables, you need to find the COEFFICIENT value, which can range between -1 and 1. Karl Pearson coefficient equation is 𝑟𝑥𝑦 = 𝑖=1 𝑛 (𝑥 − 𝑥)(𝑦− 𝑦) 𝑖=1 𝑛 (𝑥− 𝑥 )²(𝑦− 𝑦)²
  • 7.
    MEASUREMENT OF CORRELATION-SPEARMAN’S RANK CORRELATION Thecoefficient of correlation can be found between X’s and Y’s by the methods followed in the use of pearson’s coefficient of correlation.however only ranks are considered in this and so we call RANK CORRELATION COEFFICIENT in the characteristics X and Y for that group individuals. Assuming that no two individual are equal of the individual take the rank value 1,2,…….,n. Formula for unrepeated rank is:- ρ = 1- 6 𝑑² 𝑛(𝑛2−1) Formula for repeated rank is:- ρ= 1- 6( 𝑑²+ (𝑡3−𝑡) 12 ) 𝑛(𝑛2−1)
  • 8.
    CONCURRENT DEVIATION THE CORRELATIONCOEFFICIENT  This is another simple method of obtaining a quick but crude idea of correlation between two variables.  Here only direction of change in the concerned variables are noted by comparing a value by its preceding value.  If the value is greater than its preceding value, it is indicated by a positive (+) sign, if less; it is indicated by a negative (-) sign and equal values are indicated by equal(=) sign.
  • 9.
     All thepairs having same signs , i.e., either both the deviations are positive or negative or have equal sign, are known as CONCURRENT DEVIATIONS and are indicated by positive (+) sign in a separate column designated as “CONCURRENCES”.  The number of such concurrences are denoted by C.  Similarly the remaining pairs having different signs are marked by negative (-) sign in another column , designated as “DISAGREEMENTS”.
  • 10.
     The coefficientof correlation denoted by 𝑟𝑐 is given by the following formula: 𝑟𝑐 = ± ±( 2𝐶−𝐷 𝐷 ) Where; C denotes the number of concurrences D is the number of pair of observations. i.e., D = number of observations – 1 = (n-1).
  • 11.
    REGRESSION ‘Regression’ stand forsome sort of functional relationship between two or more related variables. suppose that the scatter diagram indicates some relationship between the two variates x and y ; the dots of the scatter diagram will be more or less concentrated round a curve. This curve is called the curve of regression. LINES OF REGRESSION :- When the curve is a straight line ;it is called a line of regresion and the regression is said to be linear. A line of regression is the straight line which gives the best fit in the least square sense of the given frequency.
  • 12.
    EQUATION OF THELINES OF REGRESSION The regression line of Y on X are :- Y- 𝑦 = r 𝑆𝐷 𝑜𝑓 𝑦 𝑆𝐷 𝑜𝑓 𝑥 ( X- 𝑥) The regression line of X on Y are :- X- 𝑥 = r 𝑆𝐷 𝑜𝑓 𝑥 𝑆𝐷 𝑜𝑓 𝑦 ( Y- 𝑦)