correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics it normally refers to the degree to which a pair of variables are linearly related. Familiar examples of dependent phenomena include the correlation between the height of parents and their offspring, and the correlation between the price of a good and the quantity the consumers are willing to purchase, as it is depicted in the so-called demand curve.
Correlations are useful because they can indicate a predictive relationship that can be exploited in practice. For example, an electrical utility may produce less power on a mild day based on the correlation between electricity demand and weather. In this example, there is a causal relationship, because extreme weather causes people to use more electricity for heating or cooling. However, in general, the presence of a correlation is not sufficient to infer the presence of a causal relationship (i.e., correlation does not imply causation).
Formally, random variables are dependent if they do not satisfy a mathematical property of probabilistic independence. In informal parlance, correlation is synonymous with dependence. However, when used in a technical sense, correlation refers to any of several specific types of mathematical operations between the tested variables and their respective expected values. Essentially, correlation is the measure of how two or more variables are related to one another. There are several correlation coefficients, often denoted
ρ
\rho or
r
r, measuring the degree of correlation. The most common of these is the Pearson correlation coefficient, which is sensitive only to a linear relationship between two variables (which may be present even when one variable is a nonlinear function of the other). Other correlation coefficients – such as Spearman's rank correlation – have been developed to be more robust than Pearson's, that is, more sensitive to nonlinear relationships.[1][2][3] Mutual information can also be applied to measure dependence between two variables.
2. Correlation
• The concept of correlation was first proposed by Sir Francis Galton
in 1894, which was further mathematically describ by Karl Pearson
in 1896.
• Correlation is the method to find the relation of two variables in .
which change in value of one variable is accompanied by change .
in value of other variable .
• For example, The amount of rainfall to some extent is accompanied by the increase in
the volume of production , the decrease in the price of a commodity is accompanied
by the increase in the quantity demanded, an increase in advertisement expenditure is
accompanied by the increase in sales.
• It’s value is in between -1 to 1.
Fig.: Karl Pearson
3. Types of Correlation
1. Positive and Negative correlation.
—If two variables vary in the same direction i.e. increases or decrease in the value of one variable results
increase or decrease in the value of other variables, then the two variables are said to have a positive
correlation.
—One the other hand two variables are said to have a negative correlation if two variables move in
opposite direction i.e. If one variable increases or decreases the second decreases or increases.
2. Linear and Non-linear correlation.
—The correlation between tow variables is said to be linear when a unit change in one variable results from a constant
in the other variable over the entire range of the values.
—If corresponding to a unit change in one variable, there is not constant change in another variable, then
the correlation is said to be non-linear.
3. Simple ,Multiple and Partial correlation.
4. Regression
• Regression analysis is a statistical process for estimating the relationships
among variables
• It includes many techniques for modeling and analyzing several variables,
when the focus is on the relationship between a dependent variable and one
or more independent variables (or ‘predictors’).
• Furthermore, regression analysis helps one understand how the typical value
of the dependent variable (or 'criterion variable') changes when any one of the
independent variables is varied, while the other independent variables are
held
fi
xed.
• In regression analysis, it is also of interest to characterize the variation of the
dependent variable around the regression function which can be described by
a probability distribution.
5. Differences between Correlation and Regression
SN. Correlation Regression
1
Correlation’ as the name says it determines the interconnection or a
co-relationship between the variables.
Regression’ explains how an independent variable is
numerically associated with the dependent variable.
2
In Correlation, both the independent and dependent values
have no difference.
However, in Regression, both the dependent and
independent variable are different.
3
The primary objective of Correlation is, to
fi
nd out a
quantitative/numerical value expressing the association
between the values.
When it comes to regression, its primary intent is, to
reckon the values of a haphazard variable based on the
values of the
fi
xed variable.
4
Correlation stipulates the degree to which both of the
variables can move together.
However, regression speci
fi
es the effect of the change in
the unit, in the known variable(p) on the evaluated
variable (q).
5
Correlation helps to constitute the connection between the
two variables.
Regression helps in estimating a variable’s value based
on another given value.
6. Application of Correlation coefficient in different fields.
1. Correlation Analysis in Biological Data
In biological research, correlation analysis is used to understand the relation between the independent variables (or risk factors) with
dependent variable (or the disease outcome). The selected variables may be continuous or ordinal. For example, to know the relation
between systolic blood pressure (SBP) (continuous dependent) and risk factors/independent variables such as age (continuous) and
weight (continuous), Pearson's correlation analysis would be used. On the contrary, to understand the relation between maternal age
(continuous) and parity (ordinal) or number of hospitalization (ordinal) and history of stroke (ordinal), Spearman's correlation analysis
would be used.
2. Correlation in in the
fi
nance and investment industries.
Correlation, in the finance and investment industries, is a statistic that measures the degree to which two securities move in relation to
each other.Correlations are used in advanced portfolio management, computed as the correlation coefficient, which has a value that must
fall between -1.0 and +1.0.
7. 3. Correlation in Weather forecast.
According to a Report…..
To determine a possible correlation between the clinical variables (overall patient volume and incidence) and weather data, a Spearman’s
rank correlation test was used. The advantages of this method are that a determination of statistical distribution is unnecessary and it is
unsusceptible to aberrations. The analysis reveals the Spearman’s rank correlation coefficient “rho” (ρ), with its rank being located
between (−1,1) and explains the strength and direction of a correlation. Moreover, a P-value is calculated. It refers to the probability that
the null hypothesis (“the weather has no influence on clinical variables”) is falsely accepted.
4. Correlation in Business.
Correlation is another method of sales forecasting. Correlation looks at the strength of a relationship between two variables.For marketing,
it might be useful to know that there is a predictable relationship between sales and factors such as advertising, weather, consumer income
etc.Correlation is usually measured by using a scatter diagram, on which data points are plotted. For example, a data point might measure
the number of customer enquiries that are generated per week (x-axis) against the amount spent on advertising (y-axis). This is illustrated
below:
9. 5. Correlation in Education.
• These days correlation plays vital role in Education Sector. It helps to predict the pass ratio of the students, Average marks and no. of
admission.
• Correlation is also used to predict the income and expenditure in the school and colleges.
Example :-