SlideShare a Scribd company logo
1 of 13
13
Correlation co-efficient (r test)
CORRELATION
[Q:
 Define correlation. (BSMMU, MD Radiology, January 2010,
July 2009)
 Short note: Correlation & regression (BSMMU, MD
Radiology, January, 2009)]
In statistics, the word correlation refers to the relationship between
two variables. If the change in one variable effects a change in the
other variable, the variables are said to be correlated.
Sometimes two continuous characters are measured in the same
person, such as weight and cholesterol, weight and height etc. At
other times, the same character is measured in two related groups
such as tallness in parents and tallness in children, study of
intelligent quotient (IQ) in brothers and in corresponding sisters
(siblings) and so on. The relationship or association between two
quantitatively measured or continuous variables is called
correlation.
Remember, correlation does not imply causation.
The relationship between two random variables is known as a
bivariate relationship. The known variable (or variables) is called
the independent variable(s). The variable we are trying to predict
is the dependent variable.
Example: A medical researcher may be interested in the bivariate
relationship between a patient’s blood pressure x and heart rate y.
Here x is independent variable and y is dependent variable.
Type of correlation
[Q:
Biostatistics-126
 Discuss different types of correlation with figures.
(BSMMU, MD Radiology, January, 2010)
 Classify correlation with figures of each. (BSMMU, MD
Radiology, July, 2009)]
1. Positive correlation:
 If the movements of the variables are in the same direction,
the correlation is called positive correlation.
 In positive correlation, the two variables react in the same
way, increasing or decreasing together.
 Example:
a. Height and weight of a group of people are positively
correlated
b. Temperatures in Celsius and Fahrenheit have a positive
correlation.
 In perfect positive correlation, coefficient of Correlation (r) =
+1, and in moderately positive correlation 0 < r <1.
2. Negative correlation:
 If the movements of the variables are in the opposite
direction, the correlation is called negative correlation.
 In negative correlation, as one variable increases, the other
decreases.
 Example: One variable might be the number of hunters in a
region and the other variable could be the deer population.
Perhaps as the number of hunters increases, the deer
population decreases. This is an example of a negative
correlation
 In perfect negative correlation, coefficient of correlation (r)
= -1, and in moderately negative correlation -1 < r < 0.
3. Zero correlation:
 If the movements of the one variable do not effect the
movement of the other variable, the variables are not
correlated and defined as zero correlation.
 In zero correlation, coefficient of correlation (r) = 0.
Biostatistics-127
Correlation in brief
 When the value of one variable is related to the value of
another, they are said to be correlated
 Coefficient of Correlation (r) measures such a relationship
 The value of r ranges from -1 (perfectly correlated in the
negative direction) to +1 (perfectly correlated in the positive
direction)
Biostatistics-128
 When r = 0, the 2 variables are not correlated
How can you tell if there is a correlation?
By observing the graphs, a person can tell if there is a correlation
by how closely the data resemble a line. If the points are scattered
about then there is may be no correlation. If the points would
closely fit a quadratic or exponential equation, etc., then they have
a nonlinear correlation.
How can you tell by inspection the type of correlation?
If the graph of the variables represent a line with positive slope,
then there is a positive correlation (x increases as y increases). If
the slope of the line is negative, then there is a negative
correlation (as x increases y decreases).
Correlation coefficient
Write short note on: Coefficient correlation, (BSMMU, MD
Radiology, January, 2010)
An important aspect of correlation is how strong it is.
The extent or degree of relationship between two sets of figures is
measured in terms the parameter called correlation coefficient. It
is denoted by letter ‘r’.
Another name for r is the Pearson product moment correlation
coefficient in honor of Karl Pearson who developed it about 1900.
When two variable characters in the same series or individuals are
measurable in quantitative units such as height and weight;
temperature and pulse rate; age and vital capacity; circulating
proteins in grams and surface area in square meters; systolic and
diastolic blood pressure in mm of Hg. it is often necessary and
possible to know, not only whether there is any association or
relationship between them or not but also the degree or extent of
such relationship.
Biostatistics-129
Correlation co-efficient (r) test
Measures of relationship between two group variables when one is
dependent to another.
Formula
2 2
( )( )
=
( ) ( )
sum x x y y
r
sum x x sum y y
- -
- -
2 2
=
sum XY
or r
sum X sumY
When =
X x x
-
=
Y y y
-
d.f = (n1-1) + (n2-1)
When; x=one variable
Y= other variable
Example
Problem: Find out the correlation co efficient between the
following variables.
x variable: 5, 8, 12, 15.
y variable: 20,25, 28, 30.
Solution:
Following table shows the relationship between the above
variables.
x y x x
-
= X
y y
-
= Y
X2
Y2
XY r
5 20 -5 -
5.75
25 33.06 28.75 0.959
8 25 -2 -
0.75
4 0.56 1.50
12 28 2 2.25 4 5.06 4.50
15 30 5 4.25 25 18.06 20.25
x = y = Sum
X2
=58
Sum
Y2
=56.68
Sum XY
=55
Biostatistics-130
10 25.75
2 2
=
sum XY
r
sum X sumY
55 55
, = = = 0.959
57.336
3287.44
or r
d.f = (n1-1) + (n2-1)
= 6
r = 0.959 means strong correlation
p value at 6 d.f <0.001
null hypothesis rejected.
Strength of Correlation:
Correlation coefficient degree of association
.8 to 1 Strong
.5 to .79 moderate
.2 to .49 weak
0to .19 negligible
1. Strong positive correlation …….When `r`=0.99-0.80 i. e
>.8
2. Moderate positive correlation……When `r` = 0.79-0.70
3. Limited degree correlation…. When ‘r` = 069 - 0.50
4. No correlation or zero correlation……...When `r`= <0.5
5. Negative correlation….When `r`=.1
N. B: Extent of correlation varies between minus one and plus one
I.e. - 1< r < l.
Problem for practice
During a laboratory experiment muscular contractions of a frog
muscle were measured against different doses of a given drug. The
height of the curve was considered as the response to the drug.
The observations were as below.
Biostatistics-131
Serial Number of experiment
1 2 3 4 5
Dose of
drug
0.3 0.4 0.6 0.8 0.9
Response to
drug
54.0. 59.0 60.0 65.0 70.0
From the above data calculate correlation coefficient and its
significance.
[Answer: r =0.9633 p <0.01]
[Q:
 Calculate the person's correlation coefficient between X
and Y variables are given below :
X = 5, 7, 10, 12 Y = 4, 6, 9, 11
(BSMMU, MD Radiology, January, 2010)
 Find-out the correlation coefficient between the following
2 variable. (BSMMU, MD Radiology, January, 2009)
Variable - I (X-
variable)
10, 15, 20, 25
(n=4)
Variable-II (Y-
variable)
30, 35, 40, 45
(n=4)
 The length & weight of 7 mouse are given below.
Compute 'r' and test for its significance.
Length = 2, 5, 8, 12, 14, 19, 22.
Weight = 1, 4, 3, 4, 8, 9, 8
(BSMMU, MD Radiology, January, 2010)
What Is Rank Correlation?
Biostatistics-132
Consider a situation where the data does not contain precise
sample values so that a measure of precision is unattainable. In
this situation, the data may be ranked (as in the GPA system,
different range of marks are ranked as different grade) in order of
size, importance, etc., using the numbers 1, 2,...,n. These statistics
are called rank-order statistics or correlations. Rank correlation
is used when the data is not presented in precise sample values.
What is the coefficient of rank correlation?
Example:
Given an example where the data values x and y are organized in
order of size. Now, the correlation coefficient can be computed
for the given numerical values which are in the form of ranks. This
coefficient of rank correlation is denoted by rrank or briefly r and
is calculated by the equation,
Where
d = differences between ranks of corresponding x and y
x = number of pairs of values (x, y) in the data
The above equation is called as the SPEARMAN'S FORMULA FOR
RANK CORRELATION.
Example: A group of 5 Army officers have participated in the
competition of both SWIMMING and RUNNING. The following
table depicts the ranks, which is in accordance with the
achievements in both the tests. This table also depicts the
difference between the ranks and the square of those differences.
OFFICER RUNNING (x) SWIMMING (y) Di Di
2
Selim 5 3 2 4
Biostatistics-133
Habib 2 1 1 1
Ismail 4 5 1 1
Tauhid 1 2 1 1
Mesbah 3 4 1 1
From the above table we have,
 Spearman's Rank Correlation is a technique used to test
the direction and strength of the relationship between two
variables. In other words, its a device to show whether any
one set of numbers has an effect on another set of
numbers.
 It uses the statistic rrank (Rs) which falls between -1 and +1.
Biostatistics-134
 If the rrank (Rs) value is 0, null hypothesis is accepted.
Otherwise, it is rejected.
The rank correlation method can be used when
1. The values of the variables are available in rank order form.
2. The data are qualitative in nature and can be ranked in some
order.
3. The data were originally quantitative in nature but because of
smallness of sample size were converted into ranks.
What are the types of correlation coefficient? Discuss with
figure. (BSMMU, MD Radiology, January, 2009)
Pearson’s correlation coefficient or spearman's rank
correlation coefficient
When associated variables are normally distributed such as height
and weight, the Pearson’s correlation coefficient is used. When two
variables are correlated, but not normally distributed spearman's
formula for rank correlation coefficient is used.
When calculating a correlation coefficient for ordinal data, select
Spearman's technique. For interval or ratio-type data, use
Pearson's technique.
REGRESSION
In experimental sciences after having understood the correlation
between two variables, there are situations when it is necessary to
estimate or predict the value of one character (variable say Y) from
the knowledge of the other character (variable say X) such as to
estimate height when weight is known. This is possible when the
two are linearly correlated. The former variable (Y i.e., weight) to be
estimated is called dependent variable and the latter (X i.e., height)
which is known, is called the Independent variable. This is done by
finding another constant called regression coefficient (b).
People use regression on an intuitive level every day. In business, a
well-dressed man is thought to be financially successful. A mother
knows that more sugar in her children's diet results in higher
energy levels. The ease of waking up in the morning often
Biostatistics-135
depends on how late you went to bed the night before.
Quantitative regression adds precision by developing a
mathematical formula that can be used for predictive purposes.
For example, a medical researcher might want to use body weight
(independent variable) to predict the most appropriate dose for a
new drug (dependent variable).
Regression means change in the measurements of a variable
character, on the positive or negative side, beyond the mean.
Regression coefficient is a measure of the change in one
dependent (Y) character with one unit change in the independent
character (X). It is denoted by letter ‘b’ which indicates the relative
change (Yc) in one variable (Y) from the mean (Y ) for one unit of
move, deviation or change (x) in another variable (X) from the
mean ( X ) when both are correlated. This helps to calculate or
predict any expected value of Y, i.e., Y corresponding to X. When
corresponding values Yc1. Yc2………….. Ycn are plotted on a graph a
straight line called the regression line or the mean correlation line
(Y on X) is obtained. The same was referred to as an imaginary line
while explaining various types of correlation.
The regression technique is primarily used to
1. Estimate the relationship that exists, on the average, between
the dependent variable and the explanatory (independent)
variable.
2. Determine the effect of each of the explanatory variables on
the dependent variables, controlling the effect of all the
explanatory variables.
3. Predict the value of the dependent variable for a given value
of the explanatory variable.
Types
Three types of regression models are fundamental to
epidemiological research:
1. linear regression
2. logistic regression
Biostatistics-136
3. Cox proportional hazards regression, a type of survival
analysis.
Linear regression: Here the dependent variable is a continuous
measure (such as body weight) with its frequency distribution
being the normal distribution. and the independent variables may
be both continuous and categorical.
Logistic regression: the dependent variable is derived from the
presence or absence of a characteristic,
Cox proportional hazards: the dependent variable represents the
time from a baseline of some type to the occurrence of an event of
interest.
[Reference: Bonita R, Beaglehole R, Kjellström T 2006. Basic
epidemiology, 2nd
edition, WHO.]
Difference between correlation and regression analysis
There are two important points of differences between correlation
and regression analysis.
1. Whereas correlation coefficient is a measure of degree of
relationship between x and y, the objective of regression
analysis is to study the nature of relationship between the
variables.
2. The cause and effect relation is clearly indicated through
regression analysis than by correlation. Correlation is merely a
tool of ascertaining the degree of relationship between two
variables and, therefore, we can not say that one variable is the
cause and the other the effect.
Scatter diagram
The graphical representation of bivariate data is called scatter
diagram. The graph of the data obtained by the values of the
variables x and y along the x-axis and y-axis respectively in the x-y
plane gives the scatter diagram.
Biostatistics-137
From the scatter diagram it can be evidently ascertained whether
there is any correlation existing among the variable or not. if there
exits correlation, types of correlation can also be ascertained.
Utilities of scatter diagram
1. It is simple and non mathematical method of studying
correlation between the variables. As such it can be easily
understood.
2. It is not influenced by the size of the extreme values whereas
most of the mathematical methods of finding correlation are
influenced by extreme values.
3. Making a scatter diagram usually is the first step in
investigating the relationship between the variables.

More Related Content

Similar to ch 13 Correlation and regression.doc

Correlation and regression impt
Correlation and regression imptCorrelation and regression impt
Correlation and regression imptfreelancer
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionANCYBS
 
Correlation engineering mathematics
Correlation  engineering mathematicsCorrelation  engineering mathematics
Correlation engineering mathematicsErSaurabh2
 
Power point presentationCORRELATION.pptx
Power point presentationCORRELATION.pptxPower point presentationCORRELATION.pptx
Power point presentationCORRELATION.pptxSimran Kaur
 
Correlation analysis notes
Correlation analysis notesCorrelation analysis notes
Correlation analysis notesJapheth Muthama
 
Research Methodology Module-06
Research Methodology Module-06Research Methodology Module-06
Research Methodology Module-06Kishor Ade
 
Assessment 2 ContextIn many data analyses, it is desirable.docx
Assessment 2 ContextIn many data analyses, it is desirable.docxAssessment 2 ContextIn many data analyses, it is desirable.docx
Assessment 2 ContextIn many data analyses, it is desirable.docxfestockton
 
Assessment 2 ContextIn many data analyses, it is desirable.docx
Assessment 2 ContextIn many data analyses, it is desirable.docxAssessment 2 ContextIn many data analyses, it is desirable.docx
Assessment 2 ContextIn many data analyses, it is desirable.docxgalerussel59292
 
Correlation and regresion-Mathematics
Correlation and regresion-MathematicsCorrelation and regresion-Mathematics
Correlation and regresion-MathematicsTanishq Soni
 

Similar to ch 13 Correlation and regression.doc (20)

Correlation and regression impt
Correlation and regression imptCorrelation and regression impt
Correlation and regression impt
 
Simple linear regressionn and Correlation
Simple linear regressionn and CorrelationSimple linear regressionn and Correlation
Simple linear regressionn and Correlation
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Measure of Association
Measure of AssociationMeasure of Association
Measure of Association
 
Linear Correlation
Linear Correlation Linear Correlation
Linear Correlation
 
Correlation engineering mathematics
Correlation  engineering mathematicsCorrelation  engineering mathematics
Correlation engineering mathematics
 
Power point presentationCORRELATION.pptx
Power point presentationCORRELATION.pptxPower point presentationCORRELATION.pptx
Power point presentationCORRELATION.pptx
 
Correlation analysis notes
Correlation analysis notesCorrelation analysis notes
Correlation analysis notes
 
Simple correlation
Simple correlationSimple correlation
Simple correlation
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Research Methodology Module-06
Research Methodology Module-06Research Methodology Module-06
Research Methodology Module-06
 
RMBS - CORRELATION.pptx
RMBS - CORRELATION.pptxRMBS - CORRELATION.pptx
RMBS - CORRELATION.pptx
 
Assessment 2 ContextIn many data analyses, it is desirable.docx
Assessment 2 ContextIn many data analyses, it is desirable.docxAssessment 2 ContextIn many data analyses, it is desirable.docx
Assessment 2 ContextIn many data analyses, it is desirable.docx
 
Assessment 2 ContextIn many data analyses, it is desirable.docx
Assessment 2 ContextIn many data analyses, it is desirable.docxAssessment 2 ContextIn many data analyses, it is desirable.docx
Assessment 2 ContextIn many data analyses, it is desirable.docx
 
S1 pb
S1 pbS1 pb
S1 pb
 
correlation ;.pptx
correlation ;.pptxcorrelation ;.pptx
correlation ;.pptx
 
correlation.pptx
correlation.pptxcorrelation.pptx
correlation.pptx
 
Correlation mp
Correlation mpCorrelation mp
Correlation mp
 
Correlation and regresion-Mathematics
Correlation and regresion-MathematicsCorrelation and regresion-Mathematics
Correlation and regresion-Mathematics
 
correlation.ppt
correlation.pptcorrelation.ppt
correlation.ppt
 

More from AbedurRahman5

Ch 17 Risk Ratio.doc
Ch 17 Risk Ratio.docCh 17 Risk Ratio.doc
Ch 17 Risk Ratio.docAbedurRahman5
 
Ch 15 Measures of morbidity..doc
Ch 15 Measures of morbidity..docCh 15 Measures of morbidity..doc
Ch 15 Measures of morbidity..docAbedurRahman5
 
Ch 14 diagn test.doc
Ch 14 diagn test.docCh 14 diagn test.doc
Ch 14 diagn test.docAbedurRahman5
 
Ch 12 SIGNIFICANT TESTrr.doc
Ch 12 SIGNIFICANT TESTrr.docCh 12 SIGNIFICANT TESTrr.doc
Ch 12 SIGNIFICANT TESTrr.docAbedurRahman5
 
ch 9 Confidence interval.doc
ch 9 Confidence interval.docch 9 Confidence interval.doc
ch 9 Confidence interval.docAbedurRahman5
 
Ch 8 NORMAL DIST..doc
Ch 8 NORMAL DIST..docCh 8 NORMAL DIST..doc
Ch 8 NORMAL DIST..docAbedurRahman5
 
ch 7 Tertile, Quartile, and Percentile.doc
ch 7 Tertile, Quartile, and Percentile.docch 7 Tertile, Quartile, and Percentile.doc
ch 7 Tertile, Quartile, and Percentile.docAbedurRahman5
 
Ch 5 CENTRAL TENDENCY.doc
Ch 5 CENTRAL TENDENCY.docCh 5 CENTRAL TENDENCY.doc
Ch 5 CENTRAL TENDENCY.docAbedurRahman5
 
Ch 1 Introduction..doc
Ch 1 Introduction..docCh 1 Introduction..doc
Ch 1 Introduction..docAbedurRahman5
 
Questionnaire Design
Questionnaire Design Questionnaire Design
Questionnaire Design AbedurRahman5
 
Biotechnology and Genetic Engineering
Biotechnology and Genetic EngineeringBiotechnology and Genetic Engineering
Biotechnology and Genetic EngineeringAbedurRahman5
 

More from AbedurRahman5 (19)

Ch 17 Risk Ratio.doc
Ch 17 Risk Ratio.docCh 17 Risk Ratio.doc
Ch 17 Risk Ratio.doc
 
Ch 15 Measures of morbidity..doc
Ch 15 Measures of morbidity..docCh 15 Measures of morbidity..doc
Ch 15 Measures of morbidity..doc
 
Ch 14 diagn test.doc
Ch 14 diagn test.docCh 14 diagn test.doc
Ch 14 diagn test.doc
 
Ch 12 SIGNIFICANT TESTrr.doc
Ch 12 SIGNIFICANT TESTrr.docCh 12 SIGNIFICANT TESTrr.doc
Ch 12 SIGNIFICANT TESTrr.doc
 
ch 9 Confidence interval.doc
ch 9 Confidence interval.docch 9 Confidence interval.doc
ch 9 Confidence interval.doc
 
Ch 8 NORMAL DIST..doc
Ch 8 NORMAL DIST..docCh 8 NORMAL DIST..doc
Ch 8 NORMAL DIST..doc
 
ch 7 Tertile, Quartile, and Percentile.doc
ch 7 Tertile, Quartile, and Percentile.docch 7 Tertile, Quartile, and Percentile.doc
ch 7 Tertile, Quartile, and Percentile.doc
 
Ch 6 DISPERSION.doc
Ch 6 DISPERSION.docCh 6 DISPERSION.doc
Ch 6 DISPERSION.doc
 
Ch 5 CENTRAL TENDENCY.doc
Ch 5 CENTRAL TENDENCY.docCh 5 CENTRAL TENDENCY.doc
Ch 5 CENTRAL TENDENCY.doc
 
Ch 4 SAMPLE..doc
Ch 4 SAMPLE..docCh 4 SAMPLE..doc
Ch 4 SAMPLE..doc
 
Ch 3 DATA.doc
Ch 3 DATA.docCh 3 DATA.doc
Ch 3 DATA.doc
 
Ch 2 Variables.doc
Ch 2 Variables.docCh 2 Variables.doc
Ch 2 Variables.doc
 
Ch 1 Introduction..doc
Ch 1 Introduction..docCh 1 Introduction..doc
Ch 1 Introduction..doc
 
ch 18 roc.doc
ch 18  roc.docch 18  roc.doc
ch 18 roc.doc
 
ch 2 hypothesis
ch 2 hypothesisch 2 hypothesis
ch 2 hypothesis
 
Questionnaire Design
Questionnaire Design Questionnaire Design
Questionnaire Design
 
Research
 Research  Research
Research
 
Study design
Study design Study design
Study design
 
Biotechnology and Genetic Engineering
Biotechnology and Genetic EngineeringBiotechnology and Genetic Engineering
Biotechnology and Genetic Engineering
 

Recently uploaded

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 

Recently uploaded (20)

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 

ch 13 Correlation and regression.doc

  • 1. 13 Correlation co-efficient (r test) CORRELATION [Q:  Define correlation. (BSMMU, MD Radiology, January 2010, July 2009)  Short note: Correlation & regression (BSMMU, MD Radiology, January, 2009)] In statistics, the word correlation refers to the relationship between two variables. If the change in one variable effects a change in the other variable, the variables are said to be correlated. Sometimes two continuous characters are measured in the same person, such as weight and cholesterol, weight and height etc. At other times, the same character is measured in two related groups such as tallness in parents and tallness in children, study of intelligent quotient (IQ) in brothers and in corresponding sisters (siblings) and so on. The relationship or association between two quantitatively measured or continuous variables is called correlation. Remember, correlation does not imply causation. The relationship between two random variables is known as a bivariate relationship. The known variable (or variables) is called the independent variable(s). The variable we are trying to predict is the dependent variable. Example: A medical researcher may be interested in the bivariate relationship between a patient’s blood pressure x and heart rate y. Here x is independent variable and y is dependent variable. Type of correlation [Q:
  • 2. Biostatistics-126  Discuss different types of correlation with figures. (BSMMU, MD Radiology, January, 2010)  Classify correlation with figures of each. (BSMMU, MD Radiology, July, 2009)] 1. Positive correlation:  If the movements of the variables are in the same direction, the correlation is called positive correlation.  In positive correlation, the two variables react in the same way, increasing or decreasing together.  Example: a. Height and weight of a group of people are positively correlated b. Temperatures in Celsius and Fahrenheit have a positive correlation.  In perfect positive correlation, coefficient of Correlation (r) = +1, and in moderately positive correlation 0 < r <1. 2. Negative correlation:  If the movements of the variables are in the opposite direction, the correlation is called negative correlation.  In negative correlation, as one variable increases, the other decreases.  Example: One variable might be the number of hunters in a region and the other variable could be the deer population. Perhaps as the number of hunters increases, the deer population decreases. This is an example of a negative correlation  In perfect negative correlation, coefficient of correlation (r) = -1, and in moderately negative correlation -1 < r < 0. 3. Zero correlation:  If the movements of the one variable do not effect the movement of the other variable, the variables are not correlated and defined as zero correlation.  In zero correlation, coefficient of correlation (r) = 0.
  • 3. Biostatistics-127 Correlation in brief  When the value of one variable is related to the value of another, they are said to be correlated  Coefficient of Correlation (r) measures such a relationship  The value of r ranges from -1 (perfectly correlated in the negative direction) to +1 (perfectly correlated in the positive direction)
  • 4. Biostatistics-128  When r = 0, the 2 variables are not correlated How can you tell if there is a correlation? By observing the graphs, a person can tell if there is a correlation by how closely the data resemble a line. If the points are scattered about then there is may be no correlation. If the points would closely fit a quadratic or exponential equation, etc., then they have a nonlinear correlation. How can you tell by inspection the type of correlation? If the graph of the variables represent a line with positive slope, then there is a positive correlation (x increases as y increases). If the slope of the line is negative, then there is a negative correlation (as x increases y decreases). Correlation coefficient Write short note on: Coefficient correlation, (BSMMU, MD Radiology, January, 2010) An important aspect of correlation is how strong it is. The extent or degree of relationship between two sets of figures is measured in terms the parameter called correlation coefficient. It is denoted by letter ‘r’. Another name for r is the Pearson product moment correlation coefficient in honor of Karl Pearson who developed it about 1900. When two variable characters in the same series or individuals are measurable in quantitative units such as height and weight; temperature and pulse rate; age and vital capacity; circulating proteins in grams and surface area in square meters; systolic and diastolic blood pressure in mm of Hg. it is often necessary and possible to know, not only whether there is any association or relationship between them or not but also the degree or extent of such relationship.
  • 5. Biostatistics-129 Correlation co-efficient (r) test Measures of relationship between two group variables when one is dependent to another. Formula 2 2 ( )( ) = ( ) ( ) sum x x y y r sum x x sum y y - - - - 2 2 = sum XY or r sum X sumY When = X x x - = Y y y - d.f = (n1-1) + (n2-1) When; x=one variable Y= other variable Example Problem: Find out the correlation co efficient between the following variables. x variable: 5, 8, 12, 15. y variable: 20,25, 28, 30. Solution: Following table shows the relationship between the above variables. x y x x - = X y y - = Y X2 Y2 XY r 5 20 -5 - 5.75 25 33.06 28.75 0.959 8 25 -2 - 0.75 4 0.56 1.50 12 28 2 2.25 4 5.06 4.50 15 30 5 4.25 25 18.06 20.25 x = y = Sum X2 =58 Sum Y2 =56.68 Sum XY =55
  • 6. Biostatistics-130 10 25.75 2 2 = sum XY r sum X sumY 55 55 , = = = 0.959 57.336 3287.44 or r d.f = (n1-1) + (n2-1) = 6 r = 0.959 means strong correlation p value at 6 d.f <0.001 null hypothesis rejected. Strength of Correlation: Correlation coefficient degree of association .8 to 1 Strong .5 to .79 moderate .2 to .49 weak 0to .19 negligible 1. Strong positive correlation …….When `r`=0.99-0.80 i. e >.8 2. Moderate positive correlation……When `r` = 0.79-0.70 3. Limited degree correlation…. When ‘r` = 069 - 0.50 4. No correlation or zero correlation……...When `r`= <0.5 5. Negative correlation….When `r`=.1 N. B: Extent of correlation varies between minus one and plus one I.e. - 1< r < l. Problem for practice During a laboratory experiment muscular contractions of a frog muscle were measured against different doses of a given drug. The height of the curve was considered as the response to the drug. The observations were as below.
  • 7. Biostatistics-131 Serial Number of experiment 1 2 3 4 5 Dose of drug 0.3 0.4 0.6 0.8 0.9 Response to drug 54.0. 59.0 60.0 65.0 70.0 From the above data calculate correlation coefficient and its significance. [Answer: r =0.9633 p <0.01] [Q:  Calculate the person's correlation coefficient between X and Y variables are given below : X = 5, 7, 10, 12 Y = 4, 6, 9, 11 (BSMMU, MD Radiology, January, 2010)  Find-out the correlation coefficient between the following 2 variable. (BSMMU, MD Radiology, January, 2009) Variable - I (X- variable) 10, 15, 20, 25 (n=4) Variable-II (Y- variable) 30, 35, 40, 45 (n=4)  The length & weight of 7 mouse are given below. Compute 'r' and test for its significance. Length = 2, 5, 8, 12, 14, 19, 22. Weight = 1, 4, 3, 4, 8, 9, 8 (BSMMU, MD Radiology, January, 2010) What Is Rank Correlation?
  • 8. Biostatistics-132 Consider a situation where the data does not contain precise sample values so that a measure of precision is unattainable. In this situation, the data may be ranked (as in the GPA system, different range of marks are ranked as different grade) in order of size, importance, etc., using the numbers 1, 2,...,n. These statistics are called rank-order statistics or correlations. Rank correlation is used when the data is not presented in precise sample values. What is the coefficient of rank correlation? Example: Given an example where the data values x and y are organized in order of size. Now, the correlation coefficient can be computed for the given numerical values which are in the form of ranks. This coefficient of rank correlation is denoted by rrank or briefly r and is calculated by the equation, Where d = differences between ranks of corresponding x and y x = number of pairs of values (x, y) in the data The above equation is called as the SPEARMAN'S FORMULA FOR RANK CORRELATION. Example: A group of 5 Army officers have participated in the competition of both SWIMMING and RUNNING. The following table depicts the ranks, which is in accordance with the achievements in both the tests. This table also depicts the difference between the ranks and the square of those differences. OFFICER RUNNING (x) SWIMMING (y) Di Di 2 Selim 5 3 2 4
  • 9. Biostatistics-133 Habib 2 1 1 1 Ismail 4 5 1 1 Tauhid 1 2 1 1 Mesbah 3 4 1 1 From the above table we have,  Spearman's Rank Correlation is a technique used to test the direction and strength of the relationship between two variables. In other words, its a device to show whether any one set of numbers has an effect on another set of numbers.  It uses the statistic rrank (Rs) which falls between -1 and +1.
  • 10. Biostatistics-134  If the rrank (Rs) value is 0, null hypothesis is accepted. Otherwise, it is rejected. The rank correlation method can be used when 1. The values of the variables are available in rank order form. 2. The data are qualitative in nature and can be ranked in some order. 3. The data were originally quantitative in nature but because of smallness of sample size were converted into ranks. What are the types of correlation coefficient? Discuss with figure. (BSMMU, MD Radiology, January, 2009) Pearson’s correlation coefficient or spearman's rank correlation coefficient When associated variables are normally distributed such as height and weight, the Pearson’s correlation coefficient is used. When two variables are correlated, but not normally distributed spearman's formula for rank correlation coefficient is used. When calculating a correlation coefficient for ordinal data, select Spearman's technique. For interval or ratio-type data, use Pearson's technique. REGRESSION In experimental sciences after having understood the correlation between two variables, there are situations when it is necessary to estimate or predict the value of one character (variable say Y) from the knowledge of the other character (variable say X) such as to estimate height when weight is known. This is possible when the two are linearly correlated. The former variable (Y i.e., weight) to be estimated is called dependent variable and the latter (X i.e., height) which is known, is called the Independent variable. This is done by finding another constant called regression coefficient (b). People use regression on an intuitive level every day. In business, a well-dressed man is thought to be financially successful. A mother knows that more sugar in her children's diet results in higher energy levels. The ease of waking up in the morning often
  • 11. Biostatistics-135 depends on how late you went to bed the night before. Quantitative regression adds precision by developing a mathematical formula that can be used for predictive purposes. For example, a medical researcher might want to use body weight (independent variable) to predict the most appropriate dose for a new drug (dependent variable). Regression means change in the measurements of a variable character, on the positive or negative side, beyond the mean. Regression coefficient is a measure of the change in one dependent (Y) character with one unit change in the independent character (X). It is denoted by letter ‘b’ which indicates the relative change (Yc) in one variable (Y) from the mean (Y ) for one unit of move, deviation or change (x) in another variable (X) from the mean ( X ) when both are correlated. This helps to calculate or predict any expected value of Y, i.e., Y corresponding to X. When corresponding values Yc1. Yc2………….. Ycn are plotted on a graph a straight line called the regression line or the mean correlation line (Y on X) is obtained. The same was referred to as an imaginary line while explaining various types of correlation. The regression technique is primarily used to 1. Estimate the relationship that exists, on the average, between the dependent variable and the explanatory (independent) variable. 2. Determine the effect of each of the explanatory variables on the dependent variables, controlling the effect of all the explanatory variables. 3. Predict the value of the dependent variable for a given value of the explanatory variable. Types Three types of regression models are fundamental to epidemiological research: 1. linear regression 2. logistic regression
  • 12. Biostatistics-136 3. Cox proportional hazards regression, a type of survival analysis. Linear regression: Here the dependent variable is a continuous measure (such as body weight) with its frequency distribution being the normal distribution. and the independent variables may be both continuous and categorical. Logistic regression: the dependent variable is derived from the presence or absence of a characteristic, Cox proportional hazards: the dependent variable represents the time from a baseline of some type to the occurrence of an event of interest. [Reference: Bonita R, Beaglehole R, Kjellström T 2006. Basic epidemiology, 2nd edition, WHO.] Difference between correlation and regression analysis There are two important points of differences between correlation and regression analysis. 1. Whereas correlation coefficient is a measure of degree of relationship between x and y, the objective of regression analysis is to study the nature of relationship between the variables. 2. The cause and effect relation is clearly indicated through regression analysis than by correlation. Correlation is merely a tool of ascertaining the degree of relationship between two variables and, therefore, we can not say that one variable is the cause and the other the effect. Scatter diagram The graphical representation of bivariate data is called scatter diagram. The graph of the data obtained by the values of the variables x and y along the x-axis and y-axis respectively in the x-y plane gives the scatter diagram.
  • 13. Biostatistics-137 From the scatter diagram it can be evidently ascertained whether there is any correlation existing among the variable or not. if there exits correlation, types of correlation can also be ascertained. Utilities of scatter diagram 1. It is simple and non mathematical method of studying correlation between the variables. As such it can be easily understood. 2. It is not influenced by the size of the extreme values whereas most of the mathematical methods of finding correlation are influenced by extreme values. 3. Making a scatter diagram usually is the first step in investigating the relationship between the variables.