SlideShare a Scribd company logo
Unit-5
Correlation :-
Suppose we have aset of 30 studentsina class andwe want to measure the heightsandweightsof all
the students.We observe thateachindividual(unit) of the setassumestwovalues –one relatingtothe
heightandthe otherto the weight.Suchadistributionin whicheachindividual orunitof the setis made
up of two valuesiscalledabivariate distribution. Some examplesof bivariate distributionare
(i) In a classof 60 studentsthe seriesof marksobtainedintwosubjectsbyall of them.
(ii) The seriesof salesrevenue andadvertisingexpenditureof twocompaniesinaparticular
year.
(iii) The seriesof agesof husbandsandwivesinasample of selectedmarriedcouples.
Thus ina bivariate distribution,we are givenasetof pairsof observations,whereineachpairrepresents
the valuesof twovariables.
In a bivariate distribution,we are interestedinfindingarelationship(if itexists) betweenthe two
variablesunderstudy.The conceptof ‘correlation’isastatistical tool whichstudiesthe relationship
betweentwovariablesandCorrelationAnalysisinvolvesvariousmethodsandtechniquesusedfor
studyingandmeasuringthe extentof the relationshipbetweenthe twovariables.
Definition:-Twovariablesare saidtobe incorrelationif the change inone of the variablesresultsin a
change in the othervariable.
Types of Correlation:-
Varioustypesof correlation are positive,negative,nocorrelation,perfect,strongandweakcorrelation.
Positive Correlation
Positive correlationoccurswhenanincrease inone variable increasesthe valueinanother.
The line correspondingtothe scatterplotisan increasingline.
Negative Correlation
Negative correlationoccurswhenanincrease inone variable decreasesthe value of another.
The line correspondingtothe scatterplotisa decreasingline.
No Correlation
No correlationoccurswhenthere isnolineardependencybetweenthe variables.
PerfectCorrelation
Perfectcorrelationoccurswhenthere isafuncional dependencybetweenthe variables.
In thiscase all the pointsare ina straightline.
Strong Correlation
A correlationisstrongerthe closerthe pointsare locatedtoone anotheron the line.
WeakCorrelation
A correlationisweakerthe fartherapart the pointsare locatedto one anotheronthe line.
Some examplesof seriesof positive correlationare:
(i) Heightsandweights;
(ii) Householdincome andexpenditure;
(iii) Price and supplyof commodities;
(iv) Amountof rainfall andyieldof crops.
Correlationbetweentwovariablesissaidtobe negative orinverse if the variablesdeviateinopposite
direction.Thatis,if the increase inthe variablesdeviate inopposite direction.Thatis,if increase (or
decrease) inthe valuesof one variable resultsonanaverage,incorrespondingdecrease (orincrease) in
the valuesof othervariable.
Some examplesof seriesof negative correlationare:
(i) Volume andpressure of perfectgas;
(ii) Currentand resistance [keepingthe voltage constant](𝑅 =
𝑉
𝐼
);
(iii) Price and demandof goods.
Note:
(i) If the pointsare veryclose to eachother,a fairlygoodamountof correlationcanbe
expectedbetweenthe twovariables.Onthe otherhandif theyare widelyscatteredapoor
correlationcanbe expectedbetweenthem.
(ii) If the pointsare scatteredandtheyreveal noupwardor downwardtrendas inthe case of
(d) thenwe say the variablesare uncorrelated.
(iv) If there is an upwardtrendrisingfromthe lowerlefthandcornerandgoingupwardto the
upperrighthand corner, the correlationobtainedfromthe graphissaidto be positive.Also,
if there isa downward trendfromthe upperlefthandcornerthe correlationobtainedissaid
to be negative.
(v) The graphs shownabove are generallytermedasscatterdiagrams.
The CoefficientofCorrelation (Karl Pearson’smethod)
The Karl Pearson’smethodispopularlyknownasPearson’sCoefficientof correlation.
One of the mostwidelyusedstatisticsisthe coefficientof correlation ‘𝑟’whichmeasuresthe degree of
association betweenthe twovaluesof relatedvariablesgiveninthe dataset.The coefficientof
correlation‘r’isgivenbythe formula
𝑟 =
∑ 𝑋𝑌
𝑛𝜎 𝑥 𝜎 𝑦
=
∑ 𝑋𝑌
√∑ 𝑥2 ∑ 𝑦2
[∵ 𝜎2
𝑥 =
∑ 𝑥2
𝑛
; 𝜎2
𝑦 =
∑ 𝑦2
𝑛
]
Here 𝑋 = ( 𝑥 − 𝑥̅); 𝑌 = ( 𝑦 − 𝑦̅)
𝜎 𝑥 =Standarddeviationof series 𝑥
𝜎 𝑦 =Standarddeviationof series 𝑦
𝑛 = Numberof pairsof observations
𝑟 = The (productmoment) correctioncoefficient
Thismethodisto be appliedonlywhere deviationsof itemsare takenfromactual meanandnot from
the assumedmean.
The valuesof coefficientof correlation ‘𝑟’obtainedfromthe above formulaalwayslies between ±1.
Whenr = +1 it meansthere isa perfectpositivecorrelationbetweenthe variables. Whenr= -1 it means
there isa perfectnegative correlationbetweenthe variables. Howeverif r= 0 there isno relationship
betweenthe variables.
Direct method:-
Substitutingthe valuesof 𝜎 𝑥 and 𝜎 𝑦 inthe above formula,we get
𝑟 =
∑ 𝑋𝑌
√∑ 𝑋2 ∑ 𝑌2
,
or
𝑛 ∑ 𝑋𝑌
√[ 𝑛 ∑ 𝑥2−(∑𝑥)2×{ 𝑛∑ 𝑦2−∑ 𝑥2}]
Example:- Making use of the data summarizedbelow,calculate the coefficientof correlation.
Case A B C D E F G H
x 10 9 6 10 12 13 11 9
y 9 4 6 9 11 13 8 4
Solution:-
Case 𝑥 𝑥 − 10
= 𝑋
𝑋2 𝑦 𝑦 − 8
= 𝑌
𝑌2 𝑋𝑌
A 10 0 0 9 1 1 0
B 9 -4 16 4 -4 16 16
C 6 -1 1 6 -2 4 2
D 10 0 0 9 +1 1 0
E 12 +2 4 11 +3 9 6
F 13 +3 9 13 +5 25 15
G 11 +1 1 8 0 0 0
H 9 -1 1 4 -4 16 4
𝑛 = 8 ∑𝑥 = 80 ∑𝑋 = 0 ∑𝑋2 = 32 ∑𝑦 = 64 ∑𝑌 = 0 ∑𝑌2 = 72 ∑𝑋𝑌 = 43
𝑥̅ =
∑𝑥
𝑛
=
80
8
= 10 , 𝑦̅ =
∑𝑦
𝑛
=
64
8
= 8
𝑟 =
∑ 𝑋𝑌
√∑ 𝑋2 ∑ 𝑌2
=
43
√32 × 72
=
43
√2304
=
43
48
= +0.896
Directmethod:-
Substitutingthe valuesof 𝜎 𝑥 and 𝜎 𝑦 inthe above formula,we get
𝑟 =
∑ 𝑋𝑌
√∑ 𝑋2 ∑ 𝑌2
,
or
𝑛 ∑ 𝑋𝑌
√[ 𝑛 ∑ 𝑥2−(∑𝑥)2×{ 𝑛∑ 𝑦2−∑ 𝑥2}]
Regression
If two variablesare significantlycorrelated,andif there issome theoretical basisfordoingso,itis
possible topredict (estimate) valuesof one variable fromthe other.Thisobservationleadstoavery
importantconceptknownas ‘RegressionAnalysis’.
For example,if we knowthatthe advertisingandsalesare correlatedwe findoutexpectedamountof
salesfora givenadvertisingexpenditure forattainingagivenamountof sales.Similarlyif we knowthe
yieldof rice andrainfall are closelyrelatedwe mayfindoutthe amountof rainis requiredto achieve a
certainproductionfigure.
In general Regressionanalysis meansthe estimationorpredictionof the unknownvalue of one variable
fromthe knownvalue of the othervariable.Itisone of the most importantstatistical toolswhichis
extensivelyusedinalmost all sciences –Natural,Social andPhysical.Itis speciallyusedinbusinessand
economicstostudythe relationshipbetween twoormore variablesthatare relatedcausallyandforthe
estimationof demandandsupplygraphs,costfunctions,productionand consumption functionsandso
on.
Predictionorestimationisone of the majorproblemsinalmostall the spheresof humanactivity.The
estimationorpredictionof future production, consumption,prices,investments,sales,profits,income
etc.are of verygreatimportance tobusinessprofessionals.Similarly,populationestimatesand
Population projections,GNP,Revenue andExpenditure etc.are indispensableforeconomistsand
efficientplanningof aneconomy.
The dictionarymeaningof ‘Regression’isreturningorgoingback.The term‘Regression’isfirstusedby
Sir FrancisGalton(1822-1911) in 1877 while studyingthe relationshipbetweenthe heightof fatherand
sons.Thisterm wasintroducedbyhiminthe paper of “RegressiontowardsMediocrityinhealthcare
structure”.RegressionanalysiswasexplainedbyM.M. Blairas follows:
“Regressionanalysisisamathematical measure of the average relationship betweentwoormore
variablesintermsof the original unitsof the data”.
Line of Regression
If the dotsof the scattereddiagramgenerally,tendstoclusteralonga well-defineddirectionwhich
suggesta linearrelationshipbetweenthe variable x andy,suchline of bestfitfor givendistributionof
dotsis called‘line of regression’.
There are twosuch lines,one givingthe bestpossible meanvaluesof yforeach specifiedvalueof x and
the othergivingthe bestpossible meanvaluesforx forgivingvaluesof y.The formeriscalledthe line of
regressionof yon x and lateris calledthe line of regression of x ony.
Firstconsiderthe line of regressionof yonx.
Let straightline satisfyingthe general trendof ndotsin a scattereddiagrambe
𝑦 = 𝑎 + 𝑏𝑥 ⋯(𝑖)
We have to determinethe constantaand b so that 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛 (𝑖) givesforthe each value of x,the best
estimate forthe average value of 𝑦. Thusthe normal equationfora and b are
∑𝑦 = 𝑛𝑎 + 𝑏∑𝑥 ⋯(𝑖𝑖)
∑𝑥𝑦 = 𝑎∑𝑥 + 𝑏∑𝑥2 ⋯(𝑖𝑖𝑖)
Equation (𝑖𝑖)gives
1
𝑛
∑𝑦 = 𝑎 + 𝑏.
1
𝑛
∑𝑥
i.e. 𝑦̅ = 𝑎 + 𝑏𝑥̅
Thisshowsthat ( 𝑥̅, 𝑦̅), i.e.meanof x and y lie on (𝑖).
Shiftingthe originto ( 𝑥̅, 𝑦̅), equation (𝑖𝑖𝑖) takesthe form
∑( 𝑥 − 𝑥̅)( 𝑦 − 𝑦̅) = 𝑎∑( 𝑥 − 𝑥̅) + 𝑏∑( 𝑥 − 𝑥̅)2
But ∑( 𝑥 − 𝑥̅) = 0
∴ 𝑏 =
∑( 𝑥 − 𝑥̅)( 𝑦 − 𝑦̅)
∑( 𝑥 − 𝑥̅)2 =
∑𝑋𝑌
∑𝑋2 =
∑𝑋𝑌
𝑛𝜎 𝑥
2 = 𝑟
𝜎 𝑦
𝜎 𝑥
⋯(∵ 𝑟 =
∑ 𝑋𝑌
𝑛𝜎 𝑥 𝜎𝑦
)
Thus the line of bestfitbecomes
( 𝑦 − 𝑦̅) = 𝑟
𝜎 𝑦
𝜎 𝑥
( 𝑥 − 𝑥̅)
whichisthe equation of line of regression of y on x.Its slope iscalledthe regression coefficientof yon x.
Interchangingx andy, the line of regressionx onyis
( 𝑥 − 𝑥̅) = 𝑟
𝜎 𝑥
𝜎 𝑦
( 𝑦 − 𝑦̅)
Thus the regressioncoefficientyonx = 𝑟
𝜎 𝑦
𝜎 𝑥
and the regressioncoefficientx ony = 𝑟
𝜎 𝑥
𝜎 𝑦
.
Corollary:-
Correlationcoefficientisthe geometricmeanbetweenthe tworegressioncoefficients
𝑟
𝜎 𝑦
𝜎 𝑥
× 𝑟
𝜎 𝑦
𝜎 𝑥
= 𝑟2.
Example:-
From the followingdataobtainthe tworegressionequation andcalculate the regressionequationtaking
deviationof itemsfrommeanof x andy series.
x 6 2 10 4 8
y 9 11 5 8 7
Solution:-
OBTAINING REGRESSION EQUATION
𝑥 𝑦 𝑥𝑦 x2 y2
6 9 54 36 81
2 11 22 4 121
10 5 50 100 25
4 8 32 16 64
8 7 56 64 49
∑𝑥 = 30 ∑𝑦 = 40 ∑𝑥𝑦 = 214 ∑x2 = 220 ∑y2 = 340
Regressionequationof yonx: 𝑦 = 𝑎 + 𝑏𝑥
∑𝑦 = 𝑛𝑎 + 𝑏∑𝑥
∑𝑥𝑦 = 𝑎∑𝑥 + 𝑏∑𝑥2
Substitutingthe values
40 = 5𝑎 + 30𝑏 ⋯(𝑖)
214 = 30𝑎 + 220𝑏 ⋯(𝑖𝑖)
Multiplyingequation (𝑖)by6, 240 = 30𝑎 + 180𝑏 ⋯(𝑖𝑖𝑖)
214 = 30𝑎 + 220𝑏 ⋯(𝑖𝑣)
Subtractingequation (𝑖𝑣)from (𝑖𝑖𝑖)−40𝑏 = 26 𝑜𝑟 𝑏 = −0.65
Substitutingthe value of binequation(𝑖)
40 = 5𝑎 + 30(−0.65) 𝑜𝑟 5𝑎 = 40 + 19.5 = 59.5 𝑜𝑟 𝑎 = 11.9
Puttingthe valuesof a and b in equation,the regressionof yonx is = 11.9 − 0.65𝑥 .
Regressionequationof x ony: 𝑥 = 𝑎 + 𝑏𝑦
∑𝑥 = 𝑛𝑎 + 𝑏∑𝑦
∑𝑥𝑦 = 𝑎∑𝑦 + 𝑏∑𝑦2
30 = 5𝑎 + 40𝑏 ⋯(𝑖)
214 = 40𝑎 + 340𝑏 ⋯(𝑖𝑖)
Multiplyingequation (𝑖)by 8: 240 = 40𝑎 + 320𝑏 ⋯(𝑖𝑖𝑖)
214 = 40𝑎 + 340𝑏 ⋯(𝑖𝑣)
From equation (𝑖𝑖𝑖) and(𝑖𝑣) − 20𝑏 = 26 𝑜𝑟 𝑏 = −13
Substitutingthe value of binequation (𝑖);
30 = 5𝑎 + 40(−1.3) 𝑜𝑟 5𝑎 = 30 + 52 = 82 𝑎 = 16.4
Puttingthe value of a and b inthe equation,the regressionlineof x ony is = 16.4 − 1.3𝑦 .
CALCULATION OF REGRESSION EQUATIONS
x 𝑥 − 𝑥̅ = 𝑋 𝑋2 y 𝑦 − 𝑦̅ = 𝑌 𝑌2 𝑋𝑌
6 0 0 9 +1 1 0
2 -4 16 11 +3 9 -12
10 +4 16 5 -3 9 -12
4 -2 4 8 0 0 0
8 +2 4 7 -1 1 -2
∑𝑥 = 30 ∑𝑋 = 0 ∑𝑋2 = 40 ∑𝑦 = 40 ∑𝑌 = 0 ∑𝑌2 = 20 ∑𝑋𝑌 = −26
𝑥̅ =
30
5
= 6 ; 𝑦̅ =
40
5
= 8
The line of regressionx ony is
( 𝑥 − 𝑥̅) = 𝑟
𝜎 𝑥
𝜎 𝑦
( 𝑦 − 𝑦̅)
𝑟
𝜎 𝑥
𝜎 𝑦
=
∑𝑋𝑌
∑𝑌2 =
−26
20
= −1.3
𝑥 − 6 = −1.3( 𝑦 − 8) = −1.3𝑦 + 10.4
𝑥 = −1.3𝑦 + 10.4 + 6 = 16.4 − 1.3𝑦
The line of regressionyonx is
( 𝑦 − 𝑦̅) = 𝑟
𝜎 𝑦
𝜎 𝑥
( 𝑥 − 𝑥̅)
𝑟
𝜎 𝑦
𝜎 𝑥
=
∑𝑋𝑌
∑𝑋2 =
−26
40
= −0.65
𝑦 − 8 = −0.65( 𝑥 − 6) = −0.65𝑥 + 3.9
𝑦 = −0.65𝑥 + 3.9 + 8 = 11.9 − 0.65𝑥
Thus we findthe same answerwhatobtainedearlier.However,the calculationsare verymuch
simplifiedwithoutthe use of the normal equation.
Experiment:-
An experimentisa treatmenton a groupof objectsor subjectsinthe interestof observingthe response.
Treatment:-
In experiments,atreatmentissomethingthatresearchersadministertoexperimental units.
For example,acornfieldisdividedintofour,eachpartis'treated'witha differentfertilizertosee which
producesthe mostcorn; a teacherpracticesdifferentteachingmethodsondifferentgroupsinherclass
to see whichyieldsthe bestresults;adoctortreats a patientwithaskinconditionwithdifferentcreams
to see whichismosteffective.Treatmentsare administeredtoexperimental unitsby'level',where level
impliesamountormagnitude.Forexample,if the experimental unitsweregiven5mg,10mg,15mg of a
medication,those amountswouldbe three levelsof the treatment.
(Definition taken fromValerie J. Easton and John H.McColl's StatisticsGlossary v1.1)
Factor:-
A factorof an experimentisacontrolledindependentvariable;avariable whose levelsare setbythe
experimenter.
A factor isa general type orcategory of treatments.Differenttreatmentsconstitute differentlevelsof a
factor.
For example,threedifferentgroupsof runnersare subjectedtodifferenttrainingmethods.The runners
are the experimental units,the trainingmethods,the treatments;where the three typesof training
methodsconstitute three levelsof the factor'type of training'.
(Definition taken fromValerie J. Easton and John H.McColl's StatisticsGlossary v1.1)
Experimental Design
The analysisof data generatedfromanexperiment.Asittakestime toorganize the experimentproperly
to ensure thatthe right type of data, andenoughof it, isavailable toanswerthe questionsof interestas
clearlyandefficientlyaspossible.Thisprocessiscalled experimental design.
There are six conceptsof experimentaldesign:
(i) IndependentVariable
(ii) DependentVariable
(iii) Constant
(iv) Control group
(v) ExperimentalGroup
(vi) Repeatedtrials
Variable:-Variable isthatchange duringthe experiment.
IndependentVariable:- IndependentVariableisthatchange on purpose bythe experimenter. Itisalso
knownas cause,stimulus,reasonormanipulated variable. Itisthe “if” part of the hypothesis.
DependentVariable:- The variable thatrespondtothe independentvariableiscalledDependent
Variable Itisknownas effect,resultorrespondingvariable.Itisthe thenpartof the hypothesis.
Constant:-All factorswhichare not allowedto change duringthe experimentsare calledconstant.
Control Group:- Control groupis the groupor the standardto whicheverythingiscompared.
Experimental Group:- The experimentalgroupisthe groupwhichistestedwiththe Independent
Variable.Eachtestgrouphas onlyone factor differentfromthe othergroup that isthe independent
variable.
Repeatedtrials:- Repeatedtrialsisthe numberof timesthe experimentisrepeated.The more timeswe
repeatthe experiment,we will getthe more validresult.
The IVCDV (IndependentVariable ConstantDependentVariable) chartisusedtodesignthe experiment.
IV Constant DV
Fertilizer
0 drop
2 drop
4 drop
6 drop
Amounts of water
Types of soil
Amount of soil
Type of plant
Type of planter
Size of planter
Type of light
Location
Plant growth
The Variable isthatchange duringthe experiment. Here the dropof fertilizer0,2,4or 6 is variedby
the experimenter.Plantgrowthisthe dependentvariable thatdependsonthe dropof fertilizer,So
it isthe dependentvariable.The othersare constants.
Amountsof water,Typesof soil,Amountof soil,Type of plant,Type of planter,Size of planter,Type of
light,locationare constants.
If we wantto testthe soil insteadof fertilizerthanfertilizerbecomethe constantandtype of soil
become the independentvariable.
The plant growththat we can observe here iscalled
(i) the result(of addingfertilizer)
(ii) the response (of addingfertilizer)
(iii) the effect(of addingfertilizer)
Completely Randomized Designs:-
Completely randomized designs are the simplest in which the treatments are assigned to the
experimental units completely at random. This allows every experimental unit, i.e., plot, animal,
soil sample, etc., to have an equal probability of receiving a treatment.
REFERRENCE
1. Statistical Method by S.C. Gupta
2. en.wikipedia.org/wiki/Statistics
3. www.mathsisfun.com/data/probability.html
4. www.stats.gla.ac.uk/steps/glossary/sampling.html
5. A First Course in statistics with application by A K P C Swain
6. A test book of agricultural Statistics by R. Rangaswamy
7. Fundamental of statistics, Vol.-I and II by A.M. Goon, M.K. Gupta and
B. Dasgupta
8. https://www.youtube.com/watch?feature=player_detailpage&v=UN206
cSaF0k#t=7
9. Statistics Glossary v1.1
Unit 5 Correlation

More Related Content

Similar to Unit 5 Correlation

Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
MoinPasha12
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
 
Course pack unit 5
Course pack unit 5Course pack unit 5
Course pack unit 5
Rai University
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
Shivakumar B N
 
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression
Dr. Tushar J Bhatt
 
Regression and correlation in statistics
Regression and correlation in statisticsRegression and correlation in statistics
Regression and correlation in statistics
iphone4s4
 
Lecture 4
Lecture 4Lecture 4
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
Farzad Javidanrad
 
correlation.final.ppt (1).pptx
correlation.final.ppt (1).pptxcorrelation.final.ppt (1).pptx
correlation.final.ppt (1).pptx
ChieWoo1
 
Simple Linear Regression
Simple Linear RegressionSimple Linear Regression
Simple Linear Regression
Sindhu Rumesh Kumar
 
correlation.pptx
correlation.pptxcorrelation.pptx
correlation.pptx
KrishnaVamsiMuthinen
 
Regression
RegressionRegression
REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
ShriramKargaonkar
 
Math n Statistic
Math n StatisticMath n Statistic
Math n Statistic
Jazmidah Rosle
 
Regression
RegressionRegression
Regression
LavanyaK75
 
Statistics
Statistics Statistics
Statistics
KafiPati
 
Correlation engineering mathematics
Correlation  engineering mathematicsCorrelation  engineering mathematics
Correlation engineering mathematics
ErSaurabh2
 

Similar to Unit 5 Correlation (20)

Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 
Course pack unit 5
Course pack unit 5Course pack unit 5
Course pack unit 5
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression
 
Regression and correlation in statistics
Regression and correlation in statisticsRegression and correlation in statistics
Regression and correlation in statistics
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
 
correlation.final.ppt (1).pptx
correlation.final.ppt (1).pptxcorrelation.final.ppt (1).pptx
correlation.final.ppt (1).pptx
 
Simple Linear Regression
Simple Linear RegressionSimple Linear Regression
Simple Linear Regression
 
UNIT 4.pptx
UNIT 4.pptxUNIT 4.pptx
UNIT 4.pptx
 
correlation.pptx
correlation.pptxcorrelation.pptx
correlation.pptx
 
Regression
RegressionRegression
Regression
 
REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
 
Math n Statistic
Math n StatisticMath n Statistic
Math n Statistic
 
Chap04 01
Chap04 01Chap04 01
Chap04 01
 
Regression
RegressionRegression
Regression
 
Statistics
Statistics Statistics
Statistics
 
Correlation engineering mathematics
Correlation  engineering mathematicsCorrelation  engineering mathematics
Correlation engineering mathematics
 

More from Rai University

Brochure Rai University
Brochure Rai University Brochure Rai University
Brochure Rai University
Rai University
 
Mm unit 4point2
Mm unit 4point2Mm unit 4point2
Mm unit 4point2
Rai University
 
Mm unit 4point1
Mm unit 4point1Mm unit 4point1
Mm unit 4point1
Rai University
 
Mm unit 4point3
Mm unit 4point3Mm unit 4point3
Mm unit 4point3
Rai University
 
Mm unit 3point2
Mm unit 3point2Mm unit 3point2
Mm unit 3point2
Rai University
 
Mm unit 3point1
Mm unit 3point1Mm unit 3point1
Mm unit 3point1
Rai University
 
Mm unit 2point2
Mm unit 2point2Mm unit 2point2
Mm unit 2point2
Rai University
 
Mm unit 2 point 1
Mm unit 2 point 1Mm unit 2 point 1
Mm unit 2 point 1
Rai University
 
Mm unit 1point3
Mm unit 1point3Mm unit 1point3
Mm unit 1point3
Rai University
 
Mm unit 1point2
Mm unit 1point2Mm unit 1point2
Mm unit 1point2
Rai University
 
Mm unit 1point1
Mm unit 1point1Mm unit 1point1
Mm unit 1point1
Rai University
 
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
Rai University
 
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
Rai University
 
Bsc agri 2 pae u-4.3 public expenditure
Bsc agri  2 pae  u-4.3 public expenditureBsc agri  2 pae  u-4.3 public expenditure
Bsc agri 2 pae u-4.3 public expenditure
Rai University
 
Bsc agri 2 pae u-4.2 public finance
Bsc agri  2 pae  u-4.2 public financeBsc agri  2 pae  u-4.2 public finance
Bsc agri 2 pae u-4.2 public finance
Rai University
 
Bsc agri 2 pae u-4.1 introduction
Bsc agri  2 pae  u-4.1 introductionBsc agri  2 pae  u-4.1 introduction
Bsc agri 2 pae u-4.1 introduction
Rai University
 
Bsc agri 2 pae u-3.3 inflation
Bsc agri  2 pae  u-3.3  inflationBsc agri  2 pae  u-3.3  inflation
Bsc agri 2 pae u-3.3 inflation
Rai University
 
Bsc agri 2 pae u-3.2 introduction to macro economics
Bsc agri  2 pae  u-3.2 introduction to macro economicsBsc agri  2 pae  u-3.2 introduction to macro economics
Bsc agri 2 pae u-3.2 introduction to macro economics
Rai University
 
Bsc agri 2 pae u-3.1 marketstructure
Bsc agri  2 pae  u-3.1 marketstructureBsc agri  2 pae  u-3.1 marketstructure
Bsc agri 2 pae u-3.1 marketstructure
Rai University
 
Bsc agri 2 pae u-3 perfect-competition
Bsc agri  2 pae  u-3 perfect-competitionBsc agri  2 pae  u-3 perfect-competition
Bsc agri 2 pae u-3 perfect-competition
Rai University
 

More from Rai University (20)

Brochure Rai University
Brochure Rai University Brochure Rai University
Brochure Rai University
 
Mm unit 4point2
Mm unit 4point2Mm unit 4point2
Mm unit 4point2
 
Mm unit 4point1
Mm unit 4point1Mm unit 4point1
Mm unit 4point1
 
Mm unit 4point3
Mm unit 4point3Mm unit 4point3
Mm unit 4point3
 
Mm unit 3point2
Mm unit 3point2Mm unit 3point2
Mm unit 3point2
 
Mm unit 3point1
Mm unit 3point1Mm unit 3point1
Mm unit 3point1
 
Mm unit 2point2
Mm unit 2point2Mm unit 2point2
Mm unit 2point2
 
Mm unit 2 point 1
Mm unit 2 point 1Mm unit 2 point 1
Mm unit 2 point 1
 
Mm unit 1point3
Mm unit 1point3Mm unit 1point3
Mm unit 1point3
 
Mm unit 1point2
Mm unit 1point2Mm unit 1point2
Mm unit 1point2
 
Mm unit 1point1
Mm unit 1point1Mm unit 1point1
Mm unit 1point1
 
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
 
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
 
Bsc agri 2 pae u-4.3 public expenditure
Bsc agri  2 pae  u-4.3 public expenditureBsc agri  2 pae  u-4.3 public expenditure
Bsc agri 2 pae u-4.3 public expenditure
 
Bsc agri 2 pae u-4.2 public finance
Bsc agri  2 pae  u-4.2 public financeBsc agri  2 pae  u-4.2 public finance
Bsc agri 2 pae u-4.2 public finance
 
Bsc agri 2 pae u-4.1 introduction
Bsc agri  2 pae  u-4.1 introductionBsc agri  2 pae  u-4.1 introduction
Bsc agri 2 pae u-4.1 introduction
 
Bsc agri 2 pae u-3.3 inflation
Bsc agri  2 pae  u-3.3  inflationBsc agri  2 pae  u-3.3  inflation
Bsc agri 2 pae u-3.3 inflation
 
Bsc agri 2 pae u-3.2 introduction to macro economics
Bsc agri  2 pae  u-3.2 introduction to macro economicsBsc agri  2 pae  u-3.2 introduction to macro economics
Bsc agri 2 pae u-3.2 introduction to macro economics
 
Bsc agri 2 pae u-3.1 marketstructure
Bsc agri  2 pae  u-3.1 marketstructureBsc agri  2 pae  u-3.1 marketstructure
Bsc agri 2 pae u-3.1 marketstructure
 
Bsc agri 2 pae u-3 perfect-competition
Bsc agri  2 pae  u-3 perfect-competitionBsc agri  2 pae  u-3 perfect-competition
Bsc agri 2 pae u-3 perfect-competition
 

Recently uploaded

The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 

Recently uploaded (20)

The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 

Unit 5 Correlation

  • 1. Unit-5 Correlation :- Suppose we have aset of 30 studentsina class andwe want to measure the heightsandweightsof all the students.We observe thateachindividual(unit) of the setassumestwovalues –one relatingtothe heightandthe otherto the weight.Suchadistributionin whicheachindividual orunitof the setis made up of two valuesiscalledabivariate distribution. Some examplesof bivariate distributionare (i) In a classof 60 studentsthe seriesof marksobtainedintwosubjectsbyall of them. (ii) The seriesof salesrevenue andadvertisingexpenditureof twocompaniesinaparticular year. (iii) The seriesof agesof husbandsandwivesinasample of selectedmarriedcouples. Thus ina bivariate distribution,we are givenasetof pairsof observations,whereineachpairrepresents the valuesof twovariables. In a bivariate distribution,we are interestedinfindingarelationship(if itexists) betweenthe two variablesunderstudy.The conceptof ‘correlation’isastatistical tool whichstudiesthe relationship betweentwovariablesandCorrelationAnalysisinvolvesvariousmethodsandtechniquesusedfor studyingandmeasuringthe extentof the relationshipbetweenthe twovariables. Definition:-Twovariablesare saidtobe incorrelationif the change inone of the variablesresultsin a change in the othervariable. Types of Correlation:- Varioustypesof correlation are positive,negative,nocorrelation,perfect,strongandweakcorrelation. Positive Correlation Positive correlationoccurswhenanincrease inone variable increasesthe valueinanother. The line correspondingtothe scatterplotisan increasingline. Negative Correlation Negative correlationoccurswhenanincrease inone variable decreasesthe value of another. The line correspondingtothe scatterplotisa decreasingline.
  • 2. No Correlation No correlationoccurswhenthere isnolineardependencybetweenthe variables. PerfectCorrelation Perfectcorrelationoccurswhenthere isafuncional dependencybetweenthe variables. In thiscase all the pointsare ina straightline. Strong Correlation A correlationisstrongerthe closerthe pointsare locatedtoone anotheron the line. WeakCorrelation A correlationisweakerthe fartherapart the pointsare locatedto one anotheronthe line. Some examplesof seriesof positive correlationare: (i) Heightsandweights; (ii) Householdincome andexpenditure; (iii) Price and supplyof commodities; (iv) Amountof rainfall andyieldof crops. Correlationbetweentwovariablesissaidtobe negative orinverse if the variablesdeviateinopposite direction.Thatis,if the increase inthe variablesdeviate inopposite direction.Thatis,if increase (or decrease) inthe valuesof one variable resultsonanaverage,incorrespondingdecrease (orincrease) in the valuesof othervariable. Some examplesof seriesof negative correlationare: (i) Volume andpressure of perfectgas; (ii) Currentand resistance [keepingthe voltage constant](𝑅 = 𝑉 𝐼 ); (iii) Price and demandof goods.
  • 3. Note: (i) If the pointsare veryclose to eachother,a fairlygoodamountof correlationcanbe expectedbetweenthe twovariables.Onthe otherhandif theyare widelyscatteredapoor correlationcanbe expectedbetweenthem. (ii) If the pointsare scatteredandtheyreveal noupwardor downwardtrendas inthe case of (d) thenwe say the variablesare uncorrelated. (iv) If there is an upwardtrendrisingfromthe lowerlefthandcornerandgoingupwardto the upperrighthand corner, the correlationobtainedfromthe graphissaidto be positive.Also, if there isa downward trendfromthe upperlefthandcornerthe correlationobtainedissaid to be negative. (v) The graphs shownabove are generallytermedasscatterdiagrams. The CoefficientofCorrelation (Karl Pearson’smethod) The Karl Pearson’smethodispopularlyknownasPearson’sCoefficientof correlation. One of the mostwidelyusedstatisticsisthe coefficientof correlation ‘𝑟’whichmeasuresthe degree of association betweenthe twovaluesof relatedvariablesgiveninthe dataset.The coefficientof correlation‘r’isgivenbythe formula 𝑟 = ∑ 𝑋𝑌 𝑛𝜎 𝑥 𝜎 𝑦 = ∑ 𝑋𝑌 √∑ 𝑥2 ∑ 𝑦2 [∵ 𝜎2 𝑥 = ∑ 𝑥2 𝑛 ; 𝜎2 𝑦 = ∑ 𝑦2 𝑛 ] Here 𝑋 = ( 𝑥 − 𝑥̅); 𝑌 = ( 𝑦 − 𝑦̅) 𝜎 𝑥 =Standarddeviationof series 𝑥 𝜎 𝑦 =Standarddeviationof series 𝑦 𝑛 = Numberof pairsof observations 𝑟 = The (productmoment) correctioncoefficient Thismethodisto be appliedonlywhere deviationsof itemsare takenfromactual meanandnot from the assumedmean. The valuesof coefficientof correlation ‘𝑟’obtainedfromthe above formulaalwayslies between ±1. Whenr = +1 it meansthere isa perfectpositivecorrelationbetweenthe variables. Whenr= -1 it means there isa perfectnegative correlationbetweenthe variables. Howeverif r= 0 there isno relationship betweenthe variables. Direct method:- Substitutingthe valuesof 𝜎 𝑥 and 𝜎 𝑦 inthe above formula,we get 𝑟 = ∑ 𝑋𝑌 √∑ 𝑋2 ∑ 𝑌2 , or 𝑛 ∑ 𝑋𝑌 √[ 𝑛 ∑ 𝑥2−(∑𝑥)2×{ 𝑛∑ 𝑦2−∑ 𝑥2}] Example:- Making use of the data summarizedbelow,calculate the coefficientof correlation. Case A B C D E F G H
  • 4. x 10 9 6 10 12 13 11 9 y 9 4 6 9 11 13 8 4 Solution:- Case 𝑥 𝑥 − 10 = 𝑋 𝑋2 𝑦 𝑦 − 8 = 𝑌 𝑌2 𝑋𝑌 A 10 0 0 9 1 1 0 B 9 -4 16 4 -4 16 16 C 6 -1 1 6 -2 4 2 D 10 0 0 9 +1 1 0 E 12 +2 4 11 +3 9 6 F 13 +3 9 13 +5 25 15 G 11 +1 1 8 0 0 0 H 9 -1 1 4 -4 16 4 𝑛 = 8 ∑𝑥 = 80 ∑𝑋 = 0 ∑𝑋2 = 32 ∑𝑦 = 64 ∑𝑌 = 0 ∑𝑌2 = 72 ∑𝑋𝑌 = 43 𝑥̅ = ∑𝑥 𝑛 = 80 8 = 10 , 𝑦̅ = ∑𝑦 𝑛 = 64 8 = 8 𝑟 = ∑ 𝑋𝑌 √∑ 𝑋2 ∑ 𝑌2 = 43 √32 × 72 = 43 √2304 = 43 48 = +0.896 Directmethod:- Substitutingthe valuesof 𝜎 𝑥 and 𝜎 𝑦 inthe above formula,we get 𝑟 = ∑ 𝑋𝑌 √∑ 𝑋2 ∑ 𝑌2 , or 𝑛 ∑ 𝑋𝑌 √[ 𝑛 ∑ 𝑥2−(∑𝑥)2×{ 𝑛∑ 𝑦2−∑ 𝑥2}] Regression If two variablesare significantlycorrelated,andif there issome theoretical basisfordoingso,itis possible topredict (estimate) valuesof one variable fromthe other.Thisobservationleadstoavery importantconceptknownas ‘RegressionAnalysis’. For example,if we knowthatthe advertisingandsalesare correlatedwe findoutexpectedamountof salesfora givenadvertisingexpenditure forattainingagivenamountof sales.Similarlyif we knowthe yieldof rice andrainfall are closelyrelatedwe mayfindoutthe amountof rainis requiredto achieve a certainproductionfigure. In general Regressionanalysis meansthe estimationorpredictionof the unknownvalue of one variable fromthe knownvalue of the othervariable.Itisone of the most importantstatistical toolswhichis extensivelyusedinalmost all sciences –Natural,Social andPhysical.Itis speciallyusedinbusinessand economicstostudythe relationshipbetween twoormore variablesthatare relatedcausallyandforthe estimationof demandandsupplygraphs,costfunctions,productionand consumption functionsandso on. Predictionorestimationisone of the majorproblemsinalmostall the spheresof humanactivity.The estimationorpredictionof future production, consumption,prices,investments,sales,profits,income etc.are of verygreatimportance tobusinessprofessionals.Similarly,populationestimatesand
  • 5. Population projections,GNP,Revenue andExpenditure etc.are indispensableforeconomistsand efficientplanningof aneconomy. The dictionarymeaningof ‘Regression’isreturningorgoingback.The term‘Regression’isfirstusedby Sir FrancisGalton(1822-1911) in 1877 while studyingthe relationshipbetweenthe heightof fatherand sons.Thisterm wasintroducedbyhiminthe paper of “RegressiontowardsMediocrityinhealthcare structure”.RegressionanalysiswasexplainedbyM.M. Blairas follows: “Regressionanalysisisamathematical measure of the average relationship betweentwoormore variablesintermsof the original unitsof the data”. Line of Regression If the dotsof the scattereddiagramgenerally,tendstoclusteralonga well-defineddirectionwhich suggesta linearrelationshipbetweenthe variable x andy,suchline of bestfitfor givendistributionof dotsis called‘line of regression’. There are twosuch lines,one givingthe bestpossible meanvaluesof yforeach specifiedvalueof x and the othergivingthe bestpossible meanvaluesforx forgivingvaluesof y.The formeriscalledthe line of regressionof yon x and lateris calledthe line of regression of x ony. Firstconsiderthe line of regressionof yonx. Let straightline satisfyingthe general trendof ndotsin a scattereddiagrambe 𝑦 = 𝑎 + 𝑏𝑥 ⋯(𝑖) We have to determinethe constantaand b so that 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛 (𝑖) givesforthe each value of x,the best estimate forthe average value of 𝑦. Thusthe normal equationfora and b are ∑𝑦 = 𝑛𝑎 + 𝑏∑𝑥 ⋯(𝑖𝑖) ∑𝑥𝑦 = 𝑎∑𝑥 + 𝑏∑𝑥2 ⋯(𝑖𝑖𝑖) Equation (𝑖𝑖)gives 1 𝑛 ∑𝑦 = 𝑎 + 𝑏. 1 𝑛 ∑𝑥 i.e. 𝑦̅ = 𝑎 + 𝑏𝑥̅ Thisshowsthat ( 𝑥̅, 𝑦̅), i.e.meanof x and y lie on (𝑖). Shiftingthe originto ( 𝑥̅, 𝑦̅), equation (𝑖𝑖𝑖) takesthe form ∑( 𝑥 − 𝑥̅)( 𝑦 − 𝑦̅) = 𝑎∑( 𝑥 − 𝑥̅) + 𝑏∑( 𝑥 − 𝑥̅)2 But ∑( 𝑥 − 𝑥̅) = 0 ∴ 𝑏 = ∑( 𝑥 − 𝑥̅)( 𝑦 − 𝑦̅) ∑( 𝑥 − 𝑥̅)2 = ∑𝑋𝑌 ∑𝑋2 = ∑𝑋𝑌 𝑛𝜎 𝑥 2 = 𝑟 𝜎 𝑦 𝜎 𝑥 ⋯(∵ 𝑟 = ∑ 𝑋𝑌 𝑛𝜎 𝑥 𝜎𝑦 ) Thus the line of bestfitbecomes ( 𝑦 − 𝑦̅) = 𝑟 𝜎 𝑦 𝜎 𝑥 ( 𝑥 − 𝑥̅) whichisthe equation of line of regression of y on x.Its slope iscalledthe regression coefficientof yon x. Interchangingx andy, the line of regressionx onyis
  • 6. ( 𝑥 − 𝑥̅) = 𝑟 𝜎 𝑥 𝜎 𝑦 ( 𝑦 − 𝑦̅) Thus the regressioncoefficientyonx = 𝑟 𝜎 𝑦 𝜎 𝑥 and the regressioncoefficientx ony = 𝑟 𝜎 𝑥 𝜎 𝑦 . Corollary:- Correlationcoefficientisthe geometricmeanbetweenthe tworegressioncoefficients 𝑟 𝜎 𝑦 𝜎 𝑥 × 𝑟 𝜎 𝑦 𝜎 𝑥 = 𝑟2. Example:- From the followingdataobtainthe tworegressionequation andcalculate the regressionequationtaking deviationof itemsfrommeanof x andy series. x 6 2 10 4 8 y 9 11 5 8 7 Solution:- OBTAINING REGRESSION EQUATION 𝑥 𝑦 𝑥𝑦 x2 y2 6 9 54 36 81 2 11 22 4 121 10 5 50 100 25 4 8 32 16 64 8 7 56 64 49 ∑𝑥 = 30 ∑𝑦 = 40 ∑𝑥𝑦 = 214 ∑x2 = 220 ∑y2 = 340 Regressionequationof yonx: 𝑦 = 𝑎 + 𝑏𝑥 ∑𝑦 = 𝑛𝑎 + 𝑏∑𝑥 ∑𝑥𝑦 = 𝑎∑𝑥 + 𝑏∑𝑥2 Substitutingthe values 40 = 5𝑎 + 30𝑏 ⋯(𝑖) 214 = 30𝑎 + 220𝑏 ⋯(𝑖𝑖) Multiplyingequation (𝑖)by6, 240 = 30𝑎 + 180𝑏 ⋯(𝑖𝑖𝑖) 214 = 30𝑎 + 220𝑏 ⋯(𝑖𝑣) Subtractingequation (𝑖𝑣)from (𝑖𝑖𝑖)−40𝑏 = 26 𝑜𝑟 𝑏 = −0.65 Substitutingthe value of binequation(𝑖) 40 = 5𝑎 + 30(−0.65) 𝑜𝑟 5𝑎 = 40 + 19.5 = 59.5 𝑜𝑟 𝑎 = 11.9 Puttingthe valuesof a and b in equation,the regressionof yonx is = 11.9 − 0.65𝑥 . Regressionequationof x ony: 𝑥 = 𝑎 + 𝑏𝑦 ∑𝑥 = 𝑛𝑎 + 𝑏∑𝑦 ∑𝑥𝑦 = 𝑎∑𝑦 + 𝑏∑𝑦2 30 = 5𝑎 + 40𝑏 ⋯(𝑖) 214 = 40𝑎 + 340𝑏 ⋯(𝑖𝑖) Multiplyingequation (𝑖)by 8: 240 = 40𝑎 + 320𝑏 ⋯(𝑖𝑖𝑖)
  • 7. 214 = 40𝑎 + 340𝑏 ⋯(𝑖𝑣) From equation (𝑖𝑖𝑖) and(𝑖𝑣) − 20𝑏 = 26 𝑜𝑟 𝑏 = −13 Substitutingthe value of binequation (𝑖); 30 = 5𝑎 + 40(−1.3) 𝑜𝑟 5𝑎 = 30 + 52 = 82 𝑎 = 16.4 Puttingthe value of a and b inthe equation,the regressionlineof x ony is = 16.4 − 1.3𝑦 . CALCULATION OF REGRESSION EQUATIONS x 𝑥 − 𝑥̅ = 𝑋 𝑋2 y 𝑦 − 𝑦̅ = 𝑌 𝑌2 𝑋𝑌 6 0 0 9 +1 1 0 2 -4 16 11 +3 9 -12 10 +4 16 5 -3 9 -12 4 -2 4 8 0 0 0 8 +2 4 7 -1 1 -2 ∑𝑥 = 30 ∑𝑋 = 0 ∑𝑋2 = 40 ∑𝑦 = 40 ∑𝑌 = 0 ∑𝑌2 = 20 ∑𝑋𝑌 = −26 𝑥̅ = 30 5 = 6 ; 𝑦̅ = 40 5 = 8 The line of regressionx ony is ( 𝑥 − 𝑥̅) = 𝑟 𝜎 𝑥 𝜎 𝑦 ( 𝑦 − 𝑦̅) 𝑟 𝜎 𝑥 𝜎 𝑦 = ∑𝑋𝑌 ∑𝑌2 = −26 20 = −1.3 𝑥 − 6 = −1.3( 𝑦 − 8) = −1.3𝑦 + 10.4 𝑥 = −1.3𝑦 + 10.4 + 6 = 16.4 − 1.3𝑦 The line of regressionyonx is ( 𝑦 − 𝑦̅) = 𝑟 𝜎 𝑦 𝜎 𝑥 ( 𝑥 − 𝑥̅) 𝑟 𝜎 𝑦 𝜎 𝑥 = ∑𝑋𝑌 ∑𝑋2 = −26 40 = −0.65 𝑦 − 8 = −0.65( 𝑥 − 6) = −0.65𝑥 + 3.9 𝑦 = −0.65𝑥 + 3.9 + 8 = 11.9 − 0.65𝑥 Thus we findthe same answerwhatobtainedearlier.However,the calculationsare verymuch simplifiedwithoutthe use of the normal equation. Experiment:- An experimentisa treatmenton a groupof objectsor subjectsinthe interestof observingthe response. Treatment:- In experiments,atreatmentissomethingthatresearchersadministertoexperimental units. For example,acornfieldisdividedintofour,eachpartis'treated'witha differentfertilizertosee which producesthe mostcorn; a teacherpracticesdifferentteachingmethodsondifferentgroupsinherclass to see whichyieldsthe bestresults;adoctortreats a patientwithaskinconditionwithdifferentcreams to see whichismosteffective.Treatmentsare administeredtoexperimental unitsby'level',where level impliesamountormagnitude.Forexample,if the experimental unitsweregiven5mg,10mg,15mg of a
  • 8. medication,those amountswouldbe three levelsof the treatment. (Definition taken fromValerie J. Easton and John H.McColl's StatisticsGlossary v1.1) Factor:- A factorof an experimentisacontrolledindependentvariable;avariable whose levelsare setbythe experimenter. A factor isa general type orcategory of treatments.Differenttreatmentsconstitute differentlevelsof a factor. For example,threedifferentgroupsof runnersare subjectedtodifferenttrainingmethods.The runners are the experimental units,the trainingmethods,the treatments;where the three typesof training methodsconstitute three levelsof the factor'type of training'. (Definition taken fromValerie J. Easton and John H.McColl's StatisticsGlossary v1.1) Experimental Design The analysisof data generatedfromanexperiment.Asittakestime toorganize the experimentproperly to ensure thatthe right type of data, andenoughof it, isavailable toanswerthe questionsof interestas clearlyandefficientlyaspossible.Thisprocessiscalled experimental design. There are six conceptsof experimentaldesign: (i) IndependentVariable (ii) DependentVariable (iii) Constant (iv) Control group (v) ExperimentalGroup (vi) Repeatedtrials Variable:-Variable isthatchange duringthe experiment. IndependentVariable:- IndependentVariableisthatchange on purpose bythe experimenter. Itisalso knownas cause,stimulus,reasonormanipulated variable. Itisthe “if” part of the hypothesis. DependentVariable:- The variable thatrespondtothe independentvariableiscalledDependent Variable Itisknownas effect,resultorrespondingvariable.Itisthe thenpartof the hypothesis. Constant:-All factorswhichare not allowedto change duringthe experimentsare calledconstant. Control Group:- Control groupis the groupor the standardto whicheverythingiscompared. Experimental Group:- The experimentalgroupisthe groupwhichistestedwiththe Independent Variable.Eachtestgrouphas onlyone factor differentfromthe othergroup that isthe independent variable. Repeatedtrials:- Repeatedtrialsisthe numberof timesthe experimentisrepeated.The more timeswe repeatthe experiment,we will getthe more validresult.
  • 9. The IVCDV (IndependentVariable ConstantDependentVariable) chartisusedtodesignthe experiment. IV Constant DV Fertilizer 0 drop 2 drop 4 drop 6 drop Amounts of water Types of soil Amount of soil Type of plant Type of planter Size of planter Type of light Location Plant growth The Variable isthatchange duringthe experiment. Here the dropof fertilizer0,2,4or 6 is variedby the experimenter.Plantgrowthisthe dependentvariable thatdependsonthe dropof fertilizer,So it isthe dependentvariable.The othersare constants. Amountsof water,Typesof soil,Amountof soil,Type of plant,Type of planter,Size of planter,Type of light,locationare constants. If we wantto testthe soil insteadof fertilizerthanfertilizerbecomethe constantandtype of soil become the independentvariable.
  • 10. The plant growththat we can observe here iscalled (i) the result(of addingfertilizer) (ii) the response (of addingfertilizer) (iii) the effect(of addingfertilizer) Completely Randomized Designs:- Completely randomized designs are the simplest in which the treatments are assigned to the experimental units completely at random. This allows every experimental unit, i.e., plot, animal, soil sample, etc., to have an equal probability of receiving a treatment. REFERRENCE 1. Statistical Method by S.C. Gupta 2. en.wikipedia.org/wiki/Statistics 3. www.mathsisfun.com/data/probability.html 4. www.stats.gla.ac.uk/steps/glossary/sampling.html 5. A First Course in statistics with application by A K P C Swain 6. A test book of agricultural Statistics by R. Rangaswamy 7. Fundamental of statistics, Vol.-I and II by A.M. Goon, M.K. Gupta and B. Dasgupta 8. https://www.youtube.com/watch?feature=player_detailpage&v=UN206 cSaF0k#t=7 9. Statistics Glossary v1.1