Like this presentation? Why not share!

# Correlation and Regression; ANOVA

## on Feb 18, 2014

• 340 views

### Views

Total Views
340
Views on SlideShare
340
Embed Views
0

Likes
1
15
0

No embeds

### Report content

• Comment goes here.
Are you sure you want to

## Correlation and Regression; ANOVAPresentation Transcript

• BIOSTATISTICS CORRELATION AND REGRESSION, ANOVA SHRIVARDHAN DHEEMAN GURUKUL KANGRI UNIVERSITY HARIDWAR
• CORRELATION/CORRELATION ANALYSIS When we going to finding a relationship (if it exist) between the two variables (bivariate) under study Method and techniques used for studying and measuring the extent of the relationship between two variables Correlation Correlation Analysis 2 TOOL WE USE
• FIRST TO UNDERSTAND TERM BIVARIATE 4 12 2 3 10 4 13 4 5 15 5 5 16 4 11 7 6 18 8 Height and flower 1 6 In field 10 plants Height of Flower plant on plant 3 Obtained marks in two subject by all of them S. No. 3 9 9 5 14 10 4 12 3 Example of bivariate distribution will clear your concept: In class 60 students
• 20 18 16 14 12 10 8 6 4 2 0 2 4 Hight of plant 6 8 10 12 Flower on plant 4 0
• TYPES OF CORRELATION Analytical Graphical Linear Negative Nonlinear 5 Positive
• POSITIVE CORRELATION Proceeding goes in a single direction: e.g. Turbidity in a culture and OD Concentration of Antibiotic and Zone of clearance NEGATIVE CORRELATION Proceeding goes in a diverse/different direction: e.g. Volume and Pressure of gas 6 Demand of grain and Price
• LINEAR CORRELATION  This correlation is categorized based upon the graphical representation:  The correlation gives a linear straight graph representation says a linear correlation.  Change in one unit of one variable result in the corresponding change in the other variable over the entire range of value: X 2 4 6 8 10 Y 7 13 19 25 31 7 e.g.
• • Unit change in the value of X, there is a constant change in the corresponding value of Y and the above data can be expressed by relation 𝒀 = 𝟑𝒙 + 𝟏 In general two variable X and Y are said to be Linearly related, if these exist in a relation ship of the from 𝒀 = 𝒂 + 𝒃𝒙 Where, 8 a and b are the real numbers.
• Linear Correlation Graph 35 30 25 20 X Y 15 10 5 0 2 3 4 5 9 1
• NON-LINEAR CORRELATION Relation between two non-linear if corresponding to a unit change in one variable, the other variable does not change at a constant rate. 10 But, change at fluctuating rate, So graph will not get a straight line
• Non-Linear Correlation Graph 35 30 25 20 X Y 15 10 5 0 2 3 4 5 11 1
• COEFFICIENT OF CORRELATION Measure of the degree of association between two variable is called coefficient of correlation (r): If the two set of data have r = +1 Thus, Positive correlation If the two set of data have r = -1 Thus, Negative correlation If the two set of data have r= 0 Thus, Non-correlation 𝒏Σ𝑿𝒀 − Σ𝑿. Σ𝒀 [(𝒏Σ𝑿 𝟐 − Σ𝑿 𝟐][(𝒏Σ𝒀 𝟐 − Σ𝒀 𝟐] 12 𝒓=
• SOLVED EXAMPLE S. No. Height of Flower plant on plant 1 4 12 2 3 10 3 4 13 4 5 15 5 5 16 6 4 11 7 6 18 8 3 9 9 5 14 10 4 12 13 Problem: Find the relationship between the Flower on plant is correlated with the height of plant
• SOLUTION Flower x2 on plant (y) y2 xy 1 4 12 16 144 48 2 3 10 9 100 30 3 4 13 16 169 52 4 5 15 25 225 75 5 5 16 25 256 80 6 4 11 16 121 44 7 6 18 36 324 108 8 3 9 9 81 27 9 5 14 25 196 70 10 4 12 16 144 48 Total 43 130 193 1760 582 14 S. No. Height of plant (x)
• 𝒓= 𝒓= 𝒓= 𝒓= 10. (582) − 43.130 [(10.193 − 43 𝟐][(10.1760 − 130 𝟐] 5820 − 5590 [(1930 − 1849][(17600 − 16900] 230 (𝟖𝟏)(700) 230 𝟓𝟔𝟕𝟎𝟎 230 𝟐𝟑𝟖. 𝟏𝟏 𝒓= 230 𝟐𝟑𝟖. 𝟏𝟏 𝒓 = 0.9659 15 𝒓=
• REGRESSION If the two are significantly correlated and if there is some theoretical basis for doing so, it is possible to predict value of one variable from the other. This method to analyze so is called the Regression Analysis. “Estimation or prediction of the unknown value of the variable from the known value of the other variable. 16 M. M. Blair has addressed that “ regression analysis is mathematical measure of the average relationship between two or more variables in terms of the original unit of the data.
• REGRESSION EQUATION Size of sample = n And the two set of measures is denoted by the X and Y We can predict the value of Y given the value of X for desirable size n denoted with the X’ Following the equation is used as Regression Equation: Y=a+bX’ Where, a and b = coefficient 17 𝒏Σ𝑿𝒀 − Σ𝑿. Σ𝒀 𝒃= 𝒏 Σ𝑿 𝟐 − Σ𝑿 𝟐 Σ𝒀 − 𝒃Σ𝑿 𝒂= 𝒏
• EXAMPLE Treatment plant Mid term Final 1 98 90 2 66 74 3 100 98 4 96 88 5 88 80 6 45 62 7 76 78 8 60 74 9 74 86 10 82 80 18 Problem: Nitrogen produced by the treatment plant in the mid term and final. Develop a regression equation which may be used to predict final yield from the mid term score.
• SOLUTION Mid term (x) Final (y) x2 xy 1 98 90 9064 8820 2 66 74 4356 4884 3 100 98 10000 9800 4 96 88 9216 8448 5 88 80 7744 7040 6 45 62 2025 2790 7 76 78 5776 5928 8 60 74 3600 4440 9 74 86 5476 6364 10 82 80 6724 6560 Total 785 810 64521 65071 19 Treatment plant
• Numerator of b = 10x65071-785x810 = 65710-635850 = 14860 Dominator of b = 64521-(785)2 = 645210-616225 = 28985 Therefore b = 14860/28985 = 0.5127 Numerator of a = 810-785x0.5127 = 810-402.4695 = 407.5305 = 10 20 Dominator of a
• Thus, Value of a = numerator of a/dominator of a = 407.5305/a = 40.7531 considering the formula of regression equation: Y=a+b(X’) Y= predicting value a = value obtained b = value obtained X’ = number of object for the prediction is desirable Thus, = 40.7531+(0.5127)50 = 40.7531+25.631 = 66.3881 21 Y
• 22 ANOVA
• ANOVA ANALYSIS OF VARIANCE • statistical hypothesis • Analysis of experimental data • Method • Making decision by using data • Calculated • By the null hypothesis and the sample data 23 “Assuming the truth of the Null Hypothesis statistically result can be justifies to reject and accept for predict the inference regarding variance of the data. If the variation analysis is predict as accept thus the variation is not significant and vice versa.”
• When the graphical data representation obtained after ANOVA data lies in the graph and the two region of graph is obtained one in acceptance region where data support the hypothesis and another in rejection region where data doesn't support the hypothesis 24 Null hypothesis is denoted by H0
• HISTORY OF ANOVA In year 1827 La’Place address the ANOVA problem regarding measurement of atmosphere tides. 1918 Sir Ronald Fisher introduced the term Varience in his article published in same year under the title “the correlation between relative on the supposition of medallion inheritance”. 25 Fischer introduced the method of analysis in his book published in the year 1925 named “statistical method for research workers”
• COMPONET OF MEASURE OF ANOVA: F TEST For the comparison of variance from a mixed poputation. It is recommended for ANOVA, where two estimates of the variance of the same sample are compared. While the F test is not generally used against the departures from normality, it has been found to be robust in the special case of ANOVA. 26 Citation from Moore and Mc Cabe (2003); uses F test in ANOVA, but there are not the same as the F statistic for computing standard deviation of two population.
• 27 The F-test is used for comparisons of the components of the total deviation. For example, in one-way, or single factor ANOVA, statistical significance is tested for by comparing the F test statistic
• WHAT IS ANOVA ANOVA apply in all groups of simply random sample of the single population, so the treatment want to implies the same effect. ANOVA as a statistical design of experiments Experiment adjust the factors & measures response in an attempt to determine effect. 28 ANOVA is the synthesis of several ideas and it is used for multiple response/purpose. As a consequences, it is difficult to define concisely and precisely.
• CHARACTERISTIC & LOGIC Characteristics: • Used in the analysis of comparative experiments • Determine by the ratio of two variances Logic: • The calculation of ANOVA can be characterized a computing a number of means and variances, dividing two variation and comparing the ratio to determine statistical significance. 29 • An effect of any treatment is estimated by taking the difference between the mean of the observation which receive the treatment and the general mean.
• 30
• 31
• 32
• TYPE OF ANOVA One way ANOVA: This ANOVA is analyze for the single hypothesis from the obtained data. Hypothesis is null hypothesis Single hypothesis is analyze the effect or factor of the variance in the random data of groups. Further by F-test a limit of acceptance and rejection is obtained under the factor of F-test the graph is plotted between F value and the obtained value of ANOVA analysis. Example: Problem: Nitrogen produced by the treated plant with Fertilizer 33 H0: nitrogen is produce due to fertilizer Vs. itself by the plant
• TYPE OF ANOVA Two way ANOVA: This ANOVA has a significant difference from the one way ANOVA that from this analysis we can test two hypothesis simultaneously under the Null hypothesis From the two hypothesis one is rejected and the another is accepted for the data. Example: Problem: Bacterial growth observed in CFU on the 28 solid media plate. Where temperature and pH are the factor of growth. If we want to test the factor so we have to test the two hypothesis: H0: bacterial growth is inhibited due to temp Vs. pH 34 H0’: bacterial groth is enhanced due to temp. Vs. pH
• 35
• 36
• 37
• 38
• 39
• 40