- 1. BIOSTATISTICS HYPOTHESIS ,TYPES OF ERRORS,T TEST Z TEST By Mr Payaam Vohra NIPER AIR 11 Gold Medalist in MU MET AIR 07 ICT MTECH SCORE RANK 01 CUET-PG AIR 01 IIT BHU AIR 08 GPAT AIR 43
- 2. IV B.PHARMACY (BIO STATISTICS) TEST OF HYPOTHESIS HYPOTHESIS: Any statement about the population is called hypothesis. It may be TRUE (or) FALSE. STATISTICAL HYPOTHESIS: Any statement about the population Parameter is called statistical hypothesis. It is denoted by H. For eg1: The average height of competitors in a game is 165 CMS. i.e., H : =165 For eg2: The average life time of an electrical bulb manufacturing by a company is 1000 hours. i.e., H : =1000 There are two kinds of essential hypothesis in conducting the BEST procedure: (1) NULL HYPOTHESIS (2) ALTERNATIVE HYPOTHESIS (3)NULL HYPOTHESIS: - A Statistical hypothesis with no difference (or) with null attitude is called Null hypothesis. It is denoted by H0 . For eg1: The average height of competitors in a game is 165 CMS. i.e., H0 : =165 For eg2: The average life time of an electrical bulb manufacturing by a company is 1000 hours. i.e., H0 : =1000 (2)ALTERNATIVE HYPOTHESIS: - A Statistical hypothesis which is complementary to the Null Hypothesis is called Alternative hypothesis. It is denoted by H1 . For eg: If the null hypothesis is the Average height of the competitors in the game is 165 CMS. i.e., H1 : =165 Then, Alternative hypothesis may be formulated as i) H1 : >165[ One tailed test (or) Right tailed test] ii)H1 : <165 [One tailed test (or) left tailed test] iii)H1 : 165[Two tailed tests]
- 3. IV B.PHARMACY (BIO STATISTICS) Critical Region: - Let x1, x2,...........xn be sample observations in the sample space “S”. Let us divide the sample space ‘S’ Into two disjoint parts W & W . The region W consists of the sample points for which the Null Hypothesis is rejected when it is TRUE. (or) CRITICAL REGION: A Region in the sample “S” which amount to Rejection of “HO” is called CRITICAL REGION(CR) (or) REJECTION REGION(RR) In any test the critical region is presented by area under the probability curve. CRITICAL VALUE(CV): The value of test statistic which separate CRITICAL REGION AND ACCEPTENCE REGION is called “CRITICAL VALUE” or “SIGNIFICANT VALUE” Significant value depends on 1. Level of Significant (LOS) 2. ALTERNATIVE HYPOTHESIS(H1) ALTERNATIVE HYPOTHESIS(H1) is of three types 1. Right Tailed Test (RTT)⇒H1: µ> µo 2. Left Tailed Test (LTT)⇒H1: µ< µo 3. Two Tailed Test (TTT)⇒H1: µ≠ µo Right Tailed Test (RTT) If the critical region falls right hand side of a probability curve is known as Right Tailed Test (RTT)
- 4. IV B.PHARMACY (BIO STATISTICS) Left Tailed Test (LTT) If the critical region falls Left hand side of a probability curve is known as Left Tailed Test (LTT) Two Tailed Test (TTT) If the critical region is equally distributed on both sides of a probability curve is known as” Two Tailed Test (TTT) TYPES OF ERRORS: - There are two types of Errors. 1. Type – I Error 2. Type - II Error The Type-I and Type-II Errors are clearly explained in the following table: Nature/Statement Decision ACCEPT HO REJECT HO NULL HYPOTHESIS IS TRUE Correct decision Type-I Error NULL HYPOTHESIS IS FALSE (H1 is TRUE) Type-II Error Correct decision
- 5. IV B.PHARMACY (BIO STATISTICS) 1. Type – I Error: The error of rejecting H 0 (Null hypothesis) when the statement H 0 is TRUE is called Type – I Error. The probability of Type – I Error is denoted by ‘ α ’ and it is given as H0 α = P (TYPE- I Error) = P (Rejecting when H0 is True) 2. Type - II Error: statement H0 The error of Accepting H0 (Null hypothesis) when the is FALSE is called Type – II Error. The probability of Type – II Error is denoted by β ’ and it is given as β = P (TYPE- II Error) = P (Accepting H0 when H0 is False) = P (Accepting H0 when H1 is True) Level of significance (l.o.s): - The probability of Type – I Error is known as level of significance. This is also called size of the Critical Region. Working rule for Testing of Hypothesis The test of hypothesis following the five-step procedure: Step 1: Null Hypothesis(H0): Define (or) Set up a Null Hypothesis Ho taking into consideration the nature of the problem and data involved. Step 2: Alternative Hypothesis (H1): Set up the Alternative Hypothesis H1, so that we could decide whether we should use One Tailed Test (RTT or LTT) (or) Two Tailed Test (TTT). Alternative Hypothesis may be one of the following 3 conditions 1. Right Tailed Test (RTT); H1: µ> µo 2. Left Tailed Test (LTT); H1: µ< µo 3. Two Tailed Test (TTT); H1: µ≠ µo Step 3: Level of Significance (LOS): Select the appropriate Level of significance depending on the reliability of the estimates and permissible risk. If it is not given in the problem usually, we choose 5% los. i.e., ttab = t table value at 𝛼% and Type of test (RTT (or) LTT (or) TTT) Step 4: Test Statistic: Compute Test Statistic under Ho
- 6. I V B . P H A R MA C Y (BIO S TA TI S TI C S ) tcal = |𝐭 − E(t) | S.E(t) Where t is a sample statistic and S.E(t) Standard Error of t. Step 5: Conclusion(or) INFERENCE: If ttab ≥ tcal, then Accept Null Hypothesis Ho and Otherwise Reject Null Hypothesis (Accept H1). SMALL SAMPLE TEST SMALL SAMPLE TESTS: A Sample which consists of sample size n < 30, that sample is called small sample. To test the small sample, we have to use exact sampling distribution. i.e., t, F and Chi-square (χ²) Tests degrees of freedom (d.f): - The number of independent viriates which make up the statistics is known as the degrees of freedom (d.f) and it is denoted by a Greek letter d.f (υ) = n-k for example: In a set of data of ‘n’ observation, if ‘k’ is the number of independent constraints, then d.f (υ) = n-k. (1) t – Test :- Let x1,x2,x3 ........xn be a random sample of sizes n be a random sample of size n from a Normal population with mean( ) &variance( 2 ) – then student t – statistic is defined as t = x ~ t ( or) t = x ~ t S n 1 s n 1 n n 1 1 n 2 1 n
- 7. IV B.PHARMACY (BIO STATISTICS) Properties of t – distribution: - 1. The shape of t – distribution is Bell – shaped, which is similar to that of a Normal distribution and is symmetrical about the mean. 2. It is symmetrical about the line t = 0. 3. The mean of t – distribution is zero. 4. It is unimodal with mean = median = mode. 5. The variance of t – distribution depends upon the parameter v which is called the degrees of freedom (d.f). 6. The limits of the t – distribution from -∞ to +∞. Applications of t – test: - 1. To test the significance of the sample mean, when population variance is not given (t – test for single mean) 2. t – Test for difference of mean (or) two sample mean. 3. Paired t – test. PROBLEMS ON t -TEST SINGLE MEAN Problems-1: A random sample of 10 boys had the following IQ.70,120,110,101,88,83,95,98,107, 100.Do this data support the assumption of apopulation mean IQ of 100. Solution: Here SD & Mean of the samples is not given directly. Mean = x = x = 70 +120 +110+101+ 88+ 83+ 95+ 98+107+100 n 10 = 972 = 97.2 10 x (x x) (x x)2 70 -27.2 739.84 120 22.8 519.84 110 12.8 163.84 101 3.8 14.44 88 -9.2 84.64 83 -14.2 201.64 95 -2.2 4.84 98 0.8 0.64 107 9.8 96.04 100 2.8 7.84 x =972 (x - x)2 =1833.60
- 8. IV B.PHARMACY (BIO STATISTICS) 1 n 1 (x x)2 S n 14.2735 10 We know that S = = 1 10 1 1833.60 =14.2735 1. Null hypothesis: The data support the assumption of a population mean I.Q of 100 in the population. i.e., H0 : =100 2. Alternative hypothesis: i.e., H1 : 100 3. LOS: t-test tabulated for 9 d.f at 5% level of significance is 2.26 4.Test Statistics: compute test statistic under H0 tCal = x ~t(n 1)d.f Cal t = 97.2 100 = 0.6203~ t 9 d. f 5. Inference: if ttab > tcal i.e., 2.26 > 0.6203 then accept Null hypothesis The data support the assumption of a population mean I.Q of 100 in the population. PROBLEMS ON t -TEST TWO MEANS Problems-2: Sample of two types of two types of electrical light bulbs were tested for length of life and the following data were obtained. TYPE-I TYPE-II Sample mean 1234 hrs 1036 hrs Sample SD 36 hrs 40 hrs Sample number 8 7 is the difference in the means sufficient to warrant that type-I is superior to type-II regarding length of life?
- 9. n s 2 + n s 2 1 1 2 2 n1 + n2 2 8(36)2 + 7(40)2 8+ 7 2 S 1 + 1 n1 n2 IV B.PHARMACY (BIO STATISTICS) Solution: Given that n1 =8 ,n2 =7 s1 = 36 , s2 = 40 x =1234 , y =1036 Now we have find S = = = 40.73 1. Null hypothesis: The two types I & type-II of electric bulbs are identical. i.e., H0 : 1 = 2 2. Alternative hypothesis: The two types I & type-II of electric bulbs are not identical. 3. LOS: d.f = n1+ n2 -2 = 8+7-2 = 13 d.f i.e., Tabulated value of t- for 13 d.f at 13% level is 2.16 4. Test statistics: Compute test statistic under H0 cal x y t = ~ tn +n 2 1 2 = 1234 1036 1 1 ~ t8+7 2 40.73 + 8 7 = 9.39 ~ t13 d. f 5. Inference: if ttab < tcal i.e., 2.16 < 9.39 then Re ject Null hypothesis (or) Accept Alternative hypothesis i.e., The two types of electrical bulbs are not identical
- 10. IV B.PHARMACY (BIO STATISTICS) Problem-3: The Students of two schools were measured for their heights, one school was east coast & another in west coast where there are slights difference in weather. The sampling results are as follows. East coast 43 45 48 49 51 52 West coast 47 49 51 53 54 55 55 56 57 Solution: 1. Null hypothesis: Weather has no impact on the height of the students. 2. Alternative hypothesis: Weather has some impact on the height of the students. 3. Los: d.f = n1+ n2 -2 = 8+7-2 = 13 d.f i.e., Tabulated value of t- for 13 d.f at 13% level is 2.16 4. Calculation: East Coast(xi) West coast(yi) (xi x) (x x)2 i (yi y) (y y)2 i 43 47 -5 25 -6 36 45 49 -3 9 -4 16 48 51 0 0 -2 4 49 53 1 1 0 0 51 54 3 9 1 1 52 55 4 16 2 4 55 2 4 56 3 9 57 4 16 xi = 288 yi =447 x x 2 6 i (y y)2 = 90 i 50
- 11. x x + y y ( ) ( ) 2 2 i i n1 + n2 2 S x y 1 + 1 n1 n2 IV B.PHARMACY (BIO STATISTICS) Mean of x = x = 43+ 45+ 48+ 49 +51+52 = 48 6 Mean of x = x = 47 + 49+ 51+53+54 + 55+ 55+56 + 57 = 53 9 we know that S = S = 60 + 90 = 3.3968 6 +9 2 4.Test Statistic :Compute test statistic under H0 cal t = ~ tn +n 2 1 2 = ~ t6+9 2 48 53 3.3968 1 + 1 6 9 = 2.7929 ~ t13d. f 5. Inference: if ttab < tcal i.e., 2.16 < 2.7929 then Re ject Null hypothesis (or) Accept Alternative hypothesis i.e., Weather has some impact on the height of the students. Problem-4: Scores obtained in a shooting competition by 10 soldiers before and after intensive coaching are given below: Before 67 24 57 55 63 54 56 68 33 43 After 70 38 58 58 56 67 68 75 42 38 Test whether the intensive training is useful at 0.05 los? 1. NULL HYPOTHESIS: There is no significant effect of the training 2.ALTERNATIVE HYPOTHESIS: The intensive training is useful. 3. LOS: d.f = n-1 =10-1 = 9 df t- tabulated value for 9 df at 0.05 level is 1.83
- 12. IV B.PHARMACY (BIO STATISTICS) n 1 10 1 4. Computation: BEFORE AFTER d (d d) (d d)2 67 70 -3 2 4 24 38 -14 -9 81 57 58 -1 4 16 55 58 -3 2 4 63 56 7 12 144 54 67 -13 8 64 56 68 -12 7 49 68 75 -7 -2 4 33 42 -9 -4 16 43 38 5 10 100 d = 50 (d d)2 = 482 n 10 (d d ) 2 d = d = 50 = 5 S = = (482) = 7.32
- 13. IV B.PHARMACY (BIO STATISTICS) d S n 4. Test Statistic : cal t = ~ tn 1 df = 5 7.32 t10 1 10 = 2.16 ~ t9 df 5.Inference: if ttab < tcal i.e.,1.83< 2.16then Re ject Null hypothesis (or) Accept Alternative hypothesis i.e.,The intensivetraining isuseful.
- 14. IV B.PHARMACY (BIO STATISTICS) d S n 4. Test Statistic : cal t = ~ tn 1 df = 5 7.32 t10 1 10 = 2.16 ~ t9 df 5.Inference: if ttab < tcal i.e.,1.83< 2.16then Re ject Null hypothesis (or) Accept Alternative hypothesis i.e.,The intensivetraining isuseful.
- 15. IV B.PHARMACY (BIO STATISTICS) t-test table values
- 16. IV B.PHARMACY (BIO STATISTICS) ANALYSIS OF VARIANCE (ANOVA) INTRODUCTION: Analysis of variance is an important method to test the significance difference between the sample mean. Previously, we used t –test for testing the significance difference between Two sample means, but when we have more than two sample then we have to use the technique of ANOVA. Example: Five fertilizers are applied to four plots, each plot consisting of yield of rice then we have to test whether the effected of fertilizers on the plots is significantly different (or) not. ANOVA technique is introduced by R.A. Fisher (Father of statistics) in the year 1920. Assumption of ANOVA: To apply the ANOVA technique the following assumptions must be made. 1. All the observations are Independent. 2. All group variances are Equal. 3. Parent population is Normal. 4. Environmental effects are additive in Nature. To test the equality of several means by ANOVA technique. we have the following methods. (1) ANOVA One-way classification The Technique of testing several means by analyzing the total variance in one direction is called ANOVA One-way classification STATISTICAL ANALYSIS OF ONE-WAY CLASSIFICATION: Total Sum of Square (T.S.S) = Group Sum of Square (G.S.S) + Error Sum of Square (E.S.S) i.e., T.S.S = G.S.S + E.S.S ST = SG + SE 2 2 2