Upcoming SlideShare
×

# Presentation1group b

798 views

Published on

Published in: Education, Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

Views
Total views
798
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
10
0
Likes
0
Embeds 0
No embeds

No notes for slide
• page 79 of text
• Some student have difficulty understand the idea of ‘within one standard deviation of the mean’. Emphasize that this means the interval from one standard deviation below the mean to one standard deviation above the mean.
• These percentages will be verified by the concepts learned in Chapter 5. Emphasize the Empirical Rule is appropriate for data that is in a BELL-SHAPED distribution.
• page 19 of text
• Students will most often confuse stratified sampling with cluster sampling. Both break the population into strata or sections. With stratified a few are selected from each strata. With cluster, choose a few of the strata and choose all the member from the chosen strata.
• page 23 of text
• ### Presentation1group b

1. 1. <ul><li>Biological variation in large groups is common. e.g : BP, wt </li></ul><ul><li>What is normal variation? and How to measure? </li></ul><ul><li>Measure of dispersion helps to find how individual observations are dispersed around the central tendency of a large series </li></ul><ul><li>Deviation = Observation - Mean </li></ul>10/01/11 STATISTICS
2. 2. <ul><li>Range </li></ul><ul><li>Quartile deviation </li></ul><ul><li>Mean deviation </li></ul><ul><li>Standard deviation </li></ul><ul><li>Variance </li></ul><ul><li>Coefficient of variance : indicates relative variability ( SD/Mean) x100 </li></ul>10/01/11 STATISTICS
3. 3. <ul><li>Range : difference between the highest and the lowest value </li></ul><ul><li>Problem: </li></ul><ul><li>Systolic and diastolic pressure of 10 medical students are as follows: 140/70, 120/88, 160/90, 140/80, 110/70, 90/60, 124/64, 100/62, 110/70 & 154/90. Find out the range of systolic and diastolic blood pressure </li></ul><ul><li>Solution: </li></ul><ul><li>Range of systolic blood pressure of medical students: 90-160 or 70 </li></ul><ul><li>Range of diastolic blood pressure of medical students: 60-90 or 30 </li></ul><ul><li>Mean Deviation: average deviations of observations from mean value </li></ul><ul><li>_ </li></ul><ul><li>Σ (X – X ) __ </li></ul><ul><li>Mean deviation (M.D) = --------------- , ( where X = observation, X = Mean </li></ul><ul><li>n n= number of observation ) </li></ul>10/01/11 STATISTICS
4. 4. <ul><li>  Problem : Find out the mean deviation of incubation period of measles of 7 children, which are as follows: 10, 9, 11, 7, 8, 9, 9. </li></ul><ul><li>Solution: </li></ul><ul><li>  </li></ul><ul><li>  </li></ul>Mean deviation (MD ) = _ Σ X - X = ------------ n = 6 / 7 = 0.85 10/01/11 STATISTICS Observation (X) __ Mean ( X ) __ Deviation (X - X) 10 __ X = Σ X / n = 63 / 7 = 9 1 9 0 11 2 7 -2 8 -1 9 0 9 0 ΣX=63 _ Σ (X-X) = 6, ignoring + or - signs
5. 5. <ul><li>It is the most frequently used measure of dispersion </li></ul><ul><li>S.D is the Root-Means-Square-Deviation </li></ul><ul><li>S.D is denoted by σ or S.D </li></ul><ul><li>___________ </li></ul><ul><li>Σ ( X – X ) 2 </li></ul><ul><li>S.D (σ) = γ ---------------------- </li></ul><ul><li>n </li></ul>10/01/11 STATISTICS
6. 6. <ul><li>Calculate the mean </li></ul><ul><li>↓ </li></ul><ul><li>Calculate difference between each observation and mean </li></ul><ul><li>↓ </li></ul><ul><li>Square the differences </li></ul><ul><li>↓ </li></ul><ul><li>Sum the squared values </li></ul><ul><li>↓ </li></ul><ul><li>Divide the sum of squares by the no. observations (n) to get ‘mean square deviation’ or variances (σ 2 ) . [For sample size < 30, it will be divided by (n-1)] </li></ul><ul><li>↓ </li></ul><ul><li>Find the square root of variance to get Root-Means-Square-Deviation or S.D ( σ) </li></ul>10/01/11 STATISTICS
7. 7. S.D ( σ ) = = Σ(X –X) 2 / n-1 =(√1924/ (12-1) _____ = √174 = 13.2 10/01/11 STATISTICS Observation (X) __ Mean ( X ) _ Deviation (X- X) __ (X-X) 2 58 __ X = Σ X / n = 984/12 = 82 -12 576 66 -16 256 70 -12 144 74 -8 64 80 -2 4 86 -4 16 90 8 64 100 18 324 79 -3 9 96 14 196 88 6 36 97 15 225 Σ X = 984 _ Σ (X - X) 2 =1914
8. 8. x The Empirical Rule (applies to bell-shaped distributions ) FIGURE 2-15 10/01/11 STATISTICS
9. 9. x - s x x + s 68% within 1 standard deviation 34% 34% The Empirical Rule (applies to bell-shaped distributions ) FIGURE 2-15 10/01/11 STATISTICS
10. 10. x - 2s x - s x x + 2s x + s 68% within 1 standard deviation 34% 34% 95% within 2 standard deviations The Empirical Rule (applies to bell-shaped distributions ) 13.5% 13.5% FIGURE 2-15 10/01/11 STATISTICS
11. 11. x - 3s x - 2s x - s x x + 2s x + 3s x + s 68% within 1 standard deviation 34% 34% 95% within 2 standard deviations 99.7% of data are within 3 standard deviations of the mean The Empirical Rule (applies to bell-shaped distributions ) 0.1% 2.4% 2.4% 13.5% 13.5% FIGURE 2-15 10/01/11 STATISTICS 0.1%
12. 12. <ul><li>Other names : Frequency distribution curve, Normal curve, Gaussian </li></ul><ul><li>Curve etc. </li></ul><ul><li>Most of the biological variables (continuous) follow normal distribution </li></ul><ul><li>Applicable for quantitative data (when large no. of observations) </li></ul><ul><li>Quantitative data - represented by a histogram & by joining midpoints of each rectangle in the histogram we can get a frequency polygon </li></ul><ul><li>When number of observations become very large and class interval very much reduced, the frequency polygon loses its angulations and gives rise to a smooth curve known as frequency curve. </li></ul>10/01/11 STATISTICS
13. 13. <ul><li>Mean  1 SD limit, includes 68.27% of all the observations </li></ul><ul><li>Mean  1.96 SD limit, includes 95% of all observations </li></ul><ul><li>Mean  2 SD limit, includes 95.45% of all observations </li></ul><ul><li>Mean  2.58 SD limit, includes 99% of all observations </li></ul><ul><li>Mean  3 SD limit, includes 99.73% of all observations </li></ul>10/01/11 STATISTICS
14. 14. <ul><li>Observations of a continuous variable, those are normally distributed in a popln., when plotted as a frequency curve give rise to Normal Curve </li></ul><ul><li>The characteristics of Normal Curve: </li></ul><ul><li>- A smooth bell shaped symmetrical curve </li></ul><ul><li>- A area under the curve is 1 or 100%. </li></ul><ul><li>- Mean, median and mode - identical (at same point). </li></ul><ul><li>- Never touch the base line. </li></ul><ul><li>- Limit on either side is called ‘ Confidence limit’. </li></ul><ul><li>- Curve tells the probability of occurrence by chance (sample variability) </li></ul><ul><li>or how many times an observation can occur normally in the popln. </li></ul><ul><li>- Distribution of observations under normal curve follows the same </li></ul><ul><li>pattern of Normal Distribution   </li></ul>10/01/11 STATISTICS
15. 15. <ul><li>Each observation under a normal curve has a ‘Z’ value </li></ul><ul><li>Z (standard normal variate or relative deviate or critical ratio) is the measure of distance of the observation from mean in terms of standard deviation </li></ul><ul><li>__ </li></ul><ul><li>Z=(Observation-Mean)/S.D=( X - X ) / S.D </li></ul><ul><li>So, if ‘Z’ score is – 2, it means that the observation is 2 S.D away from mean on left hand side. Similarly, Z is + 2, it means that the observation is 2 S.D away from mean on right hand side. </li></ul><ul><li>When ‘Z’ score is expressed in terms of absolute value, suppose, 2, it means that the observation is 2 S.D away from mean irrespective of the direction. </li></ul><ul><li>If all observations of normal curves are replaced by ‘Z’ score, virtually all curves become the same. This standardized curve is known as </li></ul><ul><li>STANDARD NORMAL CURVE </li></ul>10/01/11 STATISTICS
16. 16. <ul><li>Properties : - All properties of Normal Curve </li></ul><ul><li>- Area under the curve is 1 </li></ul><ul><li>- Mean, median & mode coincide and they are 0 </li></ul><ul><li>- Standard deviation is 1 </li></ul>The Standard Normal Curve and Areas within 1, 2, 3 SD's of the Mean 10/01/11 STATISTICS
17. 17. Areas within 1 & 2 S.D's of the Mean ( Mean-36, SD-8) and (Mean-70, SD-3) 10/01/11 STATISTICS
18. 18. <ul><li>The confidence level or reliability is the expected percentage of times that the actual values will fall within the stated precision limit. </li></ul><ul><li>Thus 95 % CI mean that there are 95 chances in 100 (or 0.95 in 1) that the sample results represent the true condition of population within a specified precision range against 5 chances in 100 (0.05 in 1) that it does not. </li></ul><ul><li>Precision is the range within which the answer may vary and still be accepted </li></ul><ul><li>CI indicates the chance that the answer will fall within that range & Significance level indicates the likelihood that the answer will fall outside that range </li></ul><ul><li>We always remember that if the confidence level is 95%, then the significance level will be (100-95) i.e., 5%; if the confidence level is 99%, significance level is (100-99) i.e.,1% </li></ul><ul><li>Area of normal curve within precision limits for the specified CI constitutes the accepted zone and area of curve outside this limit in either direction constitutes the rejection zone. </li></ul>10/01/11 STATISTICS
19. 19. <ul><li>__ __ </li></ul><ul><li>CI= Mean ± Z SE (Mean) = X ± Z SE (X) </li></ul><ul><li>  _ _ </li></ul><ul><li>95% CI = X ± 1.96 SE (X) </li></ul><ul><li>  _ _ </li></ul><ul><li>99% CI = X ± 2.58 SE (X ) </li></ul><ul><li>  </li></ul><ul><li>  </li></ul>10/01/11 STATISTICS
20. 20. <ul><li>Large sample- sample size > 30 </li></ul><ul><li>Small sample- sample size > 30 </li></ul><ul><li>Hypothesis – </li></ul><ul><li>Null ( H 0 )- assumes that there is no difference b/w two values such as population </li></ul><ul><li>means or proportions </li></ul><ul><li>Ho : Mean of popn. A = Mean of popn. B </li></ul><ul><li>µ 1 = µ 2 OR P1 =P2 </li></ul><ul><li>b. Alternative ( H 1 )-hypothesis that differs from </li></ul><ul><li>Ho </li></ul><ul><li>H 1: µ 1≠ µ 2 or µ 1 > µ 2 or µ 1 < µ 2 </li></ul><ul><li>6. Sampling errors – a. Type 1 error </li></ul><ul><li>b. Type 2 error </li></ul>
21. 21. <ul><li>State the Null Hypothesis </li></ul><ul><li>State the Alternative Hypothesis </li></ul><ul><li>Decide whether to use 1 or 2 tail test </li></ul><ul><li>Specify the level of significance(5 or 1%) </li></ul><ul><li>Select appropriate test, follow calculation based on type of the test </li></ul><ul><li>Compare calculated value with the theoretical value </li></ul><ul><li>If calculated value> theoretical value, reject Null Hypothesis and if <, then accept it </li></ul><ul><li>Make conclusion on the basis of the above </li></ul>
22. 22. Tests of Significance DATA Discrete (Qualitative) Continuous Non- Parametric Test Chi- square, Fishers exact sign, Mann Whitney Parametric Tests Z-test, t-test ANOVA test 10/01/11 STATISTICS
23. 23. <ul><li>Conditions to apply  2 test: </li></ul><ul><li>- Applicable on qualitative data, obtained from random sample. </li></ul><ul><li>- Based on frequency, not on parameter like %, rates, ratios, mean or S.D </li></ul><ul><li>- Observed frequency not less than 5 </li></ul><ul><li>Application of  2 test : </li></ul><ul><li>- Comparison of proportions of two or more than two samples </li></ul><ul><li>- Comparison of observed proportion with a hypothesized one (goodness of fit) </li></ul><ul><li>- Comparison of paired observations ( Mc Nemar  2 test) </li></ul><ul><li>- Trend  2 test </li></ul><ul><li>N.B : Yates’ correction : When the expected frequency in any cell of the (2x2) </li></ul><ul><li>table is less than 5 then Yates’ correction (correction for continuity) done </li></ul><ul><li>  </li></ul>10/01/11 STATISTICS
24. 24. <ul><li>Step - 1: Write down the null hypothesis </li></ul><ul><li>Step –2: Make a contingency table & calculate the Expected frequencies </li></ul><ul><li>Expected Frequency= (Row total X Column total) / Grand total </li></ul><ul><li>Step-3: Compute the value of  2 test </li></ul><ul><li> 2 = Sum (observed value-Expected value) 2 / Expected value </li></ul><ul><li>=  (O-E) 2 / E </li></ul><ul><li>Step-4: Find out the degree of freedom d.f= (r-1) (c-1) </li></ul><ul><li>  </li></ul><ul><li>Step-5: Obtain the tabulated value under the column p=0.05 or p=0.01, of  2 test table </li></ul><ul><li>Step-6: Compare  2 calculated with table value. If calculated value of  2 test is greater than </li></ul><ul><li>table value, reject null hypothesis, otherwise accept it. </li></ul><ul><li>Step-7: Write down the conclusion </li></ul>10/01/11 STATISTICS
25. 25. <ul><li>Cure rate of treatment A & B are 90%out of 100 patients & 70% out of 150 patients . Are treatment A & B equally effective? </li></ul><ul><li>Ho :No difference in cure rate b/w t/t A & B </li></ul><ul><li>2 Χ 2 contigency table </li></ul><ul><li>3. Computation of value of ג 2 </li></ul>Observed value T/t Outcome Total Cure NotCured A 90 10 100 B 105 45 150 Total 195 55 250
26. 26. Calculated value 13.99 > tabulated Value 3.84 Null hypothesis rejected Conclusion:- Treatment A more effective than Treatment B Expected value ג 2 =∑ (O-E) 2 E (90-78) 2 + (10-22) 2 +( 105-117) 2 +( 45-33 ) 2 78 22 117 33 = 13.99 T/t Outcome Total Cure NotCured A 78 22 100 B 117 33 150 Total 195 55 250
27. 27. <ul><li>A pharmaceutical claimed that their new product can cure 80% of pts. But on trial, it was revealed that 56 have been cured out of 80( 70%).Do you agree with the company that cure rate is 80% </li></ul>ג 2= 5 It is >3.84 Reject Ho Efficacy -80% T/t Outcome with new drug Total Cure NotCured Obs. value 56 24 80 Hypothetical value 64 16 80 Total 120 40 160
28. 28. <ul><li>Comparison of </li></ul><ul><li>Proportions of >=2 samples </li></ul><ul><li>Observed proportion with a hypothesized one ( goodness of fit ) </li></ul><ul><li>Paired observations (McNemar test) </li></ul><ul><li>LIMITATIONS – </li></ul><ul><li>Yates’ correction reqd. if the expected value in each cell is <5 </li></ul><ul><li>∑ { O-E - ½} 2 </li></ul><ul><li>E </li></ul><ul><li>Or, =[(ad –bc)- n/2]2 Χ N </li></ul><ul><li>(a+b)(c+d)(a+c)(b+d) </li></ul><ul><li>B. In tables larger than 2 Χ 2, Yates’ correction not applicable </li></ul><ul><li>C. Does n’t measure the strength, but tells of presence or absence of any association </li></ul><ul><li>D. Statistical finding of relation doesnot indicate cause and effect </li></ul>
29. 29. <ul><li>Identify your objective </li></ul><ul><li>Collect sample data </li></ul><ul><li>Use a random procedure that   avoids bias </li></ul><ul><li>Analyze the data and form   conclusions </li></ul>10/01/11 STATISTICS
30. 30. Convenience Sampling - use results that are readily available 10/01/11 STATISTICS
31. 31. Random Sampling - selection so that each has an equal chance of being selected 10/01/11 STATISTICS
32. 32. Systematic Sampling - Select some starting point and then select every K th element in the population 10/01/11 STATISTICS
33. 33. Stratified Sampling - subdivide the population into subgroups that share the same characteristic, then draw a sample from each stratum 10/01/11 STATISTICS
34. 34. Cluster Sampling - divide the population into sections (or clusters); randomly select some of those clusters; choose all members from selected clusters 10/01/11 STATISTICS
35. 35. <ul><li>Sampling Error </li></ul><ul><li>the difference between a sample result and the true population result; such an error results from chance sample fluctuations. </li></ul><ul><li>Nonsampling Error </li></ul><ul><li>sample data that are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly). </li></ul>Definitions 10/01/11 STATISTICS
36. 36. a c e b d 10/01/11 STATISTICS
37. 37. <ul><li>When Null Hypothesis is true,but still rejected,it is Type 1 ( α ) error </li></ul><ul><li>When Null Hypothesis is false,but still accepted,it is Type 2 ( β ) error </li></ul><ul><li>Level of Significance- The prob.of committing Type 1 error. </li></ul><ul><li>Power of test – Ability of the test to correctly reject Ho in favour of H 1 when Ho is false. It is the prob.of committing Type 2error. </li></ul>10/01/11 STATISTICS
38. 38. SAMPLING ERRORS 10/01/11 STATISTICS Population Conclusion based on sample Null hypothesis Null hypothesis Rejected Accepted Null hypothesis True Type 1 error Correct decision Null hypothesis False Correct decision Type 2 error