Parametric tests seminar


Published on

1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Parametric tests seminar

  3. 3. INTRODUCTION • Statistics:- science of data - study of uncertainty • Biostatistics: data from: Medicine, Biological sciences (business, education, psychology, agriculture, economics...) • Types: Descriptive statistics Inferential statistics 3
  4. 4. 1. Descriptive Statistics - overview of the attributes of a data set. These include measurements of central tendency (frequency histograms, mean, median, & mode) and dispersion (range, variance & standard deviation) 2. Inferential Statistics - provide measures of how well data support hypothesis and if data are generalizable beyond what was tested (significance tests) 4
  5. 5. Data: Observations recorded during research Types of data: 1. Nominal data synonymous with categorical data, assigned names/ categories based on characters with out ranking between categories. ex. male/female, yes/no, death /survival 5
  6. 6. 2. Ordinal data ordered or graded data, expressed as Scores or ranks ex. pain graded as mild, moderate and severe 3. Interval data an equal and definite interval between two measurements it can be continuous or discrete ex. weight expressed as 20, 21,22,23,24 interval between 20 & 21 is same as 23 &24 6
  7. 7. Measures of Central Tendencies: •In a normal distribution, mean and median are the same •If median and mean are different, indicates that the data are not normally distributed •The mode is of little if any practical use 7
  8. 8. MEASURES OF VARIABILITY Range: It is the interval between the highest and lowest observations. • Ex. Diastolic BP of 5 individuals is90,80,78,84,98. Highest observation is 98 Lowest observation is 78 Range is 98-78= 20. 8
  9. 9. Standard deviation(SD): it is defined as positive square root of arithmetic mean of the square of the deviations taken from the arithmetic mean. • It describes the variability of the observation about the mean. Variance: average square deviation around the mean. variance =∑(X-X-)2 or ∑(X-X-)2 n n-1 valuesofNumber Value)Mean-Valuel(IndividuaofSum SD 2  9
  10. 10. Coefficient Of Variance(cv): It is the standard deviation(SD) expressed as a percentage of the mean. CV= SD / mean* 100 • It is dimensionless (independent of any unit of measurement) 10
  11. 11. Correlation coefficient: It measures relationship between two variables. denoted by ‘r’ , unitless quantity, it is a pure number.  values lie between -1 and +1  if variables not correlated CC will be zero. 11
  12. 12. PROBABILTY DISTRIBUTIONS 1. Binomial Distribution: The conditions to be fulfilled  i. There is fixed number(n) of trials; ii. Only two outcomes, ‘success’ and ‘failure’, are possible at each trial; iii. The trials are independent, iv. There is constant probability (𝜋) of success at each trial; v. The variable is the total number of successes in n trials. 12
  13. 13. 2. Poisson Distribution: • There are situations in which number of times an event occurs is meaningful and can be counted but the number of times the event did not occur is meaningless or can not be counted. • It is discrete and has an infinite number of possible values. • It has single parameter λ. 13
  14. 14. 3.Gaussian or Normal Distribution: Important characteristics are: i. The shape of the distribution resembles a bell and is symmetric around the midpoint; ii. At the centre of distribution which is peaked, mean median and mode coincide; 14
  15. 15. iii. The area under the curve between any two points which correspond to the proportion of observations between any two values of the variate can be found out in terms of a relationship between the mean and the standard deviation. iv. Parameters used mean(𝜇) and SD(𝜎) 15
  16. 16. • Standard Error Of Mean: The square root of the variance of the sample means SE of sample mean = SD/ 𝑛 SE of sample proportion = 𝑝𝑞 𝑛 • Applications of SEM: i. To determine whether a sample is drawn from the same population or not when its mean is known. ii. To work out the limits of desired confidence within which the population mean should lie. 16
  17. 17. Confidence Interval Or Fiducial Limits: • Confidence limits are two extremes of measurements within which 95% of observations would lie. Lower confidence limit = mean – ( t0.05 X SEM) Upper confidence limit = mean + ( t0.05 X SEM) • The important difference between ‘p’ value and confidence interval is confidence interval represents clinical significance and ‘p’ value indicates statistical significance. 17
  18. 18. Standard Normal Distribution Mean +/- 1 SD  encompasses 68% of observations Mean +/- 2 SD  encompasses 95% of observations Mean +/- 3SD  encompasses 99.7% of observations 18
  19. 19. Statistical Hypothesis: • They are hypothesis that are stated in such a way that they may be evaluated by appropriate statistical techniques. • There are two types of hypothesis testing: • Null hypothesis H0: It is the hypothesis which assumes that there is no difference between two values. H0: 𝜇1 = 𝜇2 • Alternative hypothesis HA : It is the hypothesis that differs from null hypothesis. • HA: 𝜇1 ≠ 𝜇2 𝑜𝑟 𝜇1 > 𝜇2 𝑜𝑟 𝜇1 < 𝜇2 19
  20. 20. Hypothesis Errors: Type-I Error: • It is probability of finding difference; when no such difference actually exists. • Acceptance of inactive compound • It is also known as 𝛼 error/ false positive 20
  21. 21. Type-II Error: • It is probability of inability to detect difference; when such difference actually exists, thus resulting in rejection of active compound as an inactive. • It is called as 𝛽 error/ false negative. 21
  22. 22. Level of significance(l.o.s): • The probability of committing type I error • Denoted by 𝛼 • L.o.s of 0.05% means risk of making wrong decisions only is 5 out of 100 cases i.e 95% confident Power of the test: • It is probability of committing type II error • Denoted by 𝛽 𝑎𝑛𝑑 1- 𝛽 is power of the test • Power is probability of rejecting H0 when H0 is false i.e correct decision. 22
  23. 23. • The p-value is defined as the smallest value of α for which the null hypothesis can be rejected. • If the p-value is less than α ,we reject the null hypothesis (p<α) • If the p-value is greater than α ,we do not reject the null hypothesis (p ≥ α) 23
  24. 24. Critical Region One tailed test: • The rejection is in one or other tail of distribution • The difference could only be their in one direction/ possibility • Ex. English men are taller than Indian men. 24
  25. 25. Two Tailed Test: • The rejection is split between two sides or tails of distribution • The difference could be in both direction/ possibility • Ex. Comparative study of drug ‘X’ with atenolol for antihypertensive property 25
  26. 26. 26
  27. 27. SAMPLE SIZE: • Large Sample : sample of size is more than 30 • Small Sample: sample of size less than or equal to 30 • Many statistical test are based upon the assumption that the data are sampled from a Gaussian distribution. • Procedures for testing hypotheses about parameters in a population described by a specified distributional form, (normal distribution) are called parametric tests. 27
  28. 28. Types of Parametric tests 1. Large sample tests  Z-test 2. Small sample tests  t-test * Independent/ unpaired t-test * Paired t-test ANOVA (Analysis of variance) * One way ANOVA * Two way ANOVA 28
  29. 29. Z- Test: • A z-test is used for testing the mean of a population versus a standard, or comparing the means of two populations, with large (n ≥ 30) samples whether you know the population standard deviation or not. 29
  30. 30. • It is also used for testing the proportion of some characteristic versus a standard proportion, or comparing the proportions of two populations. Ex. Comparing the average engineering salaries of men versus women. Ex. Comparing the fraction defectives from two production lines. 30
  31. 31. T- test: Derived by W S Gosset in 1908. • Properties of t distribution: i. It has mean 0 ii. It has variance greater than one iii. It is bell shaped symmetrical distribution about mean • Assumption for t test: i. Sample must be random, observations independent ii. Standard deviation is not known iii. Normal distribution of population 31
  32. 32. Uses of t test: i. The mean of the sample ii. The difference between means or to compare two samples iii. Correlation coefficient Types of t test: a. Paired t test b. Unpaired t test 32
  33. 33. Paired t test: • Consists of a sample of matched pairs of similar units, or one group of units that has been tested twice (a "repeated measures" t-test). • Ex. where subjects are tested prior to a treatment, say for high blood pressure, and the same subjects are tested again after treatment with a blood-pressure lowering medication. 33
  34. 34. Unpaired t test: • When two separate sets of independent and identically distributed samples are obtained, one from each of the two populations being compared. • Ex: 1. compare the height of girls and boys. 2. compare 2 stress reduction interventions when one group practiced mindfulness meditation while the other learned progressive muscle relaxation. 34
  35. 35. ANALYSIS OF VARIANCE(ANOVA): • Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences between group means and their associated procedures (such as "variation" among and between groups), • Compares multiple groups at one time • Developed by R.A. Fisher. • Two types: i. One way ANOVA ii. Two way ANOVA 35
  36. 36. It compares three or more unmatched groups when data are categorized in one way Ex. 1. Compare control group with three different doses of aspirin in rats 2. Effect of supplementation of vit C in each subject before , during and after the treatment. One Way ANOVA: 36
  37. 37. Two way ANOVA: • Used to determine the effect of two nominal predictor variables on a continuous outcome variable. • A two-way ANOVA test analyzes the effect of the independent variables on the expected outcome along with their relationship to the outcome itself. 37
  38. 38. Difference between one & two way ANOVA • An example of when a one-way ANOVA could be used is if we want to determine if there is a difference in the mean height of stalks of three different types of seeds. Since there is more than one mean, we can use a one-way ANOVA since there is only one factor that could be making the heights different. 38
  39. 39. • Now, if we take these three different types of seeds, and then add the possibility that three different types of fertilizer is used, then we would want to use a two-way ANOVA. • The mean height of the stalks could be different for a combination of several reasons: 39
  40. 40. • The types of seed could cause the change, the types of fertilizer could cause the change, and/or there is an interaction between the type of seed and the type of fertilizer. • There are two factors here (type of seed and type of fertilizer), so, if the assumptions hold, then we can use a two-way ANOVA. 40
  41. 41. Summary of parametric tests applied for different type of data Sl no Type of Group Parametric test 1. Comparison of two paired groups Paired ‘t’ test 2. Comparison of two unpaired groups Unpaired ‘t’ test 3. Comparison of three or more matched groups Two way ANOVA 4. Comparison of three or more matched groups One way ANOVA 5. Correlation between two variables Pearson correlation 41
  42. 42. References: 1. Dr J V Dixit’s Principles and practice of biostatistics 5th edition. 2. Rao & Murthy’s applied statistics in health sciences 2nd edition. 3. Sarmukaddam’s fundamentals of biostatistics 1st edition. 4. Internet sources……. 42
  43. 43. 43