Biostatistics iii

500 views

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
500
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
21
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Biostatistics iii

  1. 1. Descriptive Methods for Categorical Data Mahmoud Alhussmi, D.Sc., PhD
  2. 2. Topics• Proportions• Rates – Change rates – Measures of morbidity and mortality – Standardization of rates• Ratios – Relative risk – Odds RatioMarch 28, 2013 2
  3. 3. Categorical DataData that can be classified as belonging to adistinct number of categories. – Binary – data can be classified into one of 2 possible categories (yes/no, positive/negative) – Ordinal – data that can be classified into categories that have a natural ordering (i.e.. Levels of pain: none, moderate, intense) – Nominal- data can be classified into >2 categories (i.e.. Race: Arab, African, and other)March 28, 2013 3
  4. 4. Data examples• What type of data would result from these questions? – How old are you? ____________ – How old are you? • A. under 18 • B. 19-35 • C. 36-49 • D. over 50March 28, 2013 4
  5. 5. Proportions• Numbers by themselves may be misleading: they are on different scales and need to be reduced to a standard basis in order to compare them.• We most frequently use proportions: that is, the fraction of items that satisfy some property, such as having a disease or being exposed to a dangerous chemical.• "Proportions" are the same thing as fractions or percentages. In every case you need to know what you are taking a proportion of: that is, what is the DENOMINATOR in the proportion. x x p= n percent (100) = (100) nMarch 28, 2013 5
  6. 6. Proportions and Probabilities• We often interpret proportions as probabilities. If the proportion with a disease is 1/10 then we also say that the probability of getting the disease is 1/10, or 1 in 10.• Proportions are usually quoted for samples - probabilities are almost always quoted for populations. March 28, 2013 6
  7. 7. Workers Example Smoking Workers Cases Controls No Yes 11 35 No 50 203 Yes Yes 84 45 No 313 270• For the cases: – Proportion of exposure=84/397=0.212 or 21.2%• For the controls: – Proportion of exposure=45/315=0.143 or 14.3% March 28, 2013 7
  8. 8. PrevalenceDisease Prevalence = the proportion of people with a givendisease at a given time.disease prevalence = Number of diseased persons at a given time Total number of persons examined at that timePrevalence is usually quoted as per 100,000people so the above proportion should bemultiplied by 100,000.March 28, 2013 8
  9. 9. Interpretation Cases (old +new)At time t Pr evalence = Total Problem of exposure, consequently Not comparable measurement Old = duration of the disease New = speed of the disease March 28, 2013 9
  10. 10. Screening Tests• Through screening tests people are classified as healthy or as falling into one or more disease categories.• These tests are not 100% accurate and therefore misclassification is unavoidable.• There are 2 proportions that are used to evaluate these types of diagnostic procedures.March 28, 2013 10
  11. 11. Screening Tests General Population Diseased Positive Test ResultsMarch 28, 2013 11
  12. 12. Sensitivity and Specificity• Sensitivity and specificity are terms used to describe the effectiveness of screening tests. They describe how good a test is in two ways - finding false positives and finding false negatives• Sensitivity is the Proportion of diseased who screen positive for the disease• Specificity is the Proportion of healthy who screen healthyMarch 28, 2013 12
  13. 13. Sensitivity and Specificity Condition Present Condition Absent……………………………………………………………………………………………Test Positive True Positive (TP) False Positive (FP)Test Negative False Negative (FN) True Negative (TN)…………………………………………………………………………………………… Test Sensitivity (Sn) is defined as the probability that the test is positive when given to a group of patients who have the disease.  Sn= (TP/(TP+FN))x100.  It can be viewed as, 1-the false negative rate. The Specificity (Sp) of a screening test is defined as the probability that the test will be negative among patients who do not have the disease.  Sp = (TN/(TN+FP))X100.  It can be understood as 1-the false positive rate.
  14. 14. Positive & Negative Predictive Values• The positive predictive value (PPV) of a test is the probability that a patient who tested positive for the disease actually has the disease. PPV = (TP/(TP+FP))X 100.• The negative predictive value (NPV) of a test is the probability that a patent who tested negative for a disease will not have the disease. NPV = (TN/(TN+FN))X100.
  15. 15. The Efficiency• The efficiency (EFF) of a test is the probability that the test result and the diagnosis agree.• It is calculated as: EFF = ((TP+TN)/(TP+TN+FP+FN)) X 100
  16. 16. Example• A cytological test was undertaken to screen women for cervical cancer. Test Positive Test Negative Total Actually Positive )TP (154 )FP (225 379 Actually Negative )FN (362 )TN (23,362 23,724 )TP+FN (516 )FP+TN(23587• Sensitivity =?• Specificity = ?March 28, 2013 16
  17. 17. Displaying Proportions• Types of charts that can be used: – Histogram – Pie Chart – Line Graph• BEWARE of the type of display you use – some charts are better at displaying certain types of data than others.March 28, 2013 17
  18. 18. Displaying Proportions Percent of Children LivingSudan Distribution of Race in in African 70% 80% Crack/cocaine Households 80 Arab 18% 70% 70 Percent of Children Living in 60% Crack/cocaine Households Mixed 8% 60 50% 50 40% 40 30% Other 4% 30 20% 20 10% 10 0% 0 Black White American Other Indian African Arab foreigners othersMarch 28, 2013 18
  19. 19. Displaying Proportions Distribution of Race in Sudan Distribution of Race in Sudan othersforeigners African Arab Arab foreigners others African 0 20 40 60 80 March 28, 2013 19
  20. 20. Displaying Proportions Cause of Death of Deaths# Proportion of Deaths Heart Disease 12,278 0.38 Cancer 6,448 0.20Cerebrovascular Disease 3,958 0.12 Accidents 1,814 0.06 Other 8,088 0.25 Causes of Death Heart Disease Cancer Cerebrovascular disease Accidents Other March 28, 2013 20
  21. 21. Rates• The term rate is often used interchangeably with the term proportion although sometimes it refers to a quantity of a very different nature.• Types of rates we will cover: – Incidence rate – Change rates – Death rate – Follow-up death rateMarch 28, 2013 21
  22. 22. Calculation of Incidence RateMarch 28, 2013 22
  23. 23. DefinitionThe incidence rate is the production of new cases in a population. It measure the number of cases per unit of time, i.e. It measure the average of the speed of the apparition of new cases in given population.There are three measures of incidence:1. Incidence rate (=news cases/time of participation)2. Instantaneous incidence3. Accumulate incidence March 28, 2013 23
  24. 24. In incidence study we are interested on theoccurrence of event (disease) over the time,.Such study deals with follow up ofeach subject and the moment (ti) that each event(disease) occurs.The problem of incidence data is that the existenceofobservation incomplete, subjects are still notaffected at the moment of analysis.March 28, 2013 24
  25. 25. To study incidence rate we should know foreach subject the following information:Date of origin Is the date that an individual enter in the study.Date of last information Is the recent date that we receive information aboutthe status of the subject. If the subject is affected, so the dateof last information is date of getting the disease. March 28, 2013 25
  26. 26. Duration of follow up Is the delay between the date of origin and the date of last information Date of point Is the date that we decide to stop collecting information about subjects.Loss of view A subject that we do not know his status at the dateof point is called loss of view. March 28, 2013 26
  27. 27. Time of participation t i Time of participation for loss of view Time of participation for loss of view Time not consider t1, t2,t3, ….,time 0 ? Date of last Date of point information Time of participation for affected subject Time of participation March 28, 2013 27
  28. 28. ExampleThe following example follow 30 individuals between 1982 to 1988 for the disease D. of individual# Date of origin Date of disease Date of last Time of information participation 1 11-82 1-84 12-88 14 2 11-82 10-87 12-88 59 3 6-82 2-86 6-87 44 4 11-82 11-86 12-88 36 5 11-82 3-85 12-88 28 6 11-82 1-88 12-88 62 7 6-83 3-84 6-85 9 8 1-83 12-86 12-87 47 9 1-84 1-87 12-88 36 10 11-82 10-85 12-88 35 11 12-82 8-86 12-88 44 12 1-83 6-83 7-85 5 13 6-83 11-87 7-88 53 14 11-82 8-84 2-87 21 March 28, 2013 28
  29. 29. 15 1-83 3-86 12-88 38 16 6-83 5-87 12-88 47 17 5-83 1-84 12-88 8 18 11-83 6-88 12-88 55 19 6-83 5-86 12-87 35 20 3-83 2-88 12-88 59 21 4-83 12-88 68 22 11-82 11-85 36 23 1-83 12-88 71 24 6-83 12-88 66 25 11-82 11-86 48 26 6-82 12-88 78 27 12-83 12-88 60 28 11-82 12-88 73 29 6-84 12-88 54 30 1-83 1-84 12Time of participation = 3001Number of cases = 20Rate of incidence = 20/3001 = 0.015Or 1.5 case per 100 individuals.March 28, 2013 29
  30. 30. Sometimes it is difficult to know the exact date of origin of the case or even the duration offollow up, and this is always take place when the population under study is open . In thiscase what we know very well :1.Number of new cases, but not the time exact of participation = m2.Total number of population from the begin to end of the study = NThe calculation using the same method as before will be impossible, because manyinformation was missed.The assumption over which we are going to build our hypothesis is that all the cases enteror quiet the study are distributed uniformly during the period of follow up, i.e. the date offollow up for each of these subject will be the half average of the period of the follow up ofthe study.The calculation will be as follow: Item Number time of participationNot diseases N N*ΔtEnter along the study Ne Ne*Δt/2Quiet along the study Ns Ns*Δt/2Disease m m*Δt/2 m IR = ∆t ( 2 N + N e + N s + m) 2March 28, 2013 30
  31. 31. Change Rates • These types of rates are used to describe changes after a certain period of time. new value - old value change rate (%) = X 100 old value • Example: A total of 35,238 new AIDS cases were reported in 1989 compared to 32,196 reported during 1988. – The change rate for new AIDS cases:  35,238 − 32 ,196   100 = 9.4%  32 ,196 March 28, 2013 31
  32. 32. Measures of Morbidity and Mortality # deaths in a calendar year crude death rate = the population on that year # of people that developed the disease over a defined period of time (ie. a year) incidence rate = # of people at risk who were followed for the defined period of time (ie. a year) number of deaths follow - up death rate = total person - yearsMarch 28, 2013 32
  33. 33. Crude Death Rate• Example: The 1980 population in California was 23,000,000 (as estimated on 1 July) and there were 190,237 deaths during that year. – Crude death rate =(190,237/23,000,000)*1,000 = 8.3 deaths per 1,000 per yearMarch 28, 2013 33
  34. 34. Displaying Proportions over Time Death Rate per 100,000 Female Death Rates (1984-1987)1984 793 815 810 8051985 807 800 795 Death Rate per 100,000 7901986 809 785 780 1984 1985 1986 19871987 813March 28, 2013 34
  35. 35. Standardization of Rates • Crude rates are used to describe a population but comparisons of crude rates are often invalid because the populations may be different w.r.t important characteristics (ie. age, gender, race). • To account for these differences adjusted rates are used in the comparison.  # deaths expected adjusted rate =   # in standard population  X100,000   March 28, 2013 35
  36. 36. Group A Group B no. deaths/ no. deaths/Age group deaths persons 100000 deaths persons 100000 0-4 162 40,000 405.0 2,049 546,000 375.3 5-19 107 128,000 83.6 1,195 1,982,000 60.3 20-44 449 172,000 261.0 5,097 2,676,000 190.5 45-64 451 58,000 777.6 19,904 1,807,000 1101.5 +65 444 9,000 4933.3 63,505 1,444,000 4397.9 Totals 1613 407000 396.3 91750 8455000 1085.2March 28, 2013 36
  37. 37. :Using the X population for 1970 as a standard we get Group A Group B Age Age spec. Exp Age spec. Expgroup Standard rate deaths rate deaths 0-4 84,416 405.0 342 375.3 317 5-19 294,353 83.6 246 60.3 177 20-44 316,744 261.0 827 190.5 603 45-64 205,745 777.6 1600 1101.5 2266 +65 98,742 4933.3 4871 4397.9 4343 Totals 1,000,000 7886 7706Expected deaths for Group A for age group 65+ = (98,742)(4933.3)/100,000 = 4871 Age adjusted rate for = Group A 788.6 100,000*(7886/1,000,000)= Age adjusted rate for = Group B 770.6 100,000*(7706/1,000,000)=March 28, 2013 37
  38. 38. Relative Risk • Relative risks are the ratio of risks for two different populations (ratio=a/b). disease incidence in group 1 Relative Risk = disease incidence in group 2 • If the risk (or proportion) of having the outcome is 1/10 in one population and 2/10 in a second population, then the relative risk is: (2/10) / (1/10) = 2.0 • A relative risk >1 indicates increased risk for the group in the numerator and a relative risk <1 indicates decreased risk for the group in the numerator.March 28, 2013 38
  39. 39. Odd’s Ratio and Relative Risk• Odds ratios are better to use in case- control studies (cases and controls are selected and level of exposure is determined retrospectively)• Relative risks are better for cohort studies (exposed and unexposed subjects are chosen and are followed to determine disease status - prospective)March 28, 2013 39
  40. 40. Odd’s Ratio and Relative Risk• When we have a two-way classification of exposure and disease we can approximate the relative risk by the odds ratio Disease Yes No Yes A B A+B Exposure No C D C+D• Relative Risk=A/(A+B) divided by C/(C+D)• Odd’s Ratio= A/B divided by C/D = AD/BCMarch 28, 2013 40
  41. 41. Relationship Between the Two Measures A C RR = ÷ A+B C+D A(C + D) = C(A + B) if the number of subjects classified as disease positive is small compared to those classified as disease negative, then : C+D ≅ D A+B≅B Therefore the relative risk can be approximated by : A A*D RR ≅ = B B*C C DMarch 28, 2013 41
  42. 42. Case Control Study Example • Disease: Pancreatic Cancer • Exposure: Cigarette Smoking Disease Yes No Exposure Yes 38 81 119 No 2 56 58March 28, 2013 42
  43. 43. Example Continued• Relative risk for exposed vs. non-exposed – Numerator- proportion of exposed people that have the disease – Denominator-proportion of non-exposed that have the disease – Relative Risk= (38/119)/(2/58)=9.26March 28, 2013 43
  44. 44. Example Continued• Odd’s Ratio for exposed vs. non-exposed – Numerator- ratio of diseased vs. non- diseased in the exposed group – Denominator- ratio of diseased vs. non- diseased in the non-exposed group – Odd’s Ratio= (38/81)/(2/56)=(38*56)/(2*81) =13.14March 28, 2013 44
  45. 45. Relative Risk• Relative risk – the chance that a member of a group receiving some exposure will develop a disease relative to the chance that a member of an unexposed group will develop the same disease. P(disease | exposed) RR = P(disease | unexposed)• Recall: a RR of 1.0 indicates that the probabilities of disease in the exposed and unexposed groups are identical – an association between exposure and disease does not exist.March 28, 2013 45
  46. 46. Relative Risk• When we have a two-way classification of exposure and disease we can calculate the relative risk Disease Yes No Yes A B A+B Exposure No C D C+DMarch 28, 2013 46
  47. 47. Case Control Study Example • Disease: Pancreatic Cancer • Exposure: Cigarette Smoking Disease Yes No Exposure Yes 38 81 119 No 2 56 58March 28, 2013 47
  48. 48. Data Interpretation• Consideration:1. Accuracy 1. critical view of the data 2. investigating evidence of the results 3. consider other studies’ results 4. peripheral data analysis 5. conduct power analysis: type I & type II True False True Correct Type-II False Type -I Correct
  49. 49. Types of ErrorsIf You…… When the Null Then You Hypothesis is… Have…….Reject the null True (there really Made a Type Ihypothesis are no difference) ErrorReject the null False (there really ☻hypothesis are difference)Accept the null False (there really Made Type IIhypothesis are difference) ErrorAccept the null True (there really ☻hypothesis are no difference)
  50. 50. • alpha : the level of significance used for establishing type-I error• β : the probability of type-II error• 1 – β : is the probability of obtaining significance results ( power)• Effect size: how much we can say that the intervention made a significance difference
  51. 51. 2. Meaning of the results - translation of the results and make it understandable3. Importance: - translation of the significant findings into practical findings4. Generalizability: - how can we make the findings useful for all the population5. Implication: - what have we learned related to what has been used during study
  52. 52. POWER--Uses and Misuses• Sources – Cohen Statistical Power Analysis for the Behavioral Sciences (gold standard for power) – Kraemer & Thieman How Many Subjects? (also a good review)
  53. 53. Needed Parameters• Alpha--chance of a Type I error• Beta--chance of a Type II error• Power = 1 - beta• Effect size--difference between groups or amount of variance explained or how much relationship there is between the DV and the IVs
  54. 54. ?Remember this in English• Type I error is when you say there is a difference or relationship and there is not• Type II error is when you say there is no difference or relationship and there really is
  55. 55. ?What Affects Power• Size of the difference in means or amount of variance explained (ES)• alpha• Unexplained variance• N
  56. 56. ?Which is more important• Type I error more important if possibility of harm or lethal effect• Type II error more important in relatively unexplored areas of research• In some studies, Type I and Type II errors may be equally important
  57. 57. How to Increase Power1. Increase the n2. Decrease the unexplained variance--control by design or statistics (e.g. ANCOVA)3. Increase alpha (controversial)4. Use a one tailed test (directional hypothesis)--puts the zone of rejection all in one tail; same effect as increasing alpha5. Use parametric statistics as long as you meet the assumptions. If not, parametric statistics are LESS powerful6. Decrease measurement error (decrease unexplained variance)--use more reliable instruments, standardize measurement protocol, frequent calibration of physiologic instruments, improve inter-rater reliability
  58. 58. ?What is good powerBy tradition, “good” power is 80%The correct answer is it depends on the nature of the phenomenon and which kind of error is most important in your study. This is a theoretical argument that you have to make.Using convention (alpha = .05 and power = .80, beta = .20) you are saying that Type I error is _________ as serious as a Type II error
  59. 59. Effect SizeHow large an effect do I expect exists in the population if the null is false?ORHow much of a difference do I want to be able to detect?The larger the effect, the fewer the cases needed to see it. (The difference is so big you can trip on it.)
  60. 60. The World According to Power Kraemer & Thiemann• The more stringent the significance level, the greater the necessary sample size. More subjects are needed for a 1% level than a 5% level• Two tailed tests require larger sample sizes than one tailed tests. Assessing two directions at the same time requires a greater investment.• The smaller the effect size, the larger the necessary sample size. Subtle effects require greater efforts.• The larger the power required, the larger the necessary sample size. Greater protection from failure requires greater effort.• The smaller the sample size, the smaller the power, ie the greater the chance of failure
  61. 61. The World According to Power Kraemer & Thiemann• If one proposed to go with a sample size of 20 or fewer, you have to be willing to have a high risk of failure or a huge effect size• To achieve 99% power for a effect size of . 01, you need > 150,000 subjects
  62. 62. Test YourselfKeeping the other parameters the same:• As ES decreases, needed n ____• As alpha decreases, needed n ____• Higher power requires _____ n
  63. 63. Power for each test• You do a power analysis for each statistic you are going to use.• Choose the sample size based on the highest number of subjects from the power analysis.• Use the most conservative power analysis--guarantees you the most subjects
  64. 64. ?What about multiple time points• More time points requires fewer subjects since more is known about the subjects from prior time points as compared to a cross sectional study• In other words, less variance is unexplained since you have baseline information• How many fewer? It depends
  65. 65. Power analysis and secondary analysisIf you have a set sample size, your poweranalysis then works backward. You set then, alpha and ES and determine the powergiven the first three parameters.
  66. 66. Determining ESIf you want to determine effect size from a completed study, you have the n, alpha and power and can work backwards to determine the ES.Especially important in relatively unexplored areas
  67. 67. Power and MR• ES is the amount of explained variance expected since there may not be group differences, based on past research• Increasing the number of independent variables _______ sample size needed to achieve adequate power.
  68. 68. Sampling Distribution• A sample statistic is often unequal to the value of the corresponding population parameter because of sampling error.• Sampling error reflects the tendency for statistics to fluctuate from one sample to another.• The amount of sampling error is the difference between the obtained sample value and the population parameter.• Inferential statistics allow researchers to estimate how close to the population value the calculated statistics is likely to be.• The concept of sampling, which are actually probability distributions, is central to estimates of sampling error.
  69. 69. Characteristics of Sampling Distribution• Sampling error= sample mean-population mean.• Every sample size has a different sampling distribution of the mean.• Sampling distributions are theoretical, because in practice, no one draws an infinite number of samples from a population.• Their characteristics can be modeled mathematically and have determined by a formulation known as the central limit theorem.• This theorem stipulates that the mean of the sampling distribution is identical to the population mean.• The average sampling error-the mean of the (mean-μ)s- would always equal zero.
  70. 70. Standard Error of the Mean• The standard deviation of a sampling distribution of the mean has a special name: the standard error of the mean (SEM).• The smaller the SEM, the more accurate are the sample means as estimates of the population value.
  71. 71. • Estimation• Hypothesis TestingBoth activities use sample statistics (for ̅ example, X) to make inferences about a population parameter (μ). 71
  72. 72. • Why don’t we just use a single number (a point estimate) like, say, X̅ to estimate a population parameter, μ?• The problem with using a single point (or value) is that it will very probably be wrong. In fact, with a continuous random variable, the probability that the variable is equal to a ̅ particular value is zero. So, P(X=μ) = 0.• This is why we use an interval estimator.• We can examine the probability that the interval includes the population parameter. 72
  73. 73. Types of Statistical Inference• Parameter estimation: – It is used to estimate a population value, such as a mean, relative risk index or a mean difference between two groups. – Estimation can take two forms: • Point estimation: involves calculating a single statistic to estimate the parameter. E.g. mean and median. – Disadvantages: they offer no context for interpreting their accuracy and a point estimate gives no information regarding the probability that it is correct or close to the population value. • Interval estimation: is to estimate a range of values that has a high probability of containing the population value .
  74. 74. • How wide should the interval be? That depends upon how much confidence you want in the estimate.• For instance, say you wanted a confidence interval estimator for the mean income of a college graduate: You might have That the mean income is between 100% $∞and $0 confidence 95% and $41,000 $35,000 confidence 90% and $40,000 $36,000 confidence 80% and $38,500 $37,500 confidence• The wider the interval, the greater the confidence … … you will have in it as containing the true population confidence 0%(a point estimate )$38,000 parameter μ. 74
  75. 75. Interval Estimation• For example, it is more likely the population height mean lies between 165-175cm.• Interval estimation involves constructing a confidence interval (CI) around the point estimate.• The upper and lower limits of the CI are called confidence limits.• A CI around a sample mean communicates a range of values for the population value, and the probability of being right. That is, the estimate is made with a certain degree of confidence of capturing the parameter.
  76. 76. Confidence Intervals around a Mean• 95% CI = (mean + (1.96 x SEM)• This statement indicates that we can be 95% confident that the population mean lies between the confident limits , and that these limits are equal to 1.96 times the true standard error, above and below the sample mean.• E.g. if the mean = 61 inches, and SEM = 1, What is 95% CI. – Solution: 95% CI = (61 + (1.96 X 1)) 95% CI = (61 + 1.96) 95% CI = 59.04 < μ < 62.96• E.g. if the mean = 61 inches, and SEM = 1, What is 99% CI. – Solution: 99% CI = (61 + (2.58 X 1)) 99% CI = (61 + 2.58) 99% CI = 58.42 < μ < 63.58
  77. 77. Confidence Intervals and the t distribution• When sample size is small then we cannot use confidence intervals around the mean, instead, we measure confidence intervals by the t-distribution.• t-distribution is similar to a normal distribution in a standard form.• The exact shape of the t-distribution is influenced by the number cases in the sample.• Statisticians have developed tables for the area under the t-distribution for different sample size and probability levels.• To use this table, we must enter at the appropriate row based on the number of degrees of freedom.
  78. 78. Confidence Intervals and the t distribution• 95% CI = (mean + (t x SEM) – Where mean = the sample mean T = tables t value at 95% CI for df = N-1 SEM = the calculated SEM for the sample data• E.g. SEM = 1, mean = 61, N = 25, df = 25-1, t for the 95% CI with 24 df is 2.06 – Solution: 95% CI = (61 + (2.06 X 1)) 95% CI = (61 + 2.06) 95% CI = 58.95 < μ < 63.06 To compute CIs around a mean with SPSS: Analyze------descriptive stat----explore then click on the statistics pushbutton.
  79. 79. Types of Statistical Inference• Hypothesis testing: – Hypothesis testing is a second approach to inferential statistics. – Hypothesis testing involves using sampling distributions and the laws of probability to make an objective decision about whether to accept or reject the null hypothesis. – The sample may deviate from the defined population’s true nature by certain amount. – This deviation is called sampling error. – Drawing the wrong conclusion is called an error of inference. – There are two types of errors of inference defined in terms of the null hypothesis: • Type I error • Type II error
  80. 80. • Testing a Claim: Companies often make claims about products. For example, a frozen yogurt company may claim that its product has no more than 90 calories per cup. This claim is about a parameter – i.e., the population mean number of calories per cup (μ).• The claim is tested is by taking a sample - say, 100 cups - and determining the sample mean. If the sample mean is 90 calories or less we have no evidence that the company has lied. Even if the sample mean is greater than 90 calories, it is possible the company is still telling the truth (sampling error). However, at some point – perhaps, say, a sample average of 500 calories per cup – it will be clear that the company has not been completely truthful about its product. 80
  81. 81. • A hypothesis is made about the value of a parameter, but the only facts available to estimate the true parameter are those provided by the sample. If the statistic differs (and of course it will) from the hypothesis stated about the parameter, a decision must be made as to whether or not this difference is significant. If it is, the hypothesis is rejected. If not, it cannot be rejected.• H0: The null hypothesis. This contains the hypothesized parameter value which will be compared with the sample value.• H1: The alternative hypothesis. This will be “accepted” only if H0 is rejected. Technically speaking, we never accept H0 What we actually say is that we do not have the evidence to reject it. 81
  82. 82. • Two types of errors may occur: α (alpha) and β (beta). The α error is often referred to as a Type I error and β error as a Type II error. – You are guilty of an alpha error if you reject H0 when it really is true. – You commit a beta error if you “accept” H0 when it is false. 82
  83. 83. • This alpha error is related to the (1- α) we just learned about when constructing confidence intervals. We will soon see that an α error of .05 in testing a hypothesis (two-tail test) is equivalent to a confidence of 95% in constructing a two-sided interval estimator. α/2 α/2 -Zα/2 Zα/2 83
  84. 84. TRADEOFF!•There is a tradeoff between the alpha and beta errors.We cannot simply reduce both types of error. As onegoes down, the other rises.•As we lower the α error, the β error goes up: reducingthe error of rejecting H0 (the error of rejection) increasesthe error of “Accepting” H0 when it is false (the error ofacceptance).•This is similar (in fact exactly the same) to the problemwe had earlier with confidence intervals. Ideally, wewould love a very narrow interval, with a lot ofconfidence. But, practically, we can never have both:there is a tradeoff. 84
  85. 85. • Our legal system understands this tradeoff very well. – If we make it extremely difficult to convict criminals because we do not want to incarcerate any innocent people we will probably have a legal system in which no one gets convicted. – On the other hand, if we make it very easy to convict, then we will have a legal system in which many innocent people end up behind bars. – This is why our legal system does not require a guilty verdict to be “beyond a shadow of a doubt” (i.e., complete certainty) but “beyond reasonable doubt.” 85
  86. 86. • Quality Control. – A company purchases chips for its smart phones, in batches of 50,000. The company is willing to live with a few defects per 50,000 chips. How many defects? – If the firm randomly samples 100 chips from each batch of 50,000 and rejects the entire shipment if there are ANY defects, it may end up rejecting too many shipments (error of rejection). If the firm is too liberal in what it accepts and assumes everything is “sampling error,” it is likely to make the error of acceptance. – This is why government and industry generally work with an alpha error of .05 86
  87. 87. 1.Formulate H0 and H1. H0 is the null hypothesis, a hypothesis about the valueof a parameter, and H1 is an alternative hypothesis. – e.g., H0: µ=12.7 years; H1: µ≠12.7 years2.Specify the level of significance (α) to be used. This level of significance tellsyou the probability of rejecting H0 when it is, in fact, true. (Normally,significance level of 0.05 or 0.01 are used)3.Select the test statistic: e.g., Z, t, F, etc. So far, we have been using the Zdistribution. We will be learning about the t-distribution (used for smallsamples) later on.4.Establish the critical value or values of the test statistic needed to reject H0.DRAW A PICTURE!5.Determine the actual value (computed value) of the test statistic.6.Make a decision: Reject H0 or Do Not Reject H0. 87
  88. 88. •When we Formulate H0 and H1, we have to decidewhether to use a one-tail or two-tail test.•With a “two-tail” hypothesis test, α is split into twoand put in both tails. H1 then includes twopossibilities: μ = # OR μ ≠ #. This is why theregion of rejection is divided into two tails. Notethat the region of rejection always corresponds toH1.• With a “one-tail” hypothesis test, the α is entirelyin one of the tails. Hypothesis Testing 88
  89. 89. •For example, if the company claims that a certainproduct has exactly 1 mg of aspirin, that would result ina two-tail test. Note words like “exactly” suggest two tailtests. There are problems with too much aspirin and toolittle aspirin in a drug.•On the other hand, if a firm claims that a box of itsraisin bran cereal contains at least 100 raisins, a one-tailtest has to be used. If the sample mean is more than100, everything is ok. The problems arise only if thesample mean is less than 100. The question will bewhether we are looking at sampling error or perhaps thecompany is lying and the true (population) mean is lessthan 100 raisins. 89
  90. 90. •A company claims that its soda vending machines deliver exactly 8 ounces ofsoda. Clearly, You do not want the vending machines to deliver too much ortoo little soda. How would you formulate this?Answer:H0: µ = 8 ouncesH1: µ ≠ 8 ouncesIf you are testing at α=.01, The .01 is split into two: .005 in the left tail and .005 in the right tail The critical values are ±2.575 .005 .005 -2.575 2.575 90
  91. 91. •A company claims that its bolts have a circumferenceof exactly 12.50 inches. (If the bolts are too wide ornarrow, they will not fit properly):Answer:H0: µ = 12.50 inchesH1: µ ≠ 12.50 inches•A company claims that a slice of its bread has exactly 2grams of fiber. Formulate this:Answer:H0: µ = 2 gramsH1: µ ≠ 2 grams 91
  92. 92. •A company claims that its batteries have an average life of at least 500hours. How would you formulate this?Answer:H0: µ ≧ 500 hoursH1: µ < 500 hoursIf you are testing at an α = .05, The entire .05 is in the left tail (hint: H1 points towhere the rejection region should be.) The critical value is -1.645. 92
  93. 93. A company claims that its overpriced, bottled spring water has no more than 1mcg of benzene (poison). How would you formulate this:Answer:H0: µ ≦ 1 mcg. benzeneH1: µ > 1 mcg. benzeneIf you are testing at an α = .05, The entire .05 is in the right tail (hint: H1 pointsto where the rejection region should be.) The critical value is +1.645. .05 1.645 93
  94. 94. A pharmaceutical company claims that each of its pills contains exactly 20.00milligrams of Cumidin (a blood thinner). You sample 64 pills and find that thesample mean X̅ =20.50 mg and s = .80 mg. Should the company’s claim berejected? Test at α = 0.05.•Formulate the hypotheses H0: µ =20.00 mg H1: µ ≠ 20.00 mg•Choose the test statistic and find the critical values; draw region of rejectionTest statistic: ZAt α = 0.05, the critical values are ±1.96.•Use the data to get the calculated value of the test statisticZ= = =5 [ .80/√.64 = .10 This is the standard error of the mean. ]•Come to a Conclusion: Reject H0 or Do Not Reject H0 The computed Z value of 5 is deep in the region of rejection. Thus, Reject H0 at p < .05 94
  95. 95. • Suppose we took the above data, ignored the hypothesis, and constructed a 95% confidence interval estimator.20.50 ± 1.96(.10)95%, CIE: 20.304 mg ←→ 20.696 mg• We note that 20.00 mg is not in this interval.• As you can see, hypothesis testing and CIE are virtually the same exercise; they are merely two sides of the same coin. Both rely on the sample evidence.• If a claim is made about a parameter, do a hypothesis test. If no claim is made and a company wants to use sample evidence to estimate a parameter (perhaps to determine what claims may be made in the future about a parameter), construct a confidence interval estimator. 95
  96. 96. • A company claims that its LED bulbs will last at least 8,000 hours. You sample 100 bulbs and find that X̅ =7,800 hours and s=800 hours. Should the company’s claim be rejected? Test at α = 0.05.• H0: µ ≧ 8,000 hours H1: µ < 8,000 hours 5% -1.645• Z = 7,800 – 8,000 / (800/√100) = -200/80 = -2.50 • [800/√100 = 80, the standard error of the mean]• The computed Z value of -2.50 is in the region of rejection. Thus, reject H0 at p < .05 – Note: When testing a hypothesis, we often have to perform a one-tail test if the claim requires it. However, we will always use only two-sided confidence interval estimators when using sample statistics to estimate population parameters. 96
  97. 97. • In estimating µ based on sample statistics, how large a sample do we need for the level of precision we want? – To determine the sample size we need, we must know the (1) desired precision and (2) σ. e= Pr ecision    X ± Zα σ / n• e, the half-width of the confidence interval estimator is the precision with which we are estimating. e is also called sampling error. 97
  98. 98. …continuedWe use e to solve for n: Zσ Zσ e= n= n e If then Z 2σ 2 n= e2 and so 98
  99. 99. 1.96 2 20 2n= 10 2 99
  100. 100. • Similarly, taking e (precision) from formula for the half-width of a confidence interval estimator for P: Z P (1 − P ) 2 e2• Q: If we are trying to estimate the population proportion, P, what do we use for P in this formula? 100
  101. 101. Suppose a pollster wants a maximum errorofe = .01 with 95% confidence.We assume that variance is the highestpossible, so we use P=.5. This is the waywe ensure that sampling error will be within±.01 of the true population Proportion.Then,1.96 .5(1 − .5) 2n= 2 .01 = 9,604 That is a VERY large sample. 101
  102. 102. …continuedLet’s try that again with e = .03. 1.96 2.5(1 − .5)n= .032 = 1,067This is the sample size that most pollsterswork with. 102

×