Z score


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Z score

  1. 1. Background | Enter Data | Analyze Data | Interpret Data | Report Data Z-Scores Sometimes we want to do more than summarize a bunch of scores. Sometimes we want to talk about particular scores within the bunch. We may want to tell other people about whether or not a score is above or below average. We may want to tell other people how far away a particular score is from average. We might also want to compare scores from different bunches of data. We will want to know which score is better. Z-scores can help with all of this. They Tell Us Important Things Z-Scores tell us whether a particular score is equal to the mean, below the mean or above the mean of a bunch of scores. They can also tell us how far a particular score is away from the mean. Is a particular score close to the mean or far away? If a Z-Score…. ü Has a value of 0, it is equal to the group mean. ü Is positive, it is above the group mean. ü Is negative, it is below the group mean. ü Is equal to +1, it is 1 Standard Deviation above the mean. ü Is equal to +2, it is 2 Standard Deviations above the mean. ü Is equal to -1, it is 1 Standard Deviation below the mean. ü Is equal to -2, it is 2 Standard Deviations below the mean. Z-Scores Can Help Us Understand… How typical a particular score is within bunch of scores. If data are normally distributed, approximately 95% of the data should have Z-score between -2 and +2. Z-scores that do not fall within this range may be less typical of the data in a bunch of scores.
  2. 2. Z-Scores Can Help Us Compare… Individual scores from different bunches of data. We can use Z-scores to standardize scores from different groups of data. Then we can compare raw scores from different bunches of data. How Do You Calculate a Z-Score/ Sigma Level? Jeff Sauro • June 14, 2004 The benefit of using a z-score in usability metrics was explained in "What's a Z-Score and why use it in Usability Testing?" this article discusses different ways of calculating a z-score. The short answer is: It depends on your data and what you're looking for. If you've encountered the z-score in a statistics book you usually get some formula like: The above formula is for obtaining a z-score for an entire population. Usability testing obviously samples a very small subset of the population and thus the following formula is used: Where x-bar and s are used as estimators for the population's true mean and standard deviation. Both formulas essentially calculate the same thing:
  3. 3. Calculating a Z-Score Example For example, lets say you took the GRE a few weeks ago and got scores of 630 Verbal and 700 Quantitative. How good are these scores? Which is better, the Verbal or Quantitative score? Using a z-score can tell you how far you are from the mean and thus how well you performed. If you know the mean and standard deviations for a set of GRE test takers you can compare your scores. the means and standard deviations of a set of test takers on the GRE website verbal quantitative mean 469 591 StDev 119 148 By plugging in your scores you get the following: Verbal z = (630 - 469) ÷ 119 = 1.35σ Quantitative z = (700 - 591) ÷ 148 = .736σ To convert these sigma values into a percentage you can look them up in a standard z-table, use the Excel formula =NORMSDIST(1.35) or use the Z-Score to Percentile Calculator (choose 1-sided) and get the percentages : 91% Verbal and 77% Quantitative. You can see where your score falls within the
  4. 4. sample of other test takers and also see that the verbal score was better than the quantitative score. Assuming the sample data was normally distributed, here's how the scores would look graphically Figure 1: Verbal Score Figure 2: Quantitative Score Z-Scores and Process Sigma An interactive Graph of the Standard Normal Curve similar to Figures 1 & 2 is available for you to visualize how the z-scores and the area under the normal curve correspond. The graphs also allow you to see the difference between one and two-sided (also called two-tailed) areas. In Six Sigma the process sigma metric is derived using the same method as a z-score. However, in Six Sigma you are measuring the distance a sample mean is above a specification limit-- there can be an upper and lower spec limit that a sample must fall between as
  5. 5. well. As in the z-score, you still use the same normal-deviates from the z-table to approximate the area under the curve. The process sigma metric is essentially a Z equivalent. When testing software with users, task times are usually a good metric that will reveal the individual differences in performance. For task times there typically is only an upper spec limit. That is, it usually doesn't matter how fast a user completes a task, but it does matter if a user takes too long. For example, say you and your product team determined that a task should be completed in 120 seconds. 120 seconds becomes your Upper Spec Limit (USL). You sampled 10 users and got these task times: Sample 100 99 101 125 100 123 96 90 98 116 USL: 120 Mean: 104 StDev: 12 To calculate the process sigma you subtract the mean (104) of the sample from the target (120) and divide by the sample standard deviation (12). For Sample 1 the process sigma is -1.32σ. The visual representation of the data can be seen below:
  6. 6. In the case of task times, a negative process sigma is ideal--as you want more people completing the task below the task time, not above it. You can simply drop the negative when communicating the results in the event it causes confusion. If you were to make radical improvements to the UI and then sampled another set of ten users, here are more results: Sample 2 60 75 99 88 65 72 75 72 87 65 USL: 120 Mean: 75.8 StDev: 12.14
  7. 7. In the redesign, the average of the new sample is well below the spec limit and the process sigma is now very high. The corresponding defect area is now only .01% and the quality area is 99.98 Of course having users perform that much below the spec limit is not very common due to the inherent variability in user performance. If you need more help with z-scores, see the Crash course in Z- scores, a tutorial with plenty of pictures, examples and review questions for you to grasp this concept The z-score The Standard Normal Distribution
  8. 8. Definition of the Standard Normal Distribution The Standard Normal distribution follows a normal distribution and has mean 0 and standard deviation 1 Notice that the distribution is perfectly symmetric about 0. If a distribution is normal but not standard, we can convert a value to the Standard normal distribution table by first by finding how many standard deviations away the number is from the mean. The z-score The number of standard deviations from the mean is called the z-score and can be found by the formula x -  z =  Example
  9. 9. Find the z-score corresponding to a raw score of 132 from a normal distribution with mean 100 and standard deviation 15. Solution We compute 132 -  z = = 2.133 15 Example A z-score of 1.7 was found from an observation coming from a normal distribution with mean 14 and standard deviation 3. Find the raw score. Solution We have x -  1.7 = 3 To solve this we just multiply both sides by the denominator 3, (1.7)(3) = x - 14 5.1 = x - 14 x = 19.1 The z-score and Area Often we want to find the probability that a z-score will be less than a given
  10. 10. value, greater than a given value, or in between two values. To accomplish this, we use the table from the textbook and a few properties about the normal distribution. Example Find P(z < 2.37) Solution We use the table. Notice the picture on the table has shaded region corresponding to the area to the left (below) a z-score. This is exactly what we want. Below are a few lines of the table. z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890 2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916 2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936 The columns corresponds to the ones and tenths digits of the z-score and the rows correspond to the hundredths digits. For our problem we want the row 2.3 (from 2.37) and the row .07 (from 2.37). The number in the table that matches this is .9911. Hence P(z < 2.37) = .9911 Example
  11. 11. Find P(z > 1.82) Solution In this case, we want the area to the right of 1.82. This is not what is given in the table. We can use the identity P(z > 1.82) = 1 - P(z < 1.82) reading the table gives P(z < 1.82) = .9656 Our answer is P(z > 1.82) = 1 - .9656 = .0344 Example Find P(-1.18 < z < 2.1) Solution Once again, the table does not exactly handle this type of area. However, the area between -1.18 and 2.1 is equal to the area to the left of 2.1 minus the area to the left of -1.18. That is P(-1.18 < z < 2.1) = P(z < 2.1) - P(z < -1.18) To find P(z < 2.1) we rewrite it as P(z < 2.10) and use the table to get P(z < 2.10) = .9821.
  12. 12. The table also tells us that P(z < -1.18) = .1190 Now subtract to get P(-1.18 < z < 2.1) = .9821 - .1190 = .8631 Back to the Probability Home Page Back to the Elementary Statistics (Math 201) Home Page e-mail Questions and Suggestions • scoring: Use the table below to determine your BMI rating. The table shows the World Health Organization BMI classification system. The rating scale is the same for males and females. You can also use the reverse lookup BMI table for determining your ideal weight based on height. classification BMI (kg/m2) sub-classification BMI (kg/m2) underweight < 18.50 Severe thinness < 16.00 Moderate thinness 16.00 - 16.99 Mild thinness 17.00 - 18.49 normal range 18.5 - 24.99 normal 18.5 - 24.99 overweight ≥ 25.00 pre-obese 25.00 -
  13. 13. 29.99 Obese (≥ 30.00) obese class I 30.00 - 34.99 obese class II 35.00 - 39.99 obese class II ≥ 40.00 source: World Health Organization Fitness Testing Fitness Testing > Tests > Anthropometry > Body Composition > Waist to Hip Ratio Waist to Hip Ratio (WHR) • aim: the purpose of this test to determine the ratio of waist circumference to the hip circumference, as this has been shown to be related to the risk of coronary heart disease. • equipment required: tape measure • procedure: A simple calculation of the measurements of the waist girth divided by the hip girth. Waist to Hip Ratio (WHR) = Gw / Gh, where Gw = waist girth, Gh = hip girth. It does not matter which units of measurement you use, as long as it is the same for each measure.
  14. 14. • scoring: The table below gives general guidelines for acceptable levels for hip to waist ratio. You can use any units for the measurements (e.g. cm or inches), as it is only the ratio that is important. acceptable unacceptable excellent good average high extreme male < 0.85 0.85 - 0.90 0.90 - 0.95 0.95 - 1.00 > 1.00 female < 0.75 0.75 - 0.80 0.80 - 0.85 0.85 - 0.90 > 0.90 • target population: This measure is often used to determine the coronary artery disease risk factor associated with obesity. Anthropometric Results Anthropometric results should be interpreted based on the WHO classifications as described below using the WHO standard curves. Cut offs for acute malnutrition (wasting) Acute malnutrition based on weight-for-Height in z-scores and percentage of the median
  15. 15. Table 6: Cut off points for acute malnutrition (weight for height) Degree of malnutrition Definition using z-score Definitions using % of median Acute None/Mild ≥ -2.0 ≥ 80% Moderate ≥ - 3.0 but <-2.0 ≥70% but <80% Severe <-3.0 or oedema <70% or Oedema Global Acute (GAM) Moderate + Severe <-2.0 and/or Oedema <80% and/or Oedema Severe Acute (SAM) Severe < - 3.0 and/ or Oedema <70% and/or Oedema Cut off points for chronic malnutrition (Stunting) Chronic malnutrition based on Height-for-Age in z-scores and percentage of the median Cut off points for chronic malnutrition (Stunting) Chronic malnutrition based on Height-for-Age in z-scores and percentage of the median Table 7: Cut off points for chronic malnutrition (height for age) Height for age z-scores Height for age % of median Normal/Not Stunted ≥-2 z-scores ≥ 90 Moderate chronic malnutrition ≥ - 3.0 but <-2.0 ≥ 80% and <90% Severe chronic malnutrition/Severely stunted <-3 Z scores <80% Total chronic malnutrition/Total stunted (moderate + severe) <-2 Z score <90% Cut off points for Underweight
  16. 16. Underweight based on Weight-for-Age in z-scores and percentage of the median Table 8: Cut off points for Underweight Description of Nutritional Status Weight for Age Index Z scores Weight for Age % of median Severe Underweight <-3 Z scores <70% Moderately Underweight ≥ - 3.0 but <-2.0 ≥ 70% and <80% Total Underweight (moderate plus severe) <-2 Z score <80% Normal ≥-2 Z-scores ≥ 80% Using a global classifications of malnutrition The following classifications for malnutrition have been established by WHO as levels for interpreting WFH, HFA and WFA z-scores (WHO 2002). For acute malnutrition (wasting), care needs to be taken to assess the context; a prevalence classified as “poor/medium” but which is likely to get worst will have different programmatic implications than a prevalence classified as “serious/high” but where the situation is likely to improve (e.g. impending good harvest). Table 9: prevalence of malnutrition and interpretation levels Index Normal/ Low Poor/ Medium Serious/ High Critical/ Very high Wasting (GAM) <5% 5-9.9% 10-14.9% >15% Stunting <20% 20-29.9% 30-39.9% >40% Underweight <10% 10-19.9% 20-29.9% >30% Risk of mortality using MUAC For children taller/greater than 65 cm
  17. 17. Table 10: MUAC cut-offs and risk of mortality Nutritional Status MUAC Severe <11.0 cm Moderate >11.0 and 12.5 cm Mild malnutrition >12.5 and 13.5cm Satisfactory nutritional status > 13.5cm Note: New WHO standards recommend MUAC < 115 mm as criteria for severe malnutrition among children of age 6 months and above. 6.1. NCHS/WHO Reference Standards The reference standards most commonly used to standardize measurements were developed by the US National Center for Health Statistics (NCHS) and are recommended for international use by the World Health Organization. The reference population chosen by NCHS was a statistically valid random population of healthy infants and children. Questions have frequently been raised about the validity of the US-based NCHS reference standards for populations from other ethnic backgrounds. Available evidence suggests that until the age of approximately 10 years, children from wellnourished and healthy families throughout the world grow at approximately the same rate and attain the same height and weight as children from industrialized countries. The NCHS/WHO reference standards are available for children up to 18 years old but are most accurate when limited to use with children up to the age of 10 years. The NCHS/WHO international reference tables can be used for standardizing anthropometric
  18. 18. data from around the world and can be found on FANTA’s website at www.fantaproject.org/ publications/anthropometry.shtml. 6.2. Comparisons to the Reference Standard References are used to standardize a child’s measurement by comparing the child’s measurement with the median or average measure for children at the same age and sex. For example, if the length of a 3 month old boy is 57 cm, it would be difficult to know if that was reflective of a healthy 3 month old boy without comparison to a reference standard. The reference or median length for a population of 3 month old boys is 61.1 cm and the simple comparison of lengths would conclude that the child was almost 4 cm shorter than could be expected. When describing the differences from the reference, a numeric value can be standardized to enable children of different ages and sexes to be compared. Using the example above, the boy is 4 cm shorter than the reference child but this does not take the age or the sex of the child into consideration. Comparing a 4 cm difference from the reference for a 6. Comparison of Anthropometric Data to Reference Standards 40 child 3 months old is not the same as a 4 cm difference from the reference for a 9 year old child, because of their relatively different body sizes. Taking age and sex into consideration, differences in measurements can be expressed a number of ways: • standard deviation units, or Z-scores • percentage of the median • percentiles
  19. 19. To standardize reporting, USAID recommends that Cooperating Sponsors calculate percentages of children below cut-offs as well as other statistics using Z-scores. If Z-scores cannot be used, percentage of the median should be used. 6.3. Standard Deviation Units or Z-Scores Z-scores are more commonly used by the international nutrition community because they offer two major advantages. First, using Z-scores allows us to identify a fixed point in the distributions of different indices and across different ages. For all indices for all ages, 2.28% of the reference population lie below a cut-off of -2 Z-scores. The percent of the median does not have this characteristic. For example, because weight and height have different distributions (variances), -2 Z-scores on the weight-for-age distribution is about 80% of the median, and -2 Z-scores on the height-for-age distribution is about 90% of the median. Further, the proportion of the population identified by a particular percentage of the median varies at different ages on the same index. The second major advantage of using Zscores is that useful summary statistics can be calculated from them. The approach allows the mean and standard deviation to be calculated for the Z-scores for a group of children. The Z-score application is considered the simplest way of describing the reference population and making comparisons to it. It is the statistic recommended for use when reporting results of nutritional assessments. Examples of Z-score calculations are presented in Appendix 1. The Z-score or standard deviation unit (SD) is defined as the difference between the value
  20. 20. for an individual and the median value of the reference population for the same age or height, divided by the standard deviation of the reference population. This can be written in equation form as: 6.4. Percentage of the Median and Percentiles The percentage of the median is defined as the ratio of a measured or observed value in the individual to the median value of the reference data for the same age or height for the specific sex, expressed as a percentage. This can be written in equation form as: (observed value) - (median reference value) standard deviation of reference population Z-score (or SD-score) = observed value median value of reference population Percent of median = x 100 6. Comparison of Anthropometric Data to Reference Standards 41 The median is the value at exactly the midpoint between the largest and smallest. If a child’s measurement is exactly the same as the median of the reference population we say that they are “100% of the median.” Examples of calculations for percent of median can be found in Appendix 1. The percentile is the rank position of an individual on a given reference distribution, stated in terms of what percentage of the group the individual equals or exceeds. Percentiles will not be presented in this guide. The distribution of Z-scores follows a normal (bell-shaped or Gaussian) distribution. The commonly used cut-offs of -3, -2, and -1 Zscores are, respectively, the 0.13th, 2.28th, and 15.8th percentiles. The percentiles can be
  21. 21. thought of as the percentage of children in the reference population below the equivalent cutoff. Approximately 0.13 percent of children would be expected to be below -3 Z-score in a normally distributed population. Z-score Percentile -3 0.13 -2 2.28 -1 15.8 6.5. Cut-offs The use of a cut-off enables the different individual measurements to be converted into prevalence statistics. Cut-offs are also used for identifying those children suffering from or at a higher risk of adverse outcomes. The children screened under such circumstances may be identified as eligible for special care. The most commonly-used cut-off with Z-scores is -2 standard deviations, irrespective of the indicator used. This means children with a Z-score for underweight, stunting or wasting, below -2 SD are considered moderately or severely malnourished. For example, a child with a Z-score for height-for-age of -2.56 is considered stunted, whereas a child with a Z-score of -1.78 is not classified as stunted. In the reference population, by definition, 2.28% of the children would be below -2 SD and 0.13% would be below -3 SD (a cut-off reflective of a severe condition). In some cases, the cut-off for defining malnutrition used is -1 SD (e.g. in Latin America). In the reference or healthy population, 15.8% would be below a cut-off of -1 SD. The use of -1 SD is generally discouraged as a cut-off due to the large percentage of healthy children normally
  22. 22. falling below this cut-off. For example, the 1995 DHS survey using a –2 SD cut-off for stunting in Uganda found a 36% prevalence of stunting in under-three year olds. This level of stunting is about 16 times the level of the reference population. A comparison of cutoffs for percent of median and Z-scores illustrates the following: 90% = -1 Z-score 80% = -2 Z-score 70% = -3 Z-score (approx.) 60% = -4 Z-score (approx.) 6.5.1. Cut-off points for MUAC for the 6 - 59 month age group MUAC cut-offs are somewhat arbitrary due to its lack of precision as a measure of malnutrition. A cut-off of 11.0 cm can be used for screening severely malnourished children. Those children with MUAC below 12.5 cm with or without edema are classified as moderate and severe. Global Acute Malnutrition is a term generally used in emergency settings. The global malnutrition rate refers to the percent of children 6 to 59 months with weight-forheight below -2 Z-scores or 80% median or MUAC below 12.5 cm, with or without edema. This refers to all moderate and severe malnutrition combined. The combination of a low weight-for-height and any child with edema contributes to those children counted as in the global acute malnutrition statistic. C O M PA R I S O N O F A N T H RO P O M E T R I C DATA TO R E F E R E N C E S TA N DA R D S PA RT 6 . 6. Comparison of Anthropometric Data to Reference Standards 42 6.5.2. Malnutrition Classification
  23. 23. Systems The cut-off points for different malnutrition classification systems are listed below. The most widely used system is WHO classification (Z-scores). The Road-to-Health (RTH) system is typically seen in clinic-based growthmonitoring systems. The Gomez system was widely used in the 1960s and 1970s, but is only used in a few countries now. An analysis of prevalence elicits different results from different systems. These results would not be directly comparable. The difference is especially broad at the severe malnutrition cut-off between the WHO method (Z-scores) and percent of median methods. At 60% of the median, the closest corresponding Z-score is –4. The WHO method is recommended for analysis and presentation of data (see Part 6.2). Mild, moderate and severe are different in each of the classification systems listed below. It is important to use the same system to analyze and present data. The RTH and Gomez classification systems typically use weight-for-age. System Cut-off Malnutrition classification WHO < -1 to > -2 Z-score mild < -2 to > -3 Z-score moderate < -3 Z-score severe RTH > 80% of median normal 60% - < 80% of median mild-to-moderate < 60% of median severe Gomez > 90% of median normal 75% - < 90% of median mild 60% - < 75% of median moderate < 60% of median severe _ _
  24. 24. _ 6. ComparisoOur girl therefore has moderate protein- energy malnutrition, as defined by weight-for-height z- score. n