Upcoming SlideShare
×

Chapter 3

3,581 views
3,387 views

Published on

Published in: Economy & Finance, Technology
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
3,581
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
60
0
Likes
1
Embeds 0
No embeds

No notes for slide

Chapter 3

1. 1. Chapter 3: Data Description
2. 2. Parameter vs. Statistic <ul><li>A statistic is a characteristic or measure obtained by using the data values from a sample. </li></ul><ul><li>A parameter is a characteristic or measure obtained by using all the data values for a specific population. </li></ul>
3. 3. Parameter vs. Statistic <ul><li>In statistics Greek letters are used to denote parameters and Roman letters are used to denote statistics. </li></ul><ul><li>Assume that the data are obtained from samples unless otherwise specified. </li></ul>
4. 4. Measures of Central Tendency: Mean <ul><li>The mean is the sum of the values, divided by the total number of values. </li></ul><ul><li>The symbol represents the sample mean </li></ul><ul><li>where n represents the total number of values in the sample. </li></ul>
5. 5. Measures of Central Tendency: Mean <ul><li>For a population the Greek letter  is used for the mean. </li></ul><ul><li>_________________________________ where N represents the total number of values in the population. </li></ul>
6. 6. Example: Chief Justices <ul><li>The lengths of service (in years) of eight of the Chief Justices of the Supreme Court are 7, 1, 5, 35, 28, 10, 15, 22. Find the mean. </li></ul>
7. 7. What Makes the Mean a Center? <ul><li>1 5 7 10 15 22 28 35 </li></ul>
8. 8. Measures of Central Tendency: Mean <ul><li>The mean should be rounded to one more decimal place that occurs in the raw data. </li></ul>
9. 9. Measures of Central Tendency: Mean <ul><li>To estimate the mean from a frequency distribution, use the class midpoint to represent each class. </li></ul><ul><li>____________________________ </li></ul>
10. 10. Example: Mean Age of 120 Students <ul><li>Approximate the mean age for students in MAT 120. </li></ul><ul><li>Class Frequency( ) Midpoint( ) ______ </li></ul><ul><li>15 – 19 16 </li></ul><ul><li>20 – 24 34 </li></ul><ul><li>25 – 29 12 </li></ul><ul><li>30 – 34 5 </li></ul><ul><li>35 – 39 1 </li></ul><ul><li>40 – 44 0 </li></ul><ul><li>45 – 49 1 </li></ul>
11. 11. Measures of Central Tendency: Median <ul><li>The median is the midpoint of the data array. To find the median, the data must be arranged in order. </li></ul>
12. 12. Example: Supreme Court Justices <ul><li>Find the median value for the lengths of service for the sample of Supreme Court Justices 7, 1, 5, 35, 28, 10, 15, 22. </li></ul>
13. 13. Example: Hospital System <ul><li>Example: The number of hospitals for the five largest hospital systems is shown here. Find the median. 340, 75, 123, 259, 151 </li></ul>
14. 14. Measures of Central Tendency: Mode <ul><li>The value that occurs most often in a set of data is called the mode . </li></ul><ul><li>Find the mode for 5, 6, 2, 4, 2, 3, 6, 4, 1, 2 </li></ul><ul><li>A set of data that has two modes is called bimodal. </li></ul><ul><li>A data set may also have no mode. </li></ul>
15. 15. Example: Birth Month Data <ul><li>Find the mode for the class birth month data. </li></ul><ul><li>Birth Month Frequency </li></ul><ul><ul><li>January 4 </li></ul></ul><ul><ul><li>February 3 </li></ul></ul><ul><ul><li>March 4 </li></ul></ul><ul><ul><li>April 5 </li></ul></ul><ul><ul><li>May 6 </li></ul></ul><ul><ul><li>June 3 </li></ul></ul><ul><ul><li>July 11 </li></ul></ul><ul><ul><li>August 9 </li></ul></ul><ul><ul><li>September 7 </li></ul></ul><ul><ul><li>October 5 </li></ul></ul><ul><ul><li>November 6 </li></ul></ul><ul><ul><li>December 6 </li></ul></ul>
16. 16. Measures of Central Tendency: Mode <ul><li>The mode for grouped data is the modal class . The modal class is the class with the largest frequency. </li></ul><ul><li>Age Distribution of MAT 120 Students </li></ul><ul><li>Classes Frequencies </li></ul><ul><li>15 –19 16 </li></ul><ul><li>20 –24 34 </li></ul><ul><li>25 –29 12 </li></ul><ul><li>30 –34 5 </li></ul><ul><li>35 –39 1 </li></ul><ul><li>40 –44 0 </li></ul><ul><li>45 –49 1 </li></ul>
17. 17. Measures of Central Tendency: Midrange <ul><li>The midrange is defined as the sum of the lowest and highest values in the data set, divided by 2. The symbol MR is used for the midrange. </li></ul><ul><li>_____ = _________________ </li></ul>
18. 18. Example: Midrange of Ages <ul><li>Find the midrange of the student ages for MAT 120. (Recall that the lowest value was 17 and the highest value was 49.) </li></ul>
19. 19. Measures of Central Tendency: Weighted Mean <ul><li>Find the weighted mean of a variable X by multiplying each value by its corresponding weight and dividing the sum of the products by the sum of the weights </li></ul><ul><li>__________________________________ </li></ul>
20. 20. Example: Weighted Mean <ul><li>Example: An instructor grades exams 20%; term paper, 30%; final exam, 50%. A student had grades of 83, 72, and 90, respectively, for exams, term paper, and final exam. Find the student’s final average. </li></ul>
21. 21. Example: Weighted Mean <ul><li>Example: A student has the following grades for the Fall term: MAT 120 (3 hrs), A; BIO 210 (4 hrs), B; HIS 201, 3 (hrs) C; SOC 101 (3 hrs), A; CPT 170 (3 hrs), A. Calculate the student’s GPA for the fall term. </li></ul>
22. 22. Example: Weighted Mean <ul><li>Example: In a dental survey of third grade students, this distribution was obtained for the number of cavities found. Find the average number of cavities. </li></ul><ul><li>Number of Students Number of Cavities </li></ul><ul><li>12 0 </li></ul><ul><li>8 1 </li></ul><ul><li>5 2 </li></ul><ul><li>5 3 </li></ul>
23. 24. Measures of Variation 10.0 9.3 8.5 7.7 7.7 6.7 6.2 5.8 5.4 4.2 Bank of the USA 7.7 7.7 7.7 7.4 7.3 7.1 6.8 6.7 6.6 6.5 First Valley Bank Midrange Mode Median Mean Bank of the USA First Valley
24. 25. Back-to-Back Stem & Leaf Plot First Valley Bank Bank of the USA 10.0 9.3 8.5 7.7 7.7 6.7 6.2 5.8 5.4 4.2 Bank of the USA 7.7 7.7 7.7 7.4 7.3 7.1 6.8 6.7 6.6 6.5 First Valley Bank
25. 26. Range <ul><li>The range is the highest value minus the lowest value. The symbol R is used for the range. </li></ul><ul><li>____________________ </li></ul><ul><li>The range is affected by extremely high or low values. </li></ul><ul><li>The range is easy to compute. </li></ul><ul><li>Example: Determine the range for the First Valley Bank and the Bank of USA. </li></ul><ul><li>Range </li></ul><ul><ul><li>First Valley _______ </li></ul></ul><ul><ul><li>Bank of USA _______ </li></ul></ul>
26. 27. Measures of Variation Standard Deviation Range 75 75 75 75 Mean 90 77 0 75 80 76 100 75 70 74 100 75 60 73 100 75 Student D Student C Student B Student A
27. 28. Deriving the Variation and Standard Deviation Formulas
28. 29. Population Variance & Standard Deviation <ul><li>The variance is the average of the squares of the distance each value is from the mean. The symbol for the population variance is  2 . The formula for the population variance is </li></ul><ul><li>______________________________ </li></ul><ul><li>The standard deviation is the square root of the variance. The symbol for the population standard deviation is  . The formula for the population standard deviation is </li></ul><ul><li>____________________. </li></ul>
29. 30. Sample Variance & Standard Deviation <ul><li>The formula for the sample variance, denoted by s 2 is </li></ul><ul><li>_____________ </li></ul><ul><li>The standard deviation for a sample is </li></ul><ul><li>____=______ =____________ </li></ul>
30. 31. Example: <ul><li>Use your calculator to determine the standard deviation and variance for the First Valley Bank and the Bank of USA. </li></ul><ul><li>Variance Standard Deviation </li></ul><ul><li>First Valley _______ _________ </li></ul><ul><li>Bank of USA_______ _________ </li></ul>
31. 32. Finding the Standard Deviation From a Frequency Distribution <ul><li>Example: Use your calculator to approximate the variance and standard deviation for the age for students in MAT 120. </li></ul><ul><li>Class Frequency (___) Midpoint (____) _______ </li></ul><ul><li>15 – 19 16 </li></ul><ul><li>20 – 24 34 </li></ul><ul><li>25 – 29 12 </li></ul><ul><li>30 – 34 5 </li></ul><ul><li>35 – 39 1 </li></ul><ul><li>40 – 44 0 </li></ul><ul><li>45 – 49 1 </li></ul>
32. 33. Coefficient of Variation <ul><li>The coefficient of variation is the standard deviation divided by the mean. It allows one to compare standard deviations when the units are different. </li></ul><ul><li>_________________________ </li></ul>
33. 34. Example <ul><li>The average score on an English final exam was 85, with a standard deviation of 5. The average score on a history final exam was 110 with a standard deviation of 8. Which class was more variable? </li></ul>
34. 35. Chebyshev’s Theorem <ul><li>The proportion of values from a data set that will fall within k standard deviations of the mean will be at least </li></ul><ul><li>__________, </li></ul><ul><li>where k is a number greater than 1. </li></ul>
35. 36. Empirical Rule <ul><li>– For data that is bell-shaped, the following statements make up the Empirical Rule. </li></ul><ul><li>Approximately 68% of the data values will fall within 1 standard deviation of the mean. </li></ul><ul><li>Approximately 95% of the data values will fall within 2 standard deviations of the mean </li></ul><ul><li>Approximately 99.7% of the data values will fall within 3 standard deviations of the mean </li></ul>
36. 38. Empirical Rule Example <ul><li>A study of the number of paid sick days taken per year by employees results in a mound-shaped distribution with a mean of 8.7 and a standard deviation or 3. According to the empirical rule, what percentage of employees were taking between 2.7 and 14.7 paid sick days per year? </li></ul>
37. 39. Example <ul><li>A bakery makes loaves of rye bread that have an average weight of 28 ounces and a standard deviation of 0.8 ounce. The distribution of weights is mound shaped. </li></ul><ul><li>About 95% of the loaves will have weights that lie within what interval? </li></ul>
38. 40. Example <ul><li>A bakery makes loaves of rye bread that have an average weight of 28 ounces and a standard deviation of 0.8 ounce. The distribution of weights is mound shaped. </li></ul><ul><li>Nearly all the loaves will have weights that lie within what interval? </li></ul>
39. 41. Example <ul><li>A bakery makes loaves of rye bread that have an average weight of 28 ounces and a standard deviation of 0.8 ounce. The distribution of weights is mound shaped. </li></ul><ul><li>Approximately what percentage of loaves will weigh more than 28.8 ounces? </li></ul>
40. 42. Example <ul><li>A taxi company has found that its fares average \$7.80 with a standard deviation of \$1.40. What can we say about the percentage of fares that are between \$5.00 and \$10.60 if </li></ul><ul><li>A. The distribution of fares is mound shaped? </li></ul>
41. 43. <ul><li>A taxi company has found that its fares average \$7.80 with a standard deviation of \$1.40. What can we say about the percentage of fares that are between \$5.00 and \$10.60 if </li></ul><ul><li>B. The distribution of fares in not mound shaped? </li></ul>
42. 44. <ul><li>Example: A pharmaceutical company manufactures capsules that contain an average of 507 grams of vitamin C. The standard deviation is 3 grams. At least 96 percent of the capsules will contain what amount of vitamin C? </li></ul>
43. 45. Measures of Position <ul><li>A z-score or standard score for a value is obtained by subtracting the mean from the value and dividing the result by the standard deviation. The formula is </li></ul><ul><li>_________________________ </li></ul><ul><li>____________ ____________ </li></ul>
44. 46. <ul><li>The z-score represents the number of standard deviations that a data value falls above or below the mean. </li></ul>
45. 47. Example <ul><li>A student scores 60 on a mathematics test that has a mean of 54 and a standard deviation of 3, and she scores 80 on a history test with a mean of 75 and a standard deviation of 2. On which test did she perform better? </li></ul>
46. 48. Percentiles <ul><li>Percentiles divide the data set into 100 equal groups. </li></ul><ul><li>The percentile corresponding to a given value X is computed using the following formula </li></ul><ul><li>______________________ </li></ul>
47. 49. Find a Data Value Corresponding to a Given Percentile <ul><li>Arrange the data in order from highest to lowest </li></ul><ul><li>Substitute into the formula __________ where __________ and ____________ </li></ul><ul><li>If c is not a whole number, round up to the next whole number. Starting at the lowest value, count over to the number that corresponds to the rounded-up value. </li></ul><ul><li>If c is a whole number, use the value halfway between the c th and (c + 1) th values when counting up from the lowest value. </li></ul>
48. 50. Example <ul><li>The data given are weights are in pounds. 78, 82, 86, 88, 92, 97 </li></ul><ul><li>Find the percentile rank of each weight in the data set. </li></ul><ul><li>What value corresponds to the 30th percentile? </li></ul>
49. 51. Example <ul><li>a. Find the percentile rank for each test score in the data set. </li></ul><ul><li>12, 28, 35, 42, 47, 49, 50 </li></ul><ul><li>What value corresponds to the 60th percentile? </li></ul>