Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Medical Statistics Part-I:Descripti... by Ramachandra Barik 3816 views
- Descriptive statistics by Aileen Balbido 4227 views
- Stat11t chapter3 by raylenepotter 6254 views
- Medical statistics made easy by Tarek Demerdash 18463 views
- Descriptive Statistics by guest290abe 11406 views
- Descriptive statistics i by Mohammad Ihmeidan 1786 views

No Downloads

Total views

4,701

On SlideShare

0

From Embeds

0

Number of Embeds

15

Shares

0

Downloads

244

Comments

0

Likes

2

No embeds

No notes for slide

- 1. 1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Position 2-7 Exploratory Data Analysis (EDA)
- 2. 2 Descriptive Statistics summarize or describe the important characteristics of a known set of population data Inferential Statistics use sample data to make inferences (or generalizations) about a population 2 -1 Overview
- 3. 3 1. Center: A representative or average value that indicates where the middle of the data set is located 2. Variation: A measure of the amount that the values vary among themselves 3. Distribution: The nature or shape of the distribution of data (such as bell-shaped, uniform, or skewed) 4. Outliers: Sample values that lie very far away from the vast majority of other sample values 5. Time: Changing characteristics of the data over time Important Characteristics of Data
- 4. 4 Frequency Table lists classes (or categories) of values, along with frequencies (or counts) of the number of values that fall into each class 2-2 Summarizing Data With Frequency Tables
- 5. 5 Qwerty Keyboard Word Ratings Table 2-1 2 2 5 1 2 6 3 3 4 2 4 0 5 7 7 5 6 6 8 10 7 2 2 10 5 8 2 5 4 2 6 2 6 1 7 2 7 2 3 8 1 5 2 5 2 14 2 2 6 3 1 7
- 6. 6 Frequency Table of Qwerty Word Ratings Table 2-3 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency
- 7. 7 Frequency Table Definitions
- 8. 8 Lower Class Limits Lower Class Limits 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency are the smallest numbers that can actually belong to different classes
- 9. 9 Upper Class Limits Upper Class Limits 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency are the largest numbers that can actually belong to different classes
- 10. 10 are the numbers used to separate classes, but without the gaps created by class limits Class Boundaries
- 11. 11 number separating classes Class Boundaries 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency - 0.5 2.5 5.5 8.5 11.5 14.5
- 12. 12 Class Boundaries Class Boundaries 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency - 0.5 2.5 5.5 8.5 11.5 14.5 number separating classes
- 13. 13 midpoints of the classes Class Midpoints
- 14. 14 midpoints of the classes Class Midpoints Class Midpoints 0 - 1 2 20 3 - 4 5 14 6 - 7 8 15 9 - 10 11 2 12 - 13 14 1 Rating Frequency
- 15. 15 is the difference between two consecutive lower class limits or two consecutive class boundaries Class Width
- 16. 16 Class Width Class Width 3 0 - 2 20 3 3 - 5 14 3 6 - 8 15 3 9 - 11 2 3 12 - 14 1 Rating Frequency is the difference between two consecutive lower class limits or two consecutive class boundaries
- 17. 17 1. Be sure that the classes are mutually exclusive. 2. Include all classes, even if the frequency is zero. 3. Try to use the same width for all classes. 4. Select convenient numbers for class limits. 5. Use between 5 and 20 classes. 6. The sum of the class frequencies must equal the number of original data values. Guidelines For Frequency Tables
- 18. 18 3. Select for the first lower limit either the lowest score or a convenient value slightly less than the lowest score. 4. Add the class width to the starting point to get the second lower class limit, add the width to the second lower limit to get the third, and so on. 5. List the lower class limits in a vertical column and enter the upper class limits. 6. Represent each score by a tally mark in the appropriate class. Total tally marks to find the total frequency for each class. Constructing A Frequency Table 1. Decide on the number of classes . 2. Determine the class width by dividing the range by the number of classes (range = highest score - lowest score) and round up. class width ≈ round up of range number of classes
- 19. 19
- 20. 20 Relative Frequency Table relative frequency = class frequency sum of all frequencies
- 21. 21 Relative Frequency Table 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency 0 - 2 38.5% 3 - 5 26.9% 6 - 8 28.8% 9 - 11 3.8% 12 - 14 1.9% Rating Relative Frequency 20/52 = 38.5% 14/52 = 26.9% etc. Total frequency = 52
- 22. 22 Cumulative Frequency Table Cumulative Frequencies 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency Less than 3 20 Less than 6 34 Less than 9 49 Less than 12 51 Less than 15 52 Rating Cumulative Frequency
- 23. 23 Frequency Tables 0 - 2 20 3 - 5 14 6 - 8 15 9 - 11 2 12 - 14 1 Rating Frequency 0 - 2 38.5% 3 - 5 26.9% 6 - 8 28.8% 9 - 11 3.8% 12 - 14 1.9% Rating Relative Frequency Less than 3 20 Less than 6 34 Less than 9 49 Less than 12 51 Less than 15 52 Rating Cumulative Frequency
- 24. 24 a value at the center or middle of a data set Measures of Center
- 25. 25 Mean (Arithmetic Mean) AVERAGE the number obtained by adding the values and dividing the total by the number of values Definitions
- 26. 26 Notation Σ denotes the addition of a set of values x is the variable usually used to represent the individual data values n represents the number of data values in a sample N represents the number of data values in a population
- 27. 27 Notation is pronounced ‘x-bar’ and denotes the mean of a set of sample values x = n Σ x x
- 28. 28 Notation µ is pronounced ‘mu’ and denotes the mean of all values in a population is pronounced ‘x-bar’ and denotes the mean of a set of sample values Calculators can calculate the mean of data x = n Σ x x N µ = Σ x
- 29. 29 Definitions Median the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude
- 30. 30 Definitions Median the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude often denoted by x (pronounced ‘x-tilde’) ~
- 31. 31 Definitions Median the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude often denoted by x (pronounced ‘x-tilde’) is not affected by an extreme value ~
- 32. 32 6.72 3.46 3.60 6.44 3.46 3.60 6.44 6.72 no exact middle -- shared by two numbers 3.60 + 6.44 2 (even number of values) MEDIAN is 5.02
- 33. 33 6.72 3.46 3.60 6.44 26.70 3.46 3.60 6.44 6.72 26.70 (in order - odd number of values) exact middle MEDIAN is 6.44 6.72 3.46 3.60 6.44 3.46 3.60 6.44 6.72 no exact middle -- shared by two numbers 3.60 + 6.44 2 (even number of values) MEDIAN is 5.02
- 34. 34 Definitions Mode the score that occurs most frequently Bimodal Multimodal No Mode denoted by M the only measure of central tendency that can be used with nominal data
- 35. 35 a. 5 5 5 3 1 5 1 4 3 5 b. 1 2 2 2 3 4 5 6 6 6 7 9 c. 1 2 3 6 7 8 9 10 Examples Mode is 5 Bimodal - 2 and 6 No Mode
- 36. 36 Midrange the value midway between the highest and lowest values in the original data set Definitions
- 37. 37 Midrange the value midway between the highest and lowest values in the original data set Definitions Midrange = highest score + lowest score 2
- 38. 38 Symmetric Data is symmetric if the left half of its histogram is roughly a mirror of its right half. Skewed Data is skewed if it is not symmetric and if it extends more to one side than the other. Definitions
- 39. 39 Skewness Mode = Mean = Median SYMMETRIC
- 40. 40 Skewness Mode = Mean = Median SKEWED LEFT (negatively) SYMMETRIC Mean Mode Median
- 41. 41 Skewness Mode = Mean = Median SKEWED LEFT (negatively) SYMMETRIC Mean Mode Median SKEWED RIGHT (positively) MeanMode Median
- 42. 42 Waiting Times of Bank Customers at Different Banks in minutes Jefferson Valley Bank Bank of Providence 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0
- 43. 43 Jefferson Valley Bank Bank of Providence 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0 Jefferson Valley Bank 7.15 7.20 7.7 7.10 Bank of Providence 7.15 7.20 7.7 7.10 Mean Median Mode Midrange Waiting Times of Bank Customers at Different Banks in minutes
- 44. 44 Dotplots of Waiting Times
- 45. 45 Measures of Variation
- 46. 46 Measures of Variation Range value highest lowest value
- 47. 47 a measure of variation of the scores about the mean (average deviation from the mean) Measures of Variation Standard Deviation
- 48. 48 Sample Standard Deviation Formula calculators can compute the sample standard deviation of data Σ (x - x)2 n - 1 S =
- 49. 49 Sample Standard Deviation Shortcut Formula n (n - 1) s = n (Σx2 ) - (Σx)2 calculators can compute the sample standard deviation of data
- 50. 50 Σ x - x Mean Absolute Deviation Formula n
- 51. 51 Population Standard Deviation calculators can compute the population standard deviation of data 2 Σ (x - µ) N σ =
- 52. 52 Measures of Variation Variance standard deviation squared
- 53. 53 Measures of Variation Variance standard deviation squared s σ 2 2 } use square key on calculatorNotation
- 54. 54 Sample Variance Population Variance Variance Σ (x - x )2 n - 1 s 2 = Σ (x - µ)2 N σ 2 =
- 55. 55 Estimation of Standard Deviation Range Rule of Thumb x - 2s x x + 2s Range ≈ 4s or (minimum usual value) (maximum usual value)
- 56. 56 Estimation of Standard Deviation Range Rule of Thumb x - 2s x x + 2s Range ≈ 4s or (minimum usual value) (maximum usual value) Range 4 s ≈ = highest value - lowest value 4
- 57. 57 x The Empirical Rule (applies to bell-shaped distributions)FIGURE 2-15
- 58. 58 x - s x x + s 68% within 1 standard deviation 34% 34% The Empirical Rule (applies to bell-shaped distributions)FIGURE 2-15
- 59. 59 x - 2s x - s x x + 2sx + s 68% within 1 standard deviation 34% 34% 95% within 2 standard deviations The Empirical Rule (applies to bell-shaped distributions) 13.5% 13.5% FIGURE 2-15
- 60. 60 x - 3s x - 2s x - s x x + 2s x + 3sx + s 68% within 1 standard deviation 34% 34% 95% within 2 standard deviations 99.7% of data are within 3 standard deviations of the mean The Empirical Rule (applies to bell-shaped distributions) 0.1% 0.1% 2.4% 2.4% 13.5% 13.5% FIGURE 2-15
- 61. 61 Chebyshev’s Theorem applies to distributions of any shape. the proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1 - 1/K 2 , where K is any positive number greater than 1. at least 3/4 (75%) of all values lie within 2 standard deviations of the mean. at least 8/9 (89%) of all values lie within 3 standard deviations of the mean.
- 62. 62 Measures of Variation Summary For typical data sets, it is unusual for a score to differ from the mean by more than 2 or 3 standard deviations.
- 63. 63 z Score (or standard score) the number of standard deviations that a given value x is above or below the mean Measures of Position
- 64. 64 Sample z = x - x s Population z = x - µ σ Measures of Position z score
- 65. 65 - 3 - 2 - 1 0 1 2 3 Z Unusual Values Unusual Values Ordinary Values Interpreting Z Scores FIGURE 2-16
- 66. 66 Measures of Position Quartiles, Deciles, Percentiles
- 67. 67 Q1, Q2, Q3 divides ranked scores into four equal parts Quartiles 25% 25% 25% 25% Q3Q2Q1 (minimum) (maximum) (median)
- 68. 68 D1, D2, D3, D4, D5, D6, D7, D8, D9 divides ranked data into ten equal parts Deciles 10% 10% 10% 10% 10% 10% 10% 10% 10% 10% D1 D2 D3 D4 D5 D6 D7 D8 D9
- 69. 69 99 Percentiles Percentiles
- 70. 70 Quartiles, Deciles, Percentiles Fractiles (Quantiles) partitions data into approximately equal parts
- 71. 71 Exploratory Data Analysis the process of using statistical tools (such as graphs, measures of center, and measures of variation) to investigate the data sets in order to understand their important characteristics
- 72. 72 Outliers a value located very far away from almost all of the other values an extreme value can have a dramatic effect on the mean, standard deviation, and on the scale of the histogram so that the true nature of the distribution is totally obscured
- 73. 73 Boxplots (Box-and-Whisker Diagram) Reveals the: center of the data spread of the data distribution of the data presence of outliers Excellent for comparing two or more data sets
- 74. 74 Boxplots 5 - number summary Minimum first quartile Q1 Median (Q2) third quartile Q3 Maximum
- 75. 75 Boxplots Boxplot of Qwerty Word Ratings 2 4 6 8 10 12 14 0 2 4 6 14 0
- 76. 76 Bell-Shaped Skewed Boxplots Uniform
- 77. 77 Exploring Measures of center: mean, median, and mode Measures of variation: Standard deviation and range Measures of spread and relative location: minimum values, maximum value, and quartiles Unusual values: outliers Distribution: histograms, stem-leaf plots, and boxplots

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment