5 Number Summary

8,132 views
7,659 views

Published on

Using the 5 Number Summary to analyze data.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
8,132
On SlideShare
0
From Embeds
0
Number of Embeds
446
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

5 Number Summary

  1. 1. The 5-number summary, boxplots and outliers9/4/2011 Slide 1
  2. 2. The 5-number summary• An example 5-number summary: 5 12 14 17 21• 5 is the minimum value in the set• 12 is the first quartile• 14 is the median• 17 is the third quartile• 21 is the maximum value9/4/2011 Slide 2
  3. 3. The 5-number summary• Quartiles? – The 1st quartile is the point at which 25% of the data is below and 75% above. – The 2nd quartile (the MEDIAN) is the point at which 50% of the data is below and 50% above. – The 3rd quartile is the point at which 75% of the data is below and 25% above9/4/2011 Slide 3
  4. 4. The 5-number summary• Back to the example 5 12 14 17 21• So, if these are a scores on a 22 point quiz from a class… – The lowest score in the class was 5 points – 25% of students earned 12 or fewer points (75% earned 12 or more) – 50% of students earned 14 or fewer points (50% earned 14 or more) – the median – 75% of students earned 17 or fewer points (25% earned 17 or more) – The highest score in the class was 21 points9/4/2011 Slide 4
  5. 5. The 5-number summary• Finding the minimum and maximum is pretty simple• Finding the median was discussed in lesson 1.6• So to find the quartiles…9/4/2011 Slide 5
  6. 6. The 5-number summary• Finding the quartiles – To find the 1st quartile, simply find the median of the lower half of the data (the lower half of the data does NOT include the median of the data). – To find the 3rd quartile, simply find the median of the upper half of the data (the upper half of the data does NOT include the median of the data).9/4/2011 Slide 6
  7. 7. The 5-number summary 2• Example of finding 3 Median of lower the quartiles Lower Half of Data 3 half is 1st quartile = 3 5 Median is (8+9)/2 or 8.5 8 9 9 Median of Upper Half of Data 12 upper half is 3rd quartile = 12 13 139/4/2011 Slide 7
  8. 8. The 5-number summary• Find the 5-number 17.1 2.1 summary of the 5.8 2.0 following data (which are salaries (in 5.0 1.0 millions) of an NBA 4.5 1.0 team: 4.3 0.8 4.2 0.7• Go to the next slide to check your work. 3.1 0.39/4/2011 Slide 8
  9. 9. The 5-number summary• Your answer should be: 0.3 1.0 2.6 4.5 17.1• Another measure of spread is found from the 5-number summary, the interquartile range (or IQR).• The IQR is simply the 3rd quartile (Q3) minus the 1st quartile (Q1).• So, IQR = Q3 – Q1• This is simply a variation on the definition of range.9/4/2011 Slide 9
  10. 10. Outliers and extreme values• The 17.1 million dollar salary is quite high.• Is it an outlier among the data?• Tukey’s rule: A data point is an outlier if it falls more than 1.5 IQR below the 1st quartile OR 1.5 IQR above the 3rd quartile.9/4/2011 Slide 10
  11. 11. Outliers and extreme values• Recall the summary: 0.3 1.0 2.6 4.5 17.1• The IQR = 4.5 – 1.0 = 3.5• Check for the high outlier: – Is 17.1 more than 1.5IQR above the 3rd quartile (4.5)? – Is 17.1 > 1.5(3.5) + 4.5 ? – Is 17.1 > 9.75 ? – Yes, so the $17.1 million salary is an outlier on the team’s payroll.9/4/2011 Slide 11
  12. 12. Boxplots• The boxplot displays the 5-number summary: minimum, lower quartile (Q1), median upper quartile (Q3), maximum.• It also shows the Inter-quartile range (IQR) and outliers.• It also gives us information about the symmetry of the distribution.9/4/2011 Slide 12
  13. 13. Example Boxplot The largest observation (max) is approximately 4.The smallestobservation (min)is approximately0. The first The second The third quartile quartile (Q1) is quartile (Q2) is (Q3) is approximately approximately approximately 2.7. 9/4/2011 1. 1.6. Slide 13
  14. 14. Example Boxplot (modified) Outliers show up as circles. In this case, it is now the max. This is the largest observation that is NOT an out outlier.9/4/2011 Slide 14
  15. 15. Boxplots• Example – Verbal GMAT scores of 12 students: 10, 22, 24, 27, 31, 33, 39, 40, 42, 43, 44, 45 – The 5-number summary is: 10 25.5 36 42.5 45 – Now the box-plot is constructed as follows: • The line inside the box indicates the median. • The left side of this box indicates the lower quartile (Q1). • The right side of this box indicates the upper quartile (Q3). • A straight line is then drawn from the lowest value of this distribution to the box (at Q1) and another straight line from the box (at Q3) to the highest value of this distribution.9/4/2011 Slide 15
  16. 16. Boxplot9/4/2011 Slide 16
  17. 17. Modified Boxplot• Consider the 5-number summary of NBA salaries: 0.3 1.0 2.6 4.5 17.1• The modified boxplot shows outliers separately.• On the next slide, note how the outlier 17.1 is plotted separately.• The line only goes to the maximum (or minimum) point that is NOT an outlier.9/4/2011 Slide 17
  18. 18. Modified Boxplot9/4/2011 Slide 18
  19. 19. Using Technology to Make Boxplots• Use StatCrunch• On your calculator: – To create a boxplot: • Enter your list of data. • Now select STAT PLOT by pressing the 2nd key followed by the key labeled Y=. • Press the ENTER key, then select the option of ON, and select the boxplot type that shows outliers (as dots) • The Xlist should indicate the name of your list (L1 or other), and the Freq value should be set to 1. • Now press the ZOOM key and select option 9 (ZoomStat). • Press the ENTER key and the boxplot should be displayed. • You can use the TRACE key along with the arrow keys in order to read the values of the five number summary (Min, Q1, Med, Q3, Max).9/4/2011 Slide 19
  20. 20. The 5-number summary, boxplots and outliers• This concludes the presentation.9/4/2011 Slide 20

×