Upcoming SlideShare
×

# 2 7 exploratory data analysis

825 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
825
On SlideShare
0
From Embeds
0
Number of Embeds
43
Actions
Shares
0
13
0
Likes
0
Embeds 0
No embeds

No notes for slide

### 2 7 exploratory data analysis

1. 1. 5-number summary <ul><li>Minimum, </li></ul><ul><li>Lower quartile, </li></ul><ul><ul><li>Quartiles are relatively insensitive to unusual values </li></ul></ul><ul><li>Median (second quartile), </li></ul><ul><li>Upper quartile, and </li></ul><ul><ul><li>Interquartile range = Q 3 – Q 1 </li></ul></ul><ul><li>Maximum. </li></ul>
2. 2. For example <ul><li>25 55 59 59 63 71 71 74 80 80 80 83 84 84 87 88 95 95 100 100 </li></ul><ul><li>Minimum </li></ul><ul><li>Lower Quartile (Q1) </li></ul><ul><li>Median </li></ul><ul><li>Upper Quartile (Q3) </li></ul><ul><li>Maximum </li></ul>
3. 3. Boxplot (box and whiskers diagram) <ul><li>Depicts the 5-number summary </li></ul><ul><li>Useful for comparing to data sets. </li></ul>
4. 4. Components <ul><li>Number line extends from the theoretical (or practical) minimum value to the theoretical (or practical) maximum </li></ul><ul><li>A box spans the lower and upper quartiles </li></ul><ul><li>Dotted line through the box at the median </li></ul><ul><li>Whiskers extend from the box to the minimum and maximum </li></ul>
5. 5. Outliers <ul><li>Outliers are: </li></ul><ul><ul><li>Less than the lower quartile minus the 1.5 times the inter-quartile range, or </li></ul></ul><ul><ul><li>Greater than upper quartile plus 1.5 times the interquartile range </li></ul></ul>
6. 6. Live Example <ul><li>Minimum </li></ul><ul><li>First Quartile (Q1) </li></ul><ul><li>Median </li></ul><ul><li>Third Quartile (Q3) </li></ul><ul><li>Maximum </li></ul>
7. 7. Your turn <ul><li>Use a box and whiskers diagram to compare the 2005 and 2008 September temperatures </li></ul><ul><ul><li>Daily high temperatures for Trenton </li></ul></ul><ul><li>Identify any outliers </li></ul><ul><li>79 79 83 82 83 85 91 86 85 75 73 77 76 68 66 70 73 78 83 84 81 90 82 89 90 86 78 75 75 </li></ul><ul><li>86 90 88 92 90 77 81 81 73 74 74 68 83 91 81 68 75 75 68 70 81 73 72 74 66 65 71 69 70 71 </li></ul>
8. 8. Two Ways to Characterize Data x i < X – 2s x i > X + 2x x i < Q1 – 1.5  IQR x i > Q3 + 1.5  IQR Outliers Distribution spread, frequency diagram, bar chart and Box and whiskers diagram Picture X and s Minimum, first quartile, median, third quartile, maximum Statistics Mean and standard deviation 5-number summary