Upcoming SlideShare
×

# Exploring Data

1,860 views

Published on

Study Notes and Guide for the AP Statistics Exam Theme 1

1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,860
On SlideShare
0
From Embeds
0
Number of Embeds
22
Actions
Shares
0
23
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Exploring Data

1. 1. AP Review Exploring Data
2. 2. Describing a Distribution <ul><li>Discuss center, shape, and spread in context. </li></ul><ul><li>Center: Mean or Median </li></ul><ul><li>Shape: Roughly Symmetrical, Right or Left Skewed </li></ul><ul><li>Spread: Standard Deviation, IQR, Range, or Spread </li></ul>
3. 3. Checking for Outliers <ul><li>A survey was conducted to gather ratings of the quality of service at local restaurants at a nearby mall. Respondents were to rate overall service using values between 0 (terrible) and 100 (excellent). The five number summary is 32, 47.5, 51, 63.5, 92. The data values above Q3 are 65, 66, 70, 71, and 92. Are there outliers on the high end? </li></ul>
4. 4. Checking for Outliers <ul><li>Outliers > Q3 + 1.5(IQR) </li></ul><ul><li>Outliers > 63.5 + 1.5 (63.5 – 47.5) </li></ul><ul><li>Outliers > 87.5 </li></ul><ul><li>Therefore, 92 is an outlier. </li></ul>
5. 5. Robust and Sensitive Statistics <ul><li>Robust (not affected by extreme values) </li></ul><ul><li>Median, IQR </li></ul><ul><li>Sensitive (affected by extreme values) </li></ul><ul><li>Mean, s, range </li></ul>
6. 6. Parameters and Statistics <ul><li>Parameters are numerical values that describe a population. </li></ul><ul><li>Statistics are numerical values that describe a sample. </li></ul>
7. 7. Z – Scores and Percentiles <ul><li>Barron’s p. 41 #10 </li></ul><ul><li>Assuming that batting averages have a bell-shaped distribution, arrange in ascending order: </li></ul><ul><li>I. An average with a z-score of –1 </li></ul><ul><li>II. An average with a percentile rank of 20%. </li></ul><ul><li>III. An average at the first quartile, Q1. </li></ul>I, II, III
8. 8. Normal Distribution <ul><li>Barron’s P. 367 #3 </li></ul><ul><li>The average yearly snowfall in a city is 55 inches. What is the standard deviation if 15% of the years have snowfalls above 60 inches? Assume yearly snowfalls are normally distributed. </li></ul>
9. 9. Linear Regression <ul><li>Don’t forget about formulas on chart. </li></ul><ul><li>r is the correlation coefficient. </li></ul><ul><li>r^2 is the coefficient of determination. </li></ul><ul><li>r has no units </li></ul><ul><li>Strong r indicates association, not causation. </li></ul><ul><li>r is not affected if x & y are reversed or if operations (mult, divide, add, sub) are performed on each x or on each y. </li></ul>
10. 10. Linear Regression <ul><li>r^2 describes the percent variation of the dependent variable, y, explained by the linear relationship (LSRL) with the independent variable, x. PUT IN CONTEXT! </li></ul><ul><li>When discussing r, describe line as weak, moderate, or strong linear relationship between x & y </li></ul>
11. 11. Linear Regression <ul><li>Influential Point – pulls regression line toward it. An influential point is usually a point in the x-direction. </li></ul><ul><li>Outlier – shows up in residual plot usually in the y – direction. </li></ul>
12. 12. Linear Regression <ul><li>When performing Linear Regression, do the following: </li></ul><ul><ul><li>Create a scatterplot </li></ul></ul><ul><ul><li>Calculate the equation of the regression line </li></ul></ul><ul><ul><li>Plot the residuals </li></ul></ul><ul><ul><li>A residual is the observed y – predicted y. </li></ul></ul>
13. 13. Barron’s Problems <ul><li>Multiple Choice </li></ul><ul><li>P. 370 #13, 14, 16, 19, 21, 24, 27, 30, 38 </li></ul><ul><li>Free Response </li></ul><ul><li>P. 430 #2 </li></ul>