Your SlideShare is downloading. ×
0
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Unit 8 (powerpoint)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Unit 8 (powerpoint)

226

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
226
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Unit 8 Linear Modeling
  • 2. Linear Models <ul><li>The correlation coefficient measures the strength of the linear relationship between two quantitative variables x and y. </li></ul><ul><li>A linear equation describing how an dependant variable, y, is associated with an explanatory variable, x, looks like </li></ul><ul><li>y = a + bx </li></ul>
  • 3. Example <ul><li>A college charges a basic fee of $100 a semester for a meal plan plus $2 a meal. The linear equation describing the association between the cost of the meal plan, y, and the number of meals purchased, x, is: </li></ul><ul><li>y = 100 + 2x </li></ul>
  • 4. Linear Equations <ul><li>A linear equation takes the form </li></ul><ul><li>y = a + bx </li></ul><ul><li>b = slope </li></ul><ul><li>a = y-intercept </li></ul><ul><li>The slope measures the rate of change of y with respect to x </li></ul><ul><li>The y-intercept measures the initial value of y (value of y when x = 0) </li></ul>
  • 5. Linear Modeling <ul><li>Rarely does an exact linear relationship exist between two studied variables. </li></ul><ul><li>The correlation coefficient and the scatter plot help us decide if there is a reasonably strong linear relationship between two studied variables. </li></ul>
  • 6. Data <ul><li>The table gives the age and systolic blood pressure of 30 subjects </li></ul>
  • 7. Approximate Positive Linear Relationship
  • 8. Equation of Fitted Line SBP = 98.7 + 0.97(AGE) y = 98.7 + 0.97 x
  • 9. Interpretation of Slope <ul><li>The slope of the SBP vs Age fitted equation is 0.97 </li></ul><ul><li>0.97 = rate of change of SBP with respect to age </li></ul><ul><li>Every year a subject’s blood pressure rises approximately 0.97 units. </li></ul>
  • 10. Least Squares Method for Line of Best Fit <ul><li>Interactive Unit D2, Basics, Basics 1 </li></ul><ul><li>Interactive Unit D2, Basics, Practice 1 </li></ul>
  • 11. Residuals <ul><li>One method for assessing how well a linear equation models the data is assessing the extent to which points differ from the line. </li></ul><ul><li>A residual is the difference between an observed y value and the corresponding value of y on the fitted line (predicted y) </li></ul><ul><li>Residual = Observed y - Predicted y </li></ul>
  • 12. Sum of Squares of the Residuals <ul><li>The line of best fit is the one with the smallest sum of squares of the residuals </li></ul><ul><li>It is called the least squares line or sometimes the least squares regression line </li></ul><ul><li>The challenge is to find the slope and y-intercept of this least squares line </li></ul>
  • 13. More Practice with Find the Least Square Line <ul><li>Interactive D2, Basics, Basics2 </li></ul>
  • 14. The “Formulas” <ul><li>The methods of calculus can be used to find equations for the slope and y-intercept of the least squares line. Here are the results. </li></ul>
  • 15. The Good News <ul><li>Many computer programs including Excel and MINITAB as well as graphing calculators provide the slope and y-intercept of the least squares line </li></ul>
  • 16. Example <ul><li>Find the slope and y-intercept for the least squares line describing the association between age and blood pressure suggested by this data </li></ul>
  • 17. The Line of Best Fit <ul><li>The line that best fits the data is taken to be the one with the “smallest” residuals. </li></ul><ul><li>Since residuals can be both positive and negative they are squared to insure all are positive </li></ul><ul><li>The squared residuals are then added to find a measure of the total amount the fitted values deviate from the observed values </li></ul>
  • 18. Least Squares Line <ul><li>Y = SBP X = Age </li></ul><ul><li>Y = 98.7 + 0.97X </li></ul>
  • 19. Predictions <ul><li>The prediction equation y = 98.7 + 0.97x </li></ul><ul><li>can be used to predict a person’s SBP based on their age </li></ul><ul><li>For a randomly selected person who is 40 years old, the least squares equation predicts a SBP of </li></ul><ul><li>98.7 + 0.97(40) = 137.5 </li></ul>
  • 20. Making Predictions <ul><li>Use the sample least squares line </li></ul><ul><li>y = 98.7 + 0.97x </li></ul><ul><li>to complete the table </li></ul>
  • 21. Back to Residuals <ul><li>SSRes = </li></ul><ul><li>is a measure of the total amount of deviation from the fitted line. </li></ul><ul><li>It is a measure of the variability in the data that is not explained the the linear relationship with the variable x </li></ul><ul><li>It measures the variability due to factors other than the explanatory variable x </li></ul>
  • 22. Back to Age vs SBP <ul><li>SSRes = = 8393.44 </li></ul><ul><li>SSTotal = = 14787.47 </li></ul><ul><li>56.76% of the variability in the SBP data is explained by factors other than age </li></ul><ul><li>1 - 56.76% = 43.24% of the variability in SBP can be explained by the linear relationship with age </li></ul>
  • 23. The value of r 2 <ul><li>The correlation coefficient, r, for the SBP vs Age data is 0.65757 </li></ul><ul><li>r 2 = (0.65757) 2 = 0.4324 </li></ul><ul><li>When r 2 is converted to a percent, 43.24% it corresponds to the percent variability in SBP that is explained by age </li></ul>
  • 24. Interpretation of r 2 <ul><li>When r 2 is converted to a percent it can be interpreted as the percent of the variability in the response variable, y, that can be explained by the linear relationship with the explanatory variable, x. </li></ul>
  • 25. <ul><li>Find the least squares line, the values of r and r 2 Interpret r 2 Interpret the slope </li></ul>
  • 26. Scatter Graph r = -0.816
  • 27. Residuals <ul><li>Model Weight City MPG Residual </li></ul><ul><li>BMW 318Ti 2790 23 0.69556 </li></ul><ul><li>BMW Z3 2960 19 -2.12366 </li></ul><ul><li>Chevrolet Camaro 3545 17 -0.06038 </li></ul><ul><li>Chevrolet Corvette 3295 17 -1.79682 </li></ul><ul><li>Ford Mustang 3270 17 -1.97047 </li></ul><ul><li>Honda prelude 3040 22 1.43200 </li></ul><ul><li>Hyundai Tiburon 2705 22 -0.89483 </li></ul><ul><li>Mazda Miata 2365 25 -0.25640 </li></ul><ul><li>Mercury Cougar 3140 20 0.12658 </li></ul><ul><li>Mercedes Benz SLK 3020 22 1.29309 </li></ul><ul><li>Mitsubishi Eclipse 3235 23 3.78643 </li></ul><ul><li>Pontiac Firebird 3545 18 0.93962 </li></ul><ul><li>Porsche Boxster 2905 19 -2.50568 </li></ul><ul><li>Saturn SC 2420 27 2.12562 </li></ul><ul><li>Toyota Celica 2720 22 -0.79065 </li></ul>
  • 28. Vehicles with the Largest Positive and Negative Residuals <ul><li>Mitsubishi Eclipse got 3.876 city MPG more than expected </li></ul><ul><li>Porsche Boxster got 2.506 city MPG less than expected </li></ul>
  • 29. Analysis <ul><li>City MPG = 41.7 - 0.00695 Weight </li></ul><ul><li>Each additional pound translates into a loss of approximately .00695 city MPG </li></ul><ul><li>Each additional 1000 pounds translates into a loss of approximately 6.95 city MPG </li></ul><ul><li>r 2 = 66.6% </li></ul><ul><li>66.6% of the variability in city MPG can be explained by the linear association with the weight of the vehicle. 33.4% of the variability in city MPG is due to factors other than the weight of the vehicle. </li></ul>

×