9.2 lin reg coeff of det


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

9.2 lin reg coeff of det

  1. 1. 9.2 Linear Regression and the Coefficient of Determination PART 1: LINEAR REGRESSION
  2. 2. Scatter Diagrams and Linear RelationshipsLooking at a scatter diagram, we can ask somequestions.1. Do the data indicate a linear relationship between x and y?2. Can you find an equation for the “best-fitting line” relating x and y? Can you use this relationship to make predictions?3. What fractional part of the variability in y can be associated with the variability in x? What fractional part of the variability in y is not associated with a corresponding variability in x?
  3. 3. Scatter Diagrams and Linear Relationships The first step in answering these questions is to try to express the relationship as a mathematical equation.  There are many possible equations, but the simplest and most widely used is the linear equation, or the equation of a straight line. Because we will be using this line to predict the y values from the x values, we call x the explanatory variable and y the response variable. Our job is to find the linear equation that “best” represents the points of the scatter diagram.
  4. 4. Least-Squares Criterion For our criterion of “best-fitting line”, we use the least-squares criterion  The least-squares criterion states that the line we fit to the data points must be such that the sum of the squares of the vertical distances from the points (x, y) to the line be made as small as possible. In Figure 9-10, d represents the difference between the y coordinate of the data point and the corresponding y coordinate on the line. Least-Squares Criterion Figure 9.10
  5. 5. The Least-Squares Line
  6. 6. Finding the Equation of the Least-Squares Line
  7. 7. Graphing the Least-Squares Line
  8. 8. Meaning of Slope
  9. 9. Example In Denali National Park, Alaska, the wolf population is dependent on a large, strong caribou population. In this wild setting, caribou are found in very large herds. The well-being of an entire caribou herd is not threatened by wolves. In fact, it is thought that wolves keep caribou herds strong by helping prevent overpopulation. Let x be a random variable that represents the fall caribou population (in hundreds) in Denali National Park, and let y be a random variable that represents the late-winter wolf population in the park. A random sample of recent years gave the following information (Reference: U.S. department of the Interior, National Biological Service).
  10. 10. Examplea) Draw a scatter diagram displaying the dataSolution Caribou and Wolf Populations Figure 9.9
  11. 11. Example
  12. 12. Example Caribou and Wolf Populations Figure 9.11
  13. 13. Influential Points Some points in the data set have a strong influence on the equation of the least-squares line. A data pair is influential if removing it would substantially change the equation of the least- squares line or other calculations associated with linear regression.  An influential point often has an x-value near the extreme high or low value of the data set.  If a data set has an influential point, look at it to make sure it is not the result of a data collection error or a recording error.  A valid influential point affects the equation of the least-squares line.
  14. 14. Influential PointsInfluential Point Present Influential Point Removed Figure 9.12(a) Figure 9.12(b)
  15. 15. 9.2 Linear Regression and the Coefficient of Determination PART 2: COEFFICIENT OF DETERMINATION
  16. 16. Using the Least-Squares Line for Prediction
  17. 17. How well does the least-squares line fit the original data?
  18. 18. Does the prediction involve interpolation or extrapolation?
  19. 19. The data are sample data
  20. 20.
  21. 21. Example: Predictions
  22. 22. Example
  23. 23. Example
  24. 24. Example
  25. 25. Explained and Unexplained Deviations Explained and Unexplained Deviations Figure 9.14
  26. 26. Deviations and Variations
  27. 27. Coefficient of Determination
  28. 28. Example If r = 0.90, then r2 = 0.81 is the coefficient of determination. We can say that about 81% of the (variation) behavior of the y variable can be explained by the corresponding (variation) behavior of the x variable if we use the equation of the least-squares line. The remaining 19% of the (variation) behavior of the y variable is due to random chance or to the possibility of lurking variables that influence y.