Upcoming SlideShare
×

# How To Compute The Best Fit Straight Line To A Set Of Data?

2,631
-1

Published on

How To Compute The Best Fit Straight Line To A Set Of Data?

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
2,631
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
18
0
Likes
0
Embeds 0
No embeds

No notes for slide

### How To Compute The Best Fit Straight Line To A Set Of Data?

1. 1. How-To Compute the Best-Fit Straight Line to a Set of Data Page 1 How-To Compute the Best Fit Straight Line to a Set of Data Objective: Learn how to compute and use regression equations to a set of data. Keywords and Concepts 1. Best fit straight line 2. Bivariate scatter plot 3. Regression equation 4. Y’ = mX + C 5. Independent variable 6. Dependent variable 7. Slope (rate of change) of the best-fit line 8. Y-intercept 9. Correlation coefficient (r) A bivariate scatter plot (two-variable relationship) reveals the association between an individual’s score on one variable can be used to predict the corresponding score on a related variable (see Fig. 1). This prediction depends on the equation for a straight line that minimizes the variation of data points about it. The equation used to draw the best-fit straight line is called a regression equation and was first used by Sir Francis Galton (1822-1911) to show that when tall or short couples have children their heights tend to “regress”, or revert to the mean height of their parents. Figure 1. scatter plot of Fahrenheit versus Celsius temperature FAHRENHEIT VERSUS CELSIUS SCATTER PLOT 32 42 52 62 72 82 92 102 112 0 10 20 30 40 Celsius, degree Farhrenheit, degree Value used to construct line Value used to construct line Regression Line y = 1.80x + 32
2. 2. How-To Compute the Best-Fit Straight Line to a Set of Data Page 2 Regression Equation Given a collection of paired sample data, the following formula (regression equation) describes the relationship between X (independent or predictor variable) and Y’ (dependent or response variable): Y’ = mX + C (eq. 1) where, m expresses the slope (rate of change) of the best-fit line, X is any particular X value within the range of the data set, and C represents the Y-intercept or the value of X when Y equals zero. Use equation 2 to compute m as follows: m = ρ Σ∆Ψ Σ∆Ξ (eq. 2) The symbol r equals the correlation coefficient between X and Y; SDY and SDX are their respective X and Y variable standard deviations. The Y-intercept C computes in equation 3 as: C = Ψ− µ Ξ (eq. 3) where, Y equals the mean of the Y scores, m equals the slope, and X equals the mean of the X scores. Equation 3 can be rewritten as follows: C = Ψ− ρ( Σ∆Ψ Σ∆Ξ )Ξ (eq. 4) Combining equations 2 and 3 expresses the equation for the best-fit regression line to predict Y (Y’) from any X value in equation 5: ′Y = r( SDY SDX )X + Y − mX (eq. 5) Alternatively, combining equations 2 and 4 expresses the equation for a straight line as:
3. 3. How-To Compute the Best-Fit Straight Line to a Set of Data Page 3 ′Y = r( SDY SDX )X + Y − (r SDY SDX )X (eq. 6) The slope and Y-intercept of the best-fit regression line also can be computed from raw scores with the following formula (equation 7): Slope,m = ( ΞΨ∑ Ν ) − ( Ξ∑ Ν )( Ψ∑ Ν ) ( Ξ2 ∑ Ν ) −( Ξ∑ Ν )2 (eq. 7) where, the equation’s numerator equals the numerator for the correlation coefficient (r) and the denominator equals SDX. The raw score equation (eq. 8) computes the Y-intercept (C) as: (eq. 8) The denominator in equation 8 equals the square of SDX. Example Given the following 5 data points for temperature in degrees Fahrenheit (Y- variable) and temperature in degrees Celsius (X-variable), compute the equation for the best-fitting straight line. Y Fahrenheit Temperature 32 40 60 80 100 X Celsius Temperature 0 4.44 15.55 26.6 37.77 Step 1. Compute r, SDx, and SDy. ∑Y=312; ∑Y2 =22624; ∑X=84.36; ∑X2 =2395.65; ∑XY=7015.6; N=5 r = 0.999 Y = 62.4; SDY = 25.12 X = 16.87; SDX = 13.95
4. 4. How-To Compute the Best-Fit Straight Line to a Set of Data Page 4 Step 2. Compute the slope (m) and Y-intercept (C) using equations 2 and 4, respectively. m = ρ Σ∆Ψ Σ∆Ξ (eq. 2) m = 0.9999 (25.12 ÷ 13.95) m = 1.80 C = Ψ− ρ( Σ∆Ψ Σ∆Ξ )Ξ (eq. 4) C = 62.4(0.999 25.2 13.95 )16.87 Χ = 32.0 Step 3. The equation for the regression of degrees Celsius on degrees Fahrenheit becomes: Y’ = mX + C Y’ = 1.8 X + 32.0 Step 4. Determine the best-fit straight line of the regression, and plot the individual data points as a scatter diagram. (See figure 1) Arbitrarily select a value of X near the maximum observed values of X, and substitute the score in the equation to solve for Y’ (predicted Y). Plot this point (X, Y’) on the scattergram. Repeat the procedure for another value of X near the minimum observed value of X. The straight line joining the two points represents the best-fitting straight line generated from the regression equation. 32 0 40 4.44 60 15.55 80 26.6