10. Strength of Linear Association r value Interpretation 1 perfect positive linear relationship 0 no linear relationship -1 perfect negative linear relationship
14. Formula = the sum n = number of paired items x i = input variable y i = output variable x = x-bar = mean of x ’s y = y-bar = mean of y ’s s x = standard deviation of x ’s s y = standard deviation of y ’s
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28. Example Temperature (F) Water Consumption (ounces) 75 16 83 20 85 25 85 27 92 32 97 48 99 48
Typically, in the summer as the temperature increases people are thirstier. Consider the two numerical variables, temperature and water consumption. We would expect the higher the temperature, the more water a given person would consume. Thus we would say that in the summer time, temperature and water consumption are positively correlated.
(The data is shown in the table with the temperature placed in increasing order.)
This graph helps us visualize what appears to be a somewhat linear relationship between temperature and the amount of water one drinks.
Direction of the Association: The association can be either positive or negative. Positive Correlation: as the x variable increases so does the y variable. Example: In the summer, as the temperature increases, so does thirst. Negative Correlation: as the x variable increases, the y variable decreases. Example: As the price of an item increases, the number of items sold decreases.
Strength of the Association: The strength of the linear association is measured by the sample Correlation Coefficient, r. r can be any value from –1 to +1. The closer r is to one (in magnitude) the stronger the linear association. If r equals zero, then there is no linear association between the two variables.
* No other values of r have precise definitions of strength. See the chart below. Note: All of the values in the second table are positive. Thus the associations are positive. The same strength interpretations hold for negative values of r, only the direction interpretations of the association would change.
To describe or model a set of data with one dependent variable and one (or more) independent variables. To predict or estimate the values of the dependent variable based on given value(s) of the independent variable(s). To control or administer standards from a useable statistical relationship.
Simple: only one independent variable Linear in the Independent Variable: the independent variable only appears to the first power.
Regression analysis tries to fit a model to one response variable based on one or more explanatory variables. In most cases, there will be error. See the graph below for an example of Simple Linear Regression. The actual data points (x,y) are the blue dots The Least Squares linear model (regression line) of the response (y) variable based on the the explanatory (x) variable is shown in black. The errors (residuals) for each data point to its predicted value are the vertical lines shown in red. The goal, in general, is to minimize the errors from the actual data to the regression line. The least squares line minimizes the sum of the square of the errors.
If you were planning an outdoor party in the summer time, you might need to estimate how many soft drinks to buy. In your planning, you will, of course, need to know how many people will be attending. However, as you determine the number of soft drinks for each person, you might want to also consider how hot it will be. The temperatures have been forecasted for around 95 degrees on the day of the outdoor party. (In real life, you would probably make an estimate based on your previous experience and make a logical increase based on the temperature.) We will use some hypothetical data to determine how thirsty people might be. We will use the Temperature/Water example data for this.
And we cannot expect it be accurate at such low temperatures, because the sample of data was taken in the summer time and the temperatures range from 75 to 99 degrees Fahrenheit. Thus the model only predicts for temperatures in approximately that range.
Solution: Substitute the value of x=95 (degrees F) into the regression equation y=1.5*x - 96.9 and solve for y (water consumption). y=1.5*x - 96.9 if x=95, then y=1.5*95 - 96.9 = 45.6 ounces. Note: Since 95 degrees F is in the range of values that were used to find the regression equation, then we can use this equation to predict the water consumption. If the desired prediction had been for 50 degrees F, then the model should not be used, since the value of x does not fall in the range of predictability for this model.
Note: This means that approximately 7% of the variation in the water consumption is not explained by the temperature. So perhaps there is another variable that accounts for the remaining portion of the variation. Can you think of a reasonable variable? Multiple Regression is the name of the method that would include more than one independent variable to predict amount of water people would drink.
Predicting the Solar MaximumPlanning for satellite orbits and space missions often require advance knowledge of solar activity levels. NASA scientists are using new techniques to predict sunspot maxima years in advance. Click here for the latest predictions for the current solar cycle. Click on the image for current predictions
Compare the height and arm span of students in the class, except for one student. Test for the strength of association between the two variables. Measure the arm span of the one student and ask the class to predict the student's height.
Have students search the Internet or other sources for data on basketball players. In particular, find the number of points scored in a game and the number of minutes played in that same game. Based on the data, could the number of points scored in a game be predicted by the amount of time a player plays in a game? Have students suggest alternative ways to set up the situation that would provide better prediction capabilities and ask them to find the data to check their theories. (Idea modified from Steven King, Aiken, SC. NCTM presentation 1997.)