Transcript of "Further6 displaying bivariate data"
Further MathematicsDisplaying Bivariate Data K McMullen 2012
Displaying Bivariate DataBivariate Data: data with two variables (twoquantities or qualities that change)Generally one variable depends on the other The dependent variable depends on the independent variable Eg. Height and Weight Eg. Hours studied and test result Tend to focus more on dependent and independent variables when plotting scatterplots K McMullen 2012
Displaying Bivariate DataBack-to-back stem plots: are used to display therelationship between a numerical variable and atwo-valued categorical variableThey are used to compare data sets usingsummary statistics such as measures of centreand measures of spreadEg. Comparing Further Maths study scores(numerical variable) with gender (male or female-two-valued categorical variable) K McMullen 2012
Displaying Bivariate DataParallel box plots: are used to display the relationshipbetween a numerical variable and a categoricalvariable with two or more categoriesThey are used to compare sets of data usingsummary statistics such as measures of centre andmeasures of spread- also think of the 5 numbersummaryRemember that parallel box plots must be placed onthe same axis (you can also do this on CAS)Eg. The results achieved by 4 different further mathsclasses K McMullen 2012
Displaying Bivariate DataTwo-way frequency tables: are used to displaythe relationship between two categoricalvariables and can be represented graphically asa segmented bar chartRemember that it is easier to compare data setsif you are working with percentages instead oftotalsIn a frequency table you should place yourindependent variable along the top row and yourdependent variable along the left column (this willmean that all your columns must add to 100% ifdone correctly) K McMullen 2012
Displaying Bivariate DataScatterplots: are used to display the relationship(correlation) between two numerical variablesThe dependent variable is displayed on the verticalaxisThe independent variable is displayed on thehorizontal axisThe relationship between variables on a scatterplotcan be described in terms of: Strength (strong, moderate, weak) Direction (positive, negative) Form (linear, non-linear) K McMullen 2012
Displaying Bivariate DataScatterplots- continued Pearson’s product-moment correlation coefficient (r) is used to measure the strength of the scatterplot The values of r range between -1 (perfect negative) to 1 (perfect positive) You can approximate the value of r (look at formula on p. 101) but you can also calculate it using CAS (obviously more reliable) To interpret r look and copy the table on page 100 of your textbook K McMullen 2012
Displaying Bivariate DataScatterplots- continued• The coefficient of determination (r2): this provides information about the degree to which one variable can be predicted from another variable provided that the variables have a linear correlation• The coefficient of determination is calculated by squaring the correlation coefficient (r)• When commenting using r2 always convert your value into a percentage• Comments“The coefficient of determination tells us that rr% of the variation in thedependent variable is explained by the variation in the independentvariable” K McMullen 2012
Displaying Bivariate Data• You must remember the difference between correlation and causation• To interpret your scatterplot you must stick to the variables given and don’t make any unnecessary assumptions• If your scatterplot is negative then: “As IV increases the DV decreases)• If your scatterplot is positive then: “As IV increases the DV increases) K McMullen 2012
Displaying Bivariate DataExample: Age and arm span of teenage boys Comment: As the age of teenage boys increases the length of their arm span also increases Assumption: As teenage boys get taller their arm span increases Obviously they get taller but height is not a variable and therefore you should not comment on it K McMullen 2012
Displaying Bivariate DataEg. The number of cigarettes smoked and fitnesslevel Comment: As the number of cigarettes increase the fitness level of participants decreased Assumption: Smoking cigarettes causes fitness levels to decrease You must remember that there can be other factors the can account for low levels of fitness such as lack of exercise or weight etc K McMullen 2012
Displaying Bivariate DataEg. People catching public transport and the sales ofdesigner handbags Comment: As the number of people catching public transport increase the number of people buying designer handbags decreases Assumption: A high proportion of people catching public transport has caused a decline in the sales of designer handbags These two variables are clearly unrelated even though there can be some correlation. You need to always question the validity of stats- what else could have caused public transport use to increase and designer handbags sales to decrease? K McMullen 2012
Displaying Bivariate DataWork through Ch 4 questions and chapter review K McMullen 2012
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.