MYP5S 6.2 Central tendency and spread.pptxMYP5S 6.2 Central tendency and spread.pptxMYP5S 6.2 Central tendency and spread.pptxMYP5S 6.2 Central tendency and spread.pptxMYP5S 6.2 Central tendency and spread.pptx
Measures of centraltendency
As the title suggests, we are concerned with finding a single value that
represents the ‘middle’ of the data.
This is commonly known as the average.
There are three averages:
• mode • median • mean
the value that
occurs most
frequently
arrange the data
in order, then find
the middle value
add the data, then
divide by how
many there are
Examples
2, 2, 3,7, 2, 4, 280
2, 2, 3, 7, 7, 3, 280
Mode = 2
Mode = 2 or 3 or 7 !
Which has no meaning
4.
The mean
The meanis the statistic that most people think of when someone
says ‘average’.
We use a special symbol to stand for the mean:
- which we read as ‘x-bar’
The median
The mediancan be thought of as the
‘middle one’, once the data has been
arranged in numerical order.
Step 1: Arrange in order 1 2 4 7 9 10 11 12 16 17 18
Step 2: Find the middle number Median = 10
This value is the 6th number
in the list.
Which can be found using
11 data
There are an ODD number of data.
7.
1 2 47 9 10 10.5 11 12 16 17 18
12 data
What happens if we add another number to our list of data?
, 10.5
We find the median using the
6th
and 7th
data values.
Which can be found using
There are an EVEN number of data.
The values are the 6th
and 7th
numbers in the list.
9
What is themedian of these data?
8 10
9 9.5
A C
D
B
5, 6, 8, 8, 9, 10, 11, 13, 15
10.
10
What is themedian of these data?
10 9
8.5 11
A C
D
B
3,5,8,10,11,11
11.
11
What is themean of these data?
5 20
7 5.5
A C
D
B
2, 8, 6, 4
12.
12
What is themean of these data?
24 5
3.5 4
A C
D
B
1,6,8,2,3,4
13.
13
What is themedian of these data?
4 10
5 11
A C
D
B
4,2,11,10,1
14.
14
What is themode of these data?
8 10
9 15
A C
D
B
5, 6, 8, 8, 9, 10, 11, 13, 15
15.
In the frequencytable, we can see that
7 students owned one hat.
So the mode is 1 hat.
Be careful not to
say that the
mode is 7!
The mode
16.
The thinking isthe same when the data is organised into a grouped
frequency table.
The highest frequency is 10.
So, what do you think the mode will be?
We can only refer to the modal class, which is 16 < y ≤ 17.
17.
The mean andthe frequency table
As we have already seen, data can often be organised using a frequency
table.
How could we find the mean of the data?
What is the mode of the data?
The highest frequency is 8⟹ the mode = 5
Imagine that we were given the data as a list.
What would it look like?
7, 7,
2 x 7 4 x 6 8 x 5 ...
5, 5, 5, 5, 5, 5, 5, 5, ...
6, 6, 6, 6,
18.
We can showthis in an ‘improved’ frequency table.
So, in general, when finding the mean
from a frequency table, use ...
19.
Frequency Tables
Eggs 01 2 3 4
Frequency 3 3 4 5 1
Watson’s Chick’s recorded the number of
Eggs each of the little beauties produced
over 1 week. Here are the results.
Finding the Mean
Eggs0 1 2 3 4
Frequency 2 3 4 5 1
Frequency x
Eggs
2x0 1x3 2x4 3x5 4x1
Total
Number of
eggs
0 3 8 15 4
Total Eggs
Total Chickens
Mean =
Finding the Median
Eggs0 1 2 3 4
Frequency 2 3 4 5 1
Frequency x
Eggs
2x0 1x3 2x4 3x5 4x1
Total
Number of
eggs
0 3 8 15 4
26.
MIDPOINT
(X)
FX
Effective
Participator
Self Manager
Independent
Enquirer
Creative
Thinker
Team
Worker
Reflective
Learner
PLT SkillsWhich ones are you using?
Mean from a table
POCKET MONEY FREQUENCY (F)
0 < P ≤ 1 2
1 < P ≤ 2 5
2 < P ≤ 3 5
3 < P ≤ 4 9
4 < P ≤ 5 15
EXAMPLE
Find the estimate of the mean and modal group
from the table below:
x
x
x
x
x
TOTAL = 36
0.5
ESTIMATED MEAN =
FX total
F total
=
120
36
One with highest frequency
MODAL GROUP =
= £3.33
MEAN =
FX total
F total
1.5
2.5
3.5
4.5
1
7.5
12.5
31.5
67.5
120
= 4 < P ≤ 5
27.
The mean andgrouped frequency tables
Why do you think we are being asked
for an estimate of the mean?
As the data is grouped, we do not know
the specific age of each cat.
What do you think we need to ‘add’
to the table?
We need a column headed ‘mid-interval’
or ‘mid-value’.
We then use these mid-values to
calculate the estimate of the mean.
28.
Add a column,and then find the mean
of the upper and lower bounds.
1
3
5
7
Add a final column headed fm.
Using
we have
The estimated mean average is 3.6 years old.
When using the TI,
make sure that you
enter the mid-values.
29.
Effective
Participator
Self Manager
Independent
Enquirer
Creative
Thinker
Team
Worker
Reflective
Learner
PLT SkillsWhich ones are you using?
Task (Grade C)
Mean from a table
1) 2)
3) 4)
Calculate the estimate of the mean and
modal group from the table above.
Calculate the estimate of the mean and
modal group from the table above.
Calculate the estimate of the mean and
modal group from the table above.
Calculate the estimate of the mean and
modal group from the table above.
Score Frequency Midpoint fx
1-5 11
6-10 12
11-15 15
16-20 9
21-25 3
Total
Marks Frequency Midpoint fx
10-19 7
20-29 5
30-39 8
40-49 15
50-59 25
Total
Mass Frequency Midpoint fx
50-54 17
55-59 52
60-64 36
65-69 12
70-74 4
Total
Height Frequency Midpoint fx
130 ≤h<135 10
135 ≤h<140 15
140 ≤h<145 17
145 ≤h<150 11
150 ≤h<155 2
Total
30.
Effective
Participator
Self Manager
Independent
Enquirer
Creative
Thinker
Team
Worker
Reflective
Learner
PLT SkillsWhich ones are you using?
Task (Grade C)
Answers
1) 2)
3) 4)
Score Frequency Midpoint fx
1-5 11 3 33
6-10 12 8 96
11-15 15 13 195
16-20 9 18 162
21-25 3 23 69
Total 50 Total 555
Marks Frequency Midpoint fx
10-19 7 14.5 101.5
20-29 5 24.5 122.5
30-39 8 34.5 276
40-49 15 44.5 667.5
50-59 25 54.5 1362.5
Total 60 Total 2530
Mass Frequency Midpoint fx
50-54 17 52 884
55-59 52 57 2964
60-64 36 62 2232
65-69 12 67 804
70-74 4 72 288
Total 121 Total 7172
Height Frequency Midpoint fx
130 ≤h<135 10 132.5 1325
135 ≤h<140 15 137.5 2062.5
140 ≤h<145 17 142.5 2422.5
145 ≤h<150 11 147.5 1622.5
150 ≤h<155 2 152.5 305
Total 55 Total 7737.5
Estimated Mean = 555/50=11.1
Estimated Mean = 2530/60=42.17
Estimated Mean = 7172/121=59.27 Estimated Mean = 7737.5/55=140.68
Modal Group = 11-15
Modal Group = 50-59
Modal Group = 55-59 Modal Group = 140 ≤h<145
Measures of dispersionor spread
We have looked at the distribution of the data in the ‘middle section’,
i.e. the average.
Other useful information about the distribution of data can be found by
examining its dispersion or spread.
33.
The range
The rangeis the simplest measure of spread.
It is found by calculating the difference between the maximum data value
and the minimum data value.
Range = max − min
Example
5, 5, 6, 4, 4, 14, 7, 8, 6, 14, 10, 4, 8, 7, 11
Range = 14 – 4 = 10
34.
The interquartile range
Bydefinition, a quartile is one quarter or 25% of the data set.
4, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 10, 11, 14, 14
The median is 7,
There are 15 data
Looking at this data set ... with 15 data
⟹ the middle value is the 8th number.
which is 50%, or the second quartile (Q2)
min max
Now find the first quartile (Q1), by finding the middle of the lower half
of the data.
Q2
Q1
And finally, find the third quartile (Q3), by finding the middle of the upper
half of the data.
Q3
The interquartile range = Q3 – Q1
= 10 – 5
= 5
35.
4, 4, 4,5, 5, 6, 6, 7, 7, 8, 8, 10, 11, 14
Q2 = 6.5
Q1 = 5 Q3 = 8
lower half upper half
The 6 and the 7 have not been ‘used’, so ...
Looking at this data set ... with 14 data
Box plots
Box plotscan also be referred to as box and whisker plots.
Box plots tend to look very similar!
41.
Box plots providean excellent method of presenting some important
statistics about the median and the spread of the data:
1. The minimum value
5. The maximum value
2. The lower quartile, Q1
4. The upper quartile, Q2
3. The median
As a consequence, this information is referred to as a five-number summary.
Outliers
Outliers are datapoints that don’t ‘fit’ the rest of the data.
Look at this question.
Clearly, the income of the
factory owner is an outlier!
An outlier is often defined
as follows:
44.
(a)
Page 299
Choose thespreadsheet.
Type in data.
Put the data into the column.
Choose menu, 4 Statistics, 1 Stat Calculations.
Choose 1 One-variable Statistics ...
There is only 1 list of data, so click OK.
Choose data from the drop-down menu.
Click on OK.
45.
(b) Begin bydrawing a number line.
Then mark the five-numbers.
Finally, connect the box, add the whiskers and
write the 5-number summary.
37 44 47 51.5 54
46.
(c) IQR =Q3 – Q1
= 51.5 – 44
= 7.5
1.5 × 7.5 = 11.25
44 – 11.25 = 32.75
As 37 > 32.75, it is not an outlier.
11.25
32.75 37
47.
Should outliers bekept in the data set, or rejected?
Outliers will affect the mean and the range, but not the median, mode and
interquartile range.
Variance and standarddeviation
Consider the following data
0, 1, 1, 2, 2, 2, 3, 4, 4, 5
We can easily find the range, and (with a little effort) we can find the IQR.
Each of these two statistics, the range and IQR, only use two data values ...
max and min Q1 and Q3
The variance is a measure of dispersion (spread)
which uses every piece of data.
51.
To calculate thevariance, we follow these steps:
1. Calculate the mean of the data
2. Find the distance from the mean of every piece of data
Some data will be below the mean and is therefore negative
Some data will be above the mean and is therefore positive
3. Square each value, to get rid of any negatives
4. Find the sum
5. Divide by the number of data
52.
The standard deviation
Thestandard deviation is denoted by,
In order for this measure of dispersion to be useful for statiticians, we take
the square root of the variance, and rename it the standard deviation.
This is not in the formula booklet as you will use your TI to calculate it.
(lower case sigma)
53.
Using the TIto find the standard deviation
0, 1, 1, 2, 2, 2, 3, 4, 4, 5
Choose Lists and spreadsheet.
Type in the word data in to column A.
Put the data into the column.
Choose menu, 4 Statistics, 1 Stat Calculations.
Choose 1 One-variable Statistics ...
There is only 1 list of data, so click OK.
Choose data from the drop-down menu.
Click on OK.
NOTE: To find the
variance, simply square
the standard deviation!
54.
Use your TIto find the values to complete the tableshown below.
17, 13, 15, 9, 13
Data Mean Median Mode Range IQR Std Dev
9, 13, 13, 15, 17
Look at your results.
What happens to the averages?
What happens to the spread?
13.4 13 13 8 5 2.653...
13, 17, 17, 19, 21 17.4 17 17 8 5 2.653...
18, 26, 26, 30, 34 26.8 26 26 16 10 5.306...
Add 4
Mutiply by 2
Now add 4 to each data value
Now multiply each data value by 2
55.
A question ...
Seraand Zeynep are in the basketball team.
The number of points scored in their last 5 games is as follows:
Sera 12, 15, 13, 11, 14
Zeynep 11, 4, 23, 8, 19
The team is playing in the final and YOU are the basketball coach.
The score in the final is close, with 5 minutes left in the game.
Should you put Sera into the game, or Zeynep?