SlideShare a Scribd company logo
1 of 83
Statistical Analysis
IB Diploma Biology
Stephen Taylor
Image: 'Hummingbird Checks Out Flower'
http://www.flickr.com/photos/25659032@N07/7200193254 Found on flickrcc .net
Assessment Statements Obj.
1.1.1
State that error bars are a graphical representation of the variability of data.
 Range and standard deviation show the variability/ spread in the data
 95% Confidence Interval error bars suggest significance of difference where there is
no overlap.
1
1.1.2
Calculate the mean and standard deviation of a set of values
 Using Excel (Formula =STDEV(rawdata))
 Using your calculator
2
1.1.3
State that the term standard deviation (s) is used to summarize the spread of
values around the mean, and that 68% of all data fall within (±) 1 standard
deviation of the mean.
1
1.1.4
Explain how the standard deviation is useful for comparing the means and the
spread of data between two or more samples.
 A greater standard deviation shows a greater variability of data around the mean.
 This can be used to infer reliability in methods or results.
3
1.1.5
Deduce the significance of the difference between two sets of data using
calculated values for t and the appropriate tables.
 Using t-values, t-tables and critical values
 Directly calculating P values using Excel in lab reports.
3
1.1.6
Explain that the existence of a correlation does not establish that there is a
causal relationship between two variables.
3
Assessment statements from: Online IB Biology Subject GuideCommand terms: http://i-biology.net/ibdpbio/command-terms/
MrT’s Excel Statbook
has guidance and ‘live’ examples of
tables, graphs and statistical tests.
http://i-biology.net/ict-in-ib-biology/spreadsheets-graphing/statexcel/
“Why is this Biology?”
Variation in populations.
Variability in results.
affects
Confidence
in conclusions.
The key methodology in Biology is hypothesis
testing through experimentation.
Carefully-designed and controlled
experiments and surveys give us quantitative
(numeric) data that can be compared.
We can use the data collected to test our
hypothesis and form explanations of the
processes involved… but only if we can be
confident in our results.
We therefore need to be able to evaluate the
reliability of a set of data and the significance
of any differences we have found in the data.
Image: 'Transverse section of part of a stem of a Dead-nettle (Lamium sp.) showing+a+vascular+bundle+and+part+of+the+cortex'
http://www.flickr.com/photos/71183136@N08/6959590092 Found on flickrcc.net
“Which medicine should I prescribe?”
Image from: http://www.msf.org/international-activity-report-2010-sierra-leone
Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/
“Which medicine should I prescribe?”
Image from: http://www.msf.org/international-activity-report-2010-sierra-leone
Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/
Generic drugs are out-of-patent, and are
much cheaper than the proprietary
(brand-name) equivalents. Doctors need to
balance needs with available resources.
Which would you choose?
“Which medicine should I prescribe?”
Image from: http://www.msf.org/international-activity-report-2010-sierra-leone
Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/
Means (averages) in Biology are almost
never good enough. Biological systems
(and our results) show variability.
Which would you choose now?
Hummingbirds are nectarivores (herbivores
that feed on the nectar of some species of
flower).
In return for food, they pollinate the flower.
This is an example of mutualism –
benefit for all.
As a result of natural selection,
hummingbird bills have evolved.
Birds with a bill best suited to
their preferred food source have
the greater chance of survival.
Photo: Archilochus colubris, from wikimedia commons, by Dick Daniels.
Researchers studying comparative anatomy collect
data on bill-length in two species of hummingbirds:
Archilochus colubris
(red-throated hummingbird) and
Cynanthus latirostris (broadbilled hummingbird).
To do this, they need to collect sufficient
relevant, reliable data so they can test
the Null hypothesis (H0) that:
“there is no significant difference
in bill length between the two species.”
Photo: Archilochus colubris (male), wikimedia commons, by Joe Schneid
The sample size must
be large enough to provide
sufficient reliable data and for us
to carry out relevant statistical
tests for significance.
We must also be mindful of
uncertainty in our measuring tools
and error in our results.
Photo: Broadbilled hummingbird (wikimedia commons).
The mean is a measure of the central tendency
of a set of data.
Table 1: Raw measurements of bill
length in A. colubris and C. latirostris.
Bill length (±0.1mm)
n A. colubris C. latirostris
1 13.0 17.0
2 14.0 18.0
3 15.0 18.0
4 15.0 18.0
5 15.0 19.0
6 16.0 19.0
7 16.0 19.0
8 18.0 20.0
9 18.0 20.0
10 19.0 20.0
Mean
s
Calculate the mean using:
• Your calculator
(sum of values / n)
• Excel
=AVERAGE(highlight raw data)
n = sample size. The bigger the better.
In this case n=10 for each group.
All values should be centred in the cell, with
decimal places consistent with the measuring
tool uncertainty.
The mean is a measure of the central tendency
of a set of data.
Table 1: Raw measurements of bill
length in A. colubris and C. latirostris.
Bill length (±0.1mm)
n A. colubris C. latirostris
1 13.0 17.0
2 14.0 18.0
3 15.0 18.0
4 15.0 18.0
5 15.0 19.0
6 16.0 19.0
7 16.0 19.0
8 18.0 20.0
9 18.0 20.0
10 19.0 20.0
Mean 15.9 18.8
s
Raw data and the mean need to have
consistent decimal places (in line with
uncertainty of the measuring tool)
Uncertainties must be included.
Descriptive table title and number.
DELETE
X
DELETE
X
A.
colubris, 15.
9mm
C.
latirostris, 1
8.8mm
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
16.0
18.0
20.0
MeanBilllength(±0.1mm)
Species of hummingbird
Graph 1: Comparing mean bill lengths in two
hummingbird species, A. colubris and C. latirostris.
Descriptive title, with graph
number.
Labeled point
Y-axis clearly labeled, with
uncertainty.
Make sure that the y-axis
begins at zero.
x-axis labeled
A.
colubris, 15.
9mm
C.
latirostris, 1
8.8mm
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
16.0
18.0
20.0
MeanBilllength(±0.1mm)
Species of hummingbird
Graph 1: Comparing mean bill lengths in two
hummingbird species, A. colubris and C. latirostris.
From the means alone
you might conclude
that C. latirostris has a
longer bill than A.
colubris.
But the mean only tells
part of the story.
http://click4biology.info/c4b/1/gcStat.htm
http://mathbits.com/MathBits/TINSection/Statistics1/Spreadsheet.html
Standard deviation is a measure of the spread of
most of the data.
Table 1: Raw measurements of bill
length in A. colubris and C. latirostris.
Bill length (±0.1mm)
n A. colubris C. latirostris
1 13.0 17.0
2 14.0 18.0
3 15.0 18.0
4 15.0 18.0
5 15.0 19.0
6 16.0 19.0
7 16.0 19.0
8 18.0 20.0
9 18.0 20.0
10 19.0 20.0
Mean 15.9 18.8
s 1.91 1.03 Standard deviation can have one more
decimal place.=STDEV (highlight RAW data).
Which of the two sets of data has:
a. The longest mean bill length?
a. The greatest variability in the data?
Standard deviation is a measure of the spread of
most of the data.
Table 1: Raw measurements of bill
length in A. colubris and C. latirostris.
Bill length (±0.1mm)
n A. colubris C. latirostris
1 13.0 17.0
2 14.0 18.0
3 15.0 18.0
4 15.0 18.0
5 15.0 19.0
6 16.0 19.0
7 16.0 19.0
8 18.0 20.0
9 18.0 20.0
10 19.0 20.0
Mean 15.9 18.8
s 1.91 1.03 Standard deviation can have one more
decimal place.=STDEV (highlight RAW data).
Which of the two sets of data has:
a. The longest mean bill length?
a. The greatest variability in the data?
C. latirostris
A. colubris
Standard deviation is a measure of the spread of
most of the data. Error bars are a graphical
representation of the variability of data.
Which of the two sets of data has:
a. The highest mean?
a. The greatest variability in the data?
A
B
Error bars could represent standard deviation, range or confidence intervals.
Put the error bars for standard deviation on our graph.
Put the error bars for standard deviation on our graph.
Put the error bars for standard deviation on our graph.
Delete the horizontal error bars
A.
colubris, 15.9
mm
C.
latirostris, 18
.8mm
0.0
5.0
10.0
15.0
20.0
MeanBilllength(±0.1mm)
Species of hummingbird
Graph 1: Comparing mean bill lengths in two
hummingbird species, A. colubris and C. latirostris.
(error bars = standard deviation)
Title is adjusted to
show the source of the
error bars. This is very
important.
You can see the clear
difference in the size of
the error bars.
Variability has been
visualised.
The error bars overlap
somewhat.
What does this mean?
The overlap of a set of error bars gives a clue as to the
significance of the difference between two sets of data.
Large overlap No overlap
Lots of shared data points
within each data set.
Results are not likely to be
significantly different from
each other.
Any difference is most likely
due to chance.
No (or very few) shared data
points within each data set.
Results are more likely to be
significantly different from
each other.
The difference is more likely
to be ‘real’.
A.
colubris, 15.
9mm
(n=10)
C.
latirostris, 1
8.8mm
(n=10)
-3.0
2.0
7.0
12.0
17.0
22.0
MeanBilllength(±0.1mm)
Species of hummingbird
Graph 1: Comparing mean bill lengths in two
hummingbird species, A. colubris and C.
latirostris.(error bars = standard deviation)
Our results show a very small overlap
between the two sets of data.
So how do we know if the difference is
significant or not?
We need to use a statistical test.
The t-test is a statistical
test that helps us determine
the significance of the
difference between the
means of two sets of data.
The Null Hypothesis (H0):
“There is no significant
difference.”
This is the ‘default’ hypothesis that we always test.
In our conclusion, we either accept the null hypothesis or reject it.
A t-test can be used to test whether the difference between two means is significant.
• If we accept H0, then the means are not significantly different.
• If we reject H0, then the means are significantly different.
Remember:
• We are never ‘trying’ to get a difference. We design carefully-controlled experiments and
then analyse the results using statistical analysis.
P value = 0.1 0.05 0.02 0.01
confidence 90% 95% 98% 99%
degreesoffreedom
1 6.31 12.71 31.82 63.66
2 2.92 4.30 6.96 9.92
3 2.35 3.18 4.54 5.84
4 2.13 2.78 3.75 4.60
5 2.02 2.57 3.37 4.03
6 1.94 2.45 3.14 3.71
7 1.89 2.36 3.00 3.50
8 1.86 2.31 2.90 3.36
9 1.83 2.26 2.82 3.25
10 1.81 2.23 2.76 3.17
We can calculate the value of ‘t’ for a given set of data and compare it
to critical values that depend on the size of our sample and the level of
confidence we need.
Example two-tailed t-table.
“Degrees of Freedom (df)” is
the total sample size minus two.
What happens to the value of P
as the confidence in the results
increases?
What happens to the critical
value as the confidence level
increases?
“critical values”
P value = 0.1 0.05 0.02 0.01
confidence 90% 95% 98% 99%
degreesoffreedom
1 6.31 12.71 31.82 63.66
2 2.92 4.30 6.96 9.92
3 2.35 3.18 4.54 5.84
4 2.13 2.78 3.75 4.60
5 2.02 2.57 3.37 4.03
6 1.94 2.45 3.14 3.71
7 1.89 2.36 3.00 3.50
8 1.86 2.31 2.90 3.36
9 1.83 2.26 2.82 3.25
10 1.81 2.23 2.76 3.17
We can calculate the value of ‘t’ for a given set of data and compare it
to critical values that depend on the size of our sample and the level of
confidence we need.
Example two-tailed t-table.
“Degrees of Freedom (df)” is
the total sample size minus
two*.
We usually use P<0.05 (95%
confidence) in Biology, as our
data can be highly variable
*Simple explanation: we are working in
two directions – within each population
and across populations.
“critical values”
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
t was calculated as 2.15 (this is done for you)
t cv
2.15
If t < cv, accept H0 (there is no significant difference)
If t > cv, reject H0 (there is a significant difference)
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
0.05
t was calculated as 2.15 (this is done for you)
t cv
2.15
If t < cv, accept H0 (there is no significant difference)
If t > cv, reject H0 (there is a significant difference)
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
2.069
0.05
t was calculated as 2.15 (this is done for you)
t cv
2.15 > 2.069
If t < cv, accept H0 (there is no significant difference)
If t > cv, reject H0 (there is a significant difference)
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
2.069
0.05
t was calculated as 2.15 (this is done for you)
t cv
2.15 > 2.069
If t < cv, accept H0 (there is no significant difference)
If t > cv, reject H0 (there is a significant difference)
Conclusion:
“There is a significant difference in the wing spans of
the two populations of birds.”
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
2.0452.045
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
“There is no significant difference in the size of shells
between north-side and south-side snail populations.”
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
2.086
2.086
2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
“There is a significant difference in the resting heart
rates between the two groups of swimmers.”
Excel can jump straight to a value of P for our results.
One function (=ttest) compares both sets of data.
As it calculates P directly (the
probability that the difference is due
to chance), we can determine
significance directly.
In this case, P=0.00051
This is much smaller than 0.005, so
we are confident that we can:
reject H0.
The difference is unlikely to be due to
chance.
Conclusion:
There is a significant difference in bill
length between A. colubris and C.
latirostris.
Two tails: we assume data are normally distributed, with two ‘tails’ moving away from mean.
Type 2 (unpaired): we are comparing one whole population with the other whole population.
(Type 1 pairs the results of each individual in set A with the same individual in set B).
95% Confidence Intervals can also be plotted as error bars.
These give a clearer indication of the significance of a result:
• Where there is overlap, there is not a significant difference
• Where there is no overlap, there is a significant difference.
• If the overlap (or difference) is small, a t-test should still be carried out.
no overlap
=CONFIDENCE.NORM(0.05,stdev,samplesize)
e.g =CONFIDENCE.NORM(0.05,C15,10)
Error bars can have very different purposes.
Standard deviation
• You really need to know this
• Look for relative size of bars
• Used to indicate spread of most
of the data around the mean
• Can imply reliability of data
95% Confidence Intervals
• Adds value to labs where we are
looking for differences.
• Look for overlap, not size
• Overlap  no sig. diff.
• No overlap  sig. dif.
Interesting Study: Do “Better” Lecturers Cause More Learning?
Find out more here: http://priceonomics.com/is-this-why-ted-talks-seem-so-convincing/
Students watched a one-minute video of a lecture. In one video, the lecturer was
fluent and engaging. In the other video, the lecturer was less fluent.
They predicted how much they would learn on the topic
(genetics) and this was compared to their actual score.
(Error bars = standard deviation).
n=21 n=21
Interesting Study: Do “Better” Lecturers Cause More Learning?
Find out more here: http://priceonomics.com/is-this-why-ted-talks-seem-so-convincing/
Students watched a one-minute video of a lecture. In one video, the lecturer was
fluent and engaging. In the other video, the lecturer was less fluent.
They predicted how much they would learn on the topic
(genetics) and this was compared to their actual score.
(Error bars = standard deviation).
Is there a significant difference in the actual learning?
n=21 n=21
Interesting Study: Do “Better” Lecturers Cause More Learning?
Find out more here: http://priceonomics.com/is-this-why-ted-talks-seem-so-convincing/
Evaluate the study:
1. What do the error bars (standard deviation) tell us about reliability?
2. How valid is the study in terms of sufficiency of data (population sizes (n))?
n=21 n=21
Dog fleas jump
higher that cat
fleas, winner of
the IgNobel
prize for
Biology, 2008.
http://www.youtube.com/watch?v=fJEZg4QN760
P value = 0.1 0.05 0.02 0.01 0.005
confidence 90% 95% 98% 99% 99.50%
degreesoffreedom
1 6.31 12.71 31.82 63.66 127.34
2 2.92 4.30 6.96 9.92 14.09
3 2.35 3.18 4.54 5.84 7.45
4 2.13 2.78 3.75 4.60 5.60
5 2.02 2.57 3.37 4.03 4.77
6 1.94 2.45 3.14 3.71 4.32
7 1.89 2.36 3.00 3.50 4.03
8 1.86 2.31 2.90 3.36 3.83
9 1.83 2.26 2.82 3.25 3.69
10 1.81 2.23 2.76 3.17 3.58
degreesoffreedom
11 1.80 2.20 2.72 3.11 3.50
12 1.78 2.18 2.68 3.05 3.43
13 1.77 2.16 2.65 3.01 3.37
14 1.76 2.14 2.62 2.98 3.33
15 1.75 2.13 2.60 2.95 3.29
16 1.75 2.12 2.58 2.92 3.25
17 1.74 2.11 2.57 2.90 3.22
18 1.73 2.10 2.55 2.88 3.20
19 1.73 2.09 2.54 2.86 3.17
20 1.72 2.09 2.53 2.85 3.15
degreesoffreedom
21 1.72 2.08 2.52 2.83 3.14
22 1.72 2.07 2.51 2.82 3.12
23 1.71 2.07 2.50 2.81 3.10
24 1.71 2.06 2.49 2.80 3.09
25 1.71 2.06 2.49 2.79 3.08
26 1.71 2.06 2.48 2.78 3.07
27 1.70 2.05 2.47 2.77 3.06
28 1.70 2.05 2.47 2.76 3.05
29 1.70 2.05 2.46 2.76 3.04
30 1.70 2.04 2.46 2.75 3.03
degreesoffreedom
31 1.70 2.04 2.45 2.74 3.02
32 1.69 2.04 2.45 2.74 3.02
33 1.69 2.03 2.44 2.73 3.01
34 1.69 2.03 2.44 2.73 3.00
35 1.69 2.03 2.44 2.72 3.00
36 1.69 2.03 2.43 2.72 2.99
37 1.69 2.03 2.43 2.72 2.99
38 1.69 2.02 2.43 2.71 2.98
39 1.68 2.02 2.43 2.71 2.98
40 1.68 2.02 2.42 2.70 2.97
Cartoon from: http://www.xkcd.com/552/
Correlation does not imply causation, but it does waggle its eyebrows
suggestively and gesture furtively while mouthing "look over there."
From MrT’s Excel Statbook.
http://diabetes-obesity.findthedata.org/b/240/Correlations-between-diabetes-obesity-and-physical-activity
Interpreting Graphs: See – Think – Wonder
See: What is factual about the graph?
• What are the axes?
• What is being plotted
• What values are present?
Think: How is the graph interpreted?
• What relationship is present?
• Is cause implied?
• What explanations are possible and
what explanations are not possible?
Wonder: Questions about the graph.
• What do you need to
know more about?
See – Think - Wonder
Visible Thinking Routine
http://diabetes-obesity.findthedata.org/b/240/Correlations-between-diabetes-obesity-and-physical-activity
Diabetes and obesity are ‘risk factors’ of each other.
There is a strong correlation between them,
but does this mean one causes the other?
Correlation does not imply causality.
Pirates vs global warming, from http://en.wikipedia.org/wiki/Flying_Spaghetti_Monster#Pirates_and_global_warming
Correlation does not imply causality.
Pirates vs global warming, from http://en.wikipedia.org/wiki/Flying_Spaghetti_Monster#Pirates_and_global_warming
Where correlations exist, we must then design solid scientific experiments to determine the
cause of the relationship. Sometimes a correlation exist because of confounding variables –
conditions that the correlated variables have in common but that do not directly affect each
other.
To be able to determine causality through experimentation we need:
• One clearly identified independent variable
• Carefully measured dependent variable(s) that can be attributed to change in the
independent variable
• Strict control of all other variables that might have a measurable impact on the
dependent variable.
We need: sufficient relevant, repeatable and statistically significant data.
Some known causal relationships:
• Atmospheric CO2 concentrations and global warming
• Atmospheric CO2 concentrations and the rate of photosynthesis
• Temperature and enzyme activity
Flamenco Dancer, by Steve Corey
http://www.flickr.com/photos/22016744@N06/7952552148
i-Biology.net
This is a Creative Commons presentation. It may be linked and embedded but not sold or re-hosted.
Please consider a donation to charity via Biology4Good.
Click here for more information about Biology4Good charity donations.
@IBiologyStephen

More Related Content

What's hot

Research Methology -Factor Analyses
Research Methology -Factor AnalysesResearch Methology -Factor Analyses
Research Methology -Factor Analyses
Neerav Shivhare
 

What's hot (20)

The Basics of Statistics for Data Science By Statisticians
The Basics of Statistics for Data Science By StatisticiansThe Basics of Statistics for Data Science By Statisticians
The Basics of Statistics for Data Science By Statisticians
 
Research Methology -Factor Analyses
Research Methology -Factor AnalysesResearch Methology -Factor Analyses
Research Methology -Factor Analyses
 
Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)
 
Path analysis
Path analysisPath analysis
Path analysis
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
1.2 types of data
1.2 types of data1.2 types of data
1.2 types of data
 
Sampling Variability And The Precision Of A Sample by Dr Sindhu Almas copy.pptx
Sampling Variability And The Precision Of A Sample by Dr Sindhu Almas copy.pptxSampling Variability And The Precision Of A Sample by Dr Sindhu Almas copy.pptx
Sampling Variability And The Precision Of A Sample by Dr Sindhu Almas copy.pptx
 
Regression analysis pdf
Regression analysis pdfRegression analysis pdf
Regression analysis pdf
 
Classification and regression trees (cart)
Classification and regression trees (cart)Classification and regression trees (cart)
Classification and regression trees (cart)
 
Decision tree
Decision treeDecision tree
Decision tree
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear model
 
What is bias in statistics its definition and types
What is bias in statistics its definition and typesWhat is bias in statistics its definition and types
What is bias in statistics its definition and types
 
10 information bias
10 information bias10 information bias
10 information bias
 
Part 1 Survival Analysis
Part 1 Survival AnalysisPart 1 Survival Analysis
Part 1 Survival Analysis
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
 
Repeated Measures t-test
Repeated Measures t-testRepeated Measures t-test
Repeated Measures t-test
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
statistic
statisticstatistic
statistic
 

Viewers also liked

S5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testS5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t test
Rachel Chung
 
Hypothesis
HypothesisHypothesis
Hypothesis
17somya
 
Statistical Analysis for Educational Outcomes Measurement in CME
Statistical Analysis for Educational Outcomes Measurement in CMEStatistical Analysis for Educational Outcomes Measurement in CME
Statistical Analysis for Educational Outcomes Measurement in CME
D. Warnick Consulting
 

Viewers also liked (20)

Command Terms in IB Biology
Command Terms in IB BiologyCommand Terms in IB Biology
Command Terms in IB Biology
 
Cell membrane and cell membrane transport
Cell membrane and cell membrane transportCell membrane and cell membrane transport
Cell membrane and cell membrane transport
 
Correlation
CorrelationCorrelation
Correlation
 
IB Biology 1.6 & 1.1 Slides: Mitosis & Stem Cells
IB Biology 1.6 & 1.1 Slides: Mitosis & Stem CellsIB Biology 1.6 & 1.1 Slides: Mitosis & Stem Cells
IB Biology 1.6 & 1.1 Slides: Mitosis & Stem Cells
 
Prokaryotes - introduction IB Biology
Prokaryotes - introduction IB BiologyProkaryotes - introduction IB Biology
Prokaryotes - introduction IB Biology
 
Cell Theory
Cell TheoryCell Theory
Cell Theory
 
Moodle/Turnitin GradeMark for Feedback to Students
Moodle/Turnitin GradeMark for Feedback to StudentsMoodle/Turnitin GradeMark for Feedback to Students
Moodle/Turnitin GradeMark for Feedback to Students
 
IB Biology 9.1 transport in the xylem of plants
IB Biology 9.1 transport in the xylem of plantsIB Biology 9.1 transport in the xylem of plants
IB Biology 9.1 transport in the xylem of plants
 
Reproduction (Core)
Reproduction (Core)Reproduction (Core)
Reproduction (Core)
 
Anesthesia powerpoint
Anesthesia powerpointAnesthesia powerpoint
Anesthesia powerpoint
 
S5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testS5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t test
 
Choosing the right statistics
Choosing the right statisticsChoosing the right statistics
Choosing the right statistics
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Development of resistant Staphylococcus aureus over time
Development of resistant Staphylococcus aureus over timeDevelopment of resistant Staphylococcus aureus over time
Development of resistant Staphylococcus aureus over time
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Statistical Analysis for Educational Outcomes Measurement in CME
Statistical Analysis for Educational Outcomes Measurement in CMEStatistical Analysis for Educational Outcomes Measurement in CME
Statistical Analysis for Educational Outcomes Measurement in CME
 
Measurement and uncertainty
Measurement and uncertainty Measurement and uncertainty
Measurement and uncertainty
 
Data Analysis: Descriptive Statistics
Data Analysis: Descriptive StatisticsData Analysis: Descriptive Statistics
Data Analysis: Descriptive Statistics
 
Measurement and uncertainties
Measurement and uncertaintiesMeasurement and uncertainties
Measurement and uncertainties
 
Measurement uncertainty
Measurement uncertaintyMeasurement uncertainty
Measurement uncertainty
 

Similar to Statistical Analysis

Topic 1 stat. analysis
Topic 1 stat. analysisTopic 1 stat. analysis
Topic 1 stat. analysis
Mizan Salim
 
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docxConfidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
maxinesmith73660
 
Statistical analysis
Statistical analysisStatistical analysis
Statistical analysis
highlandn
 
Statistice Chapter 02[1]
Statistice  Chapter 02[1]Statistice  Chapter 02[1]
Statistice Chapter 02[1]
plisasm
 
Explore, Analyze and Present your data
Explore, Analyze and Present your dataExplore, Analyze and Present your data
Explore, Analyze and Present your data
gcalmettes
 
Statistics for Journalists
Statistics for JournalistsStatistics for Journalists
Statistics for Journalists
writethinking
 

Similar to Statistical Analysis (20)

IB Biology 0.0 Slides: Statistics
IB Biology 0.0 Slides: StatisticsIB Biology 0.0 Slides: Statistics
IB Biology 0.0 Slides: Statistics
 
03 chapter 3 application .pptx
03 chapter 3 application .pptx03 chapter 3 application .pptx
03 chapter 3 application .pptx
 
Topic 1 stat. analysis
Topic 1 stat. analysisTopic 1 stat. analysis
Topic 1 stat. analysis
 
Statistics
StatisticsStatistics
Statistics
 
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docxConfidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
 
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
 
Statistical analysis
Statistical analysisStatistical analysis
Statistical analysis
 
Statistics for ess
Statistics for essStatistics for ess
Statistics for ess
 
Statistice Chapter 02[1]
Statistice  Chapter 02[1]Statistice  Chapter 02[1]
Statistice Chapter 02[1]
 
Two Means, Two Dependent Samples, Matched Pairs
Two Means, Two Dependent Samples, Matched PairsTwo Means, Two Dependent Samples, Matched Pairs
Two Means, Two Dependent Samples, Matched Pairs
 
Statistics
StatisticsStatistics
Statistics
 
Statistics Review (0.0)
Statistics Review (0.0)Statistics Review (0.0)
Statistics Review (0.0)
 
Statistics for IB Biology
Statistics for IB BiologyStatistics for IB Biology
Statistics for IB Biology
 
Explore, Analyze and Present your data
Explore, Analyze and Present your dataExplore, Analyze and Present your data
Explore, Analyze and Present your data
 
Research on Haberman dataset also business required document
Research on Haberman dataset also business required documentResearch on Haberman dataset also business required document
Research on Haberman dataset also business required document
 
0.0 Notes
0.0 Notes0.0 Notes
0.0 Notes
 
Statistics for Journalists
Statistics for JournalistsStatistics for Journalists
Statistics for Journalists
 
Uncertainties & Error.ppt
Uncertainties & Error.pptUncertainties & Error.ppt
Uncertainties & Error.ppt
 
Sampling Size
Sampling SizeSampling Size
Sampling Size
 
Chapter 8
Chapter 8Chapter 8
Chapter 8
 

More from Stephen Taylor

A3 special issues in nutrition
A3 special issues in nutritionA3 special issues in nutrition
A3 special issues in nutrition
Stephen Taylor
 
Current Electricity: "I used to think... Now I think."
Current Electricity: "I used to think... Now I think."Current Electricity: "I used to think... Now I think."
Current Electricity: "I used to think... Now I think."
Stephen Taylor
 

More from Stephen Taylor (20)

How International Is Our School? MA Dissertation
How International Is Our School? MA DissertationHow International Is Our School? MA Dissertation
How International Is Our School? MA Dissertation
 
Trivium 21C Review in International School Magazine
Trivium 21C Review in International School MagazineTrivium 21C Review in International School Magazine
Trivium 21C Review in International School Magazine
 
A Pragmatic Approach to Inquiry
A Pragmatic Approach to InquiryA Pragmatic Approach to Inquiry
A Pragmatic Approach to Inquiry
 
MYP: Mind The Gap [MA Assignment]
MYP: Mind The Gap [MA Assignment]MYP: Mind The Gap [MA Assignment]
MYP: Mind The Gap [MA Assignment]
 
Protein synthesis Running Dictation
Protein synthesis Running DictationProtein synthesis Running Dictation
Protein synthesis Running Dictation
 
Cells Super Crossword
Cells Super CrosswordCells Super Crossword
Cells Super Crossword
 
How International Is You School?
How International Is You School? How International Is You School?
How International Is You School?
 
Human Subject Consent Form
Human Subject Consent FormHuman Subject Consent Form
Human Subject Consent Form
 
A3 special issues in nutrition
A3 special issues in nutritionA3 special issues in nutrition
A3 special issues in nutrition
 
Current Electricity: "I used to think... Now I think."
Current Electricity: "I used to think... Now I think."Current Electricity: "I used to think... Now I think."
Current Electricity: "I used to think... Now I think."
 
Chemistry Lab Manual
Chemistry Lab ManualChemistry Lab Manual
Chemistry Lab Manual
 
Reactions & Formulas Lab Sequence
Reactions & Formulas Lab SequenceReactions & Formulas Lab Sequence
Reactions & Formulas Lab Sequence
 
Red Bull Stratos: Freefall Physics
Red Bull Stratos: Freefall PhysicsRed Bull Stratos: Freefall Physics
Red Bull Stratos: Freefall Physics
 
Curriculum Studies Assignment
Curriculum Studies AssignmentCurriculum Studies Assignment
Curriculum Studies Assignment
 
01 Nature of Biology
01 Nature of Biology01 Nature of Biology
01 Nature of Biology
 
One Direction Do Physics
One Direction Do PhysicsOne Direction Do Physics
One Direction Do Physics
 
Measurement & Error
Measurement & ErrorMeasurement & Error
Measurement & Error
 
Chemistry Lab Manual 2012-13
Chemistry Lab Manual 2012-13Chemistry Lab Manual 2012-13
Chemistry Lab Manual 2012-13
 
Science Show 2012
Science Show 2012Science Show 2012
Science Show 2012
 
Describing Motion 2012
Describing Motion 2012Describing Motion 2012
Describing Motion 2012
 

Recently uploaded

Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
EADTU
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
EADTU
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
中 央社
 

Recently uploaded (20)

ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
 
An overview of the various scriptures in Hinduism
An overview of the various scriptures in HinduismAn overview of the various scriptures in Hinduism
An overview of the various scriptures in Hinduism
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
 
An Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppAn Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge App
 
Trauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical PrinciplesTrauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical Principles
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
 
MOOD STABLIZERS DRUGS.pptx
MOOD     STABLIZERS           DRUGS.pptxMOOD     STABLIZERS           DRUGS.pptx
MOOD STABLIZERS DRUGS.pptx
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"
 
Major project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesMajor project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategies
 
How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
 

Statistical Analysis

  • 1. Statistical Analysis IB Diploma Biology Stephen Taylor Image: 'Hummingbird Checks Out Flower' http://www.flickr.com/photos/25659032@N07/7200193254 Found on flickrcc .net
  • 2. Assessment Statements Obj. 1.1.1 State that error bars are a graphical representation of the variability of data.  Range and standard deviation show the variability/ spread in the data  95% Confidence Interval error bars suggest significance of difference where there is no overlap. 1 1.1.2 Calculate the mean and standard deviation of a set of values  Using Excel (Formula =STDEV(rawdata))  Using your calculator 2 1.1.3 State that the term standard deviation (s) is used to summarize the spread of values around the mean, and that 68% of all data fall within (±) 1 standard deviation of the mean. 1 1.1.4 Explain how the standard deviation is useful for comparing the means and the spread of data between two or more samples.  A greater standard deviation shows a greater variability of data around the mean.  This can be used to infer reliability in methods or results. 3 1.1.5 Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables.  Using t-values, t-tables and critical values  Directly calculating P values using Excel in lab reports. 3 1.1.6 Explain that the existence of a correlation does not establish that there is a causal relationship between two variables. 3 Assessment statements from: Online IB Biology Subject GuideCommand terms: http://i-biology.net/ibdpbio/command-terms/
  • 3. MrT’s Excel Statbook has guidance and ‘live’ examples of tables, graphs and statistical tests. http://i-biology.net/ict-in-ib-biology/spreadsheets-graphing/statexcel/
  • 4. “Why is this Biology?” Variation in populations. Variability in results. affects Confidence in conclusions. The key methodology in Biology is hypothesis testing through experimentation. Carefully-designed and controlled experiments and surveys give us quantitative (numeric) data that can be compared. We can use the data collected to test our hypothesis and form explanations of the processes involved… but only if we can be confident in our results. We therefore need to be able to evaluate the reliability of a set of data and the significance of any differences we have found in the data. Image: 'Transverse section of part of a stem of a Dead-nettle (Lamium sp.) showing+a+vascular+bundle+and+part+of+the+cortex' http://www.flickr.com/photos/71183136@N08/6959590092 Found on flickrcc.net
  • 5. “Which medicine should I prescribe?” Image from: http://www.msf.org/international-activity-report-2010-sierra-leone Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/
  • 6. “Which medicine should I prescribe?” Image from: http://www.msf.org/international-activity-report-2010-sierra-leone Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/ Generic drugs are out-of-patent, and are much cheaper than the proprietary (brand-name) equivalents. Doctors need to balance needs with available resources. Which would you choose?
  • 7. “Which medicine should I prescribe?” Image from: http://www.msf.org/international-activity-report-2010-sierra-leone Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/ Means (averages) in Biology are almost never good enough. Biological systems (and our results) show variability. Which would you choose now?
  • 8. Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower). In return for food, they pollinate the flower. This is an example of mutualism – benefit for all. As a result of natural selection, hummingbird bills have evolved. Birds with a bill best suited to their preferred food source have the greater chance of survival. Photo: Archilochus colubris, from wikimedia commons, by Dick Daniels.
  • 9. Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds: Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird). To do this, they need to collect sufficient relevant, reliable data so they can test the Null hypothesis (H0) that: “there is no significant difference in bill length between the two species.” Photo: Archilochus colubris (male), wikimedia commons, by Joe Schneid
  • 10. The sample size must be large enough to provide sufficient reliable data and for us to carry out relevant statistical tests for significance. We must also be mindful of uncertainty in our measuring tools and error in our results. Photo: Broadbilled hummingbird (wikimedia commons).
  • 11.
  • 12. The mean is a measure of the central tendency of a set of data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm) n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 18.0 4 15.0 18.0 5 15.0 19.0 6 16.0 19.0 7 16.0 19.0 8 18.0 20.0 9 18.0 20.0 10 19.0 20.0 Mean s Calculate the mean using: • Your calculator (sum of values / n) • Excel =AVERAGE(highlight raw data) n = sample size. The bigger the better. In this case n=10 for each group. All values should be centred in the cell, with decimal places consistent with the measuring tool uncertainty.
  • 13. The mean is a measure of the central tendency of a set of data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm) n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 18.0 4 15.0 18.0 5 15.0 19.0 6 16.0 19.0 7 16.0 19.0 8 18.0 20.0 9 18.0 20.0 10 19.0 20.0 Mean 15.9 18.8 s Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool) Uncertainties must be included. Descriptive table title and number.
  • 14.
  • 16.
  • 17.
  • 18. A. colubris, 15. 9mm C. latirostris, 1 8.8mm 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0 MeanBilllength(±0.1mm) Species of hummingbird Graph 1: Comparing mean bill lengths in two hummingbird species, A. colubris and C. latirostris. Descriptive title, with graph number. Labeled point Y-axis clearly labeled, with uncertainty. Make sure that the y-axis begins at zero. x-axis labeled
  • 19. A. colubris, 15. 9mm C. latirostris, 1 8.8mm 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0 MeanBilllength(±0.1mm) Species of hummingbird Graph 1: Comparing mean bill lengths in two hummingbird species, A. colubris and C. latirostris. From the means alone you might conclude that C. latirostris has a longer bill than A. colubris. But the mean only tells part of the story.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 33.
  • 34.
  • 35. Standard deviation is a measure of the spread of most of the data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm) n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 18.0 4 15.0 18.0 5 15.0 19.0 6 16.0 19.0 7 16.0 19.0 8 18.0 20.0 9 18.0 20.0 10 19.0 20.0 Mean 15.9 18.8 s 1.91 1.03 Standard deviation can have one more decimal place.=STDEV (highlight RAW data). Which of the two sets of data has: a. The longest mean bill length? a. The greatest variability in the data?
  • 36. Standard deviation is a measure of the spread of most of the data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm) n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 18.0 4 15.0 18.0 5 15.0 19.0 6 16.0 19.0 7 16.0 19.0 8 18.0 20.0 9 18.0 20.0 10 19.0 20.0 Mean 15.9 18.8 s 1.91 1.03 Standard deviation can have one more decimal place.=STDEV (highlight RAW data). Which of the two sets of data has: a. The longest mean bill length? a. The greatest variability in the data? C. latirostris A. colubris
  • 37. Standard deviation is a measure of the spread of most of the data. Error bars are a graphical representation of the variability of data. Which of the two sets of data has: a. The highest mean? a. The greatest variability in the data? A B Error bars could represent standard deviation, range or confidence intervals.
  • 38. Put the error bars for standard deviation on our graph.
  • 39. Put the error bars for standard deviation on our graph.
  • 40. Put the error bars for standard deviation on our graph. Delete the horizontal error bars
  • 41. A. colubris, 15.9 mm C. latirostris, 18 .8mm 0.0 5.0 10.0 15.0 20.0 MeanBilllength(±0.1mm) Species of hummingbird Graph 1: Comparing mean bill lengths in two hummingbird species, A. colubris and C. latirostris. (error bars = standard deviation) Title is adjusted to show the source of the error bars. This is very important. You can see the clear difference in the size of the error bars. Variability has been visualised. The error bars overlap somewhat. What does this mean?
  • 42. The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data. Large overlap No overlap Lots of shared data points within each data set. Results are not likely to be significantly different from each other. Any difference is most likely due to chance. No (or very few) shared data points within each data set. Results are more likely to be significantly different from each other. The difference is more likely to be ‘real’.
  • 43.
  • 44.
  • 45.
  • 46. A. colubris, 15. 9mm (n=10) C. latirostris, 1 8.8mm (n=10) -3.0 2.0 7.0 12.0 17.0 22.0 MeanBilllength(±0.1mm) Species of hummingbird Graph 1: Comparing mean bill lengths in two hummingbird species, A. colubris and C. latirostris.(error bars = standard deviation) Our results show a very small overlap between the two sets of data. So how do we know if the difference is significant or not? We need to use a statistical test. The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data.
  • 47.
  • 48. The Null Hypothesis (H0): “There is no significant difference.” This is the ‘default’ hypothesis that we always test. In our conclusion, we either accept the null hypothesis or reject it. A t-test can be used to test whether the difference between two means is significant. • If we accept H0, then the means are not significantly different. • If we reject H0, then the means are significantly different. Remember: • We are never ‘trying’ to get a difference. We design carefully-controlled experiments and then analyse the results using statistical analysis.
  • 49. P value = 0.1 0.05 0.02 0.01 confidence 90% 95% 98% 99% degreesoffreedom 1 6.31 12.71 31.82 63.66 2 2.92 4.30 6.96 9.92 3 2.35 3.18 4.54 5.84 4 2.13 2.78 3.75 4.60 5 2.02 2.57 3.37 4.03 6 1.94 2.45 3.14 3.71 7 1.89 2.36 3.00 3.50 8 1.86 2.31 2.90 3.36 9 1.83 2.26 2.82 3.25 10 1.81 2.23 2.76 3.17 We can calculate the value of ‘t’ for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need. Example two-tailed t-table. “Degrees of Freedom (df)” is the total sample size minus two. What happens to the value of P as the confidence in the results increases? What happens to the critical value as the confidence level increases? “critical values”
  • 50. P value = 0.1 0.05 0.02 0.01 confidence 90% 95% 98% 99% degreesoffreedom 1 6.31 12.71 31.82 63.66 2 2.92 4.30 6.96 9.92 3 2.35 3.18 4.54 5.84 4 2.13 2.78 3.75 4.60 5 2.02 2.57 3.37 4.03 6 1.94 2.45 3.14 3.71 7 1.89 2.36 3.00 3.50 8 1.86 2.31 2.90 3.36 9 1.83 2.26 2.82 3.25 10 1.81 2.23 2.76 3.17 We can calculate the value of ‘t’ for a given set of data and compare it to critical values that depend on the size of our sample and the level of confidence we need. Example two-tailed t-table. “Degrees of Freedom (df)” is the total sample size minus two*. We usually use P<0.05 (95% confidence) in Biology, as our data can be highly variable *Simple explanation: we are working in two directions – within each population and across populations. “critical values”
  • 51. 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
  • 52. t was calculated as 2.15 (this is done for you) t cv 2.15 If t < cv, accept H0 (there is no significant difference) If t > cv, reject H0 (there is a significant difference) 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
  • 53. 0.05 t was calculated as 2.15 (this is done for you) t cv 2.15 If t < cv, accept H0 (there is no significant difference) If t > cv, reject H0 (there is a significant difference) 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
  • 54. 2.069 0.05 t was calculated as 2.15 (this is done for you) t cv 2.15 > 2.069 If t < cv, accept H0 (there is no significant difference) If t > cv, reject H0 (there is a significant difference) 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
  • 55. 2.069 0.05 t was calculated as 2.15 (this is done for you) t cv 2.15 > 2.069 If t < cv, accept H0 (there is no significant difference) If t > cv, reject H0 (there is a significant difference) Conclusion: “There is a significant difference in the wing spans of the two populations of birds.” 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
  • 56. 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
  • 57. 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
  • 58. 2.0452.045 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php “There is no significant difference in the size of shells between north-side and south-side snail populations.”
  • 59. 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php
  • 60. 2.086 2.086 2-tailed t-table source: http://www.medcalc.org/manual/t-distribution.php “There is a significant difference in the resting heart rates between the two groups of swimmers.”
  • 61. Excel can jump straight to a value of P for our results. One function (=ttest) compares both sets of data. As it calculates P directly (the probability that the difference is due to chance), we can determine significance directly. In this case, P=0.00051 This is much smaller than 0.005, so we are confident that we can: reject H0. The difference is unlikely to be due to chance. Conclusion: There is a significant difference in bill length between A. colubris and C. latirostris.
  • 62.
  • 63. Two tails: we assume data are normally distributed, with two ‘tails’ moving away from mean. Type 2 (unpaired): we are comparing one whole population with the other whole population. (Type 1 pairs the results of each individual in set A with the same individual in set B).
  • 64.
  • 65. 95% Confidence Intervals can also be plotted as error bars. These give a clearer indication of the significance of a result: • Where there is overlap, there is not a significant difference • Where there is no overlap, there is a significant difference. • If the overlap (or difference) is small, a t-test should still be carried out. no overlap =CONFIDENCE.NORM(0.05,stdev,samplesize) e.g =CONFIDENCE.NORM(0.05,C15,10)
  • 66. Error bars can have very different purposes. Standard deviation • You really need to know this • Look for relative size of bars • Used to indicate spread of most of the data around the mean • Can imply reliability of data 95% Confidence Intervals • Adds value to labs where we are looking for differences. • Look for overlap, not size • Overlap  no sig. diff. • No overlap  sig. dif.
  • 67. Interesting Study: Do “Better” Lecturers Cause More Learning? Find out more here: http://priceonomics.com/is-this-why-ted-talks-seem-so-convincing/ Students watched a one-minute video of a lecture. In one video, the lecturer was fluent and engaging. In the other video, the lecturer was less fluent. They predicted how much they would learn on the topic (genetics) and this was compared to their actual score. (Error bars = standard deviation). n=21 n=21
  • 68. Interesting Study: Do “Better” Lecturers Cause More Learning? Find out more here: http://priceonomics.com/is-this-why-ted-talks-seem-so-convincing/ Students watched a one-minute video of a lecture. In one video, the lecturer was fluent and engaging. In the other video, the lecturer was less fluent. They predicted how much they would learn on the topic (genetics) and this was compared to their actual score. (Error bars = standard deviation). Is there a significant difference in the actual learning? n=21 n=21
  • 69. Interesting Study: Do “Better” Lecturers Cause More Learning? Find out more here: http://priceonomics.com/is-this-why-ted-talks-seem-so-convincing/ Evaluate the study: 1. What do the error bars (standard deviation) tell us about reliability? 2. How valid is the study in terms of sufficiency of data (population sizes (n))? n=21 n=21
  • 70. Dog fleas jump higher that cat fleas, winner of the IgNobel prize for Biology, 2008. http://www.youtube.com/watch?v=fJEZg4QN760
  • 71. P value = 0.1 0.05 0.02 0.01 0.005 confidence 90% 95% 98% 99% 99.50% degreesoffreedom 1 6.31 12.71 31.82 63.66 127.34 2 2.92 4.30 6.96 9.92 14.09 3 2.35 3.18 4.54 5.84 7.45 4 2.13 2.78 3.75 4.60 5.60 5 2.02 2.57 3.37 4.03 4.77 6 1.94 2.45 3.14 3.71 4.32 7 1.89 2.36 3.00 3.50 4.03 8 1.86 2.31 2.90 3.36 3.83 9 1.83 2.26 2.82 3.25 3.69 10 1.81 2.23 2.76 3.17 3.58 degreesoffreedom 11 1.80 2.20 2.72 3.11 3.50 12 1.78 2.18 2.68 3.05 3.43 13 1.77 2.16 2.65 3.01 3.37 14 1.76 2.14 2.62 2.98 3.33 15 1.75 2.13 2.60 2.95 3.29 16 1.75 2.12 2.58 2.92 3.25 17 1.74 2.11 2.57 2.90 3.22 18 1.73 2.10 2.55 2.88 3.20 19 1.73 2.09 2.54 2.86 3.17 20 1.72 2.09 2.53 2.85 3.15 degreesoffreedom 21 1.72 2.08 2.52 2.83 3.14 22 1.72 2.07 2.51 2.82 3.12 23 1.71 2.07 2.50 2.81 3.10 24 1.71 2.06 2.49 2.80 3.09 25 1.71 2.06 2.49 2.79 3.08 26 1.71 2.06 2.48 2.78 3.07 27 1.70 2.05 2.47 2.77 3.06 28 1.70 2.05 2.47 2.76 3.05 29 1.70 2.05 2.46 2.76 3.04 30 1.70 2.04 2.46 2.75 3.03 degreesoffreedom 31 1.70 2.04 2.45 2.74 3.02 32 1.69 2.04 2.45 2.74 3.02 33 1.69 2.03 2.44 2.73 3.01 34 1.69 2.03 2.44 2.73 3.00 35 1.69 2.03 2.44 2.72 3.00 36 1.69 2.03 2.43 2.72 2.99 37 1.69 2.03 2.43 2.72 2.99 38 1.69 2.02 2.43 2.71 2.98 39 1.68 2.02 2.43 2.71 2.98 40 1.68 2.02 2.42 2.70 2.97
  • 72. Cartoon from: http://www.xkcd.com/552/ Correlation does not imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing "look over there."
  • 73.
  • 74.
  • 75.
  • 76. From MrT’s Excel Statbook.
  • 77. http://diabetes-obesity.findthedata.org/b/240/Correlations-between-diabetes-obesity-and-physical-activity Interpreting Graphs: See – Think – Wonder See: What is factual about the graph? • What are the axes? • What is being plotted • What values are present? Think: How is the graph interpreted? • What relationship is present? • Is cause implied? • What explanations are possible and what explanations are not possible? Wonder: Questions about the graph. • What do you need to know more about? See – Think - Wonder Visible Thinking Routine
  • 78. http://diabetes-obesity.findthedata.org/b/240/Correlations-between-diabetes-obesity-and-physical-activity Diabetes and obesity are ‘risk factors’ of each other. There is a strong correlation between them, but does this mean one causes the other?
  • 79. Correlation does not imply causality. Pirates vs global warming, from http://en.wikipedia.org/wiki/Flying_Spaghetti_Monster#Pirates_and_global_warming
  • 80. Correlation does not imply causality. Pirates vs global warming, from http://en.wikipedia.org/wiki/Flying_Spaghetti_Monster#Pirates_and_global_warming Where correlations exist, we must then design solid scientific experiments to determine the cause of the relationship. Sometimes a correlation exist because of confounding variables – conditions that the correlated variables have in common but that do not directly affect each other. To be able to determine causality through experimentation we need: • One clearly identified independent variable • Carefully measured dependent variable(s) that can be attributed to change in the independent variable • Strict control of all other variables that might have a measurable impact on the dependent variable. We need: sufficient relevant, repeatable and statistically significant data. Some known causal relationships: • Atmospheric CO2 concentrations and global warming • Atmospheric CO2 concentrations and the rate of photosynthesis • Temperature and enzyme activity
  • 81.
  • 82. Flamenco Dancer, by Steve Corey http://www.flickr.com/photos/22016744@N06/7952552148
  • 83. i-Biology.net This is a Creative Commons presentation. It may be linked and embedded but not sold or re-hosted. Please consider a donation to charity via Biology4Good. Click here for more information about Biology4Good charity donations. @IBiologyStephen