SlideShare a Scribd company logo
Statistical Analysis
IB Diploma Biology
Objectives of this Unit:
 Types of Data, Types of Graphs, Applications and Statistics to match your data
o Bar Graphs, Line Graph, Scatter Plot, Histogram, Pie Chart
o Mean, S.D., Regression, Chi Square Analysis
 State that error bars are a graphical representation of the variability of data
o Range and standard deviation show the variability/spread in the data
 Calculate the mean and standard deviation of a set of values
 Using Excel formulas
o Given a mean and S.D. state the range for different parameters
 State the term standard deviation is used to summarize the spread of values around the mean
o 68% of all data +/- 1 standard deviation, 95% within 2 SD
 Explain how S.D. is useful for comparing the means and spread of data between two or more
samples
o Greater S.D. shows greater variability of data
o This can be used to inter reliability in methods or results BUT in Biology we also expect
variability
 Deduce the significance of the difference between two sets of data using calculated values for t and
tables
o Using t value and t table and critical values
o Directly calculating P values using excel in lab reports
o Difference between P and T
 Explain that correlation does not establish that there is a causal relationship between two variables
 Proper Lab Format
 Designing Lab Process
What are statistics?
• Statistics are numbers used to:
Describe and draw conclusions about DATA
• These are called descriptive (or “univariate”) and
inferential (or “analytical”) statistics, respectively.
Variables
• A variable is anything we can measure/observe
• Three types:
– Continuous: values span an uninterrupted range (e.g. height)
– Discrete: only certain fixed values are possible (e.g. counts)
– Categorical: values are qualitatively assigned (e.g. low/med/hi)
• Dependence in variables:
“Dependent variables depend on independent ones”
– Independent variable – variable you are changing
– Dependent variable – variable you measure to see result
– Controlled variables – variables that can also impact the
dependent variable that you identify as needed to not vary
*** Experimental Control – NOT the same as controlled variables
Descriptive statistics
Numerical
– Mean
– Variance
• Standard deviation
• Standard error
– Median
– Mode
– Skew
– etc.
Graphical
– Histogram
– Boxplot
– Scatterplot
– etc.
Techniques to summarize data
Graphic Applications
What graph to use ?
Line Scatter Histogram Bar
Appropriat
e for data
when:
Important
Features
Include
Sample and
other notes
Outlier - An outlier is an observation that lies an abnormal distance from other values in
a random sample from a population. In a sense, this definition leaves it up to the analyst
(or a consensus process) to decide what will be considered abnormal. Before abnormal
observations can be singled out, it is necessary to characterize normal observations.
Numeric Descriptive Statistics
S
The Mean:
Most important measure of “central tendency”
Xi
i=1
N
m =
N
Population Mean
S
The Mean:
Most important measure of “central tendency”
i=1
n
n
Sample Mean
X =
Xi
Additional central tendency measures
M = X(n+1)/2 (n is odd)
Median: the 50th percentile
(n is even)
Xn/2 + X(n/2)+1
2
M =
Mode: the most common value
1, 1, 2, 4, 6, 6, 6, 7, 7, 7, 7, 8, 8, 9, 9, 10, 12, 15
Which to use: mean, median or mode?
Variance:
Most important measure of “dispersion”
s2 = S
N
Population Variance
(Xi - µ)2
Variance:
Most important measure of “dispersion”
s2 = S
n - 1
Sample Variance
(Xi - X)2
From now on, we’ll ignore sample vs. population. But remember:
We are almost always interested in the population, but can measure only a sample.
“Graphical Statistics”
Lets look deeper into graphs now
The Friendly Histogram
• Histograms represent the distribution of data
• They allow you to visualize the mean, median,
mode, variance, and skew at once!
Constructing a Histogram is Easy
X (data)
7.4
7.6
8.4
8.9
10.0
10.3
11.5
11.5
12.0
12.3
Histogram of X
Value
6 8 10 12 14
0
1
2
3
Frequency
(count)
The Normal Distribution
aka “Gaussian” distribution
• Occurs frequently in nature
• Especially for measures that
are based on sums, such as:
– sample means
– body weight
– “error”
• Many statistics are based on
the assumption of normality
– You must make sure your data
are normal, or try something
else!
Sample normal data:
Histogram + theoretical distribution
(i.e. sample vs. population)
Properties of the Normal Distribution
• Symmetric
Mean = Median = Mode
• Theoretical percentiles can be computed exactly
~68% of data are within 1 standard deviation of the mean
>99% within 3 s.d.
“skinny tails”
>99%
~95%
~68%
Important!
Handy!
Amazing!
What if my data aren’t Normal?
• It’s OK!
• Although lots of data are Gaussian (because of the CLT),
many simply aren’t.
– Example: Fire return intervals
Time between fires (yr)
• Solutions:
– Transform data to make it
normal (e.g. take logs)
– Use a test that doesn’t
assume normal data
• Don’t worry, there are plenty
• Especially these days...
• Many stats work OK as long as data are “reasonably” normal
That is enough for today
Please complete the flipped notes while watching the
video before next class
IMPORTANT: Bring a Device Next classs with
either excel or google sheets
Inferential Statistics:
Day 2
Inference: the process by which we draw
conclusions about an unknown based on
evidence or prior experience.
In statistics: make conclusions about a
population based on samples taken from
that population.
Important: Your sample must reflect the
population you’re interested in, otherwise
your conclusions will be misleading!
Statistical Hypotheses
• Should be related to a scientific hypothesis!
• Very often presented in pairs:
– Null Hypothesis (H0):
the “boring” hypothesis of “no difference”
– Alternative Hypothesis (HA)
the interesting hypothesis of “there is an effect”
• Statistical tests attempt to (mathematically)
reject the null hypothesis
Significance
• Your sample will never match H0 perfectly,
even when H0 is in fact true
• The question is whether your sample is
different enough from the expectation under
H0 to be considered significant
• If your test finds a significant difference, then
you reject H0.
p-Values Measure Significance
The p-value of a test is the probability of observing data
at least as extreme as your sample, assuming H0 is true
• If p is very small, it is unlikely that H0 is true
(in other words, if H0 were true, your observed sample would be unlikely)
• How small does p have to be?
– 0.05 is a common cutoff
• If p<0.05, then there is less than 5% chance that you would observe
your sample if the null hypothesis was true.
‘Proof’ in statistics
• Failing to reject (i.e. “accepting”) H0 does not
prove that H0 is true!
• And accepting HA doesn’t prove that HA is true
either!
Why?
• Statistical inference tries to draw conclusions
about the population from a small sample
– By chance, the samples may be misleading
– Example: if you always accept H0 at p=0.05, then
1 in 20 times you will be wrong!
Play it Safe
Avoid using the term Prove in your labs
Instead say “the data accepts or supports” the
hypothesis
Watch out for reaching – classic student error,
stick to the scope of your lab data in your
conclusions, this is not your life work.
“Why is this Biology?”
Variation in populations.
Variability in results.
affects
Confidence
in conclusions.
The key methodology in Biology is hypothesis
testing through experimentation.
Carefully-designed and controlled
experiments and surveys give us quantitative
(numeric) data that can be compared.
We can use the data collected to test our
hypothesis and form explanations of the
processes involved… but only if we can be
confident in our results.
We therefore need to be able to evaluate the
reliability of a set of data and the significance
of any differences we have found in the data.
Image: 'Transverse section of part of a stem of a Dead-nettle (Lamium sp.) showing+a+vascular+bundle+and+part+of+the+cortex'
http://www.flickr.com/photos/71183136@N08/6959590092 Found on flickrcc.net
“Which medicine should I prescribe?”
Image from: http://www.msf.org/international-activity-report-2010-sierra-leone
Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/
“Which medicine should I prescribe?”
Image from: http://www.msf.org/international-activity-report-2010-sierra-leone
Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/
Generic drugs are out-of-patent, and are
much cheaper than the proprietary
(brand-name) equivalents. Doctors need to
balance needs with available resources.
Which would you choose?
“Which medicine should I prescribe?”
Image from: http://www.msf.org/international-activity-report-2010-sierra-leone
Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/
Means (averages) in Biology are almost
never good enough. Biological systems
(and our results) show variability.
Which would you choose now?
Hummingbirds are nectarivores (herbivores
that feed on the nectar of some species of
flower).
In return for food, they pollinate the flower.
This is an example of mutualism –
benefit for all.
As a result of natural selection,
hummingbird bills have evolved.
Birds with a bill best suited to
their preferred food source have
the greater chance of survival.
Photo: Archilochus colubris, from wikimedia commons, by Dick Daniels.
Researchers studying comparative anatomy collect
data on bill-length in two species of hummingbirds:
Archilochus colubris
(red-throated hummingbird) and
Cynanthus latirostris (broadbilled hummingbird).
To do this, they need to collect sufficient
relevant, reliable data so they can test
the Null hypothesis (H0) that:
“there is no significant difference
in bill length between the two species.”
Photo: Archilochus colubris (male), wikimedia commons, by Joe Schneid
The sample size must
be large enough to provide
sufficient reliable data and for us
to carry out relevant statistical
tests for significance.
We must also be mindful of
uncertainty in our measuring tools
and error in our results.
Photo: Broadbilled hummingbird (wikimedia commons).
The mean is a measure of the central tendency
of a set of data.
Table 1: Raw measurements of bill length in
A. colubris and C. latirostris.
Bill length (±0.1mm)
n A. colubris C. latirostris
1 13.0 17.0
2 14.0 18.0
3 15.0 18.0
4 15.0 18.0
5 15.0 19.0
6 16.0 19.0
7 16.0 19.0
8 18.0 20.0
9 18.0 20.0
10 19.0 20.0
Mean
s
Calculate the mean using:
• Your calculator
(sum of values / n)
• Excel
=AVERAGE(highlight raw data)
n = sample size. The bigger the better.
In this case n=10 for each group.
All values should be centred in the cell, with
decimal places consistent with the measuring
tool uncertainty.
The mean is a measure of the central tendency
of a set of data.
Table 1: Raw measurements of bill length in
A. colubris and C. latirostris.
Bill length (±0.1mm)
n A. colubris C. latirostris
1 13.0 17.0
2 14.0 18.0
3 15.0 18.0
4 15.0 18.0
5 15.0 19.0
6 16.0 19.0
7 16.0 19.0
8 18.0 20.0
9 18.0 20.0
10 19.0 20.0
Mean 15.9 18.8
s
Raw data and the mean need to have
consistent decimal places (in line with
uncertainty of the measuring tool)
Uncertainties must be included.
Descriptive table title and number.
DELETE
X
DELETE
X
A. colubris,
15.9mm
C. latirostris,
18.8mm
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
16.0
18.0
20.0
MeanBilllength(±0.1mm)
Species of hummingbird
Graph 1: Comparing mean bill lengths in two
hummingbird species, A. colubris and C. latirostris.
Descriptive title, with graph
number.
Labeled point
Y-axis clearly labeled, with
uncertainty.
x-axis labeled
A. colubris,
15.9mm
C. latirostris,
18.8mm
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
16.0
18.0
20.0
MeanBilllength(±0.1mm)
Species of hummingbird
Graph 1: Comparing mean bill lengths in two
hummingbird species, A. colubris and C. latirostris.
From the means alone
you might conclude
that C. latirostris has a
longer bill than A.
colubris.
But the mean only tells
part of the story.
Standard deviation is a measure of the spread of
most of the data.
Table 1: Raw measurements of bill length in
A. colubris and C. latirostris.
Bill length (±0.1mm)
n A. colubris C. latirostris
1 13.0 17.0
2 14.0 18.0
3 15.0 18.0
4 15.0 18.0
5 15.0 19.0
6 16.0 19.0
7 16.0 19.0
8 18.0 20.0
9 18.0 20.0
10 19.0 20.0
Mean 15.9 18.8
s 1.91 1.03 Standard deviation can have one more
decimal place.=STDEV (highlight RAW data).
Which of the two sets of data has:
a. The longest mean bill length?
a. The greatest variability in the data?
Standard deviation is a measure of the spread of
most of the data.
Table 1: Raw measurements of bill length in
A. colubris and C. latirostris.
Bill length (±0.1mm)
n A. colubris C. latirostris
1 13.0 17.0
2 14.0 18.0
3 15.0 18.0
4 15.0 18.0
5 15.0 19.0
6 16.0 19.0
7 16.0 19.0
8 18.0 20.0
9 18.0 20.0
10 19.0 20.0
Mean 15.9 18.8
s 1.91 1.03 Standard deviation can have one more
decimal place.=STDEV (highlight RAW data).
Which of the two sets of data has:
a. The longest mean bill length?
a. The greatest variability in the data?
C. latirostris
A. colubris
Standard deviation is a measure of the spread of
most of the data. Error bars are a graphical
representation of the variability of data.
Which of the two sets of data has:
a. The highest mean?
a. The greatest variability in the data?
A
B
Error bars could represent standard deviation, range or confidence intervals.
Put the error bars for standard deviation on our graph.
Put the error bars for standard deviation on our graph.
Put the error bars for standard deviation on our graph.
Delete the horizontal error bars
A. colubris,
15.9mm
C. latirostris,
18.8mm
0.0
5.0
10.0
15.0
20.0
MeanBilllength(±0.1mm)
Species of hummingbird
Graph 1: Comparing mean bill lengths in two
hummingbird species, A. colubris and C. latirostris.
(error bars = standard deviation)
Title is adjusted to
show the source of the
error bars. This is very
important.
You can see the clear
difference in the size of
the error bars.
Variability has been
visualised.
The error bars overlap
somewhat.
What does this mean?
The overlap of a set of error bars gives a clue as to the
significance of the difference between two sets of data.
Large overlap No overlap
Lots of shared data points
within each data set.
Results are not likely to be
significantly different from
each other.
Any difference is most likely
due to chance.
No (or very few) shared data
points within each data set.
Results are more likely to be
significantly different from
each other.
The difference is more likely
to be ‘real’.
A. colubris,
15.9mm
(n=10)
C. latirostris,
18.8mm
(n=10)
-3.0
2.0
7.0
12.0
17.0
22.0
MeanBilllength(±0.1mm)
Species of hummingbird
Graph 1: Comparing mean bill lengths in two
hummingbird species, A. colubris and C.
latirostris.(error bars = standard deviation)
Our results show a very small overlap
between the two sets of data.
So how do we know if the difference is
significant or not?
We need to use a statistical test.
The t-test is a statistical
test that helps us determine
the significance of the
difference between the
means of two sets of data.
The Null Hypothesis (H0):
“There is no significant
difference.”
This is the ‘default’ hypothesis that we always test.
In our conclusion, we either accept the null hypothesis or reject it.
A t-test can be used to test whether the difference between two means is significant.
• If we accept H0, then the means are not significantly different.
• If we reject H0, then the means are significantly different.
Remember:
• We are never ‘trying’ to get a difference. We design carefully-controlled experiments and
then analyse the results using statistical analysis.
Excel can jump straight to a value of P for our results.
One function (=ttest) compares both sets of data.
As it calculates P directly (the
probability that the difference is due
to chance), we can determine
significance directly.
In this case, P=0.00026
This is much smaller than 0.005, so
we are confident that we can:
reject H0.
The difference is unlikely to be due to
chance.
Conclusion:
There is a significant difference in bill
length between A. colubris and C.
latirostris.
Two tails: we assume data are normally distributed, with two ‘tails’ moving away from mean.
Type 2 (unpaired): we are comparing one whole population with the other whole population.
(Type 1 pairs the results of each individual in set A with the same individual in set B).
Your correlation coefficient, is your
R^2 value
In Excel, you will want to do a scatter
plot with your data
Next Add a trend line and check the
boxes for displaying the equation as
well as the R-squared value. The
closer to 1.0 that value is, the
stronger the correlation
Table 2: Correlation between bill length and body weight in A. colubris
bill length
(mm) (+/-
0.1mm)
13.0 14.0 15.0 15.0 15.0 16.0 16.0 18.0 18.0 19.0
weight (g)
(+/-0.05g)
2.7 2.8 2.8 2.9 2.9 2.9 3.0 3.1 3.4 3.6
http://diabetes-obesity.findthedata.org/b/240/Correlations-between-diabetes-obesity-and-physical-activity
Interpreting Graphs: See – Think – Wonder
See: What is factual about the graph?
• What are the axes?
• What is being plotted
• What values are present?
Think: How is the graph interpreted?
• What relationship is present?
• Is cause implied?
• What explanations are possible and
what explanations are not possible?
Wonder: Questions about the graph.
• What do you need to
know more about?
See – Think - Wonder
Visible Thinking Routine
http://diabetes-obesity.findthedata.org/b/240/Correlations-between-diabetes-obesity-and-physical-activity
Diabetes and obesity are ‘risk factors’ of each other.
There is a strong correlation between them,
but does this mean one causes the other?
Correlation does not imply causality.
Pirates vs global warming, from http://en.wikipedia.org/wiki/Flying_Spaghetti_Monster#Pirates_and_global_warming
Cartoon from: http://www.xkcd.com/552/
Correlation does not imply causation, but it does waggle its eyebrows
suggestively and gesture furtively while mouthing "look over there."
Check out these funny “Correlations”
Correlation does not imply causality.
Pirates vs global warming, from http://en.wikipedia.org/wiki/Flying_Spaghetti_Monster#Pirates_and_global_warming
Where correlations exist, we must then design solid scientific experiments to determine the
cause of the relationship. Sometimes a correlation exist because of confounding variables –
conditions that the correlated variables have in common but that do not directly affect each
other.
To be able to determine causality through experimentation we need:
• One clearly identified independent variable
• Carefully measured dependent variable(s) that can be attributed to change in the
independent variable
• Strict control of all other variables that might have a measurable impact on the
dependent variable.
We need: sufficient relevant, repeatable and statistically significant data.
Some known causal relationships:
• Atmospheric CO2 concentrations and global warming
• Atmospheric CO2 concentrations and the rate of photosynthesis
• Temperature and enzyme activity

More Related Content

What's hot

Chapter 3
Chapter 3Chapter 3
Chemistry
ChemistryChemistry
Chemistry
jangezkhan
 
Macromolecules
MacromoleculesMacromolecules
Macromolecules
Wilhelmina Annie Mensah
 
Powerpoint variation
Powerpoint variationPowerpoint variation
Powerpoint variation
Magdalena Ravagnan
 
429985318-DIV-DLL-SCIENCE-9-Biodiversity-and-Evolution (5).docx
429985318-DIV-DLL-SCIENCE-9-Biodiversity-and-Evolution (5).docx429985318-DIV-DLL-SCIENCE-9-Biodiversity-and-Evolution (5).docx
429985318-DIV-DLL-SCIENCE-9-Biodiversity-and-Evolution (5).docx
levieagacer
 
covalent bonding
covalent bondingcovalent bonding
covalent bonding
Karnav Rana
 
IB Biology 3.5 genetic modifcation and biotechnology
IB Biology 3.5 genetic modifcation and biotechnologyIB Biology 3.5 genetic modifcation and biotechnology
IB Biology 3.5 genetic modifcation and biotechnology
Bob Smullen
 
Writing and Balancing Chemical Equation
Writing and Balancing Chemical EquationWriting and Balancing Chemical Equation
Writing and Balancing Chemical Equation
Danica Gutierrez
 
Digestion and Nutrition 2016 (Core and Option D)
Digestion and Nutrition  2016 (Core and Option D)Digestion and Nutrition  2016 (Core and Option D)
Digestion and Nutrition 2016 (Core and Option D)
Dobbs Ferry High School
 
Atomic Models: Everything You Need to Know
Atomic Models: Everything You Need to KnowAtomic Models: Everything You Need to Know
Atomic Models: Everything You Need to Know
jane1015
 
Misuses of statistics
Misuses of statisticsMisuses of statistics
Misuses of statistics
Diksha Gupta
 
Macromolecules
MacromoleculesMacromolecules
Macromolecules
Mariana Serrato
 
Surface area to volume ratio in biology powerpoint
Surface area to volume ratio in biology powerpointSurface area to volume ratio in biology powerpoint
Surface area to volume ratio in biology powerpoint
Jakob Garlick
 
General Genetics Lec 1
General Genetics Lec 1General Genetics Lec 1
General Genetics Lec 1
Shaina Mavreen Villaroza
 
Mendelian Genetics
Mendelian GeneticsMendelian Genetics
Mendelian Genetics
mpattani
 
11 24 What Is Molar Mass
11 24 What Is Molar Mass11 24 What Is Molar Mass
11 24 What Is Molar Mass
mrheffner
 
IB Biology Option D.1: Origin of life
IB Biology Option D.1: Origin of lifeIB Biology Option D.1: Origin of life
IB Biology Option D.1: Origin of life
Jason de Nys
 
Lesson plan for genetic engineering
Lesson plan for genetic engineeringLesson plan for genetic engineering
Lesson plan for genetic engineering
Michael Robbins
 
Diffusion student worksheet
Diffusion student worksheetDiffusion student worksheet
Diffusion student worksheet
kleinkea
 
chemistry of water ph acid and bases.
chemistry of water ph acid and bases.  chemistry of water ph acid and bases.
chemistry of water ph acid and bases.
ASIF IQBAL KHAN
 

What's hot (20)

Chapter 3
Chapter 3Chapter 3
Chapter 3
 
Chemistry
ChemistryChemistry
Chemistry
 
Macromolecules
MacromoleculesMacromolecules
Macromolecules
 
Powerpoint variation
Powerpoint variationPowerpoint variation
Powerpoint variation
 
429985318-DIV-DLL-SCIENCE-9-Biodiversity-and-Evolution (5).docx
429985318-DIV-DLL-SCIENCE-9-Biodiversity-and-Evolution (5).docx429985318-DIV-DLL-SCIENCE-9-Biodiversity-and-Evolution (5).docx
429985318-DIV-DLL-SCIENCE-9-Biodiversity-and-Evolution (5).docx
 
covalent bonding
covalent bondingcovalent bonding
covalent bonding
 
IB Biology 3.5 genetic modifcation and biotechnology
IB Biology 3.5 genetic modifcation and biotechnologyIB Biology 3.5 genetic modifcation and biotechnology
IB Biology 3.5 genetic modifcation and biotechnology
 
Writing and Balancing Chemical Equation
Writing and Balancing Chemical EquationWriting and Balancing Chemical Equation
Writing and Balancing Chemical Equation
 
Digestion and Nutrition 2016 (Core and Option D)
Digestion and Nutrition  2016 (Core and Option D)Digestion and Nutrition  2016 (Core and Option D)
Digestion and Nutrition 2016 (Core and Option D)
 
Atomic Models: Everything You Need to Know
Atomic Models: Everything You Need to KnowAtomic Models: Everything You Need to Know
Atomic Models: Everything You Need to Know
 
Misuses of statistics
Misuses of statisticsMisuses of statistics
Misuses of statistics
 
Macromolecules
MacromoleculesMacromolecules
Macromolecules
 
Surface area to volume ratio in biology powerpoint
Surface area to volume ratio in biology powerpointSurface area to volume ratio in biology powerpoint
Surface area to volume ratio in biology powerpoint
 
General Genetics Lec 1
General Genetics Lec 1General Genetics Lec 1
General Genetics Lec 1
 
Mendelian Genetics
Mendelian GeneticsMendelian Genetics
Mendelian Genetics
 
11 24 What Is Molar Mass
11 24 What Is Molar Mass11 24 What Is Molar Mass
11 24 What Is Molar Mass
 
IB Biology Option D.1: Origin of life
IB Biology Option D.1: Origin of lifeIB Biology Option D.1: Origin of life
IB Biology Option D.1: Origin of life
 
Lesson plan for genetic engineering
Lesson plan for genetic engineeringLesson plan for genetic engineering
Lesson plan for genetic engineering
 
Diffusion student worksheet
Diffusion student worksheetDiffusion student worksheet
Diffusion student worksheet
 
chemistry of water ph acid and bases.
chemistry of water ph acid and bases.  chemistry of water ph acid and bases.
chemistry of water ph acid and bases.
 

Similar to Statistics for IB Biology

Statistics
StatisticsStatistics
Statistics
Eran Earland
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
Ali Al Mousawi
 
Applied statistics part 5
Applied statistics part 5Applied statistics part 5
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
Georgios Ath. Kounis
 
Descriptive Analysis.pptx
Descriptive Analysis.pptxDescriptive Analysis.pptx
Descriptive Analysis.pptx
Parveen Vashisth
 
Data in science
Data in science Data in science
Data in science
Sreejith Aravindakshan
 
Ds vs Is discuss 3.1
Ds vs Is discuss 3.1Ds vs Is discuss 3.1
Ds vs Is discuss 3.1
Makati Science High School
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
praveen3030
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlal
IstiqlalEid
 
Descriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing ResearchDescriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing Research
enamprofessor
 
250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC
LeaCamillePacle
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
manaswidebbarma1
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
Bob Smullen
 
Statistics pres 3.31.2014
Statistics pres 3.31.2014Statistics pres 3.31.2014
Statistics pres 3.31.2014
tjcarter
 
Introduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptIntroduction to Statistics53004300.ppt
Introduction to Statistics53004300.ppt
TripthiDubey
 
Introduction to statistics in health care
Introduction to statistics in health care Introduction to statistics in health care
Introduction to statistics in health care
Dhasarathi Kumar
 
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Sherri Gunder
 
IDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notesIDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notes
AnkurTiwari813070
 
ststs nw.pptx
ststs nw.pptxststs nw.pptx
ststs nw.pptx
MrymNb
 
Data Science interview questions of Statistics
Data Science interview questions of Statistics Data Science interview questions of Statistics
Data Science interview questions of Statistics
Learnbay Datascience
 

Similar to Statistics for IB Biology (20)

Statistics
StatisticsStatistics
Statistics
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
 
Applied statistics part 5
Applied statistics part 5Applied statistics part 5
Applied statistics part 5
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
Descriptive Analysis.pptx
Descriptive Analysis.pptxDescriptive Analysis.pptx
Descriptive Analysis.pptx
 
Data in science
Data in science Data in science
Data in science
 
Ds vs Is discuss 3.1
Ds vs Is discuss 3.1Ds vs Is discuss 3.1
Ds vs Is discuss 3.1
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlal
 
Descriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing ResearchDescriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing Research
 
250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
 
Statistics pres 3.31.2014
Statistics pres 3.31.2014Statistics pres 3.31.2014
Statistics pres 3.31.2014
 
Introduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptIntroduction to Statistics53004300.ppt
Introduction to Statistics53004300.ppt
 
Introduction to statistics in health care
Introduction to statistics in health care Introduction to statistics in health care
Introduction to statistics in health care
 
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)
 
IDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notesIDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notes
 
ststs nw.pptx
ststs nw.pptxststs nw.pptx
ststs nw.pptx
 
Data Science interview questions of Statistics
Data Science interview questions of Statistics Data Science interview questions of Statistics
Data Science interview questions of Statistics
 

More from Eran Earland

IB Topic 1.5 Cell Origins
IB Topic 1.5 Cell OriginsIB Topic 1.5 Cell Origins
IB Topic 1.5 Cell Origins
Eran Earland
 
1.4 Part 3 - Active Cell Transport
1.4 Part 3 - Active Cell Transport1.4 Part 3 - Active Cell Transport
1.4 Part 3 - Active Cell Transport
Eran Earland
 
1.4 Part 1 - Cell Transport
1.4 Part 1 - Cell Transport1.4 Part 1 - Cell Transport
1.4 Part 1 - Cell Transport
Eran Earland
 
1.3 Cell Membrane
1.3 Cell Membrane1.3 Cell Membrane
1.3 Cell Membrane
Eran Earland
 
1.4 Part 1 - Cell Transport
1.4 Part 1 - Cell Transport1.4 Part 1 - Cell Transport
1.4 Part 1 - Cell Transport
Eran Earland
 
IB Biology Topic 1.3 - Cell Membrane Structure
IB Biology Topic 1.3 - Cell Membrane StructureIB Biology Topic 1.3 - Cell Membrane Structure
IB Biology Topic 1.3 - Cell Membrane Structure
Eran Earland
 
1.2 Obj Notes and Practice
1.2 Obj Notes and Practice1.2 Obj Notes and Practice
1.2 Obj Notes and Practice
Eran Earland
 
1.1 introduction-to-cells
1.1 introduction-to-cells1.1 introduction-to-cells
1.1 introduction-to-cells
Eran Earland
 

More from Eran Earland (8)

IB Topic 1.5 Cell Origins
IB Topic 1.5 Cell OriginsIB Topic 1.5 Cell Origins
IB Topic 1.5 Cell Origins
 
1.4 Part 3 - Active Cell Transport
1.4 Part 3 - Active Cell Transport1.4 Part 3 - Active Cell Transport
1.4 Part 3 - Active Cell Transport
 
1.4 Part 1 - Cell Transport
1.4 Part 1 - Cell Transport1.4 Part 1 - Cell Transport
1.4 Part 1 - Cell Transport
 
1.3 Cell Membrane
1.3 Cell Membrane1.3 Cell Membrane
1.3 Cell Membrane
 
1.4 Part 1 - Cell Transport
1.4 Part 1 - Cell Transport1.4 Part 1 - Cell Transport
1.4 Part 1 - Cell Transport
 
IB Biology Topic 1.3 - Cell Membrane Structure
IB Biology Topic 1.3 - Cell Membrane StructureIB Biology Topic 1.3 - Cell Membrane Structure
IB Biology Topic 1.3 - Cell Membrane Structure
 
1.2 Obj Notes and Practice
1.2 Obj Notes and Practice1.2 Obj Notes and Practice
1.2 Obj Notes and Practice
 
1.1 introduction-to-cells
1.1 introduction-to-cells1.1 introduction-to-cells
1.1 introduction-to-cells
 

Recently uploaded

Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
Bisnar Chase Personal Injury Attorneys
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 

Recently uploaded (20)

Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 

Statistics for IB Biology

  • 2. Objectives of this Unit:  Types of Data, Types of Graphs, Applications and Statistics to match your data o Bar Graphs, Line Graph, Scatter Plot, Histogram, Pie Chart o Mean, S.D., Regression, Chi Square Analysis  State that error bars are a graphical representation of the variability of data o Range and standard deviation show the variability/spread in the data  Calculate the mean and standard deviation of a set of values  Using Excel formulas o Given a mean and S.D. state the range for different parameters  State the term standard deviation is used to summarize the spread of values around the mean o 68% of all data +/- 1 standard deviation, 95% within 2 SD  Explain how S.D. is useful for comparing the means and spread of data between two or more samples o Greater S.D. shows greater variability of data o This can be used to inter reliability in methods or results BUT in Biology we also expect variability  Deduce the significance of the difference between two sets of data using calculated values for t and tables o Using t value and t table and critical values o Directly calculating P values using excel in lab reports o Difference between P and T  Explain that correlation does not establish that there is a causal relationship between two variables  Proper Lab Format  Designing Lab Process
  • 3. What are statistics? • Statistics are numbers used to: Describe and draw conclusions about DATA • These are called descriptive (or “univariate”) and inferential (or “analytical”) statistics, respectively.
  • 4. Variables • A variable is anything we can measure/observe • Three types: – Continuous: values span an uninterrupted range (e.g. height) – Discrete: only certain fixed values are possible (e.g. counts) – Categorical: values are qualitatively assigned (e.g. low/med/hi) • Dependence in variables: “Dependent variables depend on independent ones” – Independent variable – variable you are changing – Dependent variable – variable you measure to see result – Controlled variables – variables that can also impact the dependent variable that you identify as needed to not vary *** Experimental Control – NOT the same as controlled variables
  • 5. Descriptive statistics Numerical – Mean – Variance • Standard deviation • Standard error – Median – Mode – Skew – etc. Graphical – Histogram – Boxplot – Scatterplot – etc. Techniques to summarize data
  • 7. What graph to use ? Line Scatter Histogram Bar Appropriat e for data when: Important Features Include Sample and other notes Outlier - An outlier is an observation that lies an abnormal distance from other values in a random sample from a population. In a sense, this definition leaves it up to the analyst (or a consensus process) to decide what will be considered abnormal. Before abnormal observations can be singled out, it is necessary to characterize normal observations.
  • 9. S The Mean: Most important measure of “central tendency” Xi i=1 N m = N Population Mean
  • 10. S The Mean: Most important measure of “central tendency” i=1 n n Sample Mean X = Xi
  • 11. Additional central tendency measures M = X(n+1)/2 (n is odd) Median: the 50th percentile (n is even) Xn/2 + X(n/2)+1 2 M = Mode: the most common value 1, 1, 2, 4, 6, 6, 6, 7, 7, 7, 7, 8, 8, 9, 9, 10, 12, 15 Which to use: mean, median or mode?
  • 12. Variance: Most important measure of “dispersion” s2 = S N Population Variance (Xi - µ)2
  • 13. Variance: Most important measure of “dispersion” s2 = S n - 1 Sample Variance (Xi - X)2 From now on, we’ll ignore sample vs. population. But remember: We are almost always interested in the population, but can measure only a sample.
  • 14. “Graphical Statistics” Lets look deeper into graphs now
  • 15. The Friendly Histogram • Histograms represent the distribution of data • They allow you to visualize the mean, median, mode, variance, and skew at once!
  • 16. Constructing a Histogram is Easy X (data) 7.4 7.6 8.4 8.9 10.0 10.3 11.5 11.5 12.0 12.3 Histogram of X Value 6 8 10 12 14 0 1 2 3 Frequency (count)
  • 17. The Normal Distribution aka “Gaussian” distribution • Occurs frequently in nature • Especially for measures that are based on sums, such as: – sample means – body weight – “error” • Many statistics are based on the assumption of normality – You must make sure your data are normal, or try something else! Sample normal data: Histogram + theoretical distribution (i.e. sample vs. population)
  • 18. Properties of the Normal Distribution • Symmetric Mean = Median = Mode • Theoretical percentiles can be computed exactly ~68% of data are within 1 standard deviation of the mean >99% within 3 s.d. “skinny tails”
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. What if my data aren’t Normal? • It’s OK! • Although lots of data are Gaussian (because of the CLT), many simply aren’t. – Example: Fire return intervals Time between fires (yr) • Solutions: – Transform data to make it normal (e.g. take logs) – Use a test that doesn’t assume normal data • Don’t worry, there are plenty • Especially these days... • Many stats work OK as long as data are “reasonably” normal
  • 25. That is enough for today Please complete the flipped notes while watching the video before next class IMPORTANT: Bring a Device Next classs with either excel or google sheets
  • 27. Inference: the process by which we draw conclusions about an unknown based on evidence or prior experience. In statistics: make conclusions about a population based on samples taken from that population. Important: Your sample must reflect the population you’re interested in, otherwise your conclusions will be misleading!
  • 28. Statistical Hypotheses • Should be related to a scientific hypothesis! • Very often presented in pairs: – Null Hypothesis (H0): the “boring” hypothesis of “no difference” – Alternative Hypothesis (HA) the interesting hypothesis of “there is an effect” • Statistical tests attempt to (mathematically) reject the null hypothesis
  • 29. Significance • Your sample will never match H0 perfectly, even when H0 is in fact true • The question is whether your sample is different enough from the expectation under H0 to be considered significant • If your test finds a significant difference, then you reject H0.
  • 30. p-Values Measure Significance The p-value of a test is the probability of observing data at least as extreme as your sample, assuming H0 is true • If p is very small, it is unlikely that H0 is true (in other words, if H0 were true, your observed sample would be unlikely) • How small does p have to be? – 0.05 is a common cutoff • If p<0.05, then there is less than 5% chance that you would observe your sample if the null hypothesis was true.
  • 31. ‘Proof’ in statistics • Failing to reject (i.e. “accepting”) H0 does not prove that H0 is true! • And accepting HA doesn’t prove that HA is true either! Why? • Statistical inference tries to draw conclusions about the population from a small sample – By chance, the samples may be misleading – Example: if you always accept H0 at p=0.05, then 1 in 20 times you will be wrong!
  • 32. Play it Safe Avoid using the term Prove in your labs Instead say “the data accepts or supports” the hypothesis Watch out for reaching – classic student error, stick to the scope of your lab data in your conclusions, this is not your life work.
  • 33. “Why is this Biology?” Variation in populations. Variability in results. affects Confidence in conclusions. The key methodology in Biology is hypothesis testing through experimentation. Carefully-designed and controlled experiments and surveys give us quantitative (numeric) data that can be compared. We can use the data collected to test our hypothesis and form explanations of the processes involved… but only if we can be confident in our results. We therefore need to be able to evaluate the reliability of a set of data and the significance of any differences we have found in the data. Image: 'Transverse section of part of a stem of a Dead-nettle (Lamium sp.) showing+a+vascular+bundle+and+part+of+the+cortex' http://www.flickr.com/photos/71183136@N08/6959590092 Found on flickrcc.net
  • 34. “Which medicine should I prescribe?” Image from: http://www.msf.org/international-activity-report-2010-sierra-leone Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/
  • 35. “Which medicine should I prescribe?” Image from: http://www.msf.org/international-activity-report-2010-sierra-leone Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/ Generic drugs are out-of-patent, and are much cheaper than the proprietary (brand-name) equivalents. Doctors need to balance needs with available resources. Which would you choose?
  • 36. “Which medicine should I prescribe?” Image from: http://www.msf.org/international-activity-report-2010-sierra-leone Donate to Medecins Sans Friontiers through Biology4Good: http://i-biology.net/about/biology4good/ Means (averages) in Biology are almost never good enough. Biological systems (and our results) show variability. Which would you choose now?
  • 37.
  • 38. Hummingbirds are nectarivores (herbivores that feed on the nectar of some species of flower). In return for food, they pollinate the flower. This is an example of mutualism – benefit for all. As a result of natural selection, hummingbird bills have evolved. Birds with a bill best suited to their preferred food source have the greater chance of survival. Photo: Archilochus colubris, from wikimedia commons, by Dick Daniels.
  • 39. Researchers studying comparative anatomy collect data on bill-length in two species of hummingbirds: Archilochus colubris (red-throated hummingbird) and Cynanthus latirostris (broadbilled hummingbird). To do this, they need to collect sufficient relevant, reliable data so they can test the Null hypothesis (H0) that: “there is no significant difference in bill length between the two species.” Photo: Archilochus colubris (male), wikimedia commons, by Joe Schneid
  • 40. The sample size must be large enough to provide sufficient reliable data and for us to carry out relevant statistical tests for significance. We must also be mindful of uncertainty in our measuring tools and error in our results. Photo: Broadbilled hummingbird (wikimedia commons).
  • 41. The mean is a measure of the central tendency of a set of data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm) n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 18.0 4 15.0 18.0 5 15.0 19.0 6 16.0 19.0 7 16.0 19.0 8 18.0 20.0 9 18.0 20.0 10 19.0 20.0 Mean s Calculate the mean using: • Your calculator (sum of values / n) • Excel =AVERAGE(highlight raw data) n = sample size. The bigger the better. In this case n=10 for each group. All values should be centred in the cell, with decimal places consistent with the measuring tool uncertainty.
  • 42. The mean is a measure of the central tendency of a set of data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm) n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 18.0 4 15.0 18.0 5 15.0 19.0 6 16.0 19.0 7 16.0 19.0 8 18.0 20.0 9 18.0 20.0 10 19.0 20.0 Mean 15.9 18.8 s Raw data and the mean need to have consistent decimal places (in line with uncertainty of the measuring tool) Uncertainties must be included. Descriptive table title and number.
  • 43.
  • 45.
  • 46.
  • 47. A. colubris, 15.9mm C. latirostris, 18.8mm 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0 MeanBilllength(±0.1mm) Species of hummingbird Graph 1: Comparing mean bill lengths in two hummingbird species, A. colubris and C. latirostris. Descriptive title, with graph number. Labeled point Y-axis clearly labeled, with uncertainty. x-axis labeled
  • 48. A. colubris, 15.9mm C. latirostris, 18.8mm 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0 MeanBilllength(±0.1mm) Species of hummingbird Graph 1: Comparing mean bill lengths in two hummingbird species, A. colubris and C. latirostris. From the means alone you might conclude that C. latirostris has a longer bill than A. colubris. But the mean only tells part of the story.
  • 49.
  • 50.
  • 51. Standard deviation is a measure of the spread of most of the data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm) n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 18.0 4 15.0 18.0 5 15.0 19.0 6 16.0 19.0 7 16.0 19.0 8 18.0 20.0 9 18.0 20.0 10 19.0 20.0 Mean 15.9 18.8 s 1.91 1.03 Standard deviation can have one more decimal place.=STDEV (highlight RAW data). Which of the two sets of data has: a. The longest mean bill length? a. The greatest variability in the data?
  • 52. Standard deviation is a measure of the spread of most of the data. Table 1: Raw measurements of bill length in A. colubris and C. latirostris. Bill length (±0.1mm) n A. colubris C. latirostris 1 13.0 17.0 2 14.0 18.0 3 15.0 18.0 4 15.0 18.0 5 15.0 19.0 6 16.0 19.0 7 16.0 19.0 8 18.0 20.0 9 18.0 20.0 10 19.0 20.0 Mean 15.9 18.8 s 1.91 1.03 Standard deviation can have one more decimal place.=STDEV (highlight RAW data). Which of the two sets of data has: a. The longest mean bill length? a. The greatest variability in the data? C. latirostris A. colubris
  • 53. Standard deviation is a measure of the spread of most of the data. Error bars are a graphical representation of the variability of data. Which of the two sets of data has: a. The highest mean? a. The greatest variability in the data? A B Error bars could represent standard deviation, range or confidence intervals.
  • 54. Put the error bars for standard deviation on our graph.
  • 55. Put the error bars for standard deviation on our graph.
  • 56. Put the error bars for standard deviation on our graph. Delete the horizontal error bars
  • 57. A. colubris, 15.9mm C. latirostris, 18.8mm 0.0 5.0 10.0 15.0 20.0 MeanBilllength(±0.1mm) Species of hummingbird Graph 1: Comparing mean bill lengths in two hummingbird species, A. colubris and C. latirostris. (error bars = standard deviation) Title is adjusted to show the source of the error bars. This is very important. You can see the clear difference in the size of the error bars. Variability has been visualised. The error bars overlap somewhat. What does this mean?
  • 58. The overlap of a set of error bars gives a clue as to the significance of the difference between two sets of data. Large overlap No overlap Lots of shared data points within each data set. Results are not likely to be significantly different from each other. Any difference is most likely due to chance. No (or very few) shared data points within each data set. Results are more likely to be significantly different from each other. The difference is more likely to be ‘real’.
  • 59.
  • 60.
  • 61.
  • 62. A. colubris, 15.9mm (n=10) C. latirostris, 18.8mm (n=10) -3.0 2.0 7.0 12.0 17.0 22.0 MeanBilllength(±0.1mm) Species of hummingbird Graph 1: Comparing mean bill lengths in two hummingbird species, A. colubris and C. latirostris.(error bars = standard deviation) Our results show a very small overlap between the two sets of data. So how do we know if the difference is significant or not? We need to use a statistical test. The t-test is a statistical test that helps us determine the significance of the difference between the means of two sets of data.
  • 63.
  • 64. The Null Hypothesis (H0): “There is no significant difference.” This is the ‘default’ hypothesis that we always test. In our conclusion, we either accept the null hypothesis or reject it. A t-test can be used to test whether the difference between two means is significant. • If we accept H0, then the means are not significantly different. • If we reject H0, then the means are significantly different. Remember: • We are never ‘trying’ to get a difference. We design carefully-controlled experiments and then analyse the results using statistical analysis.
  • 65. Excel can jump straight to a value of P for our results. One function (=ttest) compares both sets of data. As it calculates P directly (the probability that the difference is due to chance), we can determine significance directly. In this case, P=0.00026 This is much smaller than 0.005, so we are confident that we can: reject H0. The difference is unlikely to be due to chance. Conclusion: There is a significant difference in bill length between A. colubris and C. latirostris.
  • 66.
  • 67. Two tails: we assume data are normally distributed, with two ‘tails’ moving away from mean. Type 2 (unpaired): we are comparing one whole population with the other whole population. (Type 1 pairs the results of each individual in set A with the same individual in set B).
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73. Your correlation coefficient, is your R^2 value In Excel, you will want to do a scatter plot with your data Next Add a trend line and check the boxes for displaying the equation as well as the R-squared value. The closer to 1.0 that value is, the stronger the correlation Table 2: Correlation between bill length and body weight in A. colubris bill length (mm) (+/- 0.1mm) 13.0 14.0 15.0 15.0 15.0 16.0 16.0 18.0 18.0 19.0 weight (g) (+/-0.05g) 2.7 2.8 2.8 2.9 2.9 2.9 3.0 3.1 3.4 3.6
  • 74. http://diabetes-obesity.findthedata.org/b/240/Correlations-between-diabetes-obesity-and-physical-activity Interpreting Graphs: See – Think – Wonder See: What is factual about the graph? • What are the axes? • What is being plotted • What values are present? Think: How is the graph interpreted? • What relationship is present? • Is cause implied? • What explanations are possible and what explanations are not possible? Wonder: Questions about the graph. • What do you need to know more about? See – Think - Wonder Visible Thinking Routine
  • 75. http://diabetes-obesity.findthedata.org/b/240/Correlations-between-diabetes-obesity-and-physical-activity Diabetes and obesity are ‘risk factors’ of each other. There is a strong correlation between them, but does this mean one causes the other?
  • 76. Correlation does not imply causality. Pirates vs global warming, from http://en.wikipedia.org/wiki/Flying_Spaghetti_Monster#Pirates_and_global_warming
  • 77. Cartoon from: http://www.xkcd.com/552/ Correlation does not imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing "look over there." Check out these funny “Correlations”
  • 78. Correlation does not imply causality. Pirates vs global warming, from http://en.wikipedia.org/wiki/Flying_Spaghetti_Monster#Pirates_and_global_warming Where correlations exist, we must then design solid scientific experiments to determine the cause of the relationship. Sometimes a correlation exist because of confounding variables – conditions that the correlated variables have in common but that do not directly affect each other. To be able to determine causality through experimentation we need: • One clearly identified independent variable • Carefully measured dependent variable(s) that can be attributed to change in the independent variable • Strict control of all other variables that might have a measurable impact on the dependent variable. We need: sufficient relevant, repeatable and statistically significant data. Some known causal relationships: • Atmospheric CO2 concentrations and global warming • Atmospheric CO2 concentrations and the rate of photosynthesis • Temperature and enzyme activity