SlideShare a Scribd company logo
Introductory statistics for
medical research
Julia Saperia
European Medicines Agency
Barcelona, 18 June 2012
www.eurordis.org
2
What does statistics mean to you?
mean
information
percentage
variability
data
median
average
graphs
measurement
probability
confidence
evidence
spread
3
An example
• The idea of (clinical) research is to provide answers to questions
• Let’s ask a question…
 How tall are Eurordis summer school 2012
participants?
4
An example
• Thinking about one (silly) question tells us some important
things about data.
 people are different variability in data
 the reliability of an average depends on the
population being studied and the sample drawn
from that population  valid inference
5
A sillier example
• Let’s say I find out 5 heights (in cm)
165, 166, 167, 168, 169
mean = 167.0 cm
standard deviation = 1.6 cm
• Let’s say I find out 5 different heights (in cm)
154, 165, 168, 172, 183
mean = 168.4 cm
standard deviation = 10.5 cm
• So the means are the quite similar…but the two groups look
different
6
A sillier example
• what we learn from the second example is that we get more
information from data if we know how variable they are
 with these pieces of information, we can draw a picture of
the data (provided some assumptions are true)
7
The histogram
•Taken from http://www.mathsisfun.com/data/histograms.html on 28 May 2012
8
Johann Carl Friedrich Gauss 1777-1855
68,3%
~95,5%
~99%,
mean mean + sd mean + 2sdmean - sdmean - 2sd
Picturing “normal” data
9
Biostatistics
… is the method of gaining empirical knowledge in the life sciences
… is the method for quantifying variation and uncertainty
1. How do I gain adequate data?
Thorough planning of studies
2. What can I do with my data?
Descriptive and inferential statistics
If (1) is not satisfactorily resolved, (2) will never be adequate.
10
draw a
sample
Population
sample
sample data
Inferential statistics
Draw conclusions for the
whole population based
on information gained
from a sample
Target of study design
representative sample
= ~ all "typical"
representatives are
included
Descriptive
statistics
(a population is a set of all
conceivable observations of a
certain phenomenon)
11
Descriptive statistics (1)
• For continuous variables (e.g. height, blood pressure,
body mass index, blood glucose)
• Histograms
• Measures of location
–mean, median…
• Measures of spread
–variance, standard deviation, range, interquartile
range
12
Descriptive statistics (2)
• For categorical variables (e.g. sex, dead/alive,
smoker/non-smoker, blood group)
• Bar charts, (pie charts)…
• Frequencies, percentages, proportions
–when using a percentage, always state the number
it’s based on
–(e.g. 68% (n=279))
13
Describing data
• Going back to the heights example
–we didn’t have time to measure the heights of the
population (all participants), so we took a sample
–we calculated the mean of the sample
–we can use the mean of the sample as an estimate
of the mean in the population
–but seeing as it’s only an estimate, we want to know
how reliable it is; that is, how well it represents the
true (population) value
–we calculated the standard deviation of the sample
–this is a fact (like the sample mean) about the
spread of the data in the sample
14
Inferential statistics – confidence intervals
• the standard deviation also helps us to understand how precise our
estimate of the mean is
• the precision depends on the number of data points in the sample
– we can calculate the standard error of the mean
– (in our more realistic example, the standard error is 4.7)
– the larger the sample, the smaller the standard error, the
more precise the estimate
• so now we have an estimate of the population mean and a measure
of its precision
• so what?
• we can use the standard error of the mean and the properties of the
normal distribution to gain some level of confidence about our
estimate
sampleinnumber
deviationstandard
meantheoferrorstandard
15
Confidence intervals
• How confident do you need to be about the outcome of a race in order to
place a bet?
• In statistics it’s typical to talk about 95% confidence
• we can use the standard error of the mean to calculate a 95%
confidence interval for the mean
• in our example: 95%CI for the mean ≈ 168.4 2 x 4.7
≈ [159.0, 177.8]
errorstandard2meanintervalconfidence95%
measure of precision of
the sample mean
relates to the normal distribution
estimate of population mean
16
Confidence intervals
• What does a 95% confidence interval mean?
• remember that we have drawn a sample from a population
• we want to know how good our estimate of the population mean (the
truth) is
• the 95% CI tells us that the true mean lies in 95 out of 100 confidence
intervals calculated using different samples from this population
 true mean lies between 159 cm and 177.8 cm with 95% probability
• remember: the larger the number in the sample, the smaller the standard
error
• the smaller the standard error, the narrower the confidence interval
 the larger the number in the sample, the more precise the estimate
17
A real example
Lateral wedge insoles for medial knee osteoarthritis: 12 month
randomised controlled trial (BMJ 2011; 342:d2912 )
Participants 200 people aged 50 or more with clinical and radiographic
diagnosis of mild to moderately severe medial knee osteoarthritis.
Interventions Full length 5 degree lateral wedged insoles or flat control
insoles worn inside the shoes daily for 12 months.
Main outcome measure Change in overall knee pain (past week)
measured on an 11 point numerical rating scale (0=no pain, 10=worst
pain imaginable).
Research question:
Are full length 5 degree lateral wedged insoles better than flat control
insoles at reducing pain after 12 months in patients with mild to
moderately severe medial knee osteoarthritis?
18
A real example
• Assume 5 degree lateral wedged insoles and flat control insoles reduce
pain to the same extent (this is our null hypothesis)
• Collect data in both groups of patients and calculate:
the mean reduction from baseline in the 5 deg lateral insole group (0.9)
the mean reduction from baseline in the flat insole group (1.2)
the difference between the two groups in mean change from baseline (0.9
– 1.2 = -0.3)
• We can see that there was some reduction in pain in both groups and
that the reduction was greater in the flat insole group
• The 95% confidence interval tells us how important that difference is
flat insole better 5 deg lateral insole better
-1.0 0.3- 0.3 0
19
A real example
• Conclusion: with the available evidence, there is no reason
to suggest that one treatment is better than the other in
reducing pain scores after one year.
• Or: there is no statistically significant difference in
treatments (because the 95% confidence interval crosses
the null value)
20
Statistical significance vs clinical significance
• Remember: the larger the number of data points in a sample, the
more precise the estimate
• Let’s imagine that instead of enrolling 200 patients in their study, the
researchers enrolled 1500.
• This time, the confidence interval does not cross the null value
• So we can conclude that the flat insole reduces pain by 0.25 points
more than the 5 degree lateral insole.
• Hooray?
• Is a difference of 0.25 points clinically relevant?
-0.4 -0.1-0.25 0
flat insole better 5 deg lateral insole better
21
Quantifying statistical significance – the p-value
• In the last example, we looked at the 95% confidence interval to decide
whether we had a statistically significant result
• (the confidence interval is useful because it gives clinicians and patients a
range of values in which the true value is likely to lie)
• another measure of statistical significance is the p-value
• Remember our null hypothesis: 5 degree lateral wedged insoles and
flat control insoles reduce pain to the same extent
• We go into our experiment assuming that the null hypothesis is true
• After we’ve collected the data, the p-value helps us decide whether our
results are due to chance alone
22
The p-value
• the p-value is a probability  it takes values between 0 and 1
• the p-value is the probability of observing the data (or data more
extreme) if the null hypothesis is true
 (if the null hypothesis is true, the two treatments are the same)
 a low p-value (a value very close to zero) means there is low
probability of observing such data when the null hypothesis is true
 reject the null hypothesis (i.e. conclude that there is difference in
treatments)
23
Significance level
• p-values are compared to a pre-specified significance level, alpha
• alpha is usually 5% (analogous to 95% confidence intervals)
 if p ≤ 0.05 reject the null hypothesis (i.e., the result is statistically significant
at the 5% level)  conclude that there is a difference in treatments
 if p > 0.05 do not reject the null hypothesis  conclude that there is not
sufficient information to reject the null hypothesis
(See Statistics notes: Absence of evidence is not evidence of absence:
BMJ 1995;311:485)
The authors of the insole paper did not cite the p-value for the result we looked
at but we can conclude that p>0.05 because the 95%CI crosses the null value
and so we cannot reject the null hypothesis (of no difference).
24
H0 H1
H0 Correct
1-
H1 False positive
Testdecision
Hypothesisaccepted
True state of nature
Type I error:
Error of rejecting a null hypothesis when it is actually true
( error, error of the first kind)
represents a FALSE POSITIVE decision: A placebo is declared to be more
effective than another placebo!)
Statistical mistakes/errors
25
H0 H1
H0 Correct
1-
False negative
H1 False positive Correct
1-
Testdecision
Hypothesisaccepted
True state of nature
Type II error:
The alternative hypothesis is true, but the null hypothesis is erroneously
not rejected. (β error, error of the second kind)
FALSE NEGATIVE decision: an effective treatment could not be
significantly distinguished from placebo
Statistical mistakes/errors
26
Which sorts of error can occur when performing statistical tests?
H0 H1
H0 Correct
1-
False negative
H1 False positive Correct
1-
Testdecision
Hypothesisaccepted
True state of nature
Significance level is usually predetermined in the study protocol, e.g.,
=0.05 (5%, 1 in 20), = 0.01 (1%, 1 in 100), = 0.001 (0.1%, 1 in 1000)
POWER 1- : The power of a statistical test is the probability that the test will reject
a false null hypothesis.
Statistical mistakes/errors
27
H0 H1
H0 Correct
1-
False negative
H1 False positive Correct
1-
Testdecision
Hypothesisaccepted
True state of nature
Consumer‘s risk?
Producer‘s risk?
Statistical mistakes/errors
28
Compare a statistical test to a court of justice
H0
Innocent
H1
Not Innocent
H0 Correct
1-
False negative
(i.e. guilty but not caught)
H1 False positive
(i.e. innocent but convicted)
Correct
1-
Testdecision
Decisionofthecourt
True state of nature
False positive: A court finds a person guilty of a crime that they did not actually commit.
False negative: A court finds a person not guilty of a crime that they did commit.
Judged
"innocent"
Judged
“Notinnocent"
Statistical tests vs court of law
29
Minimising statistical errors
Remember:
How do I gain adequate data?
Thorough planning of studies
• Defining acceptable levels of statistical error is key to the planning
of studies
• alpha (in clinical trials) is pre-defined by regulatory guidance
(usually)
• beta is not, but deciding on the power (1-beta) of the study is
crucial to enrolling sufficient patients
• the power of a study is usually chosen to be 80% or 90%
• conducting an “underpowered” study is not ethically acceptable
because you know in advance that your results will be inconclusive
30
Deciding how many patients to enrol
• the sample size calculation depends on:
– the clinically relevant effect that is expected (1)
– the amount of variability in the data that is expected (2)
– the significance level at which you plan to test (3)
– the power that you hope to achieve in your study (4)
• if you knew (1) and (2), you wouldn’t need to conduct a study
 picking your sample size is a gamble
• the smaller the treatment effect, the more patients you need
• the more variable the treatment effect, the more patients you need
• the smaller the risks (or statistical errors) you’re prepared to take,
the more patients you need
31
Conclusion
• amount of variability in data defines conclusions from
statistical tests
• clinical trial methodology is underpinned by statistics
• valid statistical inference, and therefore decision-
making, depends on robust planning
• understanding statistics isn’t as hard as it looks (I
hope)
32
Further reading
• www.consort-statement.org
 standards for reporting clinical trials in the
literature
• Statistical Principles for Clinical Trials ICH E9
 useful glossary
• http://openwetware.org/wiki/BMJ_Statistics_Notes_series
 coverage of a number of topics related to
statistics in clinical research, mostly by Douglas
Altman and Martin Bland

More Related Content

What's hot

Basis of statistical inference
Basis of statistical inferenceBasis of statistical inference
Basis of statistical inferencezahidacademy
 
How to do the maths
How to do the mathsHow to do the maths
How to do the maths
Ahmed Elfaitury
 
On p-values
On p-valuesOn p-values
On p-values
Maarten van Smeden
 
Minimally important differences v2
Minimally important differences v2Minimally important differences v2
Minimally important differences v2
Stephen Senn
 
Basics of statistics
Basics of statisticsBasics of statistics
Basics of statisticsGaurav Kr
 
Normal and standard normal distribution
Normal and standard normal distributionNormal and standard normal distribution
Normal and standard normal distribution
Avjinder (Avi) Kaler
 
Lect w7 t_test_amp_chi_test
Lect w7 t_test_amp_chi_testLect w7 t_test_amp_chi_test
Lect w7 t_test_amp_chi_test
Rione Drevale
 
The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective Statisticians
Stephen Senn
 
Minimally important differences
Minimally important differencesMinimally important differences
Minimally important differences
Stephen Senn
 
What is a Single Sample Z Test?
What is a Single Sample Z Test?What is a Single Sample Z Test?
What is a Single Sample Z Test?
Ken Plummer
 
Calculating a single sample z test by hand
Calculating a single sample z test by handCalculating a single sample z test by hand
Calculating a single sample z test by hand
Ken Plummer
 
Intro to tests of significance qualitative
Intro to tests of significance qualitativeIntro to tests of significance qualitative
Intro to tests of significance qualitativePandurangi Raghavendra
 
Test of hypothesis (z)
Test of hypothesis (z)Test of hypothesis (z)
Test of hypothesis (z)Marlon Gomez
 
Is ignorance bliss
Is ignorance blissIs ignorance bliss
Is ignorance bliss
Stephen Senn
 
What is your question
What is your questionWhat is your question
What is your question
StephenSenn2
 
Single sample z test - explain (final)
Single sample z test - explain (final)Single sample z test - explain (final)
Single sample z test - explain (final)
CTLTLA
 
What should we expect from reproducibiliry
What should we expect from reproducibiliryWhat should we expect from reproducibiliry
What should we expect from reproducibiliry
Stephen Senn
 

What's hot (20)

Basis of statistical inference
Basis of statistical inferenceBasis of statistical inference
Basis of statistical inference
 
How to do the maths
How to do the mathsHow to do the maths
How to do the maths
 
On p-values
On p-valuesOn p-values
On p-values
 
Minimally important differences v2
Minimally important differences v2Minimally important differences v2
Minimally important differences v2
 
Basics of statistics
Basics of statisticsBasics of statistics
Basics of statistics
 
Normal and standard normal distribution
Normal and standard normal distributionNormal and standard normal distribution
Normal and standard normal distribution
 
Lect w7 t_test_amp_chi_test
Lect w7 t_test_amp_chi_testLect w7 t_test_amp_chi_test
Lect w7 t_test_amp_chi_test
 
The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective Statisticians
 
Minimally important differences
Minimally important differencesMinimally important differences
Minimally important differences
 
Chapter10
Chapter10Chapter10
Chapter10
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Lund 2009
Lund 2009Lund 2009
Lund 2009
 
What is a Single Sample Z Test?
What is a Single Sample Z Test?What is a Single Sample Z Test?
What is a Single Sample Z Test?
 
Calculating a single sample z test by hand
Calculating a single sample z test by handCalculating a single sample z test by hand
Calculating a single sample z test by hand
 
Intro to tests of significance qualitative
Intro to tests of significance qualitativeIntro to tests of significance qualitative
Intro to tests of significance qualitative
 
Test of hypothesis (z)
Test of hypothesis (z)Test of hypothesis (z)
Test of hypothesis (z)
 
Is ignorance bliss
Is ignorance blissIs ignorance bliss
Is ignorance bliss
 
What is your question
What is your questionWhat is your question
What is your question
 
Single sample z test - explain (final)
Single sample z test - explain (final)Single sample z test - explain (final)
Single sample z test - explain (final)
 
What should we expect from reproducibiliry
What should we expect from reproducibiliryWhat should we expect from reproducibiliry
What should we expect from reproducibiliry
 

Similar to 3b. Introductory Statistics - Julia Saperia

Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiran
Kiran Ramakrishna
 
Clinical research ( Medical stat. concepts)
Clinical research ( Medical stat. concepts)Clinical research ( Medical stat. concepts)
Clinical research ( Medical stat. concepts)
Mohamed Fahmy Dehim
 
Introduction to medical statistics
Introduction to medical statistics Introduction to medical statistics
Introduction to medical statistics
Mohamed Alhelaly
 
Basics of Statistics.pptx
Basics of Statistics.pptxBasics of Statistics.pptx
Basics of Statistics.pptx
Ammar Sami AL-Bayati
 
RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3
DR. TIRIMBA IBRAHIM
 
ChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdfChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdf
Dikshathawait
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
Sr Edith Bogue
 
Standard Error & Confidence Intervals.pptx
Standard Error & Confidence Intervals.pptxStandard Error & Confidence Intervals.pptx
Standard Error & Confidence Intervals.pptx
hanyiasimple
 
Sample size determination
Sample size determinationSample size determination
Sample size determination
Augustine Gatimu
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testing
o_devinyak
 
Ds vs Is discuss 3.1
Ds vs Is discuss 3.1Ds vs Is discuss 3.1
Ds vs Is discuss 3.1
Makati Science High School
 
COM 201_Inferential Statistics_18032022.pptx
COM 201_Inferential Statistics_18032022.pptxCOM 201_Inferential Statistics_18032022.pptx
COM 201_Inferential Statistics_18032022.pptx
AkinsolaAyomidotun
 
Soni_Biostatistics.ppt
Soni_Biostatistics.pptSoni_Biostatistics.ppt
Soni_Biostatistics.ppt
Ogunsina1
 
Descriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing ResearchDescriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing Research
enamprofessor
 
250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC
LeaCamillePacle
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
Bob Smullen
 
Malimu statistical significance testing.
Malimu statistical significance testing.Malimu statistical significance testing.
Malimu statistical significance testing.
Miharbi Ignasm
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...
David Pratap
 
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docxConfidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
maxinesmith73660
 

Similar to 3b. Introductory Statistics - Julia Saperia (20)

Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiran
 
Clinical research ( Medical stat. concepts)
Clinical research ( Medical stat. concepts)Clinical research ( Medical stat. concepts)
Clinical research ( Medical stat. concepts)
 
Introduction to medical statistics
Introduction to medical statistics Introduction to medical statistics
Introduction to medical statistics
 
Basics of Statistics.pptx
Basics of Statistics.pptxBasics of Statistics.pptx
Basics of Statistics.pptx
 
RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3
 
ChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdfChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdf
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
 
Standard Error & Confidence Intervals.pptx
Standard Error & Confidence Intervals.pptxStandard Error & Confidence Intervals.pptx
Standard Error & Confidence Intervals.pptx
 
Sample size determination
Sample size determinationSample size determination
Sample size determination
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testing
 
Ds vs Is discuss 3.1
Ds vs Is discuss 3.1Ds vs Is discuss 3.1
Ds vs Is discuss 3.1
 
COM 201_Inferential Statistics_18032022.pptx
COM 201_Inferential Statistics_18032022.pptxCOM 201_Inferential Statistics_18032022.pptx
COM 201_Inferential Statistics_18032022.pptx
 
Soni_Biostatistics.ppt
Soni_Biostatistics.pptSoni_Biostatistics.ppt
Soni_Biostatistics.ppt
 
Descriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing ResearchDescriptive And Inferential Statistics for Nursing Research
Descriptive And Inferential Statistics for Nursing Research
 
250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC250Lec5INFERENTIAL STATISTICS FOR RESEARC
250Lec5INFERENTIAL STATISTICS FOR RESEARC
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
 
Malimu statistical significance testing.
Malimu statistical significance testing.Malimu statistical significance testing.
Malimu statistical significance testing.
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...
 
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docxConfidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
 
Hypo
HypoHypo
Hypo
 

More from EURORDIS Rare Diseases Europe (20)

Gene editing to treat cystic fibrosis_Vertex
Gene editing to treat cystic fibrosis_VertexGene editing to treat cystic fibrosis_Vertex
Gene editing to treat cystic fibrosis_Vertex
 
From Laboratory to Patient
From Laboratory to PatientFrom Laboratory to Patient
From Laboratory to Patient
 
Committee for Advanced Therapies
Committee for Advanced TherapiesCommittee for Advanced Therapies
Committee for Advanced Therapies
 
3.nf
3.nf3.nf
3.nf
 
3.i
3.i3.i
3.i
 
2.it
2.it2.it
2.it
 
2.ab
2.ab2.ab
2.ab
 
1.3 df
1.3 df1.3 df
1.3 df
 
1.4 av
1.4 av1.4 av
1.4 av
 
2.it
2.it2.it
2.it
 
2.ab
2.ab2.ab
2.ab
 
1.2 fpr
1.2 fpr1.2 fpr
1.2 fpr
 
1.1 annaritamiccio
1.1 annaritamiccio1.1 annaritamiccio
1.1 annaritamiccio
 
1 engaging with ema methodology and support
1 engaging with ema methodology and support1 engaging with ema methodology and support
1 engaging with ema methodology and support
 
panorama of actions patient organisations
 panorama of actions patient organisations panorama of actions patient organisations
panorama of actions patient organisations
 
Chmp 2018 houyez
Chmp 2018 houyezChmp 2018 houyez
Chmp 2018 houyez
 
Cup case study
Cup case studyCup case study
Cup case study
 
Hta fhz
Hta fhzHta fhz
Hta fhz
 
Patients chmp 2018
Patients chmp 2018Patients chmp 2018
Patients chmp 2018
 
Vigilance 2018 final
Vigilance 2018 finalVigilance 2018 final
Vigilance 2018 final
 

Recently uploaded

Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Savita Shen $i11
 
BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
Krishan Murari
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
Anurag Sharma
 
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #GirlsFor Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
Savita Shen $i11
 
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
VarunMahajani
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
KafrELShiekh University
 
THOA 2.ppt Human Organ Transplantation Act
THOA 2.ppt Human Organ Transplantation ActTHOA 2.ppt Human Organ Transplantation Act
THOA 2.ppt Human Organ Transplantation Act
DrSathishMS1
 
The Normal Electrocardiogram - Part I of II
The Normal Electrocardiogram - Part I of IIThe Normal Electrocardiogram - Part I of II
The Normal Electrocardiogram - Part I of II
MedicoseAcademics
 
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptxMaxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
How to Give Better Lectures: Some Tips for Doctors
How to Give Better Lectures: Some Tips for DoctorsHow to Give Better Lectures: Some Tips for Doctors
How to Give Better Lectures: Some Tips for Doctors
LanceCatedral
 
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdfARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
Anujkumaranit
 
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptxTriangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
POST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its managementPOST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its management
touseefaziz1
 
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
Catherine Liao
 
Superficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptxSuperficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptx
Dr. Rabia Inam Gandapore
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdfAlcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Dr Jeenal Mistry
 
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.GawadHemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
NephroTube - Dr.Gawad
 
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdfBENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
DR SETH JOTHAM
 
Antiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptxAntiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptx
Rohit chaurpagar
 

Recently uploaded (20)

Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
 
BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
 
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #GirlsFor Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
For Better Surat #ℂall #Girl Service ❤85270-49040❤ Surat #ℂall #Girls
 
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
 
THOA 2.ppt Human Organ Transplantation Act
THOA 2.ppt Human Organ Transplantation ActTHOA 2.ppt Human Organ Transplantation Act
THOA 2.ppt Human Organ Transplantation Act
 
The Normal Electrocardiogram - Part I of II
The Normal Electrocardiogram - Part I of IIThe Normal Electrocardiogram - Part I of II
The Normal Electrocardiogram - Part I of II
 
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptxMaxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
 
How to Give Better Lectures: Some Tips for Doctors
How to Give Better Lectures: Some Tips for DoctorsHow to Give Better Lectures: Some Tips for Doctors
How to Give Better Lectures: Some Tips for Doctors
 
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdfARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
 
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptxTriangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
 
POST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its managementPOST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its management
 
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
 
Superficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptxSuperficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptx
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
 
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdfAlcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
Alcohol_Dr. Jeenal Mistry MD Pharmacology.pdf
 
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.GawadHemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
 
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdfBENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
 
Antiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptxAntiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptx
 

3b. Introductory Statistics - Julia Saperia

  • 1. Introductory statistics for medical research Julia Saperia European Medicines Agency Barcelona, 18 June 2012 www.eurordis.org
  • 2. 2 What does statistics mean to you? mean information percentage variability data median average graphs measurement probability confidence evidence spread
  • 3. 3 An example • The idea of (clinical) research is to provide answers to questions • Let’s ask a question…  How tall are Eurordis summer school 2012 participants?
  • 4. 4 An example • Thinking about one (silly) question tells us some important things about data.  people are different variability in data  the reliability of an average depends on the population being studied and the sample drawn from that population  valid inference
  • 5. 5 A sillier example • Let’s say I find out 5 heights (in cm) 165, 166, 167, 168, 169 mean = 167.0 cm standard deviation = 1.6 cm • Let’s say I find out 5 different heights (in cm) 154, 165, 168, 172, 183 mean = 168.4 cm standard deviation = 10.5 cm • So the means are the quite similar…but the two groups look different
  • 6. 6 A sillier example • what we learn from the second example is that we get more information from data if we know how variable they are  with these pieces of information, we can draw a picture of the data (provided some assumptions are true)
  • 7. 7 The histogram •Taken from http://www.mathsisfun.com/data/histograms.html on 28 May 2012
  • 8. 8 Johann Carl Friedrich Gauss 1777-1855 68,3% ~95,5% ~99%, mean mean + sd mean + 2sdmean - sdmean - 2sd Picturing “normal” data
  • 9. 9 Biostatistics … is the method of gaining empirical knowledge in the life sciences … is the method for quantifying variation and uncertainty 1. How do I gain adequate data? Thorough planning of studies 2. What can I do with my data? Descriptive and inferential statistics If (1) is not satisfactorily resolved, (2) will never be adequate.
  • 10. 10 draw a sample Population sample sample data Inferential statistics Draw conclusions for the whole population based on information gained from a sample Target of study design representative sample = ~ all "typical" representatives are included Descriptive statistics (a population is a set of all conceivable observations of a certain phenomenon)
  • 11. 11 Descriptive statistics (1) • For continuous variables (e.g. height, blood pressure, body mass index, blood glucose) • Histograms • Measures of location –mean, median… • Measures of spread –variance, standard deviation, range, interquartile range
  • 12. 12 Descriptive statistics (2) • For categorical variables (e.g. sex, dead/alive, smoker/non-smoker, blood group) • Bar charts, (pie charts)… • Frequencies, percentages, proportions –when using a percentage, always state the number it’s based on –(e.g. 68% (n=279))
  • 13. 13 Describing data • Going back to the heights example –we didn’t have time to measure the heights of the population (all participants), so we took a sample –we calculated the mean of the sample –we can use the mean of the sample as an estimate of the mean in the population –but seeing as it’s only an estimate, we want to know how reliable it is; that is, how well it represents the true (population) value –we calculated the standard deviation of the sample –this is a fact (like the sample mean) about the spread of the data in the sample
  • 14. 14 Inferential statistics – confidence intervals • the standard deviation also helps us to understand how precise our estimate of the mean is • the precision depends on the number of data points in the sample – we can calculate the standard error of the mean – (in our more realistic example, the standard error is 4.7) – the larger the sample, the smaller the standard error, the more precise the estimate • so now we have an estimate of the population mean and a measure of its precision • so what? • we can use the standard error of the mean and the properties of the normal distribution to gain some level of confidence about our estimate sampleinnumber deviationstandard meantheoferrorstandard
  • 15. 15 Confidence intervals • How confident do you need to be about the outcome of a race in order to place a bet? • In statistics it’s typical to talk about 95% confidence • we can use the standard error of the mean to calculate a 95% confidence interval for the mean • in our example: 95%CI for the mean ≈ 168.4 2 x 4.7 ≈ [159.0, 177.8] errorstandard2meanintervalconfidence95% measure of precision of the sample mean relates to the normal distribution estimate of population mean
  • 16. 16 Confidence intervals • What does a 95% confidence interval mean? • remember that we have drawn a sample from a population • we want to know how good our estimate of the population mean (the truth) is • the 95% CI tells us that the true mean lies in 95 out of 100 confidence intervals calculated using different samples from this population  true mean lies between 159 cm and 177.8 cm with 95% probability • remember: the larger the number in the sample, the smaller the standard error • the smaller the standard error, the narrower the confidence interval  the larger the number in the sample, the more precise the estimate
  • 17. 17 A real example Lateral wedge insoles for medial knee osteoarthritis: 12 month randomised controlled trial (BMJ 2011; 342:d2912 ) Participants 200 people aged 50 or more with clinical and radiographic diagnosis of mild to moderately severe medial knee osteoarthritis. Interventions Full length 5 degree lateral wedged insoles or flat control insoles worn inside the shoes daily for 12 months. Main outcome measure Change in overall knee pain (past week) measured on an 11 point numerical rating scale (0=no pain, 10=worst pain imaginable). Research question: Are full length 5 degree lateral wedged insoles better than flat control insoles at reducing pain after 12 months in patients with mild to moderately severe medial knee osteoarthritis?
  • 18. 18 A real example • Assume 5 degree lateral wedged insoles and flat control insoles reduce pain to the same extent (this is our null hypothesis) • Collect data in both groups of patients and calculate: the mean reduction from baseline in the 5 deg lateral insole group (0.9) the mean reduction from baseline in the flat insole group (1.2) the difference between the two groups in mean change from baseline (0.9 – 1.2 = -0.3) • We can see that there was some reduction in pain in both groups and that the reduction was greater in the flat insole group • The 95% confidence interval tells us how important that difference is flat insole better 5 deg lateral insole better -1.0 0.3- 0.3 0
  • 19. 19 A real example • Conclusion: with the available evidence, there is no reason to suggest that one treatment is better than the other in reducing pain scores after one year. • Or: there is no statistically significant difference in treatments (because the 95% confidence interval crosses the null value)
  • 20. 20 Statistical significance vs clinical significance • Remember: the larger the number of data points in a sample, the more precise the estimate • Let’s imagine that instead of enrolling 200 patients in their study, the researchers enrolled 1500. • This time, the confidence interval does not cross the null value • So we can conclude that the flat insole reduces pain by 0.25 points more than the 5 degree lateral insole. • Hooray? • Is a difference of 0.25 points clinically relevant? -0.4 -0.1-0.25 0 flat insole better 5 deg lateral insole better
  • 21. 21 Quantifying statistical significance – the p-value • In the last example, we looked at the 95% confidence interval to decide whether we had a statistically significant result • (the confidence interval is useful because it gives clinicians and patients a range of values in which the true value is likely to lie) • another measure of statistical significance is the p-value • Remember our null hypothesis: 5 degree lateral wedged insoles and flat control insoles reduce pain to the same extent • We go into our experiment assuming that the null hypothesis is true • After we’ve collected the data, the p-value helps us decide whether our results are due to chance alone
  • 22. 22 The p-value • the p-value is a probability  it takes values between 0 and 1 • the p-value is the probability of observing the data (or data more extreme) if the null hypothesis is true  (if the null hypothesis is true, the two treatments are the same)  a low p-value (a value very close to zero) means there is low probability of observing such data when the null hypothesis is true  reject the null hypothesis (i.e. conclude that there is difference in treatments)
  • 23. 23 Significance level • p-values are compared to a pre-specified significance level, alpha • alpha is usually 5% (analogous to 95% confidence intervals)  if p ≤ 0.05 reject the null hypothesis (i.e., the result is statistically significant at the 5% level)  conclude that there is a difference in treatments  if p > 0.05 do not reject the null hypothesis  conclude that there is not sufficient information to reject the null hypothesis (See Statistics notes: Absence of evidence is not evidence of absence: BMJ 1995;311:485) The authors of the insole paper did not cite the p-value for the result we looked at but we can conclude that p>0.05 because the 95%CI crosses the null value and so we cannot reject the null hypothesis (of no difference).
  • 24. 24 H0 H1 H0 Correct 1- H1 False positive Testdecision Hypothesisaccepted True state of nature Type I error: Error of rejecting a null hypothesis when it is actually true ( error, error of the first kind) represents a FALSE POSITIVE decision: A placebo is declared to be more effective than another placebo!) Statistical mistakes/errors
  • 25. 25 H0 H1 H0 Correct 1- False negative H1 False positive Correct 1- Testdecision Hypothesisaccepted True state of nature Type II error: The alternative hypothesis is true, but the null hypothesis is erroneously not rejected. (β error, error of the second kind) FALSE NEGATIVE decision: an effective treatment could not be significantly distinguished from placebo Statistical mistakes/errors
  • 26. 26 Which sorts of error can occur when performing statistical tests? H0 H1 H0 Correct 1- False negative H1 False positive Correct 1- Testdecision Hypothesisaccepted True state of nature Significance level is usually predetermined in the study protocol, e.g., =0.05 (5%, 1 in 20), = 0.01 (1%, 1 in 100), = 0.001 (0.1%, 1 in 1000) POWER 1- : The power of a statistical test is the probability that the test will reject a false null hypothesis. Statistical mistakes/errors
  • 27. 27 H0 H1 H0 Correct 1- False negative H1 False positive Correct 1- Testdecision Hypothesisaccepted True state of nature Consumer‘s risk? Producer‘s risk? Statistical mistakes/errors
  • 28. 28 Compare a statistical test to a court of justice H0 Innocent H1 Not Innocent H0 Correct 1- False negative (i.e. guilty but not caught) H1 False positive (i.e. innocent but convicted) Correct 1- Testdecision Decisionofthecourt True state of nature False positive: A court finds a person guilty of a crime that they did not actually commit. False negative: A court finds a person not guilty of a crime that they did commit. Judged "innocent" Judged “Notinnocent" Statistical tests vs court of law
  • 29. 29 Minimising statistical errors Remember: How do I gain adequate data? Thorough planning of studies • Defining acceptable levels of statistical error is key to the planning of studies • alpha (in clinical trials) is pre-defined by regulatory guidance (usually) • beta is not, but deciding on the power (1-beta) of the study is crucial to enrolling sufficient patients • the power of a study is usually chosen to be 80% or 90% • conducting an “underpowered” study is not ethically acceptable because you know in advance that your results will be inconclusive
  • 30. 30 Deciding how many patients to enrol • the sample size calculation depends on: – the clinically relevant effect that is expected (1) – the amount of variability in the data that is expected (2) – the significance level at which you plan to test (3) – the power that you hope to achieve in your study (4) • if you knew (1) and (2), you wouldn’t need to conduct a study  picking your sample size is a gamble • the smaller the treatment effect, the more patients you need • the more variable the treatment effect, the more patients you need • the smaller the risks (or statistical errors) you’re prepared to take, the more patients you need
  • 31. 31 Conclusion • amount of variability in data defines conclusions from statistical tests • clinical trial methodology is underpinned by statistics • valid statistical inference, and therefore decision- making, depends on robust planning • understanding statistics isn’t as hard as it looks (I hope)
  • 32. 32 Further reading • www.consort-statement.org  standards for reporting clinical trials in the literature • Statistical Principles for Clinical Trials ICH E9  useful glossary • http://openwetware.org/wiki/BMJ_Statistics_Notes_series  coverage of a number of topics related to statistics in clinical research, mostly by Douglas Altman and Martin Bland