This document provides an introduction to introductory statistics concepts for medical research. It discusses key statistical terms like mean, median, variability, and probability. It provides examples to illustrate descriptive statistics like histograms and measures of central tendency and variability. The document also discusses inferential statistics concepts like confidence intervals and p-values. It explains how statistical significance is determined using a significance level like 0.05 and how studies aim to avoid type 1 and type 2 errors in evaluating hypotheses. The overall document serves as a high-level primer on important foundational statistical concepts for medical researchers.
Statistical significance vs Clinical significanceVini Mehta
esults are said to be "statistically significant" if the probability that the result is compatible with the null hypothesis is very small. Clinical significance, or clinical importance: Is the difference between new and old therapy found in the study large enough for you to alter your practice?
Statistical significance vs Clinical significanceVini Mehta
esults are said to be "statistically significant" if the probability that the result is compatible with the null hypothesis is very small. Clinical significance, or clinical importance: Is the difference between new and old therapy found in the study large enough for you to alter your practice?
These slides were presented on November 22 2016 during the Annual Julius Symposium, organised by the Julius Center for Health Sciences and Primary Care, University Medical Hospital Utrecht.
Only a few months ago, the American Statistical Association authoritatively issued an official statement on significance and p-values (American Statistician, 2016, 70:2, 129-133), claiming that the p-value is: “commonly misused and misinterpreted.”
In this presentation I focus on the principles of the ASA statement.
When estimating sample sizes for clinical trials there are several different views that might be taken as to what definition and meaning should be given to the sought-for treatment effect. However, if the concept of a ‘minimally important difference’ (MID) does have relevance to interpreting clinical trials (which can be disputed) then its value cannot be the same as the ‘clinically relevant difference’ (CRD) that would be used for planning them.
A doubly pernicious use of the MID is as a means of classifying patients as responders and non-responders. Not only does such an analysis lead to an increase in the necessary sample size but it misleads trialists into making causal distinctions that the data cannot support and has been responsible for exaggerating the scope for personalised medicine.
In this talk these statistical points will be explained using a minimum of technical detail.
The Seven Habits of Highly Effective StatisticiansStephen Senn
If you know why the title of this talk is extremely stupid, then you clearly know something about control, data and reasoning: in short, you have most of what it takes to be a statistician. If you have studied statistics then you will also know that a large amount of anything, and this includes successful careers, is luck.
In this talk I shall try share some of my experiences of being a statistician in the hope that it will help you make the most of whatever luck life throws you, In so doing, I shall try my best to overcome the distorting influence of that easiest of sciences hindsight. Without giving too much away, I shall be recommending that you read, listen, think, calculate, understand, communicate, and do. I shall give you some example of what I think works and what I think doesn’t
In all of this you should never forget the power of negativity and also the joy of being able to wake up every day and say to yourself ‘I love the small of data in the morning’.
It is argued that when it comes to nuisance parameters an assumption of ignorance is harmful. On the other hand this raises problems as to how far one should go in searching for further data when combining evidence.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
What should we expect from reproducibiliryStephen Senn
Is there really a reproducibility crisis and if so are P-values to blame? Choose any statistic you like and carry out two identical independent studies and report this statistic for each. In advance of collecting any data, you ought to expect that it is just as likely that statistic 1 will be smaller than statistic 2 as vice versa. Once you have seen statistic 1, things are not so simple but if they are not so simple, it is that you have other information in some form. However, it is at least instructive that you need to be careful in jumping to conclusions about what to expect from reproducibility. Furthermore, the forecasts of good Bayesians ought to obey a Martingale property. On average you should be in the future where you are now but, of course, your inferential random walk may lead to some peregrination before it homes in on “the truth”. But you certainly can’t generally expect that a probability will get smaller as you continue. P-values, like other statistics are a position not a movement. Although often claimed, there is no such things as a trend towards significance.
Using these and other philosophical considerations I shall try and establish what it is we want from reproducibility. I shall conclude that we statisticians should probably be paying more attention to checking that standard errors are being calculated appropriately and rather less to inferential framework.
These slides were presented on November 22 2016 during the Annual Julius Symposium, organised by the Julius Center for Health Sciences and Primary Care, University Medical Hospital Utrecht.
Only a few months ago, the American Statistical Association authoritatively issued an official statement on significance and p-values (American Statistician, 2016, 70:2, 129-133), claiming that the p-value is: “commonly misused and misinterpreted.”
In this presentation I focus on the principles of the ASA statement.
When estimating sample sizes for clinical trials there are several different views that might be taken as to what definition and meaning should be given to the sought-for treatment effect. However, if the concept of a ‘minimally important difference’ (MID) does have relevance to interpreting clinical trials (which can be disputed) then its value cannot be the same as the ‘clinically relevant difference’ (CRD) that would be used for planning them.
A doubly pernicious use of the MID is as a means of classifying patients as responders and non-responders. Not only does such an analysis lead to an increase in the necessary sample size but it misleads trialists into making causal distinctions that the data cannot support and has been responsible for exaggerating the scope for personalised medicine.
In this talk these statistical points will be explained using a minimum of technical detail.
The Seven Habits of Highly Effective StatisticiansStephen Senn
If you know why the title of this talk is extremely stupid, then you clearly know something about control, data and reasoning: in short, you have most of what it takes to be a statistician. If you have studied statistics then you will also know that a large amount of anything, and this includes successful careers, is luck.
In this talk I shall try share some of my experiences of being a statistician in the hope that it will help you make the most of whatever luck life throws you, In so doing, I shall try my best to overcome the distorting influence of that easiest of sciences hindsight. Without giving too much away, I shall be recommending that you read, listen, think, calculate, understand, communicate, and do. I shall give you some example of what I think works and what I think doesn’t
In all of this you should never forget the power of negativity and also the joy of being able to wake up every day and say to yourself ‘I love the small of data in the morning’.
It is argued that when it comes to nuisance parameters an assumption of ignorance is harmful. On the other hand this raises problems as to how far one should go in searching for further data when combining evidence.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
What should we expect from reproducibiliryStephen Senn
Is there really a reproducibility crisis and if so are P-values to blame? Choose any statistic you like and carry out two identical independent studies and report this statistic for each. In advance of collecting any data, you ought to expect that it is just as likely that statistic 1 will be smaller than statistic 2 as vice versa. Once you have seen statistic 1, things are not so simple but if they are not so simple, it is that you have other information in some form. However, it is at least instructive that you need to be careful in jumping to conclusions about what to expect from reproducibility. Furthermore, the forecasts of good Bayesians ought to obey a Martingale property. On average you should be in the future where you are now but, of course, your inferential random walk may lead to some peregrination before it homes in on “the truth”. But you certainly can’t generally expect that a probability will get smaller as you continue. P-values, like other statistics are a position not a movement. Although often claimed, there is no such things as a trend towards significance.
Using these and other philosophical considerations I shall try and establish what it is we want from reproducibility. I shall conclude that we statisticians should probably be paying more attention to checking that standard errors are being calculated appropriately and rather less to inferential framework.
In order to understand medical statistics, you have to learn the very basic concepts as mean, median, and standard deviation. interpretation and understanding of clinical study results depends mainly on statistical background.
Standard Error & Confidence Intervals.pptxhanyiasimple
Certainly! Let's delve into the concept of **standard error**.
## What Is Standard Error?
The **standard error (SE)** is a statistical measure that quantifies the **variability** between a sample statistic (such as the mean) and the corresponding population parameter. Specifically, it estimates how much the sample mean would **vary** if we were to repeat the study using **new samples** from the same population. Here are the key points:
1. **Purpose**: Standard error helps us understand how well our **sample data** represents the entire population. Even with **probability sampling**, where elements are randomly selected, some **sampling error** remains. Calculating the standard error allows us to estimate the representativeness of our sample and draw valid conclusions.
2. **High vs. Low Standard Error**:
- **High Standard Error**: Indicates that sample means are **widely spread** around the population mean. In other words, the sample may not closely represent the population.
- **Low Standard Error**: Suggests that sample means are **closely distributed** around the population mean, indicating that the sample is representative of the population.
3. **Decreasing Standard Error**:
- To decrease the standard error, **increase the sample size**. Using a large, random sample minimizes **sampling bias** and provides a more accurate estimate of the population parameter.
## Standard Error vs. Standard Deviation
- **Standard Deviation (SD)**: Describes variability **within a single sample**. It can be calculated directly from sample data.
- **Standard Error (SE)**: Estimates variability across **multiple samples** from the same population. It is an **inferential statistic** that can only be estimated (unless the true population parameter is known).
### Example:
Suppose we have a random sample of 200 students, and we calculate the mean math SAT score to be 550. In this case:
- **Sample**: The 200 students
- **Population**: All test takers in the region
The standard error helps us understand how well this sample represents the entire population's math SAT scores.
Remember, the standard error is crucial for making valid statistical inferences. By understanding it, researchers can confidently draw conclusions based on sample data. 📊🔍
If you need further clarification or have additional questions, feel free to ask! 😊
---
I've provided a concise explanation of standard error, emphasizing its importance in statistical analysis. If you'd like more details or specific examples, feel free to ask! ¹²³⁴
Source: Conversation with Copilot, 5/31/2024
(1) What Is Standard Error? | How to Calculate (Guide with Examples) - Scribbr. https://www.scribbr.com/statistics/standard-error/.
(2) Standard Error (SE) Definition: Standard Deviation in ... - Investopedia. https://www.investopedia.com/terms/s/standard-error.asp.
(3) Standard error Definition & Meaning - Merriam-Webster. https://www.merriam-webster.com/dictionary/standard%20error.
(4) Standard err
A sample design is a definite plan for obtaining a sample from a given population. Researcher must select/prepare a sample design which should be reliable and appropriate for his research study.
Biostatistics - the application of statistical methods in the life sciences including medicine, pharmacy, and agriculture.
An understanding is needed in practice issues requiring sound decisions.
Statistics is a decision science.
Biostatistics therefore deals with data.
Biostatistics is the science of obtaining, analyzing and interpreting data in order to understand and improve human health.
Applications of Biostatistics
Design and analysis of clinical trials
Quality control of pharmaceuticals
Pharmacy practice research
Public health, including epidemiology
Genomics and population genetics
Ecology
Biological sequence analysis
Bioinformatics etc.
P-values the gold measure of statistical validity are not as reliable as many...David Pratap
This is an article that appeared in the NATURE as News Feature dated 12-February-2014. This article was presented in the journal club at Oman Medical College , Bowshar Campus on December, 17, 2015. This article was presented by Pratap David , Biostatistics Lecturer.
Confidence Intervals in the Life Sciences PresentationNamesS.docxmaxinesmith73660
Confidence Intervals in the Life Sciences Presentation
Names
Statistics for the Life Sciences STAT/167
Date
Fahad M. Gohar M.S.A.S
1
Conservation Biology of Bears
Normal Distribution
Standard normal distribution
Confidence Interval
Population Mean
Population Variance
Confidence Level
Point Estimate
Critical Value
Margin of Error
Welcome to the presentation on Confidence Intervals of Conservation Biology on Bears.
The team will define normal distribution and use an example of variables why this is important. A standard and normal distribution is discussed as well as the difference between standard and other normal distributions. Confidence interval will be defined and how it is used in Conservation Biology and Bears. We will learn how a confidence interval helps researchers estimate of population mean and population variance. The presenters defined a point estimate and try to explain how a point estimate found from a confidence interval. Confidence level is defined and a short explanation of confidence level is related to the confidence interval. Lastly, a critical value and margin of error are explained with examples from the Statdisk.
2
Normal Distribution
A normal distribution is one which has the mean, median, and mode are the same and the standard deviations are apart from the mean in the probabilities that go with the empirical rule. Not all data has the measures of central tendency, since some data sets may not have one unique value which occurs more than once. But every data set has a mean and median. The mean is only good with interval and ratio data, while the median can be used with interval, ratio and ordinal data. Mean is used when they're a lot of outliers, and median is used when there are few.
The normal distribution is continuous, and has only two parameters - mean and variance. The mean can be any positive number and variance can be any positive number (can't be negative - the mean and variance), so there are an infinite number of normal distributions. You want your data to represent the population distribution because when you make claims from the distribution of the sample you took, you want it to represent the whole entire population.
Some examples in the business world: Some industries which use normal distributions are pharmaceutical companies. They model the average blood pressure through normal distributions, and can make medicine which will help majority of the people with high blood pressure. A company can also model its average time to create something using the normal distribution. Several statistics can be calculated with the normal distribution, and hypothesis tests can be done with the normal distribution which models the average time.
Our chosen life science is BEARS. The age of the bears can be modeled by normal distributions and it is important to monitor since that tells us the average age of the bear, and can tell us a lot about the population. If the mean is high and the standard deviatio.
micro teaching on communication m.sc nursing.pdfAnurag Sharma
Microteaching is a unique model of practice teaching. It is a viable instrument for the. desired change in the teaching behavior or the behavior potential which, in specified types of real. classroom situations, tends to facilitate the achievement of specified types of objectives.
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...VarunMahajani
Disruption of blood supply to lung alveoli due to blockage of one or more pulmonary blood vessels is called as Pulmonary thromboembolism. In this presentation we will discuss its causes, types and its management in depth.
These lecture slides, by Dr Sidra Arshad, offer a quick overview of physiological basis of a normal electrocardiogram.
Learning objectives:
1. Define an electrocardiogram (ECG) and electrocardiography
2. Describe how dipoles generated by the heart produce the waveforms of the ECG
3. Describe the components of a normal electrocardiogram of a typical bipolar leads (limb II)
4. Differentiate between intervals and segments
5. Enlist some common indications for obtaining an ECG
Study Resources:
1. Chapter 11, Guyton and Hall Textbook of Medical Physiology, 14th edition
2. Chapter 9, Human Physiology - From Cells to Systems, Lauralee Sherwood, 9th edition
3. Chapter 29, Ganong’s Review of Medical Physiology, 26th edition
4. Electrocardiogram, StatPearls - https://www.ncbi.nlm.nih.gov/books/NBK549803/
5. ECG in Medical Practice by ABM Abdullah, 4th edition
6. ECG Basics, http://www.nataliescasebook.com/tag/e-c-g-basics
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdfAnujkumaranit
Artificial intelligence (AI) refers to the simulation of human intelligence processes by machines, especially computer systems. It encompasses tasks such as learning, reasoning, problem-solving, perception, and language understanding. AI technologies are revolutionizing various fields, from healthcare to finance, by enabling machines to perform tasks that typically require human intelligence.
Ethanol (CH3CH2OH), or beverage alcohol, is a two-carbon alcohol
that is rapidly distributed in the body and brain. Ethanol alters many
neurochemical systems and has rewarding and addictive properties. It
is the oldest recreational drug and likely contributes to more morbidity,
mortality, and public health costs than all illicit drugs combined. The
5th edition of the Diagnostic and Statistical Manual of Mental Disorders
(DSM-5) integrates alcohol abuse and alcohol dependence into a single
disorder called alcohol use disorder (AUD), with mild, moderate,
and severe subclassifications (American Psychiatric Association, 2013).
In the DSM-5, all types of substance abuse and dependence have been
combined into a single substance use disorder (SUD) on a continuum
from mild to severe. A diagnosis of AUD requires that at least two of
the 11 DSM-5 behaviors be present within a 12-month period (mild
AUD: 2–3 criteria; moderate AUD: 4–5 criteria; severe AUD: 6–11 criteria).
The four main behavioral effects of AUD are impaired control over
drinking, negative social consequences, risky use, and altered physiological
effects (tolerance, withdrawal). This chapter presents an overview
of the prevalence and harmful consequences of AUD in the U.S.,
the systemic nature of the disease, neurocircuitry and stages of AUD,
comorbidities, fetal alcohol spectrum disorders, genetic risk factors, and
pharmacotherapies for AUD.
- Video recording of this lecture in English language: https://youtu.be/lK81BzxMqdo
- Video recording of this lecture in Arabic language: https://youtu.be/Ve4P0COk9OI
- Link to download the book free: https://nephrotube.blogspot.com/p/nephrotube-nephrology-books.html
- Link to NephroTube website: www.NephroTube.com
- Link to NephroTube social media accounts: https://nephrotube.blogspot.com/p/join-nephrotube-on-social-media.html
The prostate is an exocrine gland of the male mammalian reproductive system
It is a walnut-sized gland that forms part of the male reproductive system and is located in front of the rectum and just below the urinary bladder
Function is to store and secrete a clear, slightly alkaline fluid that constitutes 10-30% of the volume of the seminal fluid that along with the spermatozoa, constitutes semen
A healthy human prostate measures (4cm-vertical, by 3cm-horizontal, 2cm ant-post ).
It surrounds the urethra just below the urinary bladder. It has anterior, median, posterior and two lateral lobes
It’s work is regulated by androgens which are responsible for male sex characteristics
Generalised disease of the prostate due to hormonal derangement which leads to non malignant enlargement of the gland (increase in the number of epithelial cells and stromal tissue)to cause compression of the urethra leading to symptoms (LUTS
Anti ulcer drugs and their Advance pharmacology ||
Anti-ulcer drugs are medications used to prevent and treat ulcers in the stomach and upper part of the small intestine (duodenal ulcers). These ulcers are often caused by an imbalance between stomach acid and the mucosal lining, which protects the stomach lining.
||Scope: Overview of various classes of anti-ulcer drugs, their mechanisms of action, indications, side effects, and clinical considerations.
2. 2
What does statistics mean to you?
mean
information
percentage
variability
data
median
average
graphs
measurement
probability
confidence
evidence
spread
3. 3
An example
• The idea of (clinical) research is to provide answers to questions
• Let’s ask a question…
How tall are Eurordis summer school 2012
participants?
4. 4
An example
• Thinking about one (silly) question tells us some important
things about data.
people are different variability in data
the reliability of an average depends on the
population being studied and the sample drawn
from that population valid inference
5. 5
A sillier example
• Let’s say I find out 5 heights (in cm)
165, 166, 167, 168, 169
mean = 167.0 cm
standard deviation = 1.6 cm
• Let’s say I find out 5 different heights (in cm)
154, 165, 168, 172, 183
mean = 168.4 cm
standard deviation = 10.5 cm
• So the means are the quite similar…but the two groups look
different
6. 6
A sillier example
• what we learn from the second example is that we get more
information from data if we know how variable they are
with these pieces of information, we can draw a picture of
the data (provided some assumptions are true)
8. 8
Johann Carl Friedrich Gauss 1777-1855
68,3%
~95,5%
~99%,
mean mean + sd mean + 2sdmean - sdmean - 2sd
Picturing “normal” data
9. 9
Biostatistics
… is the method of gaining empirical knowledge in the life sciences
… is the method for quantifying variation and uncertainty
1. How do I gain adequate data?
Thorough planning of studies
2. What can I do with my data?
Descriptive and inferential statistics
If (1) is not satisfactorily resolved, (2) will never be adequate.
10. 10
draw a
sample
Population
sample
sample data
Inferential statistics
Draw conclusions for the
whole population based
on information gained
from a sample
Target of study design
representative sample
= ~ all "typical"
representatives are
included
Descriptive
statistics
(a population is a set of all
conceivable observations of a
certain phenomenon)
11. 11
Descriptive statistics (1)
• For continuous variables (e.g. height, blood pressure,
body mass index, blood glucose)
• Histograms
• Measures of location
–mean, median…
• Measures of spread
–variance, standard deviation, range, interquartile
range
12. 12
Descriptive statistics (2)
• For categorical variables (e.g. sex, dead/alive,
smoker/non-smoker, blood group)
• Bar charts, (pie charts)…
• Frequencies, percentages, proportions
–when using a percentage, always state the number
it’s based on
–(e.g. 68% (n=279))
13. 13
Describing data
• Going back to the heights example
–we didn’t have time to measure the heights of the
population (all participants), so we took a sample
–we calculated the mean of the sample
–we can use the mean of the sample as an estimate
of the mean in the population
–but seeing as it’s only an estimate, we want to know
how reliable it is; that is, how well it represents the
true (population) value
–we calculated the standard deviation of the sample
–this is a fact (like the sample mean) about the
spread of the data in the sample
14. 14
Inferential statistics – confidence intervals
• the standard deviation also helps us to understand how precise our
estimate of the mean is
• the precision depends on the number of data points in the sample
– we can calculate the standard error of the mean
– (in our more realistic example, the standard error is 4.7)
– the larger the sample, the smaller the standard error, the
more precise the estimate
• so now we have an estimate of the population mean and a measure
of its precision
• so what?
• we can use the standard error of the mean and the properties of the
normal distribution to gain some level of confidence about our
estimate
sampleinnumber
deviationstandard
meantheoferrorstandard
15. 15
Confidence intervals
• How confident do you need to be about the outcome of a race in order to
place a bet?
• In statistics it’s typical to talk about 95% confidence
• we can use the standard error of the mean to calculate a 95%
confidence interval for the mean
• in our example: 95%CI for the mean ≈ 168.4 2 x 4.7
≈ [159.0, 177.8]
errorstandard2meanintervalconfidence95%
measure of precision of
the sample mean
relates to the normal distribution
estimate of population mean
16. 16
Confidence intervals
• What does a 95% confidence interval mean?
• remember that we have drawn a sample from a population
• we want to know how good our estimate of the population mean (the
truth) is
• the 95% CI tells us that the true mean lies in 95 out of 100 confidence
intervals calculated using different samples from this population
true mean lies between 159 cm and 177.8 cm with 95% probability
• remember: the larger the number in the sample, the smaller the standard
error
• the smaller the standard error, the narrower the confidence interval
the larger the number in the sample, the more precise the estimate
17. 17
A real example
Lateral wedge insoles for medial knee osteoarthritis: 12 month
randomised controlled trial (BMJ 2011; 342:d2912 )
Participants 200 people aged 50 or more with clinical and radiographic
diagnosis of mild to moderately severe medial knee osteoarthritis.
Interventions Full length 5 degree lateral wedged insoles or flat control
insoles worn inside the shoes daily for 12 months.
Main outcome measure Change in overall knee pain (past week)
measured on an 11 point numerical rating scale (0=no pain, 10=worst
pain imaginable).
Research question:
Are full length 5 degree lateral wedged insoles better than flat control
insoles at reducing pain after 12 months in patients with mild to
moderately severe medial knee osteoarthritis?
18. 18
A real example
• Assume 5 degree lateral wedged insoles and flat control insoles reduce
pain to the same extent (this is our null hypothesis)
• Collect data in both groups of patients and calculate:
the mean reduction from baseline in the 5 deg lateral insole group (0.9)
the mean reduction from baseline in the flat insole group (1.2)
the difference between the two groups in mean change from baseline (0.9
– 1.2 = -0.3)
• We can see that there was some reduction in pain in both groups and
that the reduction was greater in the flat insole group
• The 95% confidence interval tells us how important that difference is
flat insole better 5 deg lateral insole better
-1.0 0.3- 0.3 0
19. 19
A real example
• Conclusion: with the available evidence, there is no reason
to suggest that one treatment is better than the other in
reducing pain scores after one year.
• Or: there is no statistically significant difference in
treatments (because the 95% confidence interval crosses
the null value)
20. 20
Statistical significance vs clinical significance
• Remember: the larger the number of data points in a sample, the
more precise the estimate
• Let’s imagine that instead of enrolling 200 patients in their study, the
researchers enrolled 1500.
• This time, the confidence interval does not cross the null value
• So we can conclude that the flat insole reduces pain by 0.25 points
more than the 5 degree lateral insole.
• Hooray?
• Is a difference of 0.25 points clinically relevant?
-0.4 -0.1-0.25 0
flat insole better 5 deg lateral insole better
21. 21
Quantifying statistical significance – the p-value
• In the last example, we looked at the 95% confidence interval to decide
whether we had a statistically significant result
• (the confidence interval is useful because it gives clinicians and patients a
range of values in which the true value is likely to lie)
• another measure of statistical significance is the p-value
• Remember our null hypothesis: 5 degree lateral wedged insoles and
flat control insoles reduce pain to the same extent
• We go into our experiment assuming that the null hypothesis is true
• After we’ve collected the data, the p-value helps us decide whether our
results are due to chance alone
22. 22
The p-value
• the p-value is a probability it takes values between 0 and 1
• the p-value is the probability of observing the data (or data more
extreme) if the null hypothesis is true
(if the null hypothesis is true, the two treatments are the same)
a low p-value (a value very close to zero) means there is low
probability of observing such data when the null hypothesis is true
reject the null hypothesis (i.e. conclude that there is difference in
treatments)
23. 23
Significance level
• p-values are compared to a pre-specified significance level, alpha
• alpha is usually 5% (analogous to 95% confidence intervals)
if p ≤ 0.05 reject the null hypothesis (i.e., the result is statistically significant
at the 5% level) conclude that there is a difference in treatments
if p > 0.05 do not reject the null hypothesis conclude that there is not
sufficient information to reject the null hypothesis
(See Statistics notes: Absence of evidence is not evidence of absence:
BMJ 1995;311:485)
The authors of the insole paper did not cite the p-value for the result we looked
at but we can conclude that p>0.05 because the 95%CI crosses the null value
and so we cannot reject the null hypothesis (of no difference).
24. 24
H0 H1
H0 Correct
1-
H1 False positive
Testdecision
Hypothesisaccepted
True state of nature
Type I error:
Error of rejecting a null hypothesis when it is actually true
( error, error of the first kind)
represents a FALSE POSITIVE decision: A placebo is declared to be more
effective than another placebo!)
Statistical mistakes/errors
25. 25
H0 H1
H0 Correct
1-
False negative
H1 False positive Correct
1-
Testdecision
Hypothesisaccepted
True state of nature
Type II error:
The alternative hypothesis is true, but the null hypothesis is erroneously
not rejected. (β error, error of the second kind)
FALSE NEGATIVE decision: an effective treatment could not be
significantly distinguished from placebo
Statistical mistakes/errors
26. 26
Which sorts of error can occur when performing statistical tests?
H0 H1
H0 Correct
1-
False negative
H1 False positive Correct
1-
Testdecision
Hypothesisaccepted
True state of nature
Significance level is usually predetermined in the study protocol, e.g.,
=0.05 (5%, 1 in 20), = 0.01 (1%, 1 in 100), = 0.001 (0.1%, 1 in 1000)
POWER 1- : The power of a statistical test is the probability that the test will reject
a false null hypothesis.
Statistical mistakes/errors
28. 28
Compare a statistical test to a court of justice
H0
Innocent
H1
Not Innocent
H0 Correct
1-
False negative
(i.e. guilty but not caught)
H1 False positive
(i.e. innocent but convicted)
Correct
1-
Testdecision
Decisionofthecourt
True state of nature
False positive: A court finds a person guilty of a crime that they did not actually commit.
False negative: A court finds a person not guilty of a crime that they did commit.
Judged
"innocent"
Judged
“Notinnocent"
Statistical tests vs court of law
29. 29
Minimising statistical errors
Remember:
How do I gain adequate data?
Thorough planning of studies
• Defining acceptable levels of statistical error is key to the planning
of studies
• alpha (in clinical trials) is pre-defined by regulatory guidance
(usually)
• beta is not, but deciding on the power (1-beta) of the study is
crucial to enrolling sufficient patients
• the power of a study is usually chosen to be 80% or 90%
• conducting an “underpowered” study is not ethically acceptable
because you know in advance that your results will be inconclusive
30. 30
Deciding how many patients to enrol
• the sample size calculation depends on:
– the clinically relevant effect that is expected (1)
– the amount of variability in the data that is expected (2)
– the significance level at which you plan to test (3)
– the power that you hope to achieve in your study (4)
• if you knew (1) and (2), you wouldn’t need to conduct a study
picking your sample size is a gamble
• the smaller the treatment effect, the more patients you need
• the more variable the treatment effect, the more patients you need
• the smaller the risks (or statistical errors) you’re prepared to take,
the more patients you need
31. 31
Conclusion
• amount of variability in data defines conclusions from
statistical tests
• clinical trial methodology is underpinned by statistics
• valid statistical inference, and therefore decision-
making, depends on robust planning
• understanding statistics isn’t as hard as it looks (I
hope)
32. 32
Further reading
• www.consort-statement.org
standards for reporting clinical trials in the
literature
• Statistical Principles for Clinical Trials ICH E9
useful glossary
• http://openwetware.org/wiki/BMJ_Statistics_Notes_series
coverage of a number of topics related to
statistics in clinical research, mostly by Douglas
Altman and Martin Bland