Clinical Trials
Dr Nagarjuna B
Flow of presentation
• Clinical Trials
• Design of RCT
- Hypothesis testing
- Sample size calculation
- Statistical analysis
• Survival analysis
- Kaplan Meier curve
• Meta-analysis
- Forest plot
Clinical Trial
• A clinical trial tests potential interventions in
humans to determine if the intervention
represents an advance and should be adopted
for general use
FDA 2003
What do clinical trials test?
• Clinical trials test research hypotheses
• Good clinical trials test specific research
hypothesis
• A clinical research hypothesis is a carefully
formulated assumption developed in order to
test its logical consequences
Phase I trial
• First evaluation of a new therapy in humans
1. Classical Goals:
Identify dose limiting toxicities (DLT)
Identify maximum tolerated dose(MTD)
Assess pharmacokinetics(PK)
Assess pharmacodynamic(PD)endpoints
Phase II trial
• To define antitumor activity
• To further demonstrate safety
• To gain new insights into the
pharmacokinetics, pharmacodynamics &
metabolism of drugs
• To evaluate biologic correlates which may
predict response or resistance to treatment
and/or toxicity
• Phase II trials are exploratory studies and rarely are
definitive
• Efficient to exclude inactive therapies
• Results are interpreted cautiously, in the context of
the availability of other therapies
• Estimate clinical activity and provide further safety
information – important in the “go/no go” decision
• Require confirmation in pivotal phase III trials
Phase III trial
• Comparative Trials with or without controls
• Primary goal is to establish actual clinical
value
1. Survival (OAS, EFS,PFS) are primary endpoints
2. Compares new treatment to current standard of
care
3. Randomized (with allocation concealment) to
minimize bias
4. May be sometimes placebo controlled and even
blinded
Phase IV trial
• Post-marketing surveillance studies
• Assess long-term toxic effects & risk-benefit
ratio
Design of RCT Trial
Select suitable population(target population)
Select suitable sample (experimental or study population)
Make necessary exclusions(not eligible/no consent)
Randomise
Experimental group Control group
Follow up & assessment
Definition of RCT
• RCT is a study in which a group of investigators study
an intervention in a series of individuals who receive
the intervention in a random order.
• Intervention to be tested is called the experimental
group
• The other group of participants is called the control
group.
• The control can be conventional practice, a placebo,
or no intervention at all
Why to randomize
• We need to analyse groups at the end of the trial
• To ensure that difference in groups is because of the
Rx
• For this you need comparable groups at the start of
trial
• Purpose of randomization is to make the treatment
groups comparable
Value of randomization
• It reduces the risk of serious imbalance in
unknown but important factors that could
influence the clinical course of the
participants.
Types of Randomization
• Common types of randomization methods
are:
1. Simple randomization
2. Stratified randomization – randomize within pre-decided
groups (based on possible prognostic factors)
3. Block (restricted) randomization – randomize within a block of
x patients – keeps numbers roughly equal between groups
4. Cluster randomization – randomize groups of subjects rather
than individual subjects
Blinding
• The best way to protect a trial against bias is
by keeping the people involved in the trial
unaware of the identity of the interventions
for as long as possible
• RCTs according to whether the investigators
and participants know which intervention is
being assessed
– Open trials
– Single blind trials
– Double blind trials
– Triple blind trials
• Formats of RCTs
1. Parallel group
2. Cross over
3. Factorial
Factorial Design
• Factorial designs allow for researchers to test multiple
interventions or treatment combinations in a single study.
• The simplest form of this design is a 2x2 factorial design.
• Looks like a “grid”
• Used to effectively test multiple treatments in a single study.
• More efficient and more statistically powerful than multiple
single intervention studies.
Dose Cycle
Statin
Rosuvastatin
(Crestor)
Atorvastatin (Lipitor)
3x Per Week M LDL M LDL
Everyday M LDL M LDL
Drawing sample from population
HYPOTHESIS TESTING
learning objectives:
» to understand the role of significance
» to distinguish the null and alternative
hypotheses
» to interpret p-value, type I and II errors
Hypothesis testing
• Hypotheses are defined as formal statements of
explanations stated in a testable form.
• To test statistical hypotheses two presumptions are
made to draw the inference from sample value.
• Logic- designed to detect significant differences:
differences that did not occur by random chance.
Formulate
hypotheses
Collect data to
test hypotheses
Accept hypothesis Reject hypothesis
Null and alternate hypothesis
1. Null Hypothesis (H0)
– The difference is caused by random chance.
– The H0 always states there is “no significant difference.” It
means that there is no significant difference between the
population mean and the sample mean.
2. Alternate hypothesis (H1)
– “The difference is real”.
– (H1) always contradicts the H0.
• One (and only one) of these explanations must be true.
Testing of hypotheses
Type I and Type II Errors. Example.
Decision No disease Disease
Not diagnosed OK Type II error
Diagnosed Type I error OK
treated but not harmed
by the treatment
irreparable damage
would be done
Inference: to avoid Type error II, have high level of
significance
Significance
Probability that you reject the null hypothesis (in
favor of the alternative hypothesis) when the
null hypothesis is true:
α = Pr[reject H0 | H0 is true] = Pr[accept H1 | H0
is true]
What does this mean? If we set α = 0.05, then
Power
Probability that you reject the null hypothesis (in
favor of the alternative hypothesis) when the
null hypothesis is false:
1–β = Pr[reject H0 | H1 is true] = Pr[accept H1 |
H1 is true]
P value
• Probability that difference at least as large as those
found in the observed data would have occurred by
chance.
• p-value is defined as the probability of obtaining a
result equal to or more extreme than what was
actually observed.
• P values evaluate how well the sample data support
that the null hypothesis is true.
• High P values: your data are likely with a true
null.
• Low P values: your data are unlikely with a
true null.
Type I error/ False positive conclusion
• Stating difference when there is no difference
• Probability (Type I Error) = 
• Usually set at 1/20 or 0.05. never 0 and it should
be below the value of ‘α’ for concluding statistical
significance.
• The probability of a type I error is distributed at the
tails of the normal curve i.e. 0.025 on either tail.
Type II Error/ false negative conclusion
• Stating no difference when actually there is i.e.
missing a true difference
• Occurs when sample size is too small.
• Probability (Type II Error) = 
• Conventionally accepted to be 0.1 – 0.2
• Power of a study =(1- )
• Researchers consider a power 0.8 – 0.9 (80-90%) as
satisfactory.
Cut off for p value
• Arbitrary cut-off 0.05 (5% chance of a false +ve
conclusion.
• If p<0.05 statistically significant- Reject H0, Accept
H1
• If p>0.05 statistically not-significant- Accept H0,
Reject H1
• Testing potential harmful interventions ‘α’ value is
set below 0.05
One/Two sided p values
• If we are interested only to find out whether the test
drug is better than the control drug, we put the α of
0.05 under only one tail of hypothesis - called one
tailed test.
• To know whether one drug performs better or worse
than the other, we would distribute the of 0.05 to both
tails under the hypothesis i.e. 0.025 to each tail – two
tailed test.
The Curve for Two- vs. One-tailed Tests at α = .05:
Two-tailed test:
“is there a significant
difference?”
One-tailed tests:
“is the sample mean
greater than µ or Pu?”
“is the sample mean
less than µ or Pu?”
Confidence level and interval
CONFIDENCE INTERVAL
A range of values so constructed that there
is a specified probability of including the
true value of a parameter within it
CONFIDENCE LEVEL
• Probability of including the true value of
a parameter within a confidence interval
• Percentage
CONFIDENCE LIMITS
• Two extreme measurements within
which an observation lies
• End points of the confidence interval
• Larger confidence – Wider interval
• A point estimate is a single number
• A confidence interval contains a certain
set of possible values of the parameter
Point Estimate
Lower
Confidence
Limit
Upper
Confidence
LimitWidth of
confidence interval
Confidence level is usually set at 95%
(1– ) = 0.95
95% CI corresponds to hypothesis testing with
P <0.05
Margin of Error
n
σ
zME α/ 2 x
Margin of error
• Reduce the SD (σ↓)
• Increase the sample size (n↑)
• Narrow confidence level (1 – ) ↓
Sample size and Power calculation
Recipe for most common formulation
1. Specify hypothesis
2. Specify the significance level (α)
3. Specify the effect size that is clinically relevant
4. Specify the power (1-β)
5. Use appropriate software / formulae to determine the
minimum sample size
A more preferable approach:
4. Specify sample sizes you can reasonably test (resources,
ethics, etc.)
5. Use appropriate software / formulae to determine the
power
Selecting the minimal clinically
relevant effect size
Could base it on previous data:
• Published data
• Pilot study
• Expert scientific opinion
Sample Size Formula
• The formula requires that we
(i)specify the amount of confidence we wish to have, (ii)
estimate the variance in the population, and
(iii) specify the level of desired accuracy we want.
• When we specify the above, the formula tells us what
sample size we need to use….n
55
PROCEDURE FOR CALCULATING SAMPLE SIZE
There are 3 procedures that could be used for
calculating sample size:
1. Use of formulae
2. Ready made tables
3. Computer softwares
57
Three types of analysis
• Univariate analysis
– the examination of the distribution of cases on only one
variable at a time (e.g., weight of college students)
• Bivariate analysis
– the examination of two variables simultaneously (e.g., the
relation between gender and weight of college students )
• Multivariate analysis
– the examination of more than two variables simultaneously
(e.g., the relationship between gender, race and weight of
college students)
Purpose of diff. types of analysis
• Univariate analysis
– Purpose: mainly description
• Bivariate analysis
– Purpose: determining the empirical relationship
between the two variables
• Multivariate analysis
– Purpose: determining the empirical relationship
among multiple variables
Statistics
InferentialDescriptive
Collecting
Organising
Summarising
Presenting Data
Making inference
Hypothesis testing
Determining
relationships
Making predictions
UNIVARIATE ANALYSIS
INFERENTIAL STATISTICSDESCRIPTIVE STATISTICS
1)Measures of central tendency
Mean
Median
Mode
2)Measures of dispersion
Range
Variance
Standard deviation
1)Z Test
2)T test
3)Chi-square test
• Multivariate analysis:
It assures that the results are not biased
and influenced by other factors that are not
accounted for.
• Choice of technique to be used:
Chi square test for two nominal variables
Correlation test to assess between two
interval or ratio measures.
Survival analysis
SURVIVAL:
• It is the probability of remaining alive for a
specific length of time.
• Our point of interest : prognosis of disease i.e
5 year survival
65
What is survival analysis?
• Statistical methods for analyzing longitudinal data on
the occurrence of events.
• Events may include death,onset of illness, recovery
from illness (binary variables) or failure etc.
• Accommodates data from randomized clinical trial or
cohort study design.
Need for survival analysis:
– Investigators frequently must analyze data before all
patients have died; otherwise, it may be many years
before they know which treatment is better.
– Survival analysis gives patients credit for how long
they have been in the study, even if the outcome has
not yet occurred.
– The Kaplan–Meier procedure is the most commonly
used method to illustrate survival curves.
67
Estimate time-to-event for a group of individuals:
To compare time-to-event between two or more
groups:
Objectives of survival analysis:
68
Survival Analysis: Terms
• Time-to-event: The time from entry into a study
until a subject has a particular outcome.
• Censoring: Subjects are said to be censored if they
are lost to follow up or drop out of the study, or if
the study ends before they die or have an outcome
of interest. They are counted as alive or disease-
free for the time they were enrolled in the study.
CENSORING:
• Subjects are said to be censored
– if they are lost to follow up
– drop out of the study,
– if the study ends before they die or have an outcome
of interest.
• They are counted as alive or disease-free for the
time they were enrolled in the study.
• In simple words, some important information
required to make a calculation is not available to
us. i.e. censored.
Types of censoring:
Three Types
of Censoring
Right
censoring Left censoring
Interval
censoring
Right Censoring:
• Right censoring is the most common of concern.
• It means that we are not certain what happened to
people after some point in time.
• This happens when some people cannot be
followed the entire time because they died or were
lost to follow-up or withdrew from the study.
• Left censoring is when we are not certain what
happened to people before some point in time.
• Commonest example is when people already
have the disease of interest when the study starts.
Left Censoring:
• Interval/random censoring is when we know that something
happened in an interval (i.e. not before starting time and not
after ending time of the study ), but do not know exactly when
in the interval it happened.
• For example, we know that the patient was well at time of
start of the study and was diagnosed with disease at time of
end of the study, so when did the disease actually begin?
• All we know is the interval.
Interval/Random
Censoring
Importance of censoring in survival
analysis?
• Example:
we want to know the survival rates of a disease in two groups
and our outcome interest is death due the disease?
group-1 group-2
Time
in
month
s
event
5 death
6 death
8 death
9 death
10 death
12 death
16 death
Time in
months
event
9 death
8 death
12 death
20 death
6 death
7 death
4 death
This data can’t be
analysed by survival
analysis method.As
there is no censored
data.In this case as all
pts. died so we can
take mean time of
death and know which
group has more
survival time
Also data shouldn’t
have >50% censored
data
SURVIVAL FUNCTION:
Let T= Time of death(disease)
•Survival function S(t)=F(t)
=prob.(alive at time t)
=prob.(T>t)
In simple terms it can be defined as
No. of pts. Surviving longer than ‘t’
S(t)= ----------------------------------------------
Total no. of pts.
77
Kaplan-Meier estimate of survival
function:
• Calculate the survival of study population.
• Easy to calculate.
• Non-parametric estimate of the survival function.
• Commonly used to compare two study populations.
• Applicable to small,moderate and large samples.
Beginning of study End of study Time in
months 
Subject B
Subject A
Subject C
Subject D
Subject E
Survival Data (right-censored)
1. subject E
dies at 4
months
X
0
100
%
 Time in months
Corresponding Kaplan-Meier Curve
Probability
of surviving
to 4 months
is 100% =
5/5
Fraction
surviving
this death
= 4/5Subject E dies
at 4 months
4
Beginning of study End of study
 Time in
months 
Subject B
Subject A
Subject C
Subject D
Subject E
Survival Data
2. subject
A drops out
after 6
months
1. subject E
dies at 4
months
X
3. subject C
dies at 7
months
X
100
%
 Time in months
Corresponding Kaplan-Meier Curve
subject C
dies at 7
months
Fraction
surviving
this death
= 2/3
74
Beginning of study End of study
 Time in
months 
Subject B
Subject A
Subject C
Subject D
Subject E
Survival Data
2. subject
A drops out
after 6
months
4.
Subjects
B and D
survive
for the
whole
year-long
study
period1. subject E
dies at 4
months
X
3. subject C
dies at 7
months
X
12
100
%
 Time in months
Corresponding Kaplan-Meier Curve
Rule from probability theory:
P(A&B)=P(A)*P(B) if A and B independent
In kaplan meier : intervals are defined by failures(2 intervals
leading to failures here).
P(surviving intervals 1 and 2)=P(surviving interval 1)*P(surviving
interval 2)
Product limit estimate of survival =
P(surviving interval 1/at-risk up to
failure 1) *
P(surviving interval 2/at-risk up to
failure 2)
= 4/5 * 2/3= .5333
0
The probability of surviving in the entire year, taking
into account censoring
= (4/5) (2/3) = 53%
Properties of survival function:
1.Step function
2.Median survival time estimate(i.e 50% of pts. survival
time)
Median survival? 12 &22
Which has better survival? (2nd one)
What proportion survives 20 weeks?(in 1st
graph=around 35% and in 2nd onearound 62%)
Limitations of Kaplan-Meier:
1.Must have >50% uncensored observations.
2.Median survival time.
3.Assumes that censoring occurs independent of survival
times.(what if the person who develops adverse effect due to
some treatment and forced to leave or died?)
Comparison between 2 survival curve
• Don’t make judgments simply on the
basis of the amount of separation
between two lines
Comparison between 2 survival curve:
• methods may be used to compare survival
curves.
– Logrank statistic.
– Breslow Statistics
– Tarone-Ware Statistics
LOGRANK TEST:
• The log rank statistic is one of the most
commonly used methods to learn if two
curves are significantly different.
• This method also known as Mantel-logrank
statistics or Cox-Mantel-logrank statistics.
• The logrank statistic is distributed as χ2 with a
H0 that survival functions of the two groups
are the same
Hazard function:
• Opposite to survival function
• Hazard function is the derivative of the
survival function over time h(t)=dS(t)/dt
• instantaneous risk of event at time t
(conditional failure rate)
• It is the probability that a person will die in
the next interval of time, given that he
survived until the beginning of the interval.
Summary of survival analysis
– survival analysis Estimate time-to-event for a group of individuals and To
compare time-to-event between two or more groups.
• In survival data is transformed into censored and uncensored data
• all those who achieve the outcome of interest are uncensored” data
• those who do not achieve the outcome are “censored” data
• Log-Rank test used to compare 2 survival curves but does not
control for confounding.
Meta-analysis
• Forest plots display the results of meta-analysis
graphically
• The plot was named after a breast cancer
researcher called Pat Forrest and as a result the
name has sometimes been spelt "forrest plot"
intention-to-treat analysis
• The intention-to-treat principle defines that every patient randomized to the clinical study should enter
the primary analysis. Accordingly, patients who drop out prematurely, are non-compliant to the study
treatment, or even take the wrong study treatment, are included in the primary analysis within the
respective treatment group they have been assigned to at randomization (“as randomized”).
Consequently, in an analysis according to the ITT principle, the original randomization and the number of
patients in the treatment groups remain unchanged, the analysis population is as complete as possible,
and a potential bias due to exclusion of patients is avoided. Thus, the patient set used for the primary
analysis according to the ITT principle is called “full analysis set”.
There are only some specific reasons that might cause an exclusion of a patient from the full analysis set:
– no treatment was applied at all
– there are no data available after randomization
• In addition, the ICH E9 guideline mentions “failure of major entry criteria” as a reason for exclusion.
However, as these major entry criteria are quite specific and only valid under certain circumstances, they
are not commonly used for the definition of a full analysis set
Per protocol analysis
• While an analysis according to the ITT principle aims to preserve the original randomization
and to avoid potential bias due to exclusion of patients, the aim of a per-protocol (PP)
analysis is to identify a treatment effect which would occur under optimal conditions; i.e. to
answer the question: what is the effect if patients are fully compliant? Therefore, some
patients (from the full analysis set) need to be excluded from the population used for the PP
analysis (PP population)
• Usually, this applies to patients fulfilling any of the following criteria:
@ non-availability of measurements of the primary endpoint
@ non-sufficient exposure to study treatment
• There might be further criteria for selecting a PP population; however, the following approaches are
essential:
– The assignment to the PP analysis set needs to take place prior to the analysis (if
possible in a blinded manner).
– Deviations that might be affected by the actual treatment should not be used as
exclusion criteria: e.g., “premature discontinuation from the study” might not be a good
choice of criterion for exclusion from the PP analysis, if this discontinuation was due to
lack of efficacy (and therefore associated with the treatment received)
• Both approaches, the ITT and the PP approach, are valid but have different roles in the
analysis of clinical studies. Let’s come back to the question at the beginning of this article:
What is worse, scenario A (claim a non-existing effect) or B (neglect an existing effect)?
When to stop a clinical trial
• During a clinical trial, we can perform interim analysis (or DMC, DSMB review) for
three different reasons:
Interim analysis for safety
1) with pre-specified stopping rule (for example stop the trial if we see # of cases
of Serious Adverse Events)
2) without pre-specified stopping rule (rely on DMC members to review the overall
safety) Interim analysis for efficacy: To see if the new treatment is overwhelmingly
better than control - then stop the trial for efficacy
Interim analysis for futility: To see if the new treatment is unlikely to beat the
control –
then stop the trial for futility - this is called ‘futility analysis’.
In situations 2 and 3, the criteria for stopping rule for efficacy could be different
from
the stopping rule for futility, but need to be pre-specified.
Chapter 28 clincal trials

Chapter 28 clincal trials

  • 1.
  • 2.
    Flow of presentation •Clinical Trials • Design of RCT - Hypothesis testing - Sample size calculation - Statistical analysis • Survival analysis - Kaplan Meier curve • Meta-analysis - Forest plot
  • 3.
    Clinical Trial • Aclinical trial tests potential interventions in humans to determine if the intervention represents an advance and should be adopted for general use FDA 2003
  • 4.
    What do clinicaltrials test? • Clinical trials test research hypotheses • Good clinical trials test specific research hypothesis • A clinical research hypothesis is a carefully formulated assumption developed in order to test its logical consequences
  • 7.
    Phase I trial •First evaluation of a new therapy in humans 1. Classical Goals: Identify dose limiting toxicities (DLT) Identify maximum tolerated dose(MTD) Assess pharmacokinetics(PK) Assess pharmacodynamic(PD)endpoints
  • 9.
    Phase II trial •To define antitumor activity • To further demonstrate safety • To gain new insights into the pharmacokinetics, pharmacodynamics & metabolism of drugs • To evaluate biologic correlates which may predict response or resistance to treatment and/or toxicity
  • 10.
    • Phase IItrials are exploratory studies and rarely are definitive • Efficient to exclude inactive therapies • Results are interpreted cautiously, in the context of the availability of other therapies • Estimate clinical activity and provide further safety information – important in the “go/no go” decision • Require confirmation in pivotal phase III trials
  • 12.
    Phase III trial •Comparative Trials with or without controls • Primary goal is to establish actual clinical value 1. Survival (OAS, EFS,PFS) are primary endpoints 2. Compares new treatment to current standard of care 3. Randomized (with allocation concealment) to minimize bias 4. May be sometimes placebo controlled and even blinded
  • 15.
    Phase IV trial •Post-marketing surveillance studies • Assess long-term toxic effects & risk-benefit ratio
  • 16.
    Design of RCTTrial Select suitable population(target population) Select suitable sample (experimental or study population) Make necessary exclusions(not eligible/no consent) Randomise Experimental group Control group Follow up & assessment
  • 17.
    Definition of RCT •RCT is a study in which a group of investigators study an intervention in a series of individuals who receive the intervention in a random order. • Intervention to be tested is called the experimental group • The other group of participants is called the control group. • The control can be conventional practice, a placebo, or no intervention at all
  • 18.
    Why to randomize •We need to analyse groups at the end of the trial • To ensure that difference in groups is because of the Rx • For this you need comparable groups at the start of trial • Purpose of randomization is to make the treatment groups comparable
  • 19.
    Value of randomization •It reduces the risk of serious imbalance in unknown but important factors that could influence the clinical course of the participants.
  • 20.
    Types of Randomization •Common types of randomization methods are: 1. Simple randomization 2. Stratified randomization – randomize within pre-decided groups (based on possible prognostic factors) 3. Block (restricted) randomization – randomize within a block of x patients – keeps numbers roughly equal between groups 4. Cluster randomization – randomize groups of subjects rather than individual subjects
  • 21.
    Blinding • The bestway to protect a trial against bias is by keeping the people involved in the trial unaware of the identity of the interventions for as long as possible
  • 22.
    • RCTs accordingto whether the investigators and participants know which intervention is being assessed – Open trials – Single blind trials – Double blind trials – Triple blind trials
  • 23.
    • Formats ofRCTs 1. Parallel group 2. Cross over 3. Factorial
  • 26.
    Factorial Design • Factorialdesigns allow for researchers to test multiple interventions or treatment combinations in a single study. • The simplest form of this design is a 2x2 factorial design. • Looks like a “grid” • Used to effectively test multiple treatments in a single study. • More efficient and more statistically powerful than multiple single intervention studies. Dose Cycle Statin Rosuvastatin (Crestor) Atorvastatin (Lipitor) 3x Per Week M LDL M LDL Everyday M LDL M LDL
  • 28.
  • 29.
    HYPOTHESIS TESTING learning objectives: »to understand the role of significance » to distinguish the null and alternative hypotheses » to interpret p-value, type I and II errors
  • 30.
    Hypothesis testing • Hypothesesare defined as formal statements of explanations stated in a testable form. • To test statistical hypotheses two presumptions are made to draw the inference from sample value. • Logic- designed to detect significant differences: differences that did not occur by random chance. Formulate hypotheses Collect data to test hypotheses Accept hypothesis Reject hypothesis
  • 31.
    Null and alternatehypothesis 1. Null Hypothesis (H0) – The difference is caused by random chance. – The H0 always states there is “no significant difference.” It means that there is no significant difference between the population mean and the sample mean. 2. Alternate hypothesis (H1) – “The difference is real”. – (H1) always contradicts the H0. • One (and only one) of these explanations must be true.
  • 33.
    Testing of hypotheses TypeI and Type II Errors. Example. Decision No disease Disease Not diagnosed OK Type II error Diagnosed Type I error OK treated but not harmed by the treatment irreparable damage would be done Inference: to avoid Type error II, have high level of significance
  • 34.
    Significance Probability that youreject the null hypothesis (in favor of the alternative hypothesis) when the null hypothesis is true: α = Pr[reject H0 | H0 is true] = Pr[accept H1 | H0 is true] What does this mean? If we set α = 0.05, then
  • 35.
    Power Probability that youreject the null hypothesis (in favor of the alternative hypothesis) when the null hypothesis is false: 1–β = Pr[reject H0 | H1 is true] = Pr[accept H1 | H1 is true]
  • 36.
    P value • Probabilitythat difference at least as large as those found in the observed data would have occurred by chance. • p-value is defined as the probability of obtaining a result equal to or more extreme than what was actually observed. • P values evaluate how well the sample data support that the null hypothesis is true.
  • 37.
    • High Pvalues: your data are likely with a true null. • Low P values: your data are unlikely with a true null.
  • 38.
    Type I error/False positive conclusion • Stating difference when there is no difference • Probability (Type I Error) =  • Usually set at 1/20 or 0.05. never 0 and it should be below the value of ‘α’ for concluding statistical significance. • The probability of a type I error is distributed at the tails of the normal curve i.e. 0.025 on either tail.
  • 39.
    Type II Error/false negative conclusion • Stating no difference when actually there is i.e. missing a true difference • Occurs when sample size is too small. • Probability (Type II Error) =  • Conventionally accepted to be 0.1 – 0.2 • Power of a study =(1- ) • Researchers consider a power 0.8 – 0.9 (80-90%) as satisfactory.
  • 40.
    Cut off forp value • Arbitrary cut-off 0.05 (5% chance of a false +ve conclusion. • If p<0.05 statistically significant- Reject H0, Accept H1 • If p>0.05 statistically not-significant- Accept H0, Reject H1 • Testing potential harmful interventions ‘α’ value is set below 0.05
  • 41.
    One/Two sided pvalues • If we are interested only to find out whether the test drug is better than the control drug, we put the α of 0.05 under only one tail of hypothesis - called one tailed test. • To know whether one drug performs better or worse than the other, we would distribute the of 0.05 to both tails under the hypothesis i.e. 0.025 to each tail – two tailed test.
  • 42.
    The Curve forTwo- vs. One-tailed Tests at α = .05: Two-tailed test: “is there a significant difference?” One-tailed tests: “is the sample mean greater than µ or Pu?” “is the sample mean less than µ or Pu?”
  • 43.
  • 44.
    CONFIDENCE INTERVAL A rangeof values so constructed that there is a specified probability of including the true value of a parameter within it
  • 45.
    CONFIDENCE LEVEL • Probabilityof including the true value of a parameter within a confidence interval • Percentage
  • 46.
    CONFIDENCE LIMITS • Twoextreme measurements within which an observation lies • End points of the confidence interval • Larger confidence – Wider interval
  • 47.
    • A pointestimate is a single number • A confidence interval contains a certain set of possible values of the parameter Point Estimate Lower Confidence Limit Upper Confidence LimitWidth of confidence interval
  • 49.
    Confidence level isusually set at 95% (1– ) = 0.95 95% CI corresponds to hypothesis testing with P <0.05
  • 50.
  • 51.
    Margin of error •Reduce the SD (σ↓) • Increase the sample size (n↑) • Narrow confidence level (1 – ) ↓
  • 52.
    Sample size andPower calculation
  • 53.
    Recipe for mostcommon formulation 1. Specify hypothesis 2. Specify the significance level (α) 3. Specify the effect size that is clinically relevant 4. Specify the power (1-β) 5. Use appropriate software / formulae to determine the minimum sample size A more preferable approach: 4. Specify sample sizes you can reasonably test (resources, ethics, etc.) 5. Use appropriate software / formulae to determine the power
  • 54.
    Selecting the minimalclinically relevant effect size Could base it on previous data: • Published data • Pilot study • Expert scientific opinion
  • 55.
    Sample Size Formula •The formula requires that we (i)specify the amount of confidence we wish to have, (ii) estimate the variance in the population, and (iii) specify the level of desired accuracy we want. • When we specify the above, the formula tells us what sample size we need to use….n 55
  • 57.
    PROCEDURE FOR CALCULATINGSAMPLE SIZE There are 3 procedures that could be used for calculating sample size: 1. Use of formulae 2. Ready made tables 3. Computer softwares 57
  • 58.
    Three types ofanalysis • Univariate analysis – the examination of the distribution of cases on only one variable at a time (e.g., weight of college students) • Bivariate analysis – the examination of two variables simultaneously (e.g., the relation between gender and weight of college students ) • Multivariate analysis – the examination of more than two variables simultaneously (e.g., the relationship between gender, race and weight of college students)
  • 59.
    Purpose of diff.types of analysis • Univariate analysis – Purpose: mainly description • Bivariate analysis – Purpose: determining the empirical relationship between the two variables • Multivariate analysis – Purpose: determining the empirical relationship among multiple variables
  • 60.
  • 61.
    UNIVARIATE ANALYSIS INFERENTIAL STATISTICSDESCRIPTIVESTATISTICS 1)Measures of central tendency Mean Median Mode 2)Measures of dispersion Range Variance Standard deviation 1)Z Test 2)T test 3)Chi-square test
  • 62.
    • Multivariate analysis: Itassures that the results are not biased and influenced by other factors that are not accounted for. • Choice of technique to be used: Chi square test for two nominal variables Correlation test to assess between two interval or ratio measures.
  • 63.
  • 64.
    SURVIVAL: • It isthe probability of remaining alive for a specific length of time. • Our point of interest : prognosis of disease i.e 5 year survival
  • 65.
    65 What is survivalanalysis? • Statistical methods for analyzing longitudinal data on the occurrence of events. • Events may include death,onset of illness, recovery from illness (binary variables) or failure etc. • Accommodates data from randomized clinical trial or cohort study design.
  • 66.
    Need for survivalanalysis: – Investigators frequently must analyze data before all patients have died; otherwise, it may be many years before they know which treatment is better. – Survival analysis gives patients credit for how long they have been in the study, even if the outcome has not yet occurred. – The Kaplan–Meier procedure is the most commonly used method to illustrate survival curves.
  • 67.
    67 Estimate time-to-event fora group of individuals: To compare time-to-event between two or more groups: Objectives of survival analysis:
  • 68.
    68 Survival Analysis: Terms •Time-to-event: The time from entry into a study until a subject has a particular outcome. • Censoring: Subjects are said to be censored if they are lost to follow up or drop out of the study, or if the study ends before they die or have an outcome of interest. They are counted as alive or disease- free for the time they were enrolled in the study.
  • 69.
    CENSORING: • Subjects aresaid to be censored – if they are lost to follow up – drop out of the study, – if the study ends before they die or have an outcome of interest. • They are counted as alive or disease-free for the time they were enrolled in the study. • In simple words, some important information required to make a calculation is not available to us. i.e. censored.
  • 70.
    Types of censoring: ThreeTypes of Censoring Right censoring Left censoring Interval censoring
  • 71.
    Right Censoring: • Rightcensoring is the most common of concern. • It means that we are not certain what happened to people after some point in time. • This happens when some people cannot be followed the entire time because they died or were lost to follow-up or withdrew from the study.
  • 72.
    • Left censoringis when we are not certain what happened to people before some point in time. • Commonest example is when people already have the disease of interest when the study starts. Left Censoring:
  • 73.
    • Interval/random censoringis when we know that something happened in an interval (i.e. not before starting time and not after ending time of the study ), but do not know exactly when in the interval it happened. • For example, we know that the patient was well at time of start of the study and was diagnosed with disease at time of end of the study, so when did the disease actually begin? • All we know is the interval. Interval/Random Censoring
  • 75.
    Importance of censoringin survival analysis? • Example: we want to know the survival rates of a disease in two groups and our outcome interest is death due the disease? group-1 group-2 Time in month s event 5 death 6 death 8 death 9 death 10 death 12 death 16 death Time in months event 9 death 8 death 12 death 20 death 6 death 7 death 4 death This data can’t be analysed by survival analysis method.As there is no censored data.In this case as all pts. died so we can take mean time of death and know which group has more survival time Also data shouldn’t have >50% censored data
  • 76.
    SURVIVAL FUNCTION: Let T=Time of death(disease) •Survival function S(t)=F(t) =prob.(alive at time t) =prob.(T>t) In simple terms it can be defined as No. of pts. Surviving longer than ‘t’ S(t)= ---------------------------------------------- Total no. of pts.
  • 77.
    77 Kaplan-Meier estimate ofsurvival function: • Calculate the survival of study population. • Easy to calculate. • Non-parametric estimate of the survival function. • Commonly used to compare two study populations. • Applicable to small,moderate and large samples.
  • 78.
    Beginning of studyEnd of study Time in months  Subject B Subject A Subject C Subject D Subject E Survival Data (right-censored) 1. subject E dies at 4 months X 0
  • 79.
    100 %  Time inmonths Corresponding Kaplan-Meier Curve Probability of surviving to 4 months is 100% = 5/5 Fraction surviving this death = 4/5Subject E dies at 4 months 4
  • 80.
    Beginning of studyEnd of study  Time in months  Subject B Subject A Subject C Subject D Subject E Survival Data 2. subject A drops out after 6 months 1. subject E dies at 4 months X 3. subject C dies at 7 months X
  • 81.
    100 %  Time inmonths Corresponding Kaplan-Meier Curve subject C dies at 7 months Fraction surviving this death = 2/3 74
  • 82.
    Beginning of studyEnd of study  Time in months  Subject B Subject A Subject C Subject D Subject E Survival Data 2. subject A drops out after 6 months 4. Subjects B and D survive for the whole year-long study period1. subject E dies at 4 months X 3. subject C dies at 7 months X
  • 83.
    12 100 %  Time inmonths Corresponding Kaplan-Meier Curve Rule from probability theory: P(A&B)=P(A)*P(B) if A and B independent In kaplan meier : intervals are defined by failures(2 intervals leading to failures here). P(surviving intervals 1 and 2)=P(surviving interval 1)*P(surviving interval 2) Product limit estimate of survival = P(surviving interval 1/at-risk up to failure 1) * P(surviving interval 2/at-risk up to failure 2) = 4/5 * 2/3= .5333 0 The probability of surviving in the entire year, taking into account censoring = (4/5) (2/3) = 53%
  • 84.
    Properties of survivalfunction: 1.Step function 2.Median survival time estimate(i.e 50% of pts. survival time)
  • 85.
    Median survival? 12&22 Which has better survival? (2nd one) What proportion survives 20 weeks?(in 1st graph=around 35% and in 2nd onearound 62%)
  • 86.
    Limitations of Kaplan-Meier: 1.Musthave >50% uncensored observations. 2.Median survival time. 3.Assumes that censoring occurs independent of survival times.(what if the person who develops adverse effect due to some treatment and forced to leave or died?)
  • 87.
    Comparison between 2survival curve • Don’t make judgments simply on the basis of the amount of separation between two lines
  • 88.
    Comparison between 2survival curve: • methods may be used to compare survival curves. – Logrank statistic. – Breslow Statistics – Tarone-Ware Statistics
  • 89.
    LOGRANK TEST: • Thelog rank statistic is one of the most commonly used methods to learn if two curves are significantly different. • This method also known as Mantel-logrank statistics or Cox-Mantel-logrank statistics. • The logrank statistic is distributed as χ2 with a H0 that survival functions of the two groups are the same
  • 90.
    Hazard function: • Oppositeto survival function • Hazard function is the derivative of the survival function over time h(t)=dS(t)/dt • instantaneous risk of event at time t (conditional failure rate) • It is the probability that a person will die in the next interval of time, given that he survived until the beginning of the interval.
  • 92.
    Summary of survivalanalysis – survival analysis Estimate time-to-event for a group of individuals and To compare time-to-event between two or more groups. • In survival data is transformed into censored and uncensored data • all those who achieve the outcome of interest are uncensored” data • those who do not achieve the outcome are “censored” data • Log-Rank test used to compare 2 survival curves but does not control for confounding.
  • 93.
  • 96.
    • Forest plotsdisplay the results of meta-analysis graphically • The plot was named after a breast cancer researcher called Pat Forrest and as a result the name has sometimes been spelt "forrest plot"
  • 111.
    intention-to-treat analysis • Theintention-to-treat principle defines that every patient randomized to the clinical study should enter the primary analysis. Accordingly, patients who drop out prematurely, are non-compliant to the study treatment, or even take the wrong study treatment, are included in the primary analysis within the respective treatment group they have been assigned to at randomization (“as randomized”). Consequently, in an analysis according to the ITT principle, the original randomization and the number of patients in the treatment groups remain unchanged, the analysis population is as complete as possible, and a potential bias due to exclusion of patients is avoided. Thus, the patient set used for the primary analysis according to the ITT principle is called “full analysis set”. There are only some specific reasons that might cause an exclusion of a patient from the full analysis set: – no treatment was applied at all – there are no data available after randomization • In addition, the ICH E9 guideline mentions “failure of major entry criteria” as a reason for exclusion. However, as these major entry criteria are quite specific and only valid under certain circumstances, they are not commonly used for the definition of a full analysis set
  • 112.
    Per protocol analysis •While an analysis according to the ITT principle aims to preserve the original randomization and to avoid potential bias due to exclusion of patients, the aim of a per-protocol (PP) analysis is to identify a treatment effect which would occur under optimal conditions; i.e. to answer the question: what is the effect if patients are fully compliant? Therefore, some patients (from the full analysis set) need to be excluded from the population used for the PP analysis (PP population) • Usually, this applies to patients fulfilling any of the following criteria: @ non-availability of measurements of the primary endpoint @ non-sufficient exposure to study treatment • There might be further criteria for selecting a PP population; however, the following approaches are essential: – The assignment to the PP analysis set needs to take place prior to the analysis (if possible in a blinded manner). – Deviations that might be affected by the actual treatment should not be used as exclusion criteria: e.g., “premature discontinuation from the study” might not be a good choice of criterion for exclusion from the PP analysis, if this discontinuation was due to lack of efficacy (and therefore associated with the treatment received) • Both approaches, the ITT and the PP approach, are valid but have different roles in the analysis of clinical studies. Let’s come back to the question at the beginning of this article: What is worse, scenario A (claim a non-existing effect) or B (neglect an existing effect)?
  • 113.
    When to stopa clinical trial • During a clinical trial, we can perform interim analysis (or DMC, DSMB review) for three different reasons: Interim analysis for safety 1) with pre-specified stopping rule (for example stop the trial if we see # of cases of Serious Adverse Events) 2) without pre-specified stopping rule (rely on DMC members to review the overall safety) Interim analysis for efficacy: To see if the new treatment is overwhelmingly better than control - then stop the trial for efficacy Interim analysis for futility: To see if the new treatment is unlikely to beat the control – then stop the trial for futility - this is called ‘futility analysis’. In situations 2 and 3, the criteria for stopping rule for efficacy could be different from the stopping rule for futility, but need to be pre-specified.