SlideShare a Scribd company logo
Statistical Methods in Clinical
Research
Dr Ranjith P
DNB Resident ACME Pariyaram ,
Kerala
Overview
 Data types
 Summarizing data using descriptive statistics
 Standard error
 Confidence Intervals
Overview
 P values
 Alpha and Beta errors
 Statistics for comparing 2 or more groups with
continuous data
 Non-parametric tests
Overview
 Regression and Correlation
 Risk Ratios and Odds Ratios
 Survival Analysis
 Cox Regression
 Forest plot
 PICOT
overview
Types of Data
 Discrete Data-limited number of choices
 Binary: two choices (yes/no)

Dead or alive

Disease-free or not
 Categorical: more than two choices, not ordered

Race

Age group
 Ordinal: more than two choices, ordered

Stages of a cancer

Likert scale for response
 E.G. strongly agree, agree, neither agree or disagree, etc.
Types of data
 Continuous data
 Theoretically infinite possible values (within
physiologic limits) , including fractional values

Height, age, weight
 Can be interval

Interval between measures has meaning.

Ratio of two interval data points has no meaning

Temperature in celsius, day of the year).
 Can be ratio

Ratio of the measures has meaning

Weight, height
Types of Data
 Why important?
 The type of data defines:
 The summary measures used

Mean, Standard deviation for continuous data

Proportions for discrete data
 Statistics used for analysis:

Examples:
 T-test for normally distributed continuous
 Wilcoxon Rank Sum for non-normally distributed
continuous
Descriptive Statistics
 Characterize data set
 Graphical presentation

Histograms

Frequency distribution

Box and whiskers plot
 Numeric description

Mean, median, SD, interquartile range
Histogram
Continuous Data
No segmentation of data into groups
Frequency Distribution
Segmentation of data into groups
Discrete or continuous data
Box and Whiskers Plots
Box and Whisker Plots
Popular in Epidemiologic Studies
Useful for presenting comparative data graphically
Numeric Descriptive Statistics
 Measures of central tendency of data
 Mean
 Median
 Mode
 Measures of variability of
data(dispersion)
 Standard Deviation, mean deviation
 Interquartile range, variance
Mean
 Most commonly used measure of central tendency
 Best applied in normally distributed continuous data.
 Not applicable in categorical data
 Definition:
 Sum of all the values in a sample, divided by the number of
values.
 Eg mean Height of 6 adolescent
children 146 ,142,150,148,156,140
 Ans ?
 882/6 =147
Median
 Used to indicate the “average” in a
skewed population
 Often reported with the mean
 If the mean and the median are the same,
sample is normally distributed.
 It is the middle value from an ordered
listing of the values
 If an odd number of values, it is the middle
value 1.2.3.4.5 ie 3
 If even number of values, it is the average
of the two middle values.1,2,3,4,5,6 ie
3+4/2 = 3.5
 Mid-value in interquartile range
Mode
 Infrequently reported as a value in studies.
 Is the most common value eg. 1,3,8,9,5,8,5,6
 mode = 5
.
Interquartile range
 Is the range of data from the 25th percentile
to the 75th percentile
 Common component of a box and whiskers
plot
 It is the box, and the line across the box is the
median or middle value
 Rarely, mean will also be displayed.
Mean deviation(standard
deviation )
 Mean deviation(SD) = £I X- I / nẌ
 n is the no of observations is the mean ,Ẍ
X each observation
 Square mean deviation= variance=
£I X- I² / nẌ
Root mean square deviation =√£I X- I² / nẌ
Variance
 Square of SD(standard deviation )
Coefficient of variance = SD/ mean x 100
Eg. If sd is 3 mean is 150
Variance is 9, coefficient of variance is
300/150 = 2
Standard Error
 A fundamental goal of statistical analysis is to
estimate a parameter of a population based
on a sample
 The values of a specific variable from a
sample are an estimate of the entire
population of individuals who might have
been eligible for the study.
 A measure of the precision of a sample
Standard Error
 Standard error of the mean
 Standard deviation / square root of (sample
size)

(if sample greater than 60)

Sd/ √n
 Important: dependent on sample size
 Larger the sample, the smaller the
Clarification
 Standard Deviation measures the
variability or spread of the data in an
individual sample.
 Standard error measures the precision
of the estimate of a population
parameter provided by the sample
mean or proportion.
Standard Error
 Significance:
 Is the basis of confidence intervals
 A 95% confidence interval is defined by

Sample mean (or proportion) ± 1.96 X standard error
 Since standard error is inversely related to the
sample size:

The larger the study (sample size), the smaller the
confidence intervals and the greater the precision of the
estimate.

Mean +/- 1 sd = 68.27% value
Mean +/- 2 sd = 95.49% value

Mean +/- 3 sd = 99.7% value

Mean +/- 4 sd = 99.9% value
Confidence Intervals
 May be used to assess a single point
estimate such as mean or proportion.
 Most commonly used in assessing the
estimate of the difference between two
groups.
Confidence Intervals
Commonly reported in studies to provide an estimate of the precision
of the mean.
P Values
 The probability that any observation is
due to chance alone assuming that the
null hypothesis is true
 Typically, an estimate that has a p
value of 0.05 or less is considered to
be “statistically significant” or unlikely
to occur due to chance alone. Null
hypothesis rejected
 The P value used is an arbitrary value

P value of 0.05 equals 1 in 20
chance

P value of 0.01 equals 1 in 100
chance

P value of 0.001 equals 1 in 1000
chance.
Errors
 Type I error
 Claiming a difference between two
samples when in fact there is none.

Remember there is variability among samples-
they might seem to come from different
populations but they may not.
 Also called the α error.
 Typically 0.05 is used
Errors
 Type II error
 Claiming there is no difference between
two samples when in fact there is.
 Also called a β error.
 The probability of not making a Type II
error is 1 - β, which is called the power of
the test.
 Hidden error because can’t be detected
without a proper power analysis
Errors
Null
Hypothesis
H0
Alternative
Hypothesis
H1
Null
Hypothesis
H0
No Error Type I
α
Alternative
Hypothesis
H1
Type II
β
No Error
Test result
Truth
Sample Size Calculation
 Also called “power analysis”.
 When designing a study, one needs to
determine how large a study is needed.
 Power is the ability of a study to avoid a
Type II error.
 Sample size calculation yields the
number of study subjects needed, given
a certain desired power to detect a
difference and a certain level of P value
that will be considered significant.
Sample Size Calculation
 Depends on:
 Level of Type I error: 0.05 typical
 Level of Type II error: 0.20 typical
 One sided vs two sided: nearly always two
 Inherent variability of population

Usually estimated from preliminary data
 The difference that would be meaningful
between the two assessment arms.
One-sided vs. Two-sided
 Most tests should be framed as a two-
sided test.
 When comparing two samples, we usually
cannot be sure which is going to be be
better.

You never know which directions study results
will go.
 For routine medical research, use only two-
sided tests.
Statistical Tests
 Parametric tests
 Continuous data normally distributed
 Non-parametric tests
 Continuous data not normally distributed
 Categorical or Ordinal data
Comparison of 2 Sample Means
 Student’s T test
 Assumes normally distributed continuous
data.
T value = difference between means
standard error of difference
 T value then looked up in Table to
determine significance
Paired T Tests
 Uses the change before
and after intervention in a
single individual
 Reduces the degree of
variability between the
groups
 Given the same number
of patients, has greater
power to detect a
difference between groups
Analysis of Variance(ANOVA)
 Used to determine if two or more
samples are from the same
population-
 If two samples, is the same as
the T test.
 Usually used for 3 or more
samples.
Non-parametric Tests
 Testing proportions
 (Pearson’s) Chi-Squared (χ2) Test
 Fisher’s Exact Test
 Testing ordinal variables
 Mann Whiney “U” Test
 Kruskal-Wallis One-way ANOVA
 Testing Ordinal Paired Variables
 Sign Test
 Wilcoxon Rank Sum Test
Use of non-parametric tests
 Use for categorical, ordinal or non-normally
distributed continuous data
 May check both parametric and non-
parametric tests to check for congruity
 Most non-parametric tests are based on
ranks or other non- value related methods
 Interpretation:
 Is the P value significant?
(Pearson’s) Chi-Squared (χ2) Test
 Used to compare observed proportions of an
event compared to expected.
 Used with nominal data (better/ worse;
dead/alive)
 If there is a substantial difference between
observed and expected, then it is likely that
the null hypothesis is rejected.
 Often presented graphically as a 2 X 2 Table
Non parametric test
For comparing 2 related samples
-Wilcoxon Signed Rank Test
For comparing 2 unrelated samples
-Mann- Whitney U Test
For comparing >2groups
-Kruskal Walli Test
Mann–Whitney U test
 Mann–Whitney–Wilcoxon (MWW), Wilcoxon 
rank-sum test, or Wilcoxon–Mann–Whitney 
test) is a non-parametric test especially that a
particular population tends to have larger
values than the other.
 It has greater efficiency than the t-test on non-
normal distributions, such as
a mixture of normal distributions, and it is
nearly as efficient as the t-test on normal
distributions.
STUDENT T TEST
 A t-test is any statistical hypothesis
test in which the test statistic follows
a normal
distri bution if the null hypothesis is
supported.
 It can be used to determine if two sets of
data are significantly different from each
other, and is most commonly applied
when the test statistic would follow
a normal distribution
 The Kaplan–Meier estimator,also known
as the product limit estimator, is
an estimator for estimating the survival
function from lifetime data.
 In medical research, it is often used to
measure the fraction of patients living for a
certain amount of time after treatment.
 The estimator is named after Edward L.
Kaplan and Paul Meier.
 A plot of the Kaplan–Meier
estimate of the survival function is
a series of horizontal steps of
declining magnitude which, when
a large enough sample is taken,
approaches the true survival
function for that population.
 ODDS RATIO
In case control study –
measure of the strength of the
association between risk factor
and out come
Odds ratio
Lung 
cancer(case
s)
No  lung 
cancer 
(controls)
smokers 33 (a) 55 (b)
Non smokers 2 (c) 27 (d)
TOTAL  35(a+c) 82(b+d)
 Odds ratio =ad/bc
=33*27/55*2
=8.1
ie smokers have 8.1 times have the
risk to develop lung cancer than non
smokers
RELATIVE RISK
 Measure of risk in a cohort
study
 RR=lncidence of disease
among exposed /
incidence among non exposed
Cigarette 
smoking
Developo
d lung 
cancer
Not  
Developo
d lung 
cancer
total
Yes 70 (a) 6930 (b) 7000
(a+b)
No 3 (c) 2997 (d) 3000
(c+d)
 Incidence among
smokers=70/7000=10/1000
 Incidence among non
smokers=3/3000=1/1000
 Total incidence= 73/10000=7.3/1000
RR=lncidence of disease among exposed/
incidence among non exposed
Relative risk of lung cancer=10/1=10
Incidence of lung cancer is 10 times higher in
exposed group (smokers) , ie having a
Positive relationship with smoking
Larger RR ,more the strength of association
Attributable risk
 It is the difference in incidence
rates of disease between exposed
group(EG) and non exposed
group(NEG)
 Often expressed in percent
 (Incidence of disease rate in EG-
Incidence of disease in
NEG/incidence rate in EG ) * 100
. AR= 10-1/10=90%
Ie 90% lung cancers in smokers was
due to their smoking
Population attributable Risk
 It is the incidence of the disease in total
population - the incidence of disease
among those who were not exposed to
the suspected causal factor/incidence of
disease in total population
 PAR=7.3-1/7.3=86.3%, ie 86.3 %
disease can be avoided if risk factors like
cigarettes were avoided
Mortality rates & Ratios
 Crude Death rate
 No of deaths (from all cases )per
1000 estimated mid year
population(MYP) in one year in a
given place
 CDR=(No. deaths during the
 CDR in Panchayath A is
15.2/1000
 Panchayath B is 8.2/1000
population
Health status of Panchayath B is
better than A
 Specific Death rate=(No of diseases due to
specific diseases during a calendar year/
MYP)*1,000
Can calculate death rate in separate diseases
eg . TB, HIV 2/1000, 1/1000 resp
Age groups 5-20yrs, <5yrs - 1/1000, 3/3000
resp.
Sex eg. More in males,
Specific months,etc
Case fatality rate(ratio)
 (Total no of deaths due to a particular
disease/Total no of cases due to same
disease)*100
 Usually described in A/c infectious
diseases
 Dengue, cholera, food poisoning etc
 Represent killing power of the disease
Proportional mortality rate(ratio)
 Due to a specific disease=(No of
deaths from the specific disease in a
year/ Total deaths in an year )*100
 Under 5 Mortality rate=(No of deaths
under 5 years of age in a given
year/Total no of deaths during the
same period)*100
Survival rate
 (Total no of patients alive after
5yrs/Total no of patients diagnosed
or treated)*100
 Method of prognosis of certain
disease conditions mainly in
cancers
INCIDENCE
 No of new cases occurring in a defined
population during a specified period of time
 (No of new cases of specific disease during a
given time period / Population at risk)*1000
 Eg 500 new cases of TB in a population of
30000, Incidence is (500/3000)*1000
ie 16.7/1000/yr expressed as incidence rate
Incidence-uses
 Can be expressed as Special
incidence rate , Attack rate ,
Hospital admission rate , case rate
etc
 Measures the rate at which new
cases are occurring in a population
 Not influenced by duration
 Generally use is restricted to acute
PREVALENCE
 Refers specifically to all current
cases (old & new) existing at a
given point of time, or a period of
time in a given population
 Referred to as a rate , it is really a
a ratio
 Point prevalence=(No of all currant cases
(old& new) of a specified disease existing
at a given point of time / Estimated
population at the same point of time)*100
 Period prevalence=(No of existing cases
(old& new) of a specified disease during
a given period of time / Estimated mid
interval population at risk)*100
 Incidence - 3,4,5,8
 Point prevalence at jan 1- 1,2& 7
 Point prevalence at Dec 31- 1,3,5&8
 Period prevalence(jan-Dec)-
1,2,3,4,5,7&8
Relationship b/n Incidence &
prevalence
 Prevalence=Incidence*Mean duration
 P=I*D I=P/D D=P/I
 Eg: Incidence=10 cases/1000
population/yr
 Mean duration 5 yrs
 Prevalence=10*5 =50/1000 population
PREVALENCE-USES
 Helps to estimate magnitude of
health/disease problems in the
community, & identify potential high risk
populations
 Prevalence rates are especially useful
for administrative and planning
purposes
eg: hospital beds, man power
needs,rehabilation facilities etc.
Statistical significance
 P value (hypothesis)
 95% CI (Interval)
P value & its interpretation
“it is the probability of type 1 error”
 The chance that, a difference or
association is concluded , when actually
there is none.
 Study of prevalence of obesity in male
& female child in a classroom.
50 students
of 25 boys- 10 obese
of 25 girls - 16 obese
p value : 0.02
Null hypothesis: “no difference in obesity
among boys & girls in the classroom”
study ,Bubble vs conventional CPAP for
prevention of extubation Failure( EF) in
preterm very low birth weight infants.
EF bCPAP =4(16)
cCPAP =9(16)
p value-0.14
Null hypothesis: “ no difference in EF
among preterm babies treated with
bCPAP &cCPAP.”
95% CI
95%CI= Mean ‡1.96SD(2SD)
= Mean ‡ 2SE
1) 100 children attending pediatric OP.
mean wt=15kg SD=2
95%CI =?
Interpretation of 95%CI
 If a test is repeated 100times , 95 times
the mean value comes between this
value.
 If CI of 2 variables overlap, the chance
of significant difference is very less.
Measures Of Risk
 case control study- Odds ratio
 Cohort study -RR,AR
Chi-Squared (χ2) Test
 Chi-Squared (χ2) Formula
 Not applicable in small samples
 If fewer than 5 observations per cell, use
Fisher’s exact test
BREAK
Correlation
 Assesses the linear relationship between two variables
 Example: height and weight
 Strength of the association is described by a correlation
coefficient- r

r = 0 - .2 low, probably meaningless

r = .2 - .4 low, possible importance

r = .4 - .6 moderate correlation

r = .6 - .8 high correlation
 r = .8 - 1 very high correlation
 Can be positive or negative
 Pearson’s, Spearman correlation coefficient
 Tells nothing about causation
Correlation
Source: Harris and Taylor. Medical Statistics Made Easy
Correlation
Perfect Correlation
Source: Altman. Practical Statistics for Medical Research
Regression
 Based on fitting a line to data
 Provides a regression coefficient, which is the slope of the
line

Y = ax + b
 Use to predict a dependent variable’s value based on the
value of an independent variable.

Very helpful- In analysis of height and weight, for a known
height, one can predict weight.
 Much more useful than correlation
 Allows prediction of values of Y rather than just whether
there is a relationship between two variable.
Regression
 Types of regression
 Linear- uses continuous data to predict continuous
data outcome
 Logistic- uses continuous data to predict
probability of a dichotomous outcome
 Poisson regression- time between rare events.
 Cox proportional hazards regression- survival
analysis.
Multiple Regression Models
 Determining the association between two
variables while controlling for the values of
others.
 Example: Uterine Fibroids
 Both age and race impact the incidence of
fibroids.
 Multiple regression allows one to test the impact of
age on the incidence while controlling for race
(and all other factors)
Multiple Regression Models
 In published papers, the multivariable models are
more powerful than univariable models and take
precedence.
 Therefore we discount the univariable model as it does not
control for confounding variables.
 Eg: Coronary disease is potentially affected by age, HTN,
smoking status, gender and many other factors.
 If assessing whether height is a factor:

If it is significant on univariable analysis, but not on
multivariable analysis, these other factors confounded the
analysis.
Survivial Analysis
 Evaluation of time to an event (death,
recurrence, recover).
 Provides means of handling censored data
 Patients who do not reach the event by the end of
the study or who are lost to follow-up
 Most common type is Kaplan-Meier analysis
 Curves presented as stepwise change from
baseline
 There are no fixed intervals of follow-up- survival
proportion recalculated after each event.
Survival Analysis
Source: Altman. Practical Statistics for Medical Research
Kaplan-Meier Curve
Source: Wikipedia
Kaplan-Meier Analysis
 Provides a graphical means of comparing the
outcomes of two groups that vary by intervention or
other factor.
 Survival rates can be measured directly from curve.
 Difference between curves can be tested for
statistical significance.
Cox Regression Model
 Proportional Hazards Survival Model.
 Used to investigate relationship between an event
(death, recurrence) occurring over time and possible
explanatory factors.
 Reported result: Hazard ratio (HR).
 Ratio of the hazard in one group divided the hazard in
another.
 Interpreted same as risk ratios and odds ratios

HR 1 = no effect

HR > 1 increased risk

HR < 1 decreased risk
Cox Regression Model
 Common use in long-term studies
where various factors might predispose
to an event.
 Example: after uterine embolization, which
factors (age, race, uterine size, etc) might
make recurrence more likely.
True disease state vs. Test result
not rejected rejected
No disease
(D = 0)

specificity
X
Type I error
(False +) α
Disease
(D = 1)
X
Type II error
(False -) β

Power 1 - β;
sensitivity
Disease
Test
Specific Example
Test Result
Pts withPts with
diseasedisease
Pts withoutPts without
the diseasethe disease
Test Result
Call these patients “negative” Call these patients “positive”
Threshold
Test Result
Call these patients “negative” Call these patients “positive”
without the disease
with the disease
True Positives
Some definitions ...
Test Result
Call these patients “negative” Call these patients “positive”
without the disease
with the disease
False
Positives
Test Result
Call these patients “negative” Call these patients “positive”
without the disease
with the disease
True
negatives
Test Result
Call these patients “negative” Call these patients “positive”
without the disease
with the disease
False
negatives
Test Result
without the disease
with the disease
‘‘‘‘-’’-’’ ‘‘‘‘+’’+’’
Moving the Threshold: right
Test Result
without the disease
with the disease
‘‘‘‘-’’-’’ ‘‘‘‘+’’+’’
Moving the Threshold: left
TruePositiveRate
(sensitivity)
0%
100%
False Positive Rate
(1-specificity)
0% 100%
ROC curve
TruePositiveRate
0
%
100%
False Positive Rate
0
%
100%
TruePositiveRate
0
%
100%
False Positive Rate
0
%
100%
A good test: A poor test:
ROC curve comparison
Best Test: Worst test:
TruePositiveRate
0
%
100%
False Positive
Rate
0
%
100
%
TruePositive
Rate
0
%
100%
False Positive
Rate
0
%
100
%
The distributions
don’t overlap at all
The distributions
overlap completely
ROC curve extremes
Best Test: Worst test:
TruePositiveRate
0
%
100%
False Positive
Rate
0
%
100
%
TruePositive
Rate
0
%
100%
False Positive
Rate
0
%
100
%
The distributions
don’t overlap at all
The distributions
overlap completely
ROC curve extremes
FOREST PLOT
114
 An example forest plot of five odds
ratios (squares) with the summary
measure (centre line of diamond)
and associated confidence
intervals (lateral tips of diamond),
and solid vertical line of no effect.
Names of (fictional) studies are
shown on the left, odds ratios and
115
 A forest plot (or blobbogram[1]
) is a
graphical display designed to illustrate
the relative strength of treatment effects
in multiple quantitative scientific studies
addressing the same question. It was
developed for use in medical research
as a means of graphically representing
a meta-analysis of the results of
randomized controlled trials.
116
117
 i. Probably a small study, with a wide
CI, crossing the line of no effect (OR =
1). Unable to say if the intervention
works
 ii. Probably a small study, wide CI , but
does not cross OR = 1; suggests
intervention works but weak evidence
 iii. Larger study, narrow CI: but crosses
OR = 1; no evidence that intervention
 iv. Large study, narrow confidence
intervals: entirely to left of OR = 1;
suggests intervention works
 v. Small study, wide confidence
intervals, suggests intervention is
detrimental
 vi. Meta-analysis of all identified
studies: suggests intervention works.
PICOT
 Used to test evidence based research
 Population
 Intervension or issue
 Comparison with another intervention
 Outcome
 Time frame

More Related Content

What's hot

Seminaar on meta analysis
Seminaar on meta analysisSeminaar on meta analysis
Seminaar on meta analysisPreeti Rai
 
Statistical tests for categorical data
Statistical tests for categorical dataStatistical tests for categorical data
Statistical tests for categorical data
Rizwan S A
 
Sample determinants and size
Sample determinants and sizeSample determinants and size
Sample determinants and size
Tarek Tawfik Amin
 
Sample Size Determination
Sample Size DeterminationSample Size Determination
Odds ratios (Basic concepts)
Odds ratios (Basic concepts)Odds ratios (Basic concepts)
Odds ratios (Basic concepts)
Tarekk Alazabee
 
Survival analysis & Kaplan Meire
Survival analysis & Kaplan MeireSurvival analysis & Kaplan Meire
Survival analysis & Kaplan Meire
Dr Athar Khan
 
Power Analysis and Sample Size Determination
Power Analysis and Sample Size DeterminationPower Analysis and Sample Size Determination
Power Analysis and Sample Size DeterminationAjay Dhamija
 
Study designs
Study designsStudy designs
Study designs
Dr Lipilekha Patnaik
 
How to determine sample size
How to determine sample size How to determine sample size
How to determine sample size
saifur rahman
 
A.6 confidence intervals
A.6  confidence intervalsA.6  confidence intervals
A.6 confidence intervalsUlster BOCES
 
Meta analysis
Meta analysisMeta analysis
Meta analysis
Dinesh Chaurasiya
 
ODDS RATIO AND RELATIVE RISK EVALUATION
ODDS RATIO AND RELATIVE RISK EVALUATIONODDS RATIO AND RELATIVE RISK EVALUATION
ODDS RATIO AND RELATIVE RISK EVALUATION
Kanhu Charan
 
Systematic review and meta analaysis course - part 2
Systematic review and meta analaysis course - part 2Systematic review and meta analaysis course - part 2
Systematic review and meta analaysis course - part 2
Ahmed Negida
 
Non inferiority clinical trials
Non inferiority clinical trialsNon inferiority clinical trials
Non inferiority clinical trials
Francois MAIGNEN
 
Clinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansClinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-Statisticians
Brook White, PMP
 
Heterogeneity in meta-analysis
Heterogeneity in meta-analysisHeterogeneity in meta-analysis
Heterogeneity in meta-analysis
Rizwan S A
 

What's hot (20)

Bio stat
Bio statBio stat
Bio stat
 
Seminaar on meta analysis
Seminaar on meta analysisSeminaar on meta analysis
Seminaar on meta analysis
 
Statistical tests for categorical data
Statistical tests for categorical dataStatistical tests for categorical data
Statistical tests for categorical data
 
Meta analysis with R
Meta analysis with RMeta analysis with R
Meta analysis with R
 
Sample determinants and size
Sample determinants and sizeSample determinants and size
Sample determinants and size
 
SAMPLE SIZE, CONSENT, STATISTICS
SAMPLE SIZE, CONSENT, STATISTICSSAMPLE SIZE, CONSENT, STATISTICS
SAMPLE SIZE, CONSENT, STATISTICS
 
Sample Size Determination
Sample Size DeterminationSample Size Determination
Sample Size Determination
 
Odds ratios (Basic concepts)
Odds ratios (Basic concepts)Odds ratios (Basic concepts)
Odds ratios (Basic concepts)
 
Survival analysis & Kaplan Meire
Survival analysis & Kaplan MeireSurvival analysis & Kaplan Meire
Survival analysis & Kaplan Meire
 
Power Analysis and Sample Size Determination
Power Analysis and Sample Size DeterminationPower Analysis and Sample Size Determination
Power Analysis and Sample Size Determination
 
Study designs
Study designsStudy designs
Study designs
 
How to determine sample size
How to determine sample size How to determine sample size
How to determine sample size
 
A.6 confidence intervals
A.6  confidence intervalsA.6  confidence intervals
A.6 confidence intervals
 
Meta analysis
Meta analysisMeta analysis
Meta analysis
 
ODDS RATIO AND RELATIVE RISK EVALUATION
ODDS RATIO AND RELATIVE RISK EVALUATIONODDS RATIO AND RELATIVE RISK EVALUATION
ODDS RATIO AND RELATIVE RISK EVALUATION
 
Systematic review and meta analaysis course - part 2
Systematic review and meta analaysis course - part 2Systematic review and meta analaysis course - part 2
Systematic review and meta analaysis course - part 2
 
Survival analysis
Survival  analysisSurvival  analysis
Survival analysis
 
Non inferiority clinical trials
Non inferiority clinical trialsNon inferiority clinical trials
Non inferiority clinical trials
 
Clinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansClinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-Statisticians
 
Heterogeneity in meta-analysis
Heterogeneity in meta-analysisHeterogeneity in meta-analysis
Heterogeneity in meta-analysis
 

Viewers also liked

Bio statistics
Bio statisticsBio statistics
Bio statisticsNc Das
 
Management of epidemics
Management of epidemicsManagement of epidemics
Management of epidemicsNc Das
 
Mortality measurement
Mortality measurementMortality measurement
Mortality measurement
Abino David
 
Types of epidemics and epidemic investigations
Types of epidemics and epidemic investigationsTypes of epidemics and epidemic investigations
Types of epidemics and epidemic investigations
Tarek Tawfik Amin
 

Viewers also liked (8)

Bio statistics
Bio statisticsBio statistics
Bio statistics
 
Management of epidemics
Management of epidemicsManagement of epidemics
Management of epidemics
 
Mortality measurement
Mortality measurementMortality measurement
Mortality measurement
 
Types of epidemics and epidemic investigations
Types of epidemics and epidemic investigationsTypes of epidemics and epidemic investigations
Types of epidemics and epidemic investigations
 
Measures of mortality
Measures of mortalityMeasures of mortality
Measures of mortality
 
Statistical ppt
Statistical pptStatistical ppt
Statistical ppt
 
16
1616
16
 
Epidemiology ppt
Epidemiology pptEpidemiology ppt
Epidemiology ppt
 

Similar to bio statistics for clinical research

biostatistics
biostatisticsbiostatistics
biostatistics
Mehul Shinde
 
Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiran
Kiran Ramakrishna
 
Soni_Biostatistics.ppt
Soni_Biostatistics.pptSoni_Biostatistics.ppt
Soni_Biostatistics.ppt
Ogunsina1
 
Spss mahareak
Spss mahareakSpss mahareak
Spss mahareak
Ali Mahareak
 
Overview of different statistical tests used in epidemiological
Overview of different  statistical tests used in epidemiologicalOverview of different  statistical tests used in epidemiological
Overview of different statistical tests used in epidemiological
shefali jain
 
Biostatistics clinical research & trials
Biostatistics clinical research & trialsBiostatistics clinical research & trials
Biostatistics clinical research & trials
eclinicaltools
 
Parametric and Non parametric Tests.pptx
Parametric and Non parametric Tests.pptxParametric and Non parametric Tests.pptx
Parametric and Non parametric Tests.pptx
DrDeveshPandey1
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
JoicePjiji
 
Critical Appriaisal Skills Basic 1 | May 4th 2011
Critical Appriaisal Skills Basic 1 | May 4th 2011Critical Appriaisal Skills Basic 1 | May 4th 2011
Critical Appriaisal Skills Basic 1 | May 4th 2011
NES
 
Parametric vs non parametric test
Parametric vs non parametric testParametric vs non parametric test
Parametric vs non parametric test
ar9530
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.ppt
mousaderhem1
 
A lesson on statistics
A lesson on statisticsA lesson on statistics
A lesson on statistics
Andrea Josephine
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpointjamiebrandon
 
Analysis and Interpretation
Analysis and InterpretationAnalysis and Interpretation
Analysis and Interpretation
Francisco J Grajales III
 
Bio-Statistics in Bio-Medical research
Bio-Statistics in Bio-Medical researchBio-Statistics in Bio-Medical research
Bio-Statistics in Bio-Medical research
Shinjan Patra
 

Similar to bio statistics for clinical research (20)

Stat
StatStat
Stat
 
Bgy5901
Bgy5901Bgy5901
Bgy5901
 
biostatistics
biostatisticsbiostatistics
biostatistics
 
Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiran
 
Hypo
HypoHypo
Hypo
 
Soni_Biostatistics.ppt
Soni_Biostatistics.pptSoni_Biostatistics.ppt
Soni_Biostatistics.ppt
 
Spss mahareak
Spss mahareakSpss mahareak
Spss mahareak
 
Overview of different statistical tests used in epidemiological
Overview of different  statistical tests used in epidemiologicalOverview of different  statistical tests used in epidemiological
Overview of different statistical tests used in epidemiological
 
Biostatistics clinical research & trials
Biostatistics clinical research & trialsBiostatistics clinical research & trials
Biostatistics clinical research & trials
 
Stat topics
Stat topicsStat topics
Stat topics
 
Parametric and Non parametric Tests.pptx
Parametric and Non parametric Tests.pptxParametric and Non parametric Tests.pptx
Parametric and Non parametric Tests.pptx
 
Displaying your results
Displaying your resultsDisplaying your results
Displaying your results
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
 
Critical Appriaisal Skills Basic 1 | May 4th 2011
Critical Appriaisal Skills Basic 1 | May 4th 2011Critical Appriaisal Skills Basic 1 | May 4th 2011
Critical Appriaisal Skills Basic 1 | May 4th 2011
 
Parametric vs non parametric test
Parametric vs non parametric testParametric vs non parametric test
Parametric vs non parametric test
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.ppt
 
A lesson on statistics
A lesson on statisticsA lesson on statistics
A lesson on statistics
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
 
Analysis and Interpretation
Analysis and InterpretationAnalysis and Interpretation
Analysis and Interpretation
 
Bio-Statistics in Bio-Medical research
Bio-Statistics in Bio-Medical researchBio-Statistics in Bio-Medical research
Bio-Statistics in Bio-Medical research
 

Recently uploaded

Navigating Challenges: Mental Health, Legislation, and the Prison System in B...
Navigating Challenges: Mental Health, Legislation, and the Prison System in B...Navigating Challenges: Mental Health, Legislation, and the Prison System in B...
Navigating Challenges: Mental Health, Legislation, and the Prison System in B...
Guillermo Rivera
 
Medical Technology Tackles New Health Care Demand - Research Report - March 2...
Medical Technology Tackles New Health Care Demand - Research Report - March 2...Medical Technology Tackles New Health Care Demand - Research Report - March 2...
Medical Technology Tackles New Health Care Demand - Research Report - March 2...
pchutichetpong
 
Demystifying-Gene-Editing-The-Promise-and-Peril-of-CRISPR.pdf
Demystifying-Gene-Editing-The-Promise-and-Peril-of-CRISPR.pdfDemystifying-Gene-Editing-The-Promise-and-Peril-of-CRISPR.pdf
Demystifying-Gene-Editing-The-Promise-and-Peril-of-CRISPR.pdf
SasikiranMarri
 
Telehealth Psychology Building Trust with Clients.pptx
Telehealth Psychology Building Trust with Clients.pptxTelehealth Psychology Building Trust with Clients.pptx
Telehealth Psychology Building Trust with Clients.pptx
The Harvest Clinic
 
ventilator, child on ventilator, newborn
ventilator, child on ventilator, newbornventilator, child on ventilator, newborn
ventilator, child on ventilator, newborn
Pooja Rani
 
BOWEL ELIMINATION BY ANUSHRI SRIVASTAVA.pptx
BOWEL ELIMINATION BY ANUSHRI SRIVASTAVA.pptxBOWEL ELIMINATION BY ANUSHRI SRIVASTAVA.pptx
BOWEL ELIMINATION BY ANUSHRI SRIVASTAVA.pptx
AnushriSrivastav
 
Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...
Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...
Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...
The Lifesciences Magazine
 
Neuro Saphirex Cranial Brochure
Neuro Saphirex Cranial BrochureNeuro Saphirex Cranial Brochure
Neuro Saphirex Cranial Brochure
RXOOM Healthcare Pvt. Ltd. ​
 
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdfCHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
Sachin Sharma
 
A Community health , health for prisoners
A Community health  , health for prisonersA Community health  , health for prisoners
A Community health , health for prisoners
Ahmed Elmi
 
The Impact of Meeting: How It Can Change Your Life
The Impact of Meeting: How It Can Change Your LifeThe Impact of Meeting: How It Can Change Your Life
The Impact of Meeting: How It Can Change Your Life
ranishasharma67
 
POLYCYSTIC OVARIAN SYNDROME (PCOS)......
POLYCYSTIC OVARIAN SYNDROME (PCOS)......POLYCYSTIC OVARIAN SYNDROME (PCOS)......
POLYCYSTIC OVARIAN SYNDROME (PCOS)......
Ameena Kadar
 
India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...
India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...
India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...
Kumar Satyam
 
Navigating Women's Health: Understanding Prenatal Care and Beyond
Navigating Women's Health: Understanding Prenatal Care and BeyondNavigating Women's Health: Understanding Prenatal Care and Beyond
Navigating Women's Health: Understanding Prenatal Care and Beyond
Aboud Health Group
 
.Metabolic.disordersYYSSSFFSSSSSSSSSSDDD
.Metabolic.disordersYYSSSFFSSSSSSSSSSDDD.Metabolic.disordersYYSSSFFSSSSSSSSSSDDD
.Metabolic.disordersYYSSSFFSSSSSSSSSSDDD
samahesh1
 
Haridwar ❤CALL Girls 🔝 89011★83002 🔝 ❤ℂall Girls IN Haridwar ESCORT SERVICE❤
Haridwar ❤CALL Girls 🔝 89011★83002 🔝 ❤ℂall Girls IN Haridwar ESCORT SERVICE❤Haridwar ❤CALL Girls 🔝 89011★83002 🔝 ❤ℂall Girls IN Haridwar ESCORT SERVICE❤
Haridwar ❤CALL Girls 🔝 89011★83002 🔝 ❤ℂall Girls IN Haridwar ESCORT SERVICE❤
ranishasharma67
 
💘Ludhiana ℂall Girls 📞]][89011★83002][[ 📱 ❤ESCORTS service in Ludhiana💃💦Ludhi...
💘Ludhiana ℂall Girls 📞]][89011★83002][[ 📱 ❤ESCORTS service in Ludhiana💃💦Ludhi...💘Ludhiana ℂall Girls 📞]][89011★83002][[ 📱 ❤ESCORTS service in Ludhiana💃💦Ludhi...
💘Ludhiana ℂall Girls 📞]][89011★83002][[ 📱 ❤ESCORTS service in Ludhiana💃💦Ludhi...
ranishasharma67
 
Yemen National Tuberculosis Program .ppt
Yemen National Tuberculosis Program .pptYemen National Tuberculosis Program .ppt
Yemen National Tuberculosis Program .ppt
Esam43
 
The Importance of Community Nursing Care.pdf
The Importance of Community Nursing Care.pdfThe Importance of Community Nursing Care.pdf
The Importance of Community Nursing Care.pdf
AD Healthcare
 
Navigating Healthcare with Telemedicine
Navigating Healthcare with  TelemedicineNavigating Healthcare with  Telemedicine
Navigating Healthcare with Telemedicine
Iris Thiele Isip-Tan
 

Recently uploaded (20)

Navigating Challenges: Mental Health, Legislation, and the Prison System in B...
Navigating Challenges: Mental Health, Legislation, and the Prison System in B...Navigating Challenges: Mental Health, Legislation, and the Prison System in B...
Navigating Challenges: Mental Health, Legislation, and the Prison System in B...
 
Medical Technology Tackles New Health Care Demand - Research Report - March 2...
Medical Technology Tackles New Health Care Demand - Research Report - March 2...Medical Technology Tackles New Health Care Demand - Research Report - March 2...
Medical Technology Tackles New Health Care Demand - Research Report - March 2...
 
Demystifying-Gene-Editing-The-Promise-and-Peril-of-CRISPR.pdf
Demystifying-Gene-Editing-The-Promise-and-Peril-of-CRISPR.pdfDemystifying-Gene-Editing-The-Promise-and-Peril-of-CRISPR.pdf
Demystifying-Gene-Editing-The-Promise-and-Peril-of-CRISPR.pdf
 
Telehealth Psychology Building Trust with Clients.pptx
Telehealth Psychology Building Trust with Clients.pptxTelehealth Psychology Building Trust with Clients.pptx
Telehealth Psychology Building Trust with Clients.pptx
 
ventilator, child on ventilator, newborn
ventilator, child on ventilator, newbornventilator, child on ventilator, newborn
ventilator, child on ventilator, newborn
 
BOWEL ELIMINATION BY ANUSHRI SRIVASTAVA.pptx
BOWEL ELIMINATION BY ANUSHRI SRIVASTAVA.pptxBOWEL ELIMINATION BY ANUSHRI SRIVASTAVA.pptx
BOWEL ELIMINATION BY ANUSHRI SRIVASTAVA.pptx
 
Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...
Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...
Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...
 
Neuro Saphirex Cranial Brochure
Neuro Saphirex Cranial BrochureNeuro Saphirex Cranial Brochure
Neuro Saphirex Cranial Brochure
 
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdfCHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
 
A Community health , health for prisoners
A Community health  , health for prisonersA Community health  , health for prisoners
A Community health , health for prisoners
 
The Impact of Meeting: How It Can Change Your Life
The Impact of Meeting: How It Can Change Your LifeThe Impact of Meeting: How It Can Change Your Life
The Impact of Meeting: How It Can Change Your Life
 
POLYCYSTIC OVARIAN SYNDROME (PCOS)......
POLYCYSTIC OVARIAN SYNDROME (PCOS)......POLYCYSTIC OVARIAN SYNDROME (PCOS)......
POLYCYSTIC OVARIAN SYNDROME (PCOS)......
 
India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...
India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...
India Clinical Trials Market: Industry Size and Growth Trends [2030] Analyzed...
 
Navigating Women's Health: Understanding Prenatal Care and Beyond
Navigating Women's Health: Understanding Prenatal Care and BeyondNavigating Women's Health: Understanding Prenatal Care and Beyond
Navigating Women's Health: Understanding Prenatal Care and Beyond
 
.Metabolic.disordersYYSSSFFSSSSSSSSSSDDD
.Metabolic.disordersYYSSSFFSSSSSSSSSSDDD.Metabolic.disordersYYSSSFFSSSSSSSSSSDDD
.Metabolic.disordersYYSSSFFSSSSSSSSSSDDD
 
Haridwar ❤CALL Girls 🔝 89011★83002 🔝 ❤ℂall Girls IN Haridwar ESCORT SERVICE❤
Haridwar ❤CALL Girls 🔝 89011★83002 🔝 ❤ℂall Girls IN Haridwar ESCORT SERVICE❤Haridwar ❤CALL Girls 🔝 89011★83002 🔝 ❤ℂall Girls IN Haridwar ESCORT SERVICE❤
Haridwar ❤CALL Girls 🔝 89011★83002 🔝 ❤ℂall Girls IN Haridwar ESCORT SERVICE❤
 
💘Ludhiana ℂall Girls 📞]][89011★83002][[ 📱 ❤ESCORTS service in Ludhiana💃💦Ludhi...
💘Ludhiana ℂall Girls 📞]][89011★83002][[ 📱 ❤ESCORTS service in Ludhiana💃💦Ludhi...💘Ludhiana ℂall Girls 📞]][89011★83002][[ 📱 ❤ESCORTS service in Ludhiana💃💦Ludhi...
💘Ludhiana ℂall Girls 📞]][89011★83002][[ 📱 ❤ESCORTS service in Ludhiana💃💦Ludhi...
 
Yemen National Tuberculosis Program .ppt
Yemen National Tuberculosis Program .pptYemen National Tuberculosis Program .ppt
Yemen National Tuberculosis Program .ppt
 
The Importance of Community Nursing Care.pdf
The Importance of Community Nursing Care.pdfThe Importance of Community Nursing Care.pdf
The Importance of Community Nursing Care.pdf
 
Navigating Healthcare with Telemedicine
Navigating Healthcare with  TelemedicineNavigating Healthcare with  Telemedicine
Navigating Healthcare with Telemedicine
 

bio statistics for clinical research

  • 1. Statistical Methods in Clinical Research Dr Ranjith P DNB Resident ACME Pariyaram , Kerala
  • 2. Overview  Data types  Summarizing data using descriptive statistics  Standard error  Confidence Intervals
  • 3. Overview  P values  Alpha and Beta errors  Statistics for comparing 2 or more groups with continuous data  Non-parametric tests
  • 4. Overview  Regression and Correlation  Risk Ratios and Odds Ratios  Survival Analysis  Cox Regression
  • 5.  Forest plot  PICOT overview
  • 6. Types of Data  Discrete Data-limited number of choices  Binary: two choices (yes/no)  Dead or alive  Disease-free or not  Categorical: more than two choices, not ordered  Race  Age group  Ordinal: more than two choices, ordered  Stages of a cancer  Likert scale for response  E.G. strongly agree, agree, neither agree or disagree, etc.
  • 7. Types of data  Continuous data  Theoretically infinite possible values (within physiologic limits) , including fractional values  Height, age, weight  Can be interval  Interval between measures has meaning.  Ratio of two interval data points has no meaning  Temperature in celsius, day of the year).  Can be ratio  Ratio of the measures has meaning  Weight, height
  • 8. Types of Data  Why important?  The type of data defines:  The summary measures used  Mean, Standard deviation for continuous data  Proportions for discrete data  Statistics used for analysis:  Examples:  T-test for normally distributed continuous  Wilcoxon Rank Sum for non-normally distributed continuous
  • 9. Descriptive Statistics  Characterize data set  Graphical presentation  Histograms  Frequency distribution  Box and whiskers plot  Numeric description  Mean, median, SD, interquartile range
  • 11. Frequency Distribution Segmentation of data into groups Discrete or continuous data
  • 13. Box and Whisker Plots Popular in Epidemiologic Studies Useful for presenting comparative data graphically
  • 14. Numeric Descriptive Statistics  Measures of central tendency of data  Mean  Median  Mode  Measures of variability of data(dispersion)  Standard Deviation, mean deviation  Interquartile range, variance
  • 15. Mean  Most commonly used measure of central tendency  Best applied in normally distributed continuous data.  Not applicable in categorical data  Definition:  Sum of all the values in a sample, divided by the number of values.
  • 16.  Eg mean Height of 6 adolescent children 146 ,142,150,148,156,140  Ans ?  882/6 =147
  • 17. Median  Used to indicate the “average” in a skewed population  Often reported with the mean  If the mean and the median are the same, sample is normally distributed.
  • 18.  It is the middle value from an ordered listing of the values  If an odd number of values, it is the middle value 1.2.3.4.5 ie 3  If even number of values, it is the average of the two middle values.1,2,3,4,5,6 ie 3+4/2 = 3.5  Mid-value in interquartile range
  • 19. Mode  Infrequently reported as a value in studies.  Is the most common value eg. 1,3,8,9,5,8,5,6  mode = 5 .
  • 20. Interquartile range  Is the range of data from the 25th percentile to the 75th percentile  Common component of a box and whiskers plot  It is the box, and the line across the box is the median or middle value  Rarely, mean will also be displayed.
  • 21.
  • 22. Mean deviation(standard deviation )  Mean deviation(SD) = £I X- I / nẌ  n is the no of observations is the mean ,Ẍ X each observation  Square mean deviation= variance= £I X- I² / nẌ Root mean square deviation =√£I X- I² / nẌ
  • 23. Variance  Square of SD(standard deviation ) Coefficient of variance = SD/ mean x 100 Eg. If sd is 3 mean is 150 Variance is 9, coefficient of variance is 300/150 = 2
  • 24. Standard Error  A fundamental goal of statistical analysis is to estimate a parameter of a population based on a sample  The values of a specific variable from a sample are an estimate of the entire population of individuals who might have been eligible for the study.  A measure of the precision of a sample
  • 25. Standard Error  Standard error of the mean  Standard deviation / square root of (sample size)  (if sample greater than 60)  Sd/ √n  Important: dependent on sample size  Larger the sample, the smaller the
  • 26. Clarification  Standard Deviation measures the variability or spread of the data in an individual sample.  Standard error measures the precision of the estimate of a population parameter provided by the sample mean or proportion.
  • 27. Standard Error  Significance:  Is the basis of confidence intervals  A 95% confidence interval is defined by  Sample mean (or proportion) ± 1.96 X standard error  Since standard error is inversely related to the sample size:  The larger the study (sample size), the smaller the confidence intervals and the greater the precision of the estimate.
  • 28.  Mean +/- 1 sd = 68.27% value Mean +/- 2 sd = 95.49% value  Mean +/- 3 sd = 99.7% value  Mean +/- 4 sd = 99.9% value
  • 29. Confidence Intervals  May be used to assess a single point estimate such as mean or proportion.  Most commonly used in assessing the estimate of the difference between two groups.
  • 30. Confidence Intervals Commonly reported in studies to provide an estimate of the precision of the mean.
  • 31. P Values  The probability that any observation is due to chance alone assuming that the null hypothesis is true  Typically, an estimate that has a p value of 0.05 or less is considered to be “statistically significant” or unlikely to occur due to chance alone. Null hypothesis rejected
  • 32.  The P value used is an arbitrary value  P value of 0.05 equals 1 in 20 chance  P value of 0.01 equals 1 in 100 chance  P value of 0.001 equals 1 in 1000 chance.
  • 33. Errors  Type I error  Claiming a difference between two samples when in fact there is none.  Remember there is variability among samples- they might seem to come from different populations but they may not.  Also called the α error.  Typically 0.05 is used
  • 34. Errors  Type II error  Claiming there is no difference between two samples when in fact there is.  Also called a β error.  The probability of not making a Type II error is 1 - β, which is called the power of the test.  Hidden error because can’t be detected without a proper power analysis
  • 35. Errors Null Hypothesis H0 Alternative Hypothesis H1 Null Hypothesis H0 No Error Type I α Alternative Hypothesis H1 Type II β No Error Test result Truth
  • 36. Sample Size Calculation  Also called “power analysis”.  When designing a study, one needs to determine how large a study is needed.  Power is the ability of a study to avoid a Type II error.  Sample size calculation yields the number of study subjects needed, given a certain desired power to detect a difference and a certain level of P value that will be considered significant.
  • 37. Sample Size Calculation  Depends on:  Level of Type I error: 0.05 typical  Level of Type II error: 0.20 typical  One sided vs two sided: nearly always two  Inherent variability of population  Usually estimated from preliminary data  The difference that would be meaningful between the two assessment arms.
  • 38. One-sided vs. Two-sided  Most tests should be framed as a two- sided test.  When comparing two samples, we usually cannot be sure which is going to be be better.  You never know which directions study results will go.  For routine medical research, use only two- sided tests.
  • 39. Statistical Tests  Parametric tests  Continuous data normally distributed  Non-parametric tests  Continuous data not normally distributed  Categorical or Ordinal data
  • 40. Comparison of 2 Sample Means  Student’s T test  Assumes normally distributed continuous data. T value = difference between means standard error of difference  T value then looked up in Table to determine significance
  • 41. Paired T Tests  Uses the change before and after intervention in a single individual  Reduces the degree of variability between the groups  Given the same number of patients, has greater power to detect a difference between groups
  • 42. Analysis of Variance(ANOVA)  Used to determine if two or more samples are from the same population-  If two samples, is the same as the T test.  Usually used for 3 or more samples.
  • 43. Non-parametric Tests  Testing proportions  (Pearson’s) Chi-Squared (χ2) Test  Fisher’s Exact Test  Testing ordinal variables  Mann Whiney “U” Test  Kruskal-Wallis One-way ANOVA  Testing Ordinal Paired Variables  Sign Test  Wilcoxon Rank Sum Test
  • 44. Use of non-parametric tests  Use for categorical, ordinal or non-normally distributed continuous data  May check both parametric and non- parametric tests to check for congruity  Most non-parametric tests are based on ranks or other non- value related methods  Interpretation:  Is the P value significant?
  • 45. (Pearson’s) Chi-Squared (χ2) Test  Used to compare observed proportions of an event compared to expected.  Used with nominal data (better/ worse; dead/alive)  If there is a substantial difference between observed and expected, then it is likely that the null hypothesis is rejected.  Often presented graphically as a 2 X 2 Table
  • 46. Non parametric test For comparing 2 related samples -Wilcoxon Signed Rank Test For comparing 2 unrelated samples -Mann- Whitney U Test For comparing >2groups -Kruskal Walli Test
  • 47. Mann–Whitney U test  Mann–Whitney–Wilcoxon (MWW), Wilcoxon  rank-sum test, or Wilcoxon–Mann–Whitney  test) is a non-parametric test especially that a particular population tends to have larger values than the other.  It has greater efficiency than the t-test on non- normal distributions, such as a mixture of normal distributions, and it is nearly as efficient as the t-test on normal distributions.
  • 48. STUDENT T TEST  A t-test is any statistical hypothesis test in which the test statistic follows a normal distri bution if the null hypothesis is supported.  It can be used to determine if two sets of data are significantly different from each other, and is most commonly applied when the test statistic would follow a normal distribution
  • 49.  The Kaplan–Meier estimator,also known as the product limit estimator, is an estimator for estimating the survival function from lifetime data.  In medical research, it is often used to measure the fraction of patients living for a certain amount of time after treatment.  The estimator is named after Edward L. Kaplan and Paul Meier.
  • 50.  A plot of the Kaplan–Meier estimate of the survival function is a series of horizontal steps of declining magnitude which, when a large enough sample is taken, approaches the true survival function for that population.
  • 51.
  • 52.
  • 53.  ODDS RATIO In case control study – measure of the strength of the association between risk factor and out come
  • 54. Odds ratio Lung  cancer(case s) No  lung  cancer  (controls) smokers 33 (a) 55 (b) Non smokers 2 (c) 27 (d) TOTAL  35(a+c) 82(b+d)
  • 55.  Odds ratio =ad/bc =33*27/55*2 =8.1 ie smokers have 8.1 times have the risk to develop lung cancer than non smokers
  • 56. RELATIVE RISK  Measure of risk in a cohort study  RR=lncidence of disease among exposed / incidence among non exposed
  • 58.  Incidence among smokers=70/7000=10/1000  Incidence among non smokers=3/3000=1/1000  Total incidence= 73/10000=7.3/1000
  • 59. RR=lncidence of disease among exposed/ incidence among non exposed Relative risk of lung cancer=10/1=10 Incidence of lung cancer is 10 times higher in exposed group (smokers) , ie having a Positive relationship with smoking Larger RR ,more the strength of association
  • 60. Attributable risk  It is the difference in incidence rates of disease between exposed group(EG) and non exposed group(NEG)  Often expressed in percent
  • 61.  (Incidence of disease rate in EG- Incidence of disease in NEG/incidence rate in EG ) * 100 . AR= 10-1/10=90% Ie 90% lung cancers in smokers was due to their smoking
  • 62. Population attributable Risk  It is the incidence of the disease in total population - the incidence of disease among those who were not exposed to the suspected causal factor/incidence of disease in total population  PAR=7.3-1/7.3=86.3%, ie 86.3 % disease can be avoided if risk factors like cigarettes were avoided
  • 63. Mortality rates & Ratios  Crude Death rate  No of deaths (from all cases )per 1000 estimated mid year population(MYP) in one year in a given place  CDR=(No. deaths during the
  • 64.  CDR in Panchayath A is 15.2/1000  Panchayath B is 8.2/1000 population Health status of Panchayath B is better than A
  • 65.  Specific Death rate=(No of diseases due to specific diseases during a calendar year/ MYP)*1,000 Can calculate death rate in separate diseases eg . TB, HIV 2/1000, 1/1000 resp Age groups 5-20yrs, <5yrs - 1/1000, 3/3000 resp. Sex eg. More in males, Specific months,etc
  • 66. Case fatality rate(ratio)  (Total no of deaths due to a particular disease/Total no of cases due to same disease)*100  Usually described in A/c infectious diseases  Dengue, cholera, food poisoning etc  Represent killing power of the disease
  • 67. Proportional mortality rate(ratio)  Due to a specific disease=(No of deaths from the specific disease in a year/ Total deaths in an year )*100  Under 5 Mortality rate=(No of deaths under 5 years of age in a given year/Total no of deaths during the same period)*100
  • 68. Survival rate  (Total no of patients alive after 5yrs/Total no of patients diagnosed or treated)*100  Method of prognosis of certain disease conditions mainly in cancers
  • 69. INCIDENCE  No of new cases occurring in a defined population during a specified period of time  (No of new cases of specific disease during a given time period / Population at risk)*1000  Eg 500 new cases of TB in a population of 30000, Incidence is (500/3000)*1000 ie 16.7/1000/yr expressed as incidence rate
  • 70. Incidence-uses  Can be expressed as Special incidence rate , Attack rate , Hospital admission rate , case rate etc  Measures the rate at which new cases are occurring in a population  Not influenced by duration  Generally use is restricted to acute
  • 71. PREVALENCE  Refers specifically to all current cases (old & new) existing at a given point of time, or a period of time in a given population  Referred to as a rate , it is really a a ratio
  • 72.  Point prevalence=(No of all currant cases (old& new) of a specified disease existing at a given point of time / Estimated population at the same point of time)*100  Period prevalence=(No of existing cases (old& new) of a specified disease during a given period of time / Estimated mid interval population at risk)*100
  • 73.
  • 74.  Incidence - 3,4,5,8  Point prevalence at jan 1- 1,2& 7  Point prevalence at Dec 31- 1,3,5&8  Period prevalence(jan-Dec)- 1,2,3,4,5,7&8
  • 75. Relationship b/n Incidence & prevalence  Prevalence=Incidence*Mean duration  P=I*D I=P/D D=P/I  Eg: Incidence=10 cases/1000 population/yr  Mean duration 5 yrs  Prevalence=10*5 =50/1000 population
  • 76. PREVALENCE-USES  Helps to estimate magnitude of health/disease problems in the community, & identify potential high risk populations  Prevalence rates are especially useful for administrative and planning purposes eg: hospital beds, man power needs,rehabilation facilities etc.
  • 77. Statistical significance  P value (hypothesis)  95% CI (Interval)
  • 78. P value & its interpretation “it is the probability of type 1 error”  The chance that, a difference or association is concluded , when actually there is none.
  • 79.  Study of prevalence of obesity in male & female child in a classroom. 50 students of 25 boys- 10 obese of 25 girls - 16 obese p value : 0.02
  • 80. Null hypothesis: “no difference in obesity among boys & girls in the classroom”
  • 81. study ,Bubble vs conventional CPAP for prevention of extubation Failure( EF) in preterm very low birth weight infants. EF bCPAP =4(16) cCPAP =9(16) p value-0.14
  • 82. Null hypothesis: “ no difference in EF among preterm babies treated with bCPAP &cCPAP.”
  • 83. 95% CI 95%CI= Mean ‡1.96SD(2SD) = Mean ‡ 2SE 1) 100 children attending pediatric OP. mean wt=15kg SD=2 95%CI =?
  • 84. Interpretation of 95%CI  If a test is repeated 100times , 95 times the mean value comes between this value.  If CI of 2 variables overlap, the chance of significant difference is very less.
  • 85. Measures Of Risk  case control study- Odds ratio  Cohort study -RR,AR
  • 86. Chi-Squared (χ2) Test  Chi-Squared (χ2) Formula  Not applicable in small samples  If fewer than 5 observations per cell, use Fisher’s exact test
  • 88. Correlation  Assesses the linear relationship between two variables  Example: height and weight  Strength of the association is described by a correlation coefficient- r  r = 0 - .2 low, probably meaningless  r = .2 - .4 low, possible importance  r = .4 - .6 moderate correlation  r = .6 - .8 high correlation  r = .8 - 1 very high correlation  Can be positive or negative  Pearson’s, Spearman correlation coefficient  Tells nothing about causation
  • 89. Correlation Source: Harris and Taylor. Medical Statistics Made Easy
  • 90. Correlation Perfect Correlation Source: Altman. Practical Statistics for Medical Research
  • 91. Regression  Based on fitting a line to data  Provides a regression coefficient, which is the slope of the line  Y = ax + b  Use to predict a dependent variable’s value based on the value of an independent variable.  Very helpful- In analysis of height and weight, for a known height, one can predict weight.  Much more useful than correlation  Allows prediction of values of Y rather than just whether there is a relationship between two variable.
  • 92. Regression  Types of regression  Linear- uses continuous data to predict continuous data outcome  Logistic- uses continuous data to predict probability of a dichotomous outcome  Poisson regression- time between rare events.  Cox proportional hazards regression- survival analysis.
  • 93. Multiple Regression Models  Determining the association between two variables while controlling for the values of others.  Example: Uterine Fibroids  Both age and race impact the incidence of fibroids.  Multiple regression allows one to test the impact of age on the incidence while controlling for race (and all other factors)
  • 94. Multiple Regression Models  In published papers, the multivariable models are more powerful than univariable models and take precedence.  Therefore we discount the univariable model as it does not control for confounding variables.  Eg: Coronary disease is potentially affected by age, HTN, smoking status, gender and many other factors.  If assessing whether height is a factor:  If it is significant on univariable analysis, but not on multivariable analysis, these other factors confounded the analysis.
  • 95. Survivial Analysis  Evaluation of time to an event (death, recurrence, recover).  Provides means of handling censored data  Patients who do not reach the event by the end of the study or who are lost to follow-up  Most common type is Kaplan-Meier analysis  Curves presented as stepwise change from baseline  There are no fixed intervals of follow-up- survival proportion recalculated after each event.
  • 96. Survival Analysis Source: Altman. Practical Statistics for Medical Research
  • 98. Kaplan-Meier Analysis  Provides a graphical means of comparing the outcomes of two groups that vary by intervention or other factor.  Survival rates can be measured directly from curve.  Difference between curves can be tested for statistical significance.
  • 99. Cox Regression Model  Proportional Hazards Survival Model.  Used to investigate relationship between an event (death, recurrence) occurring over time and possible explanatory factors.  Reported result: Hazard ratio (HR).  Ratio of the hazard in one group divided the hazard in another.  Interpreted same as risk ratios and odds ratios  HR 1 = no effect  HR > 1 increased risk  HR < 1 decreased risk
  • 100. Cox Regression Model  Common use in long-term studies where various factors might predispose to an event.  Example: after uterine embolization, which factors (age, race, uterine size, etc) might make recurrence more likely.
  • 101. True disease state vs. Test result not rejected rejected No disease (D = 0)  specificity X Type I error (False +) α Disease (D = 1) X Type II error (False -) β  Power 1 - β; sensitivity Disease Test
  • 102. Specific Example Test Result Pts withPts with diseasedisease Pts withoutPts without the diseasethe disease
  • 103. Test Result Call these patients “negative” Call these patients “positive” Threshold
  • 104. Test Result Call these patients “negative” Call these patients “positive” without the disease with the disease True Positives Some definitions ...
  • 105. Test Result Call these patients “negative” Call these patients “positive” without the disease with the disease False Positives
  • 106. Test Result Call these patients “negative” Call these patients “positive” without the disease with the disease True negatives
  • 107. Test Result Call these patients “negative” Call these patients “positive” without the disease with the disease False negatives
  • 108. Test Result without the disease with the disease ‘‘‘‘-’’-’’ ‘‘‘‘+’’+’’ Moving the Threshold: right
  • 109. Test Result without the disease with the disease ‘‘‘‘-’’-’’ ‘‘‘‘+’’+’’ Moving the Threshold: left
  • 111. TruePositiveRate 0 % 100% False Positive Rate 0 % 100% TruePositiveRate 0 % 100% False Positive Rate 0 % 100% A good test: A poor test: ROC curve comparison
  • 112. Best Test: Worst test: TruePositiveRate 0 % 100% False Positive Rate 0 % 100 % TruePositive Rate 0 % 100% False Positive Rate 0 % 100 % The distributions don’t overlap at all The distributions overlap completely ROC curve extremes
  • 113. Best Test: Worst test: TruePositiveRate 0 % 100% False Positive Rate 0 % 100 % TruePositive Rate 0 % 100% False Positive Rate 0 % 100 % The distributions don’t overlap at all The distributions overlap completely ROC curve extremes
  • 115.  An example forest plot of five odds ratios (squares) with the summary measure (centre line of diamond) and associated confidence intervals (lateral tips of diamond), and solid vertical line of no effect. Names of (fictional) studies are shown on the left, odds ratios and 115
  • 116.  A forest plot (or blobbogram[1] ) is a graphical display designed to illustrate the relative strength of treatment effects in multiple quantitative scientific studies addressing the same question. It was developed for use in medical research as a means of graphically representing a meta-analysis of the results of randomized controlled trials. 116
  • 117. 117
  • 118.  i. Probably a small study, with a wide CI, crossing the line of no effect (OR = 1). Unable to say if the intervention works  ii. Probably a small study, wide CI , but does not cross OR = 1; suggests intervention works but weak evidence  iii. Larger study, narrow CI: but crosses OR = 1; no evidence that intervention
  • 119.  iv. Large study, narrow confidence intervals: entirely to left of OR = 1; suggests intervention works  v. Small study, wide confidence intervals, suggests intervention is detrimental  vi. Meta-analysis of all identified studies: suggests intervention works.
  • 120. PICOT  Used to test evidence based research  Population  Intervension or issue  Comparison with another intervention  Outcome  Time frame

Editor's Notes

  1. Similar: use both to compare groups
  2. sd = difference between each value and the mean, squared, then all added together and divided by (n-1) THEN take the square root of this value