SlideShare a Scribd company logo
1 of 32
Download to read offline
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Descriptive analysis
of continuous variables
Tuan V. Nguyen
Professor and NHMRC Senior Research Fellow
Garvan Institute of Medical Research
University of New South Wales
Sydney, Australia
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Some old words
“If it were not for the great variability among
individuals, medicine might be a Science,
not an Art”
William Osler, 1882
The Principles and Practice of Medicine
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Normal (Gaussian) distribution
•  Given a series of values xi (i = 1, … , n): x1, x2, …, xn,
the mean is:
x
n
xi
i
n
=
=
∑
1
1
•  Study 1: the color scores of 6 consumers are: 6, 7, 8, 4, 5, and
6. The mean is:
6
6
36
6
6
5
4
8
7
6
1
1
=
=
+
+
+
+
+
=
∑
=
=
n
i
i
x
n
x
•  Study 2: the color scores of 4 consumers are: 10, 2, 3, and 9.
The mean is:
6
4
24
4
9
3
2
10
1
1
=
=
+
+
+
=
∑
=
=
n
i
i
x
n
x
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Variation
•  The mean does not adequately describe the data. We
need to know the variation in the data.
•  An obvious measure is the sum of difference from the
mean.
•  For study 1, the scores 6, 7, 8, 4, 5, and 6, we have:
(6-6) + (7-6) + (8-6) + (4-6) + (5-6) + (6-6)
= 0 + 1 + 2 – 2 – 1 + 0
= 0
NOT SATISFACTORY!
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Sum of squares
•  We need to make the difference positive by squaring
them. This is called “Sum of squares” (SS)
•  For study 1: 6, 7, 8, 4, 5, 6, we have:
SS = (6-6)2 + (7-6)2 + (8-6)2 + (4-6)2 + (5-6)2 + (6-6)2
= 10
•  For study 2: 10, 2, 3, 9, we have:
SS = (10-6)2 + (2-6)2 + (3-6)2 + (9-6)2 = 50
•  This is better!
•  But it does not take into account sample size n.
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Variance
•  We have to divide the SS by sample size n. But
in each square we use the mean to calculate the
square, so we lose 1 degree of freedom.
•  Therefore the correct denominator is n-1. This is
called variance (denoted by s2)
( )
∑ −
−
=
=
n
i
i x
x
n
s
1
2
2
1
1
( ) ( ) ( )
1
... 2
2
2
2
1
2
−
−
+
+
−
+
−
=
n
x
x
x
x
x
x
s n
•  Or, in the sum notation:
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Variance - example
•  For study 1: 6, 7, 8, 4, 5, and 6, the variance is:
( ) ( ) ( ) ( ) ( ) 2
5
10
1
6
6
6
6
5
6
8
6
7
6
6 2
2
2
2
2
2
=
=
−
−
+
−
+
−
+
−
+
−
=
s
•  For study 2: 10, 2, 3, 9, the variance is:
( ) ( ) ( ) ( ) 7
.
16
3
50
1
4
6
9
6
3
6
2
6
10 2
2
2
2
2
=
=
−
−
+
−
+
−
+
−
=
s
•  The scores in study 2 were much more variable than those
in study 1.
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Standard deviation
•  The problem with variance is that it is expressed in unit
squared, whereas the mean is in the actual unit. We
need a way to convert variance back to the actual unit of
measurement.
•  We take the square root of variance – this is called
“standard deviation” (denote by s)
•  For study 1, s = sqrt(2) = 1.41
For study 2, s = sqrt(16.7) = 4.1
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Coefficient of variation
•  In many studies, the standard deviation can vary with
the mean (eg higher/lower mean values are associated
with higher/lower SD)
•  Another statistic commonly used to quantify this
phenomenon is the coefficient of variation (CV).
•  A CV expresses the SD as percentage of the mean. CV
= SD/mean*100
•  For study 1, CV = 1.41 / 6 * 100 = 23.5%
For study 2, CV = 4.1 / 6 * 100 = 68.3%
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Summary statistics
•  Summary statistics are usually shown in
sample size, mean and standard deviation.
•  In our examples
Study N Mean SD
1 6 6.0 1.4
2 4 6.0 4.1
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Implications of the mean and SD
•  “In the Vietnamese population aged 30+ years, the
average of weight was 55.0 kg, with the SD being 8.2 kg.”
•  What does this mean?
•  If the data are normally distributed, this means that the
probability that an individual randomly selected from the
population with weight being w kg is:
( ) ( )
⎥
⎥
⎦
⎤
⎢
⎢
⎣
⎡ −
−
=
= 2
2
2
exp
2
1
s
x
w
s
w
Weight
P
π
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Implications of the mean and SD
•  In our example, x = 55, s = 8.2
•  The probability that an individual randomly selected from the
population with weight being 40 kg is:
( ) ( ) 009
.
0
2
.
8
2
.
8
2
55
40
exp
1416
.
3
2
2
.
8
1
40
2
=
⎥
⎥
⎦
⎤
⎢
⎢
⎣
⎡
×
×
−
−
×
×
=
=
Weight
P
( ) ( ) 040
.
0
2
.
8
2
.
8
2
55
50
exp
1416
.
3
2
2
.
8
1
50
2
=
⎥
⎥
⎦
⎤
⎢
⎢
⎣
⎡
×
×
−
−
×
×
=
=
Weight
P
( ) ( ) 0004
.
0
2
.
8
2
.
8
2
55
80
exp
1416
.
3
2
2
.
8
1
80
2
=
⎥
⎥
⎦
⎤
⎢
⎢
⎣
⎡
×
×
−
−
×
×
=
=
Weight
P
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Implications of the mean and SD
•  The distribution of weight of the entire population can be
shown to be:
0
1
2
3
4
5
6
22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 92
Weight (kg)
Percent
(%)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Z-scores
•  Actual measurements can be converted to z-scores
•  A z-score is the number of SDs from the mean
s
x
x
Z
−
=
•  A weight = 55 kg à z = (55 - 55)/8.2 = 0 SDs
•  A weight = 40 kg à z = (40 - 55)/8.2 = -1.8 SDs
•  A weight = 80 kg à z = (80-55)/8.2 = 3.0 SDs
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Z-scores = Standard Normal Distribution
•  A z-score is unitless, allowing comparison between
variables with different measurements
•  Z-scores have mean 0 and variance of 1.
•  Z-scores à Standard Normal Distribution
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Z-scores and area under the curve
•  Z-scores and weight – another look:
0
1
2
3
4
5
6
-4.0 -3.5 -3.0 -2.6 -2.1 -1.6 -1.1 -0.6 -0.1 0.4 0.9 1.3 1.8 2.3 2.8 3.3 3.8 4.3
Percent
(%)
•  Area under the curve for z < -1.96 = 0.025
•  Area under the curve for -1.0 < z < 1.0 = 0.6828
•  Area under the curve for -2.0 < z < 2.0 = 0.9544
•  Area under the curve for -3.0 < z < 3.0 = 0.9972
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
95% confidence interval
•  A sample of n measurements (x1, x2, …, xn), with mean x
and standard deviation s.
•  95% of the individual values of xi lies between x-1.96s and
x+1.96s
•  Mean weight = 55 kg, SD = 8.2 kg
•  95% of individuals’ weight lies between 39 kg and 71 kg.
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Cumulative probability (area under the curve)
for Z-scores
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Z-scores
Percent
(%)
Z < -3 -2.5 -2.0 -1.5 -1.0 -0.5 0 0.5 1.0 1.5 2.0 2.5 3.0
Prob .0013 .006 .0227 .0668 .1587 .3085 .5000 .6915 .8413 .9332 .9772 .9938 .9987
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Standard error (SE)
•  SE = standard error
•  s : standard deviation
•  n: sample size
What does it mean ?
s
SE
n
=
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
The meaning of SE
•  Consider a population of 10 people:
130, 189, 200, 156, 154, 160, 162, 170, 145, 140
•  Mean m = 160.6 cm
•  We repeated take random samples, each sample has 5
people:
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
The meaning of SE
•  We repeated take random samples, each sample has 5
people:
1st sample: 140, 160, 200, 140, 145 mean x1 = 157.0
2nd sample: 154, 170, 162, 160, 162 mean x2 = 161.6
3rd sample: 145, 140, 156, 140, 156 mean x3 = 147.4
4th sample: 140, 170, 162, 170, 145 mean x4 = 157.4
5th sample: 156, 156, 170, 189, 170 mean x5 = 168.2
6th sample: 130, 170, 170, 170, 170 mean x6 = 162.0
7th sample: 156, 154, 145, 154, 189 mean x7 = 159.6
8th sample: 200, 154, 140, 170, 170 mean x8 = 166.8
9th sample: 140, 170, 145, 162, 160 mean x9 = 155.4
10th sample: 200, 200, 162, 170, 162 mean x10 = 178.8
….
SD of x1, x2, x3, …, x10 is the SE
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Use of SD and SE
Let the population mean be m (we do not know m). Let the sample mean
be x and SD be s.
•  68% individuals in the population will have values from x─s to x+s
•  95% individuals in the population will have values from x─2s to x+2s
•  99% individuals in the population will have values from x─3s to x+3s
Let the population mean be m (we do not know m). Let the sample mean
be x and SE be se.
•  68% averages from repeated samples will have values from x─se to x+se
•  95% averages from repeated samples will have values from x─2se to x+2se
•  99% averages from repeated samples will have values from x─3se to x+3se
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Central location: Median
•  The median is the value with a depth of (n+1)/2
•  When n is even, average the two values that straddle a
depth of (n+1)/2
•  For the 10 values listed below, the median has depth
(10+1) / 2 = 5.5, placing it between 27 and 28. Average
these two values to get median = 27.5
05 11 21 24 27 28 30 42 50 52
!
median
Average the adjacent values: M = 27.5
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
More examples of medians
•  Example A: 2 4 6
Median = 4
•  Example B: 2 4 6 8
Median = 5 (average of 4 and 6)
•  Example C: 6 2 4
Median ! 2
(Values must be ordered first)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
The median is robust
This data set has a mean of 1636:
1362 1439 1460 1614 1666 1792 1867
The median is 1614 in both instances,
demonstrating its robustness in the face of outliers.
The median is more resistant to skews and
outliers than the mean; it is more robust.
Here’s the same data set with a data entry error
“outlier” (highlighted). This data set has a mean of
2743:
1362 1439 1460 1614 1666 1792 9867
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Mode
•  The mode is the most commonly encountered value in the
dataset
•  This data set has a mode of 7
{4, 7, 7, 7, 8, 8, 9}
•  This data set has no mode
{4, 6, 7, 8}
(each point appears only once)
•  The mode is useful only in large data sets with repeating
values
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Comparison of Mean, Median, Mode
Note how the mean gets pulled toward
the longer tail more than the median
mean = median → symmetrical distrib
mean > median → positive skew
mean < median → negative skew
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Spread: Quartiles
•  Quartile 1 (Q1): cuts off bottom 25% of data
•  Quartile 3 (Q3): cuts off top 25% of data
= median of the to half of the data set
•  Interquartile Range (IQR) = Q3 – Q1
05 11 21 24 27 28 30 42 50 52
! ! !
Q1 median Q3
Q1 = 21, Q3 = 42, and IQR = 42 – 21 = 21
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Box plot
•  5 pt summary: {5, 21, 27.5, 42, 52};
box from 21 to 42 with line @ 27.5
•  IQR = 42 – 21 = 21.
FU = Q3 + 1.5(IQR) = 42 + (1.5)(21) =
73.5
FL = Q1 – 1.5(IQR) = 21 – (1.5)(21) = –
10.5
•  None values above upper fence
None values below lower fence
•  Upper inside value = 52
Lower inside value = 5
Draws whiskers
Data: 05 11 21 24 27 28 30 42 50 52
60
50
40
30
20
10
0
Upper inside = 52
Q3 = 42
Q1 = 21
Lower inside = 5
Q2 = 27.5
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Box plot
Seven metabolic rates:
1362 1439 1460 1614 1666 1792 1867
7
N =
Data source: Moore,
2000
1900
1800
1700
1600
1500
1400
1300
1.  5-point summary: 1362, 1449.5, 1614,
1729, 1867
2.  IQR = 1729 – 1449.5 = 279.5
FU = Q3 + 1.5(IQR) = 1729 + (1.5)(279.5) =
2148.25
FL = Q1 – 1.5(IQR) = 1449.5 – (1.5)(279.5) =
1030.25
3. None outside
4. Whiskers end @ 1867 and 1362
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Report of statistical summary
•  Always report a measure of central location, a measure
of spread, and the sample size
•  Symmetrical mound-shaped distributions ! report
mean and standard deviation
•  Non-normally distributed data: median, interquartile
ranges
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Summary
•  Mean indicates the typical value of sample values.
•  Standard deviation indicates the between-subjects
variability of sample values;
•  Standard deviation indicates the variability among sample
means = standard deviation of the means.
•  (There is no such thing called “standard error of the
means” (SEM))
•  Coefficient of variation indicates the relative variability
(about the mean) among subjects within a sample.
•  95% confidence interval loosely means the probable values
of a sample with 95% probability.

More Related Content

What's hot

3. Calculate samplesize for prevalence studies
3. Calculate samplesize for prevalence studies3. Calculate samplesize for prevalence studies
3. Calculate samplesize for prevalence studiesAzmi Mohd Tamil
 
Bayesian multivariate meta-analysis for modelling surrogate endpoints in HTA
Bayesian multivariate meta-analysis for modelling surrogate endpoints in HTABayesian multivariate meta-analysis for modelling surrogate endpoints in HTA
Bayesian multivariate meta-analysis for modelling surrogate endpoints in HTAcheweb1
 
9. Calculate samplesize for diagnostic study
9. Calculate samplesize for diagnostic study9. Calculate samplesize for diagnostic study
9. Calculate samplesize for diagnostic studyAzmi Mohd Tamil
 
Measures and Feedback (miller & schuckard, 2014)
Measures and Feedback (miller & schuckard, 2014)Measures and Feedback (miller & schuckard, 2014)
Measures and Feedback (miller & schuckard, 2014)Scott Miller
 
Application of the Rasch Model in Assessing and Streamlining an Instrument Me...
Application of the Rasch Model in Assessing and Streamlining an Instrument Me...Application of the Rasch Model in Assessing and Streamlining an Instrument Me...
Application of the Rasch Model in Assessing and Streamlining an Instrument Me...SherwinBalbuena1
 
How to write a paper statistics
How to write a paper statisticsHow to write a paper statistics
How to write a paper statisticsAmany El-seoud
 
Has modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurtHas modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurtStephen Senn
 
Approximate ANCOVA
Approximate ANCOVAApproximate ANCOVA
Approximate ANCOVAStephen Senn
 
4. Calculate samplesize for cross-sectional studies
4. Calculate samplesize for cross-sectional studies4. Calculate samplesize for cross-sectional studies
4. Calculate samplesize for cross-sectional studiesAzmi Mohd Tamil
 
5. Calculate samplesize for case-control studies
5. Calculate samplesize for case-control studies5. Calculate samplesize for case-control studies
5. Calculate samplesize for case-control studiesAzmi Mohd Tamil
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatisticso_devinyak
 
Sample size calculation - a brief overview
Sample size calculation - a brief overviewSample size calculation - a brief overview
Sample size calculation - a brief overviewAzmi Mohd Tamil
 
statistics in pharmaceutical sciences
statistics in pharmaceutical sciencesstatistics in pharmaceutical sciences
statistics in pharmaceutical sciencesTechmasi
 
2. Tools to calculate samplesize
2. Tools to calculate samplesize2. Tools to calculate samplesize
2. Tools to calculate samplesizeAzmi Mohd Tamil
 
Basic biostatistics dr.eezn
Basic biostatistics dr.eeznBasic biostatistics dr.eezn
Basic biostatistics dr.eeznEhealthMoHS
 
Motivation for biostatistics
Motivation for biostatisticsMotivation for biostatistics
Motivation for biostatisticso_devinyak
 

What's hot (20)

3. Calculate samplesize for prevalence studies
3. Calculate samplesize for prevalence studies3. Calculate samplesize for prevalence studies
3. Calculate samplesize for prevalence studies
 
Bayesian multivariate meta-analysis for modelling surrogate endpoints in HTA
Bayesian multivariate meta-analysis for modelling surrogate endpoints in HTABayesian multivariate meta-analysis for modelling surrogate endpoints in HTA
Bayesian multivariate meta-analysis for modelling surrogate endpoints in HTA
 
9. Calculate samplesize for diagnostic study
9. Calculate samplesize for diagnostic study9. Calculate samplesize for diagnostic study
9. Calculate samplesize for diagnostic study
 
Normality tests
Normality testsNormality tests
Normality tests
 
Measures and Feedback (miller & schuckard, 2014)
Measures and Feedback (miller & schuckard, 2014)Measures and Feedback (miller & schuckard, 2014)
Measures and Feedback (miller & schuckard, 2014)
 
Application of the Rasch Model in Assessing and Streamlining an Instrument Me...
Application of the Rasch Model in Assessing and Streamlining an Instrument Me...Application of the Rasch Model in Assessing and Streamlining an Instrument Me...
Application of the Rasch Model in Assessing and Streamlining an Instrument Me...
 
How to write a paper statistics
How to write a paper statisticsHow to write a paper statistics
How to write a paper statistics
 
PMED: APPM Workshop: Overview of Methods for Subgroup Identification in Clini...
PMED: APPM Workshop: Overview of Methods for Subgroup Identification in Clini...PMED: APPM Workshop: Overview of Methods for Subgroup Identification in Clini...
PMED: APPM Workshop: Overview of Methods for Subgroup Identification in Clini...
 
Has modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurtHas modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurt
 
Approximate ANCOVA
Approximate ANCOVAApproximate ANCOVA
Approximate ANCOVA
 
4. Calculate samplesize for cross-sectional studies
4. Calculate samplesize for cross-sectional studies4. Calculate samplesize for cross-sectional studies
4. Calculate samplesize for cross-sectional studies
 
5. Calculate samplesize for case-control studies
5. Calculate samplesize for case-control studies5. Calculate samplesize for case-control studies
5. Calculate samplesize for case-control studies
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
 
Sample size calculation - a brief overview
Sample size calculation - a brief overviewSample size calculation - a brief overview
Sample size calculation - a brief overview
 
statistics in pharmaceutical sciences
statistics in pharmaceutical sciencesstatistics in pharmaceutical sciences
statistics in pharmaceutical sciences
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Statistical analysis by iswar
 
2. Tools to calculate samplesize
2. Tools to calculate samplesize2. Tools to calculate samplesize
2. Tools to calculate samplesize
 
Basic biostatistics dr.eezn
Basic biostatistics dr.eeznBasic biostatistics dr.eezn
Basic biostatistics dr.eezn
 
Fragility Index
Fragility IndexFragility Index
Fragility Index
 
Motivation for biostatistics
Motivation for biostatisticsMotivation for biostatistics
Motivation for biostatistics
 

Similar to Ct lecture 4. descriptive analysis of cont variables

Ct lecture 1. theory of measurements
Ct lecture 1. theory of measurementsCt lecture 1. theory of measurements
Ct lecture 1. theory of measurementsHau Pham
 
Ct lecture 12. simple linear regression analysis
Ct lecture 12. simple linear regression analysisCt lecture 12. simple linear regression analysis
Ct lecture 12. simple linear regression analysisHau Pham
 
Ct lecture 5. descriptive analysis of categorical variables
Ct lecture 5. descriptive analysis of categorical variablesCt lecture 5. descriptive analysis of categorical variables
Ct lecture 5. descriptive analysis of categorical variablesHau Pham
 
Epidemiology Lectures for UG
Epidemiology Lectures for UGEpidemiology Lectures for UG
Epidemiology Lectures for UGamitakashyap1
 
Ct lecture 20. survival analysis (part 2)
Ct lecture 20. survival analysis (part 2)Ct lecture 20. survival analysis (part 2)
Ct lecture 20. survival analysis (part 2)Hau Pham
 
Ct lecture 8. comparing two groups categorical data
Ct lecture 8. comparing two groups   categorical dataCt lecture 8. comparing two groups   categorical data
Ct lecture 8. comparing two groups categorical dataHau Pham
 
scope and need of biostatics
scope and need of  biostaticsscope and need of  biostatics
scope and need of biostaticsdr_sharmajyoti01
 
Applied statistics lecture_3
Applied statistics lecture_3Applied statistics lecture_3
Applied statistics lecture_3Daria Bogdanova
 
Ct lecture 6. test of significance and test of h
Ct lecture 6. test of significance and test of hCt lecture 6. test of significance and test of h
Ct lecture 6. test of significance and test of hHau Pham
 
Ct lecture 17. introduction to logistic regression
Ct lecture 17. introduction to logistic regressionCt lecture 17. introduction to logistic regression
Ct lecture 17. introduction to logistic regressionHau Pham
 
Lecture 11 Paired t test.pptx
Lecture 11 Paired t test.pptxLecture 11 Paired t test.pptx
Lecture 11 Paired t test.pptxshakirRahman10
 
Metanalysis Lecture
Metanalysis LectureMetanalysis Lecture
Metanalysis Lecturedrmomusa
 
Comparative hospital performance: new data, borrowed methods, more targeted a...
Comparative hospital performance: new data, borrowed methods, more targeted a...Comparative hospital performance: new data, borrowed methods, more targeted a...
Comparative hospital performance: new data, borrowed methods, more targeted a...cheweb1
 
Data Display and Summary
Data Display and SummaryData Display and Summary
Data Display and SummaryDrZahid Khan
 
Data analysis ( Bio-statistic )
Data analysis ( Bio-statistic )Data analysis ( Bio-statistic )
Data analysis ( Bio-statistic )Amany Elsayed
 

Similar to Ct lecture 4. descriptive analysis of cont variables (20)

Ct lecture 1. theory of measurements
Ct lecture 1. theory of measurementsCt lecture 1. theory of measurements
Ct lecture 1. theory of measurements
 
Ct lecture 12. simple linear regression analysis
Ct lecture 12. simple linear regression analysisCt lecture 12. simple linear regression analysis
Ct lecture 12. simple linear regression analysis
 
Ct lecture 5. descriptive analysis of categorical variables
Ct lecture 5. descriptive analysis of categorical variablesCt lecture 5. descriptive analysis of categorical variables
Ct lecture 5. descriptive analysis of categorical variables
 
Epidemiology Lectures for UG
Epidemiology Lectures for UGEpidemiology Lectures for UG
Epidemiology Lectures for UG
 
Ct lecture 20. survival analysis (part 2)
Ct lecture 20. survival analysis (part 2)Ct lecture 20. survival analysis (part 2)
Ct lecture 20. survival analysis (part 2)
 
Ct lecture 8. comparing two groups categorical data
Ct lecture 8. comparing two groups   categorical dataCt lecture 8. comparing two groups   categorical data
Ct lecture 8. comparing two groups categorical data
 
scope and need of biostatics
scope and need of  biostaticsscope and need of  biostatics
scope and need of biostatics
 
Applied statistics lecture_3
Applied statistics lecture_3Applied statistics lecture_3
Applied statistics lecture_3
 
Ct lecture 6. test of significance and test of h
Ct lecture 6. test of significance and test of hCt lecture 6. test of significance and test of h
Ct lecture 6. test of significance and test of h
 
Ct lecture 17. introduction to logistic regression
Ct lecture 17. introduction to logistic regressionCt lecture 17. introduction to logistic regression
Ct lecture 17. introduction to logistic regression
 
Lecture 11 Paired t test.pptx
Lecture 11 Paired t test.pptxLecture 11 Paired t test.pptx
Lecture 11 Paired t test.pptx
 
Metanalysis Lecture
Metanalysis LectureMetanalysis Lecture
Metanalysis Lecture
 
Comparative hospital performance: new data, borrowed methods, more targeted a...
Comparative hospital performance: new data, borrowed methods, more targeted a...Comparative hospital performance: new data, borrowed methods, more targeted a...
Comparative hospital performance: new data, borrowed methods, more targeted a...
 
Lab 1 intro
Lab 1 introLab 1 intro
Lab 1 intro
 
Unit 2 - Statistics
Unit 2 - StatisticsUnit 2 - Statistics
Unit 2 - Statistics
 
Intro to Biostat. ppt
Intro to Biostat. pptIntro to Biostat. ppt
Intro to Biostat. ppt
 
Unit 3 Sampling
Unit 3 SamplingUnit 3 Sampling
Unit 3 Sampling
 
Data Display and Summary
Data Display and SummaryData Display and Summary
Data Display and Summary
 
Test of significance
Test of significanceTest of significance
Test of significance
 
Data analysis ( Bio-statistic )
Data analysis ( Bio-statistic )Data analysis ( Bio-statistic )
Data analysis ( Bio-statistic )
 

More from Hau Pham

2008_Plague-Slide_Ref SR20080097
2008_Plague-Slide_Ref SR200800972008_Plague-Slide_Ref SR20080097
2008_Plague-Slide_Ref SR20080097Hau Pham
 
Thuc hanh Dich Te Hoc Y Ha Noi 2003
Thuc hanh Dich Te Hoc Y Ha Noi 2003Thuc hanh Dich Te Hoc Y Ha Noi 2003
Thuc hanh Dich Te Hoc Y Ha Noi 2003Hau Pham
 
Lecture 3. planning data analysis
Lecture 3. planning data analysisLecture 3. planning data analysis
Lecture 3. planning data analysisHau Pham
 
Ct lecture 16. model selection
Ct lecture 16. model selectionCt lecture 16. model selection
Ct lecture 16. model selectionHau Pham
 
Ct lecture 2. questionnaire deisgn
Ct lecture 2. questionnaire deisgnCt lecture 2. questionnaire deisgn
Ct lecture 2. questionnaire deisgnHau Pham
 
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bảnThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bảnHau Pham
 

More from Hau Pham (6)

2008_Plague-Slide_Ref SR20080097
2008_Plague-Slide_Ref SR200800972008_Plague-Slide_Ref SR20080097
2008_Plague-Slide_Ref SR20080097
 
Thuc hanh Dich Te Hoc Y Ha Noi 2003
Thuc hanh Dich Te Hoc Y Ha Noi 2003Thuc hanh Dich Te Hoc Y Ha Noi 2003
Thuc hanh Dich Te Hoc Y Ha Noi 2003
 
Lecture 3. planning data analysis
Lecture 3. planning data analysisLecture 3. planning data analysis
Lecture 3. planning data analysis
 
Ct lecture 16. model selection
Ct lecture 16. model selectionCt lecture 16. model selection
Ct lecture 16. model selection
 
Ct lecture 2. questionnaire deisgn
Ct lecture 2. questionnaire deisgnCt lecture 2. questionnaire deisgn
Ct lecture 2. questionnaire deisgn
 
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bảnThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản
 

Recently uploaded

Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Miss joya
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.MiadAlsulami
 
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...Garima Khatri
 
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safenarwatsonia7
 
Call Girls Service Bellary Road Just Call 7001305949 Enjoy College Girls Service
Call Girls Service Bellary Road Just Call 7001305949 Enjoy College Girls ServiceCall Girls Service Bellary Road Just Call 7001305949 Enjoy College Girls Service
Call Girls Service Bellary Road Just Call 7001305949 Enjoy College Girls Servicenarwatsonia7
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Miss joya
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...narwatsonia7
 
Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatore
Call Girl Coimbatore Prisha☎️  8250192130 Independent Escort Service CoimbatoreCall Girl Coimbatore Prisha☎️  8250192130 Independent Escort Service Coimbatore
Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatorenarwatsonia7
 
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore EscortsVIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escortsaditipandeya
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore EscortsCall Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escortsvidya singh
 
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Miss joya
 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...narwatsonia7
 
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowSonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowRiya Pathan
 
Call Girl Chennai Indira 9907093804 Independent Call Girls Service Chennai
Call Girl Chennai Indira 9907093804 Independent Call Girls Service ChennaiCall Girl Chennai Indira 9907093804 Independent Call Girls Service Chennai
Call Girl Chennai Indira 9907093804 Independent Call Girls Service ChennaiNehru place Escorts
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
 
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safenarwatsonia7
 
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune) Girls Service
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune)  Girls ServiceCALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune)  Girls Service
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune) Girls ServiceMiss joya
 

Recently uploaded (20)

Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
Low Rate Call Girls Pune Esha 9907093804 Short 1500 Night 6000 Best call girl...
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
 
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
 
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
 
Call Girls Service Bellary Road Just Call 7001305949 Enjoy College Girls Service
Call Girls Service Bellary Road Just Call 7001305949 Enjoy College Girls ServiceCall Girls Service Bellary Road Just Call 7001305949 Enjoy College Girls Service
Call Girls Service Bellary Road Just Call 7001305949 Enjoy College Girls Service
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
Russian Call Girls in Pune Tanvi 9907093804 Short 1500 Night 6000 Best call g...
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
 
Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatore
Call Girl Coimbatore Prisha☎️  8250192130 Independent Escort Service CoimbatoreCall Girl Coimbatore Prisha☎️  8250192130 Independent Escort Service Coimbatore
Call Girl Coimbatore Prisha☎️ 8250192130 Independent Escort Service Coimbatore
 
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Servicesauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
 
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore EscortsVIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
 
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore EscortsCall Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
Call Girls Horamavu WhatsApp Number 7001035870 Meeting With Bangalore Escorts
 
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
Call Girls Service Pune Vaishnavi 9907093804 Short 1500 Night 6000 Best call ...
 
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
Russian Call Girl Brookfield - 7001305949 Escorts Service 50% Off with Cash O...
 
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowSonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 
Call Girl Chennai Indira 9907093804 Independent Call Girls Service Chennai
Call Girl Chennai Indira 9907093804 Independent Call Girls Service ChennaiCall Girl Chennai Indira 9907093804 Independent Call Girls Service Chennai
Call Girl Chennai Indira 9907093804 Independent Call Girls Service Chennai
 
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
 
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
 
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune) Girls Service
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune)  Girls ServiceCALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune)  Girls Service
CALL ON ➥9907093804 🔝 Call Girls Baramati ( Pune) Girls Service
 

Ct lecture 4. descriptive analysis of cont variables

  • 1. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Descriptive analysis of continuous variables Tuan V. Nguyen Professor and NHMRC Senior Research Fellow Garvan Institute of Medical Research University of New South Wales Sydney, Australia
  • 2. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Some old words “If it were not for the great variability among individuals, medicine might be a Science, not an Art” William Osler, 1882 The Principles and Practice of Medicine
  • 3. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Normal (Gaussian) distribution •  Given a series of values xi (i = 1, … , n): x1, x2, …, xn, the mean is: x n xi i n = = ∑ 1 1 •  Study 1: the color scores of 6 consumers are: 6, 7, 8, 4, 5, and 6. The mean is: 6 6 36 6 6 5 4 8 7 6 1 1 = = + + + + + = ∑ = = n i i x n x •  Study 2: the color scores of 4 consumers are: 10, 2, 3, and 9. The mean is: 6 4 24 4 9 3 2 10 1 1 = = + + + = ∑ = = n i i x n x
  • 4. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Variation •  The mean does not adequately describe the data. We need to know the variation in the data. •  An obvious measure is the sum of difference from the mean. •  For study 1, the scores 6, 7, 8, 4, 5, and 6, we have: (6-6) + (7-6) + (8-6) + (4-6) + (5-6) + (6-6) = 0 + 1 + 2 – 2 – 1 + 0 = 0 NOT SATISFACTORY!
  • 5. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Sum of squares •  We need to make the difference positive by squaring them. This is called “Sum of squares” (SS) •  For study 1: 6, 7, 8, 4, 5, 6, we have: SS = (6-6)2 + (7-6)2 + (8-6)2 + (4-6)2 + (5-6)2 + (6-6)2 = 10 •  For study 2: 10, 2, 3, 9, we have: SS = (10-6)2 + (2-6)2 + (3-6)2 + (9-6)2 = 50 •  This is better! •  But it does not take into account sample size n.
  • 6. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Variance •  We have to divide the SS by sample size n. But in each square we use the mean to calculate the square, so we lose 1 degree of freedom. •  Therefore the correct denominator is n-1. This is called variance (denoted by s2) ( ) ∑ − − = = n i i x x n s 1 2 2 1 1 ( ) ( ) ( ) 1 ... 2 2 2 2 1 2 − − + + − + − = n x x x x x x s n •  Or, in the sum notation:
  • 7. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Variance - example •  For study 1: 6, 7, 8, 4, 5, and 6, the variance is: ( ) ( ) ( ) ( ) ( ) 2 5 10 1 6 6 6 6 5 6 8 6 7 6 6 2 2 2 2 2 2 = = − − + − + − + − + − = s •  For study 2: 10, 2, 3, 9, the variance is: ( ) ( ) ( ) ( ) 7 . 16 3 50 1 4 6 9 6 3 6 2 6 10 2 2 2 2 2 = = − − + − + − + − = s •  The scores in study 2 were much more variable than those in study 1.
  • 8. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Standard deviation •  The problem with variance is that it is expressed in unit squared, whereas the mean is in the actual unit. We need a way to convert variance back to the actual unit of measurement. •  We take the square root of variance – this is called “standard deviation” (denote by s) •  For study 1, s = sqrt(2) = 1.41 For study 2, s = sqrt(16.7) = 4.1
  • 9. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Coefficient of variation •  In many studies, the standard deviation can vary with the mean (eg higher/lower mean values are associated with higher/lower SD) •  Another statistic commonly used to quantify this phenomenon is the coefficient of variation (CV). •  A CV expresses the SD as percentage of the mean. CV = SD/mean*100 •  For study 1, CV = 1.41 / 6 * 100 = 23.5% For study 2, CV = 4.1 / 6 * 100 = 68.3%
  • 10. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Summary statistics •  Summary statistics are usually shown in sample size, mean and standard deviation. •  In our examples Study N Mean SD 1 6 6.0 1.4 2 4 6.0 4.1
  • 11. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Implications of the mean and SD •  “In the Vietnamese population aged 30+ years, the average of weight was 55.0 kg, with the SD being 8.2 kg.” •  What does this mean? •  If the data are normally distributed, this means that the probability that an individual randomly selected from the population with weight being w kg is: ( ) ( ) ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ − − = = 2 2 2 exp 2 1 s x w s w Weight P π
  • 12. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Implications of the mean and SD •  In our example, x = 55, s = 8.2 •  The probability that an individual randomly selected from the population with weight being 40 kg is: ( ) ( ) 009 . 0 2 . 8 2 . 8 2 55 40 exp 1416 . 3 2 2 . 8 1 40 2 = ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ × × − − × × = = Weight P ( ) ( ) 040 . 0 2 . 8 2 . 8 2 55 50 exp 1416 . 3 2 2 . 8 1 50 2 = ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ × × − − × × = = Weight P ( ) ( ) 0004 . 0 2 . 8 2 . 8 2 55 80 exp 1416 . 3 2 2 . 8 1 80 2 = ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ × × − − × × = = Weight P
  • 13. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Implications of the mean and SD •  The distribution of weight of the entire population can be shown to be: 0 1 2 3 4 5 6 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 92 Weight (kg) Percent (%)
  • 14. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Z-scores •  Actual measurements can be converted to z-scores •  A z-score is the number of SDs from the mean s x x Z − = •  A weight = 55 kg à z = (55 - 55)/8.2 = 0 SDs •  A weight = 40 kg à z = (40 - 55)/8.2 = -1.8 SDs •  A weight = 80 kg à z = (80-55)/8.2 = 3.0 SDs
  • 15. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Z-scores = Standard Normal Distribution •  A z-score is unitless, allowing comparison between variables with different measurements •  Z-scores have mean 0 and variance of 1. •  Z-scores à Standard Normal Distribution
  • 16. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Z-scores and area under the curve •  Z-scores and weight – another look: 0 1 2 3 4 5 6 -4.0 -3.5 -3.0 -2.6 -2.1 -1.6 -1.1 -0.6 -0.1 0.4 0.9 1.3 1.8 2.3 2.8 3.3 3.8 4.3 Percent (%) •  Area under the curve for z < -1.96 = 0.025 •  Area under the curve for -1.0 < z < 1.0 = 0.6828 •  Area under the curve for -2.0 < z < 2.0 = 0.9544 •  Area under the curve for -3.0 < z < 3.0 = 0.9972
  • 17. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 95% confidence interval •  A sample of n measurements (x1, x2, …, xn), with mean x and standard deviation s. •  95% of the individual values of xi lies between x-1.96s and x+1.96s •  Mean weight = 55 kg, SD = 8.2 kg •  95% of individuals’ weight lies between 39 kg and 71 kg.
  • 18. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Cumulative probability (area under the curve) for Z-scores 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 Z-scores Percent (%) Z < -3 -2.5 -2.0 -1.5 -1.0 -0.5 0 0.5 1.0 1.5 2.0 2.5 3.0 Prob .0013 .006 .0227 .0668 .1587 .3085 .5000 .6915 .8413 .9332 .9772 .9938 .9987
  • 19. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Standard error (SE) •  SE = standard error •  s : standard deviation •  n: sample size What does it mean ? s SE n =
  • 20. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 The meaning of SE •  Consider a population of 10 people: 130, 189, 200, 156, 154, 160, 162, 170, 145, 140 •  Mean m = 160.6 cm •  We repeated take random samples, each sample has 5 people:
  • 21. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 The meaning of SE •  We repeated take random samples, each sample has 5 people: 1st sample: 140, 160, 200, 140, 145 mean x1 = 157.0 2nd sample: 154, 170, 162, 160, 162 mean x2 = 161.6 3rd sample: 145, 140, 156, 140, 156 mean x3 = 147.4 4th sample: 140, 170, 162, 170, 145 mean x4 = 157.4 5th sample: 156, 156, 170, 189, 170 mean x5 = 168.2 6th sample: 130, 170, 170, 170, 170 mean x6 = 162.0 7th sample: 156, 154, 145, 154, 189 mean x7 = 159.6 8th sample: 200, 154, 140, 170, 170 mean x8 = 166.8 9th sample: 140, 170, 145, 162, 160 mean x9 = 155.4 10th sample: 200, 200, 162, 170, 162 mean x10 = 178.8 …. SD of x1, x2, x3, …, x10 is the SE
  • 22. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Use of SD and SE Let the population mean be m (we do not know m). Let the sample mean be x and SD be s. •  68% individuals in the population will have values from x─s to x+s •  95% individuals in the population will have values from x─2s to x+2s •  99% individuals in the population will have values from x─3s to x+3s Let the population mean be m (we do not know m). Let the sample mean be x and SE be se. •  68% averages from repeated samples will have values from x─se to x+se •  95% averages from repeated samples will have values from x─2se to x+2se •  99% averages from repeated samples will have values from x─3se to x+3se
  • 23. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Central location: Median •  The median is the value with a depth of (n+1)/2 •  When n is even, average the two values that straddle a depth of (n+1)/2 •  For the 10 values listed below, the median has depth (10+1) / 2 = 5.5, placing it between 27 and 28. Average these two values to get median = 27.5 05 11 21 24 27 28 30 42 50 52 ! median Average the adjacent values: M = 27.5
  • 24. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 More examples of medians •  Example A: 2 4 6 Median = 4 •  Example B: 2 4 6 8 Median = 5 (average of 4 and 6) •  Example C: 6 2 4 Median ! 2 (Values must be ordered first)
  • 25. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 The median is robust This data set has a mean of 1636: 1362 1439 1460 1614 1666 1792 1867 The median is 1614 in both instances, demonstrating its robustness in the face of outliers. The median is more resistant to skews and outliers than the mean; it is more robust. Here’s the same data set with a data entry error “outlier” (highlighted). This data set has a mean of 2743: 1362 1439 1460 1614 1666 1792 9867
  • 26. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Mode •  The mode is the most commonly encountered value in the dataset •  This data set has a mode of 7 {4, 7, 7, 7, 8, 8, 9} •  This data set has no mode {4, 6, 7, 8} (each point appears only once) •  The mode is useful only in large data sets with repeating values
  • 27. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Comparison of Mean, Median, Mode Note how the mean gets pulled toward the longer tail more than the median mean = median → symmetrical distrib mean > median → positive skew mean < median → negative skew
  • 28. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Spread: Quartiles •  Quartile 1 (Q1): cuts off bottom 25% of data •  Quartile 3 (Q3): cuts off top 25% of data = median of the to half of the data set •  Interquartile Range (IQR) = Q3 – Q1 05 11 21 24 27 28 30 42 50 52 ! ! ! Q1 median Q3 Q1 = 21, Q3 = 42, and IQR = 42 – 21 = 21
  • 29. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Box plot •  5 pt summary: {5, 21, 27.5, 42, 52}; box from 21 to 42 with line @ 27.5 •  IQR = 42 – 21 = 21. FU = Q3 + 1.5(IQR) = 42 + (1.5)(21) = 73.5 FL = Q1 – 1.5(IQR) = 21 – (1.5)(21) = – 10.5 •  None values above upper fence None values below lower fence •  Upper inside value = 52 Lower inside value = 5 Draws whiskers Data: 05 11 21 24 27 28 30 42 50 52 60 50 40 30 20 10 0 Upper inside = 52 Q3 = 42 Q1 = 21 Lower inside = 5 Q2 = 27.5
  • 30. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Box plot Seven metabolic rates: 1362 1439 1460 1614 1666 1792 1867 7 N = Data source: Moore, 2000 1900 1800 1700 1600 1500 1400 1300 1.  5-point summary: 1362, 1449.5, 1614, 1729, 1867 2.  IQR = 1729 – 1449.5 = 279.5 FU = Q3 + 1.5(IQR) = 1729 + (1.5)(279.5) = 2148.25 FL = Q1 – 1.5(IQR) = 1449.5 – (1.5)(279.5) = 1030.25 3. None outside 4. Whiskers end @ 1867 and 1362
  • 31. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Report of statistical summary •  Always report a measure of central location, a measure of spread, and the sample size •  Symmetrical mound-shaped distributions ! report mean and standard deviation •  Non-normally distributed data: median, interquartile ranges
  • 32. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Summary •  Mean indicates the typical value of sample values. •  Standard deviation indicates the between-subjects variability of sample values; •  Standard deviation indicates the variability among sample means = standard deviation of the means. •  (There is no such thing called “standard error of the means” (SEM)) •  Coefficient of variation indicates the relative variability (about the mean) among subjects within a sample. •  95% confidence interval loosely means the probable values of a sample with 95% probability.