DR DEEPIKA G
1ST YEAR PG
DEPT OF PHARMACOLOGY
MEASURES OF CENTRAL TENDENCY AND
DISTRIBUTION AND HYPOTHESIS
• Statistics:- science of data
- study of uncertainty
• Biostatistics: data from: Medicine, Biological
sciences (business, education, psychology,
• Types: Descriptive statistics
1. Descriptive Statistics - overview
of the attributes of a data set. These include
measurements of central tendency (frequency
histograms, mean, median, & mode) and
dispersion (range, variance & standard
2. Inferential Statistics - provide measures of how
well data support hypothesis and if
data are generalizable beyond what was
tested (significance tests)
Data: Observations recorded during research
Types of data:
1. Nominal data synonymous with categorical
data, assigned names/ categories based on
characters with out ranking between categories.
ex. male/female, yes/no, death /survival
2. Ordinal data ordered or graded data,
expressed as Scores or ranks
ex. pain graded as mild, moderate and severe
3. Interval data an equal and definite interval
between two measurements
it can be continuous or discrete
ex. weight expressed as 20, 21,22,23,24
interval between 20 & 21 is same as 23 &24
Measures of Central Tendencies:
•In a normal distribution, mean and median are the
•If median and mean are different, indicates that the
data are not normally distributed
•The mode is of little if any practical use
MEASURES OF VARIABILITY
Range: It is the interval between the highest
and lowest observations.
• Ex. Diastolic BP of 5 individuals
Highest observation is 98
Lowest observation is 78
Range is 98-78= 20.
Standard deviation(SD): it is defined as
positive square root of arithmetic mean of the
square of the deviations taken from the arithmetic
• It describes the variability of the observation
about the mean.
Variance: average square deviation around the
variance =∑(X-X-)2 or ∑(X-X-)2
Coefficient Of Variance(cv):
It is the standard deviation(SD) expressed as a
percentage of the mean.
CV= SD / mean* 100
• It is dimensionless (independent of any unit of
It measures relationship between two variables.
denoted by ‘r’ , unitless quantity,
it is a pure number.
values lie between -1 and +1
if variables not correlated CC will be zero.
1. Binomial Distribution:
The conditions to be fulfilled
i. There is fixed number(n) of trials;
ii. Only two outcomes, ‘success’ and ‘failure’, are
possible at each trial;
iii. The trials are independent,
iv. There is constant probability (𝜋) of success at
v. The variable is the total number of successes
in n trials.
2. Poisson Distribution:
• There are situations in which number of times an
event occurs is meaningful and can be counted
but the number of times the event did not occur is
meaningless or can not be counted.
• It is discrete and has an infinite number of
• It has single parameter λ.
3.Gaussian or Normal Distribution:
Important characteristics are:
i. The shape of the distribution resembles a bell
and is symmetric around the midpoint;
ii. At the centre of distribution which is peaked,
mean median and mode coincide;
iii. The area under the curve between any two
points which correspond to the proportion of
observations between any two values of the
variate can be found out in terms of a
relationship between the mean and the
iv. Parameters used mean(𝜇) and SD(𝜎)
• Standard Error Of Mean:
The square root of the variance of the sample
SE of sample mean = SD/ 𝑛
SE of sample proportion =
• Applications of SEM:
i. To determine whether a sample is drawn from the
same population or not when its mean is known.
ii. To work out the limits of desired confidence within
which the population mean should lie.
Confidence Interval Or Fiducial Limits:
• Confidence limits are two extremes of
measurements within which 95% of observations
Lower confidence limit = mean – ( t0.05 X SEM)
Upper confidence limit = mean + ( t0.05 X SEM)
• The important difference between ‘p’ value and
confidence interval is confidence interval
represents clinical significance and ‘p’ value
indicates statistical significance.
Standard Normal Distribution
Mean +/- 1 SD encompasses 68% of observations
Mean +/- 2 SD encompasses 95% of observations
Mean +/- 3SD encompasses 99.7% of observations
• They are hypothesis that are stated in such a way
that they may be evaluated by appropriate
• There are two types of hypothesis testing:
• Null hypothesis H0: It is the hypothesis which
assumes that there is no difference between two
values. H0: 𝜇1 = 𝜇2
• Alternative hypothesis HA : It is the hypothesis
that differs from null hypothesis.
• HA: 𝜇1 ≠ 𝜇2 𝑜𝑟 𝜇1 > 𝜇2 𝑜𝑟 𝜇1 < 𝜇2
• It is probability of finding difference; when
no such difference actually exists.
• Acceptance of inactive compound
• It is also known as 𝛼 error/ false positive
• It is probability of inability to detect difference;
when such difference actually exists, thus
resulting in rejection of active compound as an
• It is called as 𝛽 error/ false negative.
Level of significance(l.o.s):
• The probability of committing type I error
• Denoted by 𝛼
• L.o.s of 0.05% means risk of making wrong
decisions only is 5 out of 100 cases i.e 95%
Power of the test:
• It is probability of committing type II error
• Denoted by 𝛽 𝑎𝑛𝑑 1- 𝛽 is power of the test
• Power is probability of rejecting H0 when H0 is
false i.e correct decision.
• The p-value is defined as the smallest
value of α for which the null hypothesis can
• If the p-value is less than α ,we reject the
null hypothesis (p<α)
• If the p-value is greater than α ,we do not
reject the null hypothesis (p ≥ α)
One tailed test:
• The rejection is in one or other tail of distribution
• The difference could only be their in one
• Ex. English men are taller than Indian men.
Two Tailed Test:
• The rejection is split between two sides or tails of
• The difference could be in both direction/
• Ex. Comparative study of drug ‘X’ with atenolol
for antihypertensive property
• Large Sample : sample of size is more than 30
• Small Sample: sample of size less than or equal
• Many statistical test are based upon the
assumption that the data are sampled from a
• Procedures for testing hypotheses about
parameters in a population described by a
specified distributional form, (normal distribution)
are called parametric tests.
Types of Parametric tests
1. Large sample tests
2. Small sample tests
* Independent/ unpaired t-test
* Paired t-test
ANOVA (Analysis of variance)
* One way ANOVA
* Two way ANOVA
• A z-test is used for testing the mean of a
population versus a standard, or comparing
the means of two populations, with large (n
≥ 30) samples whether you know the
population standard deviation or not.
• It is also used for testing the proportion of some
characteristic versus a standard proportion, or
comparing the proportions of two populations.
Ex. Comparing the average engineering
salaries of men versus women.
Ex. Comparing the fraction defectives from two
T- test: Derived by W S Gosset in 1908.
• Properties of t distribution:
i. It has mean 0
ii. It has variance greater than one
iii. It is bell shaped symmetrical distribution about
• Assumption for t test:
i. Sample must be random, observations
ii. Standard deviation is not known
iii. Normal distribution of population
Uses of t test:
i. The mean of the sample
ii. The difference between means or to compare two
iii. Correlation coefficient
Types of t test:
a. Paired t test
b. Unpaired t test
Paired t test:
• Consists of a sample of matched pairs of
similar units, or one group of units that has been
tested twice (a "repeated measures" t-test).
• Ex. where subjects are tested prior to a
treatment, say for high blood pressure, and the
same subjects are tested again after treatment
with a blood-pressure lowering medication.
Unpaired t test:
• When two separate sets of independent and
identically distributed samples are obtained, one
from each of the two populations being
• Ex: 1. compare the height of girls and boys.
2. compare 2 stress reduction interventions
when one group practiced mindfulness
meditation while the other learned progressive
ANALYSIS OF VARIANCE(ANOVA):
• Analysis of variance (ANOVA) is a collection
of statistical models used to analyze the differences
between group means and their associated procedures
(such as "variation" among and between groups),
• Compares multiple groups at one time
• Developed by R.A. Fisher.
• Two types: i. One way ANOVA
ii. Two way ANOVA
It compares three or more unmatched groups
when data are categorized in one way
1. Compare control group with three different
doses of aspirin in rats
2. Effect of supplementation of vit C in each
subject before , during and after the treatment.
One Way ANOVA:
Two way ANOVA:
• Used to determine the effect of two nominal
predictor variables on a continuous outcome
• A two-way ANOVA test analyzes the effect of the
independent variables on the expected outcome
along with their relationship to the outcome itself.
Difference between one & two way
• An example of when a one-way ANOVA could be
used is if we want to determine if there is a
difference in the mean height of stalks of three
different types of seeds. Since there is more than
one mean, we can use a one-way ANOVA since
there is only one factor that could be making the
• Now, if we take these three different types of
seeds, and then add the possibility that three
different types of fertilizer is used, then we would
want to use a two-way ANOVA.
• The mean height of the stalks could be different
for a combination of several reasons:
• The types of seed could cause the change,
the types of fertilizer could cause the change,
and/or there is an interaction between the type of
seed and the type of fertilizer.
• There are two factors here (type of seed and type
of fertilizer), so, if the assumptions hold, then we
can use a two-way ANOVA.
Summary of parametric tests applied for
different type of data
Sl no Type of Group Parametric test
1. Comparison of two paired groups Paired ‘t’ test
2. Comparison of two unpaired groups Unpaired ‘t’ test
3. Comparison of three or more matched
Two way ANOVA
4. Comparison of three or more matched
One way ANOVA
5. Correlation between two variables Pearson correlation
1. Dr J V Dixit’s Principles and practice of
biostatistics 5th edition.
2. Rao & Murthy’s applied statistics in health
sciences 2nd edition.
3. Sarmukaddam’s fundamentals of biostatistics
4. Internet sources…….