T‑tests

© 2015 Journal of the Practice of Cardiovascular Sciences | Published by Wolters Kluwer - Medknow 185
Statistical Pages
Commonly Used t‑tests in Medical Research
R. M. Pandey
Department of Biostatistics, All India Institute of Medical Sciences, New Delhi, India
Introduction
To draw some conclusion about a population parameter
(true result of any phenomena in the population) using the
information contained in a sample, two approaches of statistical
inference are used, that is, confidence interval (range of results
likely to be obtained, usually, 95% of the times) and hypothesis
testing, to find how often the observed finding could be due
to chance alone, reported by P value which is the probability
of obtaining the result as extreme as observed under null
hypothesis. Statistical tests used for hypothesis testing are
broadly classified into two groups, that is, parametric tests
and nonparametric tests. In parametric tests, some assumption
is made about the distribution of population from which
the sample is drawn. In all parametric tests, the distribution
of quantitative variable in the population is assumed to be
normally distributed. As one does not have access to the
population values to say normal or nonnormal, assumption of
normality is made based on the sample values. Nonparametric
statistical methods are also known as distribution‑free methods
or methods based on ranks where no assumptions are made
about the distribution of variable in the population.
The family of t‑tests falls in the category of parametric statistical
tests where the mean value(s) is (are) compared against a
hypothesized value. In hypothesis testing of any statistic
(summary), for example, mean or proportion, the hypothesized
value of the statistic is specified while the population variance
is not specified, in such a situation, available information is
only about variability in the sample. Therefore, to compute
the standard error (measure of variability of the statistic of
interest which is always in the denominator of the test statistic),
it is considered reasonable to use sample standard deviation.
William Sealy Gosset, a chemist working for a brewery in
Dublin Ireland introduced the t‑statistic. As per the company
policy, chemists were not allowed to publish their findings, so
Gosset published his mathematical work under the pseudonym
“Student,” his pen name. The Student’s t‑test was published
in the journal Biometrika in 1908.[1,2]
In medical research, various t‑tests and Chi‑square tests are
the two types of statistical tests most commonly used. In
any statistical hypothesis testing situation, if the test statistic
follows a Student’s t‑test distribution under null hypothesis,
it is a t‑test. Most frequently used t‑tests are: For comparison
of mean in single sample; two samples related; two samples
unrelated tests; and testing of correlation coefficient and
regression coefficient against a hypothesized value which is
usually zero. In one‑sample location test, it is tested whether
or not the mean of the population has a value as specified in
a null hypothesis; in two independent sample location test,
Address for correspondence: Dr. R. M. Pandey,
Department of Biostatistics, All India Institute of Medical Sciences,
New Delhi, India.
E‑Mail: rmpandey@yahoo.com
Access this article online
Quick Response Code:
Website:
www.j‑pcs.org
DOI:
10.4103/2395-5414.166321
Student’s t-test is a method of testing hypotheses about the mean of a small sample drawn from a normally distributed population when
the population standard deviation is unknown. In 1908 William Sealy Gosset, an Englishman publishing under the pseudonym Student,
developed the t-test. This article discusses the types of T test and shows a simple way of doing a T test.
Ker words: Student's T test, method, William Gosset
Abstract
How to cite this article: Pandey RM. Commonly used t-tests in medical
research. J Pract Cardiovasc Sci 2015;1:185-8.
This is an open access article distributed under the terms of the Creative Commons
Attribution‑NonCommercial‑ShareAlike 3.0 License, which allows others to remix,
tweak, and build upon the work non‑commercially, as long as the author is credited and
the new creations are licensed under the identical terms.
For reprints contact: reprints@medknow.com
[Downloaded free from http://www.j-pcs.org on Sunday, December 27, 2015, IP: 117.195.155.159]

Pandey: The T Test
Journal of the Practice of Cardiovascular Sciences ¦ May‑August 2015 ¦ Volume 1 ¦ Issue 2186
equality of means of two populations is tested; to compare
the mean delta (difference between two related samples)
against hypothesized value of zero in a null hypothesis,
also known as paired t‑test or repeated‑measures t‑test; and,
to test whether or not the slope of a regression line differs
significantly from zero. For a binary variable (such as cure,
relapse, hypertension, diabetes, etc.,) which is either yes
or no for a subject, if we take 1 for yes and 0 for no and
consider this as a score attached to each study subject then
the sample proportion (p) and the sample mean would be the
same. Therefore, the approach of t‑test for mean can be used
for proportion as well.
The focus here is on describing a situation where a
particular t‑test would be used. This would be divided
into t‑tests used for testing: (a) Mean/proportion in one
sample, (b) mean/proportion in two unrelated samples,
(c) mean/proportion in two related samples, (d) correlation
coefficient, and (e) regression coefficient. The process
of hypothesis testing is same for any statistical test:
Formulation of null and alternate hypothesis; identification
and computation of test statistics based on sample values;
deciding of alpha level, one‑tailed or two‑tailed test;
rejection or acceptance of null hypothesis by comparing
the computed test statistic with the theoretical value of “t”
from the t‑distribution table corresponding to given degrees
of freedom. In hypothesis testing, P value is reported as
P < 0.05. However, in significance testing, the exact P value
is reported so that the reader is in a better position to judge
the level of statistical significance.
• t‑test for one sample: For example, in a random sample
of 30 hypertensive males, the observed mean body mass
index (BMI) is 27.0 kg/m2
and the standard deviation
is 4.0. Also, suppose it is known that the mean BMI in
nonhypertensive males is 25 kg/m2
. If the question is to
know whether or not these 30 observations could have
come from a population with a mean of 25 kg/m2
. To
determine this, one sample t‑test is used with the null
hypothesis H0: Mean = 25, against alternate hypothesis
of H1: Mean ≠ 25. Since the standard deviation of the
hypothesized population is not known, therefore, t‑test
would be appropriate; otherwise, Z‑test would have been
used
• t‑test for two related samples: Two samples can be
regarded as related in a pre‑ and post‑design (self‑pairing)
or in two groups where the subjects have been matched
on a third factor a known confounder (artificial pairing).
In a pre‑ and post–design, each subject is used as his or
her own control. For example, an investigator wants to
assess effect of an intervention in reducing systolic blood
pressure (SBP) in a pre‑ and post‑design. Here, for each
patient, there would be two observations of SBP, that is,
before and after. Here instead of individual observations,
difference between pairs of observations would be of
interest and the problem reduces to one‑sample situation
where the null hypothesis would be to test the mean
difference in SBP equal to zero against the alternate
hypothesis of mean SBP being not equal to zero. The
underlying assumption for using paired t‑test is that
under the null hypothesis the population of difference
in normally distributed and this can be judged using
the sample values. Using the mean difference and the
standard error of the mean difference, 95% confidence
interval can be computed. The other situation of the two
sample being related is the two group matched design.
For example, in a case–control study to assess association
between smoking and hypertension, both hypertensive
and nonhypertensive are matched on some third factor,
say obesity, in a pair‑wise manner. Same approach of
paired analysis would be used. In this situation, cases and
controls are different subjects. However, they are related
by the factor
• t‑test for two independent samples: To test the null
hypothesis that the means of two populations are equal;
Student’s t‑test is used provided the variances of the
two populations are equal and the two samples are
assumed to be random sample. When this assumption
of equality of variance is not fulfilled, the form of
the test used is a modified t‑test. These tests are
also known as two‑sample independent t‑tests with
equal variance or unequal variance, respectively.
The only difference in the two statistical tests lies in
the denominator, that is, in determining the pooled
variance. Prior to choosing t‑test for equal or unequal
variance, very often a test of variance is carried out
to compare the two variances. It is recommended
that this should be avoided.[3]
Using a modified t‑test
even in a situation when the variances are equal,
has high power, therefore, to compare the means in
the two unrelated groups, using a modified t‑test is
sufficient.[4]
When there are more than two groups, use
of multiple t‑test (for each pair of groups) is incorrect
because it may give false‑positive result, hence, in such
situations, one‑way analysis of variance (ANOVA),
followed by correction in P value for multiple
comparisons (post‑hoc ANOVA), if required, is used
to test the equality of more than two means as the null
hypothesis, ensuring that the total P value of all the
pair‑wise does not exceed 0.05
• t‑test for correlation coefficient: To quantify the strength of
relationship between two quantitative variables, correlation
coefficient is used. When both the variables follow normal
distribution, Pearson’s correlation coefficient is computed;
and when one or both of the variables are nonnormal
or ordinal, Spearman’s rank correlation coefficient
(based on ranks) are used. For both these measures, in the
case of no linear correlation, null value is zero and under
null hypothesis, the test statistic follows t‑distribution
and therefore, t‑test is used to find out whether or not
the Pearson’s/Spearman’s rank correlation coefficient is
significantly different from zero
• Regression coefficient: Regression methods are used to

Pandey: The T Test
Journal of the Practice of Cardiovascular Sciences ¦ May‑August 2015 ¦ Volume 1 ¦ Issue 2 187
model a relationship between a factor and its potential
predictors. Type of regression method to be used depends
on the type of dependent/outcome/effect variable. Three
most commonly used regression methods are multiple linear
regression, multiple logistic regression, and Cox regression.
The form of the dependent variable in these three methods is
quantitative, categorical, and time to an event, respectively.
A multiple linear regression would be of the form
Y = a + b1X1 + b2X2 + ..., where Y is the outcome and X’s
are the potential covariates. In logistic and Cox regression,
the equation is nonlinear and using transformation the
equation is converted into linear equation because it is easy
to obtain unknowns in the linear equation using sample
observations. The computed values of a and b vary from
sample to sample. Therefore, to test the null hypothesis
that there is no relationship between X and Y, t‑test, which
is the coefficient divided by its standard error, is used to
determine the P value. This is also commonly referred to
as Wald t‑test and using the numerator and denominator of
the Wald t‑statistic, 95% confidence interval is computed as
coefficient ± 1.96 (standard error of the coefficient).
The above is an illustration of the most common situations
where t‑test is used. With availability of software,
computation is not the issue anymore. Any software where
basic statistical methods are provided will have these tests.
All one needs to do is to identify the t‑test to be used in a
given situation, arrange the data in the manner required by
the particular software, and use mouse to perform the test
and report the following: Number of observations, summary
statistic, P value, and the 95% confidence interval of summary
statistic of interest.
Using an Online Calculator to Compute
T‑statistics
In addition to the statistical software, you can also use
online calculators for calculating the t‑statistics, P values,
95% confidence interval, etc., Various online calculators are
available over the World Wide Web. However, for explaining
how to use these calculators, a brief description is given below.
A link to one of the online calculator available over the internet
is http://www.graphpad.com/quickcalcs/.
• Step 1: The first screen that will appear by typing this URL
in address bar will be somewhat as shown in Figure 1.
• Step 2: Check on the continuous data option as shown in
Figure 1 and press continue
• Step 3: On pressing the continue tab, you will be guided
to another screen as shown in Figure 2.
• Step 4: For calculating the one‑sample t‑statistic,
click on the one‑sample t‑test. Compare observed and
expected means option as shown in Figure 2 and press
continue. For comparing the two means as usually done
in the paired t‑test for related samples and two‑sample
independent t‑test, click on the t‑test to compare two
means option.
• Step 5:After pressing the continue tab, you will be guided to
another screen as shown in Figure 3. Choose the data entry
format, like for the BMI and hypertensive males’ example
givenfortheone‑samplet‑test,wehaven,mean,andstandard
deviation of the sample that has to be compared with the
hypothetical mean value of 25 kg/m2
. Enter the values in the
calculator and set the hypothetical value to 25 and then press
the calculate now tab. Refer to Figure 3 for details
• Step 6: On pressing the calculate now tab, you will be
Figure 1: First step.
Figure 2: Second step.
Figure 3: Third step.

Pandey: The T Test
Journal of the Practice of Cardiovascular Sciences ¦ May‑August 2015 ¦ Volume 1 ¦ Issue 2188
guided to next screen as shown in Figure 4, which will
give you the results of your one‑sample t‑test. It can be
seen from the results given in Figure 4 that the P value for
our one‑sample t‑test is 0.0104. 95% confidence interval
is 0.51–3.49 and one‑sample t‑statistics is 2.7386.
Similarly online t‑test calculators can be used to calculate the
paired t‑test (t‑test for two related samples) and t‑test for two
independent samples.You just need to look that in what format
you are having the data and a basic knowledge of in which
condition which test has to be applied and what is the correct
form for entering the data in the calculator.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References
1. Mankiewicz R. The Story of Mathematics (Paperback ed.). Princeton,
NJ: Princeton University Press; 2004. p. 158.
2. Fisher Box J. Guinness, Gosset, Fisher, and small samples. Stat Sci
1987;2:45‑52.
3. Markowski CA, Markowski EP. Conditions for the effectiveness of a
preliminary test of variance. Am Stat 1990;44:322‑6.
4. Moser BK, Stevens GR. Homogeneity of variance in the two sample
t‑test. Am Stat 1992;46:19‑21.Figure 4: Results.

T‑tests

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to T‑tests

Similar to T‑tests (20)

More from Ramachandra Barik

More from Ramachandra Barik (20)

Recently uploaded

Recently uploaded (20)

T‑tests