2. Theoretical work on “t distribution” was done by William
Sealy Gosset in 1980. He has published his findings under
the pen name “Student” because he was working with
“Guinness Son & Company, Dublin Brewery, Ireland” and
company didn’t permit employees to publish research
findings under their own names. That’s why it is called
Student’s t Test.
Student’s t Test
3. A t-test is an inferential statistic used to determine if
there is a statistically significant difference between
the means of two variables.
The t-test is a test used for hypothesis testing in
statistics.
Calculating a t-test requires three fundamental data
values including the difference between the mean
values from each data set, the standard deviation of
each group, and the number of data values.
T-tests can be dependent or independent.
Key Takeaways
4. Uses of t-test / Application
Student’s t-test is used when sample size is less than 30
Population standard deviation (𝜎) is unknown.
Compare the means of two groups on two samples.
When parameter of population are normal.
Correlation of coefficient in population is Zero.
5. Type of t-test
Investigate whether there’s a difference
within a group between two points in time
(within-subjects).
6. Paired t-test
If the groups come from a single population (e.g.,
measuring before and after an experimental treatment),
perform a paired t-test. This is a within-subjects design.
Type of t-test Cont..
8. Independent t-test
If the groups come from two different populations (e.g.,
two different species, or people from two separate
cities), perform a two-sample t-test (independent t-test).
This is a between-subjects design.
Type of t-test Cont..
9. Investigate whether there's a difference
between a group and a standard value or
Whether a subgroup belongs to a
population.
Type of t-test Cont..
10. One-sample t-test
If there is one group being compared against a standard
value (e.g., comparing the acidity of a liquid to a neutral
pH of 7), perform a one-sample t-test.
Type of t-test Cont..
11. If you only care whether the two populations are
different from one another, perform a two-tailed t-test.
If you want to know whether one population mean is
greater than or less than the other, perform a one-
tailed t-test.
One-tailed or two-tailed t-test
12. The following flowchart can be used to determine which
t-test to use based on the characteristics of the sample
sets. The key items to consider include the similarity of
the sample records, the number of data records in each
sample set, and the variance of each sample set.
Which t-test to Use
13.
14. Calculating a t-test requires three fundamental data values.
They include the difference between the mean values from
each data set, or the mean difference, the standard deviation
of each group, and the number of data values of each
group.
where x
̄ = mean of sample
μ = mean of population
n = sample size
s = standard deviation of sample
t-test Formula
t = 𝒙
̅−µ
𝒔
𝒏
15. The t-score
The t score is a ratio between the difference between
two groups and the difference within the groups.
Larger t scores = more difference between groups.
Smaller t score = more similarity between groups.
16. The p value is a number, calculated from a statistical
test, that describes how likely you are to have found a
particular set of observations if the null hypothesis
were true.
P values are used in hypothesis testing to help decide
whether to reject the null hypothesis. The smaller the
p value, the more likely you are to reject the null
hypothesis.
Understanding P-values
17. Degrees of Freedom
Degrees of freedom are the maximum number of logically
independent values, which may vary in a data sample.
Degrees of freedom are calculated by subtracting one from
the number of items within the data sample.
Formula:
where:
Df = Degrees of freedom
N = Sample size
Df = N −𝟏
18. The T-Distribution Table is available in one-tail and two-
tails formats. The former is used for assessing cases that
have a fixed value or range with a clear direction, either
positive or negative. For instance, what is the probability
of the output value remaining below -3, or getting more
than seven when rolling a pair of dice? The latter is used
for range-bound analysis, such as asking if the coordinates
fall between -2 and +2.
How is the t-distribution table used