Parametric test

Sampling and Hypothesis Testing(I)
in MATLAB
Kajal Rai
kajalrai.pu@gmail.com

Contents:
• Sampling
• Hypothesis Test
• Types of parametric test
• One sample t-test
• Paired t-test
• Tailed t-test
• Two sample t-test
• z-test
• F-test
• Difference between t-test, z-test and F-test

Sampling:
Sampling is the technique to be used in selecting the
items for the sample from the population.
Simple Random Sampling: In which each and every
unit of the population has an equal opportunity of being
selected in the sample.
Can be done with or without replacement.
If done with replacement, then each item has a
probability of 1/N of being drawn at each selection.
If done without replacement, then the first item has a
probability of 1/N, second item has 1/(N-1) and so on of
being drawn.

Random Sampling in MATLAB
y = randsample(n,k) returns a vector of k sample
of values sampled uniformly at random, without
replacement, from the integers 1 to n.
y = randsample(population,k) returns a vector
of k values sampled uniformly at random,
without replacement, from the values in the
vector population.

Random Sampling in MATLAB cntd…
y = randsample(n,k,replacement) or
y = randsample(population,k,replacement)
returns a sample taken with replacement
if replacement is true, or without replacement
if replacement is false. By default it is false.

Random Sampling in MATLAB cntd…
y = randsample(n,k,true,w) or
y =
randsample(population,k,true,w) returns a
weighted sample taken with replacement, using a
vector of positive weights w, whose length is n.
The probability that the integer i is selected for
an entry of y is w(i)/sum(w). Where, w is a
vector of probabilities.
randsample does not support weighted sampling
without replacement.

Generate a random sequence of the characters A, C, G,
and T, with replacement, according to the specified
probabilities.

Hypothesis Tests:
A hypothesis test is a procedure for determining
if an assertion about a characteristic of a
population is correct.
In hypothesis testing, the goal is to see if there is
sufficient statistical evidence to accept a
presumed null hypothesis or to reject
the alternative hypothesis[1].
The null hypothesis is usually denoted H0 while
the alternative hypothesis is usually denoted H1.

Types of parametric test:
• One sample t-test: The one-sample t-test is used when we want to know
whether our sample comes from a particular population but we do not
have full population information available to us. Used when we don't
know the variance.
• Paired t-test: A paired t-test looks at the difference between paired values
in two samples, takes into account the variation of values within each
sample, and produces a single number known as a t-value.
• Two sample t-test:To compare responses from two groups. These two
groups can come from different experimental treatments, or different
"populations".
• z-test: It is an appropriate parametric statistical procedure when there is
one sample that is being compared to a population with a known mean and
standard deviation.
• F-test: The F-test is designed to test if two population variances are equal.

One sample t-test:
[h,p,ci,stat] = ttest(X,M) performs a t-test of the hypothesis that
the data in X come from a distribution with mean M.
CI returns a 100*(1-ALPHA)% confidence interval for the true
mean of X.
STATS returns a structure with the following fields:
'tstat' -- the value of the test statistic
'df' -- the degrees of freedom of the test
'sd' -- the estimated population standard deviation.

One sample t-test example:
• Ex: The specimen of copper wires drawn form a
large lot have the following breaking strength (in
kg. weight):
• 578, 572, 570, 568, 572, 578, 570, 572, 596, 544
• Test (using Student’s t-statistic)whether the
mean breaking strength of the lot may be taken
to be 578 kg. weight (Test at 5 per cent level of
significance).

t-test with own significance level:
[h,p,ci,stat] = TTEST(...,ALPHA) performs the test at the significance level
(100*ALPHA)%. ALPHA must be a scalar.

Paired t-test:
A paired t-test looks at the difference between
paired values in two samples, takes into
account the variation of values within each
sample, and produces a single number known
as a t-value.

Paired t-test in MATLAB:
H = TTEST(X,Y) performs a paired T-test of the
hypothesis that two matched samples, in the
vectors X and Y, come from distributions with
equal means. The difference X-Y is assumed to
come from a normal distribution with unknown
variance.
X and Y must have the same length.

Example: Paired t-test
• Memory capacity of 9 students was tested
before and after training. State at 5 percent
level of significance whether the training was
effective from the following scores:
• Before:10,15,9,3,7,12,16,17,4
• After:12,17,8,5,6,11,18,20,3
• Take the score before training as X and the
score after training as Y and then taking the
null hypothesis that the mean of difference is
zero

we accept H0 and conclude that the difference in score before and after training is insignificant
i.e., it is only due to sampling fluctuations. Hence we can infer that the training was not
effective.

Tailed t-test:
A one- or two-tailed t-test is determined by
whether the total area of α is placed in one tail or
divided equally between the two tails.
The one-tailed t-test 
is performed if the results
are interesting only if they turn out in a particular
direction.
The two-tailed t-test is performed if the results
would be interesting in either direction.

One-Tailed t-Test:
There are two different one-tailed t-tests, one for each
tail.
In a one-tailed t-test, all the area associated with α is
placed in either one tail or the other. Selection of the tail
depends upon which direction t would be (+ or -) if the
results of the experiment came out as expected.
The selection of the tail must be made before the
experiment is conducted and analyzed.
Test to see whether one mean was higher than the other.

One-tailed t-test in the positive direction
The value tcrit would be positive. For example when α is set to .05
with ten degrees of freedom (df=10), tcrit would be equal to
+1.812.

One-tailed t-test in the negative direction
The value tcrit would be negative. For example, when αis set to .05
with ten degrees of freedom (df=10), tcrit would be equal to -1.812.

Two-Tailed t-Test:
A two-tailed t-test divides αin half, placing half in the each tail. The null hypothesis in
this case is a particular value, and there are two alternative hypotheses, one positive and
one negative. The critical value of t, tcrit, is written with both a plus and minus sign
(± ). For example, the critical value of t when there are ten degrees of freedom (df=10)
and α is set to .05, is tcrit= ± 2.228.
We would use a two-tailed test to see if two means are different from each other (ie
from different populations), or from the same population.

Tailed t-test in MATLAB
H = TTEST(...,TAIL) performs the test against
the alternative hypothesis specified by TAIL:
'both' -- "mean is M" (two-tailed test)
'right' -- "mean is greater than M" (right-tailed
test)
'left' -- "mean is less than M" (left-tailed test)

Two sample t-test
H = TTEST2(X,Y) performs a T-test of the hypothesis
that two independent samples, in the vectors X and Y,
come from distributions with equal means, and returns
the result of the test in H.
H=0 indicates that the null hypothesis ("means are
equal") cannot be rejected at the 5% significance level.
H=1 indicates that the null hypothesis can be rejected at
the 5% level.
The data are assumed to come from normal
distributions with unknown, but equal, variances.
X and Y can have different lengths.

Example:
• A group of seven-week old chickens reared on a high protein
diet weight 12, 15, 11, 16, 14, 14, and 16, a second group of
five chickens, similarly treated except that they receive a low
protein diet, weight 8, 10, 14, 10 and 13. Testing at 5 percent
level whether there is significant evidence that additional
protein has increased the weight of the chickens. Using
assumed mean = 10 for the sample of 7 and assumed mean = 8
for the sample of 5 chickens in our calculations.
• Taking the null hypothesis that additional protein has not
increased the weight of the chickens

Two sample t-test in MATLAB
we reject H0 and conclude that additional protein has increased the weight of chickens, at 5
per cent level of significance.

Two sample t-test in MATLAB cntd…
H = TTEST2(X,Y,ALPHA,TAIL,VARTYPE) allows
you to specify the type of test. When VARTYPE is
'equal', TTEST2 performs the default test assuming
equal variances.
When VARTYPE is 'unequal', TTEST2 performs the
test assuming that the two samples come from normal
distributions with unknown and unequal variances.
This is known as the Behrens-Fisher problem.

z-test in MATLAB
A z-test is used for testing the mean of a population or
comparing the means of two populations, with large (n ≥ 30)
samples when we know the population standard deviation.
H = ZTEST(X,M,SIGMA) performs a Z-test of the hypothesis
that the data in the vector X come from a distribution with mean
M, and returns the result of the test in H.
H=0 indicates that the null hypothesis ("mean is M") cannot be
rejected at the 5% significance level. H=1 indicates that the null
hypothesis can be rejected at the 5% level.
The data are assumed to come from a normal distribution with
standard deviation SIGMA.

Example:
• A dog food manufacturer, had created new Super Vitamin
Enriched Puppy Chow, specially designed for the active and
growing Doberman Pincer.
• The sample of 10 Doberman puppies are 27.5, 33.5, 36.8, 39.5,
40.5, 42.5, 40.0, 22.9, 39.8, 40.8 and fed them nothing but
with Super Vitamin Enriched Puppy Chow. When these dogs
reached adulthood, they weighed 39.7 kg on average (M) and
σ = 6.2 kg
• Did Puppy Chow make them grow especially big, test with a =
.05?
• H0: The puppy chow did make the dogs grow more than
normal.
• H1: The puppy chow did not make the dogs grow larger than
normal

We will accept H0 and conclude that the Super Vitamin Enriched Puppy Chow makes
Doberman Pincers grow significantly larger.

F-test:
• F-test is used to compare the variance of the
two-independent samples.
• This test is also used in the context of analysis
of variance (ANOVA) for judging the
significance of more than two sample means at
one and the same time.
• It is also used for judging the significance of
multiple correlation coefficients.

F-test in MATLAB
• H = vartest2(X,Y) performs an F test of the
hypothesis that two independent samples, in the
vectors X and Y, come from normal distributions
with the same variance, against the alternative that
they come from normal distributions with different
variances.
• The result is H=0 if the null hypothesis ("variances
are equal") cannot be rejected at the 5% significance
level, or H=1 if the null hypothesis can be rejected at
the 5% level.
• X and Y can have different lengths.

Example:
• Two random samples drawn from two normal
populations are:
• Sample1: 20 16 26 27 23 22 18 24 25 19
• Sample2: 27 33 42 35 32 34 38 28 41 43 30 37
• At 5% significance level.
• We take the null hypothesis that the two populations
from where the samples have been drawn have the
same variances

Since p value is more than 0.05 as such we accept the null hypothesis and conclude that
samples have been drawn from two populations having the same variances.

Difference between t-test, z-test and F-test:
t-test z-test F-test
A t-test is used for testing the
mean of one population
against a standard or
comparing the means of two
populations. And when you
do not know the populations’
standard deviation and when
you have a limited sample (n
< 30).
A z-test is used for testing the
mean of a population versus a
standard, or comparing the
means of two populations, with
large (n ≥ 30) samples when we
know the population standard
deviation.
It is also used for testing the
proportion of some characteristic
versus a standard proportion, or
comparing the proportions of
two populations.
An F-test is used to
compare 2 populations’
variances. The samples can
be any size. It is the basis of
ANOVA.

References:
Kothari, C.R.,1985, Research Methodology- Methods and
Techniques, New Delhi, Wiley Eastern Limited.
S.P.Gupta,Statistical Methods,eight revised edition 2009
http://www.mathworks.in/help/stats/ztest.html#btriieq
http://www.math.uah.edu/stat/hypothesis/Introduction.html
http://www.mathworks.in/products/statistics/description7.html
How to Do a T-Test in MATLAB
eHow http://www.ehow.com/how_12211819_ttest-matlab.
html#ixzz2WSQ6BN6o

Parametric test

More Related Content

What's hot

Viewers also liked

Similar to Parametric test

More from Chinnannan Periasamy

Parametric test