6 estimation hypothesis testing t test

Quantitative Research Methods
Lecture 6
1. Probability distribution
2. Sample distribution
3. Estimation
4. Hypothesis Testing
5. T-test

Inferential
statistics
Differences
between groups
Relationships
between variables
T-test, ANOVA, MANOVA Correlation, Multiple Regression

Review 2: Statistical Inference…
Statistical inference is the process by which we
acquire information and draw conclusions about
populations from samples.
In order to do inference, we require the skills and
knowledge of descriptive statistics, probability
distributions, and sampling distributions.

1.Probability distribution…
• … is a table, formula, or graph that describes the
value of a random variable and the probability
associate with these values.

1. Probability distribution:
Random Variables…
A random variable is a function or rule that
assigns a number to each outcome of an
experiment.
Alternatively, the value of a random variable is a
numerical event.
Instead of talking about the coin flipping event as
{heads, tails} think of it as
“the number of heads when flipping a coin”
{0, 1, 2…} (numerical events)

Two Types of Random Variables…
Discrete Random Variable
– one that takes on a countable number of values
– E.g. values on the roll of dice: 2, 3, 4, …, 12
Continuous Random Variable
– one whose values are not discrete, not countable
– E.g. time (30.1 minutes? 30.10000001 minutes?)
Analogy:
Integers are Discrete, while Real Numbers are Continuous

The Normal Distribution…
The normal distribution is the most important of
all probability distributions. The probability density
function of a normal random variable is given by:
It looks like this:
Bell shaped,
Symmetrical around the mean

Standard Normal Distribution…
A normal distribution whose mean is zero and
standard normal deviation is one is called the
standard normal distribution.
Any normal distribution can be converted to a
standard normal distribution with simple algebra.
0
1
1

Normal Distribution…
The normal distribution is described by two
parameters:
its mean and its standard deviation .
Increasing the mean shifts the curve to the right…

Normal Distribution…
The normal distribution is described by two
parameters:
its mean and its standard deviation . Increasing
the standard deviation “flattens” the curve…

2. Sampling Distributions
The sampling distribution of the mean of a random
sample drawn from any population is
approximately normal for a sufficiently
large sample size.
The larger the sample size, the more closely the
sampling distribution of X will resemble a normal
distribution.

3. Statistical Inference
Estimation
There are two types of inference: estimation and
hypothesis testing; estimation is introduced first.
The objective of estimation is to determine the
approximate value of a population parameter
on the basis of a sample statistic.
E.g., the sample mean ( ) is employed to
estimate the population mean ( ).

Estimation…
The objective of estimation is to determine the
approximate value of a population parameter
on the basis of a sample statistic.
There are two types of estimators:
Point Estimator
Interval Estimator

Point Estimator…
A point estimator draws inferences about a
population by estimating the value of an unknown
parameter using a single value or point.

Example Xm10-01
To lower costs, the operations manager wants to
use an inventory model. He notes demand during
lead time is normally distributed and he needs to
know the mean to compute the optimum inventory
level.
He observes 25 lead time periods and records the
demand during each period.
Assume that the manager knows that the standard
deviation is 75 computers.

Data
235 374 309 499 253
421 361 514 462 369
394 439 348 344 330
261 374 302 466 535
386 316 296 332 334

Confidence Intervals…
• The 95% CI of the mean is the range of values
that will contain the true mean with a probability
of 0.95.
• Other CIs can be calculated by replacing 1.96 by a
different multiplying factor.
• The multiplying factor depends on the number of
observations in the sample and the confidence
level required.
• As the confidence level increases, so does the
multiplying factor.

Hypothesis Testing
A criminal trial is an example of hypothesis testing
without the statistics.
In a trial a jury must decide between two hypotheses. The
null hypothesis is
H0: The defendant is innocent
The alternative hypothesis or research hypothesis is
H1: The defendant is guilty
The jury does not know which hypothesis is true. They must
make a decision on the basis of evidence presented.

Hypothesis Testing
In the language of statistics convicting the
defendant is called
rejecting the null hypothesis in favor of the
alternative hypothesis.
That is, the jury is saying that there is enough
evidence to conclude that the defendant is guilty
(i.e., there is enough evidence to support the
alternative hypothesis).

Hypothesis Testing
If the jury acquits it is stating that
there is not enough evidence to support the
Notice that the jury is not saying that the defendant is
innocent, only that there is not enough evidence to
support the alternative hypothesis. That is why we
never say that we accept the null hypothesis.

Hypothesis Testing
There are two possible errors.
A Type I error occurs when we reject a true null
hypothesis. That is, a Type I error occurs when the
jury convicts an innocent person.
A Type II error occurs when we don’t reject a false
null hypothesis. That occurs when a guilty
defendant is acquitted.

Hypothesis Testing
The probability of a Type I error is denoted as α
(Greek letter alpha). The probability of a type II
error is β (Greek letter beta).
The two probabilities are inversely related.
Decreasing one increases the other.

Hypothesis Testing
In our judicial system Type I errors are regarded as
more serious. We try to avoid convicting innocent
people. We are more willing to acquit guilty people.

Hypothesis Testing
The critical concepts are:
1. There are two hypotheses, the null and the alternative
hypotheses.
2. The procedure begins with the assumption that the null
hypothesis is true.
3. The goal is to determine whether there is enough
evidence to infer that the alternative hypothesis is true.
4. There are two possible decisions:
Conclude that there is enough evidence to support the
Conclude that there is not enough evidence to support
the alternative hypothesis.

Hypothesis Testing
5. Two possible errors can be made.
Type I error: Reject a true null hypothesis
Type II error: Do not reject a false null
hypothesis.
P(Type I error) = α
P(Type II error) = β
α is called the significance level.

Concepts of Hypothesis Testing (1)
There are two hypotheses. One is called the null
hypothesis and the other the alternative or
research hypothesis. The usual notation is:
H0: — the ‘null’ hypothesis
H1: — the ‘alternative’ or ‘research’ hypothesis
The null hypothesis (H0) will always state that the
parameter equals the value specified in the
alternative hypothesis (H1)
pronounced
H “nought”

Concepts of Hypothesis Testing
Consider Example 10.1 (mean demand for computers
during assembly lead time) again. Rather than
estimate the mean demand, our operations manager
wants to know whether the mean is different
from 350 units. We can rephrase this request into a
test of the hypothesis:
H0:µ = 350
Thus, our research hypothesis becomes:
H1:µ ≠ 350 This is what we are interested
in determining…

The testing procedure begins with the
assumption that the null hypothesis is true.
Thus, we have further statistical evidence, we will
assume:
H0: µ = 350 (assumed to be TRUE)

The goal of the process is to determine whether
there is enough evidence to infer that the
alternative hypothesis is true.
That is, is there sufficient statistical information to
determine if this statement is true?
H1:µ ≠ 350

There are two possible decisions that can be made:
Conclude that there is enough evidence to support the alternative
hypothesis
(also stated as: rejecting the null hypothesis in favor of the alternative)
Conclude that there is not enough evidence to support the
alternative hypothesis
(also stated as: not rejecting the null hypothesis in favor of the
alternative)
NOTE: we do not say that we accept the null hypothesis…

Once the null and alternative hypotheses are
stated, the next step is to randomly sample the
population and calculate a test statistic (in this
example, the sample mean).
If the test statistic’s value is inconsistent with the
null hypothesis we reject the null hypothesis
and infer that the alternative hypothesis is
true.

For example, if we’re trying to decide whether the
mean is not equal to 350, a large value of x (say,
600) would provide enough evidence.
If x is close to 350 (say, 355) we could not say that
this provides a great deal of evidence to infer that
the population mean is different than 350.

Types of Errors
A Type I error occurs when we reject a true null
hypothesis (i.e. Reject H0 when it is TRUE)
A Type II error occurs when we don’t reject a false null
hypothesis (i.e. Do NOT reject H0 when it is FALSE)
H0 T F
Reject I
Reject II

Example 11.1
The manager of a department store is thinking about
establishing a new billing system for the store's credit customers.
She determines that the new system will be cost-effective only if
the mean monthly account is more than $170. A random sample
of 400 monthly accounts is drawn, for which the sample mean is
$178.
The manager knows that the accounts are approximately
normally distributed with a standard deviation of $65. Can the
manager conclude from this that the new system will be cost-
effective?

Example 11.1
The system will be cost effective if the mean account
balance for all customers is greater than $170.
We express this belief as our research hypothesis,
that is:
H1: µ > 170 (this is what we want to
determine)
Thus, our null hypothesis becomes:
H0: µ = 170 (this specifies a single value for
the parameter of interest)

Example 11.1
What we want to show:
H0: µ = 170 (we’ll assume this is true)
H1: µ > 170
We know:
n = 400,
= 178, and
σ = 65
What to do next?!
IDENTIFY

Example 11.1
To test our hypotheses, we can use two different
approaches:
The rejection region approach (typically used
when computing statistics manually), and
The p-value approach (which is generally used
with a computer and statistical software).
We will explore the latter…
COMPUTE

p-Value of a Test
The p-value of a test is the probability of observing a
test statistic at least as extreme as the one computed
given that the null hypothesis is true.
In the case of our department store example, what is
the probability of observing a sample mean at least
as extreme as the one already observed (i.e. = 178),
given that the null hypothesis (H0: µ = 170) is true?
p-value

P-Value of a Test
p-value = P(Z > 2.46)
p-value =.0069
z =2.46

Interpreting the p-value
Overwhelming Evidence
(Highly Significant)
Strong Evidence
(Significant)
Weak Evidence
(Not Significant)
No Evidence
(Not Significant)
0 .01 .05 .10
p=.0069

Interpreting the p-value
Compare the p-value with the selected value of the
significance level:
If the p-value is less than , we judge the p-value to be
small enough to reject the null hypothesis.
If the p-value is greater than , we do not reject the
null hypothesis.
Since p-value = .0069 < = .05, we reject H0 in
favor of H1

One– and Two–Tail Testing
The department store example (Example 11.1) was a
one tail test, because the rejection region is located
in only one tail of the sampling distribution:
More correctly, this was an example of a right tail
test.

Two–Tail Testing
Two tail testing is used when we want to test a
research hypothesis that a parameter is not equal
(≠) to some value

5. Two-Sample T-Test
Example 13.1 Xm13-01
Millions of investors buy mutual funds choosing from
thousands of possibilities.
Some funds can be purchased directly from banks or other
financial institutions while others must be purchased
through brokers, who charge a fee for this service.
This raises the question, can investors do better by buying
mutual funds directly than by purchasing mutual funds
through brokers.

Example 13.1 Xm13-01
To help answer this question a group of
researchers randomly sampled the annual returns
from mutual funds that can be acquired directly
and mutual funds that are bought through brokers
and recorded the net annual returns, which are the
returns on investment after deducting all relevant
fees.
Can we conclude at the 5% significance level that
directly-purchased mutual funds outperform
mutual funds bought through brokers?

Example 13.1
To answer the question we need to compare the
population of returns from direct and the returns
from broker- bought mutual funds.
The data are obviously interval (we've recorded
real numbers).

Example 13.1
The hypothesis to be tested is that the mean net annual
return from directly-purchased mutual funds (µ1) is
larger than the mean of broker-purchased funds (µ2).
Hence the alternative hypothesis is
H1: µ1- µ2 > 0
and
H0: µ1- µ2 = 0

Identifying Factors I…
Factors that identify the equal-variances t-test and
estimator of :

Identifying Factors II…
Factors that identify the unequal-variances t-test
and estimator of :

The value of the test statistic is 2.29. The one-tail
p-value is .0122.
We observe that the p-value of the test is small.
As a result we conclude that there is sufficient
evidence to infer that on average directly-
purchased mutual funds outperform broker-
purchased mutual funds
Conclusion

Checking the Required Condition
Both the equal-variances and unequal-variances
techniques require that the populations be
normally distributed.

Example 13.4
In the last few years a number of web-based companies
that offer job placement services have been created.
The manager of one such company wanted to
investigate the job offers recent MBAs were obtaining.
In particular, she wanted to know whether finance
majors were being offered higher salaries than
marketing majors.

Example 13.4
In a preliminary study she randomly sampled 50
recently graduated MBAs half of whom majored in
finance and half in marketing.
From each, she obtained the highest salary
(including benefits) offer (Xm13-04).
Can we infer that finance majors obtain higher
salary offers than do marketing majors among
MBAs?

Example 13.4
The parameter is the difference between two
means (where µ1 = mean highest salary offer to
finance majors and µ2 = mean highest salary offer
to marketing majors).
Because we want to determine whether finance
majors are offered higher salaries, the alternative
hypothesis will specify that is greater than.

Example 13.4
The hypotheses are
0)(:H 210 
0)(:H 211 

Example 13.5
Suppose now that we redo the experiment in the
following way.
We examine the transcripts of finance and marketing
MBA majors.
We randomly sample a finance and a marketing major
whose grade point average (GPA) falls between 3.92
and 4 (based on a maximum of 4).
We then randomly sample a finance and a marketing
major whose GPA is between 3.84 and 3.92.

Example 13.5
We continue this process until the 25th pair of finance
and marketing majors are selected whose GPA fell
between 2.0 and 2.08.
(The minimum GPA required for graduation is 2.0.)
As we did in Example 13.4, we recorded the highest
salary offer . Xm13-05
Can we conclude from these data that finance majors
draw larger salary offers than do marketing majors?

Paired t-test
The experiment described in Example 13.4 is one in which
the samples are independent. That is, there is no
relationship between the observations in one sample and
the observations in the second sample. However, in this
example the experiment was designed in such a way that
each observation in one sample is matched with an
observation in the other sample. The matching is conducted
by selecting finance and marketing majors with similar
GPAs. Thus, it is logical to compare the salary offers for
finance and marketing majors in each group. This type of
experiment is called matched pairs.

Example 13.5
For each GPA group, we calculate the matched pair
difference between the salary offers for finance and
marketing majors.

Example 13.5
The numbers in black are the original starting salary
data (Xm13-05) ; the numbers in blue were calculated.
although a student is either in Finance OR
in Marketing (i.e. independent), that the
data is grouped in this fashion makes it a
matched pairs experiment (i.e. the two
students in group #1 are ‘matched’ by
their GPA range
the difference of the means is equal to the mean of the differences, hence
we will consider the “mean of the paired differences” as our parameter of interest:
IDENTIFY

Example 13.5
Do Finance majors have higher salary offers than Marketing
majors?
Since:
We want to research this hypothesis: H1:
(and our null hypothesis becomes H0: )
IDENTIFY

Test Statistic for
The test statistic for the mean of the
population of differences ( ) is:
which is Student t distributed with nD–1 degrees of
freedom, provided that the differences are
normally distributed.
IDENTIFY

Using SPSS
Analyze > Compare Means > Paired Samples T
Test

Example 13.5
The p-value is .0004. There is overwhelming
evidence that Finance majors do obtain higher
starting salary offers than their peers in Marketing.

Statistical analyses
• Group differences between 2 groups:
▫ T-tests
• Group differences among 3 or more groups
▫ One-way ANOVA
 Scheffe post-hoc test
• Relationship between 2 variables
▫ Correlation
• Relationship among 3 or more variables
▫ Multiple regression

Main Analysis
Hypothesis
Prediction of
relationship between
variables
Prediction of group
difference in some
variables
T-test, ANOVA,
MANOVA
Correlation,
Multiple regression

Hypotheses regarding group
difference
• The hypothesis language
• Group A will be more (or less) in (something)
than Group B
• “ It is hypothesized that females would be
more likely to shop online than male.”
• “It is predicted that males would trust more
about online shopping than females.”

Testing group difference
• Comparing 2 groups’ difference in some
variables
• We use Independent-samples t-test
Male group Female group
Subj.1 Subj. 1
2 2
3 3
4 4
• Note: t-tests can compare only 2 groups at a
time

Comparing males and females on
these variables
Degree of enjoyment on online shopping M > F

Variable Female Male
Variable A

• The concept of being “statistically significant”
Male group Female group
Mean =2. 42 Mean = 2.11
• Can we jump into the conclusion that males are
greater than females in variable A?
• Not yet…
• We have to find out whether the difference could
really be claimed ‘a difference’
• “Is the difference statistically-significant?”
• T-test takes into consideration the difference in
means and the sample size to determine whether it
is statistically significant

The concept of being “statistically-
significant”
• We could only claim a difference as a real
difference when statistics tell you so
• The concept of being “statistically-significant”
• The SPSS language: p<.05

Being “statistically- significant”
• Significant level: p<.05
• If p<.05 (significant)
• You could claim that the difference is a real
difference, because it is statistically-
significant.
• If p>.05 (non-significant)
• You couldn’t claim there is a difference

• Significant level: p<.05
• The logic behind:
• Statistics is about probability
• What does ‘p’ stand for
• p= probability of making an error in the
calculation leading to a conclusion that there
is a significant difference when in fact there
is not
• Type 1 error: Making a false claim that there
is a real difference between 2 groups when
there is indeed none
• When this probability is smaller than 5 out
of 100 acceptable

• Type 1 error: Making a false claim that there is a
real difference between 2 groups when there is
indeed none
• When this probability is smaller than 5 out of
100 acceptable
• p<.05 = probability of committing this Type 1
error is less than 5/100
• Over 95% of the time when you make the claim
that there is a difference between the groups in
certain aspects, you are correct
• p> .05 not acceptable, no real difference
between 2 groups.

Running T-tests
• Steps for running a t-test:
• Analyze
Compare means
Independent-sample T-test
Grouping variables
Define (which two
groups)
Testing variables

Running T-tests
• Task: Perform t-tests to see if there is any gender
difference in income GSS2008 data.

Interpreting T-test results
• T-test output
• 1) Look at the means of the 2 groups (To
see which group has a higher mean)
• 2) Look at ‘Levene’s test of equality of
variance’:
• If non-significant> no significant different
in the variance> equal variance> rely on
the top row

• T-test output
• Step 1: The output from the t-test procedure
is segmented by two parts: variables and
types of information.
• Step 2: For each dependent variable, SPSS
reports descriptive statistics in the first part.
Look at the means of the 2 groups (To see
which group has a higher mean than the
other in a variable)
• Step 3: To see if there is significant
difference. We need to make reference to
part 2:

• Step 4: First look at “Levene’s test for
equality of variances”. It will help you
determine which t-test value to use. Note: It
doesn’t tell you whether the 2 groups are
statistically different.
▫ If “Levene’s test for equality of variances” is not
significant (the variances are not too different),
then use ‘equal variances assumed’ that is, look
at the 1st row and neglect the 2nd row
▫ If “Levene’s test for equality of variances” is
significant, then use ‘equal variances not
assumed’ that is, look at the 2nd row and neglect
1st row

• If p>.05 non significant the sample variance
does not differ variance is equal equal
variances assumed read the 1st row
• If p<.05 significant the sample variance
differs variance is not equal equal variances
not assumed read the 2nd row

• Step 5: Look at these figures: Mean-
difference, t value, and significance. This is
where the important information lies.
• Look at the “significance level”
• If p<.05
• There is a significant difference between the
2 groups
• If p>.05
• The 2 groups are not different in a particular
variable
• Run t-tests and complete the table

Reporting T-test results
• In reporting significant results:
• “The means for the Chinese-Canadian females
and Chinese-Canadian males in Maintenance
of Chinese culture were M=5.50 (SD =.98) and
M = 4.33 (SD =.97) respectively. T-test showed
that the Chinese-Canadian female subjects
scored significantly higher than their male
counterparts in the variable of Maintenance of
Chinese culture , t(99)= -3.01, p<.05.”
• You need to report the means, SDs, degree of
freedom, t-value, and significance.

Reporting T-test results
• In reporting non-significant results
• T-test showed no significant difference
between the Chinese-Canadian female
and male subjects i shyness, t(94)= .12,
n.s.
• *t(df)= t-value, significance level
• Units for significance level:
• p<.05, p<.01, or p<.001

Week 3 Assignment
• Read chapter 9-13
• Assignment:

6 estimation hypothesis testing t test

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 6 estimation hypothesis testing t test

Similar to 6 estimation hypothesis testing t test (20)

More from Penny Jiang

More from Penny Jiang (15)

Recently uploaded

Recently uploaded (20)

6 estimation hypothesis testing t test