1192012 155942 f023_=_statistical_inference

Statistics
It is a branch of mathematics used to summarize,
analyze & interpret a group of numbers of
observations.
Types of Statistics
Descriptive Statistics :
It summarize data to make sense or meaning of a list
of numeric values.
Inferential Statistics :
It is used to infer or generalize observations made
with samples to the larger population from which
they were selected. Broadly it is classified into theory
of estimation and testing of hypothesis

Estimation & Testing of
Hypothesis
Estimation
The method to estimate the value of a population
parameter from the value of the corresponding
sample statistic.
Testing of Hypothesis
A claim or belief about an unknown parameter value.

Types of Estimation
Point estimation
It is the value of sample statistic that is used to
estimate most likely value of the unknown population
parameter.
Interval estimation
It is the range of the values that is likely to have
population parameter value with a specified level of
confidence.

Properties of estimation
Consistency-
The statistic tend to become closer to population parameter
as the sample size increases.
Unbiasedness-
E(Statistic) = Parameter
Efficiency-
Refers to the size of the standard error(SE). E.g., SE of
sample median is greater than the sample mean, So the
sample mean is more efficient .
Sufficiency-
Refers to the usage of sample information by the statistic.
E.g., Sample mean is more sufficient than sample median
because usage is more.

Drawback of point estimation
No information is available regarding its reliability,i.e,
how close it is to its true population parameter.
In fact, the probability that a single sample statistic
actually equals to the population parameter is
extremely small

Interval Estimation
Confidence Interval= Point estimate ± margin 0f error
Margin of error = critical value of ‘Z’ or ‘t’ at 90%, 95%
& so on confidence level* standard
error of particular statistic

Estimation
 Population mean – Avg. salary
 Population proportion – Stock Market

Interval Estimation for population
mean(µ)
SAMPLE SIZE FORMULAE
Large Sample(n≥30)
Known SD(σ)
Unknown SD(σ)
Sample Mean square(S)
n
Zx
σ
α
2
±
n
S
Zx
2
α±
( )2
1
1
∑ −
−
xx
n

mean(µ)
SAMPLE SIZE FORMULAE
Small Sample(n<30)
Known SD(σ)
Unknown SD(σ)
Sample Mean square(S)
n
Zx
σ
α
2
±
n
S
tx
2
α±
( )2
1
1
∑ −
−
xx
n

proportion(p)
n
pp
Zpp
)1(
2
−
±= α

Test of hypothesis
Hypothesis
Statements about characteristics of populations,
denoted as H.
Types of Hypothesis
 Null & Alternative hypothesis
 Simple & Composite hypothesis

Hypothesis Testing
Null Hypothesis-
The hypothesis actually tested is called the null hypothesis. It is
denoted as H0. It is the claim that is initially assumed to be true.
It may usually be considered the skeptic’s hypothesis: Nothing
new or interesting happening here!
Alternative Hypothesis-
The other hypothesis, assumed true if the null is false, is the
alternative hypothesis. It is denoted as H1 or Ha . Ha may usually
be considered the researcher’s hypothesis.
These two hypotheses are mutually exclusive and exhaustive so
that one is true to the exclusion of the other.
Possible conclusions from hypothesis-testing analysis are reject
H0 or fail to reject H0.

Hypothesis Testing
Simple Hypothesis -
It specifies the distribution completely (One tail test)
H0: μ1 = μ2
HA: μ1 > or < μ2
Composite hypothesis-
It does not specifies the distribution completely(Two tail test)
H0: μ1 = μ2
HA: μ1 ≠ μ2
Examples of Hypothesis :
There exists a positive relationship between
attendance and result.
Bankers assumed high-income earners are more
profitable than low-income earners

Rules for Hypotheses
H0 is always stated as an equality claim involving
parameters.
Ha is an inequality claim that contradicts H0. It may
be one-sided (using either > or <) or two-sided
(using ≠).
A test of hypotheses is a method for using sample
data to decide whether the null hypothesis should
be rejected.
Rejection region - Values of the test statistic for
which we reject the null in favor of the alternative
hypothesis

Test Procedure
A test procedure is specified by
1. A test statistic, a function of the sample
data on which the decision is to be based.
2. (Sometimes, not always!) A rejection region,
the set of all test statistic values for which
H0 will be rejected

Hypothesis Testing
Test Result –
True State
H0 True H0 False
H0 True Correct
Decision
Type I Error
H0 False Type II Error Correct
Decision
)()( ErrorIITypePErrorITypeP == βα
• Goal: Keep α, β reasonably small

Errors in Hypothesis Testing
A type I error consists of rejecting the null
hypothesis H0 when it was true.
A type II error consists of not rejecting H0 when
H0 is false.
are the probabilities of type I and
type II error, respectively.
andα β

Level Testα
A test corresponding to the significance level is
called a level test. A test with significance
level is one for which the type I error
probability is controlled at the specified level.
α
α
Sometimes, the experimenter will fix the value of
, also known as the significance level.
α

Steps in Hypothesis-Testing Analysis
1. Set up H0 and Ha
2. Identify the nature of the sampling distribution curve
and specify the appropriate test statistic
3. Determine whether the hypothesis test is one-tailed or
two-tailed
4. Taking into account the specified significance level,
determine the critical value (two critical values for a two-
tailed test) for the test statistic from the appropriate
statistical table
5. State the decision rule for rejecting H0
6. Compute the value for the test statistic from the
sample data
7. Using the decision rule specified in step 5, either reject
H0 or reject Ha

Large sample test(Z-test)
TEST FOR SINGLE MEAN
A cinema hall has cool Drinks fountain supplying
orange & colas. When the machine is turned on ,it
fills a 550ml cup with 5ooml of the required drink.
The manager has 3 problems.
1.The clients have been complaining that the machine supplies less than
5ooml.
2. The manager wants to make sure that the amount of cola does not exceed
500ml.
3. The manager want to minimize customer complaint & at the same time
does not want any overflow.
In the case of cinema hall, suppose n= 36, sample
mean= 499ml & the specifications of the machine
give the standard deviation of the output as 1 ml. The
significance level is 10%.

Test for difference mean
Business Today has conducted a survey between
Sonepur & Muzaffarpur on the hourly wages of
laborers. Results of the survey are as follows.
Town Mean Hourly Wages S.D Sample
Sonepur Rs.8.95 Rs.0.40 200
Muzaffarpur Rs.9.10 Rs.0.60 175
Business Today wants to test the hypothesis at the
0.05 significance level that there is no difference
between hourly wages for the landless laborers in the
two towns.

TEST FOR PROPORTION
A cable TV operator claims that 40% of the homes in a
city have opted for his services. Before sponsoring
advertisements on the local cable channel, a company
conducted a survey & found that 250 out of 550 persons
were found to have cable TV services from the operator .
On the basis of this data can we accept the claim of the
cable TV operator at 1% level of significance?

Q1. An ambulance service claims that it takes, on the
average 8.9 minutes to reach its destination in
emergency calls . To check on this claim, the agency
which licenses ambulance services has them timed on 50
emergency calls, getting a mean of 9.3 minutes with a
standard deviation of 1.8 minutes. Test the claim at 1%
level of significance?
Q2. An automobile company decided to introduce a new
car whose mean petrol consumption is claimed to be
lower than that of the existing auto engine. It was found
that the mean petrol consumption for 50 cars was 10 km
per litre with a standard deviation 3.5 km per litre. Test
for the company at 5% level of significance, the claim
that in the new car petrol consumption is 9.5 km per litre
on the averge.

Q3. Two types of new cars produced in India are tested
for petrol mileage. One group consisting of 36 cars
averaged 14 kms per litre. While the other group
consisting of 72 cars averaged 12.5 kms per litre.
i) What test statistic is appropriate if standard deviation of
petrol cosumption per litre for both cars are 1.5 and 2.0
respectively?
ii)Test whether there exists a significant difference in
petrol consumption of those two types of cars at 1% level
of significance?

0
/
X
Z
S n
µ−
=
( )
0
0 0
ˆ
1 /
p p
z
p p n
−
=
−
2
2
2
1
2
1
21
nn
XX
Z
σσ
+
−
=
Single Mean Difference Mean
Proportion

( )
0
0 0
ˆ
1 /
p p
z
p p n
−
=
−
Single Mean
Difference Mean
Proportion
2
2
2
1
2
1
2121 )()(
nn
xx
Z
σσ
µµ
+
−−−
=
n
xx
Z
σ
21 −
=

Critical values of Z
Level of
significance(α)
10% 5% 1%
Critical values for
two-tailed test
±1.64 ±1.96 ±2.58
Critical values for
left-tailed test
-1.28 -1.64 -2.33
Critical values for
right-tailed test
1.28 1.64 2.33

Small sample test(t-test)
TEST FOR SINGLE MEAN
The average breaking strength of steel rods is
specified to be 18.5 thousand kg. For this a sample
of 14 rods was tested . The mean & standard
deviation obtained were 17.85 and 1.955
respectively. Test at 5% level of the significance
of the deviation.

TEST FOR DIFFERENCE MEAN
The average life of sample of 10 electric light bulbs
was found to be 1456 hours with standard
deviation of 423 hours. A second sample of 17
bulbs chosen from a different batch showed a
mean life of 1280 hours with standard deviation of
398 hours. Is there a significant difference
between the means of two batches. Test at 5%
level of the significance.

PAIRED SAMPLES
The HRD manager wishes to see if there has been any
change in the ability of trainers after a specific
program. The trainees take an test before the start of
the program and an equivalent one after they have
completed it . The scores recorded are given below.
Has any change taken place at 5% level of the
significance.
Trainee : A B C D E F G H I
Score before training: 75 70 46 68 68 43 55 68 77
Score after training: 70 77 57 60 79 64 55 77 76

Single Mean Difference Mean
n
S
x
t
µ−
=
1
)( 2
−
−
=
∑
n
xx
S
( )
( )21
)( 2121
xx
S
xx
t
−
−−−
=
µµ
( )
21
11
21
nn
SS xx
+=−
( ) ( )
( )2
11
21
2
22
2
11
−+
−+−
=
nn
snsn
S

Paired t-test
d
d
S
d
t
µ−
=
1
)( 2
−
−
=
∑
n
dd
Sd
Where ‘d’ is the difference between the
scores and is the mean of the difference
between paired observations
d

1192012 155942 f023_=_statistical_inference

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to 1192012 155942 f023_=_statistical_inference

Similar to 1192012 155942 f023_=_statistical_inference (20)

1192012 155942 f023_=_statistical_inference