EKONOMETRIKA Insights

Ilmu Statistik
Karakteristik populasi dan sample
Distribusi sampel
Confidence interval dan hipotesis testing

12 - 14
• Population: The entire group about which information
is desired.
• Sample: A proportion or part of the population -
usually the proportion from which information is
gathered.
Populations and Samples

Population Vs. Sample
Population of Interest
Sample
Population Sample
Parameter Statistic
We measure the sample using statistics in order to draw
inferences about the population and its parameters.

Tiger Woods
PELUANG TIGER WOOD MENANG

PROBABILITY/PELUANG melempar
coin 3 kali f=P(X=x), X= tail

Example of computer simulation…
• How many heads come up in 100 coin tosses?
• Flip coins virtually
– Flip a coin 100 times; count the number of heads.
– Repeat this over and over again a large number of
times (we’ll try 30,000 repeats!)
– Plot the 30,000 results.

Coin tosses…
Conclusions:
We usually get
between 40 and 60
heads when we flip a
coin 100 times.
It’s extremely
unlikely that we will
get 30 heads or 70
heads (didn’t
happen in 30,000
experiments!).

12 - 29
Sampling
• In its broadest sense, sampling is a procedure by which
one or more members of a population are picked from
the population.
• The objective is to make certain observations upon the
members of the sample and then, on the basis of these
results, to draw conclusions about the characteristics of
the entire population.

12 - 31
Looking at the Process
When we randomly select a sample from a
population, we can use the mean for the sample as
an estimate or guess as to the value for the mean of
the population. This should bring up the question as
to how good is this sample mean or sample statistic
as a guess for the value of the population mean or
population parameter.
The essence of this question has to do with how well
this process works—the process of using a sample to
make guesses about the population.

12 - 32
How Good is a Sample Mean
The essential question is “How good is a sample mean
as an estimate of the population mean?”
One way to examine this question is to understand
that we used a process that involved randomly
selecting a sample from the population and then
calculating the mean for the values of the
observations in the sample.
We can repeat this process as many times as we wish
and examine what it produces.

12 - 34
Population
Person
Population of
Cholesterol values
(mg/dl)
1 201
2 182
3 199
.
.
.
.
.
.
128 124
129 180

12 - 35
Sampling Distributions
Individual
Observations
149
146
132
.
.
.
n = 1, µ = 150lbs
2 = 100lbs,  = 10lbs

12 - 36
Sample with n = 5
156
201 105
149
121
189
201 121
149 172
220
201
309111
198
46
42 162
217 198
156
133
…
261
100
Sample of 5 weights
n = 5; = 732x
732
= = 146.4
5
x
Population of weights

12 - 37
Ten Different Samples, n = 5
Sample n Mean s2 s
1 5 147.43 88.14 9.39
2 5 153.98 117.91 10.86
3 5 146.50 103.66 10.18
4 5 155.53 91.99 9.59
5 5 147.87 149.65 12.23
6 5 143.60 66.76 8.17
7 5 146.87 64.23 8.01
8 5 149.19 280.88 16.76
9 5 150.05 200.28 14.15
10 5 146.92 173.36 13.17
Average 148.79 133.69 11.25

12 - 38
Individual
Observations
Means for
n = 5
149 153.0
146 146.4
: :
n = 1 n = 5
 = 150 Ibs  = 150 Ibs
2
= 100 Ibs2 2
2 2
20Ibsx
n

  
 = 10 Ibs 4.47Ibsx
n

  

12 - 39
Standard Error of the Mean
x
n

 
The population that includes all possible samples of
size n is a long list of numbers and the variance for
these numbers can, in theory, be calculated.
The square root of this variance is called the standard
error of the mean. It is simply the standard deviation
for this population of means.
2
2
x
n

 

12 - 40
Sample with n = 20
113
145
148
151
102
111
181
189
154
114
120
191
105
206
171
133
101198
127
136
161
Sample of 20 weights
n = 20; = 3057x
3057
= = 152.85
20
x

12 - 41
Ten Different Samples, n = 20
Sample n Mean s2 s
1 20 150.86 100.96 10.05
2 20 146.88 122.70 11.08
3 20 147.65 119.51 10.93
4 20 149.37 51.07 7.15
5 20 153.30 109.54 10.47
6 20 152.83 111.96 10.58
7 20 148.62 91.94 9.59
8 20 152.16 140.83 11.87
9 20 154.40 179.56 13.40
10 20 151.43 115.85 10.76
Average 150.75 114.39 10.59

12 - 42
Individual
observations
Means for
n = 5
Means for
n = 20
149 153.0 151.6
146
.
.
.
146.4
.
.
.
151.3
.
.
.
µ = 150 lbs µ = 150 lbs µ = 150 lbs
2 = 100lbs
 = 10 lbs
2
2 2
20 lbsx
n

  
2
2 2
5 lbsx
n

  
4.47 lbsx
n

   2.23 lbsx
n

  

A Sampling Distribution
Let’s create a sampling distribution of means…
Take a sample of size 1,500 from the US. Record the mean income. Our
census said the mean is $30K.
$30K

Take another sample of size 1,500 from the US. Record the mean income.
Our census said the mean is $30K.
$30K

Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes.
$30K

Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes.
$30K
The sample means would stack up
in a normal curve. A normal
sampling distribution.

Say that the standard deviation of this distribution is $10K.
Think back to the empirical rule. What are the odds you would get a sample
mean that is more than $20K off.
$30K
-3z -2z -1z 0z 1z 2z 3z

Say that the standard deviation of this distribution is $10K.
Think back to the empirical rule. What are the odds you would get a sample
mean that is more than $20K off.
$30K
-3z -2z -1z 0z 1z 2z 3z
2.5% 2.5%

An Example:
A population’s car values are  = $12K with  = $4K.
Which sampling distribution is for sample size 625 and which is
for 2500? What are their s.e.’s?
-3 -2 -1 0 1 2 3
95% of M’s
95% of M’s
-3-2-1 0 1 2 3
? $12K ?
? $12K ?

An Example:
Which sampling distribution is for sample size 625 and which is for 2500? What are their s.e.’s?
s.e. = $4K/25 = $160 s.e. = $4K/50 = $80
(625 = 25) (2500 = 50)
-3 -2 -1 0 1 2 3
95% of M’s
95% of M’s
-3-2-1 0 1 2 3
$11,840 $12K $12,320
$11,920$12K $12,160

Which sampling distribution is for sample size 625 and which is for 2500?
Which sample will be more precise? If you get a particularly bad sample, which sample size will
help you be sure that you are closer to the true mean?
-3 -2 -1 0 1 2 3
95% of M’s
95% of M’s
-3-2-1 0 1 2 3
$11,840 $12K $12,320
$11,920$12K $12,160

•TheIdeaofaConfidence
Interval
estimate±marginoferror
Definition:
A confidence interval for a parameter has two parts:
• An interval calculated from the data, which has the form:
estimate ± margin of error
• The margin of error tells how close the estimate tends to be to the
unknown parameter in repeated random sampling.
• A confidence level C, the overall success rate of the method for
calculating the confidence interval. That is, in C% of all possible
samples, the method would yield an interval that captures the true
parameter value.
We usually choose a confidence level of 90% or higher because we want to be
quite sure of our conclusions. The most common confidence level is 95%.
The big idea: The sampling distribution ofx tells us how close to  the
sample mean x is likely to be. All confidence intervals we construct will
have a form similar to this:

• Constructing a Confidence Interval
Why settle for 95% confidence when
estimating a parameter? The price we pay
for greater confidence is a wider interval.
When we calculated a 95% confidence interval
for the mystery mean µ, we started with
estimate ± margin of error
ConfidenceIntervals:TheBasics
This leads to a more general formula for confidence intervals:
statistic ± (critical value) • (standard deviation of statistic)
Our estimate came from the sample statisticx.
Since the sampling distribution ofx is Normal,
about 95% of the values ofx will lie within 2
standard deviations (2x ) of the mystery mean.
That is, our interval could be written as:
240.79 2 5 = x  2x

• Calculating a Confidence Interval
ConfidenceIntervals:TheBasics
The confidence interval for estimating a population parameter has the form
statistic ± (critical value) • (standard deviation of statistic)
where the statistic we use is the point estimator for the parameter.
Calculating a Confidence Interval
Properties of Confidence Intervals:
 The “margin of error” is the (critical value) • (standard deviation of statistic)
 The user chooses the confidence level, and the margin of error follows
from this choice.
 The critical value depends on the confidence level and the sampling
distribution of the statistic.
 Greater confidence requires a larger critical value
 The standard deviation of the statistic depends on the sample size n
The margin of error gets smaller when:
 The confidence level decreases
 The sample size n increases

EKONOMETRIKA Insights

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to EKONOMETRIKA Insights

Similar to EKONOMETRIKA Insights (20)

More from XYZ Williams

More from XYZ Williams (20)

Recently uploaded

Recently uploaded (20)

EKONOMETRIKA Insights