The z-scores and probabilities that we have considered so far are limited to situation in which the sample consists of a single score. Most research studies use much larger samples such as n=100. We will extend the concepts of z-score and probability to cover situation with larger sample.
When we conduct research, we use samples of known size drawn from a population of unknown size and shape. The samples will have different individuals, different scores, different means and so on. In most cases, it is possible to obtain thousands of different samples from one population. How can we tell which sample will describe its population? What is the probability of selecting a sample that has a certain sample mean?
In general, the difficulty of working with samples is that a sample provides an incomplete picture of the population. In addition, any statistics that are computed for the sample will not be identical to the corresponding parameters for the entire population. This difference, or error between sample statistics and the corresponding population parameters, is called sampling error.
THE DISTRIBUTION OF SAMPLE MEANS
As noted, two separate samples probably will be different even though they are taken from the same population. With all these different samples coming from the same population, and if we continue to draw samples of the same size from the population, we are likely to get values of some sample statistic, such as means. and form a new distribution with its element are means of the sample rather than individual observation.
The distribution of sample means is the distribution formed by taking repeated samples from the same population, computing the mean of each sample, and forming the distribution of those sample mean.
THE DISTRIBUTION OF SAMPLE MEANS
Consider a population that consists of only 4 scores: 2,4,6,8. This population is pictured in frequency histogram in Fig 7.1
We are going to use this population as the basis for constructing the distribution of sample means for n=2. Remember, the distribution is the collection of sample means from all the possible random sample of n=2 from the population. For this example , there are 16 different samples, and they are listed in Table 7.1
SAMPLE AND POPULATION
DISTRIBUTIONS OF SCORES VERSUS DISTRIBUTIONS OF SAMPLE MEANS
The distribution of sample means is different from the distribution of scores that we have considered before. Until now we always have discussed distribution of scores; now the values in the distribution are not scores, but statistics (sample means). Because statistics are obtained from samples, a distribution of statistics is referred to as a sampling distribution
A sampling distribution is a distribution of statistics obtained by selecting all the possible samples of a specific size from a population.
Thus, the distribution of sample means is an example of a sampling distribution. It is often called “ the sampling distribution of M .”
THE CENTRAL LIMIT THEOREM
In more realistic circumstances, with larger populations and larger samples, the number or possible samples will increase dramatically and it will be virtually impossible to actually obtain every possible random sample. Fortunately, it is possible to determine exactly what the distribution of sample means will look like without taking hundreds or thousands of samples. A mathematical proportion known as the central limit theorem provides a precise description of the distribution that would be obtained if we selected every possible sample, calculated every sample mean, and constructed the distribution of the sample mean. The theorem said
For any population with mean µ and standard deviation σ , the distribution of sample means for sample size n will have a mean µ and standard deviation of and will approach a normal distribution as n approach infinity.
THE SHAPE, THE MEAN OF THE DISTRIBUTION OF SAMPLE MEANS
The distribution of sample means tends to be normal distribution, and will be almost perfectly normal if either of the following two conditions is satisfied:
1. The population from which the sample are selected is a normal distribution.
2. The number of scores (n) in each sample is relatively large, around 30 or more.
The mean of the distribution of sample means is exactly equal to the value of the population mean. The mean value is called the expected value of M.
THE VARIABILITY OF THE DISTRIBUTION OF SAMPLE MEANS
The standard deviation of the distribution of sample means is called the standard error of M . The notation that is used to identify the standard error is σ M.
The magnitude of the standard error is determined by two factors: the sample size, and the population standard deviation.
The formula for standard error is
Standard error = σ M =
Remember that a sample is not expected to provide a perfectly accurate reflection of its population, there typically will be some error between the sample and the population. The standard error measure exactly how much difference should be expected on average between a sample mean M and the population mean, µ. The standard error is an extremely valuable measure because it specifies precisely how well a sample mean estimate its population mean. Now, for any sample size (n) we can compute the standard error.
PROBABILITY AND THE DISTRIBUTION OF SAMPLE MEANS
Because the distribution of sample means tends to be normal, we can use z-scores and the unit normal table to find probabilities for specific mean.
For a normally distributed population with µ = 60 and σ = 12, what is the probability of selecting a random sample of n=36 scores with a sample mean greater than 64 or in symbol p(M >64) = ?
Step 1. Find the standard error (Remember, we must use the standard error and not standard deviation, because we are dealing with the distribution of sample mean).
σ M = = = =2
Step 2. Compute the z-score
z = = = 2
Step 3. p(M>64) = p(z>+2.00) = 0.0228
LOOKING AHEAD TO INFERENTIAL STATISTICS
Inferential statistics are methods that use sample data as the basis for drawing general conclusions about populations. However, we have noted that a sample is not expected to give a perfectly accurate reflection of its population. There will be some error or discrepancy between a sample statistic and corresponding population parameter. We have observed that a sample mean will not be exactly equal to the population mean. The standard error of M specifies how much difference is expected on average between the mean for a sample and the mean for the population. In the next topics we will introduce a variety of statistical methods that use sample means to draw inferences about population means.