MTH 245 Lesson 17 Notes
Sampling Distributions
Sample statistics (such as �̅��) are themselves random
variables. The reason
for this is that no two samples of the same size �� from a
given population
will contain the exact same set of data values, except by
(extremely rare)
coincidence. Since the samples are chosen randomly, their
statistics will not
necessarily be equal due to random variation.
Since sample statistics are random variables, they have their
own
probability distributions, which are called sampling
distributions. A
sampling distribution defines the probabilities associated with
values of a
statistic for all possible samples of size �� drawn from the
same population.
Theoretically, a sampling distribution for a particular statistic
can be
constructed by randomly selecting all possible samples of size n
from the
population, calculating the statistic value for each sample, and
then
constructing a grouped frequency histogram of the values. This
only works
in practice for populations whose size N is relatively small.
Therefore,
statisticians use a set of standard sampling distributions. One of
the more
commonly used of these is the normal distribution, which we
will use in a
number of cases in Lessons 18-31.
The Sampling Distribution of the Sample Proportion (�̂��)
The population proportion �� is the proportion of "favorable"
observations (or
"successes") associated with a binomial random variable ��.
Note: this is the
same �� (the probability of observing a "success" in an
individual binomial
trial) that we were introduced to in Chapter 5.
Up to this point, we have accepted the value of �� as a given.
However, since
it is itself a population parameter, it usually needs to be
estimated using a
sample proportion �̂��. The sample statistic �̂�� is the
proportion "successes" in a
given sample of �� binomial trials.
The true sampling distribution of �̂�� is binomial. However, it
can be shown
that when ���̂�� > 5 and ��(1 − �̂��) > 5, the sampling
distribution of �̂�� can be
closely approximated by a normal distribution. We won't
explore the math
behind this approximation here but will make use of it in later
chapters.
The Sampling Distribution of the Sample Mean (�̅��)
Consider a random variable �� whose probability distribution
has mean ��
and standard deviation ��.
The Central Limit Theorem (CLT) states that for all random
samples of �� of
the same size ��, and for �� ≥ 30, the sampling distribution
of �̅�� can be
approximated by a normal distribution with mean ���̅�� =
�� and standard
deviation ���̅�� = �� √��⁄ , regardless of the probability
distribution of x.
The value ���̅�� is called the standard error of �̅�� to
distinguish it from ��, the
standard deviation of the original random variable ��.
Note that if x is normal to begin with, then �̅�� is normally
distributed
regardless of sample size.
Example 1: Complete the Sampling Distribution activity
(located in the
Canvas module for this lesson). After 100,000 simulated
samples of size �� =
50, what is the shape of the distribution of the �̅�� values?
Applying the Central Limit Theorem
If the original random variable x is normal, or if it’s not normal
and �� ≥ 30,
then we can use the methods presented in Lesson 15 to calculate
probabilities associated with �̅��, as well as determine which
values of �̅�� (if
any) are significant.
If �� < 30, this normal approximation doesn’t work, and we
need to use an
alternative method (such as bootstrapping, introduced in Lesson
23).
Example 2: For a certain brand of gumdrops, ��, the weight of
an individual
candy, has a non-normal distribution with a mean of 0.41 oz and
a standard
deviation of 0.05 oz. Suppose the gumdrops are sold in bulk
packages
containing 200 candies each.
a. What is the distribution of �̅��, the mean weight of a bag of
200
candies?
Since �� ≥ 30, we can invoke the Central Limit Theorem and
assume �̅�� has
an approximately normal distribution.
b. What are the mean and standard error of that distribution?
The mean �� = 0.41 and the standard error ���̅�� = ��
√��⁄ = 0.05 √200⁄
= 0.004.
c. What is the probability that a bag of 200 candies will have a
mean
weight per candy of at most 0.40 oz?
Using the StatCrunch normal distribution calculator with �� =
0.41 and
���̅�� = 0.004, we get ��(�̅�� ≤ 0.40) = 0.006.
We use of �̅�� instead of �� because we're calculating a
probability related
to the mean weight of 200 candies instead of to the weight of an
individual
candy.
Note: StatCrunch's normal distribution calculator does not
distinguish
between standard deviation and standard error, so we need to
insert
���̅�� = 0.004 in the "Std. Dev." field when solving Central
Limit Theorem
problems.
d. Assuming significance level �� = 0.05, is 0.40 oz a
significantly low
value of �̅��?
Using the probability method for determining significant values
of a normal
distribution (Lesson 15), since ��(�̅�� ≤ 0.40) = 0.006 <
0.025, 0.40 oz a
significantly low value of �̅��.
Example 3: Suppose an elevator is rated with a capacity of 16
passengers
and a maximum load of 2,500 lb, for a maximum weight per
person of
156.25 lb. The random variable ��, the weight of an individual
adult, is
distributed as shown in the table below. In a worst-case
scenario, an
elevator could be loaded with 16 men. If so, would the r esulting
load
exceed rated capacity?
Males Females
�� 182.9 lb 165.0 lb
�� 40.8 lb 45.6 lb
Distribution Normal Normal
a. Find the probability that a single randomly selected adult
male has a
weight greater than 156.25 lb.
To calculate this probability, we to use the distribution of
individual weights.
Plugging the parameter values from the "Males" column in
above table into
the StatCrunch normal probability gives ��(�� ≥ 156.25) =
0.743 ,
b. Find the probability that a random sample of 16 adult males
has a
mean weight greater than 156.25 lb. What does the result
suggest
about the posted maximum capacity of 16?
Here we need the probability that the mean weight for random
samples of 16
adult males. This requires us to find the standard error:
���̅�� = 40.8 √16⁄
= 10.2. Inputting this value and �� = 182.9 into the normal
distribution
calculator, we get ��(�̅�� ≥ 156.25) = 0.996. Since the
elevator is likely to fail
if it is loaded with 16 men, the passenger capacity needs to be
reduced.
c. Suppose the maximum capacity is reduced to 10 passengers.
(Assume the maximum load remains at 2,500 lb.) How likely is
the
elevator to fail?
If the passenger capacity decreases to 10 and the load remains
at 2,500 lb,
the maximum weight per person is now 2500/10 = 250
lb/person. The
standard error is now ���̅�� = 40.8 √10⁄ = 12.9, and the
probability of failure
is now ��(�̅�� ≥ 250) ≈ 0.000. The elevator is now unlikely
to fail.
Example 4: A school needs to purchase 25 desks for a
kindergarten
classroom. These desks must accommodate the sitting heights of
five-year-
old kindergarten students, the distribution of which is described
in the
table below.
Boys Girls
�� 61.8 cm 61.2 cm
�� 2.9 cm 3.1 cm
Distribution Normal Normal
a. What sitting height will accommodate 95% of individual
boys?
We need to find the 95th percentile of the distribution of
individual sitting
heights. For that, we simply use the parameter values in the
above table,
which yields a 95th percentile of 66.6 cm.
b. What is the 95th percentile of the mean sitting heights of
random
samples of 25 boys?
To answer this part of the problem, we need the 95th percentile
of the
distribution of average sitting heights for random samples of 25
boys. This
requires us to find the standard error: ���̅�� = 2.9 √25⁄ =
0.58. Inputting this
value and �� = 61.8 into the normal distribution calculator, we
get a 95th
percentile for samples of 25 boys of 62.8 cm.
c. What percentage of the population of kindergarten boys
would be
able to fit in desks designed with the sitting height from Part a
but
would not fit in desks designed with the sitting height from Part
b?
The boys who would fit into the bigger desks but not the
smaller ones have
sitting heights of between 62.8 and 66.6 cm. To find the
percentage, we need
to calculate ��(62.8 ≤ �� ≤ 66.6) using the distribution of
individual sitting
heights (�� = 61.8, �� = 2.9). Using the normal distribution
calculator, we
get ��(62.8 ≤ �� ≤ 66.6) = 0.316 = 31.6%.
Example 5: A certain drink container has a labeled weight of 12
fluid
ounces. The manufacturer claims that ��, the weight of an
individual
container, has mean �� = 12.00 and standard deviation �� =
0.11. Suppose a
random sample of �� = 36 containers is weighed, and the mean
weight is
�̅�� = 12.19 fl oz.
a. Assuming the manufacturer's claim is true, what is the
probability
that a random sample of 36 containers has a mean weight of
12.19 fl
oz or greater?
The standard error: ���̅�� = 0.11 √36⁄ = 0.018. Inputting this
value and
�� = 12.00 into the normal distribution calculator, we get
��(�̅�� ≥ 12.19)
≈ 0.000.
b. Assuming the manufacturer's claim is true, is mean weight of
12.19 fl
oz or greater a significantly high value of �̅��? (Assume ��
= 0.01.)
Since ��(�̅�� ≥ 12.19) < 0.005, 12.19 is a significantly high
value. In fact,
its z-score is 10.56 (you should be able to verify this using the
formula
in Lesson 14).
c. Is it likely that a sample of 36 containers would have a mean
weight
of 12.19 fl oz due to random chance? What does this result
suggest
about the manufacturer's claim that �� = 12.00?
If the manufacturers claim is true, it is extremely unlikely that a
sample of 36 containers would have a mean weight of 12.19 fl
oz
simply by random chance. This suggests that the manufacturer's
claim is probably false (and the true value of �� is likely
greater than
12 fl oz).
8110 Week 8 Discussion Info
QUALITATIVE RESEARCH EVALUATION 2
Discussion: Designing Qualitative Research
Typically, when speaking of validity, qualitative researchers are
referring to research that is credible and trustworthy, i.e., the
extent to which one can have confidence in the study’s findings
(Lincoln & Guba, 1985). Generalizability, a marker of
reliability, is typically not a main purpose of qualitative
research because the researcher rarely selects a random sample
with a goal to generalize to a population or to other settings and
groups. Rather, a qualitative researcher’s goal is often to
understand a unique event or a purposively selected group of
individuals. Therefore, when speaking of reliability, qualitative
researchers are typically referring to research that is consistent
or dependable (Lincoln & Guba, 1985), i.e., the extent to which
the findings of the study are consistent with the data that was
collected.
References & Resources
Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry.
Thousand Oaks, CA: Sage.
Golafshani, N. (2003). Understanding reliability and validity in
qualitative research. The Qualitative Report, 8(4), 597–606.
Retrieved from http://nsuworks.nova.edu/tqr/vol8/iss4/6
For this Discussion, you will explain criteria for evaluating the
quality of qualitative research and consider the connection of
such criteria to philosophical orientations. You will also
consider the ethical implications of designing qualitative
research.
With these thoughts in mind:
Assignment Task 1
Write a 1 ½ page of the following :
· Explanation of two criteria for evaluating the quality of
qualitative research designs.
· Next, explain how these criteria are tied to epistemological
and ontological assumptions underlying philosophical
orientations and the standards of your discipline.
· Then, identify a potential ethical issue in qualitative research
and explain how it might influence design decisions.
· Finally, explain what it means for a research topic to
be amenable to scientific study using a qualitative approach.
Write the information with intext citations and cite reference
to the week’s Learning Resources and other scholarly evidence
in APA Style.
Assignment Task 2
Respond to a classmate in a 150 word response by offering a
strategy to address the ethical issue she or he identified.

Mth 245 lesson 17 notes sampling distributions sam

  • 1.
    MTH 245 Lesson17 Notes Sampling Distributions Sample statistics (such as �̅��) are themselves random variables. The reason for this is that no two samples of the same size �� from a given population will contain the exact same set of data values, except by (extremely rare) coincidence. Since the samples are chosen randomly, their statistics will not necessarily be equal due to random variation. Since sample statistics are random variables, they have their own probability distributions, which are called sampling distributions. A sampling distribution defines the probabilities associated with values of a statistic for all possible samples of size �� drawn from the same population. Theoretically, a sampling distribution for a particular statistic can be constructed by randomly selecting all possible samples of size n from the population, calculating the statistic value for each sample, and then
  • 2.
    constructing a groupedfrequency histogram of the values. This only works in practice for populations whose size N is relatively small. Therefore, statisticians use a set of standard sampling distributions. One of the more commonly used of these is the normal distribution, which we will use in a number of cases in Lessons 18-31. The Sampling Distribution of the Sample Proportion (�̂��) The population proportion �� is the proportion of "favorable" observations (or "successes") associated with a binomial random variable ��. Note: this is the same �� (the probability of observing a "success" in an individual binomial trial) that we were introduced to in Chapter 5. Up to this point, we have accepted the value of �� as a given. However, since it is itself a population parameter, it usually needs to be estimated using a sample proportion �̂��. The sample statistic �̂�� is the proportion "successes" in a given sample of �� binomial trials. The true sampling distribution of �̂�� is binomial. However, it can be shown that when ���̂�� > 5 and ��(1 − �̂��) > 5, the sampling
  • 3.
    distribution of �̂��can be closely approximated by a normal distribution. We won't explore the math behind this approximation here but will make use of it in later chapters. The Sampling Distribution of the Sample Mean (�̅��) Consider a random variable �� whose probability distribution has mean �� and standard deviation ��. The Central Limit Theorem (CLT) states that for all random samples of �� of the same size ��, and for �� ≥ 30, the sampling distribution of �̅�� can be approximated by a normal distribution with mean ���̅�� = �� and standard deviation ���̅�� = �� √��⁄ , regardless of the probability distribution of x. The value ���̅�� is called the standard error of �̅�� to distinguish it from ��, the standard deviation of the original random variable ��. Note that if x is normal to begin with, then �̅�� is normally distributed regardless of sample size. Example 1: Complete the Sampling Distribution activity
  • 4.
    (located in the Canvasmodule for this lesson). After 100,000 simulated samples of size �� = 50, what is the shape of the distribution of the �̅�� values? Applying the Central Limit Theorem If the original random variable x is normal, or if it’s not normal and �� ≥ 30, then we can use the methods presented in Lesson 15 to calculate probabilities associated with �̅��, as well as determine which values of �̅�� (if any) are significant. If �� < 30, this normal approximation doesn’t work, and we need to use an alternative method (such as bootstrapping, introduced in Lesson 23). Example 2: For a certain brand of gumdrops, ��, the weight of an individual candy, has a non-normal distribution with a mean of 0.41 oz and a standard deviation of 0.05 oz. Suppose the gumdrops are sold in bulk packages containing 200 candies each. a. What is the distribution of �̅��, the mean weight of a bag of
  • 5.
    200 candies? Since �� ≥30, we can invoke the Central Limit Theorem and assume �̅�� has an approximately normal distribution. b. What are the mean and standard error of that distribution? The mean �� = 0.41 and the standard error ���̅�� = �� √��⁄ = 0.05 √200⁄ = 0.004. c. What is the probability that a bag of 200 candies will have a mean weight per candy of at most 0.40 oz? Using the StatCrunch normal distribution calculator with �� = 0.41 and ���̅�� = 0.004, we get ��(�̅�� ≤ 0.40) = 0.006. We use of �̅�� instead of �� because we're calculating a probability related to the mean weight of 200 candies instead of to the weight of an individual candy. Note: StatCrunch's normal distribution calculator does not distinguish
  • 6.
    between standard deviationand standard error, so we need to insert ���̅�� = 0.004 in the "Std. Dev." field when solving Central Limit Theorem problems. d. Assuming significance level �� = 0.05, is 0.40 oz a significantly low value of �̅��? Using the probability method for determining significant values of a normal distribution (Lesson 15), since ��(�̅�� ≤ 0.40) = 0.006 < 0.025, 0.40 oz a significantly low value of �̅��. Example 3: Suppose an elevator is rated with a capacity of 16 passengers and a maximum load of 2,500 lb, for a maximum weight per person of 156.25 lb. The random variable ��, the weight of an individual adult, is distributed as shown in the table below. In a worst-case scenario, an elevator could be loaded with 16 men. If so, would the r esulting load exceed rated capacity? Males Females
  • 7.
    �� 182.9 lb165.0 lb �� 40.8 lb 45.6 lb Distribution Normal Normal a. Find the probability that a single randomly selected adult male has a weight greater than 156.25 lb. To calculate this probability, we to use the distribution of individual weights. Plugging the parameter values from the "Males" column in above table into the StatCrunch normal probability gives ��(�� ≥ 156.25) = 0.743 , b. Find the probability that a random sample of 16 adult males has a mean weight greater than 156.25 lb. What does the result suggest about the posted maximum capacity of 16? Here we need the probability that the mean weight for random samples of 16 adult males. This requires us to find the standard error: ���̅�� = 40.8 √16⁄ = 10.2. Inputting this value and �� = 182.9 into the normal distribution calculator, we get ��(�̅�� ≥ 156.25) = 0.996. Since the elevator is likely to fail if it is loaded with 16 men, the passenger capacity needs to be reduced.
  • 8.
    c. Suppose themaximum capacity is reduced to 10 passengers. (Assume the maximum load remains at 2,500 lb.) How likely is the elevator to fail? If the passenger capacity decreases to 10 and the load remains at 2,500 lb, the maximum weight per person is now 2500/10 = 250 lb/person. The standard error is now ���̅�� = 40.8 √10⁄ = 12.9, and the probability of failure is now ��(�̅�� ≥ 250) ≈ 0.000. The elevator is now unlikely to fail. Example 4: A school needs to purchase 25 desks for a kindergarten classroom. These desks must accommodate the sitting heights of five-year- old kindergarten students, the distribution of which is described in the table below. Boys Girls �� 61.8 cm 61.2 cm �� 2.9 cm 3.1 cm Distribution Normal Normal a. What sitting height will accommodate 95% of individual boys?
  • 9.
    We need tofind the 95th percentile of the distribution of individual sitting heights. For that, we simply use the parameter values in the above table, which yields a 95th percentile of 66.6 cm. b. What is the 95th percentile of the mean sitting heights of random samples of 25 boys? To answer this part of the problem, we need the 95th percentile of the distribution of average sitting heights for random samples of 25 boys. This requires us to find the standard error: ���̅�� = 2.9 √25⁄ = 0.58. Inputting this value and �� = 61.8 into the normal distribution calculator, we get a 95th percentile for samples of 25 boys of 62.8 cm. c. What percentage of the population of kindergarten boys would be able to fit in desks designed with the sitting height from Part a but would not fit in desks designed with the sitting height from Part b? The boys who would fit into the bigger desks but not the
  • 10.
    smaller ones have sittingheights of between 62.8 and 66.6 cm. To find the percentage, we need to calculate ��(62.8 ≤ �� ≤ 66.6) using the distribution of individual sitting heights (�� = 61.8, �� = 2.9). Using the normal distribution calculator, we get ��(62.8 ≤ �� ≤ 66.6) = 0.316 = 31.6%. Example 5: A certain drink container has a labeled weight of 12 fluid ounces. The manufacturer claims that ��, the weight of an individual container, has mean �� = 12.00 and standard deviation �� = 0.11. Suppose a random sample of �� = 36 containers is weighed, and the mean weight is �̅�� = 12.19 fl oz. a. Assuming the manufacturer's claim is true, what is the probability that a random sample of 36 containers has a mean weight of 12.19 fl oz or greater? The standard error: ���̅�� = 0.11 √36⁄ = 0.018. Inputting this value and �� = 12.00 into the normal distribution calculator, we get ��(�̅�� ≥ 12.19) ≈ 0.000.
  • 11.
    b. Assuming themanufacturer's claim is true, is mean weight of 12.19 fl oz or greater a significantly high value of �̅��? (Assume �� = 0.01.) Since ��(�̅�� ≥ 12.19) < 0.005, 12.19 is a significantly high value. In fact, its z-score is 10.56 (you should be able to verify this using the formula in Lesson 14). c. Is it likely that a sample of 36 containers would have a mean weight of 12.19 fl oz due to random chance? What does this result suggest about the manufacturer's claim that �� = 12.00? If the manufacturers claim is true, it is extremely unlikely that a sample of 36 containers would have a mean weight of 12.19 fl oz simply by random chance. This suggests that the manufacturer's claim is probably false (and the true value of �� is likely greater than 12 fl oz). 8110 Week 8 Discussion Info
  • 12.
    QUALITATIVE RESEARCH EVALUATION2 Discussion: Designing Qualitative Research Typically, when speaking of validity, qualitative researchers are referring to research that is credible and trustworthy, i.e., the extent to which one can have confidence in the study’s findings (Lincoln & Guba, 1985). Generalizability, a marker of reliability, is typically not a main purpose of qualitative research because the researcher rarely selects a random sample with a goal to generalize to a population or to other settings and groups. Rather, a qualitative researcher’s goal is often to understand a unique event or a purposively selected group of individuals. Therefore, when speaking of reliability, qualitative researchers are typically referring to research that is consistent or dependable (Lincoln & Guba, 1985), i.e., the extent to which the findings of the study are consistent with the data that was collected. References & Resources Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Thousand Oaks, CA: Sage. Golafshani, N. (2003). Understanding reliability and validity in qualitative research. The Qualitative Report, 8(4), 597–606. Retrieved from http://nsuworks.nova.edu/tqr/vol8/iss4/6 For this Discussion, you will explain criteria for evaluating the quality of qualitative research and consider the connection of such criteria to philosophical orientations. You will also consider the ethical implications of designing qualitative research.
  • 13.
    With these thoughtsin mind: Assignment Task 1 Write a 1 ½ page of the following : · Explanation of two criteria for evaluating the quality of qualitative research designs. · Next, explain how these criteria are tied to epistemological and ontological assumptions underlying philosophical orientations and the standards of your discipline. · Then, identify a potential ethical issue in qualitative research and explain how it might influence design decisions. · Finally, explain what it means for a research topic to be amenable to scientific study using a qualitative approach. Write the information with intext citations and cite reference to the week’s Learning Resources and other scholarly evidence in APA Style. Assignment Task 2 Respond to a classmate in a 150 word response by offering a strategy to address the ethical issue she or he identified.