Problem I
- Write your first name, middle name, and last name in capital letters. The letters involved in your full name would comprise your data set. In case you do not have a middle name, or you do not want to include your real middle name, make one up. Then, do the following
Write
your data
in order from A to Z and double check.
For example
, the student whose complete name is First Middle Last would have
A
A
E
E
E
J
K
M
N
N
R
R
S
Y
Your full name
: ………………..
Letters in order with existing repetitions :
What is the type of your data? Circle, or list, all that apply:
Numerical, continuous, discrete, categorical, non-numerical, quantitative, qualitative
What is the size of your data set?
What scale of measurement is applicable to your data (nominal, ordinal, interval, ratio)?
Support your answer briefly.
Is the word “range,” with its actual definition in statistics, applicable to your data set? How can you say something about your data involving “range” in your statement, anyway?
Is your data set a sample or a population?
Support your answer briefly.
Depending on your answer to Question 6 above, and recalling what we said in class, what is the correct notation to show the size of your data set in statistics?
What is (are) the mode(s) of within data set, if any? Is your data set unimodal, bimodal, trimodal, …?
What is the frequency of the mode? In case you have more than one mode, provide the frequency of each.
Recalling the example discussed in class or provided in your eTextbooks, construct a “Frequency Distribution Table,” a complete (seven-column) frequency distribution table. You should use the following headings for your table:
Letter, Frequency (F), Relative Frequency (RF), Percent Relative Frequency (PRF), Cumulative Frequency (CF), Cumulative Relative Frequency (CRF), and Cumulative Percent Relative Frequency (CPRF).
By examining appropriate rows and columns of the frequency table that you have constructed for Step 10 above, write down (in a small table) the fractions (in percentage) of your data set that the vowels A, E, I, O, and U comprise individually and collectively.
Using the frequency table created in Step 10 above and, preferably, hand drawing on graph paper (
show at least some work, in case you use technology),
Construct a bar chart for the frequency (F) distribution. (See NOTE below)
Construct a bar chart for the percent relative frequency (PRF) distribution. (See NOTE below)
Compare your F distribution with your PRF distribution. Briefly explain your finding(s).
NOTE
: You may do Parts (a) and (b) displaying the categories from highest F (or PRF) to lowest F (or PRF) from left to right; the resulting bar chart is called a “Pareto Bar Chart.”
Please note that each bar chart must have a
descriptive title
, and the x and y axes must have
descriptive labels
.
Plot the points corresponding to the
cumulative
percent relative frequency.
Problem I - Write your first name, middle name, and last name in c.docx
1. Problem I
- Write your first name, middle name, and last name in capital
letters. The letters involved in your full name would comprise
your data set. In case you do not have a middle name, or you do
not want to include your real middle name, make one up. Then,
do the following
Write
your data
in order from A to Z and double check.
For example
, the student whose complete name is First Middle Last would
have
A
A
E
E
E
J
K
M
N
N
R
R
S
Y
2. Your full name
: ………………..
Letters in order with existing repetitions :
What is the type of your data? Circle, or list, all that apply:
Numerical, continuous, discrete, categorical, non-numerical,
quantitative, qualitative
3. What is the size of your data set?
What scale of measurement is applicable to your data (nominal,
ordinal, interval, ratio)?
Support your answer briefly.
Is the word “range,” with its actual definition in statistics,
applicable to your data set? How can you say something about
your data involving “range” in your statement, anyway?
Is your data set a sample or a population?
Support your answer briefly.
Depending on your answer to Question 6 above, and recalling
what we said in class, what is the correct notation to show the
size of your data set in statistics?
What is (are) the mode(s) of within data set, if any? Is your
data set unimodal, bimodal, trimodal, …?
What is the frequency of the mode? In case you have more than
one mode, provide the frequency of each.
4. Recalling the example discussed in class or provided in your
eTextbooks, construct a “Frequency Distribution Table,” a
complete (seven-column) frequency distribution table. You
should use the following headings for your table:
Letter, Frequency (F), Relative Frequency (RF), Percent
Relative Frequency (PRF), Cumulative Frequency (CF),
Cumulative Relative Frequency (CRF), and Cumulative Percent
Relative Frequency (CPRF).
By examining appropriate rows and columns of the frequency
table that you have constructed for Step 10 above, write down
(in a small table) the fractions (in percentage) of your data set
that the vowels A, E, I, O, and U comprise individually and
collectively.
Using the frequency table created in Step 10 above and,
preferably, hand drawing on graph paper (
show at least some work, in case you use technology),
Construct a bar chart for the frequency (F) distribution. (See
NOTE below)
Construct a bar chart for the percent relative frequency (PRF)
distribution. (See NOTE below)
Compare your F distribution with your PRF
distribution. Briefly explain your finding(s).
5. NOTE
: You may do Parts (a) and (b) displaying the categories from
highest F (or PRF) to lowest F (or PRF) from left to right; the
resulting bar chart is called a “Pareto Bar Chart.”
Please note that each bar chart must have a
descriptive title
, and the x and y axes must have
descriptive labels
.
Plot the points corresponding to the
cumulative
percent relative frequency (CPRF) distribution and connect
them; the graph constructed by the line segments is called
“ogive.” Please note that your ogive would be an “increasing
curve.” The plot should have a title and the axes should be
appropriately tick marked and labeled.
Construct a pie chart to graphically display the relative
frequencies in percentages. (
In case you use technology to produce the pie chart, show some
calculations to demonstrate that you know what is involved in
finding the share of each category from the whole circle.
)
Problem II
- Choose and write down 10 distinct (different) whole numbers
(
no modes
6. ) less than 100 in a way that your data set would have a range of
76, a mean of 55, and a median of 50.
Show your data set and your work to demonstrate that your data
set does have the statistical characteristics mentioned.
Calculate the midrange.
Estimate the standard deviation using the “range rule of thumb,”
which is based on the fact that four standard deviations
practically cover the span on the data (about 95% for normal
distributions). Do not calculate the actual standard deviation
value.
Determine the Interquartile Range (IQR).
Are the verified/computed values “statistics” or “parameters”?
Explain your answer briefly.
Construct a boxplot for your data set. Any outliers?
Based on the appearance of your boxplot, is the distribution of
your data set normal, close-to-normal, left skewed, or right
skewed?
7. Using your estimate of the standard deviation, determine what
percentage of the data points fall within one standard deviation
from the mean? Briefly explain why your computed percentage
is close to, or far from, what the “Empirical Rule” says.
Problem III
- In problem 84 of Chapter 1 of Illowsky’s eTextbook (Table
1.37), which was one of your homework problems, the class
intervals or bins have been listed under “Age.” We are
interested in knowing the mean age of the chief executive
officers (CEOs) involved in the study.
Can we calculate the
exact
mean age of the CEOs studied, based on the information
provided in the table?
Briefly explain your answer.
To estimate the mean age of the CEOs studied, we can resort to
the class intervals under Age (the left-most column of the
table); See section 2.5 of Illowsky’s eTextbook. Consider the
midpoint (midrange) of each class interval to represent the age
of each CEO in that class interval. For example, the three
CEOs in the class interval “40-44” are considered to be (44-
40)/2 = 42 years of age each. Then, three values of our data set
would be 42, 42, 42, noting that the class frequency is 3. Find
the midpoints of the remaining class intervals and note their
corresponding class frequencies to complete your data set; you
may devote one column to the midpoint values. Then, find the
estimated mean CEO age and report the value to
two decimal places
.
8. Find an estimate for the median age. Support your answer by
relating your answer to the actual definition of median.
Having estimated mean and median, respectively in parts (b)
and (c) above, estimate the “midrange value” for the data
represented in the table, as the third measure of the center of
the data.
Among the estimates found in Parts (b), (c), and (d) above,
which one is relatively more accurate than others? Support your
answer by a brief explanation.
Draw a histogram based on the information provided in the
table.
Based on the appearance of your histogram (drawn for Part (e)
above) state whether the frequency distribution is almost
normal, skewed to the left or skewed to the right;
explain your answer briefly.
Estimate the standard deviation to accompany the estimated
mean, as the mean and standard deviation go hand in hand.
Hand calculations would be easy for the case in hand, and is
highly
recommended for this quiz; please see the formulas provided in
Section 2.7 of Illowsky’s eTextbook. Standard deviation is
simply the square root of the variance.
In case you use technology to do the calculation,
you should show some hand calculations to demonstrate that
you know how it is down manually; otherwise, you will not earn
full credit.
As a “sanity check,” show that your result for Part (h) is
somewhere between one fourth of the range and one sixth of the
range; how does it compare with the mean of the two bounds?