Ashford 2: - Week 1 - Instructor Guidance
Week Overview:
The following video series: Against All Odds Inside Statistics is helpful if you would like to watch it.
http://www.learner.org/resources/series65.html?pop=yes&pid=3138
For this week, we’ll learn that statistics is the science of collecting, organizing, presenting, analyzing, and interpreting numerical data to assist in making more effective decisions.
In today’s world, numerical information is everywhere. Statistical techniques are used to make decisions that affect our daily lives. The knowledge of statistical methods will help you understand how decisions are made and give you a better understanding of how they affect you. No matter what line of work you select, you will find yourself faced with decisions where an understanding of data analysis is helpful.
The concepts introduced this week include levels of measurement, measurements of center, variations, etc. Normal distribution and calculations are introduced in this week.
Measurements
You should be able to distinguish among the nominal, ordinal, interval, and ratio levels of measurement.
Nominal level - data that is classified into categories and cannot be arranged in any particular order.
EXAMPLES: eye color, gender, religious affiliation.
Ordinal level – data arranged in some order, but the differences between data values cannot be determined or are meaningless.
EXAMPLE: During a taste test of 4 soft drinks, Mellow Yellow was ranked number 1, Sprite number 2, Seven-up number 3, and Orange Crush number 4.
Interval level - similar to the ordinal level, with the additional property that meaningful amounts of differences between data values can be determined. There is no natural zero point.
EXAMPLE: Temperature on the Fahrenheit scale.
Ratio level - the interval level with an inherent zero starting point. Differences and ratios are meaningful for this level of measurement.
EXAMPLES: Monthly income of surgeons, or distance traveled by manufacturer’s representatives per month.
Why do you need to know the level of measurement of a data? This is because the level of measurement of the data dictates the calculations that can be done to summarize and present the data. It also determines the statistical tests that should be performed on the data.
Probability
PROBABILITY is a value between zero and one, inclusive, describing the relative possibility (chance or likelihood) an event will occur.
There are three ways of assigning probability:
1. Classical Probability
This is based on the assumption that the outcomes of an experiment are equally likely.
2. Empirical Probability
The probability of an event happening is the fraction of the time similar events happened in the past.
Example: On February 1, 2003, the Space Shuttle Columbia exploded. This was the second disaster in 113 space missions for NASA. On the basis of this information, what is the probability that a future mission is successfully completed?
Probability of successful flight ...
1. Ashford 2: - Week 1 - Instructor Guidance
Week Overview:
The following video series: Against All Odds Inside Statistics is
helpful if you would like to watch it.
http://www.learner.org/resources/series65.html?pop=yes&pid=3
138
For this week, we’ll learn that statistics is the science of
collecting, organizing, presenting, analyzing, and interpreting
numerical data to assist in making more effective decisions.
In today’s world, numerical information is everywhere.
Statistical techniques are used to make decisions that affect our
daily lives. The knowledge of statistical methods will help you
understand how decisions are made and give you a better
understanding of how they affect you. No matter what line of
work you select, you will find yourself faced with decisions
where an understanding of data analysis is helpful.
The concepts introduced this week include levels of
measurement, measurements of center, variations, etc. Normal
distribution and calculations are introduced in this week.
Measurements
You should be able to distinguish among the nominal, ordinal,
interval, and ratio levels of measurement.
Nominal level - data that is classified into categories and cannot
be arranged in any particular order.
EXAMPLES: eye color, gender, religious affiliation.
Ordinal level – data arranged in some order, but the differences
between data values cannot be determined or are meaningless.
EXAMPLE: During a taste test of 4 soft drinks, Mellow Yellow
was ranked number 1, Sprite number 2, Seven-up number 3, and
Orange Crush number 4.
Interval level - similar to the ordinal level, with the additional
property that meaningful amounts of differences between data
values can be determined. There is no natural zero point.
2. EXAMPLE: Temperature on the Fahrenheit scale.
Ratio level - the interval level with an inherent zero starting
point. Differences and ratios are meaningful for this level of
measurement.
EXAMPLES: Monthly income of surgeons, or distance traveled
by manufacturer’s representatives per month.
Why do you need to know the level of measurement of a data?
This is because the level of measurement of the data dictates the
calculations that can be done to summarize and present the data.
It also determines the statistical tests that should be performed
on the data.
Probability
PROBABILITY is a value between zero and one, inclusive,
describing the relative possibility (chance or likelihood) an
event will occur.
There are three ways of assigning probability:
1. Classical Probability
This is based on the assumption that the outcomes of an
experiment are equally likely.
2. Empirical Probability
The probability of an event happening is the fraction of the time
similar events happened in the past.
Example: On February 1, 2003, the Space Shuttle Columbia
exploded. This was the second disaster in 113 space missions
for NASA. On the basis of this information, what is the
probability that a future mission is successfully completed?
Probability of successful flight = 111/113 = 0.98
3. Subjective Concept Of Probability
The likelihood (probability) of a particular event happening that
is assigned by an individual based on whatever information is
available.
Discussion
To prepare for this week’s discussion, you need to familiar with
the statistics such as mean, median, mode, variance, standard
deviation, range, etc. The first three are used to measure
centers. The rest are used to measure data variations. You also
3. need to understand the concepts of probability.
Assignment Expectation:
This assignment is to be done by using Excel.
1. You need to get familiar with the different levels of
measurements: nominal, ordinal, interval, and ratio. For instant,
salary is ratio, etc.
2. You can choose individual functions such as “average” for
mean, “stdev.s” for standard deviation. (Formulas, then insert
function, scroll down the list to find “average”)
3. This is to calculate probabilities (see p42 formula and
examples)
a. P(a male in grade E) = (# of males in grade E)/(total # of
employees)
4. a. You need to rearrange the data from largest to smallest
before you can find the cut off
b. A z-score is the signed distance between a selected value,
population standard deviation, σ.
The formula is:
You can also review the example on page 58, 59.
c. through g. You need to use distribution table on page 56 to
find the probabilities. Also review the example on p58-60.
Reference
Lind,D., Marchal, W., & Wathen, S. (2010). Statistical
Techniques in Business and Economics (14th ed). McGraw-Hill
Ashford 2: - Week 1 - Discussion 1
Your initial discussion thread is due on Day 3 (Thursday) and
you have until Day 7 (Monday) to respond to your classmates.
Your grade will reflect both the quality of your initial post and
4. the depth of your responses. Reference the Discussion Forum
Grading Rubric for guidance on how your discussion will be
evaluated.
Language
Numbers and measurements are the language of business.
Organizations look at results in many ways: expenses, quality
levels, efficiencies, time, costs, etc. What measures does your
department keep track of? Are they descriptive or inferential
data, and what is the difference between these? (Note: If you
do not have a job where measures are available to you, ask
someone you know for some examples, or conduct outside
research on an interest of yours, or use personal measures.)
Guided Response: Review several of your classmates’ posts.
Respond to at least two of your classmates by providing
recommendations for the measures being discussed
Ashford 2: - Week 1 - Discussion 2
Your initial discussion thread is due on Day 3 (Thursday) and
you have until Day 7 (Monday) to respond to your classmates.
Your grade will reflect both the quality of your initial post and
the depth of your responses. Reference the Discussion Forum
Grading Rubric for guidance on how your discussion will be
evaluated.
5. Probability
Read the article, "Better Living Through...Statistics?!" and give
an example of how you might use increasing information to
make actual business decisions. Respond to at least two of your
classmates’ posts.
Guided Response: Review several of your classmates’ posts.
Respond to at least two classmates by commenting on the
situations that are being illustrated.
Better Living Through...Statistics?!
Comment Now
Follow Comments
You’ve probably heard of Nate Silver. He’s the “King of
Quants,” and his book The Signal and the Noise is an excellent
discussion of some of the problems we have with prediction.
You’ve probably never heard of the Reverend Thomas Bayes,
who is responsible for a theorem (called “Bayes’ Theorem”)
that helps us understand how we can update our estimates of the
probabilities of different events given new pieces of
information.
It’s still pretty counter-intuitive. Fortunately, the people at
6. Nowsourcing, Inc, who have provided content for this space
before, were kind enough to produce the infographic below that
introduces Bayes’ Theorem with a contrived example involving
baseball: what’s a good estimate of the probability that the
Yankees will win game #101 if they have won 72 of their first
100 games and Sportscaster Bob–who is correct 55% of the time
when he predicts a Yankees victory–has predicted that they will
win?
Since the Yankees have won 72 of 100 games, a good estimate
of the probability that they will win their 101st game would be
72%. Now, we introduce some information: since Bob is right
just over half the time when he predicts a Yankees victory, it
will nudge our estimate of the probability of a Yankees victory
up just a little bit (if Sportscaster Bob were right less than half
the time, it would nudge our estimate of the probability
downward).
Our estimate of the probability changes as we add more
information. Is it a night game? Who are the Yankees playing?
Who is pitching? Did it rain last night? Is a key player injured?
And so on: the more accurate information we add, the better our
estimates will be. The applications are numerous and important:
while Bayesian reasoning can help us understand baseball
(except for the Yankees’ hypothetical 72-28 record in this
example), it also helps us understand far more important things
like medical diagnostics. And elections. And all sorts of other
interesting things.
I am grateful to my Samford colleague Tom Woolley for
comments and suggestions. The original version of the
infographic appears here courtesy of sports-management-
degrees.com.
7. Ashford 2: - Week 1 - Assignment
Problem Set Week One
Week One Assignment will require students to utilize the
following resources to complete the assignment. Assignment
instructions are contained within the following resources.
All statistical calculations will use the Employee Salary Data
Set and Week 1 assignment sheet.
Carefully review the Grading Rubric for the criteria that will be
used to evaluate your assignment.
9. <1 point>
1
Measurement issues. Data, even numerically coded variables,
can be one of 4 levels -
nominal, ordinal, interval, or ratio. It is important to identify
which level a variable is, as
this impact the kind of analysis we can do with the data. For
example, descriptive statistics
10. such as means can only be done on interval or ratio level data.
Please list under each label, the variables in our data set that
belong in each group.
Nominal
Ordinal
Interval
Ratio
11.
12.
13.
14. b.
For each variable that you did not call ratio, why did you make
that decision?
15.
16. <1 point>
2
The first step in analyzing data sets is to find some summary
descriptive statistics for key variables.
17. For salary, compa, age, performance rating, and service; find
the mean, standard deviation, and range for 3 groups
: overall sample, Females, and Males.
You can use either the Data Analysis Descriptive Statistics tool
or the Fx =average and =stdev functions.
(the range must be found using the difference between the
=max and =min functions with Fx) functions.
Note: Place data to the right, if you use Descriptive statistics,
place that to the right as well.
26. 4
For each group (overall, females, and males) find:
Overall Female Male
a.
The value that cuts off the top 1/3 salary in each group.
Hint: can use these Fx functions
b.
The z score for each value:
Excel's standize function
c.
The normal curve probability of exceeding this score:
27. 1-normsdist function
d.
What is the empirical probability of being at or exceeding this
salary value?
e.
The value that cuts off the top 1/3 compa in each group.
f.
The z score for each value:
28. g.
The normal curve probability of exceeding this score:
h.
What is the empirical probability of being at or exceeding this
compa value?
i.
How do you interpret the relationship between the data sets?
What do they mean about our?
36. Can we make any conclusions about equal pay for equal work
yet?
37.
38.
39. Description:
Total Possible Score: 8.00
1. Identifies Data Variable level and Reasons
Total: 1.60
Distinguished - Performs the following: 1.Correctly identifies
all variable data levels and 2. Provides correct reasoning for
placing variables in the nominal, ordinal, or interval, categories.
Proficient - Performs the following but misidentifies no more
than three data types and/or the reasons: 1. Identifies all
variable data levels and 2. Provides reasoning for placing
variables in the nominal, ordinal, or interval categories.
40. Basic - Performs the following but misidentifies no more than
eight of the data types and/or the reasons: 1. Identifies variable
data levels and 2. Provides reasoning for placing variables in
the nominal, ordinal, or interval, categories.
Below Expectations - Performs the following but misidentifies
nine or more of the data types and/or the reasons: 1. Identifies
variable data levels and 2. Provides reasoning for placing
variables in the nominal, ordinal, or interval, categories.
Non-Performance - There is either no response to problem one,
or it fails to provide any correct identification and/or reasoning.
2. Generates Mean, Standard Deviation, and Range for Salary,
Compa, Age, Performance Rating, and Service For the Overall
Group as well as the Males and Females Separately
Total: 1.60
Distinguished - Performs all of the following correctly: 1.The
data necessary for computations was selected accurately. 2.
Accurate results produced. 3. The results are presented in a
clear format. 4. Identified which variables this function does not
work properly for. Correctly calculated and displayed asked for
values for all three groups.
Proficient - One of the following was not done correctly: 1.The
41. data necessary for computations was selected accurately. 2.
Accurate results produced. 3. The results are presented in a
clear format. 4. Identified which variables this function does not
work properly for. Incorrectly calculated no more than three
values.
Basic - Two of the following were not done correctly: 1.The
data necessary for computations was selected accurately. 2.
Accurate results produced. 3. The results are presented in a
clear format. 4. Identified which variables this function does not
work properly for. Incorrectly calculated no more than 14 total
values.
Below Expectations - Three of the following were not done
correctly: 1.The data necessary for computations was selected
accurately. 2. Accurate results produced. 3. The results are
presented in a clear format. 4. Identified which variables this
function does not work properly for. Incorrectly calculated more
than 15 values.
Non-Performance - There is either no response to problem two,
or it does not provide correct statistical outcomes as asked for.
3. Determines Probability
Total: 1.60
Distinguished - Performed all the following correctly: 1.The
data necessary for computations was selected accurately. 2.
42. Data counts were correct. 3. Produced accurate results. 4.
Difference in values explained clearly.
Proficient - One of the following was not done correctly: 1.The
data necessary for computations was selected accurately. 2.
Data counts were correct. 3. Produced accurate results. 4.
Difference in values explained clearly.
Basic - Two of the following were not done correctly: 1.The
data necessary for computations was selected accurately. 2.
Data counts were correct. 3. Produced accurate results. 4.
Difference in values explained clearly.
Below Expectations - Three of the following were not done
correctly: 1.The data necessary for computations was selected
accurately. 2. Data counts were correct. 3. Produced accurate
results. 4. Difference in values explained clearly.
Non-Performance - There is either no response to problem three,
or it does not provide probability values as asked for.
4. Finds Selected Values For Raw Scores That Cut Off the Top
1/3 of the Values Within the Selected Groups
Total: 1.60
Distinguished - Performed all of the following correctly:
1.Correct raw score identified for each group. 2. Z score
43. correctly calculated. 3. Related probability determined. 4.
Interpretation presented.
Proficient - No more than four errors were noted in the
following: 1. Raw score identified for each group. 2. Z score
calculated. 3. Related probability determined. 4. Interpretation
presented.
Basic - No more than eight errors were noted in the following:
1. Raw score identified for each group. 2. Z score calculated. 3.
Related probability determined. 4. Interpretation presented.
Below Expectations - No more than 15 errors were noted in the
following: 1. Raw score identified for each group. 2. Z score
calculated. 3. Related probability determined. 4. Interpretation
presented.
Non-Performance - There is either no response to problem four,
or it fails to provide any information on z-scores, distributions
and relative value of different measures asked for in the
question.
5. Conclusions About the Male and Female Pay Equality
Total: 1.60
44. Distinguished - Provides thorough and accurate conclusions
about the following issues: 1. Male and female pay inequality.
2. Consistency between and among different statistical measures
of equality.
Proficient - Provides complete and mostly accurate conclusions
about the following issues: 1. Male and female pay inequality.
2. Consistency between and among different statistical measures
of equality.
Basic - Provides incomplete and/or inaccurate conclusions about
the following issues: 1. Male and female pay inequality. 2.
Consistency between and among different statistical measures
of equality.
Below Expectations - Provides incomplete and inaccurate
conclusions about the following issues: 1. Male and female pay
inequality. 2. Consistency between and among different
statistical measures of equality.
Non-Performance - There is either no response to problem five,
or it fails to provide any correct response to the results about
males and females.
See comments at the right of the data set.
IDSalaryCompaMidpointAgePerformance
Rating
ServiceGenderRaiseDegreeGender1Grade
8231.000233290915.80FA
The ongoing question that the weekly assignments will focus on
is: Are males and females paid the same for equal work (under
45. the Equal Pay Act)?
10220.956233080714.70FA
Note: to simplfy the analysis, we will assume that jobs within
each grade comprise equal work.
11231.00023411001914.80FA
14241.04323329012160FAThe column labels in the table mean:
15241.043233280814.90FAID – Employee sample number
Salary – Salary in thousands
23231.000233665613.31FAAge – Age in yearsPerformance
Rating – Appraisal rating (Employee evaluation score)
26241.043232295216.21FAService – Years of service
(rounded)Gender: 0 = male, 1 = female
31241.043232960413.90FAMidpoint – salary grade midpoint
Raise – percent of last raise
35241.043232390415.31FAGrade – job/pay gradeDegree (0=
BSBA 1 = MS)
36231.000232775314.31FAGender1 (Male or Female)Compa -
salary divided by midpoint
37220.956232295216.21FA
42241.0432332100815.70FA
3341.096313075513.60FB
18361.1613131801115.61FB
20341.0963144701614.81FB
39351.129312790615.51FB
7411.0254032100815.70FC
13421.0504030100214.71FC
22571.187484865613.80FD
24501.041483075913.81FD
45551.145483695815.20FD
17691.2105727553130FE
48651.1405734901115.31FE
28751.119674495914.41FF
43771.1496742952015.51FF
19241.043233285104.61MA
25241.0432341704040MA
40251.086232490206.30MA