2. Statistics – The science of collecting, organizing,
analyzing and interpreting data in order to make
decisions
Data – information coming from observations, counts,
measurements, or responses
Where have you seen statistics being used before?
3. DATA SETS: POPULATIONS
VS. SAMPLES
A population is the
collection of all
outcomes, responses,
measurements, or counts
that are of interest
A sample is a subset of a
population
4. In a recent survey, 1708 adults in the US were asked
if they think global warming is a problem that
requires immediate government action. 939 of the
adults said yes.
Identify the population and the sample.
5. The US Department of Energy conducts weekly surveys
of approximately 800 gasoline stations to determine
the average price per gallon of regular gasoline. On
Feb. 12, 2007, the average price was $2.24 per gallon.
Identify the population and the sample.
6. Parameter – a numerical description of a
POPULATION characteristic
Statistic – a numerical description of a SAMPLE
characteristic
**P’s stay together, and S’s stay together
**Population = parameter
**Sample = statistic
7. DISTINGUISH BETWEEN
A PARAMETER AND
STATISTIC
1. A recent survey of a sample of MBAs
reported that the average salary for an MBA is
more than $82,000.
2. Starting salaries for the 667 MBA graduates
from the University of Chicago Graduate
School of Business increased 8.5% from the
previous year.
3. In a random check of a sample of retail
stores, the Food and Drug Administration
found that 34% of the stores were not storing
fish at the proper temperature.
4. In 2006, major league baseball teams spent
a total of $2,326,706,685 on players’ salaries.
8. BRANCHES OF STATISTICS
Descriptive Statistics – the branch
of statistics that involves the
organization, summarization, and display
of data.
Inferential Statistics – the branch of
statistics that involves using a sample to
draw conclusions about a population.
11. DATA CLASSIFICATION
Data can be just about ANYTHING pertinent to the question at hand:
Data about Students at BJSHS:
12.
13. TYPES OF DATA
Qualitative Data – consists of attributes, labels, or
nonnumerical entries (movie ratings, favorite color,
teams, etc…)
Quantitative Data – consists of numerical
measurements or counts (amounts, times, etc…)
NOTE: NUMBERS DO NOT MEAN QUANTITATIVE
14. LEVELS OF MEASUREMENT
1.Nominal – qualitative only
2.Ordinal – qualitative or quantitative
3.Interval – quantitative only
4.Ratio – quantitative only
15.
16. LEVELS OF MEASUREMENT
1. Nominal – categorized by names, labels or
qualities
Yes/No Questions
Jersey Numbers
Names
Hair Color
17. 2. Ordinal – able to be ranked or ordered, difference
mean nothing particular
S/M/L/XL shirts
1st, 2nd, 3rd,…
Movie Ratings
18. 3. Interval – when 0 does NOT mean “nothing”; can’t
find ratios
Temperature
Years (NOT TIME BETWEEN THINGS)
4. Ratio – when 0 means “none” or “nothing”; true
count, ratio between two data points can be formed
Population
# of pages in a book
Length
Price/Money
19. SECTION 1.2 ASSIGNMENT
Case Study on Page 17 (SUBMIT) [Groups of 3 or less]
INDIVIDUAL:
Pg. 15 – 16 #1 - #24 ALL
(Level of Measurement means: nominal, ordinal, interval
or ratio)
21. DESIGNING A STATISTICAL
STUDY
1.Identify the variables
2.Develop a plan for collecting data
3.Collect the data
4.Describe the data (using DESCRIPTIVE
statistics)
5.Interpret the data (using INFERENTIAL
statistics)
6.Identify any possible errors.
22. DATA COLLECTION
1.Do an Observational Study
2.Perform an Experiment
3.Use a Simulation
4.Use a Survey
24. DATA COLLECTION
2. Perform an Experiment
- a TREATMENT is applied to part of a population
and responses are observed
- Control Group – part of population where NO
treatment is applied
- Subjects are given a PLACEBO – harmless,
unmedicated treatment that is made to look like
the real treatment
- Effects of treatment can be compared to control
group
- Subjects of a study also knows as
EXPERIMENTAL UNITS
25. DATA COLLECTION
INSIGHT
IN AN OBSERVATION STUDY, A
RESEARCHER DOES NOT INFLUENCE THE
RESPONSES, IN AN EXPERIMENT, A
RESEARCHER DELIBERATELY APPLIES A
TREATMENT BEFORE OBSERVING THE
RESPONSES.
26. DATA COLLECTION
3. Use a Simulation
-Use of a mathematical or
physical model to reproduce
the conditions of a situation or
process
-Allows you to study situations
that are impractical, or
dangerous
-Saves companies time and
27. DATA COLLECTION
4. Use a Survey
-An investigation of one or more
characteristics of a population
-Customer Service Surveys
- QUESTIONS MUST BE WORDED SO THEY
DO NOT LEAD TO BIASED RESULTS
28. Which method of data collection would
you use to collect data for each study?
1.A study of the effect of exercise on
relieving depression?
2.A study of the success of graduates of a
large university finding a job within on e
year of graduation.
29. EXPERIMENTAL DESIGN
3 KEY ELEMENTS OF A WELL-
DESIGNED EXPERIMENT
1.CONTROL
2.RANDOMIZATION
3.REPLICATION
30. EXPERIMENTAL DESIGN:
CONTROL
Confounding variable – occurs when an
experimenter cannot tell the difference
between the effects of different factors on a
variable
Example:
-Coffee Shop owner redecorates to attract more
costumers
-At the same time, a shopping mall nearby has
a grand opening
-VARIABLES ARE CONFOUNDED
31. EXPERIMENTAL DESIGN:
CONTROL
PLACEBO EFFECT – when a subject reacts
favorably to a placebo when in fact, he or she
has been given no medicated treatment at all
To avoid this, we use BLINDING
32. EXPERIMENTAL DESIGN:
CONTROL
BLINDING – WHEN THE SUBJECT DOES
NOT KNOW WHETHER HE OR SHE IS
RECEIVING A TREATMENT OR A PLACEBO
DOUBLE BLINDING – NEITHER THE
SUBJECT NOR THE THE EXPERIMENTER
KNOWS IF THE SUBJECT IS RECEIVING A
TREATMENT OR PLACEBO (PREFERRED)
34. EXPERIMENTAL DESIGN:
RANDOMIZATION
2. Randomized Block Design
- Divide subjects with similar
characteristics into blocks,
and randomly assign
subjects to treatments within
each block
All Subjects
30 – 39 year
olds
Control
Treatment
40 – 49 year
olds
Control
Treatment
35. EXPERIMENTAL DESIGN:
RANDOMIZATION
3. Matched-Pairs Design
-Subjects are paired according to a
similarity
-Subjects may be paired based on age,
residency, etc.
-One receives one treatment, and the
other receives another treatment
36. EXPERIMENTAL DESIGN:
REPLICATION
Replication – repetition of an experiment
using a large group of subjects
-More subjects, more value added to the
result of your experiment
-We’re always looking for a large sample
size
37. SAMPLING TECHNIQUES
1.Census – count or measure of ENTIRE population
2.Sampling – count or measure of PART of a population
- Random Sample
- Simple Random Sample
- Stratified Sample
- Cluster Sample
- Systematic Sample
- Sampling Error – difference between the results of a
sample and those of the population
38. SAMPLING TECHNIQUES
Sampling Error – difference between the results
of a sample and those of the population
Biased Sample – one that is NOT representative
of the population from which it is drawn.
Example: A sample of 18 – 22 year old college
students would NOT be representative of the
entire 18 – 22 year old population in the
country.
39. SAMPLING TECHNIQUES
Random Sample – every member of the
population has an equal chance of being
selected
Simple Random Sample – every possible sample
of the same size has the same chance of being
selected
USE OF RANDOM NUMBER GENERATORS!
40. SAMPLING TECHNIQUES
WHEN IT IS IMPORTANT FOR THE SAMPLE TO HAVE
MEMBER FROM EACH SEGMENT OF THE POPULATION
Stratified Sample – members of population are divided
into two or more subsets (strata), then sample is
randomly selected from each strata
**Ensures that each segment of the population is
represented
41. SAMPLING TECHNIQUES
WHEN THE POPULATION FALLS INTO NATURALLY
OCCURRING SUBGROUPS
CLUSTER SAMPLE – Divide the population into groups
(clusters), and select ALL of the members in one or more
(but NOT ALL) of the clusters.
**Must be important that all clusters have similar
characteristics
42.
43. SAMPLING TECHNIQUES:
INSIGHT
For STRATIFIED SAMPLING, each of the strata
contains members with a certain characteristic.
For CLUSTERS, each consist of geographic
groupings, and should consist of members with
ALL characteristics.
-Stratified – Some of members of each group are
used
-Cluster – All of members of one or more groups
are used
44. SAMPLING TECHNIQUES
-Systematic Sample – a sample in which each
member of the population is assigned a
number, those members are then ordered
and then sample members are selected at
regular intervals starting with the starting
number.
# # # # # # # # #
47. HOMEWORK SELECTED ANSWERS
Section 1.1
5. False
6. True
7. True
8. False
9. False
10. True
11. Pop
12. Sam
13. Sam
14. Pop
15. Sam
16. Pop
21. Pop: all adults in US
Sam: 1000 surveyed
22. Pop: all infants in Italy
Sam: 33043 infants in
study
23. Pop: all households in
US
Sam: 1906 households
surveyed
24. Pop: all computer users
Sam: 496 students
surveyed
29. Statistic
30.Statistic
31.Parameter
32. Parameter
33. Statistic
34. Parameter
35. Statistic
36. Parameter
Section 1.2
1. N and O
2. O, I and R
3. False
4. False
5. False
6. False
7. Qualitative
8. Quantitative
9. Quantitative
10. Qualitative
11. Qualitative
12. Quantitative
13. Qualitative, O
14. Qualitative, N
15. Qualitative
16. Quantitative, R
17. Qualitative, O
18. Quantitative, R
19. O
20. R
21. N
22. R
23. I, N, R, O
24. I,N,I,R
Section 1.3
5. True
6. False
7. False
8. False
9. Fasle
10. True
11. P an E
12. Survey
13. Simulation
14. Census
17. SRS
18. Stratified
19. Convenience
20. Cluster
21. SRS
22. Systematic
23. Stratified
24. Convenience
25. Systematic
26. SRS
29. Census
30. Survey
Editor's Notes
Population – responses of the adults in the US
Sample – responses of adults in the survey
Population – All the gasoline stations
Sample – 800 surveyed that week
Sample statistic
Population parameter
Sample Statistic
Population parameter
Students generate list…homework assignment will be to classify the data generated
Example: Automobile manufacturers use simulations with dummies to study the effects of crashes on humans.
Example: A survey is conducted on a sample of female physicians to determine whether the primary reason for their career choice is financial stability. It would be acceptable to make a list of reasons and ask each individual in the sample to select her first choice.
What’s the population of each study?
Run an experiment
Use a survey
Experiments can be ruined by a variety of factors. Being able to CONTRAL those influential factors is important. One such factor is a CONFOUNDING VARIABLE.
Example: Use simple random sample to count the number of people who live in West Ridge County household….assign a different number to each household, use a random number generator then count the number of people living in the selected households (matched numbers)