Statistics – The science of collecting, organizing, analyzing
and interpreting data in order to make decisions
Data – information coming from observations, counts,
measurements, or responses
Where have you seen statistics being used before?
DATA SETS: POPULATIONS VS. SAMPLES
A population is the collection of all outcomes, responses,
measurements, or counts that are of interest
A sample is a subset of a population
In a recent survey, 1708 adults in the US were asked if they
think global warming is a problem that requires immediate
government action. 939 of the adults said yes.
Identify the population and the sample.
The US Department of Energy conducts weekly surveys of
approximately 800 gasoline stations to determine the average
price per gallon of regular gasoline. On Feb. 12, 2007, the
average price was $2.24 per gallon.
Identify the population and the sample.
Parameter – a numerical description of a POPULATION
characteristic
Statistic – a numerical description of a SAMPLE characteristic
**P’s stay together, and S’s stay together
**Population = parameter
**Sample = statistic
DISTINGUISH BETWEEN A PARAMETER AND STATISTIC
1. A recent survey of a sample of MBAs reported that the average
salary for an MBA is more than $82,000.
2. Starting salaries for the 667 MBA graduates from the University
of Chicago Graduate School of Business increased 8.5% from
the previous year.
3. In a random check of a sample of retail stores, the Food and
Drug Administration found that 34% of the stores were not
storing fish at the proper temperature.
4. In 2006, major league baseball teams spent a total of
$2,326,706,685 on players’ salaries.
BRANCHES OF STATISTICS
Descriptive Statistics – the branch of
statistics that involves the organization,
summarization, and display of data.
Inferential Statistics – the branch of
statistics that involves using a sample to draw
conclusions about a population.
SECTION 1.1 ASSIGNMENT
Pg. 8 - 11 #1 - #36 ALL
DATA CLASSIFICATION
Data can be just about ANYTHING pertinent to the question at hand:
Data about Students at BJSHS:
TYPES OF DATA
Qualitative Data – consists of attributes, labels, or
nonnumerical entries (movie ratings, favorite color, teams,
etc…)
Quantitative Data – consists of numerical measurements or
counts (amounts, times, etc…)
NOTE: NUMBERS DO NOT MEAN QUANTITATIVE
LEVELS OF MEASUREMENT
1.Nominal – qualitative only
2.Ordinal – qualitative or quantitative
3.Interval – quantitative only
4.Ratio – quantitative only
LEVELS OF MEASUREMENT
1. Nominal – categorized by names, labels or qualities
Yes/No Questions
Jersey Numbers
Names
Hair Color
2. Ordinal – able to be ranked or ordered, difference mean
nothing particular
S/M/L/XL shirts
1st, 2nd, 3rd,…
Movie Ratings
3. Interval – when 0 does NOT mean “nothing”; can’t find
ratios
Temperature
Years (NOT TIME BETWEEN THINGS)
4. Ratio – when 0 means “none” or “nothing”; true count, ratio
between two data points can be formed
Population
# of pages in a book
Length
Price/Money
SECTION 1.2 ASSIGNMENT
Case Study on Page 17 (SUBMIT) [Groups of 3 or less]
INDIVIDUAL:
Pg. 15 – 16 #1 - #24 ALL
(Level of Measurement means: nominal, ordinal, interval or ratio)
DESIGNING A STATISTICAL STUDY
1.Identify the variables
2.Develop a plan for collecting data
3.Collect the data
4.Describe the data (using DESCRIPTIVE statistics)
5.Interpret the data (using INFERENTIAL statistics)
6.Identify any possible errors.
DATA COLLECTION
1.Do an Observational Study
2.Perform an Experiment
3.Use a Simulation
4.Use a Survey
DATA COLLECTION
1.Observational Study
- Researcher observes and measure
characteristics of interest, but does NOT
change existing conditions.
DATA COLLECTION
2. Perform an Experiment
- a TREATMENT is applied to part of a population and
responses are observed
- Control Group – part of population where NO treatment is
applied
- Subjects are given a PLACEBO – harmless, unmedicated
treatment that is made to look like the real treatment
- Effects of treatment can be compared to control group
- Subjects of a study also knows as EXPERIMENTAL UNITS
DATA COLLECTION
INSIGHT
IN AN OBSERVATION STUDY, A RESEARCHER
DOES NOT INFLUENCE THE RESPONSES, IN
AN EXPERIMENT, A RESEARCHER
DELIBERATELY APPLIES A TREATMENT
BEFORE OBSERVING THE RESPONSES.
DATA COLLECTION
3. Use a Simulation
- Use of a mathematical or physical
model to reproduce the conditions
of a situation or process
- Allows you to study situations that
are impractical, or dangerous
- Saves companies time and money
DATA COLLECTION
4. Use a Survey
- An investigation of one or more characteristics
of a population
- Customer Service Surveys
- QUESTIONS MUST BE WORDED SO THEY DO
NOT LEAD TO BIASED RESULTS
Which method of data collection would you use
to collect data for each study?
1.A study of the effect of exercise on relieving
depression?
2.A study of the success of graduates of a large
university finding a job within on e year of
graduation.
EXPERIMENTAL DESIGN
3 KEY ELEMENTS OF A WELL-
DESIGNED EXPERIMENT
1.CONTROL
2.RANDOMIZATION
3.REPLICATION
EXPERIMENTAL DESIGN: CONTROL
Confounding variable – occurs when an experimenter
cannot tell the difference between the effects of
different factors on a variable
Example:
- Coffee Shop owner redecorates to attract more
costumers
- At the same time, a shopping mall nearby has a grand
opening
- VARIABLES ARE CONFOUNDED
EXPERIMENTAL DESIGN: CONTROL
PLACEBO EFFECT – when a subject reacts favorably to a
placebo when in fact, he or she has been given no
medicated treatment at all
To avoid this, we use BLINDING
EXPERIMENTAL DESIGN: CONTROL
BLINDING – WHEN THE SUBJECT DOES NOT
KNOW WHETHER HE OR SHE IS RECEIVING A
TREATMENT OR A PLACEBO
DOUBLE BLINDING – NEITHER THE SUBJECT
NOR THE THE EXPERIMENTER KNOWS IF THE
SUBJECT IS RECEIVING A TREATMENT OR
PLACEBO (PREFERRED)
EXPERIMENTAL DESIGN: RANDOMIZATION
Randomization – process of randomly
assigning subjects to different treatment
groups
1.Completely Randomized Design
2.Randomized Block Design
3.Matched Pairs Design
EXPERIMENTAL DESIGN: RANDOMIZATION
2. Randomized Block Design
- Divide subjects with similar
characteristics into blocks, and
randomly assign subjects to
treatments within each block
All
Subjects
30 – 39
year olds
Control
Treatment
40 – 49
year olds
Control
Treatment
EXPERIMENTAL DESIGN: RANDOMIZATION
3. Matched-Pairs Design
- Subjects are paired according to a similarity
- Subjects may be paired based on age,
residency, etc.
- One receives one treatment, and the other
receives another treatment
EXPERIMENTAL DESIGN: REPLICATION
Replication – repetition of an experiment using a
large group of subjects
- More subjects, more value added to the result
of your experiment
- We’re always looking for a large sample size
SAMPLING TECHNIQUES
1. Census – count or measure of ENTIRE population
2. Sampling – count or measure of PART of a population
- Random Sample
- Simple Random Sample
- Stratified Sample
- Cluster Sample
- Systematic Sample
- Sampling Error – difference between the results of a sample
and those of the population
SAMPLING TECHNIQUES
Sampling Error – difference between the results of a
sample and those of the population
Biased Sample – one that is NOT representative of the
population from which it is drawn.
Example: A sample of 18 – 22 year old college students
would NOT be representative of the entire 18 – 22
year old population in the country.
SAMPLING TECHNIQUES
Random Sample – every member of the population has
an equal chance of being selected
Simple Random Sample – every possible sample of the
same size has the same chance of being selected
USE OF RANDOM NUMBER GENERATORS!
SAMPLING TECHNIQUES
WHEN IT IS IMPORTANT FOR THE SAMPLE TO HAVE MEMBER FROM
EACH SEGMENT OF THE POPULATION
Stratified Sample – members of population are divided into two or
more subsets (strata), then sample is randomly selected from
each strata
**Ensures that each segment of the population is represented
SAMPLING TECHNIQUES
WHEN THE POPULATION FALLS INTO NATURALLY OCCURRING
SUBGROUPS
CLUSTER SAMPLE – Divide the population into groups (clusters),
and select ALL of the members in one or more (but NOT ALL) of
the clusters.
**Must be important that all clusters have similar characteristics
SAMPLING TECHNIQUES: INSIGHT
For STRATIFIED SAMPLING, each of the strata contains
members with a certain characteristic.
For CLUSTERS, each consist of geographic groupings,
and should consist of members with ALL
characteristics.
- Stratified – Some of members of each group are used
- Cluster – All of members of one or more groups are
used
SAMPLING TECHNIQUES
- Systematic Sample – a sample in which each
member of the population is assigned a number,
those members are then ordered and then
sample members are selected at regular intervals
starting with the starting number.
# # # # # # # # #
SAMPLING TECHNIQUES
Convenience Sample – sample consists
only of available members of population
(not recommended)
ASSIGNMENT
Pg. 25 #1 - #14, #17 - #26 (identify sampling technique)
Pg. 27 #29- #30
HOMEWORK SELECTED ANSWERS
Section 1.1
5. False
6. True
7. True
8. False
9. False
10. True
11. Pop
12. Sam
13. Sam
14. Pop
15. Sam
16. Pop
21. Pop: all adults in US
Sam: 1000 surveyed
22. Pop: all infants in Italy
Sam: 33043 infants in
study
23. Pop: all households in
US
Sam: 1906 households
surveyed
24. Pop: all computer users
Sam: 496 students
surveyed
29. Statistic
30.Statistic
31.Parameter
32. Parameter
33. Statistic
34. Parameter
35. Statistic
36. Parameter
Section 1.2
1. N and O
2. O, I and R
3. False
4. False
5. False
6. False
7. Qualitative
8. Quantitative
9. Quantitative
10. Qualitative
11. Qualitative
12. Quantitative
13. Qualitative, O
14. Qualitative, N
15. Qualitative
16. Quantitative, R
17. Qualitative, O
18. Quantitative, R
19. O
20. R
21. N
22. R
23. I, N, R, O
24. I,N,I,R
Section 1.3
5. True
6. False
7. False
8. False
9. Fasle
10. True
11. P an E
12. Survey
13. Simulation
14. Census
17. SRS
18. Stratified
19. Convenience
20. Cluster
21. SRS
22. Systematic
23. Stratified
24. Convenience
25. Systematic
26. SRS
29. Census
30. Survey

Stat 4325IS.pdf

  • 2.
    Statistics – Thescience of collecting, organizing, analyzing and interpreting data in order to make decisions Data – information coming from observations, counts, measurements, or responses Where have you seen statistics being used before?
  • 3.
    DATA SETS: POPULATIONSVS. SAMPLES A population is the collection of all outcomes, responses, measurements, or counts that are of interest A sample is a subset of a population
  • 4.
    In a recentsurvey, 1708 adults in the US were asked if they think global warming is a problem that requires immediate government action. 939 of the adults said yes. Identify the population and the sample.
  • 5.
    The US Departmentof Energy conducts weekly surveys of approximately 800 gasoline stations to determine the average price per gallon of regular gasoline. On Feb. 12, 2007, the average price was $2.24 per gallon. Identify the population and the sample.
  • 6.
    Parameter – anumerical description of a POPULATION characteristic Statistic – a numerical description of a SAMPLE characteristic **P’s stay together, and S’s stay together **Population = parameter **Sample = statistic
  • 7.
    DISTINGUISH BETWEEN APARAMETER AND STATISTIC 1. A recent survey of a sample of MBAs reported that the average salary for an MBA is more than $82,000. 2. Starting salaries for the 667 MBA graduates from the University of Chicago Graduate School of Business increased 8.5% from the previous year. 3. In a random check of a sample of retail stores, the Food and Drug Administration found that 34% of the stores were not storing fish at the proper temperature. 4. In 2006, major league baseball teams spent a total of $2,326,706,685 on players’ salaries.
  • 8.
    BRANCHES OF STATISTICS DescriptiveStatistics – the branch of statistics that involves the organization, summarization, and display of data. Inferential Statistics – the branch of statistics that involves using a sample to draw conclusions about a population.
  • 9.
    SECTION 1.1 ASSIGNMENT Pg.8 - 11 #1 - #36 ALL
  • 11.
    DATA CLASSIFICATION Data canbe just about ANYTHING pertinent to the question at hand: Data about Students at BJSHS:
  • 13.
    TYPES OF DATA QualitativeData – consists of attributes, labels, or nonnumerical entries (movie ratings, favorite color, teams, etc…) Quantitative Data – consists of numerical measurements or counts (amounts, times, etc…) NOTE: NUMBERS DO NOT MEAN QUANTITATIVE
  • 14.
    LEVELS OF MEASUREMENT 1.Nominal– qualitative only 2.Ordinal – qualitative or quantitative 3.Interval – quantitative only 4.Ratio – quantitative only
  • 16.
    LEVELS OF MEASUREMENT 1.Nominal – categorized by names, labels or qualities Yes/No Questions Jersey Numbers Names Hair Color
  • 17.
    2. Ordinal –able to be ranked or ordered, difference mean nothing particular S/M/L/XL shirts 1st, 2nd, 3rd,… Movie Ratings
  • 18.
    3. Interval –when 0 does NOT mean “nothing”; can’t find ratios Temperature Years (NOT TIME BETWEEN THINGS) 4. Ratio – when 0 means “none” or “nothing”; true count, ratio between two data points can be formed Population # of pages in a book Length Price/Money
  • 19.
    SECTION 1.2 ASSIGNMENT CaseStudy on Page 17 (SUBMIT) [Groups of 3 or less] INDIVIDUAL: Pg. 15 – 16 #1 - #24 ALL (Level of Measurement means: nominal, ordinal, interval or ratio)
  • 21.
    DESIGNING A STATISTICALSTUDY 1.Identify the variables 2.Develop a plan for collecting data 3.Collect the data 4.Describe the data (using DESCRIPTIVE statistics) 5.Interpret the data (using INFERENTIAL statistics) 6.Identify any possible errors.
  • 22.
    DATA COLLECTION 1.Do anObservational Study 2.Perform an Experiment 3.Use a Simulation 4.Use a Survey
  • 23.
    DATA COLLECTION 1.Observational Study -Researcher observes and measure characteristics of interest, but does NOT change existing conditions.
  • 24.
    DATA COLLECTION 2. Performan Experiment - a TREATMENT is applied to part of a population and responses are observed - Control Group – part of population where NO treatment is applied - Subjects are given a PLACEBO – harmless, unmedicated treatment that is made to look like the real treatment - Effects of treatment can be compared to control group - Subjects of a study also knows as EXPERIMENTAL UNITS
  • 25.
    DATA COLLECTION INSIGHT IN ANOBSERVATION STUDY, A RESEARCHER DOES NOT INFLUENCE THE RESPONSES, IN AN EXPERIMENT, A RESEARCHER DELIBERATELY APPLIES A TREATMENT BEFORE OBSERVING THE RESPONSES.
  • 26.
    DATA COLLECTION 3. Usea Simulation - Use of a mathematical or physical model to reproduce the conditions of a situation or process - Allows you to study situations that are impractical, or dangerous - Saves companies time and money
  • 27.
    DATA COLLECTION 4. Usea Survey - An investigation of one or more characteristics of a population - Customer Service Surveys - QUESTIONS MUST BE WORDED SO THEY DO NOT LEAD TO BIASED RESULTS
  • 28.
    Which method ofdata collection would you use to collect data for each study? 1.A study of the effect of exercise on relieving depression? 2.A study of the success of graduates of a large university finding a job within on e year of graduation.
  • 29.
    EXPERIMENTAL DESIGN 3 KEYELEMENTS OF A WELL- DESIGNED EXPERIMENT 1.CONTROL 2.RANDOMIZATION 3.REPLICATION
  • 30.
    EXPERIMENTAL DESIGN: CONTROL Confoundingvariable – occurs when an experimenter cannot tell the difference between the effects of different factors on a variable Example: - Coffee Shop owner redecorates to attract more costumers - At the same time, a shopping mall nearby has a grand opening - VARIABLES ARE CONFOUNDED
  • 31.
    EXPERIMENTAL DESIGN: CONTROL PLACEBOEFFECT – when a subject reacts favorably to a placebo when in fact, he or she has been given no medicated treatment at all To avoid this, we use BLINDING
  • 32.
    EXPERIMENTAL DESIGN: CONTROL BLINDING– WHEN THE SUBJECT DOES NOT KNOW WHETHER HE OR SHE IS RECEIVING A TREATMENT OR A PLACEBO DOUBLE BLINDING – NEITHER THE SUBJECT NOR THE THE EXPERIMENTER KNOWS IF THE SUBJECT IS RECEIVING A TREATMENT OR PLACEBO (PREFERRED)
  • 33.
    EXPERIMENTAL DESIGN: RANDOMIZATION Randomization– process of randomly assigning subjects to different treatment groups 1.Completely Randomized Design 2.Randomized Block Design 3.Matched Pairs Design
  • 34.
    EXPERIMENTAL DESIGN: RANDOMIZATION 2.Randomized Block Design - Divide subjects with similar characteristics into blocks, and randomly assign subjects to treatments within each block All Subjects 30 – 39 year olds Control Treatment 40 – 49 year olds Control Treatment
  • 35.
    EXPERIMENTAL DESIGN: RANDOMIZATION 3.Matched-Pairs Design - Subjects are paired according to a similarity - Subjects may be paired based on age, residency, etc. - One receives one treatment, and the other receives another treatment
  • 36.
    EXPERIMENTAL DESIGN: REPLICATION Replication– repetition of an experiment using a large group of subjects - More subjects, more value added to the result of your experiment - We’re always looking for a large sample size
  • 37.
    SAMPLING TECHNIQUES 1. Census– count or measure of ENTIRE population 2. Sampling – count or measure of PART of a population - Random Sample - Simple Random Sample - Stratified Sample - Cluster Sample - Systematic Sample - Sampling Error – difference between the results of a sample and those of the population
  • 38.
    SAMPLING TECHNIQUES Sampling Error– difference between the results of a sample and those of the population Biased Sample – one that is NOT representative of the population from which it is drawn. Example: A sample of 18 – 22 year old college students would NOT be representative of the entire 18 – 22 year old population in the country.
  • 39.
    SAMPLING TECHNIQUES Random Sample– every member of the population has an equal chance of being selected Simple Random Sample – every possible sample of the same size has the same chance of being selected USE OF RANDOM NUMBER GENERATORS!
  • 40.
    SAMPLING TECHNIQUES WHEN ITIS IMPORTANT FOR THE SAMPLE TO HAVE MEMBER FROM EACH SEGMENT OF THE POPULATION Stratified Sample – members of population are divided into two or more subsets (strata), then sample is randomly selected from each strata **Ensures that each segment of the population is represented
  • 41.
    SAMPLING TECHNIQUES WHEN THEPOPULATION FALLS INTO NATURALLY OCCURRING SUBGROUPS CLUSTER SAMPLE – Divide the population into groups (clusters), and select ALL of the members in one or more (but NOT ALL) of the clusters. **Must be important that all clusters have similar characteristics
  • 43.
    SAMPLING TECHNIQUES: INSIGHT ForSTRATIFIED SAMPLING, each of the strata contains members with a certain characteristic. For CLUSTERS, each consist of geographic groupings, and should consist of members with ALL characteristics. - Stratified – Some of members of each group are used - Cluster – All of members of one or more groups are used
  • 44.
    SAMPLING TECHNIQUES - SystematicSample – a sample in which each member of the population is assigned a number, those members are then ordered and then sample members are selected at regular intervals starting with the starting number. # # # # # # # # #
  • 45.
    SAMPLING TECHNIQUES Convenience Sample– sample consists only of available members of population (not recommended)
  • 46.
    ASSIGNMENT Pg. 25 #1- #14, #17 - #26 (identify sampling technique) Pg. 27 #29- #30
  • 47.
    HOMEWORK SELECTED ANSWERS Section1.1 5. False 6. True 7. True 8. False 9. False 10. True 11. Pop 12. Sam 13. Sam 14. Pop 15. Sam 16. Pop 21. Pop: all adults in US Sam: 1000 surveyed 22. Pop: all infants in Italy Sam: 33043 infants in study 23. Pop: all households in US Sam: 1906 households surveyed 24. Pop: all computer users Sam: 496 students surveyed 29. Statistic 30.Statistic 31.Parameter 32. Parameter 33. Statistic 34. Parameter 35. Statistic 36. Parameter Section 1.2 1. N and O 2. O, I and R 3. False 4. False 5. False 6. False 7. Qualitative 8. Quantitative 9. Quantitative 10. Qualitative 11. Qualitative 12. Quantitative 13. Qualitative, O 14. Qualitative, N 15. Qualitative 16. Quantitative, R 17. Qualitative, O 18. Quantitative, R 19. O 20. R 21. N 22. R 23. I, N, R, O 24. I,N,I,R Section 1.3 5. True 6. False 7. False 8. False 9. Fasle 10. True 11. P an E 12. Survey 13. Simulation 14. Census 17. SRS 18. Stratified 19. Convenience 20. Cluster 21. SRS 22. Systematic 23. Stratified 24. Convenience 25. Systematic 26. SRS 29. Census 30. Survey