4. a) The average age of the students
in a statistics class is 21 years.
b) The chances of winning the
California Lottery are one chance
in twenty-two million.
c) There is a relationship between
smoking cigarettes and getting
emphysema.
d) From past figures, it is predicted
that 39% of the registered voters
in California will vote in the June
primary.
EXERCISE:
Identify whether each
situation shows a
descriptive or
inferential statistics.
5. Sources of Data
• Survey and Experiment
• Known to be the two fundamental kinds of
investigations.
• Retrospective Studies (Case-Control Studies)
• Gather data from selected cases and
controls to determine differences, if any, in
the exposure to a suspected factor.
• Prospective Studies (Cohort Studies)
• The researchers enroll a group of healthy
persons (a cohort) and follow them over a
certain period to determine the frequency
with which a disease develops.
6. Descriptive vs Analytical Surveys
• Retrospective surveys are
usually descriptive surveys
that provide estimates of a
population’s characteristics.
• Prospective surveys may be
descriptive or analytical.
Analytical surveys seek to
determine the degree of
association between a variable
and a factor in the population.
7. Populations
and Samples
Definition:
A population is a set of
persons (or objects)
having a common
observable
characteristics.
A sample is a subset of
a population.
Note: The way the sample is
selected, not its size,
determines whether we may
draw appropriate inferences
about the population.
8. EXERCISE:
Identify the
population and
the sample
a) A survey of 1353 American households found that
18% of the households own a computer.
Population: All American households
Sample: 1353 American households
b) A recent survey of 2625 elementary school
children found that 28% of the children could be
classified obese.
Population:
Sample:
c) The average weight of every sixth person entering
the mall within 3 hour period was 146 lb.
Population:
Sample:
10. Parameter vs
Statistic
• The symbols change as you move from a statistic to a
parameter. Greek symbols are used for parameters.
11. Exercise
Determine whether the numerical
value is a parameter or a statistics
(and explain):
a) A recent survey by the alumni of a
major university indicated that the
average salary of 10,000 of its
300,000 graduates was P125,000.
b) The average salary of all assembly-
line employees at a certain car
manufacturer is $33,000.
c) The average late fee for 360 credit
card holders was found to be
$56.75
12. Qualitative or Quantitative Data
Qualitative Data Quantitative Data
• Deals with descriptions.
• Data can be observed but not
measured.
• Colors, textures, smells, tastes,
appearance, beauty, etc.
• Deals with numbers.
• Data which can be measured.
• Length, height, area, volume, weight,
speed, time, temperature, humidity,
sound levels, cost, members, ages,
etc.
13. EXERCISE
Determine whether the data are
qualitative or quantitative:
a) the colors of automobiles on a
used car lot
b) the numbers on the shirts of a
girl’s soccer team
c) the number of seats in a movie
theater
d) a list of house numbers on your
street
e) the ages of a sample of 350
employees of a large hospital.
14. Levels of
Measurement
• Variable - any characteristic of an
individual or entity. A variable can
take different values for different
individuals.
15. Levels of Measurement
EXAMPLES
NOMINAL ORDINAL INTERVAL RATIO
genotype, blood type, zip
code, gender, race, eye
color, political party
socio economic status (“low
income”,”middle income”,”high
income”), education level (“high
school”,”BS”,”MS”,”PhD”), income
level (“less than 50K”, “50K-100K”,
“over 100K”), satisfaction rating
(“extremely dislike”, “dislike”,
“neutral”, “like”, “extremely like”).
temperature (Farenheit),
temperature (Celcius), pH,
SAT score (200-800), credit
score (300-850).
enzyme activity, dose
amount, reaction rate,
flow rate, concentration,
pulse, weight, length,
temperature in Kelvin (0.0
Kelvin really does mean
“no heat”), survival time.
16. Exercise
Identify the data set’s level of measurement (nominal, ordinal, interval, ratio):
a) hair color of women on a high school tennis team
b) numbers on the shirts of a girl’s soccer team
c) ages of students in a statistics class
d) temperatures of 22 selected refrigerators
e) number of milligrams of tar in 28 cigarettes
f) number of pages in your statistics book
g) marriage status of the faculty at the local community college
h) list of 1247 social security numbers
21. Types of Sampling
• Probability (Random) Samples
• Simple random sample
• Systematic random sample
• Stratified random sample
• Cluster sample
• Non-Probability Samples
• Convenience sample
• Purposive sample
• Quota
• Snowball 21
22. SAMPLING
22
• A sample is “a smaller (but hopefully representative)
collection of units from a population used to determine
truths about that population” (Field, 2005)
• Why sample?
• Resources (time, money) and workload
• Gives results with known accuracy that can be
calculated mathematically
• The sampling frame is the list from which the potential
respondents are drawn
• Registrar’s office
• Class rosters
• Must assess sampling frame errors
23. SAMPLING……
23
• What is your population of interest?
• To whom do you want to generalize your
results?
• All doctors
• School children
• Indians
• Women aged 15-45 years
• Other
• Can you sample the entire population?
24. SAMPLING…….
24
• 3 factors that influence sample
representativeness
• Sampling procedure
• Sample size
• Participation (response)
• When might you sample the entire population?
• When your population is very small
• When you have extensive resources
• When you don’t expect a very high response
26. PROBABILITY SAMPLING
26
• A Probability Sampling scheme is one in which every
unit in the population has a chance (greater than zero)
of being selected in the sample, and this probability
can be accurately determined.
• When every element in the population does have the
same probability of selection, this is known as an
'equal probability of selection' (EPS) design. Such
designs are also referred to as 'self-weighting'
because all sampled units are given the same weight.
27. NON-PROBABILITY SAMPLING
27
• Any sampling method where some elements of population
have no chance of selection (these are sometimes
referred to as 'out of coverage'/'undercovered'), or
where the probability of selection can't be accurately
determined. It involves the selection of elements based on
assumptions regarding the population of interest, which
forms the criteria for selection. Hence, because the
selection of elements is nonrandom, nonprobability
sampling not allows the estimation of sampling errors..
• Example: We visit every household in a given street, and
interview the first person to answer the door. In any
household with more than one occupant, this is a
nonprobability sample, because some people are more
likely to answer the door (e.g. an unemployed person who
spends most of their time at home is more likely to answer
than an employed housemate who might be at work when
the interviewer calls) and it's not practical to calculate
these probabilities.
28. EXERCISE
Identify the sampling method used in each situation
• A sample of 2,000 was sought to estimate the average achievement in science of fifth graders in a city’s public
schools. The average fifth grade enrollment in the city’s elementary schools is 100 students. Thus, 20 schools were
randomly selected and within each of those schools all fifth graders were tested.
• A researcher has a population of 100 third grade children from a local school district from which a sample of 25
children is to be selected. Each child’s name is put on a list, and each child is assigned a number from 1 to 100. Then
the numbers 1 to 100 are written on separate pieces of paper and shuffled. Finally, the researcher picks 25 slips of
paper and the numbers on the paper determine the 25 participants.
• A sociologist conducts an opinion survey in a major city. Part of the research plan calls for describing and comparing
the opinions of four different ethnic groups: African Americans, Asian Americans, European Americans, and Native
Americans. For a total sample of 300, the researcher selects 75 participants from each of the four predetermined
subgroups.
• Instructors teaching research methods are interested in knowing what study techniques their students are utilizing.
Rather than assessing all students, the researchers randomly select 10 students from each of the sections to comprise
their sample.