  • Basic Definitions: Population
    • Population
      • The entire group to be studied
    • Census
      • A collection of data and information from the population
    • Parameter
      • A numeric measurement or calculation of census data
      • Quantifies an attribute of the population
  • Basic Definitions: Sample
    • Fundamental concept
      • A census may not be possible or practical
    • Sample
      • (Noun) A subset of a population that (we hope) represents the population
      • (Verb) The process of collecting data from the subset.
    • Statistic
      • A numeric measurement or calculation of sample data
      • Estimates the parameter of a population
  • Another Fundamental Concept
    • We are never 100% sure that our sample exactly represents the population
    • So a statistic is just an estimate
    • We will learn many techniques to deal with this uncertainty
  • Where do we get the data?
    • Census vs sample
    • Observations
      • “ Watching” real activity and collecting data
      • Opinion polls
    • Experiments
      • Running the activity and measuring the results
      • Relatively easy to control
  • For Example
    • TV watching and test scores
    • Observation
      • Use a survey that asks your sampled students their TV watching habits and their test scores.
    • Experiment
      • Design varied TV-watching schedules for your samples
      • Design and/or administer an test to measure learning
    • Car crashworthiness and make
    • Observation
      • Collect accident data and auto repair data
    • Experiment
      • Deliberately crash cars and measure the results
  • Live Example
    • Movie popularity
    • Observation
    • Experiment
    • Cell Phone Reception
    • Observation
    • Experiment
  • Homework
    • Describe an experiment to gather data tests the following claims.
      • Reading books improves school performance
      • Blondes have more fun
  • Variables
    • Variable refers to any characteristic that could effect an outcome being tested.
      • Variables have to be measureable
    • What characteristics affect SAT scores?
    • What characteristics affect car crashworthiness?
  • Varying and Controlling
    • In a statistics study, we test if one variable really has an affect on the outcome.
    • We will vary the test variable
      • Change the value to see if the outcome also changes
    • To prevent confounding , we will control the other variables
      • Confounding: The effects of two or more variables can not be distinguished
      • Control: Samples with similar values for the other variables may be grouped
  • C and A: How do you raise a smart kid?
    • Economics professor has correlated test scores with family characteristics
      • Educated parents
      • High socio-economic status
      • 30 year old mom
      • Books in home
      • English in the house
      • PTA participation
      • Birth weight
      • Adopted
  • Your Turn, Home Work.
    • Lets assume we are designing a study of car crashworthiness. Your assignment is to to the following.
    • List 6 variables of a car or driver that you feel affect
    • Of these variables, pick one that you would like to test.
    • Using the control variables, describe three groups of cars and/or drivers you would create to test your variable.
  • Treatment
    • When running a experiment that tests a variable:
      • The sample will be split into groups
      • Each group will be administered one level of the variable
      • Who or what is assigned to each group is randomly determined.
    • In some experiments the test variable is all or none.
      • E.g., a drug
      • One group, the treatment group, receives all (called the treatment)
      • The other group, the control group, receives nothing or a pretend treatment called a placebo
  • Placebo Effect
    • The subject, but especially the control group, might think they are being given the treatment and start to act accordingly.
    • If the experiment is blinded the subjects are not told if they are receiving the real treatment or placebo.
      • The subjects should also not be told the outcome
    • If the experiment is double blinded the people administering the experiment are also not told
  • Your turn/homework
    • You are charged with testing a new SAT prep course
    • Describe how the placebo effect might come into play in your experiment
    • Describe how you would counteract that effect
  • Sampling
    • Sampling: picking a subset of a population
    • Sample’s characteristics should reflect the population’s in the same proportion
    • E.g., our school’s demographic break-down is
    Frosh Sophomore Junior Senior Male 13% 12% 12% 13% Female 13% 13% 11% 13%
  • Sample Scheme Characteristics
    • Random sample
      • Each member of the population has an equal chance to be selected
    • Simple random sample
      • Each subset a population has an equal change of being selected.
  • Sampling Strategies
    • Self-selected
      • Population members volunteer
      • E.g., Call-in phone lines
      • Easy to implement
      • Difficult to get a proportional sample
      • Susceptible to bias
    • Convenience sampling
      • Whoever happens by
      • E.g., Mall surveys
      • Also susceptible to bias
  • Sampling Strategies
    • Random sample
      • Each member of the population is selected at random
      • E.g., Generate random student id’s
    • Systematic sampling
      • Population is put into some order
      • Select some starting point, then select every n th individual in a population
      • The starting point and maybe the interval ( n ) are picked at random
  • More Sampling Selection and Collection
    • Stratified sampling
      • Divide the population into groups.
        • Groups are determined by control variables
      • Randomly sample within each group
    • Cluster sample
      • Divide the population into clusters, randomly pick a cluster, then sample all (or most) members of the cluster
  • Example: Student Opinion Poll
    • Self-selecting
    • Random sampling
    • Systematic sampling
    • Convenience sampling
    • Stratified sampling
    • Cluster sample
  • Example: Crashworthiness
    • Self-selecting
    • Random sample
    • Systematic sampling
    • Convenience sampling
    • Stratified sampling
    • Cluster sample
  • Bias
    • Sampling members of a population…
      • With a specific characteristic
      • That will give a specific outcome
      • “ Rigging the game”
    • Selection and undercoverage bias
      • E.g., FOX news and health care
    • Non-response bias
      • Counting non-response as one answer
    • Voluntary response bias
    • Only people who feel strongly might respond to a survey.
  • More on Bias
    • If I want my test to support the claim that watching too much TV hurts SAT scores, how do I rig the sample?
    • If I want my test to support the claim that European cars are safer that Japanese cars, how do I rig the sample?