Chapter 3 part3-Toward Statistical Inference


Published on


Published in: Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Chapter 3 part3-Toward Statistical Inference

  1. 1. INTRODUCTION TO STATISTICS & PROBABILITY Chapter 3: Producing Data (Part 3) Dr. Nahid Sultana 1
  2. 2. Chapter 3: Producing Data Introduction 3.1 Design of Experiments 3.2 Sampling Design 3.3 Toward Statistical Inference 3.4 Ethics 2
  3. 3. 3.3 Toward Statistical Inference 3  Parameters and Statistics  Sampling Variability  Sampling Distribution  Bias and Variability  Sampling from Large Populations
  4. 4. 4 Parameters and Statistics Using samples to talk about populations A parameter is a number that describes some characteristic of the population. In statistical practice, the value of a parameter is not known because we cannot examine the entire population. Name Symbol Example Mean µ In a nationwide test, what is the average score? Proportion p What proportion of people choose chocolate as their favorite ice cream flavor? Name Symbol Example Sample Mean Sample mean of 100 test scores Sample Proportion Sample proportion of 100 people who choose chocolate as their favorite ice cream flavor? x We answer such questions by studying a sample…. A statistic is a number that describes some characteristic of a sample. The value of a statistic can be computed directly from the sample data. p 
  5. 5. 5 Parameters and Statistics Examples:  Proportion of all students who attended the last home football game. Parameter, p  Proportion of registered voters who voted in November. Parameter, p  Mean height of a sample of NBA basketball players. Statistics,  Mean SAT of entering freshmen Parameter, µ  Proportion of people who prefer Coke over Pepsi in a sample of mall shoppers Statistics,  Mean number of pepperoni slices on a 12̎ pizza from a sample of a certain brand of pepperoni pizzas. Statistics, x x
  6. 6. 6 Statistical Estimation  The process of statistical inference involves using information from a sample to draw conclusions about a wider population.  Your estimate of the population is only as good as your sampling design.  Work hard to eliminate biases.  Your sample is only an estimate—and if you randomly sampled again you would probably get a somewhat different result.  Bigger sample is better.
  7. 7. 7 Sampling Variability  Each time we take a random sample from a population, we are likely to get a different set of individuals and calculate a different statistic. This is called sampling variability.  We ask, “What would happen if we took many samples?”  Take a large number of samples from the same population.  Calculate the sample mean/proportion for each sample.  Make a histogram of these values.  Examine the distribution displayed in the histogram for shape, center, and spread, as well as outliers or other deviations.
  8. 8. 8 Sampling Variability (Cont…)  The sampling distribution of a statistic is the distribution of that statistic for samples of a given size n taken from the same population. The variability of a statistic is described by the spread of its sampling distribution. This spread depends on the sampling design and the sample size n, with larger sample sizes leading to lower variability.
  9. 9. 9 The results of many SRSs have a regular pattern. Here, we draw 1000 SRSs of size 100 from the same population. The population proportion is p = 0.60. The histogram shows the distribution of the 1000 sample proportions. The distribution of sample proportions for 1000 SRSs of size 2500 drawn from the same population as in first figure. The two histograms have the same scale. The statistic from the larger sample is less variable.
  10. 10. 10 Both bias and variability describe what happens when we take many shots at the target. Bias concerns the center of the sampling distribution. A statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated. The variability of a statistic is described by the spread of its sampling distribution. This spread is determined by the sampling design and the sample size n. Statistics from larger probability samples have smaller spreads.10 Bias and Variability
  11. 11. 11 A good sampling scheme must have both small bias and small variability. To reduce bias, use random sampling. To reduce variability of a statistic from an SRS, use a larger sample. Managing Bias and Variability POPULATION SIZE DOESN’T MATTER The variability of a statistic from a random sample does not depend on the size of the population, as long as the population is at least 100 times larger than the sample.
  12. 12. 12 3.4 Ethics  Institutional Review Boards  Informed Consent  Confidentiality  Clinical Trials  Behavioral and Social Science Experiments
  13. 13. 13 Institutional Review Boards  The organization that carries out the study must have an institutional review board that reviews all planned studies in advance in order to protect the subjects from possible harm.  The institutional review board:  reviews the plan of study  can require changes  reviews the consent form  monitors progress at least once a year
  14. 14. 14 Informed Consent  All subjects must give their informed consent before data are collected.  Subjects must be informed in advance about the nature of a study and any risk of harm it might bring.  Subjects must then consent in writing.  Who can’t give informed consent?  prison inmates  very young children  people with mental disorders
  15. 15. 15 Confidentiality  All individual data must be kept confidential. Only statistical summaries may be made public.  Confidentiality is not the same as anonymity. Anonymity means that subjects are anonymous—their names are not known even to the director of the study. Anonymity prevents follow-ups to improve non-response or inform subjects of results.  Any breach of confidentiality is a serious violation of data ethics.  The best practice is to separate the identity of the subjects from the rest of the data immediately!
  16. 16. 16 Clinical Trials  Clinical trials study the effectiveness of medical treatments on actual patients—these treatments can harm as well as heal.  Points for a discussion:  Randomized comparative experiments are the only way to see the true effects of new treatments.  Most benefits of clinical trials go to future patients. We must balance future benefits against present risks.
  17. 17. 17 Behavioral and Social Science Experiments  Many behavioral experiments rely on hiding the true purpose of the study.  Subjects would change their behavior if told in advance what investigators were looking for.  The “Ethical Principles” of the American Psychological Association require consent unless a study only observes behavior in a public space.