Chapter 1 Review 1.1 WHAT IS STATISTICS? 1.2 RANDOM SAMPLES1.3 INTRODUCTION TO EXPERIMENTAL DESIGN
Page 4 Introduction Statistics is the study of how to collect, organize, analyze, and interpret numerical information from data. Generally the first thing one must do is gather data. In order to do that, we need to identify the individuals or objects to be included in our study and the characteristics that are of interest. Individuals are the people or objects included in the study. A variable is a characteristic of the individual to be measured or observed.
Variables The variables in a study can be quantitative or qualitative. A quantitative variable has a value or numerical measurement for which operations such as addition or averaging make sense. A qualitative variable describes an individual by placing the individual into a category or group.
Some VS All In population data, the data are from every individual of interest. Data from a specific population are fixed and complete A population parameter is a numerical measure that describes an aspect of a population In sample data, the data are from only some of the individuals of interest. Data from a sample may vary from sample to sample and are not complete A sample statistic is a numerical measure that describes an aspect of a sample
Page 7 Levels of Measurement Levels of Measurement are another way to classify data, they indicate the type of arithmetic that is appropriate for the data Categorize the level of measurement, take the highest level appropriate. Consider which calculations are suitable for the data. Nominal: data that consists of names, labels, or categories. There are no implied criteria to order the data from largest to smallest. Ordinal: data that can be arranged in order, but differences between data values either cannot be determined or are meaningless. Interval: data that can be arranged in order and differences between data values are meaningful. Ratio: data that can be arranges in order, both differences and ratios of data values are meaningful. Data at the ratio level have a true zero.
Page 13 Sampling Techniques One of the most important sampling techniques is the simple random sample. In a simple random sample every sample of a specific size has an equal chance of being selected and every individual of the population has an equal chance of being selected. To get a random sample use a random-number table or an random-number generator
Page 15 Simulation A simulation is a numerical copy, imitation, or representation of a real-world phenomenon. Random-number tables can be used for simulations Advantages (p 22): numerical and statistical simulations can fit real-world problems extremely well. Allows a researcher to explore situations that might be very dangerous in real-life.
Page 15 Sampling With or Without Replacement Sampling with replacement means that although a number (or individual) is selected for the sample, it is not removed from the population. Thus, the same number may be selected for the sample more than once. Simulations often use sampling with replacement. Sampling without replacement means that when a number (or individual) is selected for a sample it is not replaced in the population. If you need to sample without replacement, generate more items than you need for the sample and remove duplicate values.
Page 16 Other Sampling Techniques Random Sample: Use a simple random sample from the entire population. In our text, assume (simple) random samples are used Stratified Sampling: Divide the entire population into distinct subgroups, called strata, based on a specific characteristic such that all members of a stratum share the specific characteristic. Draw random sample from each stratum.
Other Sampling Techniques Systematic Sampling: Number all members of the population sequentially. From a random starting point, include every kth member of the population in the sample. Advantage: easy to get Disadvantage: will not work if a population is repetitive or cyclic in nature Cluster Sampling: Divide the entire population into pre-existing segments or clusters. Make a random selection of clusters, include every member of each selected cluster in the sample.
Other Sampling Techniques Multistage Sampling: Use a variety of sampling methods to create successively smaller groups at each stage, the final sample consists of clusters. Convenience Sampling: Create a sample by using data from population members that are readily available. Disadvantage: may be severely biased
Page 21 Planning a Statistical Study1. Identify the individuals or objects of interest2. Specify the variables and protocols for taking measurements or making observations3. Determine whether you will use the entire population or a sample. If using a sample, choose an appropriate method. In a census, measurements or observations from the entire population are used. Advantage: gives complete information about the population Disadvantage: obtaining a census can be expensive and difficult4. Make a data collection plan. Choose a collection technique: observational study, experiment, or survey5. Collect the data.6. Use descriptive statistics methods and make decisions using appropriate inferential statistics methods.7. Note any concerns you have about your data collection and recommendations for future studies.
Page 22 Gathering Data In an observational study, observations and measurements of individuals are conducted in a way that doesn’t change the response or variable being measured.
Gathering Data In an experiment, a treatment is deliberately imposed on the individuals in order to observe a possible change in the response or variable being measured. Statistical experiments are commonly used to determine the effect or a treatment. The design needs to control for other possible causes of the effect. The Placebo Effect occurs when a subject receives no treatment but (incorrectly) believes he or she is in fact receiving the treatment and responds favorably.
Experiments cont… A completely randomized experiment is one in which a random process is used to assign each individual to one of the treatments; this accounts for the placebo effect. Randomly divide individuals into two groups: treatment and control The treatment group receives the treatment. The control group receives a dummy or placebo treatment that is disguised to look like the real treatment. After the treatment cycle, compare the groups.
Page 24 Experiments cont… It is difficult to control all the variables that might influence the response to a treatment. One to control some of the variables is through blocking. A block is a group of individuals sharing some common features that might affect the treatment. In a randomized block experiment, individuals are first sorted into blocks, and then a random process is used to assign each individual in the black to one of the treatments.
Experiments cont… Features of a good experimental design: There is a control group. Randomization is used to assign individuals to the two treatment groups (this helps to prevent bias in selecting members for each group) Replication of the experiment on many patients/individuals reduces the possibility that the results are due to chance A double-blind experiment is one in which the individual and the observer do not know which group the individual is in which helps control subtle bias that the observer may pass onto the individual.
Gathering Data A survey involves asking the individuals questions.