Inferential include one or more of the inferential statistical procedures.docx
1. Inferential Statistics: include one or more of the inferential statistical
procedures that you learned about in this module (that is, a t-test or
ANOVA)
For this module of your SLP, you can use an article you used in a previous module’ s SLP
paper, or choose a different one. The article must be no more than 5 years old. The article
must include one or more of the inferential statistical procedures that you learned about in
this module (that is, a t-test or ANOVA).Begin by providing the reference for the article, in
proper format.Write an introductory paragraph that includes a reminder of what your topic
is.Introduce and briefly describe the study in one paragraph.Then identify the
following:Null and alternative hypothesisSampling proceduresIndependent and Dependent
Variable/sAlpha levelOutcome (significant results, or fail to reject null hypothesis)What 2
questions would you like to ask the researcher about the results?If you were designing your
own study about this topic, what would your independent variable be?What would your
dependent variable be?What would you expect to find? For example: males will be more
likely to exercise than females …something related to your own topic.ASSIGNMENT
EXPECTATIONS: Please read before completing assignments.Copy the actual assignment
from this page onto the cover page of your paper (do this for all papers in all
courses).Assignment should be approximately 2 pages in length (double-spaced).Please use
major sections corresponding to the major points of the assignment, and where appropriate
use sub-sections (with headings).Remember to write in a Scientific manner (try to avoid
using the first person except when describing a relevant personal experience).Quoted
material should not exceed 10% of the total paper (since the focus of these assignments is
on independent thinking and critical analysis). Use your own words and build on the ideas
of others.When material is copied verbatim from external sources, it MUST be properly
cited. This means that material copied verbatim must be enclosed in quotes and the
reference should be cited either within the text or with a footnote.Use of peer-reviewed
articles is required.Credible professional sources are used (for example, government
agencies, nonprofit organizations, academic institutions, scholarly journals). Wikipedia is
not acceptable. WHAT IS STATISTICAL INFERENCE?What is a FACT? It is something we
know through the direct evidence of our senses. When we do basically descriptive statistics,
we are directly counting and measuring. It is a fact that the mean height of a particular class
of sixth grade boys is (let’ s say) 4’ 11” and it’ s a fact that the standard deviation of the
heights in that class is 2.5?. We can verify this directly.How does an INFERENCE differ from
2. a fact? In making an inference we are assuming that something is true, we draw a
conclusion based on evidence or signs that it has occurred or it is true, not on a direct
observation.To differentiate with an example from “ real life” :I arrive at one of my live,
real-time Introductory Psychology classes looking very grim. I am carrying a stack of tests
with me.What kinds of conclusion are the students going to draw?Now, can they be
absolutely sure about their conclusions?How about an alternative hypothesis? For example,
on my way to school I rear-ended someone, ruined my little car and my insurance rates. Or
maybe I just had a big fight with the man I am dating.It could be that the tests were very
poor, or it could be that I am having a hard time in some other aspect of my life. There is no
way to tell until we move from the inferential to the descriptive. And of course, I may not
actually tell my students what is going on, if it isn’ t the test that is bumming me out. So,
how can they really know for certain?Inductive reasoning involves putting our evidence
together to support a claim about a phenomenon. Inductive reasoning is evaluated in terms
of its strength. One can provide a strong inductive argument and still be wrong (this is not
true of what we call deductive reasoning, but that is a story for a different
day).STATISTICAL INFERENCE involves using procedures based on samples or bits of
information used to make statements about some broader set of circumstances.So in
induction and statistical inference, we use little bits of evidence, information and clues to
make statements about phenomena. We reason from the specific to the general, however, all
conclusions we arrive at are potentially wrong. Therefore, we refer to them as
CONDITIONAL CONCLUSIONS. We can’ t say that the conclusions are absolutely
correct.CONDITIONAL CONCLUSION: Any conclusion based on induction cannot be stated
as being absolutely correct. There is always a probability that an inference is wrong.What
we have been working up to over the past 6 weeks is the essential problem:We generally do
research on some aspect of a population. What is a population? It’ s all those individuals,
observations or measurements that share a common characteristic (which is probably of
interest to us).We want to be able to generalize about some feature in a population but we
are unable to study that population in its entirety. How can we use the little bit of data that
we are able to gather to take an educated guess about this aspect of the population?That’ s
a general definition that fits most instances. However, in the health and behavioral sciences
we often deal with a hypothetical version of the population known as a TREATMENT
POPULATION. I am calling it hypothetical because it may only exist when the experiment is
being conducted.The research we do is an attempt to describe the treatment population if it
did exist – as if our newly developed antidepressant was being given to all the depressed
people in the larger population. It is very difficult to study a population exhaustively, so we
depend on our sample to be able to make inferences concerning the important
characteristics, which we call PARAMETERS.TREATMENT POPULATION: A hypothetical
population that experimenters create for research purposes. The parameters of the
treatment population are estimated using the results obtained from the subjects in the
experimental group who are exposed to a specific independent (manipulated) variable.We
are going to use the smaller pieced of information gleaned from our sample to make larger
and more complicated inferences and inductions. This is what inferential statistics enables
us to do.INFERENTIAL PROCEDURESThe questions that we ask using inferential procedures
3. are:1. Does the behavior or feature we see in the sample represent the behavior of the
population as a whole?2. Is it legitimate for us to use the data from the sample to claim or
conclude that there is a real, significant difference between two or more experimental
conditions?These questions represent two different types of inferential procedures known
as– Estimation of population parameters (point estimation and confidence intervals)–
Hypothesis testingIn this module we will emphasize hypothesis testing, as many health
research studies of interest involve the use of statistical procedures to test
hypotheses.PART IISTATISTICAL HYPOTHESIS TESTINGTo contrast with point/interval
estimation, in hypothesis testing we are not trying to directly estimate a parameter of the
population; rather we are comparing our sample statistic result with a known or
hypothesized population parameter. We want to see if there is any significant difference
between our sample result and what is known about the population in general.
SIGNIFICANT does not mean “ important” in quantitative reasoning (not necessarily at any
rate.) Rather it means “ unlikely to be due to mere chance.” There are two general types of
hypothesis testing procedures:– A result is compared to a known population average. An
example would be the rate of cancer in people exposed to a certain hazardous chemical in
their line of work compared to the national average.– Samples from two or more treatment
populations are compared: different dosage levels of a drug, or a drug, a non-
pharmacological intervention, and no treatment.In quantitative reasoning procedures,
HYPOTHESIS TESTING is the comparison of sample results with some known or
hypothesized population parameters.TYPES OF HYPOTHESESWe all do hypothesis testing
every day of our lives as we navigate through various situations where we have to make
educated guesses about what is going on with people and situations at work and
socially/interpersonally. Earlier I used the example of my introductory psychology students
trying to spin hypotheses about why I was acting like such a grouch.The hypothesis testing
that we do in research should be somewhat more rigorous and organized.First, we make a
general statement about the relationship between the independent and dependent
variables or the magnitude of the observation (the size of the effect of interest.) This is our
conceptual hypothesis.Example: I am doing research to find out if doing 1 hour of moderate
aerobic exercise 4 times a week as opposed to no regular aerobic exercise is related to
longer lifespan among women over the age of 65.What is my independent variable?Exercise
or no exercise.What is my dependent variable?Years of life.Try this one:You are doing a
study of the effects of a vitamin supplement on energy levels of men who are recovering
from heart bypass surgery.What is your independent variable?What is your dependent
variable?We use our conceptual hypothesis as a basis for a statistical hypothesis. This is a
mathematical statement that can be shown to be supported or not supported through
statistical procedures.In my study, there should be a difference in the mean lifespan of the
women who exercise and the women who don’ t. Specifically, the women who do exercise
should having a higher mean lifespan than those who do not.What is the statistical
hypothesis in your study (the vitamin supplement study)?It is customary for researchers to
state first the hypothesis in terms of NO SIGNIFICANT RESULTS. It is a way of keeping
one’ s hopes for exciting results in check so that experimenter expectations don’ t unduly
influence the results.This is called the NULL hypothesis. Here are links to visit to learn more
4. about the NULL hypothesis:Internet Glossary of Statistical Terms (2002). Retrieved Jan 1,
2012 from http://www.animatedsoftware.com/statglos/sgnullhy.htmNull Hypothesis.
Retrieved Jan 1, 2012 from http://davidmlane.com/hyperstat/A29337.htmlTo put this into
statistical language, the null hypothesis is that the mean of the treatment group has
approximately the same value as the mean of the control group. This is how it looks in
statistical language:?1 = ?2In my study – “ There is no difference (increase) in lifespan of
women who do aerobic exercise on a regular basis compared to those who do not.” What is
the NULL hypothesis in your study?The result that represents a significant difference is
called the ALTERNATIVE hypothesis. It can be shown a few different ways depending on
your whether there is any evidence that the difference may be in a particular direction
(more or less.)?1 < ?2; ?1 > ?2; ?1 does not equal ?2Two types of alternative statistical
hypotheses?Directional hypothesisThe direction of the relationship difference between the
two populations is explicitly stated. An alternative hypothesis that states the direction in
which the population parameter in the experimental group differs from that in the control
group.Usually we are interested in a direction for our significant difference. In my study, the
alternative hypothesis is “ Women who do aerobic exercise regularly will live longer than
women who do not.” When we do not state the alternative hypothesis in terms of a specific
direction for the effect or the difference, it is called a NON-DIRECTIONAL hypothesis. Think
about it – if I just said “ There is a difference in lifespan between women who exercise
regularly and women who do not” it would seem to be quite significant if I found that the
mean lifespan in the exercising group was fewer years. But this would be not be a desirable
outcome, given that I probably want to show that exercise helps people live longer.What
should you do with the vitamin study? Should you state your hypothesis in a directional or
non-directional manner? Which would be the more meaningful alternative hypothesis for
your purposes?This is important to note: We MAY in fact find differences between the
population means in examining our competing (null versus alternative) hypotheses.
However, one of the functions of the statistical tests will be to hold us to a set size of
difference between the means of the two (or more) groups we are comparing. If the
difference is not SIGNIFICANT according to the requirements of our procedures, we are
compelled to assume that those differences are due to RANDOM SAMPLING ERROR.It is also
the custom that we set up our null and alternative hypotheses BEFORE we collect and
analyze our data —this is termed “ a priori” . It is part of the ethics of doing research.So one
needs to determine what the null and alternative hypotheses will be (conceptually and
statistically) and then determine if the alternative will be stated directionally or non-
directionally.Whether one chooses a directional or non-directional expression will depend
on the following:– Is there strong evidence before beginning that the difference will be in a
particular (positive or negative direction)? Just hoping it will be in a particular direction
doesn’ t mean one should use a directional alternative hypothesis.-If all that can be
reasonably suspected is that the two population means will be different, it is more prudent
to use a non-directional alternative hypothesis. It is tougher to obtain significant results
under these conditions. This makes for more reliable results.WHEN DO WE REJECT THE
NULL HYPOTHESIS?Every time you and your friend get together and work on your latest
exciting module, you go out for dinner together afterwards and discuss all the fascinating
5. things that you have learned. You decide to make it your custom to flip a coin to see who
will pick up the check. Your friend always provides the coin.The first day your friend wins.
But you are not especially paranoid, so you don’ t suspect anything yet. But then he/she
wins three more times in a row. Your nerves and your budget are starting to suffer. And you
know the oIDs of this happening are fairly low (the probability is .5 * .5 * .5 * .5* = .0625).
And it continues on for aIDitional six times, so the oIDs are really becoming as we say
astronomical that your friend is getting treated to free dinner every week just because of
random forces in the universe. What are you starting to suspect? (Your suspicion that your
friend is supplying a loaded coin is your ALTERNATIVE HYPOTHESIS. What is your
NULL?)Here is a decision table for the possibilities you face:REALITY (columns)YOUR
DECISION (rows)FRIEND IS CHEATING FRIEND IS NOT CHEATINGYOU THINK FRIEND IS
CHEATING YOU ARE CORRECT* (statistical power) YOU GET A BLACK EYE (“ alpha
error” )YOU THINK FRIEND IS NOT CHEATING YOU ARE A SUCKER (“ beta error” ) YOU
ARE CORRECT***In this case, you will probably get a black eye also, but you will be ahead
again financially.**In this case, your friend is an expensive date.So what is our null
hypothesis? That the coin you guys are using is just fine, that no cheating is going on, the
coin is not loaded or two-headed. The 10 wins in a row that your friend has pulled off, the
subsidized dinners have been purely a matter of chance.What is our alternative hypothesis?
That the 10 wins are not a matter of chance; that your friend is an operator and is providing
a coin that helps him/her be assured of winning the flip and getting free food. (The
probability of 10 wins in a row, according to my calculations, is .1%).Even so, you just can’ t
KNOW the true proportion of correct guesses from your friend. You could start flipping the
coin in question and have your friend guess long into the night and never exhaust the
population of possible outcomes. There is an unlimited population of coin toss predictions.
But we can still do hypothesis testing even with these unknown values of our population
parameters.Again, the wording of the null hypothesis is a way of using language to enforce a
certain scientific discipline. Everyone likes to make cool discoveries; the problem is when
we are looking for certain results we tend to find them (the problem of wishful thinking). By
constructing a null hypothesis, we encourage ourselves to not be too invested in our
expectations. We also don’ t use the term “ proven.” We reject our alternative hypothesis if
the results are significant. We “ fail to reject” it if they are not.If our results are not
significant (that is, we fail to reject the null hypothesis) we have not closed the door on our
hypothesis either. Failure to reject the null hypothesis DOES NOT automatically mean that
the null hypothesis is correct. It may also mean that in this case we did not collect
information sufficient to reject it. We expect in doing scientific work that we will have to
repeat our procedures until we have a convincing and consistent set of results. If we do
reject the null, we have not PROVED the alternative —our acceptance of the alternative
hypothesis is conditional and provides the basis and justification for further
research.People do abuse the word significant. Any time research results are reported, in
academic or popular magazines, there is the potential for this abuse or misunderstanding of
statistical language. So when you read a statement such as: Researchers have found a
significant relationship between eating alfalfa at every meal and weight loss —don’ t take it
at face value. If the study is reported somewhere in full, check the tests and the significance
6. levels set by the researchers. There is a big difference between significant results at an
alpha level of .68 and a level of .01. The smaller the alpha level, the more meaningful is the
term “ significant.” For the coin flipping, freeloading friend, you set an alpha level of .001.
You have reached this level of significance at the tenth toss. You reject the null hypothesis,
take the black eye, and find someone else with whom to review quantitative reasoning over
dinner.THE TWO TYPES OF HYPOTHESIS TESTING ERRORIn science and in life, we are
always dealing with a large amount of annoying uncertainty. That is just how it goes. If we
waited to be certain in all situations before speaking or acting, we would never say or do
anything.When we do our probabilistic testing of hypotheses, we can expect to make errors
periodically. There are two types of errors that we make, and from which we hope to
learn.TYPE I OR ALPHA ERRORWe reject the null hypothesis when it is in reality true. (We
accuse our friend of using a loaded coin when our friend was really playing fair.)The alpha
level not only sets the cutoff point at which we will reject the null hypothesis; it also sets the
likelihood of committing this type of error. When we set an alpha level of .05, we are
committing ourselves to a 5% chance of rejecting a true null hypothesis, at a p-level of .01 it
is 1%, at .001, it is .1%. The lower the alpha level, the less chance of committing this type of
error. But this certainty has a cost. The lower we set the alpha level, the greater the chance
that we will commit a beta error and accept a false null hypothesis. We sacrifice what we
call POWER in order to use a very low alpha level. This other competing type of error, which
we must try to balance out the chance of, is:TYPE II OR BETA ERRORWe fail to reject the
null hypothesis when it is in reality false. (We assume our friend is playing fair when in fact
he or she is playing with a loaded coin.)Beta is the probability of making this type of error.
The implication of this type of error is that in our quest to be scrupulous about not
accepting a chance or random effect as an actual systematic effect of interest, we ignore a
possible effect of interest.We need our test and our significance levels to allow us a
sufficient level of power as well as carefully guard against finding effects that are not there.
Power is the potential for our research and tests to reject an actually false null hypothesis. If
we set our alpha levels so low that we have little or no chance of doing so, it is not a good
thing.Think of alpha and beta as lying on a graph with a diagonal: as one goes up, the other
goes down. The .05 significance level, in most cases, is regarded as the best compromise
level between alpha and beta errors, although significance of results at .01 level is generally
more highly prized in the world of research. Although this is usually a very good thing, let’ s
say your research results show that the effect you were looking at would have occurred
only 2% of the time by chance. At a an alpha level of .01, you still fail to reject the null
hypothesis even though there has been an effect from your treatment. You probably did not
allow yourself sufficient power in setting the alpha level. You commit a beta error.The alpha
level, along with the number of subjects, is used in conjunction with statistical tables in
order to set a “ critical” value for our statistic (“ t” for t tests and “ F” for ANOVA.) If our
testing procedure yields a t or F value higher than the critical value, we can reject the null. If
our obtained t or F (obtained from the test) does not exceed the critical, we FAIL TO REJECT
the null (by custom we do not “ Accept” because nothing has been proven and a future
research study may not replicate our non-significant results.Because of the reciprocal
nature of these types of error, we need to carefully consider the consequences of each
7. before we set our alpha level.Two Main Types of Research Errors1. Random errors can be
minimized but cannot be avoided. For example, they may be related to sampling variability
or measurement precision. Random errors can be determined and can be aIDressed using
statistical analysis.2. Systematic errors are also called bias. There are many causes of bias,
including complex human factors. Because of this, systematic errors or bias must be
considered when designing any research study in order to avoid false differences between
observed and true values.PART IIIMain Categories of Research BiasSelection biases, which
may result in the subjects in the sample not being representative of the population you
intend to study.Measurement biases, which include issues related to how the outcome of
interest was measured.Intervention (exposure) biases, which involve differences in how the
treatment or intervention was carried out, or how participants were exposed to the factor
of interest.Selection Biases happen when two groups are compared but they are different in
some way. Those differences may influence the outcome of the research. Examples include
volunteer or referral bias, and nonrespondent bias. By definition, nonequivalent group
designs also introduce selection bias.Volunteer or referral bias happens because people
who volunteer to participate in a study (or who are referred to it) are often different than
people who do not volunteer or are not referred to the study. This bias usually favors the
treatment group, because volunteers tend to be more motivated and concerned about their
health.Non-respondent bias is when the people who do not respond to a survey differ in
important ways from the people who do respond. This bias can work in either
direction.Measurement Biases involve systematic error that can happen when researchers
collect data. Some examples include instrument bias, insensitive measure bias, expectation
bias, recall or memory bias, attention bias, and verification or work-up bias (UMDNJ,
n.d.).Instrument bias happens when calibration errors lead to inaccurate measurements
being recorded. An example is an unbalanced scale being used to weigh people.Insensitive
measure bias happens when the measurement tool(s) used are not sensitive enough to
detect what might be important differences in the variable of interest.Expectation bias
happens when masking or blinding is not carried out. This means the researchers know
which group is the control and which is the intervention group. Observers may err in
measuring data toward the expected outcome. This bias usually favors the intervention
group.Recall or memory bias can be a problem if outcomes being measured require that
subjects recall past events. People may recall positive events more than negative events.
Some participants may be questioned differently or engaged in more conversation than
others, which could improve their recollections more than others.Attention bias happens
when people who know they are part of a study and are getting more attention, give more
favorable responses or perform better than people who are unaware of the study’ s
intent.Intervention (Exposure) BiasesIntervention or exposure biases include
contamination bias, co-intervention bias, timing bias(es), compliance bias, withdrawal bias,
and proficiency bias. This type of bias is most often associated with research that compares
groups.Contamination bias happens when members of the ‘ control’ group inadvertently
receive the treatment or are exposed to the intervention. This can potentially minimize the
difference in outcomes between the two groups.Co-intervention bias occurs when some
participants are receiving some other interventions at the same time as the study treatment,
8. but those other interventions are not accounted for.Timing bias depends on the timing of
the study. If an intervention is provided over a long period of time, maturation could be the
cause for improvement. If treatment is of very short duration, there may not have been
sufficient time for a noticeable effect.Compliance bias occurs when participants differ in
their levels of adherence to the planned intervention and this affects the study
outcomes.Withdrawal bias happens when people who drop out of the study differ
significantly from people who continue to participate and complete the study.Proficiency
bias happens when the interventions or treatments are not applied equally to subjects. This
may be due to skill or training differences among personnel and/or differences in resources
or procedures used at different sites.UMDNJ (nd). Major Sources of Bias in Research Studies.
Retrieved from http://www.umdnj.edu/idsweb/shared/biases.htmT-TESTSA t-test allows
us to compare means of two samples. The difference in the size of the means needs to be
large enough to achieve statistical significance.Sometimes we are comparing the mean of a
treatment group or other group of interest to an already well-established, known
population mean (a one-sample t-test.)Often we are comparing a treatment group to a
control group, in which case we would use an independent two-sample t-test.Sometimes we
are comparing the same group to itself – the classic “ before and after” model. The two
samples are the same sample observed or measured before and after the treatment. In this
case we use a dependent or correlated t-testANOVA (Analysis of Variance)When we have
more than two means to compare, or different levels of treatment, a procedure called
Analysis of Variance is used. Analysis of variance uses a comparison of the amount of
dispersion between the groups to the amount of dispersion within the groups, instead of a
direct comparison of means, which is considered inappropriate with more than two groups.
The amount of variation between the groups is attributed to the treatment or change in
conditions between the groups; the amount of variation within the groups is attributed to
ERROR or random effects that would render our results meaningless. The ratio of between
group variation to within group variation needs to be large enough to achieve statistical
significance. If significant results are achieved, direct comparisons of the means can be done
using a variety of post-hoc or “ after the fact” statistical tests.REMEMBER WHAT IS
IMPORTANT IN USING STATISTICAL TESTS: