2.
We have discussed validity related to tests and
other kinds of materials throughout your
program.
For our purposes, validity is related but means
a bit more.
Validity as a concept that differs between internal
and external evidence.
From both clinical situations and published research
studies, there is an assumption of empirically
controlled measures being taken.
Measures of patient preferences are still valid.
3.
Evidence from scientific research
Internal validity
extent to which empirical evidence provides a true or
accurate reflection of the patients, procedures, and
settings that were observed.
External validity Generalizability.
How much do the measures taken reflect how the
patient(s) behave in other settings?
How much does the findings of a study reflect the
population in general?
4.
Confounding/nuisance variables
Many! May just have to acknowledge these in the
results if you cannot control for them.
Subjective bias
Impossible to completely limit this.
Blinding, masking, or concealment
When evaluating a study, it is important to consider
whether opinions, expectations, or beliefs of
participants, or observers could have influenced the
findings.
5.
The concept of whether the researchers or
participants are blind to the treatment.
Can be difficult in our types of treatments, but
perhaps the researcher could be blind to
characteristics about the participants who are
receiving treatments.
This should be assessed and thought about when
looking at treatment studies.
6.
Does the measure(s) in the study provide a
valid reflection of performance?
Are there problems with norm referenced tests?
What are the issues with other types of measures in
the study?
In others’ research, is what they are using to measure
outcome of effectiveness appropriate?
7.
There are gold standard designs, such as the randomized
controlled trials (RCT)
Randomizing participants into treatment groups are
stronger designs than nonrandomized studies.
It is important to remember that no one study can tell us all we need
to know about a treatment.
The RCT is best for drawing causal inferences about average
treatment effects in groups of patients.
Maybe not the best for looking at the performance of
individuals, less typical types of settings and clients, and not
appropriate if looking at etiology or risk factors.
Is it appropriate to randomly assign participants to different
conditions or treatments?
The studies purpose, the nature of the investigation, and the
historical background should determine the type of research
design that is most appropriate.
8.
Experimental/Controlled vs. NonExperimental/Uncontrolled
Experimental studies are usually controlled usually
require a manipulation or some sort of control or
comparison group.
Experimental designs can be large groups, small
groups, or even single subjects.
Non-experimental studies generally are not as
controlled, but do have a systematic means of
gathering data and reporting results.
All things being equal, the more experimental and
controlled a study, the better the evidence.
9.
Prospective vs. retrospective
Prospective- the investigator plans the study, states a
hypothesis, identifies the kinds of participants and
procedures, and then gathers data. Generally,
experimental studies are prospective and
nonexperimental can also be prospective.
Retrospective- Looking at data after a certain time
frame has passed.
Retrospective are actually ranked lower than
prospective studies.
10.
Nuisance variables: All studies should be
analyzed to look for factors aside from what is
being studied that could influence the results.
Statistical significance- Very important, but
statistics can often be deceiving.
11.
Subjective bias
Become very familiar with the client- observer drift.
Conflict of interest between the role of clinician and being
a neutral observer.
Can do a check by having a colleague (blinded observer)
view a sample of the participant, or gather data from
multiple sources.
Quality of measurement
Same concerns as with research.
Do the measurements make sense? Are they valid?
Are there multiple measurements?
What/who should be measured?
12.
Research designs
Single Subject Designs
Multiple Baseline Designs
Nuisance variables
We have discussed this a lot, and these variables
need to be observed, described and controlled to
some extent in our treatment.
13.
14.
15.
Can be one in the same, but certainly we need to be
careful.
From the book:
If a patient is being studied in the course of clinical practice
solely for the purpose of benefiting him or her, and if the
investigation does not impose added burdens on the patient,
and if the investigator does not impose added burdens on the
patient, and if the investigator does not intend to disseminate
the information (present or publish), then it is not research.
It does not mean that clinical endeavors are not research
based, in that there is a search for supporting evidence, a
design for gathering data, and some sort of sense of internal
and external validity.
16.
For research, we ask how applicable the results
of a study apply to the population in general or
a specific clinical case.
For our therapy, we ask how much the
behaviors being addressed in therapy can be
applicable to extra-clinical settings.
Also must consider to what extent the findings
can be replicated across studies, and also how
the client can replicate the behaviors in other
settings.
Editor's Notes
Alright, today we are going to cover the topic of validity of evidence, which corresponds with chapter 4 in the Dollaghen book.
So what is validity? Well, we’ve discussed validity related to tests and other kinds of materials throughout your program. As clinicians, we must be concerned with how likely it is that the test/tool/study accurately reflects what it is proposing to measure. This is one type of validity, but for our purposes, validity can mean a bit more.
We can gather evidence both internally and externally, as we will discuss. The concept of validity can shift or vary depending on whether you are considering internal or external evidence.
Regardless of whether we are gathering clinical data or conducting a research, there is an assumption that the study is objective and controlled. This is one way we can assess the validity of clinical conclusions or research studies. Don’t forget that we can also take into account and measure our patients’ preferences as we conduct our studies, and that doing so doesn’t invalidate results of clinical or research studies.
Let’s take a closer look at external evidence and issues of validity.
Let’s start by talking about external evidence. External evidence is considered to be evidence from scientific research, both basic and clinical.
For our purposes, there are two main types of validity: Internal and External. Internal validity is the extent to which empirical evidence provides a true or accurate reflection of the patients, procedures, and settings that were observed. This is important for both clinical practice and research. To give you one example, we know that it is important to work on attitudinal and emotional reactions to stuttering with clients who stutter. So I may develop a therapy plan that addresses these types of goals very extensively. If I wanted to see how effective my client’s therapy has been in regard to developing more positive attitudes toward stuttering and communication, I would not use a test like the Stuttering Severity Index or percentage of words stuttered. There is a mismatch between the types of things I was working on in therapy and the measure I used to assess progress.
On the other hand, External validity has to do with generalizability. That is, how much do the measures taken reflect how the patient(s) behave in other settings? How much does the findings of a study reflect the population in general? Let’s suppose that a study was conducted in which a child with a phonological process was given experimental therapy at a university clinic. The clinicians are highly skilled and provide one-on-one therapy, the child receives three hours of therapy every day for three weeks, and the children are required to attend every session or they are excluded from the study. Now contrast the highly controlled set up of this study with how SLPs in schools usually are able to provide treatment. If you think about the nature of speech services in schools, you might be lucky to get 30 minutes with a child like this once a week, and the child may be in a group with several other children. You may not get to see the child that often due to absenteeism, special programming at the school, snow days, and a host of other issues. So how much external validity will this study have for the majority of children on your caseload who have this same phonological process? Probably not much!
Confounding or nuisance variables are just that—confounding and a nuisance. They may prevent you from being able to definitively state that your treatment caused the change in behavior that you measured. Some of these variables we can control for, but others we can’t. For example, if we were to conduct a research study on the effectiveness of a particular type of treatment for autism, we might be hard pressed to account for any therapies that the school was providing, or any alternative therapies that the parents were trying. But we could do our best to learn about these secondary treatments and account for them in our research findings.
Another issue in terms of internal validity is subjective bias. It’s no surprise that we tend to look for evidence that supports our personal point of view. We fail to notice what might be alternative points of view. It’s impossible to completely limit subjective bias, but we can try by doing things like blinding, masking, or concealment. It’s also important, when evaluating a study, to consider whether opinions, expectations, or beliefs of participants, or observers could have influenced the findings.
Let’s talk more about one method of controlling subjective bias in research studies. Blinding refers to researchers knowing whether a client is receiving a treatment, or participants knowing whether they are receiving the treatment or not. Medical researchers do this all the time. They often do double-blind studies in which a placebo is administered to one group of patients but not another. Neither the patients nor the researchers are aware of who is taking the placebo and who is not. This is because if you as a researcher know that your client is receiving a specific drug, you may be more likely to record data and interpret results in a more positive manner than may be warranted.
In our field of study, it’s really difficult to engage in a blind study, in which the participants don’t know whether they are receiving tx or not. But perhaps the researcher could be blind to characteristics about the participants who are receiving treatments. For example, if you are giving an experimental treatment to kids with ADHD and autism compared to just kids with autism, the researcher may not need to know which kids have ADHD.
And of course we want to be concerned with the quality of our measurements. Measurements help tell us if our treatment is effective or efficacious. So we have to ask ourselves if the measures used in the study provide a valid reflection of performance. We also need to know if there are problems with norm referenced tests we might use, or, in the case of studies we may be reading from other researchers, we need to know if the measure itself is appropriate as an outcome measure.
When we think about research designs and which ones we should use, we need to be aware that there are gold standard designs, such as the randomized controlled trials (RCT). While randomizing participants into treatment groups are stronger designs than nonrandomized studies, it is important to remember that no one study can tell us all we need to know about a treatment.
The RCT is best for drawing causal inferences about average treatment effects in groups of patients.
As such, it may not be the best design for looking at the performance of individuals, or less typical types of settings and clients
The studies purpose, the nature of the investigation, and the historical background should determine the type of research design that is most appropriate, so we need to ask ourselves if it is appropriate to randomly assign participants to different conditions or treatments.
You may be wondering about the different types of research in terms of experimental or controlled designs versus non-experimental or non-uncontrolled designs. Experimental studies are usually controlled, meaning that they usually require a manipulation or some sort of control or comparison group. Experimental designs can be large groups, small groups, or even single subjects.
On the other hand, non-experimental studies are generally not as controlled, but do have a systematic means of gathering data and reporting results. All things being equal, the more experimental and controlled a study, the better the evidence. It does come down to considering a lot of issues though in the type of question, treatment, etc.
We can also think about whether to conduct a prospective study, in which the investigator plans the study, states a hypothesis, identifies the kinds of participants and procedures, and then gathers data, or a retrospective study in which researchers look at the data after a certain time frame has passed. Because they have less control, retrospective studies are ranked lower on the evidence totem pole than are prospective studies.
As we design our own studies or evaluate those of others, we need to keep an eye out for nuisance variables, or other factors that could have caused the change in the outcome we are measuring. We also need to be aware of statistics and how they can be deceiving. We will revisit statistical significance and meaningful treatment effects in a future lecture.
So having talked about external evidence related to research studies, we can now talk about evidence that is specific to clinical practice, called internal evidence. We still need to be concerned about subjective bias, especially the client- observer drift. An example of this is that after you have worked with a client with an articulation disorder for some time, you may become attenuated to his/her speech, and it starts sounding more and more normal to you. Your measurements may become more and more positive, even if the client’s articulation isn’t really getting better. There can sometimes also be a conflict of interest between the role of clinician and being a neutral observer. To combat these things, we can ask a colleague (considered a blinded observer) to view a sample of the participant, or gather data from multiple sources.
In terms of quality of measurement, we have the same concerns as with research. Do the measurements make sense? Are they valid? Are there multiple measurements? What/who should be measured?
Some research designs that can be easily practiced by clinical researchers are the single subject design and multiple baseline designs. You covered both of these designs in your summer research course, but I’ll go over the basics again here as a refresher on the next slides.
So here is an example of perhaps the simplest kind of single subject design, the ABA withdrawal design. Remember that with Single Subject Designs you are attempting to demonstrate a functional relationship between the target behavior and intervention by withdrawing and then reintroducing the treatment. Withdrawal serves as a form of control. If treatment was powerful enough to alter the participant’s behavior during the treatment phase, then removal (withdrawal) of the treatment should result in an extinction of the newly acquired behavior. Keep in mind that an ABAB design is stronger because it allows you to demonstrate one more time that the behavior is affected by treatment.
Single-subject designs focus on relatively large treatment effects for individual subjects rather than small differences between groups of subjects
In other words, you can simply look at a graph and determine whether the client has made progress with that treatment protocol
In Multiple baseline designs, as you see here, more than one potential treatment target is monitored during the baseline phase. In this picture, the treatment targets are /p/ and /s/. After the baseline phase, one target is treated while the other remains in baseline. On the assumption that maturation or recovery should affect both targets equally, finding that progress occurs on the treated target but not on the control target, in this case /s/, can provide credible evidence that the treatment was effective.
We also need to be careful in who conducts studies with clinical populations. Should it be a relatively neutral researcher, or the more heavily involved clinician? The person conducting the research study could be either, but we do need to be careful.
As your author says,
If a patient is being studied in the course of clinical practice solely for the purpose of benefiting him or her, and if the investigation does not impose added burdens on the patient, and if the investigator does not impose added burdens on the patient, and if the investigator does not intend to disseminate the information (present or publish), then it is not research.
This doesn’t mean that clinical endeavors are not research based, in that there is a search for supporting evidence, a design for gathering data, and some sort of sense of internal and external validity.
For research, we ask how applicable the results of a study apply to the population in general or a specific clinical case.
For our therapy, we ask how much the behaviors being addressed in therapy can be applicable to extra-clinical settings.
We must also must consider to what extent the findings can be replicated across studies, and also how the client can replicate the behaviors in other settings.