SlideShare a Scribd company logo
1 of 33
Evaluation
19(3) 321 –332
© The Author(s) 2013
Reprints and permissions:
sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/1356389013497081
evi.sagepub.com
Validity and generalization in
future case study evaluations
Robert K. Yin
COSMOS Corporation, USA
Abstract
Validity and generalization continue to be challenging aspects
in designing and conducting case
study evaluations, especially when the number of cases being
studied is highly limited (even limited
to a single case). To address the challenge, this article
highlights current knowledge regarding
the use of: (1) rival explanations, triangulation, and logic
models in strengthening validity, and
(2) analytic generalization and the role of theory in seeking to
generalize from case studies.
To ground the discussion, the article cites specific practices and
examples from the existing
literature as well as from the six preceding articles assembled in
this special issue. Throughout,
the article emphasizes that current knowledge may still be
regarded as being at its early stage
of development, still leaving room for more learning. The
article concludes by pointing to three
topics worthy of future methodological inquiry, including: (1)
examining the connection between
the way that initial evaluation questions are posed and the
selection of the appropriate evaluation
method in an ensuing evaluation, (2) the importance of
operationally defining the ‘complexity’ of
an intervention, and (3) raising awareness about case study
evaluation methods more generally.
Keywords
analytic generalization, initial evaluation questions,
intervention complexity, logic models, rival
explanations, role of theory, triangulation
Introduction
The classic case study consists of an in-depth inquiry into a
specific and complex phenomenon
(the ‘case’), set within its real-world context. To arrive at a
sound understanding of the case, a
case study should not be limited to the case in isolation but
should examine the likely interaction
between the case and its context. Technically, such an objective
adds to a common problem,
whether doing case study research (Yin, 2014) or case study
evaluation (Yin and Ridde, 2012):
the number of datapoints (each case being a single datapoint)
will be far outstripped by the num-
ber of variables under study − because of the complexity of the
case as well as the embracing of
Corresponding author:
Robert K. Yin, COSMOS Corporation, 3 Bethesda Metro
Center, Suite 700, Bethesda, MD 20814, USA.
Email: [email protected]
497081EVI19310.1177/1356389013497081EvaluationRobert K.
Yin: Validity and generalization in future case study
evaluations
2013
Article
322 Evaluation 19(3)
the contextual conditions. This situation is nearly impossible to
remedy, even if a modest number
of cases is included as part of the same (multiple-) case study.
As a result, the usual analytic
techniques based on having a large number of datapoints and a
small number of variables
(thereby permitting estimates of means and variances) are likely
to be irrelevant in doing case
study research.
For evaluations, the ability to address the complexity and
contextual conditions nevertheless
establishes case study methods as a viable alternative among the
other methodological choices,
such as survey, experimental, or economic research
(Stufflebeam and Shinkfield, 2007). The con-
ditions appear especially relevant in efforts to evaluate highly
broad and complex initiatives; for
example, systems reforms, service delivery integration,
community and economic development
projects, and international development (e.g. Yin and Davis,
2007). At the same time, doing case
study evaluations with acceptable and rigorous procedures must
rely on a state-of-the-art still in its
formative stages.
The May 2012 workshop on ‘Validity, Generalization, and
Learning’ provided an opportunity
for a variety of scholars to share their working knowledge and
to advance the state-of-the-art. Six
of the presentations became the other articles contained in this
journal issue.1 Together, the six
assembled articles form a basis for briefly reviewing the key
practices regarding validity and gen-
eralization when doing case study evaluations. The present
article tries to reinforce and also to
elaborate upon the six articles. The goal is to stimulate yet
newer contributions on all these impor-
tant methodological practices. Only in this manner will case
study evaluations continue to get
stronger. The article is organized according to a slight
adaptation of the main themes of the original
workshop: Strengthening validity; Seeking to generalize; and
Still more learning.
Strengthening validity
Case study evaluations may limit themselves to descriptive or
even exploratory objectives.
However, the greatest challenge arises when case study
evaluations fill an explanatory role. This
means: (a) documenting (and interpreting) a set of outcomes,
and then (b) trying to explain how
those outcomes came about. When adopting such an explanatory
objective, a case study evaluation
will in effect be examining causal relationships. The evaluation
thus squarely confronts issues of
internal validity.2 To address these issues, the small number of
cases in a case study − frequently
involving only a single case − precludes the use of conventional
experimental designs. These
require the availability of a sufficiently large number of cases
that can in turn be divided into two
(or more) comparison groups. Instead, case study evaluations
must rely on other techniques.
One evaluative approach has been to conduct and document
direct observations of the events
and actions as they actually occur in a local setting as a critical
part of a case study’s data collection
(e.g. Erickson, 2012: 688; Maxwell, 2004, 2012; Miles and
Huberman, 1994: 132). The inquiry
can highlight the contextual role of the local settings and
accommodate if not feature the non-linear
and recursive flows of events (‘feedback loops that occur at
irregular times’ − Betts, 2013: 255) as
well as the possibility of entertaining multiple causes, both
proximal and distal. However, the ensu-
ing analysis remains highly qualitative and may not be very
convincing.
To improve on the precision of such an approach and to boost
confidence in the findings, two
of the six assembled articles (Befani, 2013; Byrne, 2013) offer
insights into a technique known
as qualitative comparative analysis (QCA), developed by
Charles Ragin (1987, 2000, 2009).
This technique captures within-case patterns or configurations
(Byrne, 2013: 224), consisting of
the combination of intervention and outcome conditions for
each particular case being studied.
The cross-case analysis then becomes the systematic
comparison of these within-case
Yin: Validity and generalization in future case study
evaluations 323
configurations or sets of intervention-outcome conditions.3
When a sufficiently large number of
cases is available, QCA can be ‘strong in testing, refining, and
validating findings’ (Befani,
2013: 280). As examples, Befani’s article discusses two
illustrations, having 17 and 11 cases,
respectively.
The more advanced versions of QCA (Ragin, 2000) permit the
handling of 50 to 100 cases
(Befani, 2013: 281–82). However, such a capability, as well as
the QCA procedure more generally,
may in fact focus on cases rather than on the conduct of in-
depth case studies. Except when pre-
existing cases are already archivally available, a study covering
a large number of new in-depth
case studies is likely to be difficult to conduct, because of both
the elapsed time and the resources
needed by the study. QCA’s capability, therefore, may move in
the opposite direction from the
initial challenge of confronting validity with a small number of
cases, including the classic single-
case study. For such situations, the six assembled articles gave
less attention to three known prac-
tices, possibly because the practices remain underdeveloped.
Plausible, rival explanations
The role of examining plausible rival explanations has been
readily recognized in doing evalua-
tions (e.g. Maxwell, 2004: 257−60; Yin, 2000b). Appealing to
such rivals has formed a central part
of nearly all types of research in the social and physical
sciences (e.g. Campbell, 2014: xvii−xviii).
Although experimental designs may control for all rivals (but
without specifying any of them), the
number of plausible rivals competing with the main
hypothesized causal relationships in a case
study may be sufficiently limited that they can be studied
directly. Thus, as part of the same case
study, the procedure calls for a vigorous search for data related
to the rivals, as if trying to find
support for them (Patton, 2002: 553; Rosenbaum, 2002: 8−10).
Given a vigorous search, but finding no such support, more
confidence can be placed in the
main hypothesized relationships. The degree of certainty will be
lower than that associated with an
experimental design but higher than if a case study had not
investigated any plausible rivals. As
noted in the field of education research, ‘the use of qualitative
methods . . . can be particularly help-
ful in ruling out alternative explanations . . . [and] can enable
stronger causal inferences’ (Shavelson
and Towne, 2002: 109). For a case study evaluation, the most
common rivals might be the exist-
ence of: an initiative similar to or overlapping with the
intervention being evaluated; a salient
policy shift not related to the intervention; or some other
identifiable influence in the contextual
environment.
However, beyond being identified as an integral and critical
part of doing an evaluation, the
operational procedure for making comparisons with plausible
rival explanations has received little
attention. Explicit procedures are needed to deal with how and
whether the acceptance or rejection
of rivals meets such benchmarks as being ‘acceptable,’ ‘weak,’
or ‘strong,’ or even how to distin-
guish between a plausible rival and a mere red herring. In
addition, the operational steps involved
in comparing the rival findings with those related to the main
hypothesis may be intricate and may
benefit from being represented as formal designs. To this
extent, the use of plausible rival explana-
tions remains an extremely promising but still underdeveloped
procedure for strengthening the
validity of case study evaluations.
Triangulation
Triangulation presents a similar situation. The principle has
been long understood (e.g. Denzin, 1978;
Jick, 1979), with at least four types of triangulation being
possible: (1) data source triangulation, (2)
324 Evaluation 19(3)
analyst triangulation, (3) theory/perspective triangulation, and
(4) methods triangulation (Patton,
2002: 556−63). Of the four, the data source and methods types
in particular are likely to strengthen
the validity of a case study evaluation. Renewed interest in
mixed methods research has highlighted
the ways in which a methods triangulation can provide
increased confidence in the findings from a
study that has combined quantitative with qualitative methods
(e.g. Creswell and Plano Clark, 2007;
Teddlie and Tashakkori, 2009). (Vellema et al. [2013: Table 1],
briefly refer to their use of one of the
other kinds of triangulation: theory/perspective triangulation.)
Many case study evaluations, especially those focusing on broad
or complex interventions, can
involve a combination of two or more methods. When these
methods are purposely designed to
collect some overlapping data, the possibility for triangulation
certainly exists and, if the results are
convergent, greater confidence may be placed in the
evaluation’s overall findings. Similarly, con-
vergence over the examination of causal relationships will
strengthen the evaluation’s internal
validity.
At the same time, operational procedures for carrying out
triangulations also have received little
attention. No benchmarks exist to define when triangulation
might be considered ‘strong’ or ‘weak’
or ‘complete’ or ‘incomplete.’ Similarly, sufficient
triangulation might involve an intricate number
of steps that need to be represented as formal designs. The
ultimate goal, as with making compari-
sons with plausible rival explanations, calls for a common
procedure that can be routinely adopted
and used by many if not all case study evaluations.
Logic models
Case study evaluations frequently use logic models, initially to
express the theoretical causal rela-
tionships between an intervention and its outcomes, and then to
guide data collection on these same
topics. The collected data can be analyzed by comparing the
empirical findings with the initially
stipulated theoretical relationships, and a match between the
empirical and the theoretical adds to
the support for explaining how an intervention produced (or
not) its outcomes.
The practice of using logic models in evaluations has again
been understood for a lengthy
period of time (e.g. Wholey, 1979). Nevertheless, although the
practice of using logic models has
become quite common, little has occurred to sharpen their use
and strengthen their role.
For instance, a major shortcoming derives from the
coincidentally graphic similarities between
logic models and flow charts. Both are usually expressed as a
sequence of boxes. In the case of the
logic model, the boxes represent the key steps or events within
an intervention and then between
the intervention and its outcomes. Graphically, the boxes are
then connected by arrows that identify
the links between and among the events. Unfortunately, most
evaluations collect data about the
boxes, but nearly no data about the arrows. Yet they represent
the flow of transitional or causal
conditions, showing or explaining how one event (box) might
actually lead to another event (a
second box). One possible reason for such negligence is that
transitional data are irrelevant in flow
charts, which only represent the shifting from one task to
another, but without implying any causal
relationship. For logic models not having any transitional data,
only a correlational analysis can be
conducted, reducing the causal value (and validity) of the entire
exercise. Future studies could
again investigate ways of improving the use of logic models.
Summary
Case study evaluations need to continue to confront the
challenge of strengthening validity. Several
known methodological practices accept rather than avoid the
necessary underlying assumption that
Yin: Validity and generalization in future case study
evaluations 325
the typical case study will only include a small number of cases:
checking for plausible, rival
explanations; triangulating data or methods; and using logic
models. These practices deserve
greater attention than they have attracted in the past. In each
situation, although the practices have
been recognized and used for many years, the preceding
paragraphs have suggested that room for
improvement still exists. Future methodological contributions
could therefore yield desirable
payoffs.
Seeking to generalize
Concerns in doing case study evaluations extend from issues of
validity to issues of generalization.
In international development, the generalizations form the basis
for transferring lessons from one
country to another as well as for ‘scaling-up’ a desirable
intervention within the same country. This
facet of the May 2012 workshop theme led the six assembled
articles to delve, in some cases quite
deeply, into generalization issues.
The widespread assumption, embraced by most of the articles as
well as the prevailing evalua-
tion literature, interprets case study generalization as an effort
to generalize from a small number
of cases to a larger population of cases (e.g. Byrne, 2013;
Ragin, 2009; Seawright and Gerring,
2008; and Woolcock, 2013). The common quest has been, first,
to establish a sufficiently precise
definition of the ‘case’ being studied (if not at the outset of a
case study at least by its conclusion),
and then to (retrospectively) define the broader population of
relevant cases. The process mimics
the conventional sampling procedure but can fail for two
reasons.
First, the difficulties of selecting the initial case(s) usually
mean that the case(s) being studied
do not represent a known, much less random sample from the
larger set of cases. An additional and
circular problem involves not fully understanding the case or
having sufficient data for selection
purposes to be able to define the potential population of cases;
but, without knowing the popula-
tion, not being able to define fully the nature of the sampled
case(s) to be studied.
Second, if a study genuinely takes advantage of the case study
method − that is, by probing a
case and its context in-depth − the study will likely only be able
to include a small number of cases.
In fact, the classic case study, as well as many case study
evaluations, is usually limited to only a
single case. The goal of understanding a case and its context,
potentially over a meaningful period
of time, is sufficiently engrossing that, even if thick description
(Geertz, 1973) is not the end result,
a case study will just not be able to cover more than a small
number of cases. The only way of
increasing the number of cases to some substantial level would
mean sacrificing the in-depth and
contextual nature of the insights inherent in using the case study
method in the first place.
Analytic generalization
Instead of pursuing the sample-to-population logic, analytic
generalization can serve as an appro-
priate logic for generalizing the findings from a case study (e.g.
Bromley, 1986: 290–1; Burawoy,
1991: 271–87; Donmoyer, 1990; Gomm et al., 2000; Mitchell,
1983; and Small, 2009).4 By ana-
lytic generalization is meant the extraction of a more abstract
level of ideas from a set of case study
findings − ideas that nevertheless can pertain to newer
situations other than the case(s) in the origi-
nal case study. For case study evaluations, the analytic
generalization should aim to apply to other
concrete situations and not just to contribute to abstract theory
building.
The desired analytic generalization also should go beyond
serving only as a ‘working hypothesis’
(e.g. Cronbach, 1975) − that is, one in need of further study
rather than being ready to be generalized
or applied to new situations. This shortcoming is not easily
overcome. However, carefully linking an
326 Evaluation 19(3)
analytic generalization to the related research literature by
identifying overlaps as well as gaps will
help. Replication of the same findings by conducting a second
or third case study (e.g. Yin, 2014:
57−9) can strengthen the generalization even further.
Eventually, the ideal generalization may extend
not only to other ‘like’ cases but also ‘apply to many different
types of cases’ (Bennett, 2004: 50).
This manner of generalizing is not peculiar to doing case
studies but is in fact analogous to
the way that generalizations are made in doing experiments.
Thus, the selection and conduct of
an experiment derives from the goal of developing fresh data
about some initially hypothesized
conditions − or about discovering a totally new condition − but
not from being a sample of some
known, larger population of like experiments.5 Case study
research follows a similar motive
(Yin, 2014: 44).
One of the six assembled articles (Mookherji and LaFond, 2013)
demonstrated the development
of analytic generalizations in considerable detail. The study
examined the ‘initiatives and pro-
cesses [that were] actually “driving” the improvements in
routine immunization [projects in three
African countries]’ (Mookherji and LaFond, 2013: 288). A
critical analytic step occurred after the
data had been collected: the identification of the varied drivers
in each of the case studies, followed
by a cross-case synthesis of how the case-specific drivers fell
into six categories, each representing
one of six (conceptually) common drivers (see Table 1 of their
article).
Based on these and other cross-case findings, Mookherji and
LaFond formulated a comprehen-
sive framework depicting the flow of pre-conditions, contextual
conditions, and drivers (see Figure
4 of their article). The framework, now empirically derived,
explains how and why immunization
projects can succeed. In the authors’ view, it became the basis
for generalizing the results from their
evaluation to other districts in other African countries (p. 22).
(By inspecting the framework
closely, a reader might even speculate that the framework can
pertain to immunization projects
outside of that region − or even to the design of community
health initiatives more broadly.)
Mookherji and LaFond’s example shows how analytic
generalization offers improved ways of
generalizing from case study evaluations. An additional line of
thinking that builds on the impor-
tance of analytic generalization is described next.
The role of ‘theory’ in making analytic generalizations
Mookherji and LaFond rightfully regarded their framework as
expressing a theory of change (p.
23). One way to have further strengthened their framework
would have been to connect it to the
extant literature, which contains a considerable body of work on
the locally decentralized service
delivery conditions and the local partnering arrangements
central to their framework. The authors
might then have been able to discuss how their case study
contributed (or not) to new knowledge
about health interventions, and whether their findings were
limited to immunization projects or
could be applied to community health projects more generally.
In essence, the desired analytic generalization should present an
explanation of how and why
the initiative being evaluated produced results (or not) − or, for
non-evaluation studies, how and
why the studied events occurred (or not). In this latter regard,
two other examples are worth noting.
The first is Graham Allison’s well-known single-case study on
the Cuban missile crisis (Allison,
1971; Allison and Zelikow, 1999). The case study has for over
30 years been a best-seller in the
field of political science because of its analytic generalizations
and implications for a broad array
of international relationships.
The second example (also illustrating how a detailed single-case
study can be published in a
leading academic journal, even given its page-length
limitations) examined how the Croatian gov-
ernment represented the country’s past, present, and future in
the aftermath of the wars of Yugoslav
Yin: Validity and generalization in future case study
evaluations 327
secession (Rivera, 2008). The wars had left a reputation-
damaging effect, threatening Croatia’s
highly valued tourist industry. The case study showed how, in
response, the government reframed
the country’s past by omitting the war in its representations of
national history, re-positioning the
country as more closely sharing a history and culture with its
Western European neighbors. The
explanation for these findings then drew from a prevailing
theoretical framework, in which the
author innovatively extended Erving Goffman’s well-regarded
work on stigma and the manage-
ment of ‘spoiled identity’ from the individual to the
institutional realm (Goffman, 1963). The
author concluded by claiming that the analytic generalization
had applicability to other situations
of collective memory and cultural sociology.
Summary
The preferred manner of generalizing from case studies and case
study evaluations is likely to take
the form of making an analytic or conceptual generalization,
rather than of reaching for a numeric
one. The desired generalization should present an explanation
for how an evaluated initiative pro-
duces its results (or not). The explanation can be regarded as a
theory of sorts − certainly more than
a set of isolated concepts − and therefore yield a better
understanding of an intervention and its
outcomes. Whether such an explanation is based on a theory
that emerged for the first time from a
case study or had been entertained in hypothetical form prior to
the conduct of the case study,
researchers need to connect the theory to the extant literature,
or alternatively, to use their findings
to explain the gaps and weaknesses in that literature. By doing
so, the generalizations from a single
case study can be interpreted with greater meaning and lead to a
desired cumulative knowledge.
Finally, replications of the original case study also help.
At the same time, the strongest empirical foundation for these
generalizations derives from the
close-up, in-depth study of a specific case in its real-world
context.6 Such a condition usually limits
the number of cases that can be studied. In turn, such a
limitation precludes applying the conven-
tional numeric, or sample-to-population generalizations when
doing case studies. If, in contrast, an
evaluation genuinely has an overarching goal of establishing or
estimating numeric relationships,
doing a case study evaluation might not be the preferred method
to satisfy such a goal.
Still more learning
The present article’s treatment of validity and generalization
suggests ways that case study evalu-
ations can gain from methodological studies yet to be done.
These studies need to focus on case
study practices to strengthen future case study evaluations. In
this sense, there is still more learning
to be done. Discussed next are three topics connected to validity
and generalization that represent
priorities for the desired methodological studies.
Noting carefully the nature of the initial evaluation questions
Perhaps the most important inquiry points to the very start of a
case study evaluation − its evalua-
tion questions. These questions have serious implications for
the remainder of the case study.
However, many case study evaluations may not be attending
carefully to the way that these ques-
tions are posed. How best to pose these questions, therefore,
should be a high priority for future
investigation. Such studies could be quite straightforward, for
example, conducting a meta-analysis
of completed evaluations, deliberately covering a variety of
forms of questions and types of evalu-
ation methods.
328 Evaluation 19(3)
The studies might initially assume that the desired questions for
case study evaluations, as with
case study research more generally, should be cast as ‘how’ or
‘why’ questions (Yin, 2014: 10−11).
Such questions implicitly direct attention to events and actions
over time, including but not limited
to causal processes (and therefore not restricted to explanatory
case study evaluations but also
embracing descriptive ones). The strength of the subsequent
case study would be its ability to
examine the relevant events and actions in all their complexity,
even if re-creating a contemporary
period of time retrospectively. ‘How’ and ‘why’ questions, for
instance, highlighted the seven
questions posed in doing each of the three country case studies
in Mookherji and LaFond’s article
(2013: 289).
Unfortunately, many evaluations, including those dealing with
international development,
totally ignore ‘how’ and ‘why’ questions and start with ‘what’
or ‘to what extent’ questions. The
‘what’ questions seek to identify the specific conditions
associated with a successful (or not) inter-
vention. Moreover, these conditions are sometimes expressed as
single ‘present-absent’ variables,
even when a condition, such as decentralization, is entirely too
complex to be treated in this man-
ner. Nevertheless, note that − assuming the availability of
sufficient data − regressions, factor
analyses, and other quantitative models can readily support the
identification process. Furthermore,
the models can more than capably demonstrate the potency of a
targeted condition by controlling
for competing conditions or showing how sets of conditions
interact. Likewise, if properly
addressed, the ‘to what extent’ questions beg for a numeric, not
explanatory or even descriptive
response.
When the initial evaluation questions appear to favor methods
other than case studies, attempts
to conduct case study evaluations in spite of these questions
may lead to tough sledding for the
ensuing case study. First, validity questions may arise about the
sample of cases selected, the avail-
ability of counterfactual conditions, and the metrics used to
assess the ‘extent’ in the phrase ‘to
what “extent”.’ Most commonly, to address the ‘to what extent’
questions, a case study evaluation
will have to resort to the use of Likert scales and then query
respondents or analysts. Yet, such a
maneuver can raise even more uncertainties about the sample
and implicit biases of the respond-
ents or analysts who were queried.
By addressing the less preferred form of questions, however, the
greatest loss may be a case
study’s inability to arrive at any generalizations. For instance,
the ‘what’ questions may lead to
no particular theoretical framework other than a correlative one,
making analytic generaliza-
tions difficult. Depending upon the number of cases, numeric
generalizations about the fre-
quency or combination of the ‘whats’ may be tenuous from any
conventional quantitative
standpoint.
Overall, future inquiries should aim to yield a better
understanding of how an evaluation’s
initial questions can imply certain preferences in selecting the
methods for an evaluation. An
important hypothesis to be entertained is that the form of these
questions dictates whether a case
study (or other evaluation method) should be used in the first
place (e.g. Shavelson and Towne,
2002: 99−108).
Extending this challenge into a slightly more controversial
realm, a somewhat more compli-
cated situation surfaces when evaluations are initially driven by
the ‘realist’ framework of ques-
tions − ‘what works for whom, when, where, and why?’
(Woolcock, 2013: 245; Betts, 2013: 256).
This common framework, appearing in many evaluations and
evaluation programs (international
and otherwise), leads to the impression that a short or at least
manageable list of conditions can
eventually be identified. Moreover, the ‘whom, when, where,
and why’ portion of the framework
leaves the impression that the responses will identify a set of
constraining and enabling conditions
related to generalizing to other situations.
Yin: Validity and generalization in future case study
evaluations 329
However, the complexity of an intervention and its context may
yield such a large number
of conditions, not to speak of their distinctiveness or
uniqueness, that they cannot be itemized
in any practical way. Even if successfully itemized, the likely
analytic tool may again be a
correlative one, not a case study. Thus, future studies should
deliberately examine the implica-
tions of using the evaluation questions deriving from a realist
framework − at a minimum
examining whether a useful procedure might be for a new study
to speculate about the kind
and length of the likely items before deciding whether to
proceed with a case study or some
alternative method.
Revisiting the ‘complexity’ of interventions
A second priority topic covers the presumed complexity of an
intervention and how it appears to
influence the choice between case study evaluations and other
evaluation methods. Many evalua-
tions, as well as the present article, portray ‘complexity’ as an
important feature justifying the use
of case studies. The usual context for making this choice is a
comparison to experiments, which in
their classic form mainly focus on the relationship between a
single cause and a single effect at a
time (Befani, 2013: 270; Byrne, 2013: 220). However, instead
of relying on a comparison with
experiments, a better justification for proceeding with a case
study evaluation might require a
sharper definition of what makes an intervention complex.
Some interventions may consist of a number of components that
have complex relationships.
These types of interventions and this type of complexity may
nevertheless be highly amenable to
methods other than case studies (e.g. an economic-based study
of a housing intervention). Simply
stipulating that complex interventions warrant the use of the
case study method might appear to be
naive if not offensive to analysts familiar with the alternative
methods, which in fact can cover
certain kinds of complexity quite well (again, regression
models, structural equation models, and
the like come readily to mind).
Instead, the desired future studies should explicitly define the
conditions associated with the
‘complexity’ of the interventions that appear to favor case
studies. Several of the six articles in
this issue have begun to define these conditions, and future
methodological work could usefully
build on this foundation. For instance, an initially relevant
characteristic of complexity can
involve interventions having multiple causes and effects.
Moreover, the intervention may be
‘quite distal from the outcomes and impacts of interest’
(Mookherji and LaFond, 2013: 285).
Complexity also may mean understanding interventions in their
totality, not ‘in terms of their
components’ (Byrne, 2013: 218).
Finally, Woolcock suggests that interventions can vary
according to their causal density:
those having a high causal density might trigger a case study
evaluation (Woolcock, 2013:
237−39). According to Woolcock, density reflects four
conditions: (1) the number of required
person-to-person transactions, (2) the amount of discretion by
front-line implementing agents,
(3) the pressure on the agents to respond to distracting
conditions, and (4) whether the agents’
solutions come from a known menu or need to innovate. In
contrast, interventions with low
causal densities may be physical development projects having
known technological solutions,
such as building roads, providing proper sanitation and
electricity, building schools, and
administering vaccinations (Andrews et al., 2012) − and for
which other evaluation methods
may be entirely appropriate.
In summary, future studies should examine the importance of
describing the actual features
associated with the labeling of an intervention as ‘complex,’
rather than relying on the use of the
label alone.
330 Evaluation 19(3)
Making the awareness of case study evaluation methods a
higher priority
A third priority topic sits at a higher plane than the first two −
and may be more difficult to pursue.
Although case study evaluation methods have advanced over the
years, progress has been slow
(e.g. Yin, 2000a). Some key topics such as triangulation and the
use of rival explanations, as previ-
ously discussed in this article, still appear to be underdeveloped
and await further investigation and
elaboration in order to become potent routines.
One possible explanation for the lack of progress is that articles
whose main concerns deal with
case study evaluations paradoxically begin with a fairly
elaborate discussion of non-case study
methods, such as the experimental method. The effect of these
lengthy and occasionally apologetic
discussions may be to displace a systematic and more thorough
canvassing of the potentially rele-
vant case study methods. The desired canvassing would increase
the awareness over justifying why
some case study practices but not others are to be employed in a
planned evaluation. As an exam-
ple, an initial discussion on rival explanations might cite the
relevant literature, show how rivals
had been incorporated (or not) in previous studies, and then
indicate how rivals are to be used (or
not) in the design of the planned evaluation. Rival explanations
were only mentioned once in the
six assembled articles (see Vellema et al., 2013).
Taking analytic generalization as a second example, the creation
of some typology of analytic
generalizations, along with the operational procedures for
deriving each type, would represent a
greater advance than has been experienced during the past
couple of decades. For example, Halkier
(2011) suggests three forms of analytic generalization and
offers procedures for examining them in
empirical studies: (1) ideal-typologizing, (2) category zooming
(depth on a single point), and (3)
positioning (the reflection of multiple voices and discourse).
Again, if an upcoming case study
evaluation were initially to discuss the previous use of analytic
generalization, even as a candidate
but then rejected practice, the study still could be building
important methodological lessons.
In summary, a more systematic canvassing should concentrate
on case study methods. These
could include rivals, analytic generalization, and other practices
not even touched upon in the pre-
sent article (e.g. case selection, the distinction between
proximal and distal causes, the mixture of
case study and other methods in the same evaluation, yet other
ways of generalizing, or parsing
contextual conditions rather than leaving them as an amorphous
entity as they now are). Only in
this way might newer contributions emerge, accelerating
progress in strengthening future case
study evaluations. Now that would be some kind of learning.
Funding
This research received no specific grant from any funding
agency in the public, commercial or not-for-profit
sectors.
Notes
1. The present article is not intended to be a review of any sort
of the assembled articles, nor did the present
author attend the May 2012 workshop.
2. The brevity of this article precludes discussing a related type
of validity − construct validity (e.g. Yin,
2014: 46−7).
3. Whether using QCA or not, the sequence of the within-case
analysis preceding the between-case analysis
− rather than starting an analysis by estimating the cross-case
averages for specific variables − is critical for
preserving the integrity of the individual cases in properly
doing any multiple-case study (Yin, 2014: 164−7).
4. The brevity of this article precludes discussing potentially
related kinds of generalizing, such as case-
to-case transferability, whose strength depends on the similarity
of the sending and receiving contexts
(Lincoln and Guba, 1985: 297).
Yin: Validity and generalization in future case study
evaluations 331
5. Regarding this contrast with a sample-population mode of
generalizing from experiments, whether
research experiments should admit to involving a well-defined
sample of human subjects and therefore
be limited to only the fuller population of similar people rather
than standing for ‘the norm for all human
beings’ (Prescott, 2002: 38) has been the topic of continuing
debate in psychology. The debate started
because of the over-reliance on college sophomores in serving
as subjects in behavioral research, now
augmented by the realization that most subjects have been white
males from industrialized countries
(Henrich et al., 2010).
6. Ethnographic methods are usually associated with the desire
to study phenomena in a real-world, up-
close, and in-depth manner (e.g. Emerson, 2001). However,
many ethnographies shy away from devel-
oping the theoretical insights and ideas needed to make analytic
generalizations. The predilections of this
kind of ethnography should therefore be considered carefully
before adopting the ethnographic method
to do the fieldwork in a case study evaluation.
References
Allison GT (1971) Essence of Decision: Explaining the Cuban
Missile Crisis. Boston, MA: Little, Brown.
Allison GT and Zelikow P (1999) Essence of Decision:
Explaining the Cuban Missile Crisis, 2nd edn. New
York: Addison Wesley Longman.
Andrews M, Pritchett L and Woolcock M (2012) Escaping
capability traps through problem-driven iterative
adaptation (PDIA). Working Paper 299. Washington, DC:
Center for Global Development.
Befani B (2013) Between complexity and generalization:
Addressing evaluation challenges with QCA.
Evaluation 19(3): 269–83.
Bennett A (2004) Testing theories and explaining cases. In:
Ragin CC, Nagel J and White P (eds), Workshop
on Scientific Foundations of Qualitative Research. Arlington,
VA: National Science Foundation, 49−51.
Betts J (2013) Aid Effectiveness and Governance Reforms:
Applying realist principles to a complex synthesis
across varied cases. Evaluation 19(3): 249–68.
Bromley DB (1986) The Case-Study Method in Psychology and
Related Disciplines. Chichester: Wiley.
Burawoy M (1991) The extended case method. In: Burawoy M
et al.. (eds), Ethnography Unbound: Power
and Resistance in the Modern Metropolis. Berkeley: University
of California Press, 271−87.
Byrne D (2013) Evaluating complex social interventions in a
complex world. Evaluation 19(3): 217–28.
Campbell DT (2014) Foreword. In: Yin RK, Case Study
Research: Design and Methods. Thousand Oaks,
CA: SAGE, xvii−xviii.
Creswell JW and Plano Clark VL (2007) Designing and
Conducting Mixed Methods Research. Thousand
Oaks, CA: SAGE.
Cronbach LJ (1975) Beyond the two disciplines of scientific
psychology. American Psychologist 30: 116–27.
Denzin NK (1978) The Research Act: A Theoretical
Introduction to Sociological Methods, 2nd edn. New
York: McGraw-Hill.
Donmoyer R (1990) Generalizability and the single-case study.
In: Eisner EW and Peshkin A (eds), Qualitative
Inquiry in Education: The Continuing Debate. New York:
Teachers College, 175−200.
Emerson RM (ed.) (2001) Contemporary Field Research:
Perspectives and Formulations, 2nd edn. Prospect
Heights, IL: Waveland Press.
Erickson F (2012) Comments on causality in qualitative inquiry.
Qualitative Inquiry 18: 686−8.
Geertz C (1973) The Interpretation of Cultures. New York:
Basic Books.
Goffman E (1963) Stigma: Notes on the Management of Spoiled
Identity. New York: Prentice-Hall.
Gomm R, Hammersley M and Foster P (2000) Case study and
generalization. In: Gomm R, Hammersley M
and Foster P (eds), Case Study Method. London: SAGE,
98−115.
Halkier B (2011) Methodological practicalities in analytic
generalization. Qualitative Inquiry 17: 787−97.
Henrich J, Heine SJ and Norenzayan A (2010) The weirdest
people in the world? Behavioral and Brain
Sciences 33: 61–83.
Jick TD (1979) Mixing qualitative and quantitative methods:
triangulation in action. Administrative Science
Quarterly 24: 602−11.
Lincoln YS and Guba E (1985) Naturalistic Inquiry. Thousand
Oaks, CA: SAGE.
332 Evaluation 19(3)
Maxwell JA (2004) Using qualitative methods for causal
explanation. Field Methods 16: 243−64.
Maxwell JA (2012) The importance of qualitative research for
causal explanation in education. Qualitative
Inquiry 18: 655−61.
Miles M and Huberman M (1994) Qualitative Data Analysis: A
Sourcebook for New Methods. Thousand
Oaks, CA: SAGE.
Mitchell JC (1983) Case and situation analysis. Sociological
Review 31: 187–211.
Mookherji S and LaFond A (2013) Strategies to maximize
generalization from multiple case studies: lessons
from the Africa routine immunization system essentials
(ARISE) project. Evaluation 19(3): 284–303.
Patton M (2002) Qualitative Research and Evaluation Methods,
3rd edn. Thousand Oaks, CA: SAGE.
Prescott HM (2002) Using the student body: college and
university students as research subjects in the United
States during the twentieth century. Journal of the History of
Medicine 57: 3–38.
Ragin CC (1987) The Comparative Method: Moving beyond
Qualitative and Quantitative Strategies.
Berkeley, CA: University of California Press.
Ragin CC (2000) Fuzzy Set Social Science. Chicago: University
of Chicago Press.
Ragin CC (2009) Reflections on casing and case-oriented
research. In: Byrne D and Ragin CC (eds), The Sage
Handbook of Case-based Methods. London: SAGE, 522−34.
Rivera LA (2008) Managing ‘spoiled’ national identity: war,
tourism, and memory in Croatia. American
Sociological Review 73: 613−34.
Rosenbaum PR (2002) Observational Studies, 2nd edn. New
York: Springer.
Seawright J and Gerring J (2008) Case selection techniques in
case study research: a menu of qualitative and
quantitative options. Political Research Quarterly 61: 294−308.
Shavelson RJ and Towne L (eds) (2002) Scientific Research in
Education. Washington, DC: National
Academy Press.
Small ML (2009) ‘How many cases do I need?’ On science and
the logic of case selection in field-based
research. Ethnography 10: 5–38.
Stufflebeam DL and Shinkfield AJ (2007) Evaluation Theory,
Models, and Applications. San Francisco, CA:
Jossey-Bass.
Teddlie C and Tashakkori A (2009) Foundations of Mixed
Methods Research: Integrating Quantitative and
Qualitative Approaches in the Social and Behavioral Sciences.
Thousand Oaks, CA: SAGE.
Vellema S, Ton G, de Roo N and van Wijk J (2013) Value
chains, partnerships and development: using case
studies to refine programme theories. Evaluation 19(3): 304–20.
Wholey J (1979) Evaluation: Performance and Promise.
Washington, DC: The Urban Institute.
Woolcock M (2013) Using case studies to explore the external
validity of ‘complex’ development interven-
tions. Evaluation 19(3): 229–48.
Yin RK (2000a) Case study evaluations: a decade of progress?
In: Stufflebeam DL, Madaus GF and
Kelleghan T (eds), Evaluation Models: Viewpoints on
Educational and Human Services Evaluation, 2nd
edn. Boston, MA: Kluwer, 185–93.
Yin RK (2000b) Rival explanations as an alternative to ‘reforms
as experiments’. In: Bickman L (ed.), Validity
& Social Experimentation: Donald Campbell’s Legacy.
Thousand Oaks, CA: SAGE, 239−66.
Yin RK and Davis D (2007) Adding new dimensions to case
study evaluations: the case of evaluating com-
prehensive reforms. New Directions for Program Evaluation:
Informing Federal Policies for Evaluation
Methodology 113: 75−93.
Yin RK and Ridde V (2012) Théorie et pratiques des études de
cas en évaluation de programmes. In: Ridde
V and Dagenais C (eds), Approches et practiques en évaluation
de programmes, 2nd edn. Montreal:
University of Montreal Press, Chapter 10.
Yin RK (2014) Case Study Research: Design and Methods (5th
edn.). Thousand Oaks, CA: SAGE.
Robert K. Yin is President of the COSMOS Corporation and has
consulted extensively on the use of case
study evaluations for many clients including the United Nations
Development Programme and The World
Bank. He has published extensively on case study methods: the
3rd edition of Applications of Case Study
Research was published in 2012; and the 5th edition of Case
Study Research: Design and Methods has just
been published with a 2014 copyright date.

More Related Content

Similar to Evaluation19(3) 321 –332© The Author(s) 2013 Reprints .docx

Assessing the Sufficiency of Arguments through Conclusion Generation.pdf
Assessing the Sufficiency of Arguments through Conclusion Generation.pdfAssessing the Sufficiency of Arguments through Conclusion Generation.pdf
Assessing the Sufficiency of Arguments through Conclusion Generation.pdf
Asia Smith
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
sleeperharwell
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
keturahhazelhurst
 
Falon Deimler Methodological Workshop Presentation
Falon Deimler Methodological Workshop PresentationFalon Deimler Methodological Workshop Presentation
Falon Deimler Methodological Workshop Presentation
Falon Deimler
 
Mixed Methods Research - Thomas FitzGibbon
Mixed Methods Research - Thomas FitzGibbonMixed Methods Research - Thomas FitzGibbon
Mixed Methods Research - Thomas FitzGibbon
Thomas P. FitzGibbon III
 
Chapter Session 4.1 Case study.ppt28.ppt
Chapter Session 4.1 Case study.ppt28.pptChapter Session 4.1 Case study.ppt28.ppt
Chapter Session 4.1 Case study.ppt28.ppt
etebarkhmichale
 

Similar to Evaluation19(3) 321 –332© The Author(s) 2013 Reprints .docx (20)

Advanced Research Methodology.docx
Advanced Research Methodology.docxAdvanced Research Methodology.docx
Advanced Research Methodology.docx
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Case study research
Case study researchCase study research
Case study research
 
Case Study Research Design
Case Study Research DesignCase Study Research Design
Case Study Research Design
 
Case Study Research Method
Case Study Research Method Case Study Research Method
Case Study Research Method
 
Assessing the Sufficiency of Arguments through Conclusion Generation.pdf
Assessing the Sufficiency of Arguments through Conclusion Generation.pdfAssessing the Sufficiency of Arguments through Conclusion Generation.pdf
Assessing the Sufficiency of Arguments through Conclusion Generation.pdf
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
 
Research Essay Questions
Research Essay QuestionsResearch Essay Questions
Research Essay Questions
 
Qualitative research
Qualitative researchQualitative research
Qualitative research
 
Quantitative Critique Research.docx
Quantitative Critique Research.docxQuantitative Critique Research.docx
Quantitative Critique Research.docx
 
APPLYING CASE STUDY METHODOLOGY TO CHILD CUSTODY EVALUATIONS
APPLYING CASE STUDY METHODOLOGY TO CHILD CUSTODY EVALUATIONSAPPLYING CASE STUDY METHODOLOGY TO CHILD CUSTODY EVALUATIONS
APPLYING CASE STUDY METHODOLOGY TO CHILD CUSTODY EVALUATIONS
 
Falon Deimler Methodological Workshop Presentation
Falon Deimler Methodological Workshop PresentationFalon Deimler Methodological Workshop Presentation
Falon Deimler Methodological Workshop Presentation
 
A Guide to Conducting a Meta-Analysis.pdf
A Guide to Conducting a Meta-Analysis.pdfA Guide to Conducting a Meta-Analysis.pdf
A Guide to Conducting a Meta-Analysis.pdf
 
Mixed Methods Research - Thomas FitzGibbon
Mixed Methods Research - Thomas FitzGibbonMixed Methods Research - Thomas FitzGibbon
Mixed Methods Research - Thomas FitzGibbon
 
Research Method EMBA chapter 5
Research Method EMBA chapter 5Research Method EMBA chapter 5
Research Method EMBA chapter 5
 
Chapter Session 4.1 Case study.ppt28.ppt
Chapter Session 4.1 Case study.ppt28.pptChapter Session 4.1 Case study.ppt28.ppt
Chapter Session 4.1 Case study.ppt28.ppt
 
Hallmarks of research
Hallmarks of researchHallmarks of research
Hallmarks of research
 
Case study research method
Case study research methodCase study research method
Case study research method
 
Anderson%20and%20 gerbing%201988
Anderson%20and%20 gerbing%201988Anderson%20and%20 gerbing%201988
Anderson%20and%20 gerbing%201988
 

More from elbanglis

Explore the Issue PapersYou will choose a topic from the Complet.docx
Explore the Issue PapersYou will choose a topic from the Complet.docxExplore the Issue PapersYou will choose a topic from the Complet.docx
Explore the Issue PapersYou will choose a topic from the Complet.docx
elbanglis
 
Experiencing Intercultural CommunicationAn Introduction6th e.docx
Experiencing Intercultural CommunicationAn Introduction6th e.docxExperiencing Intercultural CommunicationAn Introduction6th e.docx
Experiencing Intercultural CommunicationAn Introduction6th e.docx
elbanglis
 
Experimental and Quasi-Experimental DesignsChapter 5.docx
Experimental and Quasi-Experimental DesignsChapter 5.docxExperimental and Quasi-Experimental DesignsChapter 5.docx
Experimental and Quasi-Experimental DesignsChapter 5.docx
elbanglis
 
Explain how building partner capacity is the greatest challenge in.docx
Explain how building partner capacity is the greatest challenge in.docxExplain how building partner capacity is the greatest challenge in.docx
Explain how building partner capacity is the greatest challenge in.docx
elbanglis
 
Experience as a Computer ScientistFor this report, the pro.docx
Experience as a Computer ScientistFor this report, the pro.docxExperience as a Computer ScientistFor this report, the pro.docx
Experience as a Computer ScientistFor this report, the pro.docx
elbanglis
 
Expansion and Isolationism in Eurasia How did approaches t.docx
Expansion and Isolationism in Eurasia How did approaches t.docxExpansion and Isolationism in Eurasia How did approaches t.docx
Expansion and Isolationism in Eurasia How did approaches t.docx
elbanglis
 
EXPEDIA VS. PRICELINE -- WHOSE MEDIA PLAN TO BOOK Optim.docx
EXPEDIA VS. PRICELINE -- WHOSE MEDIA PLAN TO BOOK Optim.docxEXPEDIA VS. PRICELINE -- WHOSE MEDIA PLAN TO BOOK Optim.docx
EXPEDIA VS. PRICELINE -- WHOSE MEDIA PLAN TO BOOK Optim.docx
elbanglis
 
Experiments with duckweed–moth systems suggest thatglobal wa.docx
Experiments with duckweed–moth systems suggest thatglobal wa.docxExperiments with duckweed–moth systems suggest thatglobal wa.docx
Experiments with duckweed–moth systems suggest thatglobal wa.docx
elbanglis
 
EXP4304.521F19 Motivation 1 EXP4304.521F19 Motivatio.docx
EXP4304.521F19 Motivation  1  EXP4304.521F19 Motivatio.docxEXP4304.521F19 Motivation  1  EXP4304.521F19 Motivatio.docx
EXP4304.521F19 Motivation 1 EXP4304.521F19 Motivatio.docx
elbanglis
 
EXPERIMENT 1 OBSERVATION OF MITOSIS IN A PLANT CELLData Table.docx
EXPERIMENT 1 OBSERVATION OF MITOSIS IN A PLANT CELLData Table.docxEXPERIMENT 1 OBSERVATION OF MITOSIS IN A PLANT CELLData Table.docx
EXPERIMENT 1 OBSERVATION OF MITOSIS IN A PLANT CELLData Table.docx
elbanglis
 
Exercise Package 2 Systems and its properties (Tip Alwa.docx
Exercise Package 2 Systems and its properties (Tip Alwa.docxExercise Package 2 Systems and its properties (Tip Alwa.docx
Exercise Package 2 Systems and its properties (Tip Alwa.docx
elbanglis
 
Exercises for Chapter 8 Exercises III Reflective ListeningRef.docx
Exercises for Chapter 8 Exercises III Reflective ListeningRef.docxExercises for Chapter 8 Exercises III Reflective ListeningRef.docx
Exercises for Chapter 8 Exercises III Reflective ListeningRef.docx
elbanglis
 
Exercise 9-08On July 1, 2019, Sheridan Company purchased new equ.docx
Exercise 9-08On July 1, 2019, Sheridan Company purchased new equ.docxExercise 9-08On July 1, 2019, Sheridan Company purchased new equ.docx
Exercise 9-08On July 1, 2019, Sheridan Company purchased new equ.docx
elbanglis
 
Exercise 1 – Three-Phase, Variable-Frequency Induction-Motor D.docx
Exercise 1 – Three-Phase, Variable-Frequency Induction-Motor D.docxExercise 1 – Three-Phase, Variable-Frequency Induction-Motor D.docx
Exercise 1 – Three-Phase, Variable-Frequency Induction-Motor D.docx
elbanglis
 
ExemplaryVery GoodProficientOpportunity for ImprovementU.docx
ExemplaryVery GoodProficientOpportunity for ImprovementU.docxExemplaryVery GoodProficientOpportunity for ImprovementU.docx
ExemplaryVery GoodProficientOpportunity for ImprovementU.docx
elbanglis
 
Exercise Question #1 Highlight your table in Excel. Copy the ta.docx
Exercise Question #1  Highlight your table in Excel. Copy the ta.docxExercise Question #1  Highlight your table in Excel. Copy the ta.docx
Exercise Question #1 Highlight your table in Excel. Copy the ta.docx
elbanglis
 
Executive SummaryXYZ Development, LLC has requested ASU Geotechn.docx
Executive SummaryXYZ Development, LLC has requested ASU Geotechn.docxExecutive SummaryXYZ Development, LLC has requested ASU Geotechn.docx
Executive SummaryXYZ Development, LLC has requested ASU Geotechn.docx
elbanglis
 
ExemplaryProficientProgressingEmergingElement (1) Respo.docx
ExemplaryProficientProgressingEmergingElement (1) Respo.docxExemplaryProficientProgressingEmergingElement (1) Respo.docx
ExemplaryProficientProgressingEmergingElement (1) Respo.docx
elbanglis
 

More from elbanglis (20)

Explore the Issue PapersYou will choose a topic from the Complet.docx
Explore the Issue PapersYou will choose a topic from the Complet.docxExplore the Issue PapersYou will choose a topic from the Complet.docx
Explore the Issue PapersYou will choose a topic from the Complet.docx
 
Experiencing Intercultural CommunicationAn Introduction6th e.docx
Experiencing Intercultural CommunicationAn Introduction6th e.docxExperiencing Intercultural CommunicationAn Introduction6th e.docx
Experiencing Intercultural CommunicationAn Introduction6th e.docx
 
Experimental and Quasi-Experimental DesignsChapter 5.docx
Experimental and Quasi-Experimental DesignsChapter 5.docxExperimental and Quasi-Experimental DesignsChapter 5.docx
Experimental and Quasi-Experimental DesignsChapter 5.docx
 
Explain the role of the community health nurse in partnership with.docx
Explain the role of the community health nurse in partnership with.docxExplain the role of the community health nurse in partnership with.docx
Explain the role of the community health nurse in partnership with.docx
 
Explain how building partner capacity is the greatest challenge in.docx
Explain how building partner capacity is the greatest challenge in.docxExplain how building partner capacity is the greatest challenge in.docx
Explain how building partner capacity is the greatest challenge in.docx
 
Experience as a Computer ScientistFor this report, the pro.docx
Experience as a Computer ScientistFor this report, the pro.docxExperience as a Computer ScientistFor this report, the pro.docx
Experience as a Computer ScientistFor this report, the pro.docx
 
Expansion and Isolationism in Eurasia How did approaches t.docx
Expansion and Isolationism in Eurasia How did approaches t.docxExpansion and Isolationism in Eurasia How did approaches t.docx
Expansion and Isolationism in Eurasia How did approaches t.docx
 
Experimental PsychologyWriting and PresentingPaper Secti.docx
Experimental PsychologyWriting and PresentingPaper Secti.docxExperimental PsychologyWriting and PresentingPaper Secti.docx
Experimental PsychologyWriting and PresentingPaper Secti.docx
 
EXPEDIA VS. PRICELINE -- WHOSE MEDIA PLAN TO BOOK Optim.docx
EXPEDIA VS. PRICELINE -- WHOSE MEDIA PLAN TO BOOK Optim.docxEXPEDIA VS. PRICELINE -- WHOSE MEDIA PLAN TO BOOK Optim.docx
EXPEDIA VS. PRICELINE -- WHOSE MEDIA PLAN TO BOOK Optim.docx
 
Experiments with duckweed–moth systems suggest thatglobal wa.docx
Experiments with duckweed–moth systems suggest thatglobal wa.docxExperiments with duckweed–moth systems suggest thatglobal wa.docx
Experiments with duckweed–moth systems suggest thatglobal wa.docx
 
EXP4304.521F19 Motivation 1 EXP4304.521F19 Motivatio.docx
EXP4304.521F19 Motivation  1  EXP4304.521F19 Motivatio.docxEXP4304.521F19 Motivation  1  EXP4304.521F19 Motivatio.docx
EXP4304.521F19 Motivation 1 EXP4304.521F19 Motivatio.docx
 
EXPERIMENT 1 OBSERVATION OF MITOSIS IN A PLANT CELLData Table.docx
EXPERIMENT 1 OBSERVATION OF MITOSIS IN A PLANT CELLData Table.docxEXPERIMENT 1 OBSERVATION OF MITOSIS IN A PLANT CELLData Table.docx
EXPERIMENT 1 OBSERVATION OF MITOSIS IN A PLANT CELLData Table.docx
 
Exercise Package 2 Systems and its properties (Tip Alwa.docx
Exercise Package 2 Systems and its properties (Tip Alwa.docxExercise Package 2 Systems and its properties (Tip Alwa.docx
Exercise Package 2 Systems and its properties (Tip Alwa.docx
 
Exercises for Chapter 8 Exercises III Reflective ListeningRef.docx
Exercises for Chapter 8 Exercises III Reflective ListeningRef.docxExercises for Chapter 8 Exercises III Reflective ListeningRef.docx
Exercises for Chapter 8 Exercises III Reflective ListeningRef.docx
 
Exercise 9-08On July 1, 2019, Sheridan Company purchased new equ.docx
Exercise 9-08On July 1, 2019, Sheridan Company purchased new equ.docxExercise 9-08On July 1, 2019, Sheridan Company purchased new equ.docx
Exercise 9-08On July 1, 2019, Sheridan Company purchased new equ.docx
 
Exercise 1 – Three-Phase, Variable-Frequency Induction-Motor D.docx
Exercise 1 – Three-Phase, Variable-Frequency Induction-Motor D.docxExercise 1 – Three-Phase, Variable-Frequency Induction-Motor D.docx
Exercise 1 – Three-Phase, Variable-Frequency Induction-Motor D.docx
 
ExemplaryVery GoodProficientOpportunity for ImprovementU.docx
ExemplaryVery GoodProficientOpportunity for ImprovementU.docxExemplaryVery GoodProficientOpportunity for ImprovementU.docx
ExemplaryVery GoodProficientOpportunity for ImprovementU.docx
 
Exercise Question #1 Highlight your table in Excel. Copy the ta.docx
Exercise Question #1  Highlight your table in Excel. Copy the ta.docxExercise Question #1  Highlight your table in Excel. Copy the ta.docx
Exercise Question #1 Highlight your table in Excel. Copy the ta.docx
 
Executive SummaryXYZ Development, LLC has requested ASU Geotechn.docx
Executive SummaryXYZ Development, LLC has requested ASU Geotechn.docxExecutive SummaryXYZ Development, LLC has requested ASU Geotechn.docx
Executive SummaryXYZ Development, LLC has requested ASU Geotechn.docx
 
ExemplaryProficientProgressingEmergingElement (1) Respo.docx
ExemplaryProficientProgressingEmergingElement (1) Respo.docxExemplaryProficientProgressingEmergingElement (1) Respo.docx
ExemplaryProficientProgressingEmergingElement (1) Respo.docx
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 

Evaluation19(3) 321 –332© The Author(s) 2013 Reprints .docx

  • 1. Evaluation 19(3) 321 –332 © The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/1356389013497081 evi.sagepub.com Validity and generalization in future case study evaluations Robert K. Yin COSMOS Corporation, USA Abstract Validity and generalization continue to be challenging aspects in designing and conducting case study evaluations, especially when the number of cases being studied is highly limited (even limited to a single case). To address the challenge, this article highlights current knowledge regarding the use of: (1) rival explanations, triangulation, and logic models in strengthening validity, and (2) analytic generalization and the role of theory in seeking to generalize from case studies. To ground the discussion, the article cites specific practices and examples from the existing literature as well as from the six preceding articles assembled in this special issue. Throughout,
  • 2. the article emphasizes that current knowledge may still be regarded as being at its early stage of development, still leaving room for more learning. The article concludes by pointing to three topics worthy of future methodological inquiry, including: (1) examining the connection between the way that initial evaluation questions are posed and the selection of the appropriate evaluation method in an ensuing evaluation, (2) the importance of operationally defining the ‘complexity’ of an intervention, and (3) raising awareness about case study evaluation methods more generally. Keywords analytic generalization, initial evaluation questions, intervention complexity, logic models, rival explanations, role of theory, triangulation Introduction The classic case study consists of an in-depth inquiry into a specific and complex phenomenon (the ‘case’), set within its real-world context. To arrive at a sound understanding of the case, a case study should not be limited to the case in isolation but should examine the likely interaction between the case and its context. Technically, such an objective adds to a common problem, whether doing case study research (Yin, 2014) or case study evaluation (Yin and Ridde, 2012): the number of datapoints (each case being a single datapoint) will be far outstripped by the num- ber of variables under study − because of the complexity of the case as well as the embracing of Corresponding author:
  • 3. Robert K. Yin, COSMOS Corporation, 3 Bethesda Metro Center, Suite 700, Bethesda, MD 20814, USA. Email: [email protected] 497081EVI19310.1177/1356389013497081EvaluationRobert K. Yin: Validity and generalization in future case study evaluations 2013 Article 322 Evaluation 19(3) the contextual conditions. This situation is nearly impossible to remedy, even if a modest number of cases is included as part of the same (multiple-) case study. As a result, the usual analytic techniques based on having a large number of datapoints and a small number of variables (thereby permitting estimates of means and variances) are likely to be irrelevant in doing case study research. For evaluations, the ability to address the complexity and contextual conditions nevertheless establishes case study methods as a viable alternative among the other methodological choices, such as survey, experimental, or economic research (Stufflebeam and Shinkfield, 2007). The con- ditions appear especially relevant in efforts to evaluate highly broad and complex initiatives; for example, systems reforms, service delivery integration, community and economic development projects, and international development (e.g. Yin and Davis, 2007). At the same time, doing case
  • 4. study evaluations with acceptable and rigorous procedures must rely on a state-of-the-art still in its formative stages. The May 2012 workshop on ‘Validity, Generalization, and Learning’ provided an opportunity for a variety of scholars to share their working knowledge and to advance the state-of-the-art. Six of the presentations became the other articles contained in this journal issue.1 Together, the six assembled articles form a basis for briefly reviewing the key practices regarding validity and gen- eralization when doing case study evaluations. The present article tries to reinforce and also to elaborate upon the six articles. The goal is to stimulate yet newer contributions on all these impor- tant methodological practices. Only in this manner will case study evaluations continue to get stronger. The article is organized according to a slight adaptation of the main themes of the original workshop: Strengthening validity; Seeking to generalize; and Still more learning. Strengthening validity Case study evaluations may limit themselves to descriptive or even exploratory objectives. However, the greatest challenge arises when case study evaluations fill an explanatory role. This means: (a) documenting (and interpreting) a set of outcomes, and then (b) trying to explain how those outcomes came about. When adopting such an explanatory objective, a case study evaluation will in effect be examining causal relationships. The evaluation thus squarely confronts issues of internal validity.2 To address these issues, the small number of
  • 5. cases in a case study − frequently involving only a single case − precludes the use of conventional experimental designs. These require the availability of a sufficiently large number of cases that can in turn be divided into two (or more) comparison groups. Instead, case study evaluations must rely on other techniques. One evaluative approach has been to conduct and document direct observations of the events and actions as they actually occur in a local setting as a critical part of a case study’s data collection (e.g. Erickson, 2012: 688; Maxwell, 2004, 2012; Miles and Huberman, 1994: 132). The inquiry can highlight the contextual role of the local settings and accommodate if not feature the non-linear and recursive flows of events (‘feedback loops that occur at irregular times’ − Betts, 2013: 255) as well as the possibility of entertaining multiple causes, both proximal and distal. However, the ensu- ing analysis remains highly qualitative and may not be very convincing. To improve on the precision of such an approach and to boost confidence in the findings, two of the six assembled articles (Befani, 2013; Byrne, 2013) offer insights into a technique known as qualitative comparative analysis (QCA), developed by Charles Ragin (1987, 2000, 2009). This technique captures within-case patterns or configurations (Byrne, 2013: 224), consisting of the combination of intervention and outcome conditions for each particular case being studied. The cross-case analysis then becomes the systematic comparison of these within-case
  • 6. Yin: Validity and generalization in future case study evaluations 323 configurations or sets of intervention-outcome conditions.3 When a sufficiently large number of cases is available, QCA can be ‘strong in testing, refining, and validating findings’ (Befani, 2013: 280). As examples, Befani’s article discusses two illustrations, having 17 and 11 cases, respectively. The more advanced versions of QCA (Ragin, 2000) permit the handling of 50 to 100 cases (Befani, 2013: 281–82). However, such a capability, as well as the QCA procedure more generally, may in fact focus on cases rather than on the conduct of in- depth case studies. Except when pre- existing cases are already archivally available, a study covering a large number of new in-depth case studies is likely to be difficult to conduct, because of both the elapsed time and the resources needed by the study. QCA’s capability, therefore, may move in the opposite direction from the initial challenge of confronting validity with a small number of cases, including the classic single- case study. For such situations, the six assembled articles gave less attention to three known prac- tices, possibly because the practices remain underdeveloped. Plausible, rival explanations The role of examining plausible rival explanations has been readily recognized in doing evalua- tions (e.g. Maxwell, 2004: 257−60; Yin, 2000b). Appealing to
  • 7. such rivals has formed a central part of nearly all types of research in the social and physical sciences (e.g. Campbell, 2014: xvii−xviii). Although experimental designs may control for all rivals (but without specifying any of them), the number of plausible rivals competing with the main hypothesized causal relationships in a case study may be sufficiently limited that they can be studied directly. Thus, as part of the same case study, the procedure calls for a vigorous search for data related to the rivals, as if trying to find support for them (Patton, 2002: 553; Rosenbaum, 2002: 8−10). Given a vigorous search, but finding no such support, more confidence can be placed in the main hypothesized relationships. The degree of certainty will be lower than that associated with an experimental design but higher than if a case study had not investigated any plausible rivals. As noted in the field of education research, ‘the use of qualitative methods . . . can be particularly help- ful in ruling out alternative explanations . . . [and] can enable stronger causal inferences’ (Shavelson and Towne, 2002: 109). For a case study evaluation, the most common rivals might be the exist- ence of: an initiative similar to or overlapping with the intervention being evaluated; a salient policy shift not related to the intervention; or some other identifiable influence in the contextual environment. However, beyond being identified as an integral and critical part of doing an evaluation, the operational procedure for making comparisons with plausible rival explanations has received little attention. Explicit procedures are needed to deal with how and
  • 8. whether the acceptance or rejection of rivals meets such benchmarks as being ‘acceptable,’ ‘weak,’ or ‘strong,’ or even how to distin- guish between a plausible rival and a mere red herring. In addition, the operational steps involved in comparing the rival findings with those related to the main hypothesis may be intricate and may benefit from being represented as formal designs. To this extent, the use of plausible rival explana- tions remains an extremely promising but still underdeveloped procedure for strengthening the validity of case study evaluations. Triangulation Triangulation presents a similar situation. The principle has been long understood (e.g. Denzin, 1978; Jick, 1979), with at least four types of triangulation being possible: (1) data source triangulation, (2) 324 Evaluation 19(3) analyst triangulation, (3) theory/perspective triangulation, and (4) methods triangulation (Patton, 2002: 556−63). Of the four, the data source and methods types in particular are likely to strengthen the validity of a case study evaluation. Renewed interest in mixed methods research has highlighted the ways in which a methods triangulation can provide increased confidence in the findings from a study that has combined quantitative with qualitative methods (e.g. Creswell and Plano Clark, 2007; Teddlie and Tashakkori, 2009). (Vellema et al. [2013: Table 1], briefly refer to their use of one of the
  • 9. other kinds of triangulation: theory/perspective triangulation.) Many case study evaluations, especially those focusing on broad or complex interventions, can involve a combination of two or more methods. When these methods are purposely designed to collect some overlapping data, the possibility for triangulation certainly exists and, if the results are convergent, greater confidence may be placed in the evaluation’s overall findings. Similarly, con- vergence over the examination of causal relationships will strengthen the evaluation’s internal validity. At the same time, operational procedures for carrying out triangulations also have received little attention. No benchmarks exist to define when triangulation might be considered ‘strong’ or ‘weak’ or ‘complete’ or ‘incomplete.’ Similarly, sufficient triangulation might involve an intricate number of steps that need to be represented as formal designs. The ultimate goal, as with making compari- sons with plausible rival explanations, calls for a common procedure that can be routinely adopted and used by many if not all case study evaluations. Logic models Case study evaluations frequently use logic models, initially to express the theoretical causal rela- tionships between an intervention and its outcomes, and then to guide data collection on these same topics. The collected data can be analyzed by comparing the empirical findings with the initially stipulated theoretical relationships, and a match between the empirical and the theoretical adds to
  • 10. the support for explaining how an intervention produced (or not) its outcomes. The practice of using logic models in evaluations has again been understood for a lengthy period of time (e.g. Wholey, 1979). Nevertheless, although the practice of using logic models has become quite common, little has occurred to sharpen their use and strengthen their role. For instance, a major shortcoming derives from the coincidentally graphic similarities between logic models and flow charts. Both are usually expressed as a sequence of boxes. In the case of the logic model, the boxes represent the key steps or events within an intervention and then between the intervention and its outcomes. Graphically, the boxes are then connected by arrows that identify the links between and among the events. Unfortunately, most evaluations collect data about the boxes, but nearly no data about the arrows. Yet they represent the flow of transitional or causal conditions, showing or explaining how one event (box) might actually lead to another event (a second box). One possible reason for such negligence is that transitional data are irrelevant in flow charts, which only represent the shifting from one task to another, but without implying any causal relationship. For logic models not having any transitional data, only a correlational analysis can be conducted, reducing the causal value (and validity) of the entire exercise. Future studies could again investigate ways of improving the use of logic models. Summary
  • 11. Case study evaluations need to continue to confront the challenge of strengthening validity. Several known methodological practices accept rather than avoid the necessary underlying assumption that Yin: Validity and generalization in future case study evaluations 325 the typical case study will only include a small number of cases: checking for plausible, rival explanations; triangulating data or methods; and using logic models. These practices deserve greater attention than they have attracted in the past. In each situation, although the practices have been recognized and used for many years, the preceding paragraphs have suggested that room for improvement still exists. Future methodological contributions could therefore yield desirable payoffs. Seeking to generalize Concerns in doing case study evaluations extend from issues of validity to issues of generalization. In international development, the generalizations form the basis for transferring lessons from one country to another as well as for ‘scaling-up’ a desirable intervention within the same country. This facet of the May 2012 workshop theme led the six assembled articles to delve, in some cases quite deeply, into generalization issues. The widespread assumption, embraced by most of the articles as well as the prevailing evalua-
  • 12. tion literature, interprets case study generalization as an effort to generalize from a small number of cases to a larger population of cases (e.g. Byrne, 2013; Ragin, 2009; Seawright and Gerring, 2008; and Woolcock, 2013). The common quest has been, first, to establish a sufficiently precise definition of the ‘case’ being studied (if not at the outset of a case study at least by its conclusion), and then to (retrospectively) define the broader population of relevant cases. The process mimics the conventional sampling procedure but can fail for two reasons. First, the difficulties of selecting the initial case(s) usually mean that the case(s) being studied do not represent a known, much less random sample from the larger set of cases. An additional and circular problem involves not fully understanding the case or having sufficient data for selection purposes to be able to define the potential population of cases; but, without knowing the popula- tion, not being able to define fully the nature of the sampled case(s) to be studied. Second, if a study genuinely takes advantage of the case study method − that is, by probing a case and its context in-depth − the study will likely only be able to include a small number of cases. In fact, the classic case study, as well as many case study evaluations, is usually limited to only a single case. The goal of understanding a case and its context, potentially over a meaningful period of time, is sufficiently engrossing that, even if thick description (Geertz, 1973) is not the end result, a case study will just not be able to cover more than a small number of cases. The only way of
  • 13. increasing the number of cases to some substantial level would mean sacrificing the in-depth and contextual nature of the insights inherent in using the case study method in the first place. Analytic generalization Instead of pursuing the sample-to-population logic, analytic generalization can serve as an appro- priate logic for generalizing the findings from a case study (e.g. Bromley, 1986: 290–1; Burawoy, 1991: 271–87; Donmoyer, 1990; Gomm et al., 2000; Mitchell, 1983; and Small, 2009).4 By ana- lytic generalization is meant the extraction of a more abstract level of ideas from a set of case study findings − ideas that nevertheless can pertain to newer situations other than the case(s) in the origi- nal case study. For case study evaluations, the analytic generalization should aim to apply to other concrete situations and not just to contribute to abstract theory building. The desired analytic generalization also should go beyond serving only as a ‘working hypothesis’ (e.g. Cronbach, 1975) − that is, one in need of further study rather than being ready to be generalized or applied to new situations. This shortcoming is not easily overcome. However, carefully linking an 326 Evaluation 19(3) analytic generalization to the related research literature by identifying overlaps as well as gaps will help. Replication of the same findings by conducting a second
  • 14. or third case study (e.g. Yin, 2014: 57−9) can strengthen the generalization even further. Eventually, the ideal generalization may extend not only to other ‘like’ cases but also ‘apply to many different types of cases’ (Bennett, 2004: 50). This manner of generalizing is not peculiar to doing case studies but is in fact analogous to the way that generalizations are made in doing experiments. Thus, the selection and conduct of an experiment derives from the goal of developing fresh data about some initially hypothesized conditions − or about discovering a totally new condition − but not from being a sample of some known, larger population of like experiments.5 Case study research follows a similar motive (Yin, 2014: 44). One of the six assembled articles (Mookherji and LaFond, 2013) demonstrated the development of analytic generalizations in considerable detail. The study examined the ‘initiatives and pro- cesses [that were] actually “driving” the improvements in routine immunization [projects in three African countries]’ (Mookherji and LaFond, 2013: 288). A critical analytic step occurred after the data had been collected: the identification of the varied drivers in each of the case studies, followed by a cross-case synthesis of how the case-specific drivers fell into six categories, each representing one of six (conceptually) common drivers (see Table 1 of their article). Based on these and other cross-case findings, Mookherji and LaFond formulated a comprehen- sive framework depicting the flow of pre-conditions, contextual
  • 15. conditions, and drivers (see Figure 4 of their article). The framework, now empirically derived, explains how and why immunization projects can succeed. In the authors’ view, it became the basis for generalizing the results from their evaluation to other districts in other African countries (p. 22). (By inspecting the framework closely, a reader might even speculate that the framework can pertain to immunization projects outside of that region − or even to the design of community health initiatives more broadly.) Mookherji and LaFond’s example shows how analytic generalization offers improved ways of generalizing from case study evaluations. An additional line of thinking that builds on the impor- tance of analytic generalization is described next. The role of ‘theory’ in making analytic generalizations Mookherji and LaFond rightfully regarded their framework as expressing a theory of change (p. 23). One way to have further strengthened their framework would have been to connect it to the extant literature, which contains a considerable body of work on the locally decentralized service delivery conditions and the local partnering arrangements central to their framework. The authors might then have been able to discuss how their case study contributed (or not) to new knowledge about health interventions, and whether their findings were limited to immunization projects or could be applied to community health projects more generally. In essence, the desired analytic generalization should present an explanation of how and why
  • 16. the initiative being evaluated produced results (or not) − or, for non-evaluation studies, how and why the studied events occurred (or not). In this latter regard, two other examples are worth noting. The first is Graham Allison’s well-known single-case study on the Cuban missile crisis (Allison, 1971; Allison and Zelikow, 1999). The case study has for over 30 years been a best-seller in the field of political science because of its analytic generalizations and implications for a broad array of international relationships. The second example (also illustrating how a detailed single-case study can be published in a leading academic journal, even given its page-length limitations) examined how the Croatian gov- ernment represented the country’s past, present, and future in the aftermath of the wars of Yugoslav Yin: Validity and generalization in future case study evaluations 327 secession (Rivera, 2008). The wars had left a reputation- damaging effect, threatening Croatia’s highly valued tourist industry. The case study showed how, in response, the government reframed the country’s past by omitting the war in its representations of national history, re-positioning the country as more closely sharing a history and culture with its Western European neighbors. The explanation for these findings then drew from a prevailing theoretical framework, in which the author innovatively extended Erving Goffman’s well-regarded work on stigma and the manage-
  • 17. ment of ‘spoiled identity’ from the individual to the institutional realm (Goffman, 1963). The author concluded by claiming that the analytic generalization had applicability to other situations of collective memory and cultural sociology. Summary The preferred manner of generalizing from case studies and case study evaluations is likely to take the form of making an analytic or conceptual generalization, rather than of reaching for a numeric one. The desired generalization should present an explanation for how an evaluated initiative pro- duces its results (or not). The explanation can be regarded as a theory of sorts − certainly more than a set of isolated concepts − and therefore yield a better understanding of an intervention and its outcomes. Whether such an explanation is based on a theory that emerged for the first time from a case study or had been entertained in hypothetical form prior to the conduct of the case study, researchers need to connect the theory to the extant literature, or alternatively, to use their findings to explain the gaps and weaknesses in that literature. By doing so, the generalizations from a single case study can be interpreted with greater meaning and lead to a desired cumulative knowledge. Finally, replications of the original case study also help. At the same time, the strongest empirical foundation for these generalizations derives from the close-up, in-depth study of a specific case in its real-world context.6 Such a condition usually limits the number of cases that can be studied. In turn, such a limitation precludes applying the conven-
  • 18. tional numeric, or sample-to-population generalizations when doing case studies. If, in contrast, an evaluation genuinely has an overarching goal of establishing or estimating numeric relationships, doing a case study evaluation might not be the preferred method to satisfy such a goal. Still more learning The present article’s treatment of validity and generalization suggests ways that case study evalu- ations can gain from methodological studies yet to be done. These studies need to focus on case study practices to strengthen future case study evaluations. In this sense, there is still more learning to be done. Discussed next are three topics connected to validity and generalization that represent priorities for the desired methodological studies. Noting carefully the nature of the initial evaluation questions Perhaps the most important inquiry points to the very start of a case study evaluation − its evalua- tion questions. These questions have serious implications for the remainder of the case study. However, many case study evaluations may not be attending carefully to the way that these ques- tions are posed. How best to pose these questions, therefore, should be a high priority for future investigation. Such studies could be quite straightforward, for example, conducting a meta-analysis of completed evaluations, deliberately covering a variety of forms of questions and types of evalu- ation methods.
  • 19. 328 Evaluation 19(3) The studies might initially assume that the desired questions for case study evaluations, as with case study research more generally, should be cast as ‘how’ or ‘why’ questions (Yin, 2014: 10−11). Such questions implicitly direct attention to events and actions over time, including but not limited to causal processes (and therefore not restricted to explanatory case study evaluations but also embracing descriptive ones). The strength of the subsequent case study would be its ability to examine the relevant events and actions in all their complexity, even if re-creating a contemporary period of time retrospectively. ‘How’ and ‘why’ questions, for instance, highlighted the seven questions posed in doing each of the three country case studies in Mookherji and LaFond’s article (2013: 289). Unfortunately, many evaluations, including those dealing with international development, totally ignore ‘how’ and ‘why’ questions and start with ‘what’ or ‘to what extent’ questions. The ‘what’ questions seek to identify the specific conditions associated with a successful (or not) inter- vention. Moreover, these conditions are sometimes expressed as single ‘present-absent’ variables, even when a condition, such as decentralization, is entirely too complex to be treated in this man- ner. Nevertheless, note that − assuming the availability of sufficient data − regressions, factor analyses, and other quantitative models can readily support the identification process. Furthermore, the models can more than capably demonstrate the potency of a
  • 20. targeted condition by controlling for competing conditions or showing how sets of conditions interact. Likewise, if properly addressed, the ‘to what extent’ questions beg for a numeric, not explanatory or even descriptive response. When the initial evaluation questions appear to favor methods other than case studies, attempts to conduct case study evaluations in spite of these questions may lead to tough sledding for the ensuing case study. First, validity questions may arise about the sample of cases selected, the avail- ability of counterfactual conditions, and the metrics used to assess the ‘extent’ in the phrase ‘to what “extent”.’ Most commonly, to address the ‘to what extent’ questions, a case study evaluation will have to resort to the use of Likert scales and then query respondents or analysts. Yet, such a maneuver can raise even more uncertainties about the sample and implicit biases of the respond- ents or analysts who were queried. By addressing the less preferred form of questions, however, the greatest loss may be a case study’s inability to arrive at any generalizations. For instance, the ‘what’ questions may lead to no particular theoretical framework other than a correlative one, making analytic generaliza- tions difficult. Depending upon the number of cases, numeric generalizations about the fre- quency or combination of the ‘whats’ may be tenuous from any conventional quantitative standpoint. Overall, future inquiries should aim to yield a better
  • 21. understanding of how an evaluation’s initial questions can imply certain preferences in selecting the methods for an evaluation. An important hypothesis to be entertained is that the form of these questions dictates whether a case study (or other evaluation method) should be used in the first place (e.g. Shavelson and Towne, 2002: 99−108). Extending this challenge into a slightly more controversial realm, a somewhat more compli- cated situation surfaces when evaluations are initially driven by the ‘realist’ framework of ques- tions − ‘what works for whom, when, where, and why?’ (Woolcock, 2013: 245; Betts, 2013: 256). This common framework, appearing in many evaluations and evaluation programs (international and otherwise), leads to the impression that a short or at least manageable list of conditions can eventually be identified. Moreover, the ‘whom, when, where, and why’ portion of the framework leaves the impression that the responses will identify a set of constraining and enabling conditions related to generalizing to other situations. Yin: Validity and generalization in future case study evaluations 329 However, the complexity of an intervention and its context may yield such a large number of conditions, not to speak of their distinctiveness or uniqueness, that they cannot be itemized in any practical way. Even if successfully itemized, the likely analytic tool may again be a
  • 22. correlative one, not a case study. Thus, future studies should deliberately examine the implica- tions of using the evaluation questions deriving from a realist framework − at a minimum examining whether a useful procedure might be for a new study to speculate about the kind and length of the likely items before deciding whether to proceed with a case study or some alternative method. Revisiting the ‘complexity’ of interventions A second priority topic covers the presumed complexity of an intervention and how it appears to influence the choice between case study evaluations and other evaluation methods. Many evalua- tions, as well as the present article, portray ‘complexity’ as an important feature justifying the use of case studies. The usual context for making this choice is a comparison to experiments, which in their classic form mainly focus on the relationship between a single cause and a single effect at a time (Befani, 2013: 270; Byrne, 2013: 220). However, instead of relying on a comparison with experiments, a better justification for proceeding with a case study evaluation might require a sharper definition of what makes an intervention complex. Some interventions may consist of a number of components that have complex relationships. These types of interventions and this type of complexity may nevertheless be highly amenable to methods other than case studies (e.g. an economic-based study of a housing intervention). Simply stipulating that complex interventions warrant the use of the case study method might appear to be
  • 23. naive if not offensive to analysts familiar with the alternative methods, which in fact can cover certain kinds of complexity quite well (again, regression models, structural equation models, and the like come readily to mind). Instead, the desired future studies should explicitly define the conditions associated with the ‘complexity’ of the interventions that appear to favor case studies. Several of the six articles in this issue have begun to define these conditions, and future methodological work could usefully build on this foundation. For instance, an initially relevant characteristic of complexity can involve interventions having multiple causes and effects. Moreover, the intervention may be ‘quite distal from the outcomes and impacts of interest’ (Mookherji and LaFond, 2013: 285). Complexity also may mean understanding interventions in their totality, not ‘in terms of their components’ (Byrne, 2013: 218). Finally, Woolcock suggests that interventions can vary according to their causal density: those having a high causal density might trigger a case study evaluation (Woolcock, 2013: 237−39). According to Woolcock, density reflects four conditions: (1) the number of required person-to-person transactions, (2) the amount of discretion by front-line implementing agents, (3) the pressure on the agents to respond to distracting conditions, and (4) whether the agents’ solutions come from a known menu or need to innovate. In contrast, interventions with low causal densities may be physical development projects having known technological solutions,
  • 24. such as building roads, providing proper sanitation and electricity, building schools, and administering vaccinations (Andrews et al., 2012) − and for which other evaluation methods may be entirely appropriate. In summary, future studies should examine the importance of describing the actual features associated with the labeling of an intervention as ‘complex,’ rather than relying on the use of the label alone. 330 Evaluation 19(3) Making the awareness of case study evaluation methods a higher priority A third priority topic sits at a higher plane than the first two − and may be more difficult to pursue. Although case study evaluation methods have advanced over the years, progress has been slow (e.g. Yin, 2000a). Some key topics such as triangulation and the use of rival explanations, as previ- ously discussed in this article, still appear to be underdeveloped and await further investigation and elaboration in order to become potent routines. One possible explanation for the lack of progress is that articles whose main concerns deal with case study evaluations paradoxically begin with a fairly elaborate discussion of non-case study methods, such as the experimental method. The effect of these lengthy and occasionally apologetic discussions may be to displace a systematic and more thorough
  • 25. canvassing of the potentially rele- vant case study methods. The desired canvassing would increase the awareness over justifying why some case study practices but not others are to be employed in a planned evaluation. As an exam- ple, an initial discussion on rival explanations might cite the relevant literature, show how rivals had been incorporated (or not) in previous studies, and then indicate how rivals are to be used (or not) in the design of the planned evaluation. Rival explanations were only mentioned once in the six assembled articles (see Vellema et al., 2013). Taking analytic generalization as a second example, the creation of some typology of analytic generalizations, along with the operational procedures for deriving each type, would represent a greater advance than has been experienced during the past couple of decades. For example, Halkier (2011) suggests three forms of analytic generalization and offers procedures for examining them in empirical studies: (1) ideal-typologizing, (2) category zooming (depth on a single point), and (3) positioning (the reflection of multiple voices and discourse). Again, if an upcoming case study evaluation were initially to discuss the previous use of analytic generalization, even as a candidate but then rejected practice, the study still could be building important methodological lessons. In summary, a more systematic canvassing should concentrate on case study methods. These could include rivals, analytic generalization, and other practices not even touched upon in the pre- sent article (e.g. case selection, the distinction between proximal and distal causes, the mixture of
  • 26. case study and other methods in the same evaluation, yet other ways of generalizing, or parsing contextual conditions rather than leaving them as an amorphous entity as they now are). Only in this way might newer contributions emerge, accelerating progress in strengthening future case study evaluations. Now that would be some kind of learning. Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors. Notes 1. The present article is not intended to be a review of any sort of the assembled articles, nor did the present author attend the May 2012 workshop. 2. The brevity of this article precludes discussing a related type of validity − construct validity (e.g. Yin, 2014: 46−7). 3. Whether using QCA or not, the sequence of the within-case analysis preceding the between-case analysis − rather than starting an analysis by estimating the cross-case averages for specific variables − is critical for preserving the integrity of the individual cases in properly doing any multiple-case study (Yin, 2014: 164−7). 4. The brevity of this article precludes discussing potentially related kinds of generalizing, such as case- to-case transferability, whose strength depends on the similarity of the sending and receiving contexts (Lincoln and Guba, 1985: 297).
  • 27. Yin: Validity and generalization in future case study evaluations 331 5. Regarding this contrast with a sample-population mode of generalizing from experiments, whether research experiments should admit to involving a well-defined sample of human subjects and therefore be limited to only the fuller population of similar people rather than standing for ‘the norm for all human beings’ (Prescott, 2002: 38) has been the topic of continuing debate in psychology. The debate started because of the over-reliance on college sophomores in serving as subjects in behavioral research, now augmented by the realization that most subjects have been white males from industrialized countries (Henrich et al., 2010). 6. Ethnographic methods are usually associated with the desire to study phenomena in a real-world, up- close, and in-depth manner (e.g. Emerson, 2001). However, many ethnographies shy away from devel- oping the theoretical insights and ideas needed to make analytic generalizations. The predilections of this kind of ethnography should therefore be considered carefully before adopting the ethnographic method to do the fieldwork in a case study evaluation. References Allison GT (1971) Essence of Decision: Explaining the Cuban Missile Crisis. Boston, MA: Little, Brown. Allison GT and Zelikow P (1999) Essence of Decision: Explaining the Cuban Missile Crisis, 2nd edn. New
  • 28. York: Addison Wesley Longman. Andrews M, Pritchett L and Woolcock M (2012) Escaping capability traps through problem-driven iterative adaptation (PDIA). Working Paper 299. Washington, DC: Center for Global Development. Befani B (2013) Between complexity and generalization: Addressing evaluation challenges with QCA. Evaluation 19(3): 269–83. Bennett A (2004) Testing theories and explaining cases. In: Ragin CC, Nagel J and White P (eds), Workshop on Scientific Foundations of Qualitative Research. Arlington, VA: National Science Foundation, 49−51. Betts J (2013) Aid Effectiveness and Governance Reforms: Applying realist principles to a complex synthesis across varied cases. Evaluation 19(3): 249–68. Bromley DB (1986) The Case-Study Method in Psychology and Related Disciplines. Chichester: Wiley. Burawoy M (1991) The extended case method. In: Burawoy M et al.. (eds), Ethnography Unbound: Power and Resistance in the Modern Metropolis. Berkeley: University of California Press, 271−87. Byrne D (2013) Evaluating complex social interventions in a complex world. Evaluation 19(3): 217–28. Campbell DT (2014) Foreword. In: Yin RK, Case Study Research: Design and Methods. Thousand Oaks, CA: SAGE, xvii−xviii. Creswell JW and Plano Clark VL (2007) Designing and Conducting Mixed Methods Research. Thousand
  • 29. Oaks, CA: SAGE. Cronbach LJ (1975) Beyond the two disciplines of scientific psychology. American Psychologist 30: 116–27. Denzin NK (1978) The Research Act: A Theoretical Introduction to Sociological Methods, 2nd edn. New York: McGraw-Hill. Donmoyer R (1990) Generalizability and the single-case study. In: Eisner EW and Peshkin A (eds), Qualitative Inquiry in Education: The Continuing Debate. New York: Teachers College, 175−200. Emerson RM (ed.) (2001) Contemporary Field Research: Perspectives and Formulations, 2nd edn. Prospect Heights, IL: Waveland Press. Erickson F (2012) Comments on causality in qualitative inquiry. Qualitative Inquiry 18: 686−8. Geertz C (1973) The Interpretation of Cultures. New York: Basic Books. Goffman E (1963) Stigma: Notes on the Management of Spoiled Identity. New York: Prentice-Hall. Gomm R, Hammersley M and Foster P (2000) Case study and generalization. In: Gomm R, Hammersley M and Foster P (eds), Case Study Method. London: SAGE, 98−115. Halkier B (2011) Methodological practicalities in analytic generalization. Qualitative Inquiry 17: 787−97. Henrich J, Heine SJ and Norenzayan A (2010) The weirdest people in the world? Behavioral and Brain Sciences 33: 61–83. Jick TD (1979) Mixing qualitative and quantitative methods: triangulation in action. Administrative Science
  • 30. Quarterly 24: 602−11. Lincoln YS and Guba E (1985) Naturalistic Inquiry. Thousand Oaks, CA: SAGE. 332 Evaluation 19(3) Maxwell JA (2004) Using qualitative methods for causal explanation. Field Methods 16: 243−64. Maxwell JA (2012) The importance of qualitative research for causal explanation in education. Qualitative Inquiry 18: 655−61. Miles M and Huberman M (1994) Qualitative Data Analysis: A Sourcebook for New Methods. Thousand Oaks, CA: SAGE. Mitchell JC (1983) Case and situation analysis. Sociological Review 31: 187–211. Mookherji S and LaFond A (2013) Strategies to maximize generalization from multiple case studies: lessons from the Africa routine immunization system essentials (ARISE) project. Evaluation 19(3): 284–303. Patton M (2002) Qualitative Research and Evaluation Methods, 3rd edn. Thousand Oaks, CA: SAGE. Prescott HM (2002) Using the student body: college and university students as research subjects in the United States during the twentieth century. Journal of the History of Medicine 57: 3–38. Ragin CC (1987) The Comparative Method: Moving beyond Qualitative and Quantitative Strategies. Berkeley, CA: University of California Press.
  • 31. Ragin CC (2000) Fuzzy Set Social Science. Chicago: University of Chicago Press. Ragin CC (2009) Reflections on casing and case-oriented research. In: Byrne D and Ragin CC (eds), The Sage Handbook of Case-based Methods. London: SAGE, 522−34. Rivera LA (2008) Managing ‘spoiled’ national identity: war, tourism, and memory in Croatia. American Sociological Review 73: 613−34. Rosenbaum PR (2002) Observational Studies, 2nd edn. New York: Springer. Seawright J and Gerring J (2008) Case selection techniques in case study research: a menu of qualitative and quantitative options. Political Research Quarterly 61: 294−308. Shavelson RJ and Towne L (eds) (2002) Scientific Research in Education. Washington, DC: National Academy Press. Small ML (2009) ‘How many cases do I need?’ On science and the logic of case selection in field-based research. Ethnography 10: 5–38. Stufflebeam DL and Shinkfield AJ (2007) Evaluation Theory, Models, and Applications. San Francisco, CA: Jossey-Bass. Teddlie C and Tashakkori A (2009) Foundations of Mixed Methods Research: Integrating Quantitative and Qualitative Approaches in the Social and Behavioral Sciences. Thousand Oaks, CA: SAGE. Vellema S, Ton G, de Roo N and van Wijk J (2013) Value chains, partnerships and development: using case
  • 32. studies to refine programme theories. Evaluation 19(3): 304–20. Wholey J (1979) Evaluation: Performance and Promise. Washington, DC: The Urban Institute. Woolcock M (2013) Using case studies to explore the external validity of ‘complex’ development interven- tions. Evaluation 19(3): 229–48. Yin RK (2000a) Case study evaluations: a decade of progress? In: Stufflebeam DL, Madaus GF and Kelleghan T (eds), Evaluation Models: Viewpoints on Educational and Human Services Evaluation, 2nd edn. Boston, MA: Kluwer, 185–93. Yin RK (2000b) Rival explanations as an alternative to ‘reforms as experiments’. In: Bickman L (ed.), Validity & Social Experimentation: Donald Campbell’s Legacy. Thousand Oaks, CA: SAGE, 239−66. Yin RK and Davis D (2007) Adding new dimensions to case study evaluations: the case of evaluating com- prehensive reforms. New Directions for Program Evaluation: Informing Federal Policies for Evaluation Methodology 113: 75−93. Yin RK and Ridde V (2012) Théorie et pratiques des études de cas en évaluation de programmes. In: Ridde V and Dagenais C (eds), Approches et practiques en évaluation de programmes, 2nd edn. Montreal: University of Montreal Press, Chapter 10. Yin RK (2014) Case Study Research: Design and Methods (5th edn.). Thousand Oaks, CA: SAGE. Robert K. Yin is President of the COSMOS Corporation and has consulted extensively on the use of case
  • 33. study evaluations for many clients including the United Nations Development Programme and The World Bank. He has published extensively on case study methods: the 3rd edition of Applications of Case Study Research was published in 2012; and the 5th edition of Case Study Research: Design and Methods has just been published with a 2014 copyright date.