(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
The use of LISREL in validating marketing constructs
1. 283
The use of LISREL in validating
marketing constructs *
Jan-Benedict E.M. Steenkamp
and Hans C.M. van Trijp
Department of Marketing and Marketing Research, Wageningen
University, Hollandseweg I, 6706 KN Wageningen, Netherlands
,4n expository overview of the use of LISREL in validating
marketing constructs is presented and its advantages over the
“traditional approaches” are demonstrated. LISREL’S contribu-
tion in all phases of construct validation is discussed. It is
shown that LISREL has much to offer in purifying the measure
by testing the unidimensionality of the measurement instru-
ment, and in cross-validation to investigate the convergent
validity (within a method) and reliability. LISREL allows a
rigorous assessment of the stability of the construct and its
measurement instrument, and it is a powerful methodology for
assessing convergent validity across methods, discriminant
validity, and nomological validity. The use of LISREL in con-
struct validation is empirically illustrated by the analysis of
data concerning consumers’ variety seeking tendency with re-
spect to foods.
1. Introduction
The validity of constructs is a necessary
condition for theory development and testing
and, therefore, construct validity lies at the
very heart of scientific progress in marketing.
Construct validity is the degree to which a
construct achieves empirical and theoretical
* This paper was written while Professor Steenkamp was on
sabbatical leave at the Department of Marketing, The Penn-
sylvania State University, supported by a fellowship from
the Niels Stensen Foundation. The research was partially
supported by a grant from ECOZOEK to Mr van Trijp for the
proJect “Variety Seeking in Consumer Behavior”. The
authors thank Professors Richard P. Bagozzi (University of
Michigan), Hans Baumgartner (The Pennsylvania State Uni-
versity), and James C. Anderson (Northwestern University),
and two anonymous reviewers for their valuable comments
on earlier drafts of the paper. The usual disclaimers apply.
meaning (Bagozzi, 1980; Peter, 1981). In the
literature, the following criteria have been
proposed that should be satisfied for con-
struct validity to be achieved: (1) unidimen-
sionality, (2) within-method convergent valid-
ity, (3) reliability, (4) stability, (5) across-
method convergent validity and discriminant
validity, and (6) nomological validity (Nunn-
ally, 1978; Churchill, 1979; Bagozzi, 1980,
1981a,b; Judd et al., 1986; Gerbing and
Anderson, 1988). There is an implied
hierarchy among the construct validation
criteria from the easiest to attain to the more
complex and difficult to satisfy. Earlier
criteria should be satisfied before going to
later criteria. Convergent validity is men-
tioned twice, first for multiple applications of
the same method, and subsequently for in-
struments using dissimilar methods (cf.
Bagozzi, 1981a; Phillips and Bagozzi, 1986).
In order to draw meaningful conclusions
about the convergent validity across methods,
the convergent validity of a set of measures
using the same method should first be estab-
lished. *
Many researchers still use rather straight-
forward techniques such as coefficient (Y,ex-
ploratory factor analysis and bivariate corre-
lations to assess the criteria for construct
validity (see, e.g., Seymour and Lessne, 1984;
Zaichkowsky, 1985; Parasuraman et al., 1988).
While these traditional techniques are val-
uable, the emergence of covariance structure
Intern. J. of Research in Marketing 8 (1991) 283-299
North-Holland
’ A case in point is “key informant” research where multiple
informants (serving as methods) rate multiple items. It makes
only sense to assess the extent to which the key informants
agree when it has first been established that the set of items
the informant has rated converge to a common construct
(Phillips and Bagozzi, 1986).
0167-8116/91/$03.50 0 1991 - Elsevier Science Publishers B.V. All rights reserved
2. 284 J.-B. E. M. Steenkump, H. C.M. uan Trijp / Vuhdating marketing constructs
models and the widespread availability of
accompanying computer programs such as
LISREL (Jijreskog and Siirbom, 1988) provide
the researcher with a powerful tool for more
detailed assessment and refinement of the
construct validity of marketing measurement
instruments. Basically, there are two primary
advantages of LISREL over the traditional
methods, pertaining to the construct’s em-
pirical and theoretical meaning. 2 First, it
provides a test of the theoretical structure of
the measurement instrument, i.e., the rela-
tionship of the construct with its measures.
Second, the relationships between the con-
struct and other constructs can be tested
without the bias that measurement error in-
troduces. Both advantages are relevant for
theory building in marketing science as well
as in applied settings where unbiased esti-
mates of the measure’s reliability, stability,
and validity are also of great importance.
In the last decade, a considerable number
of studies have been published, dealing with
the use of LISREL in construct validation.
These studies usually concentrate on one is-
sue, are often highly technical, and have ap-
peared scattered through the psychological,
sociological, marketing, educational, and eco-
nomic literature. Thus, the literature is rather
fragmented. The purpose of this paper is to
present an overview of the use of LISREL in
validating marketing constructs and to il-
lustrate its application empirically. The ad-
vantages of the “LISREL approach” over the
“ traditional approaches” will be discussed. It
should be noted beforehand that the analysis
of multitrait-multimethod (MTMM) matrices
with LISREL will not be explored in great
z In this paper, “LISREL" and “covariance structure model”
are used interchangeably. Although other programs exist for
parameterizing covariance structure models (e.g., EQS, Bent-
ler, 1985), and some limited tests in confirmatory factor
analysis can be conducted with computer packages such as
SAS or SPSSX, LISREIL appears to be by far the most widely
adopted program in marketing and the social and managerial
sciences.
detail, and no empirical illustration is pro-
vided. This would increase the length of the
paper substantially while excellent expository
papers on this subject have recently appeared
in the marketing literature (Lastovicka et al.,
1990; Bagozzi and Yi, 1991).
The following section briefly discusses the
method employed in the empirical study to
illustrate the role of LISFEL in construct vali-
dation. The empirical illustration concerns the
development of a measurement instrument
for consumers’ variety seeking tendency with
respect to foods. In Sections 3 through 7, the
use of LIsRH_ for assessing various aspects of
construct validity is reviewed. In these sec-
tions, first the traditional approach is briefly
described, next it is shown how LISFLEL yields
additional insights, and finally its application
is demonstrated empirically. Section 8 con-
tains the conclusions.
2. Method
2.1. Vuriety seeking tendency
The issue of variety seeking has recently
gained more widespread attention among
marketers (see, e.g., McAlister and Pessemier,
1982; Givon, 1984, 1985; Lattin and McAlis-
ter, 1985; Kahn et al., 1986). In studying
variety seeking in the consumption context,
the distinction between the trait of con-
sumers’ variety seeking tendency and variety
seeking behavior is of great importance (cf.
Midgley and Dowling, 1978). Variety seeking
behavior refers to those observable aspects of
consumer behavior that are motivated by the
desire for change itself and reflects the ob-
servation that consumers derive utility from
the consumption of many different items per
se, in addition to the characteristics levels
provided by these items (Givon, 1984;
Wierenga, 1984).
The underlying trait of consumers’ variety
seeking tendency has been related to a more
3. J. -B. E. M. Steenkamp, H. C. M. van Trgp / Validating marketing constructs 285
global concept, known as Optimal Stimula-
tion Level (OSL) to explain why consumers
derive utility from variation per se (see, e.g.,
Berlyne, 1960, 1963; Fiske and Maddi, 1961;
Raju, 1980). The concept of OSL states that
individuals have an idiosyncratic intermediate
level of stimulation that is most preferred by
them. Several measurement instruments have
been proposed for this personality character-
istic, Zuckerman’s (1979) Sensation Seeking
Scale being the best known among them. Be-
havior aimed at modifying stimulation from
the environment into correspondence with the
optimal level is termed exploratory behavior.
Variation in behavior is one of the sources
that contributes to the actual level of stimula-
tion consumers are experiencing (Fiske and
Maddi, 1961). As such, variety seeking behav-
ior is conceived of as a specific manifestation
of exploratory behavior (see, e.g., Raju, 1980)
in that it may serve as a means of bringing a
suboptimal level of actual stimulation into
closer correspondence with the optimal level.
Utility derived from variety seeking behavior
is due to the behavior’s contribution to the
actual level of stimulation.
Consumer’s variety seeking tendency is re-
lated to OSL but differs from it in that it only
refers to stimulation regulation through varied
product consumption. Variety seeking tend-
ency is not regarded as a generalized per-
sonality trait such as OSL, but as a domain-
specific concept. A consumer might be a
variety seeker with respect to foods but not
with respect to vacations. This view on variety
seeking tendency accords with current psy-
chological literature suggesting that attitudes,
values, and personality variables should be
operationalized within a limited domain in
order to be related to specific behavior (Ajzen
and Fishbein, 1980; Verhallen and Pieters,
1984).
The use of LISREL in the validation of
marketing constructs will be empirically il-
lustrated on data collected for developing a
measurement instrument for variety seeking
within the domain of foods. Foods is one of
the product categories for which variety seek-
ing may be especially important (Hoyer and
Ridgway, 1984). We define the construct of
variety seeking tendency with respect to foods
as “the motivational factor that aims at pro-
viding variation in stimulation through varied
food consumption, irrespective of the instru-
mental or functional value of the product
alternatives”. It is hypothesized to be a one-
dimensional construct that influences varia-
tion in food consumption behavior. It is out-
side the scope of this paper to present a
comprehensive treatment of the literature on
variety seeking tendency and related con-
structs (see, e.g., van Trijp, 1989). In this
paper, the discussion of the empirical results
is focused on LIsmL-related issues and not on
variety seeking tendency.
2.2. Analysis
Details of the data collection will be dis-
cussed in subsequent sections of the paper.
LISREL 7 (Joreskog and S&born, 1988) was
used to analyze the data, where the analyses
were conducted on the covariance matrix of
the variables. The maximum likelihood esti-
mation procedure was used, unless otherwise
indicated. ML parameter estimates are rather
robust against moderate violations of the
multivariate normality assumption, provided
that the sample size exceeds about 100
(Boomsma, 1982; Gerbing and Anderson,
1985), but this is not the case for the overall
x2 test statistic and the asymptotic standard
errors (Browne, 1982, 1984a).
Browne (1984a) proposed an estimation
procedure which is asymptotically insensitive
to the distribution of the observations. How-
ever, the computational requirements of this
procedure, called Weighted Least Squares
(WLS) in LISREL7, are formidable and a large
sample size is required to estimate the
asymptotic covariance matrix accurately.
Joreskog and SBrbom (1986) propose the fol-
4. 286 J.-B. E. M. Steenkamp, H.C. M. uan Trijp / Validating marketing constructs
lowing rule of thumb: the sample size should
be at least 200 if q < 12, and at least 1_5q(q
+ 1) if q > 12, where q is the number of
items. 3 When the sample size does not meet
this requirement, as will often be the case in
construct validation studies, ML is to be pre-
ferred to WLS (Jiireskog and S&born, 1988).
Another approach is to use ML, and to
correct the overall x2 value for deviations
from multivariate normality with respect to
kurtosis. The x2 value tends to be overesti-
mated when the data are leptokurtically dis-
tributed, and underestimated when the data
are platykurtic. A corrected x2 value can be
obtained by dividing the x2, as estimated
with ML, by the multivariate coefficient of
relative kurtosis (Browne, 1984a). 4 This coef-
ficient can be computed with PRELIS (Jbre-
skog and SBrbom, 1986). 5 If the kurtosis is
substantial, estimates of the standard errors
should be approached with caution, and t-
values should be considerably greater than
12.0 ) before it can be concluded with confi-
dence that a coefficient is significant.
In the present empirical illustration, the
multivariate coefficient of relative kurtosis
was computed for each set of data, but the
basic conclusions remained unaltered. There-
fore, the uncorrected x2 values are reported
in this paper, unless indicated otherwise.
3 In this paper, “item” and “measure” are used interchangea-
bly to denote any type of response that is employed as an
operational measure of the construct.
4 This correction is based on the assumption of a multivariate
elliptical distribution of which the multivariate normal dis-
tribution is a special case. The proposed correction has been
found to perform well, even when the distribution of the
data diverges substantially from an elliptical distribution
(Browne, 1984a).
5 The formula for the multivariate coefficient of relative
kurtosis is:
N
II= c {(X~--)‘W-‘(X,--f)}2/{Nq(q+2)},
r=l
where x, is a q X 1 vector of values for subject r, X is the
corresponding q X 1 vector of sample means, q is the num-
ber of items, W= ((N -l)/N)S, S is the sample covari-
ante matrix, and N is the sample size (Browne, 1984a).
3. Unidimensionality
Unidimensionality can be defined as the
existence of one construct underlying a set of
items and has been recognized as “one of the
most critical and basic assumptions of mea-
surement theory” (Hattie, 1985, p. 139). 6 The
point of departure is a pool of items, based
on review of the literature, focus group dis-
cussions, etc., that all purportedly relate to
the construct under investigation. These items
are administered to subjects for scoring, and
subsequently the pool of items is purified,
using item-total correlations, corrected for the
item in question, in order to obtain a unidi-
mensional measurement instrument. When the
construct is hypothesized to consist of several
subconstructs (or dimensions), item-total cor-
relations should be computed for each sub-
construct separately. ’ The dimensionality of
the reduced set of items (if applicable: per
subconstruct) may be explored with explora-
tory factor analysis. The number of items can
be further reduced by selecting only high
6If a construct is hypothesized to consist of several subcon-
structs or factors, unidimensionality refers to each of the
factors separately, i.e., each item is related only to one
subconstruct. The construct itself is multidimensional at the
first-order level.
’ One reviewer noted that by using the item-total correlation
criterion one could omit important items that for some
reason become surpressed and include irrelevant items that
somehow become inflated, as reflected in the correlations.
Further, meaningful items might be eliminated when (1) the
construct consists of several subfactors, (2) some subfactors
have only one or two items and these subfactors are weakly
correlated with the other subfactors, and (3) the researcher is
not aware of what the subfactors might be. When the
researcher is not certain about what the subfactors might be
it is prudent to apply exploratory factor analysis to the total
pool of items in order to investigate whether meaningful
subfactors are present in the data before computing item-
total correlations. One should guard, however, against re-
taining factors that have no substantive interpretation. These
issues emphasize that it is crucial to carefully delineate the
domain of the construct and to assess its operational
meaningfulness prior to the empirical investigation of a
measurement instrument. Admittedly, this is a subjective
undertaking and probably represents the weakest link in the
construct validation process (cf. Cracker and Algina, 1986).
5. J.-B.E.M. Steenkamp, H.C.M. uan TrQp / Validating marketing constructs 287
loading items. Depending on the results, one
or more scales are usually constructed as an
unweighted sum of the scores on these high
loading items. When coefficient (Y of this
scale(s) is adequate, it is concluded that a
reliable, unidimensional measurement instru-
ment is obtained. However, coefficient cxis a
measure of reliability and cannot be used to
infer unidimensionality (see Hattie, 1985, for
an extensive discussion).
3.1. The use of LISREL in assessing unidimen-
sionality
LISRELcan add to the results of the tradi-
tional techniques by testing the unidimen-
sionality of a scale, and, if necessary, refining
the factor structure found in exploratory fac-
tor analysis in order to ensure unidimen-
sionality. Although LISREL could also be ap-
plied to the complete pool of data, the use of
item-total correlations and exploratory factor
analysis to reduce the set of items and to
provide preliminary scales that can subse-
quently be tested and refined with LISREL
makes the LISREL analysis more manageable
(Gerbing and Anderson, 1988). The confirma-
tory factor analysis model on which the
analyses are based is
where x is the q X 1 vector of the n sets of
observed variables (items), 6 is the n x 1 vec-
tor of the underlying factors, A is the q x n
matrix of regression coefficients relating the
items to the underlying factors, and S is the
q X 1 vector of error terms of the items. Re-
strictions are placed on various parameters
for theory testing (and for identification pur-
poses). A common situation is when n = 1,
i.e., the construct is hypothesized to consist
only of a single factor. Unidimensionality im-
plies that in equation (1) A has only one
column and n = 1 (or, in case the construct is
multidimensional, and unidimensionality re-
fers to each subconstruct, only one element
per row of the matrix A is different from
zero). ’ The overall fit of the model provides
the necessary and sufficient information to
determine whether a set of items is unidimen-
sional (Kumar and Dillon, 1987).
If LISREL shows a bad fit of the model,
respecification is necessary. Examination of
standardized residuals can assist the re-
searcher in identifying items that cause the
unacceptable fit of the initial model. Stan-
dardized residuals are the residuals from the
observed and reproduced covariance matrix
divided by their asymptotic standard errors,
and values exceeding 12.58 1 indicate mis-
specification (Jdreskog and Sorbom, 1988).
The pattern of standardized residuals is infor-
mative for respecification. When a subset of
items has large negative standardized residu-
als (representing overfitting) with the other
items pertaining to the same factor, and large
positive residuals among each other (repre-
senting underfitting), this suggests that the
subset constitutes a separate factor. Further,
an item that is related to the wrong factor will
usually have large negative residuals with the
other items of that factor and large positive
residuals with the items of the “correct” fac-
tor. When an item has many large standar-
dized residuals (in absolute value) but no
clear pattern emerges, it appears best to delete
the item. However, standardized residuals
should be used with caution. They are calcu-
lated under the assumption of multivariate
normality, and will be be biased when the
data violate this assumption. With large sam-
ples, standardized residuals could become sig-
8When the construct consists of four or more subconstructs,
higher-order unidimensionality of the construct can be tested
by performing a second-order confirmatory factor analysis
on the covariances among the subconstructs.
6. 288 J. -B.E.M. Steenkamp, H. C. M. oan Trijp / Validating marketing constructs
nificant, simply because of the power of the
test. 9
3.2. Application: assessing the unidimensional-
ity of the VST-SC&
Preliminary item tryouts resulted in a set of
42 items all purportedly representing the un-
derlying construct of consumers’ variety seek-
ing tendency with respect to foods (hereafter
called VST).The items were personally admin-
istered to a random sample of 159 female
purchasers of foods, living in two large cities.
In all phases of measurement development,
the items were scored on five-point labeled
Likert scales ranging from “completely dis-
agree” ( = 1) to “completely agree” ( = 5).
Seventeen items were selected on the basis
of a corrected item-total correlation exceed-
ing 0.60. Principal axis factoring of these 17
items yielded the following eigenvalues for
the first three factors: 9.57, 1.24, and 0.95.
The results strongly suggested a single un-
derlying factor as the first factor explained
more than 50% of the variation in the data, 16
items had loadings exceeding 0.60 on this
factor (the 17th item had a loading of 0.58),
and the plot of eigenvalues showed a distinct
scree at two factors (Cattell, 1966). The num-
ber of items was further reduced by selecting
the 11 items that had a factor loading exceed-
ing 0.70 on the first factor (cf. Armor, 1974).
Purification of the measure usually stops here.
Coefficient (Y would be computed (0.95 in
this case) and one would conclude that a
reliable and unidimensional ll-item measure
of VST is obtained. However, contrary to
LISREL,these techniques do not test the factor
structure of the construct. It will be shown
that the ll-item scale is not unidimensional,
despite its high reliability.
LISREL was applied to the covariance ma-
trix of the 11 items to test the unidimen-
yWe thank a reviewer for bringing our attention to this point.
sionality of the scale. The fit of the (one-con-
struct) model was unacceptable: x2(44) =
167.49 (p -CO.OOl), x2/df ratio = 3.81, good-
ness-of-fit index (GFI)= 0.82, Tucker-Lewis
index (TLI)= 0.89.lo The standardized resid-
uals indicated that the problems were caused
by three items. Nine standardized residuals,
all involving one or two of these three items,
were greater than 12.58 1, and most of the
other standardized residuals involving these
items were also considerable. Further, no con-
sistent pattern emerged (e.g., high negative
standardized residuals with the other eight
items and high positive values among them-
selves). These findings suggest that these items
are not unidimensional with respect to the
other items, but also do not constitute a sep-
arate factor. The measurement model was
respecified by eliminating the three items.
The resulting 8-item model yielded a good fit,
indicating that a unidimensional measure for
VST was obtained: x2(20) = 27.06 (p = 0.13)
x’/df ratio = 1.35, GFI = 0.96, TLI = 0.99. ”
4. Cross-validation, within-method convergent
validity, and reliability
Cross-validation of the unidimensional
measurement instrument on new data is re-
commended because there is the possibility
‘” The Tucker-Lewis index (Tucker and Lewis, 1973) is less
frequently applied in marketing than the Bender-Bonett
incremental index of fit (BBI) (Bentler and Bonett, 1980).
However, TLI is relatively independent from sample size and
incorporates a penalty function against “overfitting”, both
of which are not the case for BBI (Marsh et al., 1988;
McDonald and Marsh, 1990).
The eight items constituting the vsr-scale are: (1) “When I
eat out I like to try the most unusual items, even if I am not
sure I would like them”; (2) “While preparing food or
snacks, I like to try out new recipes”; (3) “I think it is fun to
try out food items one is not familiar with”; (4) “I am eager
to know what kind of foods people from other countries
eat”; (5) “I like to eat exotic foods”; (6) “Items on the menu
that I am unfamiliar with make me curious”; (7) “I prefer to
eat food products I am used to” (reversed); and (8) “I am
curious about food products I am not familiar with”.
7. .I.-B.E.M. Steenkamp, H. C.M. van Trap / Validatingmarketq constructs 289
that one has capitalized on chance (Cudeck
and Browne, 1983). These data may also be
used to test the within-method convergent
validity and reliability of the measurement
instrument. Churchill (1979) implicitly sug-
gests that the new data should only be used to
cross-validate the reliability of the scale. It is
thus assumed that a set of items having a high
reliability also possesses within-method con-
vergent validity, as no separate investigation
of convergent validity is proposed. However,
a set of items can be reliable without exhibit-
ing within-method convergent validity. By in-
creasing the number of items, one can nearly
always increase reliability, even when these
items have very low and nonsignificant factor
regression coefficients. Gerbing and Ander-
son (1988) showed empirically that a mea-
surement instrument that has unacceptable
within-method convergent validity may still
be highly reliable. Thus, within-method con-
vergent validity should be achieved before
reliability is estimated (see also Bagozzi,
198lb; Bagozzi et al., 1990). ‘*
4.1. The use of LISREL in assessing within-
method convergent validity and reliability
A weak condition for convergent validity is
that the factor regression coefficient on a
particular item is statistically significant. A
stronger condition is that the factor regres-
sion coefficient is substantial. Hildebrandt
(1987) suggests with respect to the latter crite-
rion that the correlation between the item and
the construct should exceed 0.50. These con-
ditions should be evaluated, provided that a
I2Within-method convergent validity deals with the extent to
which multiple applications of the same method are in
agreement, while reliability is the degree to which measures
are free from random error. Thus, as one reviewer pointed
out. the two concepts are closely related. Fulfillment of the
criteria of convergent validity virtually guarantees that the
measurement instrument is reliable. However, Gerbing and
Anderson (1988) have shown that the reverse need not be
the case.
third requirement of convergent validity is
met, i.e., that the overall fit of the model is
acceptable.
Subjects’ scores on the construct are usu-
ally estimated as a weighted or unweighted
linear composite of the item scores. The relia-
bility of the composite, pg, is given by (Jore-
skog, 1971a; Alwin and Jackson, 1979)
p6 = l/{ 1 + l/( X’&iA Var(t))}, (2)
where X is the p x 1 vector of factor regres-
sion coefficients, 9 is a p X p diagonal matrix
with error variances in the diagonal, and p
refers to the items pertaining to the same
construct (p G q). The most reliable com-
posite has item weights proportional to
X,/Var(&) (i = 1, 2,. . . , p). It is common
practice in marketing to use the unweighted
sum of the item scores as estimate of the
construct scores, but this will only maximize
construct reliability in case of equal lambda’s
and error variances (implying that each item
reflects the underlying construct to the same
extent and has the same amount of measure-
ment error). In that case, equation (2) can be
reduced to the formula for coefficient (Y.When
these conditions are not met, coefficient (Y
underestimates construct reliability, but the
extent of underestimation is usually slight
(Armor, 1974).
4.2. Application: cross-validation and assess-
ment of within-method convergent validity
and reliability of the VST-SC&
The eight-item measure for VST together
with the three items that were deleted in the
previous phase were administered to a new
random sample of 151 female purchasers of
foods, living in a large metropolitan area. The
a priori hypothesis that the three “offending”
items should be deleted was confirmed. The
fit of the 11-item model was again unaccepta-
ble: x*(44) = 240.96 (p -CO.OOl), x*/df ratio
= 5.48, GFI = 0.76, TLI = 0.76, while the fit of
the eight-item model was good: x*(20) =
8. 290 J. -B.E.M. Sieenkamp, H.C.M. uan Trijp / Validating marketing constructs
53.18 (p < O.OOl), X*/df ratio = 2.66, GFI =
0.92, TLI = 0.93.
As might be expected, the fit indices were
somewhat lower than in the previous phase
but the amount of shrinkage is small. The
value of GFI compares favorably with simula-
tion results (Anderson and Gerbing, 1984)
and TLI is above 0.90 (Bentler and Bonett,
1980). The data tended to be leptokurtic, the
multivariate coefficient of relative kurtosis
being 1.50. The corrected x2(20) is 35.48 (p
= 0.02), and the corrected x’/df ratio is 1.77.
Confidence in the model is further enhanced
by an analysis in which the meaning of the
construct is kept invariant by constraining the
h’s in the cross-validation sample to be equal
to the h’s found in the first sample (cf. Cudeck
and Browne, 1983). The following values for
the indices of fit were obtained: x2(27) =
69.75 (p < O.OOl), x2/df ratio = 2.58, GFI =
0.90, TLI = 0.93, corrected x2(27) = 46.53 (p
= 0.012), corrected x*/df ratio = 1.72. The
difference in x2 between the unconstrained
and constrained model after correction for
kurtosis was a nonsignificant 11.05 (df = 7,
p = 0.132).
Thus, it can be concluded that the eight-
item instrument is acceptably unidimensional.
Within-method convergent validity is also
achieved as the overall fit of the model was
good, all factor regression coefficients were
highly significant ( p -c O.OOl), and the corre-
lation of each item with the construct ex-
ceeded 0.50. The reliability of the construct is
an adequate 0.92, using equation (2) (coeffi-
cient (Y= 0.90).
5. Stability
Probably the most widely used method for
assessing the stability of a construct is to
compute the correlation between the (un-
weighted) composite scores from two admin-
istrations of the same test to the same sub-
jects. However, random measurement error
attenuates the test-retest correlation, while on
the other hand systematic error, due to mem-
ory effects and other possible causes, may
inflate the stability estimate. Sources of insta-
bility cannot be located with the test-retest
correlation method as it does not distinguish
between the construct and its measurement
instrument.
5.1. The use of LISREL in stability assessment
LISREL enables the researcher to estimate
the stability coefficient, i.e., the correlation
between the scores on the construct at two
points in time, corrected for random and sys-
tematic error in the items, and to test a num-
ber of aspects of the factorial invariance of
the construct through a sequence of tests (Al-
win and Jackson, 1981; Marsh and Hocevar,
1985). Testing of aspects of factorial invari-
ance yields insight into possible causes of
construct instability.
As an important preliminary step, the re-
searcher should test for the existence of sys-
tematic error in the items by comparing the
fit of the unrestricted factor model, that al-
lows for correlated errors between the same
items over time, with the fit of the restricted
model that does not allow for correlated er-
rors. When the difference in x2 is significant,
subsequent analyses should be performed on
the correlated-errors-model. Otherwise one
can proceed with the model of no correlations
between errors. Subsequent analyses take the
form of testing the difference in fit of a
sequence of four nested models.
First, one should test for invariance of the
factor regression coefficients. This test is of
central importance to stability assessment be-
cause it tests the stability of the meaning of
the construct. If the fundamental relation-
ships between the measures and the construct
are not stable, it can hardly be argued that
the construct is stable (Alwin and Jackson,
1981). Second, test for the simultaneous in-
variance of factor regression coefficients and
9. .J-B.E.M. Steenkamp, H.C.M. van Trijp / Validating marketing constructs 291
error variances. Third, test for the simulta-
neous invariance of factor regression coeffi-
cients, error variances, and variances of the
latent variables. l3 This amounts to testing
whether the reliability of the individual items
and the reliability of the construct are in-
variant over time. Fourth, test for the simul-
taneous invariance of factor regression coeffi-
cients, error variances, variances of the latent
variables, and for a stability coefficient equal
to unity. Joreskog (1971b) suggests that one
should support earlier steps before going to
later steps. In each step, the model in ques-
tion is tested against the model of the im-
mediately preceding step.
5.2. Application: assessment of the stability of
VST
To examine the stability of VST, the eight
items were administered to 96 students during
class time. Two weeks later the eight items
were administered to the same subjects, again
during class time. Thirty-two students were
lost to attrition, leaving 59 students for stabil-
ity assessment. While this number is not large,
and a longer period between test and retest
might be preferable, the data are sufficient
for illustrative purposes. LISREL was applied
to the covariance matrix of the 16 items. The
model contained two scores for VST, one for t,
and one for t,. Due to the small number of
observations, the models were estimated using
U nweighted Least Squares (ULS) (Jiireskog
and S&born, 1988). The asymptotic proper-
ties of ML only apply with large numbers, and
ML is not robust in small samples (Boomsma,
1982; Gerbing and Anderson, 1985). No dis-
tributional assumptions are necessary with
ULS, and ULS estimates seem to be more robust
against small sample sizes.
The fit of the model that allows for the
existence of systematic error was significantly
better than the fit of the model with uncorre-
lated errors across time (x:(8) = 42.62, p <
0.001). Thus, the hypothesis that item errors
are independent from t, to t, should be re-
jected. In this case, the upward bias in the
stability coefficient due to systematic error
was small (0.03), but may be more substantial
in other studies. The other analyses were con-
ducted with the correlated-errors-model. The
hypothesis that factor regression coefficients
are invariant could not be rejected (x:(7) =
7.71, p = 0.37). This means that the meaning
of VST is stable. Neither the hypothesis of
invariance of the factor regression coefficients
as well as the error variances, nor the hy-
pothesis of invariance of regression coeffi-
cients, error variances, and factor variances
could be rejected (x:(8) = 3.84, p = 0.87, and
xi(l) = 0.96, p = 0.34, respectively). Thus,
the reliability of the individual items and the
reliability of the construct are invariant over
time. The estimated reliability of the con-
struct for this sample of students was 0.92,
which is the same as reported earlier for the
cross-validation sample. The final hypothesis
stating that, in addition to the other aspects
mentioned earlier, the stability coefficient
equals unity was rejected (xi(l) = 9.55, p <
0.01). The best estimate of VST’S stability,
using the model with correlated errors, and
lambda’s, error variances, and factor vari-
ances invariant, was a high 0.93. l4
The simple test-retest correlation between
the composite scores was 0.81, which under-
estimates the “true” stability of VST. In this
case the attenuation, due to random measure-
ment error, exceeds the upward bias in the
l3When the construct consists of more than one factor, one I4 The fit indices also indicate that the model has a satisfactory
should not only test for invariance in variances between fit: x2(111) =149.86, p < 0.01; x’/df =1.35; GFI = 0.98;
corresponding factors over time but also for invariance in TLI = 0.98. The x2 value, corrected for kurtosis, was 137.49
covariances between factors. ( p = 0.05).
10. 292 J. -B.E.M. Steenkamp,H.C.M. uanTri~p/ Validatingmarketingcanstracts
test-retest correlation, due to systematic error
in the items.
6. Across-method convergent validity and dis-
criminant validity
Across-method convergent validity may be
investigated with the multitrait-multimethod
(MTMM) matrix (Campbell and Fiske, 1959),
where each trait or construct is measured by
two or more (maximally) different methods.
Assessment of convergent validity across
methods is highly desirable since high conver-
gent validity within a method may be due to
all items converging on a method factor in-
stead of loading on the construct in question.
The MTMM matrix can also be used for explor-
ing discriminant validity, although a more
limited investigation of this criterion can also
be carried out when the various constructs are
operationalized with a single method only.
Campbell and Fiske (1959) developed four
criteria that should be met to achieve conver-
gent validity (across methods) and discrimi-
nant validity in a MTMM framework. These
criteria are based on the pattern of correla-
tions between the scores on the various meth-
ods for the constructs, have been extensively
discussed in the marketing literature (see, e.g.,
Bagozzi, 1980; Lastovicka et al., 1990), and
will not be repeated here. Several problems
with the Campbell-Fiske criteria may be
noted. There is a lack of quantitative specifi-
cation as to what constitutes satisfactory re-
sults. Item variance cannot be separated into
the contributions of constructs, methods, and
error. The criteria are based on observed cor-
relations, and hence it is assumed that each
measurement instrument is equally reliable. It
is implicitly assumed that there are no corre-
lations between method factors and construct
factors, that all constructs are influenced to
the same extent by method factors, and that
method factors are uncorrelated (Marsh and
Hocevar, 1983; Schmitt and Stults, 1986).
6.1. The use of LISREL in assessing across-
method convergent validity and discrimi-
nant validity
MTMM matrices can also be analyzed with
LISREL. The scores on each observed variable
can be considered to be a function of three
components: a trait or construct component,
a method component, and a random error
component. Constructs and methods are rep-
resented by latent variables, and the (sum-
mated) scores on the different measurement
instruments, each purporting to measure a
construct with a certain method, are used as
indicators.
The most widely applied LISREL approach
to the analysis of MTMM matrices is the con-
firmatory factor analysis model (CFA) (Marsh
and Hocevar, 1983; Widaman, 1985). Wida-
man (1985) outlined a procedure, involving
four nested models, to test hypotheses about
construct and method factors. Briefly, model
1 (null model) explains the variance in the
measures by random error; model 2 (con-
struct-only model) explains the variance in
the measures by the construct factors and
random error; model 3 (method-only model)
explains the variance in the measures by the
method factors and random error; and model
4 (construct-method model) explains the vari-
ance in the measures by construct factors,
method factors, and random error. x2 dif-
ference tests are used to test for the presence
of construct and/or method factors. If con-
struct factors are present, model 2 should
have a significantly better fit than model 1,
and model 4 should have a significantly bet-
ter fit than model 3. If method factors are
present, model 3 should have a significantly
better fit than model 1, and model 4 should
have a significantly better fit than model 2. If
the test procedure indicates the presence of
construct and/or method variance in the data,
their magnitude in each measure can be
estimated using the factor regression coeffi-
cients (Widaman, 1985).
11. J.-B.E.M. Steenkamp, H.C.M. uan Trtjp / Validatmg markettng constructs 293
The criteria for within-method convergent
validity can also be employed for assessing
the convergent validity across methods. Dis-
criminant validity is achieved when the corre-
lation among constructs differs significantly
from unity or when the x2 difference test
indicates that two constructs are not perfectly
correlated. The CFA model avoids the prob-
lems of the traditional Campbell-Fiske ap-
proach. It allows for a test of convergent and
discriminant validity. It provides overall in-
dices of fit and tests of hypotheses about
construct and method factors. The use of
latent variables solves the problem of dif-
ferential reliability among the measurement
instruments. The variance of the measure-
ment instruments can be partitioned into the
contributions of constructs, methods, and er-
ror.
The CFA model assumes that the variation
in measures is a linear combination of con-
structs, methods, and error. However, method
factors may interact in a multiplicative way
with construct factors. Campbell and O’Con-
nell (1967) presented evidence that the higher
the basic relationship between two constructs,
the more that relationship is increased when
the method is shared. They also presented
evidence that when two constructs are basi-
cally independent their correlation is zero,
even when the constructs are measured by the
same method. These data can be explained by
a multiplicative model but not by an additive
model. Lastovicka et al. (1990) pointed out
that when the underlying structure of the
MTMM data is multiplicative, the CFA model
fails to provide valid insights into the conver-
gent and discriminant validity of the mea-
sures.
Browne (1984b) developed the Direct
Product Model (DPM) to deal with multiplica-
tive effects of constructs and methods. The
DPM was recently introduced in the marketing
literature by Lastovicka et al. (1990). They
discussed and illustrated how convergent and
discriminant validity is assessed with the DPM
and showed empirically that the DPM could be
used to test the convergent and discriminant
validity of two two data sets for which the
CFA model even failed to converge. Bagozzi
and Yi (1990, 1991) applied the CFA model
and the DPM to fifteen previously published
MTMM matrices (and one new data set) and
found support for the DPM in five data sets.
Wothke and Browne (1990) have shown that
the DPM can be reformulated as a second-order
confirmatory factor analysis model, allowing
researchers to estimate the DPM with LISRJSL. I5
Sample LIsREL 7 program specifications for
the DPM are given by Bagozzi and Yi (1990,
1991).
Thus, method effects are hypothesized to
be constant in the CFA model, while they are
hypothesized to vary with the level of con-
struct correlations in the DPM. Often, the re-
searcher will not have a clear hypothesis about
the structure of the method effects, and for
this reason Lastovicka et al. (1990) and
Bagozzi and Yi (1990, 1991) recommend that
both models be tested in the analysis of MTMM
data. The results reported in these three papers
indicate that it is unlikely that both models
would fit a particular MTMM data set.
Despite its great potential, the MTMM de-
sign is not often used in marketing (Cote and
Buckley, 1987; Bagozzi and Yi, 1991), a main
reason being that it is difficult and costly to
develop multiple methods. It is more common
to use only a single measurement method.
Although the MTMM approach cannot be used
in this situation, LISREL can still be used in
assessing discriminant validity between con-
structs measured by the same method. l6
I5 We thank a reviewer for bringing our attention to this point.
” For unambiguous results, the constructs should be measured
by the same method. When each construct is measured by a
different method. it is unclear whether discriminant validity
is due to differences in constructs, differences in methods or
both.
12. 294 J. -B. E.M. Steenkamp, H. C. M. uan Trijp / Validating marketing constructs
6.2. Application: assessment of the discrimi-
nant validity of VST
The 8-item scale for VST, a Dutch version
of Zuckerman’s (1979) Sensation Seeking
Scale (Feij and Van Zuilen, 1984), purporting
to measure Optimal Stimulation Level, and
four measures for variation in consumption
behavior with respect to foods were person-
ally administered to 191 male and female
purchasers of food living in five small and
medium sized cities. All 53 items of sss were
measured on five-point Likert scales. SSS con-
sists of four subscales, “Thrill and Adventure
Seeking”, “Experience Seeking”, “Boredom
Susceptibility”, and “ Disinhibition Seeking”.
Following Feij and Van Zuilen (1984), the
ratings of the items on each subscale were
averaged and these average ratings on the
four subscales were used as indicators of sss.
Variation in consumption behavior with re-
spect to foods (VARBEH) is measured by four
measures, based on self-reports: number of
different types of fresh fruits that the subject
consumes at least four times a year, the num-
ber of types of sandwich fillings the subject
consumes at least once a month, a coefficient
of entropy (Theil and Finke, 1983) based on
the number and share of different “bases”
(carbohydrate deliverers) consumed with hot
meals during the last seven days, and a simi-
lar entropy measure for vegetables consumed
with hot meals during the last week. These
measures capture essential characteristics of
Dutch food consumption behavior. Data on
VARBEH were collected to assess the nomo-
logical validity of VST (see below).
Gerbing, 1984). The data contained consider-
able measurement error (for example, the
variance due to VST was only 51.4% versus
60.1% in the cross-validation sample), which
is probably due to the fact that the interview
took a long time (on average about 45
minutes). The correlation between VST and sss
was 0.41 (p -CO.OOl), which is significantly
lower than unity (xi(l) = 108.88, p < 0.001).
Thus, discriminant validity between VST and
sss is supported. The same conclusion would
be reached, using the simple correlation be-
tween the raw composite scores on VST and
sss (r = 0.35, which is significantly lower than
unity, p < O.OOl), but this correlation is at-
tenuated due to measurement error, thus
overstating the degree of discriminant validity
achieved.
7. Nomological validity
Nomological validity is assessed by testing
the relationships with other constructs in a
nomological net, usually with correlation or
regression analysis (see, e.g., Westbrook, 1980;
Peter, 1981; Comer, 1984; Ruekert and
Churchill, 1984). Correlation/regression an-
alysis does not allow formal testing of the
nomological net and does not eliminate the
biasing effect of measurement error from the
estimates of the relations between the con-
structs.
7.1. The use of LISREL in assessing nomological
validity
These data do not allow us to illustrate the The overall indices of fit of LISREL are
analysis of MTMM matrices for VST, but they usually heavily influenced by the goodness-
do allow us to illustrate testing of discrimi- of-fit of the measurement part of the model,
nant validity with LISREL. Discriminant valid- and to a far lesser extent by the goodness-of-
ity of VST was investigated with respect to SSS. fit of the structural relations part. This is
The goodness-of-fit indices indicated a rea- because in most situations by far the greater
sonable fit for the two construct model: proportion of the parameters to be estimated
x2(53) = 144.48 (p < O.OOl), x2/df ratio = belongs to the measurement part (Mulaik et
2.73, GFI = 0.90, TLI = 0.88 (cf. Anderson and al., 1989). However, the goodness-of-fit of the
13. J.-B.E.M. Steenkamp, H.C.M. van Trijp / Validating marketing constructs 295
structural relations part is of central concern
in assessing nomological validity. Recently,
Anderson and Gerbing (1988) have proposed
a sequential testing procedure involving a
series of five nested structural models that
deals with this problem. The models are:
(1) MS, the saturated structural model in
which all structural parameters that can pos-
sibly be specified are estimated;
(2) M,, the null structural model constrain-
ing all structural parameters to zero;
(3) M,, the theoretical model (nomological
net) under investigation;
(4) MC, the next most likely constrained
alternative to M,;
(5) MU, the next most likely unconstrained
alternative to Mt.
The researcher first conducts a pseudo x2
test in which the x2 for MS (the smallest
possible value for a structural model) is tested
with the number of degrees of freedom of M,
(the largest possible value for a structural
model). When this x2 is significant, no struct-
ural model will give an acceptable fit, and the
problem is centered in the measurement part
of the model which may be respecified. Next,
the difference in x2 between M, and MS is
tested. This test provides “an asymptotic in-
dependent assessment of the theoretical
model’s explanation of the relations of the
estimated constructs to one another. In other
words, one can make an asymptotically inde-
pendent test of the nomological validity”
(Anderson and Gerbing, 1988, p. 419). When
the difference in x2 ’is not significant, J4, is
tested against MC. When this difference is
significant, and the difference in x2 between
M, and MU is not significant, i’Ut is accepted
and hence nomological validity is supported.
In other instances, one might accept either
MC, MU, or respecify the model, depending on
the results of the x2 difference tests. In the
latter case, the analysis becomes increasingly
exploratory (see Anderson and Gerbing, 1988,
for further details).
The sequential testing procedure relies en-
tirely on differences in x2, which are in-
fluenced by sample size. Anderson and Gerb-
ing (1988) therefore recommend that an in-
cremental fit index be used in conjunction
with the formal statistical tests to assess
whether the difference in fit between rival
structural models are not only significant but
also substantial. The noncentralized relative
normed fit index might be used for this pur-
pose (Mulaik et al., 1989; McDonald and
Marsh, 1990):
NRNF = {(x'n - df,) - (x; - df;)}
-(df;-df,)}-‘, (3)
where x’,, xf, and xf are the fit of M,, MS,
and the model under investigation ( M,, MC,
Mu, or a respecified version of MC or Mu),
respectively, and df,, df,, and df, are the
degrees of freedom of the null model, the
model under investigation, and of the
saturated model, respectively. When NRNF is
high, the structural part of the model is sup-
ported, although the traditional goodness-of-
fit indices for the complete model may indi-
cate mediocre fit (or worse). Differences in
NRNF can be used to compare different for-
mulations of the nomological net.
7.2. Application: assessment of nomological
validity of VST
A nomological net consisting of the con-
structs VST, sss, and VARBEH was developed.
It was hypothesized that VST affects VARBEH
and is correlated with sss, and that sss has no
significant effect on VARBEH, given findings in
the literature that generalized personality
constructs are usually not related to specific
behavior (Ajzen and Fishbein, 1980; Verhal-
len and Pieters, 1984). Although the nomo-
logical net is relatively simple, it is sufficient
to illustrate the basic principles of LISREL in
nomological validity assessment.
14. 296 J.-B.E. M. Steenkamp, H.C.M. oan Trgp / Validating marketing constructs
The pseudo x2 value was significant
(~‘(104) = 249.87, p < O.OOl), indicating
problems in the measurement model which
were largely due to the measurement error in
the data. The sequential test procedure, how-
ever, allows the assessment of the nomologi-
cal validity despite the significant pseudo x2
value. The difference in x2 between the hy-
pothesized model, M,, and the saturated
model, MS, was not significant (x:(l) = 2.48,
p = 0.12). Further, M, had a significantly
better fit than MC, i.c., the model in which the
effect of VST on VARBEH is constrained to be
zero (x:(l) = 32.77, p -c O.OOl), and the dif-
ference in fit between M, and MU (which in
this case is the same as MS) was not signifi-
cant. The findings of the x2 difference tests
were supported by the values of NRNF: 0.41
(MC), 0.99 (M,), and 1.00 (MU). The explana-
tion of the structural part of the model im-
proved dramatically by specifying a relation-
ship between VST and VARBEH, but there was
little room for further improvement by speci-
fying a causal effect of sss on VARBEH. Note
that these conclusions are not apparent from
the usual goodness-of-fit indices, which are
predominantly affected by the fit of the mea-
surement model (e.g., GFI was already a rather
high 0.83 for M,).
Thus, although the overall fit of the hy-
pothesized model was not so good [x2(102) =
252.35 (p c O.OOl), x2/df ratio = 2.47, GFI =
0.86, TLI = 0.841, the nomological validity of
VST was supported by the sequential x2 dif-
ference tests, the NRNF value of the model,
and by the differences in NRNF values with
the next most likely unconstrained and con-
strained model. The effect of VST on VARBEH
is large and highly significant, the path coeffi-
cient being 0.63 (p < 0.001). As was found
earlier, VST and sss are positively correlated
(r = 0.42, p < 0.001).
The simple correlations of the four mea-
sures of variation in consumption with VST
and sss also supported the nomological valid-
ity of vsr. Vsr’s correlations were signifi-
cantly higher than sss’s for three out of four
measures ( p < 0.05, one-sided, after Fisher
Y- z transformation). For the number of
sandwich fillings, sss had a slightly higher
correlation, but the difference was not signifi-
cant. However, the measurement part and the
structural part of the model are not sep-
arated, and the nomological net is not for-
mally tested. These drawbacks are solved in
the LISREL approach.
8. Conclusions
This paper presented an expository over-
view of the use of LISREL in validating
marketing constructs. Its purpose was not to
imply that “ traditional” techniques are not
valuable. Preliminary screening of items can
be carried out with item-total correlations
and exploratory factor analysis, and simple
correlations provide valuable information in
other phases of construct validation as well.
There are also some advantages of the “ tradi-
tional” approaches relative to LISREL. For ex-
ample, in the “ traditional” approaches it is
very easy to obtain observation-by-observa-
tion composite scores, while with LISFEL,
scores on the construct are indeterminate. No
specialized computer programs are required
for the “ traditional” approaches. It should
also be noted that LISREL and the “tradi-
tional” approaches may not always yield sub-
stantially different conclusions.
However, as has been shown and il-
lustrated in the paper, LISREL provides the
researcher with a more powerful and versatile
tool for a detailed and critical examination of
various aspects of construct validity. In puri-
fying the measure, LISREL allows for testing
the unidimensionality of the measurement in-
strument. In cross-validation, LISREL'S contri-
bution is in the investigation of the conver-
gent validity (within-a-method) and in obtain-
ing more accurate estimates of reliability.
LISREL allows in-depth assessment of the sta-
15. J.-B. E.M. Steenkamp, H.C.M. oan Tr[jp / Validating markering constructs 297
bility of the construct and its measurement
instrument through a hierarchical series of
tests, placing increasingly stringent restric-
tions on the model. Thus, insight is obtained
into the possible causes of instability, and
stability estimates are not biased by random
and systematic measurement error. LISREL is
also a powerful methodology for analyzing
discriminant validity and across-method con-
vergent validity and for testing nomological
nets.
References
Ajzen, I. and M. Fishbein, 1980. Understanding attitudes and
predicting social behavior. Englewood Cliffs, NJ: Prentice-
Hall.
Alwin, D.F., 1974. Approaches to the interpretation of rela-
tionships in the multitrait-multimethod matrix. In: H.L.
Costner (ed.), Sociological methodology 1973-1974, 79-
105. San Francisco, CA: Jossey-Bass.
Alwin, D.F. and D.J. Jackson, 1979. Measurement models for
response errors in surveys: Issues and applications. In:
K.F. Schuessler (ed.), Sociological methodology 1980, 69-
119. San Francisco, CA: Jossey-Bass.
Alwin, D.F. and D.J. Jackson, 1981. Applications of simulta-
neous factor analysis to issues of factorial invariance. In:
D.J. Jackson and E.F. Borgatta (eds.), Factor analysis and
measurement in sociological research: A multi-dimensional
perspective, 249-279. Beverly Hills, CA: Sage.
Anderson, J.C. and D.W. Gerbing, 1984. The effect of sam-
pling error on convergence, improper solutions, and good-
ness-of-fit indices for maximum likelihood confirmatory
factor analysis. Psychometrika 49, 155-173.
Anderson, J.C. and D.W. Gerbing, 1988. Structural equation
modeling in practice: A review and recommended two-step
approach. Psychological Bulletin 103, 411-423.
Armor, D.J., 1974. Theta reliability and factor scaling. In: H.L.
Costner (ed.), Sociological methodology 1973-1974, 17-50.
San Francisco, CA: Jossey-Bass.
Bagozzi, R.P., 1980. Causal models in marketing. New York:
Wiley.
Bagozzi, R.P., 1981a. Evaluating structural equation models
with unobservable variables and measurement error: A
comment. Journal of Marketing Research 18, 375-381.
Bagozzi, R.P.. 1981b. An examination of the validity of two
models of attitude. Multivariate Behavioral Research 16,
323-359.
Bagozzi, R.P., F.D. Davis and P.R. Warshaw, 1990. Develop-
ment and test of a theory of technological learning and
usage. Working Paper, University of Michigan.
Bagozzi, R.P. and Y. Yi, 1990. Assessing method variance in
multitrait-multimethod matrices: The case of self-reported
affect and perceptions at work. Journal of Applied Psy-
chology 75, 547-560.
Bagozzi, R.P. and Y. Yi, 1991. On the analysis of multitrait-
multimethod matrices in consumer research. Journal of
Consumer Research (in press).
Bentler, P.M., 1985. Theory and implementation of EQS: A
structural equation program. Los Angeles, CA: BMDP
Statistical Software.
Bentler, P.M. and D.G. Bonett, 1980. Significance tests and
goodness of fit in the analysis of covariance structures.
Psychological Bulletin 88, 588-606.
Berlyne, D.E., 1960. Conflict, arousal, and curiosity. New
York: McGraw-Hill.
Berlyne, D.E., 1963. Motivational problems raised by explora-
tory and epistemic behavior. In: S. Koch (ed.), Psychology:
A study of a science Vol. 5, 284-364. New York, NY:
McGraw-Hill.
Boomsma, A., 1982. The robustness of LISREL against small
samples in factor analysis models. In: KG. Jlireskog and
H. Wold (eds.), Systems under indirect observation, Part I,
149-173. Amsterdam: North-Holland.
Browne, M.W., 1982. Covariance structures. In: D.M. Hawkins
(ed.), Topics in applied multivariate analysis, 72-141. Cam-
bridge, UK: Cambridge University Press.
Browne, M.W., 1984a. Asymptotically distribution-free meth-
ods for the analysis of covariance structures. British Journal
of Mathematical and Statistical Psychology 37, 62-83.
Browne, M.W., 1984b. The decomposition of multitrait-multi-
method matrices. British Journal of Mathematical and Stat-
istical Psychology 37, 1-21.
Campbell, D.T. and D.W. Fiske, 1959. Convergent and dis-
criminant validation by the multitrait-multimethod matrix.
Psychological Bulletin 56, 81-105.
Campbell, D.T. and E.J. O’Connell, 1967. Method factors in
multitrait-multimethod matrices: Multiplicative rather than
additive? Multivariate Behavioral Research 2, 409-426.
Cattell, R.B., 1966. The scree test for the number of factors.
Multivariate Behavioral Research 1, 245-276.
Churchill, G.A., Jr., 1979. A paradigm for developing better
measures of marketing constructs. Journal of Marketing
Research 16, 64-73.
Comer, J.M., 1984. A psychometric assessment of a measure of
sales representatives’ power perceptions. Journal of Market-
ing Research 21, 221-225.
Cote, J.A. and R. Buckley, 1987. Estimating trait, method, and
error variance: Generalizing across 70 construct validation
studies. Journal of Marketing Research 24, 315-318.
Cracker, L. and J. Algina, 1986. Introduction to classical and
modern test theory. New York, NY: Holt, Rinehart &
Winston.
Feij, J.A. and R.W. Van Zuilen, 1984. SBL handleiding span-
ningsbehoeftelijst. Lisse: Swets & Zeitlinger.
Fiske, D.W. and S.R. Maddi, 1961. Functions of varied experi-
ence. Homewood, IL: Dorsey Press.
Gerbing, D.W. and J.C. Anderson, 1985. The effects of sam-
pling error and model characteristics on parameter estima-
tion for maximum likelihood confirmatory factor analysis.
Multivariate Behavioral Research 20, 255-271.
Gerbing, D.W. and J.C. Anderson, 1988. An updated paradigm
16. 298 J. -B. E. M. Steenkamp, H. C. M. uan Trijp / Validating marketing constructs
for scale development incorporating unidimensionality and
its assessment. Journal of Marketing Research 25, 1866192.
Given, M. 1984. Variety seeking through brand switching.
Marketing Science 3, l-22.
Givon, M. 1985. Variety seeking, market partitioning and
segmentation. International Journal of Research in Market-
ing 2, 117-127.
Hattie, J.A., 1985. Methodology review: Assessing unidimen-
sionality of tests and items. Applied Psychological Mea-
surement 9, 139-164.
Hildebrandt, L., 1987. Consumer retail satisfaction in rural
areas: A reanalysis of survey data. Journal of Economic
Psychology 8, 19-42.
Hoyer, W.D. and N.M. Ridgway, 1984. Variety seeking as an
explanation for exploratory purchase behavior: A theoreti-
cal model. In: T.C. Kinnear (ed.), Advances in consumer
research, Vol. 11, 114-119. Provo, UT: Association for
Consumer Research.
Joreskog, K.G., 1971a. Statistical analysis of sets of congeneric
tests. Psychometrika 36, 109-133.
Joreskog, K.G., 1971b. Simultaneous factor analysis in several
populations. Psychometrika 36, 409-426.
Joreskog, K.G., 1979. Analyzing psychological data by struct-
ural analysis of covariance matrices. In: J. Magidson (ed.),
Advances in factor analysis and structural equation models,
455100. Cambridge, MA: Abt Books.
Joreskog, K.G. and D. Stirborn, 1986. PRELIS: A program for
multivariate data screening and data summarization.
Mooresville, IL: Scientific Software.
Joreskog, K.G. and D. Sorbom, 1988. LISREL 7: A guide to
the program and applications. Chicago, IL: SPSS Inc.
Judd, C.M., R. Jessor and J.E. Donovan, 1986. Structural
equation models and personality research. Journal of Per-
sonality 54, 1499198.
Kahn, B.E., M.U. Kalwani and D.G. Morrison, 1986. Measur-
ing variety-seeking and reinforcement behaviors using panel
data. Journal of Marketing Research 23, 89-100.
Kumar, A. and W.R. Dillon, 1987. Some further remarks on
measurement-structure interaction and the unidimensional-
ity of constructs. Journal of Marketing Research 24, 438
444.
Lastovicka, J.L., J.P. Murry, Jr. and E.A. Joachimsthaler, 1990.
Evaluating the measurement validity of lifestyle typologies
with qualitative measures and multiplicative factoring.
Journal of Marketing Research 27, 11-23.
Lattin, J.M. and L. McAlister, 1985. Using a variety seeking
model to identify substitute and complementary relation-
ships among competing products. Journal of Marketing
Research 22, 330-339.
Marsh, H.W., J.R. Balla and R.P. McDonald, 1988. Goodness-
of-fit indices in confirmatory factor analysis: The effect of
sample size. Psychological Bulletin 103, 391-410.
Marsh, H.W. and D. Hocevar, 1983. Confirmatory factor anal-
ysis of multitrait-multimethod matrices. Journal of Educa-
tional Measurement 20, 231-248.
Marsh, H.W. and D. Hocevar, 1985. The application of con-
firmatory factor analysis to the study of self-concept: First
and higher order factor structures and their invariance
across age groups. Psychological Bulletin 97, 562-582.
McAlister, L. and E.A. Pessemier, 1982. Variety seeking behav-
ior: An interdisciplinary review. Journal of Consumer Re-
search 9, 311-322.
McDonald, R.P. and H.W. Marsh, 1990. Choosing a multi-
variate model: Noncentrality and goodness of fit. Psycho-
logical Bulletin 107, 247-255.
Midgley, D.F. and G.R. Dowling, 1978. Innovativeness: The
concept and its measurement. Journal of Consumer Re-
search 4, 229-242.
Mulaik, S.A., L.R. James, J. Van Alstine, N. Bennett, S. Lind
and CD. Stilwell, 1989. Evaluation of goodness-of-fit in-
dices for structural equation models. Psychological Bulletin
105, 430-445.
Nunnally, J.C., 1978. Psychometric theory, 2nd ed. New York:
McGraw-Hill.
Parasuraman, A., V.A. Zeithaml and L.L. Berry, 1988. SERVQ-
UAL: A multiple-item scale for measuring consumer percep-
tions of service quality. Journal of Retailing 64, 12-40.
Peter, J.P., 1981. Construct validity: A review of basics issues
and marketing practices. Journal of Marketing Research 18.
133-145.
Peter, J.P. and G.A. Churchill, Jr., 1986. Relationships among
research design choices and psychometric properties of
rating scales: A meta-analysis. Journal of Marketing Re-
search 23, l-10.
Phillips, L.W. and R.P. Bagozzi, 1986. On measuring organiza-
tional properties of distribution channels: Methodological
issues in the use of key informants. In: J.N. Sheth (ed.),
Research in Marketing, Vol. 8, 313-369. Greenwich, CT:
JAI Press.
Raju, P.S., 1980. Optimum stimulation level: Its relationship to
personality, demographics, and exploratory behavior. Jour-
nal of Consumer Research 7, 272-282.
Ruekert, R.W. and G.A. Churchill, Jr., 1984. Reliability and
validity of alternative measures of channel member satis-
faction. Journal of Marketing Research 21, 226-233.
Schmitt, N. and D.M. Stults, 1986. Methodology review: Anal-
ysis of multitrait-multimethod matrices. Applied Psycho-
logical Measurement 10, l-22.
Seymour, D. and G. Lessne, 1984. Spousal conflict arousal:
Scale development. Journal of Consumer Research 71,
810-821.
Theil, H. and R. Finke, 1983. The consumer’s demand for
diversity. European Economic Review 23, 395-400.
van Trijp, H.C.M., 1989. Variety seeking in consumer behav-
ior: A review. Working Paper, Wageningen University.
Tucker, L.R. and C. Lewis, 1973. A reliability coefficient for
maximum likelihood factor analysis. Psychometrika 38, l-
10.
Verhallen, T.M.M. and R.G.M. Pieters, 1984. Attitude theory
and behavioral costs. Journal of Economic Psychology 5,
223-249.
Westbrook, R.A., 1980. A rating scale for measuring product/
service satisfaction. Journal of Marketing 44, 68-72.
Widaman, K.F., 1985. Hierarchically nested covariance struc-
17. J.-B.E.M. Steenkamp, H.C.M. uan Trijp / Validating marketing constructs 299
ture models for multitrait-multimethod data. Applied Psy-
chological Measurement 9, I-26.
Wierenga, B., 1984. Empirical test of the Lancaster characteris-
tics model. International Journal of Research in Marketing
1, 263-293.
Wothke, W. and M.W. Browne. 1990. The direct product
model for the MTMM matrix parameterised as a second
order factor analysis model. Psychometrika 55, 255-262.
Zaichkowsky, J.L., 1985. Measuring the involvement construct.
Journal of Consumer Research 12, 341-352.
Zuckerman, M., 1979. Sensation seeking: Beyond the optimal
level of arousal. Hillsdale, NJ: Lawrence Erlbaum.