Your SlideShare is downloading. ×
Reliability, validity, generalizability and the use of multi-item scales
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Reliability, validity, generalizability and the use of multi-item scales


Published on

Reliability, validity, generalizability and the use of multi-item scales

Reliability, validity, generalizability and the use of multi-item scales

Published in: Technology, Business

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Reliability, validity,generalizability and the use ofmulti-item scalesEdward Shiu (Dept of Marketing) Valid?Generalizable?
  • 2. Multi-item scales
  • 3. How to use a questionnaire frompublished work• Appendix with items• Methodology section
  • 4. Existing multi-item scales• Used by many• Reliability and validity may be known• Good starting block• Basis to compare / contrast results
  • 5. Development of a Multi-item Scale(Doing it the HARD way!! See Malhotra & Birks, 2007)Develop TheoryGenerate Initial Pool of Items: Theory, Secondary Data, andQualitative ResearchCollect Data from a Large Pretest SampleStatistical AnalysisDevelop Purified ScaleCollect More Data from a Different SampleFinal ScaleSelect a Reduced Set of Items Based on Qualitative JudgmentEvaluate Scale Reliability, Validity, and Generalizability
  • 6. Example of Scale Development• See Richins & Dawson (1992) “A ConsumerValues Orientation for Materialism and itsMeasurement: Scale Development andValidation,” Journal of Consumer Research, 19(December), 303-316.• Materialism scale (7 items)– Marketing Scales Handbook (Vol IV) p. 352.1.It is important to me to have really nice things.2.I would like to be rich enough to buy anything I want.3.I‟d be happier if I could afford to buy more things.4. ......• Note, published scales not always perfect!!!
  • 7. Scale Evaluation(See Malhotra & Birks, 2007)Discriminant NomologicalConvergentTest/RetestAlternativeFormsInternalConsistencyContent Criterion ConstructGeneralizabilityReliability ValidityScale Evaluation
  • 8. Reliability & Validity• Reliability - extent a measuringprocedure yields consistent results onrepeated administrations of the scale• Validity - degree a measuringprocedure accurately reflects or assessesor captures the specific concept that theresearcher is attempting to measureReliable  Valid
  • 9. Reliability• Internal consistency reliabilityDO THE ITEMS IN THE SCALE GEL WELL TOGETHER• Split-half reliability, the items on the scale are dividedinto two halves and the resulting half scores arecorrelated• Cronbach alpha (α)– average of all possible „split-half‟ correlation coefficients resultingfrom different ways of splitting the scale items– value varies from 0 to 1– α < 0.6 indicates unsatisfactory internal consistency reliability(see Malhotra & Birks, 2007, p.358)– Note: alpha tends to increase with an increase in the number ofitems in scale
  • 10. • test-retest reliability– identical scale items administered at two differenttimes to same set of respondents– assess (via correlation) if respondents give similaranswers• alternative-forms reliability– two equivalent forms of the scale are constructed– same respondents are measured at two differenttimes, with a different form being used each time– assess (via correlation) if respondents give similaranswers– Note. Hardly ever practical
  • 11. Construct Validity• Construct validity is evidenced if we can establish– convergent validity, discriminant validity and nomological validity• Convergent validity extent to which scale correlatespositively with other measures of the same construct• Discriminant validity extent to which scale does notcorrelate with other conceptually distinct constructs• Nomological validity extent to which scale correlates intheoretically predicted ways with other distinct butrelated constructs.• Also read Malhotra & Birks, 2007, 358-359 on– content (or face) validity, criterion (concurrent & predictive)validity
  • 12. Generalizability• Refers to extent you can generalise fromyour specific observations to beyond yourlimited study, situation, items used,method of administration, context.....• Hardly even possible!!!
  • 13. Fun time• Now onto the data (COCB.sav) !!!!!!• Read my forthcoming JBR article forbackground on COCB and the scale• 1st SPSS and Cronbach alpha• Next, Amos and CFA• Followed by Excel to calculatecomposite/construct reliability and AVE, aswell as establish discriminant validity
  • 14. Cronbach alpha (α)• SPSS (Analyze…Scale…Reliability Analysis)• α < 0.6 indicates unsatisfactory internalconsistency reliability (see Malhotra &Birks, 2007, p.358)• α > 0.7 indicates satisfactory internalconsistency reliability (Nunnally &Berstein,1994)Ref: Nunnally JC & Berstein IH. (1994) PsychometricTheory. New York: McGraw-Hill.
  • 15. SPSS output for αAlpha value for dimension Credibility = 0.894 > 0.7 hence satisfactory
  • 16. SPSS further output for α• We note that alpha value for the Credibilitydimension would increase in value (from 0.894to 0.902) if item cred4 is removed.• However, unless the improvement is dramaticAND there is separate reasons (e.g. similarfindings from other studies), then we shouldleave the item as part of the dimension.
  • 17. Limitations for Cronbach alpha• We should employ multiple measures ofreliability (Cronbach alpha, composite/constructreliability CR & Average Variance ExtractedAVE)– Alpha and CR values often are very similarbut AVE‟s can vary much more from alphavalues– AVE‟s are also used to assess constructdiscriminant validity
  • 18. Composite/Construct Reliability• CR = {(sum of standardized loadings)2} / {(sum ofstandardized loadings)2 + (sum of indicatormeasurement errors)}• AVE = Average Variance Extracted = Variance Extracted= {sum of (standardzied loadings squared)} / {[sum of(standardzied loadings squared)] + (sum of indicatormeasurement errors)}• Note: Recommended thresholds: CR > 0.6 & AVE > 0.5,then construct internal consistency is evidenced (Fornell& Larker, 1981).Ref: Fornell, Claes and David G. Larcker (1981). “Evaluating StructuralEquation Models with Unobservable Variables and MeasurementError,” Journal of Marketing Research, 18(1, February): 39-50.
  • 19. Discriminant validity• Discriminant validity is assessed by comparingthe shared variance (squared correlation)between each pair of constructs against theminimum of the AVEs for these two constructs.• If within each possible pairs of constructs, theshared variance observed is lower than theminimum of their AVEs, then discriminant validityis evidenced (Fornell and Larker, 1981).
  • 20. Amos (Analysis of Moment Structures)Commcomm2e21comm1e3 11Benebene3e4bene2e5bene1e61111Credcred3e8cred2e9cred1e10cred4e1111111COCBave_SSI e12ave_POC e13ave_Voice e14ave_wom e1511111ave_BAoSF e161ave_DoRA e171ave_Flex e181ave_PiFA e191Loyaltyloy1e2211loy2e231loy3e241Rectangles= observed variablesEllipses= unobserved variablesloy1; loy2; loy3; comm1;comm2;….; cred1; ….bene1;....;ave_PiFA= SPSS variablese1 to e24= error variances= uniquenessLoyalty; Comm; Cred;Bene; COCB= latent factors= unobserved factors
  • 21. CFA and goodness of fit• See Hair et al.‟s book• E.g.,• The CFA resulted in an acceptable overall fit(GFI=.90, CFI=.94, TLI=.92, RMSEA=.068, andχ2=524.64, df=160, p<.001). All indicators loadsignificantly (p<.001) and substantively(standardized coef >.5) on to their respectiveconstructs; thus providing evidence ofconvergent validity.
  • 22. Refs• Baumgartner H, Homburg C. (1996). “Applications of structuralequation modeling in marketing and consumer research: a review,”International Journal of Research in Marketing,13(2):139–61.• Churchill, Gilbert A., Jr. (1979). “A Paradigm for Developing BetterMeasures of Marketing Constructs,” Journal of Marketing Research,16(1, February): 64-73.• Fornell, Claes and David G. Larcker (1981). “Evaluating StructuralEquation Models with Unobservable Variables and MeasurementError,” Journal of Marketing Research, 18(1, February): 39-50.• Hair, Joseph F., Jr., Rolph E. Anderson, Ronald L. Tatham, andWilliam C. Black (1998), Multivariate Data Analysis. 5th ed.Englewood Cliffs, NJ: Prentice Hall.• Nunnally JC & Berstein IH. (1994) Psychometric Theory. New York:McGraw-Hill.