Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Metric Calibration of Psychological Instruments (Dissertation Senate presentation)

20 views

Published on

Powerpoint slides of my dissertation project, presented to the Senate at The University of Western Ontario on June 22, 2011 in London, Ontario, Canada (Etienne P. LeBel, etiennelebel.com)

Abstract: Inspired by the history of the development of instruments in the physical sciences, and by past psychology giants, the following dissertation aimed to advance basic psychological science by investigating the metric calibration of psychological instruments. The over-arching goal of the dissertation was to demonstrate that it is both useful and feasible to calibrate the metric of psychological instruments so as to render their metrics non-arbitrary. Concerning utility, a conceptual analysis was executed delineating four categories of proposed benefits of non-arbitrary metrics including (a) help in the interpretation of data, (b) facilitation of construct validity research, (c) contribution to theory development, and (d) facilitation of general accumulation of knowledge. With respect to feasibility, the metric calibration approach was successfully applied to instruments of seven distinct constructs commonly studied in psychology, across three empirical demonstration studies and re-analyses of other researchers’ data. Extending past research, metric calibration was achieved in these empirical demonstration studies by finding empirical linkages between scores of the measures and specifically configured theoretically-relevant behaviors argued to reflect particular locations (i.e., ranges) of the relevant underlying psychological dimension. More generally, such configured behaviors can serve as common reference points to calibrate the scores of different instruments, rendering the metric of those instruments non-arbitrary.

LeBel, Etienne, "The Utility and Feasibility of Metric Calibration for Basic Psychological Research" (2011). Electronic Thesis and Dissertation Repository. 174.
https://ir.lib.uwo.ca/etd/174

Published in: Science
  • Be the first to comment

  • Be the first to like this

Metric Calibration of Psychological Instruments (Dissertation Senate presentation)

  1. 1. The Utility and Feasibility of Metric Calibration for Basic Psychological Research Etienne LeBel The University of Western Ontario
  2. 2. “…being so disinterested in our variables that we do not care about their units can hardly be desirable” (Tukey, 1969, p. 89). JOHN TUKEY "...psychologists have to start respecting the units they work with, or develop measurement units they can respect enough so that researchers can agree to use them" (Cohen, 1994, p. 1001). JACOB COHEN Inspirational Quotations
  3. 3. Over-arching Goal • Both useful and feasible to calibrate the metric of instruments in basic psychological research
  4. 4. Outline • Definitions and basic concepts • Metric calibration strategies • Past metric calibration research • Utility of Metric Calibration • Feasibility: 3 Empirical demonstration studies • Limitations and Future Directions
  5. 5. Definitions and Basic Concepts • Metric: unit of measurement used to quantify the amount of something • E.g., Celsius metric (°C) • Fridge range = -10 to +50 °C • Freezer range = -50 to +70 °C Fridge: Freezer:
  6. 6. Definitions and Basic Concepts • Metric: unit of measurement used to quantify the amount of something • E.g., Beck's Depression Inventory • Metric = 0 to 63 (BDI; Beck & Steer, 1987) • E.g., Self-report Depression Scale • Metric = 25 to 100 (SDS; Zung, 1965)
  7. 7. Definitions and Basic Concepts • Arbitrary metric: • Scores not inherently meaningful, other than relative interpretation • Formally: Unknown where a given score locates an individual on the underlying psychological dimension (Blanton & Jaccard, 2006a, 2006b) SDSBDI
  8. 8. Underlying Dimension Freezing Reference Point Thermometer CThermometer B SDS BDIMDI Underlying Dimension Behavioral Reference Point Thermometer A Boiling Reference Point 100°B 50°B 150°B 50°B50°B 100°B Cooking thermometer Indoor thermometer Outdoor thermometer
  9. 9. Main Strategies of Metric Calibration • Strategy 1 • Mapping scores to qualitatively distinct behaviors • Strategy 2 • Mapping scores to gradation of behaviors (Blanton & Jaccard, 2006a, 2006b; Sechrest et al., 1996) • Strategy 3 • Experimental approach • Manipulate construct to extreme levels
  10. 10. Metric Calibration Strategy 1 • Map scores to qualitatively distinct theoretically-relevant behaviors 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 5 10 15 20 25 30 35 40 Beck Depression Inventory Scores ProbabilityofSuicide Attempt Underlying Dimension Behavioral Reference Point
  11. 11. Metric Calibration Strategy 2 • Map scores to gradations of theoretically- relevant behaviors Underlying Dimension Ref. Point 8 Hrs/day Ref. Point 2 Hrs/day Ref. Point 12 Hrs/day
  12. 12. Ideal Characteristics of Behavioral Reference Points • Theoretically-relevant • Interpretationally clear (e.g., 1 or 0; hrs/day) • Objective • Unambiguous construct-wise • Also, theoretically-configured context
  13. 13. Past Metric Calibration Research • Specific areas of applied psychology: • Clinical psychology (Kazdin, 1999, 2001; Harman et al., 2001; Sechrest et al., 1996) • Sport psychology (Andersen, McCullagh, & Wilson, 2007) • Forensic psychology (Pirelli et al., 2011; Hanson, 2009; Hanson et al., in press) • Arbitrary metrics in psychology (Blanton & Jaccard, 2006a, 2006b)
  14. 14. Utility of Metric Calibration 1. Help in the interpretation of data a. Enhance interpretability of statistical effects b. Facilitate extraction of more information from data patterns c. Help overcome limitations of NHST 2. Facilitate construct validity research a. Help shed brighter light on psychological constructs b. Help with conceptual challenges (e.g., construct definition) c. Benchmark for detecting problems/improving measures
  15. 15. Utility of Metric Calibration 3. Contribute to theoretical development a. Facilitate theoretical debates involving absolute claims b. Allow more precise theorizing via enhanced scientific language c. Preliminary platform for quantitative testing of theories (Meehl, 1978) 3. Facilitate general accumulation of knowledge a. Calibration findings valuable information in their own right b. Guiding framework for cataloguing magnitude of psychological effects c. Facilitate phenomenon-based research (Rozin, 2001)
  16. 16. Feasibility of Metric Calibration • Empirical demonstration studies • Study 1: Need for cognition (NFC), task persistence (TP), conscientiousness • Study 2: Self-enhancement • Study 3: Risk-taking
  17. 17. Study 1: NFC and TP • Participants • 94 UWO introductory psychology undergraduates • 69 females, 25 males (age = 18.5, SD = 2.2) • Procedure & Materials • Need for cognition measure • Task persistence measure • Word association decision task • Anagram Persistence task • Demographics & Debriefing questions
  18. 18. Study 1: Materials • Need for cognition (NFC) • Tendency to engage in cognitively effortful activities and enjoy thinking in its own right (Cacioppo & Petty, 1982) • 18-item scale (Cacioppo, Petty, & Kao, 1984) • E.g. item: “I find satisfaction in deliberating hard for long hours.” • E.g. item: “Thinking is not my idea of fun” (R ) 1= Extremely Uncharacteristic 2 = Somewhat Uncharacteristic 3 = Uncertain 4 = Somewhat Characteristic 5 = Extremely Characteristic
  19. 19. Study 1: Materials • NFC behavioral reference point • Cognitively effortful (vs. simpler) Remotes Association Task (RAT) (Mednick & Mednick, 1967)
  20. 20. Study 1: Materials • Task persistence • Tendency to persist in an effortful behavior or frustration- inducing activity (Steinberg et al., 2007) • 2-item self-report measure (Steinberg et al., 2007) • Item 1: “I will keep trying the same thing over again even when I have not had success the first time” • Item 2: “I will often continue to work on something, even after other people have given up.” 1= Very untrue, not at all like me 2 = Somewhat untrue or not like me 3 = Somewhat true or like me 4 = Very true, very much like me
  21. 21. Study 1: Materials • Task persistence behavioral reference point • Anagram persistence task (Brandon et al., 2003; Quinn et al., 1996)
  22. 22. Study 1: Results: NFC Wald’s χ2 = 9.71, B = 1.20, odds ratio (OR) = 3.33, p < .002 Underlying Dimension Behavioral Reference Point 5 4 3 2 1 NFC Task 1: 62% Task 2: 38%
  23. 23. Study 1: Results: Task Persistence Linear: B = 0.18, β = r = .15, p < .15 Cubic model: F(3, 90) = 2.00, p < .10
  24. 24. Study 1: Discussion • Enhance MMR analyses • Re-analysis of O’Hara et al. (2009) Conventional +/- 1 SD approach Using calibrated values (75% NFC behavior) (25% NFC behavior) NFC scores centered on 3.8 (50% NFC behavior)
  25. 25. Study 2 Demonstration • Self-enhancement measures • Background context • Pan-cultural self-enhancement debate (Sedikides et al., 2003; Heine, 2005)
  26. 26. Study 2 • Participants • 97 UWO introductory psychology undergraduates • 50 females, 47 males (age = 18.9, SD = 1.3) • Procedure & Materials • 2 self-enhancement measures • Filler task (RAT) • Over-claiming technique • Balanced Inventory of Desirable Responding • Demographics & Debriefing questions
  27. 27. Study 2: Materials • Self-enhancement • Tendency to view characteristics of oneself in an overly positive manner (Hogan & Nicholson, 1988) • Better-than-average judgments (Alicke et al., 1995; Gaertner et al., 2008) • Rate extent to which each listed trait describes yourself relative to the average Western student of your own age and gender POSITIVE: dependable intelligent considerate observant polite respectful cooperative reliable friendly creative NEGATIVE: gullible disobedient snobbish lazy disrespectful mean unforgiving vain uncivil unpleasant 1 = Much worse than the average university student of my age and gender 4 = As well as the average university student of my age and gender 7 = Much better than the average university student of my age and gender
  28. 28. Study 2: Materials • Self-enhancement behavioral reference point • Over-claiming technique variant (OCT; Paulhus et al., 2003) • 150 items (10 categories of 15 items) • 3 non-existent items (foils) per category; 30 foils total • Behavioral index: # of foils claimed as familiar PLEASE INDICATE FOR EACH ITEM WHETHER YOU ARE FAMILIAR WITH THE ITEM OR NOT, BY CLICKING THE APPROPRIATE RESPONSE OPTION: 0 = Never heard of it 1 = Familiar with it
  29. 29. Study 2: Results Linear: B = 1.88, β = r = .29, p < .004 Cubic model: F(3, 94)= 5.91, p < .004
  30. 30. Study 3 Demonstration • Risk-taking measures • Demonstrate metric calibration for: • Measures capturing state-like constructs • Behavioral measures
  31. 31. Study 3 • Participants • 99 individuals from UWO campus • Compensated $5 + earnings in BART task • 39 females, 58 males, 2 non-specified (age = 24.5, SD = 5.5) • Procedure & Materials • Balloon Analogue Risk Task (BART) • Columbia Card Task (CCT) • Risky gambles Lottery task • Two self-report risk-taking measures • Demographics & Debriefing questions
  32. 32. Study 3: Materials • Risk-taking • Behavior involving possibility of gains but with potential negative consequences (Ben-zur & Zeidner, 2009; Lejuez et al., 2002) • Balloon Analogue Risk Task (BART) (Lejuez et al., 2002) • Ps inflate 30 simulated balloons onscreen • Each balloon pump worth 1 cent • If balloon explodes, money is lost for that trial • Scoring: mean # of pumps (non-exploding trials)
  33. 33. Study 3: Materials • Columbia Card Task (CCT) – hot version (Figner et al., 2009) • Ps sequentially turn over cards in 4 x 8 array • Accumulate as many points as possible • Can continue unless loss card turned
  34. 34. Study 3: Materials • Behavioral reference points • Risky gambles in lottery risk task (Hsee & Weber, 1999) • If Option B selected, experimenter would actually flip a coin • Risky gambles on lotteries with larger sure bets reflective of higher risk-taking reference point Lottery Option A Option B 1 $6 for certain Flip a coin. Receive $10 if heads, receive $0 if tails. 2 $2 for certain Flip a coin. Receive $10 if heads, receive $0 if tails. 3 $8 for certain Flip a coin. Receive $10 if heads, receive $0 if tails. 4 $5 for certain Flip a coin. Receive $10 if heads, receive $0 if tails. 5 $4 for certain Flip a coin. Receive $10 if heads, receive $0 if tails.
  35. 35. Study 3: Results: BART Wald’s χ2 = 4.85, B = .03, odds ratio (OR) = 1.03, p < .03
  36. 36. Study 3: Results: CCT $4 safe bet: Wald’s χ2 = 3.24, B = .08, odds ratio (OR) = 1.08, p < .07 $6 safe bet: Wald’s χ2 = 5.78, B = .30, odds ratio (OR) = 1.35, p < .02
  37. 37. Study 3: Discussion • BART & CCT calibrated to common $4 reference point • Implication: • Enhanced interpretation of data patterns • Proposed benefit 1. b) extraction of more information Underlying Dimension $4 Reference Point BART 30 25 20 15 10 5 0 90 80 70 60 50 40 30 20 10 0 CCT 10°R10°R 15°R 20°R 13°R
  38. 38. Limitations & Caveats • Small sample sizes • Consensus re: reference points
  39. 39. Future Directions • Richer behavioral reference points • E.g., EAR (Mehl et al., 2002) • E.g., Eye-tracking • Experimental approach • Capture behavioral manifestations beyond naturally-occurring levels • Item Response Theory approach (Lord, 1980) • Model distinct and ordered behavioral reference points
  40. 40. END • Thanks to all who have helped: • Conceptual: • Bertram, Kurt, Chris, Paul, Yang • Data collection: • Scott Leith • Assigning Cohen (1994): • Lorne

×