The importance of constructvalidity in designing serious games Brock Dubbels
Overview of Talk• Great ideas – we want them to get better – Why video games?• You mean we have to show evidence? – The Vegas Effect• What can be done— – Methods, validity, and return on investment• Great ideas pt 2 – we know how to help
The market • Actual market in 2009 – 52 Billion – Growth actually closer to 25% • 2014 Projection – 86 Billion – Projected growth of 9.4% • Serious games projection – 400 to 500 MillionPrice Waterhouse Cooper, 2007
Get on the bandwagon?From a consumer perspective, serious games should be judged on how they deliver learning outcomes, not what they generate in sales in the marketplace.
• Games, by their very nature, assess, measure, and evaluate. – Can be very much like tools used in psychological assessments and evaluations
Informative Assessment• An informative assessment stresses meaningful, timely, and continuous feedback about learning during learning.• It is intended to provide the problems in contexts where the learner can learn through trial and error with feedback based upon the criteria for competence.• Research findings from over 4,000 studies indicate that informative assessment has the most significant impact on achievement (Wiliam, 2007).
The Vegas EffectShould everything that happens in games, stay in games?It is not enough to invoke games and play.Serious games should provide evidence that they delivered.This should be quantifiable in performance metrics
Surface (face) Validity• Games are often built on this. – It looks like it measures what it is supposed to measure. – It appears to be a good project – Does game-like delivery cut costs?
Construct Validity• When we claim construct validity, we are essentially claiming that our observed pattern—how things operate in reality—corresponds with our theoretical pattern—how we think the world works. – There are four main ways for assessing construct validity: • Internal validity is related to what actually happens in a study. • Has the independent variable really had an effect on the dependent variable? • Or was the effect on the dependent variable caused by some other confounding variable. – Convergent and discriminant validity • External validity refers to whether the findings of a study really can be generalized beyond the present study. We can break external validity down into two types. – Can we use the same measures in the game, and use them in the work environment? – Will the measures we use in the work environment be effectively modeled for in-game learning that transfers? • Population validity - which refers to the extent to which the findings can be generalized to other populations of people. • Ecological validity - which refers to the extent to which the findings can be generalized beyond the present situation.
Pre-Design Methods• Elicitation – do you really need a game?• Cognitive Ethnography• Identification of – quality of life and cost variables – theoretic processes, & documentation variables for kiosk.• Fidelity• Assessment, measurement, and evaluation methods.
Tension in workflow• Software Design • Research Design – Typically based upon an – Typically based upon economic consideration. answering a testable • How will this solve a question. problem? • How will this solve a problem? • What are the first steps in • How do I know this? production? – The focus is on method and – The focus is on stages of hypothesis testing: production: • Construct • Business Partner validity, reliability, reliability, a Relations, Function, Beha nd probability. vior, Structure, & Non- Function (qualities).
Nomological Networks• This is an attempt to provide better assurance of construct validity. To do this, the researcher should provide a theoretical frame-work for what is being measured, an empirical framework for how it is to be measured, and specification of the linkage between these two frameworks. – (Cronbach&Meehl, 1955)
Multi Trait Multi MethodConstruct MethodADL KioskCRB RubricSDT Inventories Campbell and Fiske (1959) Great explanation here: http://luna.cas.usf.edu/~mbrannic/files/pmet/mtmm1a.htm
Cognitive Ethnography Design/Usability Perspectives for MTMM validity measure
The state of long term care• 55% white, 35% Black, 10% Hispanic• Most workers are economically disadvantaged• Low levels of educational attainment.• Physically and emotionally demanding work, but often among the lowest paid in the service industry.• Viewed as an unpleasant occupation: primarily a maid service taking care of incontinent, cognitively unaware old people.• 45% attrition rate in first 90 days. Some reports show 100% turn over a year.• Great shortage—with shortages, come reductions in quality of care.• Expected growth rate of 85% with Baby Boomers retiring.• Regulation tends to emphasize entry training, with limited attention to continued career growth or development.• Supervisors with “good people skills,” promotion of worker autonomy are the most important predictors of higher job satisfaction and lower turnover rates. » From “Who Will Care for Us?” US Dept. Health & Human Services
The User Story• Functionally capable, but not skilled. – Soft skills – Documentation skills• High school education +/-1• Like popular culture.• A bit irreverent about job, but this is coffee break coping mechanism• Limited care load while training.• Enjoy popular culture – soaps, drama, etc.
Hypothesis• Will improved people skills and increased worker autonomy reduce attrition through improving the perceived quality of life. – Will perceived quality of life (PQoL) improve: – Increase well-being in residents and nursing assistants? – Reduce pain management? – Reduce catastrophic care? – Confidence and accuracy in information gathering and reporting?
Theoretical perspective• Improve soft skills and documentation. • Quality of Life Measures – How these are affected through: Presence, Constructive Conversation, Active Listening » These are used for game mechanics and coding dialog.• Reducing attrition, improving Perceived Quality of Life (PQoL), and improving documentation will reduce costs and allow for more hires, better wages – Reduce: pain meds, attendant care, catastrophic care.• Documentation – Reduce elicitation from medical staff – Improve medical staff objective knowledge on daily living skills
Analysis Tools• In order to measure whether the game does what it was designed to do: – Analysis criteria must exist inside and outside of the game for evaluation. – Same underlying measures from game • Tools from ADL, SDT, Complex Relationship Building • Inside: – Scoring system weighted dialog – Story content equated to kiosk input – Play aloud • Outside: – Observational scoring tool for preceptors – Survey for self-report – Resident survey – Kiosk – Care plans
Four questions1. Can I take any credit for any changes that have happened in an individuals learning?2. Does this have a connection to my instructional activities?3. Does these instructional activities equate to a return on investment?4. How do I know this?
Theory of interaction • The central cog in Figure 6, Psychological Needs, is modeled from Self-Determination Theory (Deci& Ryan, 2000). • The base measure, or bottom cog, came from the Activities of Daily Living (Roper, Logan, & Tierney, 1980; 2000) and is hypthothesized to be influenced through interpersonal relations. • The interpersonal relations were modeled from operationalization of Complex Relationship Building (Bulechek, Butcher, &Dochtman, 2008)
SDT• Self-Determination Theory • Basic Psychological Needs – SDT is a macro theory, and It Scale is concerned with supporting – General our natural or intrinsic • 21 items tendencies to behave in – Baard, P. P., Deci, E. effective and healthy ways. L., Ryan, R. M. (2004). – The key game play element here was as a larger category – Relationships for scoring criteria and • 9 items providing accessible terms for • La training. Guardia, Ryan, Couchman, &D eci, 2000) – In the work environment, theses inventories are used to – Work provide an opportunity to • 21 items create – (Deci, Ryan, Gagne, Leone, U sunov, Kornazheva, 2001) external, environmental, and population validity.
ADL• Activities of Daily Living • The term “activities of daily living” refers to a set of – The facility had already identified 8 common, everyday items for identification in their tasks, performance of which is kiosk software. required for personal self-care – The key game play element here and independent living. The most was modeling the facility kiosks in often used measure of functional the game and scoring the resident ability is the Katz Activities of interaction scenarios with how the Daily Living Scale (Katz et CNAs document their observations. al., 1963; Katx, 1983). – In the work environment, the • Wiener, Hanley, Clark, Van Nostrand kiosks are already used to collect (1990, pg.1 ) data, and this provides an opportunity to create external, environmental, and population validity and provide ROI analysis for care plans.
Complex Relationship Building• Complex Relationship Building • a nursing intervention from the Nursing Interventions Classification – Identified from Nursing Interventions (NIC) defined as establishing a Classification therapeutic relationship with a (NIC), (Bulechek, Butcher, &Dochtman, patient to promote insight and 2008) behavioral change. – The key game play element was • NIC identifies a 1 hour intervention. modeling interactions with the residents and providing an optimal path • There are 31 activities identified. for interaction. – These activities represent the focus of – Although this care giving practice is the game design and scoring process. supposed to take an hour, the CNA must – 3 operationalized processes were choose how to apportion that hour. taught to mediate this: – In the work environment, these • Constructive Conversation activities are used to provide language • Presence and action for continuous • Active Listening improvement. They are part of rubrics for observation, and a further opportunity to create external, environmental, and population validity.
Design Decisions• FPH –the first person healer• Flash / with database – Web-host as well as installed with data upload.• Time serves as game element – Functional task selection vs. Interpersonal Communication• Dialog driven – Dialog supported by video cut scenes with voice narration• Mini games such as room clean up• Reward system (blue stars)• Preceptor / Optimal Path / Debrief• Interface with task log, resident information, clock, pause.• Use real people’s faces as avatars• Increase engagement through subtle but tasteful irreverence.
Outcome Analysis• General Linear Model• Quality of Life Variables – Operationalized in soft skills and PQoL construct • Presence, Active Listening, and Constructive Conversation• Longitudinal study – Pre, Game Play, Post • Compare performance in: surveys, objective observer data, game play, non-game play controls, self-report. • Game play – construct sub-level scoring, i.e. number of residents, rewards, optimal path decisions. – Institutional data pre / post • Compare catastrophic care, pain meds, independence, attrition – Use game play, survey and observational tools as co-variates.
Take home• Can you pose a testable question– hypothesis? – Tension between design process and measurement • Needs – behavior, function, non-function, structure. • Construct validity – are you measuring what you think you are measuring? Theoretically? Conceptually? • Assessments, measures, & evaluations• Mixed Methods approaches such as cognitive ethnography can provide an opportunity to create a nomological network. – MTMM provides an analysis tool that can be constructed to identify convergent and discriminant validity.• Spend time understanding the sample population – Beliefs, likes, skills, & abilities. – irreverence increases engagement, but reduces happiness of business partner.• Usability testing should align with construct• Again, emphasis on validity – Without it, there is no capability for ROI analysis