Volume XI, Number 2, Summer 2005Issue Topic: Evaluation MethodologyTheory & PracticeEvaluation Theory or What Are Evaluation Methods for?Quick Links• About The Evaluation Exchange• Table of Contents• PDF Version (152 kB)• Request reprint permission• How to citeMel Mark, professor of psychology at the Pennsylvania State University and president-elect of the American EvaluationAssociation, discusses why theory is important to evaluation practice.Hey, this issue of The Evaluation Exchange focuses on methods, including recent methodological developments. What’sthis piece on evaluation theory doing here? Was there some kind of mix-up?No, it’s not a mistake. Although evaluation theory1 serves several purposes, perhaps it functions most importantly as aguide to practice. Learning the latest methodological advance—whether it’s some new statistical adjustment for selectionbias or the most recent technique to facilitate stakeholder dialogue—without knowing the relevant theory is a bit likelearning what to do without knowing why or when.What you risk is the equivalent of becoming really skilled at tuning your car’s engine without thinking about whether yourtransportation needs involve going across town, overseas, or to the top of a skyscraper. Will Shadish, Tom Cook, andLaura Leviton make the same point using a military metaphor: “Evaluation theories are like military strategies and tactics;methods are like military weapons and logistics,” they say. “The good commander needs to know strategy and tactics todeploy weapons properly or to organize logistics in different situations. The good evaluator needs theories for the samereasons in choosing and employing methods.”2The reasons to learn about evaluation theory go beyond the strategy/tactic or why/how distinction, however. Evaluationtheory does more than help us make good judgments about what kind of methods to use, under what circumstances, andtoward what forms of evaluation influence.
First, evaluation theories are a way of consolidating lessons learned, that is, of synthesizing prior experience. CarolWeiss’ work can help evaluators develop a more sophisticated and nuanced understanding of the way organizationsmake decisions and may be influenced by evaluation findings.3 Theories enable us to learn from the experience of others(as the saying goes, we don’t live long enough to learn everything from our own mistakes). George Madaus, MichaelScriven, and Daniel Stufflebeam had this function of evaluation theory in mind when they said that evaluators who areunknowledgeable about theory are “doomed to repeat past mistakes and, equally debilitating, will fail to sustain and buildon past successes.”4Second, comparing evaluation theories is a useful way of identifying and better understanding the key areas of debatewithin the field. Comparative study of evaluation theory likewise helps crystallize what the unsettled issues are forpractice. When we read the classic exchange between Michael Patton and Carol Weiss,5 for example, we learn aboutvery different perspectives on what evaluation use can or should look like.A third reason for studying evaluation theory is that theory should be an important part of our identities as evaluators, bothindividually and collectively. If we think of ourselves in terms of our methodological skills, what is it that differentiates usfrom many other people with equal (or even superior) methodological expertise? Evaluation theory. Evaluation theory, asWill Shadish said in his presidential address to the American Evaluation Association, is “who we are.”6 But people cometo evaluation through quite varied pathways, many of which don’t involve explicit training in evaluation. That there aremyriad pathways into evaluation is, of course, a source of great strength for the field, bringing diversity of skills, opinions,knowledge sets, and so on.Despite the positive consequences of the various ways that people enter the field, this diversity also reinforces theimportance of studying evaluation theories. Methods are important, but, again, they need to be chosen in the service ofsome larger end. Theory helps us figure out where an evaluation should be going and why—and, not trivially, what it is tobe an evaluator.Of course, knowing something about evaluation theory doesn’t mean that choices about methods can be madeautomatically. Indeed, lauded evaluation theorist Ernie House notes that while theorists typically have a high profile,“practitioners lament that [theorists’] ideas are far too theoretical, too impractical. Practitioners have to do the project worktomorrow, not jawbone fruitlessly forever.”7 Especially for newer theoretical work, the translation into practice may not beclear—and sometimes not even feasible. Even evaluation theory that has withstood the test of time doesn’t automaticallytranslate into some cookbook or paint-by-number approach to evaluation practice.More knowledge about evaluation theory can, especially at first, actually make methods choices harder. Why? Becausemany evaluation theories take quite different stances about what kind of uses evaluation should focus on, and about howevaluation should be done to achieve those uses. For example, to think about Donald Campbell8 as an evaluation theoristis to highlight (a) the possibility of major choice points in the road, such as decisions about whether or not to implementsome new program; (b) the way decisions about such things often depend largely on the program’s potential effects (e.g.,does universal pre-K lead to better school readiness and other desirable outcomes?); and (c) the benefits of eitherrandomized experiments or the best-available quasi-experimental data for assessing program effects.In contrast, when we think about Joseph Wholey9 as an evaluation theorist, we focus on a very different way thatevaluation can contribute: through developing performance-measurement systems that program administrators can use toimprove their ongoing decision making. These performance measurement systems can help managers identify problemareas and also provide them with good-enough feedback about the apparent consequences of decisions.Choosing among these and the many other perspectives available in evaluation theories may seem daunting, especiallyat first. But it’s better to learn to face the choices than to have them made implicitly by some accident of one’smethodological training. In addition, theories themselves can help in the choosing. Some evaluation theories have huge“default options.” These theories may not exactly say “one size fits all,” but they certainly suggest that one size fits darnnear all. Indeed, one of the dangers for those starting to learn about evaluation theory is becoming a true believer in oneof the first theories they encounter. When this happens, the new disciple may act like his or her preferred theory fits allcircumstances. Perhaps the most effective antidote to this problem is to be sure to learn about several evaluation theories
that take fairly different stances. Metaphorically, we probably need to be multilingual: No single evaluation theory shouldbe “spoken” in all the varying contexts we will encounter.However, most, if not all, evaluation theories are contingent; that is, they prescribe (or at least are open to) quite differentapproaches under different circumstances. As it turns out, there even exist theories that suggest very different bases forcontingent decision making. Put differently, there are theories that differ significantly on reasons for deciding to use oneevaluation design and not another.These lead us to think about different “drivers” of contingent decision making. For example, Michael Patton’s well-known Utilization-Focused Evaluation tells us to be contingent based on intended use by intended users. Almost anymethod may be appropriate, if it is likely to help intended users make the intended use.10 Alternatively, in a recent book,Huey-Tsyh Chen joins others who suggest that the choices made in evaluation should be driven by programstage.11 Evaluation purposes and methods for a new program, according to Chen, would typically be different from thosefor a mature program. Gary Henry, George Julnes, and I12 have suggested that choices among alternative evaluationpurposes and methods should be driven by a kind of analytic assessment of each one’s likely contribution to socialbetterment.13It can help to be familiar with any one of these fundamentally contingent evaluation theories. And, as is true of evaluationtheories in general, one or another may fit better, depending on the specific context. Nevertheless, the ideal wouldprobably be to be multilingual even in terms of these contingent evaluation theories. For instance, sometimes intendeduse may be the perfect driver of contingent decision making, but in other cases decision making may be so distributedacross multiple parties that it isn’t feasible to identify specific intended users: Even the contingencies are contingent.Evaluation theories are an aid to thoughtful judgment—not a dispensation from it. But as an aid to thoughtful choicesabout methods, evaluation theories are indispensable.1 Although meaningful distinctions could perhaps be made, here I am treating evaluation theory as equivalentto evaluation modeland to the way the term evaluation approach is sometimes used.2 Shadish, W. R., Cook, T. D., & Leviton, L. C. (1991). Foundations of program evaluation: Theories of practice. NewburyPark, CA: Sage.3 For a recent overview, see Weiss, C. H. (2004). Rooting for evaluation: A Cliff Notes version of my work. In M. C. Alkin(Ed.),Evaluation roots: Tracing theorists’ views and influences (pp. 12–65). Thousand Oaks, CA: Sage.4 Madaus, G. F., Scriven, M., & Stufflebeam, D. L. (1983). Evaluation models. Boston: Kluwer-Nijhoff.5 See the papers by Weiss and Patton in Vol. 9, No. 1 (1988) of Evaluation Practice, reprinted in M. Alkin (Ed.).(1990). Debates on evaluation. Newbury Park, CA: Sage.6 Shadish, W. (1998). Presidential address: Evaluation theory is who we are. American Journal of Evaluation, 19(1), 1–19.7 House, E. R. (2003). Evaluation theory. In T. Kellaghan & D. L. Stufflebeam (Eds.), International handbook ofeducational evaluation (pp. 9–14). Boston: Kluwer Academic.8 For an overview, see the chapter on Campbell in the book cited in footnote 2.9 See, e.g., Wholey, J. S. (2003). Improving performance and accountability: Responding to emerging managementchallenges. In S. I. Donaldson & M. Scriven (Eds.), Evaluating social programs and problems: Visions for the newmillennium. Mahwah, NJ: Lawrence Erlbaum.10 Patton, M. Q. (1997). Utilization-focused evaluation: The new century text. Thousand Oaks, CA: Sage11 Chen, H.-T. (2005). Practical program evaluation: Assessing and improve planning, implementation, and effectiveness.Thousand Oaks, CA: Sage.12 Mark, M. M., Henry, G. T., & Julnes, G. (2000). Evaluation: An integrated framework for understanding, guiding, andimproving policies and programs. San Francisco, CA: Jossey-Bass.13 Each of these three theories also addresses other factors that help drive contingent thinking about evaluation. Inaddition, at least in certain places, there is more overlap among these models than this brief summary suggests.Nevertheless, they do in general focus on different drivers of contingent decision making, as noted.Mel Mark, Ph.D.Professor of Psychology
The Pennsylvania State UniversityDepartment of Psychology407 Moore HallUniversity Park, PA 16802.Tel: 814-863-1755Email: firstname.lastname@example.org learning and trainingevaluation theoryDonald L Kirkpatricks training evaluation model -the four levels of learning evaluationalso below - HRD performance evaluation guideDonald L Kirkpatrick, Professor Emeritus, University Of Wisconsin(where he achieved his BBA, MBA and PhD), first published his ideas in1959, in a series of articles in the Journal of American Society ofTraining Directors. The articles were subsequently included inKirkpatricks book Evaluating Training Programs (originally published in1994; now in its 3rd edition - Berrett-Koehler Publishers).Donald Kirkpatrick was president of the American Society for Trainingand Development (ASTD) in 1975. Kirkpatrick has written several othersignificant books about training and evaluation, more recently with hissimilarly inclined son James, and has consulted with some of the worldslargest corporations.Donald Kirkpatricks 1994 book Evaluating Training Programs definedhis originally published ideas of 1959, thereby further increasingawareness of them, so that his theory has now become arguably the
most widely used and popular model for the evaluation of training andlearning. Kirkpatricks four-level model is now considered an industrystandard across the HR and training communities.More recently Don Kirkpatrick formed his own company, KirkpatrickPartners, whose website provides information about their services andmethods, etc.kirkpatricks four levels of evaluation modelThe four levels of Kirkpatricks evaluation model essentially measure:• reaction of student - what they thought and felt aboutthe training• learning - the resulting increase in knowledge orcapability• behaviour - extent of behaviour and capabilityimprovement and implementation/application• results - the effects on the business or environmentresulting from the trainees performanceAll these measures are recommended for fulland meaningful evaluation of learning in organizations, although theirapplication broadly increases in complexity, and usually cost, throughthe levels from level 1-4.Quick Training Evaluation and Feedback Form, based onKirkpatricks Learning Evaluation Model - (Excel file)
kirkpatricks four levels of training evaluationThis grid illustrates the basic Kirkpatrick structure at a glance. Thesecond grid, beneath this one, is the same thing with more detail.level evaluationtype (whatismeasured)evaluationdescription andcharacteristicsexamples ofevaluation toolsand methodsrelevance andpracticability1 Reaction Reactionevaluation is how thedelegatesfelt about the trainingor learningexperience.Happy sheets,feedback forms.Verbal reaction, post-training surveys orquestionnaires.Quick and very easy toobtain.Not expensive to gatheror to analyse.2 Learning Learningevaluation is themeasurement ofthe increase inknowledge - beforeand after.Typicallyassessments or testsbefore and after thetraining.Interview orobservation can alsobe used.Relatively simple to setup; clear-cut forquantifiable skills.Less easy for complexlearning.3 Behaviour Behaviourevaluation is theextent of appliedlearning back on thejob - implementation.Observation andinterview over timeare required toassess change,relevance of change,and sustainability ofchange.Measurement ofbehaviour changetypically requirescooperation and skill ofline-managers.4 Results Results evaluation is Measures are already Individually not difficult;
the effect on thebusiness orenvironment by thetrainee.in place via normalmanagement systemsand reporting - thechallenge is to relateto the trainee.unlike whole organisation.Process must attributeclear accountabilities.kirkpatricks four levels of training evaluation in detailThis grid illustrates the Kirkpatricks structure detail, and particularly themodern-day interpretation of the Kirkpatrick learning evaluation model,usage, implications, and examples of tools and methods. This diagramis the same format as the one above but with more detail andexplanation:evaluationlevel andtypeevaluation descriptionand characteristicsexamples ofevaluation tools andmethodsrelevance andpracticability1.ReactionReactionevaluation is how thedelegates felt, andtheir personal reactionsto the training orlearning experience, forexample:Did the trainees like andenjoy the training?Did they consider the trainingrelevant?Was it a good use of theirTypically happy sheets.Feedback forms based onsubjective personalreaction to the trainingexperience.Verbal reaction which canbe noted and analysed.Post-training surveys orquestionnaires.Online evaluation orgrading by delegates.Can be doneimmediately thetraining ends.Very easy to obtainreaction feedbackFeedback is notexpensive to gather orto analyse for groups.Important to know thatpeople were not upsetor disappointed.
time?Did they like the venue, thestyle, timing, domestics, etc?Level of participation.Ease and comfort ofexperience.Level of effort required tomake the most of thelearning.Perceived practicability andpotential for applying thelearning.Subsequent verbal orwritten reports given bydelegates to managersback at their jobs.Important that peoplegive a positiveimpression whenrelating theirexperience to otherswho might be decidingwhether to experiencesame.2.LearningLearning evaluation is themeasurement ofthe increase inknowledge or intellectualcapability from before toafter the learning experience:Did the trainees learn whatwhat intended to be taught?Did the trainee experiencewhat was intended for themto experience?What is the extent ofadvancement or change inthe trainees after the training,in the direction or area thatwas intended?Typically assessments ortests before and after thetraining.Interview or observationcan be used before andafter although this is time-consuming and can beinconsistent.Methods of assessmentneed to be closely relatedto the aims of the learning.Measurement andanalysis is possible andeasy on a group scale.Reliable, clear scoring andmeasurements need to beestablished, so as to limitthe risk of inconsistentRelatively simple to setup, but moreinvestment and thoughtrequired than reactionevaluation.Highly relevant andclear-cut for certaintraining such asquantifiable or technicalskills.Less easy for morecomplex learning suchas attitudinaldevelopment, which isfamously difficult toassess.Cost escalates ifsystems are poorly
assessment.Hard-copy, electronic,online or interview styleassessments are allpossible.designed, whichincreases workrequired to measureand analyse.3.BehaviourBehaviour evaluation isthe extent to which thetrainees applied thelearningand changedtheir behaviour, and thiscan be immediately andseveral months after thetraining, depending on thesituation:Did the trainees put theirlearning into effect whenback on the job?Were the relevant skills andknowledge usedWas there noticeable andmeasurable change in theactivity and performance ofthe trainees when back intheir roles?Was the change in behaviourand new level of knowledgesustained?Would the trainee be able totransfer their learning toanother person?Is the trainee aware of theirObservation and interviewover time are required toassess change, relevanceof change, andsustainability of change.Arbitrary snapshotassessments are notreliable because peoplechange in different waysat different times.Assessments need to besubtle and ongoing, andthen transferred to asuitable analysis tool.Assessments need to bedesigned to reducesubjective judgement ofthe observer orinterviewer, which is avariable factor that canaffect reliability andconsistency ofmeasurements.The opinion of the trainee,which is a relevantindicator, is alsosubjective and unreliable,Measurement ofbehaviour change isless easy to quantifyand interpret thanreaction and learningevaluation.Simple quick responsesystems unlikely to beadequate.Cooperation and skill ofobservers, typicallyline-managers, areimportant factors, anddifficult to control.Management andanalysis of ongoingsubtle assessments aredifficult, and virtuallyimpossible without awell-designed systemfrom the beginning.Evaluation ofimplementation andapplication is anextremely importantassessment - there islittle point in a good
change in behaviour,knowledge, skill level?and so needs to bemeasured in a consistentdefined way.360-degree feedback isuseful method and neednot be used beforetraining, becauserespondents can make ajudgement as to changeafter training, and this canbe analysed for groups ofrespondents and trainees.Assessments can bedesigned around relevantperformance scenarios,and specific keyperformance indicators orcriteria.Online and electronicassessments are moredifficult to incorporate -assessments tend to bemore successful whenintegrated within existingmanagement andcoaching protocols.Self-assessment can beuseful, using carefullydesigned criteria andmeasurements.reaction and goodincrease in capability ifnothing changes backin the job, thereforeevaluation in this areais vital, albeitchallenging.Behaviour changeevaluation is possiblegiven good support andinvolvement from linemanagers or trainees,so it is helpful toinvolve them from thestart, and to identifybenefits for them,which links to the level4 evaluation below.4.ResultsResults evaluation isthe effect on theIt is possible that many ofthese measures areIndividually, resultsevaluation is not
business orenvironment resulting fromthe improved performance ofthe trainee - it is the acid test.Measures would typically bebusiness or organisationalkey performance indicators,such as:Volumes, values,percentages, timescales,return on investment, andother quantifiable aspects oforganisational performance,for instance; numbers ofcomplaints, staff turnover,attrition, failures, wastage,non-compliance, qualityratings, achievement ofstandards and accreditations,growth, retention, etc.already in place vianormal managementsystems and reporting.The challenge is toidentify which and howrelate to to the traineesinput and influence.Therefore it is important toidentify and agreeaccountability andrelevance with the traineeat the start of the training,so they understand whatis to be measured.This process overlaysnormal good managementpractice - it simply needslinking to the traininginput.Failure to link to traininginput type and timing willgreatly reduce the ease bywhich results can beattributed to the training.For senior peopleparticularly, annualappraisals and ongoingagreement of keybusiness objectives areintegral to measuringbusiness results derivedfrom training.particularly difficult;across an entireorganisation itbecomes very muchmore challenging, notleast because of thereliance on line-management, and thefrequency and scale ofchanging structures,responsibilities androles, whichcomplicates theprocess of attributingclear accountability.Also, external factorsgreatly affectorganisational andbusiness performance,which cloud the truecause of good or poorresults.
Since Kirkpatrick established his original model, other theorists (forexample Jack Phillips), and indeed Kirkpatrick himself, have referred toa possible fifth level, namely ROI (Return On Investment). In my viewROI can easily be included in Kirkpatricks original fourth level Results.The inclusion and relevance of a fifth level is therefore arguably onlyrelevant if the assessment of Return On Investment might otherwise beignored or forgotten when referring simply to the Results level.Learning evaluation is a widely researched area. This is understandablesince the subject is fundamental to the existence and performance ofeducation around the world, not least universities, which of coursecontain most of the researchers and writers.While Kirkpatricks model is not the only one of its type, for mostindustrial and commercial applications it suffices; indeed mostorganisations would be absolutely thrilled if their training and learningevaluation, and thereby their ongoing people-development, wereplanned and managed according to Kirkpatricks model.For reference, should you be keen to look at more ideas, there aremany to choose from...• Jack Phillips Five Level ROI Model• Daniel Stufflebeams CIPP Model (Context, Input, Process,Product)• Robert Stakes Responsive Evaluation Model• Robert Stakes Congruence-Contingency Model• Kaufmans Five Levels of Evaluation
• CIRO (Context, Input, Reaction, Outcome)• PERT (Program Evaluation and Review Technique)• Alkins UCLA Model• Michael Scrivens Goal-Free Evaluation Approach• Provuss Discrepancy Model• Eisners Connoisseurship Evaluation Models• Illuminative Evaluation Model• Portraiture Model• and also the American Evaluation AssociationAlso look at Leslie Raes excellent Training Evaluation andtools available on this site, which, given Leslies experience andknowledge, will save you the job of researching and designing your owntools.evaluation of HRD function performanceIf you are responsible for HR functions and services to internal and/orexternal customers, you might find it useful to go beyond Kirkpatricksevaluation of training and learning, and to evaluatealso satisfaction among staff/customers with HR departmentsoverall performance. The parameters for such an evaluationultimately depend on what your HR function is responsible for - in otherwords, evaluate according to expectations.
Like anything else, evaluating customer satisfaction must first begin witha clear appreciation of (internal) customers expectations. Expectations -agreed, stated, published or otherwise - provide the basis for evaluatingall types of customer satisfaction.If people have expectations which go beyond HRdepartments stated and actual responsibilities, then thematter must be pursued because it will almost certainly offeran opportunity to add value to HRs activities, and to addvalue and competitive advantage to your organisation as awhole. In this fast changing world, HR is increasingly the departmentwhich is most likely to see and respond to new opportunities for thesupport and development of the your people - so respond, understand,and do what you can to meet new demands when you see them.If you are keen to know how well HR department is meeting peoplesexpectations, a questionnaire, and/or some group discussions will shedlight on the situation.Here are some example questions. Effectively you should be askingpeople to say how well HR or HRD department has done the following:• helped meto identify, understand, identify and prioritise my personaldevelopment needs and wishes, in termsof: skills,knowledge, experience and attitude (or personal well-being, or emotional maturity, or mood, or mind-set, or any othersuitable term meaning mental approach, which people will respondto)
• helped me to understand my own preferred learningstyle and learning methods for acquiringnew skills, knowledge and attitudinalcapabilities• helped me to identify and obtain effectivelearning and development that suits my preferred style andcircumstances• helped me to measure my development, and for themeasurement to be clear to my boss and others in the organisationwho should know about my capabilities• provided tools and systems to encourage and facilitatemy personal development• and particularly helped to optimisethe relationship between me and my boss relating to assisting myown personal development and well-being• provided a working environment that protectsme from discrimination and harassment of any sort• provided the opportunity for me to voice my grievances if I haveany, (in private, to a suitably trained person in the company whom Itrust) and then if I so wish for proper consideration and response tobe given to them by the company• provided the opportunity for me to receive counselling andadvice in the event that I need private and supportive help of thistype, again from a suitably trained person in the company whom Itrust• ensured that disciplinary processes are clear and fair, andinclude the right of appeal
• ensured that recruitment and promotion of staff aremanaged fairly and transparently• ensuring that systems and activities exist to keep allstaff informed of company plans, performance, etc., (as normallyincluded in a Team Briefing system)• (if you dare...) ensuring that people are paid and rewardedfairly in relation to other company employees, and separately, paidand rewarded fairly when compared to market norms (your CEO willnot like this question, but if you have a problem in this area its best toknow about it...)• (and for managers) helped me to ensure the developmentneeds of my staff are identified and supportedThis is not an exhaustive list - just some examples. Many of theexamples contain elements which should under typical large companycircumstances be broken down to create more and smaller questionsabout more specific aspects of HR support and services.If you work in HR, or run an HR department, and consider that some ofthese issues and expectations fall outside your remit, then consider whoelse is responsible for them.I repeat, in this fast changing world, HR is increasingly the departmentwhich is most likely to see and respond to new opportunities for thesupport and development of the your people - so respond, understand,and do what you can to meet new demands when you see them. Indoing so you will add value to your people and your organisation - andyour department.
Informing Practice Using Evaluation Models and TheoriesInstructor: Dr. Melvin M. Mark, Professor of Psychology at the Pennsylvania State UniversityDescription: Evaluators who are not aware of the contemporary and historical aspects of the profession."are doomed to repeat past mistakes and, equally debilitating, will fail to sustain and build on pastsuccesses." Madaus, Scriven and Stufflebeam (1983)."Evaluation theories are like military strategy and tactics; methods are like military weapons and logistics.The good commander needs to know strategy and tactics to deploy weapons properly or to organizelogistics in different situations. The good evaluator needs theories for the same reasons in choosing anddeploying methods." Shadish, Cook and Leviton (1991).These quotes from Madaus et al. (1983) and Shadish et al. (1991) provide the perfect rationale for why theserious evaluator should be concerned with models and theories of evaluation. The primary purpose of thisclass is to overview major streams of evaluation theories (or models), and to consider their implications forpractice. Topics include: (1) why evaluation theories matter, (2) frameworks for classifying differenttheories, (3) in-depth examination of 4-6 major theories, (4) identification of key issues on which evaluationtheories and models differ, (5) benefits and risks of relying heavily on any one theory, and (6) tools andskills that can help you in picking and choosing from different theoretical perspectives in planning anevaluation in a specific context. The overarching theme will be on practice implications, that is, on whatdifference it would make for practice to follow one theory or some other.Theories to be discussed will be ones that have had a significant impact on the evaluation field, that offerperspectives with major implications for practice, and that represent important and different streams oftheory and practice. Case examples from the past will be used to illustrate key aspects of each theorysapproach to practice.Participants will be asked to use the theories to question their own and others practices, and to considerwhat characteristics of evaluations will help increase their potential for use. Each participant will receiveMarvin Alkins text, Evaluation Roots (Sage, 2004) and other materials.The instructors assumption will be that most people attending the session have some general familiaritywith the work of a few evaluation theorists, but that most will not themselves be scholars of evaluationtheory. At the same time, the course should be useful, whatever ones level of familiarity with evaluationtheory.Applied Measurement for EvaluationInstructor: Dr. Ann Doucette, TEI Director, Research Professor, Columbian College of Arts and Sciences,George Washington UniversityDescription: Successful evaluation depends on our ability to generate evidence attesting to the feasibility,relevance and/or effectiveness of the interventions, services, or products we study. While theory guides ourdesigns and how we organize our work, it is measurement that provides the evidence we use in makingjudgments about the quality of what we evaluate. Measurement, whether it results from self-report survey,interview/focus groups, observation, document review, or administrative data must be systematic,replicable, interpretable, reliable, and valid. While hard sciences such as physics and engineering have
advanced precise and accurate measurement (i.e., weigh, length, mass, volume), the measurement used inevaluation studies is often imprecise and characterized by considerable error. The quality of the inferencesmade in evaluation studies is directly related to the quality of the measurement on which we base ourjudgments. Judgments attesting to the ineffective interventions may be flawed - the reflection of measuresthat are imprecise and not sensitive to the characteristics we chose to evaluate. Evaluation attempts tocompensate for imprecise measurement with increasingly sophisticated statistical procedures to manipulatedata. The emphasis on statistical analysis all too often obscures the important characteristics of themeasures we choose. This class content will cover these topics:• Assessing measurement precision: examining the precision of measures in relationship to thedegree of accuracy that is needed for what is being evaluated. Issues to be addressed include:measurement/item bias, the sensitivity of measures in terms of developmental and culturalissues, scientific soundness (reliability, validity, error, etc.), and the ability of the measure todetect change over time.• Quantification: Measurement is essentially assigning numbers to what is observed (direct andinferential). Decisions about how we quantify observations and the implications these decisionshave for using the data resulting from the measures, as well as for the objectivity and certaintywe bring to the judgment made in our evaluations will be examined. This section of the coursewill focus on the quality of response options, coding categories - Do response options/codingcategories segment the respondent sample in meaningful and useful ways?• Issues and Considerations - using existing measures versus developing your own measures:What to look for and how to assess whether existing measures are suitable for your evaluationproject will be examined. Issues associated with the development and use of new measures willbe addressed in terms of how to establish sound psychometric properties, and what cautionarystatements should accompanying interpretation and evaluation findings using these newmeasures.• Criteria for choosing measures: assessing the adequacy of measures in terms of thecharacteristics of measurement - choosing measures that fit your evaluation theory andevaluation focus (exploration, needs assessment, level of implementation, process, impact andoutcome). Measurement feasibility, practicability and relevance will be examined. Variousmeasurement techniques will be examined in terms of precision and adequacy, as well as theimplications of using screening, broad-range, and peaked tests.• Error - influences on measurement precision: The characteristics of various measurementtechniques, assessment conditions (setting, respondent interest, etc.), and evaluatorcharacteristics will be addressed.Participants will be provided with a copy of the text: Measurement Theory in Action (Case Studies andExercises) by Shulz, K.S. and D.J. Whitney (Sage, 2004).>> Return to top