  1. 1. Kirkpatricks Four Levels of EvaluationASSESSING TRAINING EFFECTIVENESS often entailsusing the four-level model developed by Donald Kirkpatrick(1994). According to this model, evaluation should alwaysbegin with level one, and then, as time and budget allows,should move sequentially through levels two, three, and four.Information from each prior level serves as a base for the nextlevels evaluation. Thus, each successive level represents a In Kirkpatricks four-level model,more precise measure of the effectiveness of the training each successive evaluation levelprogram, but at the same time requires a more rigorous and is built on information providedtime-consuming analysis. by the lower level.Level 1 Evaluation - ReactionsJust as the word implies, evaluation at this level measures how participants in a training programreact to it. It attempts to answer questions regarding the participants perceptions - Did they likeit? Was the material relevant to their work? This type of evaluation is often called a “smilesheet.”According to Kirkpatrick, every program should at least be evaluated at this level to provide forthe improvement of a training program. In addition, the participants reactions have importantconsequences for learning (level two). Although a positive reaction does not guarantee learning,a negative reaction almost certainly reduces its possibility.Level 2 Evaluation - Learning Assessing at this level moves the evaluation beyond learner satisfaction and attempts to assess the extent students have advanced in skills, knowledge, or attitude. Measurement at this level is more difficult and laborious than level one. Methods range from formal to informal testing to team assessment and self-assessment. If possible, participants take the test or assessment before the training (pretest) and after training (post test) to determine the amount of learning that has occurred. Level 3 Evaluation - TransferTo assess the amount of learningthat has occurred due to a training This level measures the transfer that has occurred inprogram, level two evaluations often learners behavior due to the training program. Evaluatinguse tests conducted before training at this level attempts to answer the question - Are the(pretest) and after training (post test). newly acquired skills, knowledge, or attitude being used in the everyday environment of the learner? For manytrainers this level represents the truest assessment of a programs effectiveness. However,measuring at this level is difficult as it is often impossible to predict when the change in behaviorwill occur, and thus requires important decisions in terms of when to evaluate, how often toevaluate, and how to evaluate.Level 4 Evaluation- Results
  2. 2. Frequently thought of as the bottom line, this levelmeasures the success of the program in terms thatmanagers and executives can understand -increasedproduction, improved quality, decreased costs,reduced frequency of accidents, increased sales, andeven higher profits or return on investment. From abusiness and organizational perspective, this is theoverall reason for a training program, yet level fourresults are not typically addressed. Determining Level four evaluation attempts to assessresults in financial terms is difficult to measure, and is training in terms of business results. Inhard to link directly with training. this case, sales transactions improved steadily after training for sales staffMethods for Long-Term Evaluation occurred in April 1997. • Send post-training surveys • Offer ongoing, sequenced training and coaching over a period of time • Conduct follow-up needs assessment • Check metrics (e.g., scrap, re-work, errors, etc.) to measure if participants achieved training objectives • Interview trainees and their managers, or their customer groups (e.g., patients, other departmental staff)Summary of the Kirkpatrick ModelDonald Kirkpatrick first proposed this four-pronged approach to evaluating training programs inhis 1959 doctoral dissertation.Since then, it has become so widely used, that trainers can typically talk about it in shorthand andunderstand the reference. For example, when one trainer says to another, "What are you doingabout level IV?" the other knows that the first trainer wants to understand how the secondevaluates the impact of training. Level Name Issues Assessed at this LevelI ReactionAssesses participants’ initial reactions to a course. This, in turn, offers insights into participants’satisfaction with a course, a perception of value. Trainers usually assess this through a survey,often called a "smiley sheet." Occasionally, trainers use focus groups and similar methods toreceive more specific comments (called qualitative feedback) on the courses. According to theTRAINING magazine annual industry survey, almost 100 percent of all trainers perform "Level I"evaluation.II LearningAssesses the amount of information that participants learned. Trainers usually assess this with acriterion-referenced test. The criteria are objectives for the course: statements developed beforea course is developed that explicitly state the skills that participants should be able to performafter taking a course. Because the objectives are the requirements for the course, a Level IIevaluation assesses conformance to requirements, or quality.III TransferAssesses the amount of material that participants actually use in everyday work 6 weeks to 6months (perhaps longer) after taking the course. This assessment is based on the objectives of
  [img][/img]

1. Reactions. "Reaction may best be defined as how well the trainees liked a particular training program."Reactions are typically measured at the end of training -- at Point 3 However, that is a summative or end-of-course assessment and reactions are also measured during the training, even if only informally in terms of theinstructors perceptions.2 Learning. "What principles, facts, and techniques were understood and absorbed by the conferees?" What thetrainees know or can do can be measured during and at the end of training but, in order to say that thisknowledge or skill resulted from the training, the trainees entering knowledge or skills levels must also be knownor measured. Evaluating learning, then, requires measurements at Points 1, 2 and 3 -- before, during and aftertraining3. Behavior. Changes in on-the-job behavior. Clearly, any evaluation of changes in on-the-job behavior mustoccur in the workplace itself -- at Point 4 . It should be kept in mind, however, that behavior changes areacquired in training and they then transfer (or dont transfer) to the work place. It is deemed useful, therefore, toassess behavior changes at the end of training and in the workplace.4. Results. "Reduction of costs; reduction of turnover and absenteeism; reduction of grievances; increase inquality and quantity or production; or improved morale which, it is hoped, will lead to some of the previouslystated results." These factors are also measurable in the workplace -- at Point 4
  4. 4. I use a simple form - on side 1 is the name of the employee, name of reporting officer, department name, title,date and venue of the training, costs of the course, accommodation and travel, and then below this space forjustifying the training - both by the employee and the reporting officer, followed by Side 2 where the objectivesare listed along with how success for each will be measured. Then below this is space to put notes on an after-training meeting and details of an action plan to take the lessons of the training back to the workplace - actions,resources needed, responsible person for each action and deadline. I also use this form then to track the financialcosts, to audit the training request and delivery process and to hold reporting officers and employees to accountand see how they are progressing.Why Measure Training Effectiveness?Measuring the effectiveness of training programs consumes valuable timeand resources. As we know all too well, these things are in short supply inorganizations today. Why should we bother?Many training programs fail to deliver the expected organizational benefits.Having a well-structured measuring system in place can help you determinewhere the problem lies. On a positive note, being able to demonstrate a realand significant benefit to your organization from the training you provide canhelp you gain more resources from important decision-makers.Consider also that the business environment is not standing still. Yourcompetitors, technology, legislation and regulations are constantly changing.What was a successful program yesterday may not be a cost-effectiveprogram tomorrow. Being able to measure results will help you adapt to suchchanging circumstances.The Kirkpatrick Model
  5. 5. The most well-known and used model for measuring the effective of trainingprograms was developed by Donald Kirkpatrick in the late 1950s. It has sincebeen adapted and modified by a number of writers, however, the basicstructure has well stood the test of time. The basic structure of Kirkpatrick’sfour-level model is shown here.Figure 1 - Kirkpatrick Model for Evaluating Effectiveness of TrainingPrograms Level 4 - What organizational benefits resultedResults from the training? To what extent did participants change Level 3 - their behavior back in the workplace asBehavior a result of the training? To what extent did participants improve Level 2 - knowledge and skills and changeLearning attitudes as a result of the training? Level 1 - How did participants react to theReaction program?An evaluation at each level answers whether a fundamental requirement ofthe training program was met. Its not that conducting an evaluation at onelevel is more important that another. All levels of evaluation are important.In fact, the Kirkpatrick model explains the usefulness of performingevaluations at each level. Each level provides a diagnostic checkpoint forproblems at the succeeding level. So, if participants did not learn (Level 2),participant reactions gathered at Level 1 (Reaction) will reveal the barriers tolearning. Now moving up to the next level, if participants did not use theskills once back in the workplace (Level 3), perhaps they did not learn therequired skills in the first place (Level 2).The difficulty and cost of conducting an evaluation increases as you move upthe levels. So, you will need to consider carefully what levels of evaluationyou will conduct for which programs. You may decide to conduct Level 1evaluations (Reaction) for all programs, Level 2 evaluations (Learning) for“hard-skills” programs only, Level 3 evaluations (Behavior) for strategicprograms only and Level 4 evaluations (Results) for programs costing over$50,000. Above all else, before starting an evaluation, be crystal clear aboutyour purpose in conducting the evaluation.Using the Kirkpatrick ModelHow do you conduct an evaluation? Here is a quick guide on someappropriate information sources for each level.Level 1 (Reaction) • completed participant feedback questionnaire • informal comments from participants • focus group sessions with participants
  6. 6. Level 2 (Learning) • pre- and post-test scores • on-the-job assessments • supervisor reportsLevel 3 (Behavior) • completed self-assessment questionnaire • on-the-job observation • reports from customers, peers and participant’s managerLevel 4 (Results) • financial reports • quality inspections • interview with sales managerWhen considering what sources of data you will use for your evaluation, thinkabout the cost and time involved in collecting the data. Balance this againstthe accuracy of the source and the accuracy you actually need. Will existingsources suffice or will you need to collect new information?Think broadly about where you can get information. Sources include: • hardcopy and online quantitative reports • production and job records • interviews with participants, managers, peers, customers, suppliers and regulators • checklists and tests • direct observation • questionnaires, self-rating and multi-rating • Focus Group sessionsOnce you have completed your evaluation, distribute it to the people whoneed to read it. In deciding on your distribution list, refer to your previouslystated reasons for conducting the evaluation. And of course, if there werelessons learned from the evaluation on how to make your training moreeffective, act on them!Business Performance Pty Ltd products can help you plan and implementyour training evaluation project.Our Training Management Template Pack contains ready to go checklists,interview and questionnaire forms for conducting Level 1 (Reaction) andLevel 3 (Behavior) evaluations.Our eBook From Training to Enhanced Workplace Performance containsguidance, checklists, interview and questionnaire forms for conducting Level3 (Behavior) evaluations.AT+D classic: How to start an Objective Evaluation of your Training ProgramT+D, May, 2004 by Donald L. KirkpatrickThis excerpt is part of a larger article with the same title that originally appeared in the May-June 1956 issue of the
  7. 7. Journal of the American Society of Training Directors, a predecessor to T+D. The article heralded Kirkpatricks now classicfour-level evaluation model.Most training men agree that it is important to evaluate training programs. They also feel that the evaluation should bedone by objective means. However, the typical training man uses evaluation sheets or comment sheets as the solemeasure of the effectiveness of his programs. He realizes he should do more, but he just doesnt know how to begin anobjective evaluation.According the Raymond Katzell, a well-known authority in this field, the evaluation of a training program falls into ahierarchy of steps that can be briefly stated as follows:Step One. To determine how the trainees feel about the program.Step Two. To determine how much the trainees learn in the form of increased knowledge and understanding.Step Three. To measure the changes in the on-the-job behavior of the trainees.Step Four. To determine the effects of these behavioral changes on objective criteria such as production, turnover,absenteeism, and waste.In climbing this ladder of evaluation, most trainers have completed the first step. Typically, the training director asks thetrainees to fill out evaluation sheets at the end of the program. Questions that are asked most frequently are* How do you rate the program?* What subject did you like best?* What subject did you like least?* What did you learn that you can use on the job?* What subjects would you like to have discussed at future programs?Usually the trainees are not asked to sign their name for fear they will not give an honest reaction.This kind of subjective evaluation is important. It gives a good indication of how the trainees reacted to the program. Ifthey react favorably, the trainer can justifiably pat himself on the back and say, "I gave them a program they liked." Buthe cant rightfully claim that the training program accomplished the objective, unless his objective was to give them aprogram they liked.The immediate objective of any training course can be stated in terms of the desired knowledge and understanding thatthe program is trying to impart to the trainees. It is this stage of evaluation that should be undertaken as the secondstep. It is much more difficult than step one and, therefore, is not undertaken by many trainers.Among the possible methods for determining whether increased knowledge and understanding have taken place, the bestone seems to be the "before and after" paper and pencil test. If the scores on the posttest are significantly higher than onthe pretest, the course can be deemed effective.In determining the effectiveness of the training, it is important to note that the paper and pencil test or inventory mustcover the principles and facts that are discussed in the course. If the trainer can find a test that covers this material, hecan use it. If he cannot find a suitable one, he must construct his own inventor. Some of the inventories that areavailable are: How Supervise? by File and Remmers; Supervisory Inventory by Wesley Osterberg; and the SupervisoryInventory on Human Relations constructed by this writer.So far, then, it has been stated that a before and after test can be used to deter mine whether or not increasedknowledge and understanding have taken place. Also, that the inventory should cover the course content. In order todetermine whether or not an available test is suitable, a trainer must examine his course outline and list the principlesand facts he is trying to teach. A comparison of test items with these objectives will reveal whether or not the test can beused. Because the construction of a test involves such factors as the choice of items, the wording of the items, thenumber and type of possible response, and the sequence of items, it is far better to use an available inventory if it coversmost of the course content.Having selected or constructed a test, the trainer should consider some "Dos" for administering it:* Give the pretest at the start of the first class and the posttest at the close of the last session. This will minimize theinfluence of factors apart from the training course.* Have the trainee sign both the pretest and posttest. Then, the increased knowledge and understanding can becomputed for each individual.In instructing the trainees before they take the pretest:* Tell them it is a before and after procedure.* Explain the purpose of the test.* Encourage them to answer truthfully by assuring them that their scores will have no effect on their pay or status in thecompany.* Tell them to answer every question even if they have to guess. (This will be taken into account in the statistical analysisof scores.)* Encourage them to take their time in taking the test. This will help to motivate them to read each item carefully.
  8. 8. In analyzing the test results, there are two kinds of evaluations to be made:* Was the entire course effective as shown by gains from pretest to posttest scores for all trainees?* What specific facts and principles were learned as shown by changes from pretest to posttest for each item?SummaryTraining men agree that it is advisable to evaluate training courses as objectively as possible. Typically, their evaluationconsists of subjective comment sheets that are completed by the trainees at the end of the course. Providing that theseare properly administered, these evaluation sheets give a valid measure of trainee reaction to the program. However,they do not give any evidence of benefits derived.The first step in objectively evaluating the effectiveness of a training course is to determine whether or not the desiredfacts and principles were learned by the trainees. This can be done by:* Using a suitable paper and pencil test.* Testing the trainees before and after the program.* Determining the overall effectiveness of the course by comparing pretest and posttest scores for each trainee.* Determining which specific facts and principles were learned by analyzing the changes on each test item from pretestand posttest.The purpose of this article is to suggest a specific technique for beginning an objective evaluation of training program.Further efforts should be undertaken by every training man to follow up this kind of evaluation by attempting to measuretrainee change in behavior that occurs as a result of participation in the program.