Donald Kirkpatrick wrote in the preface of his book, Evaluating Training Programs, that he’s not sure where he got the idea for his four-level evaluation model, but that the concept was originally developed during his Ph.D. dissertation research at the University of Wisconsin-Madison in 1952. In 1959, he wrote a series of four articles titled “Techniques for Evaluating Training Programs,” which was published in the journal Training and Development. The articles described four levels of evaluation, which Kirkpatrick initially referred to as “steps.” In a 1996 article, he wrote, “Someone, I don’t know who, referred to the steps as ‘levels.’ The next thing I knew, articles and books were referring to the four levels as the Kirkpatrick model.” Kirkpatrick developed his model to clarify the concept of evaluation in four levels: reactions, learning, behavior, and results. It’s also been suggested that he wished to motivate training directors to see the importance of evaluation and to increase their efforts to evaluate their training programs. Today, Kirkpatrick’s Four-Level Model is primarily used to evaluate traditional instructor-led training programs in a summative manner.
Dialogue- A Level One Evaluation measures how participants react to a training program. It is often called a “smile sheet”. After the program, participants complete a questionnaire that asks very general questions about the program. The questions utilized on the questionnaire are based on measurable, observable, and specific events. The data collected is processed by using a Likert scale that ranges from 1 (extremely dissatisfied) to 5 (extremely satisfied). “ Kirkpatrick felt that if the experience was positive, there would be a better chance for learning to occur, and if learning occurs, there is a better chance that behavior would also improve.” This is very similar to eating out at a restaurant. If you like your experience and your meal, you are more likely to go back, therefore, a change in behavior.
Dialogue- A Level Two Evaluation Focuses on Learning and the extent to which participants have increased their skills, knowledge or desired attitudes. Participants are tested before the training program (pretest) and after training has been completed (posttest). The participants who receive the training are in the experimental group and the participants who do not receive the training are in the control group. Using a control group is important because it helps provide evidence on whether or not a change has taken place. Generally speaking then, any difference between the control group and experimental group would show that learning has occurred because of the training program. The process is valid if the test items are closely matched to the actual objectives and if the instructional strategies meet our program objectives.
Dialog: Level 3 on Kirkpatrick's Model measures behavior. In other words, it measures whether or not the skills, knowledge, and attitudes addressed in the training have transferred to real world situation. This would be an ideal success story: training on a certain skill is delivered, and six months later, the new skill can be documented occurring on the job. There are several ways to evaluate the transfer of behavior, including specific performance measures, direct observations, interviews, and questionnaires. As this occurs after the training is delivered, this data may be harder to obtain. It may be considered an afterthought. After all, the training has been delivered, what additional money needs to be spent. In addition, data collection can take considerable time. The evaluation, if not properly developed, may lead to low quality data. Plus, since people are expected to transfer the learning to the workplace, any situation where they know what the &quot;right&quot; answer is, or what they should be doing in front of a supervisor, may skew the data, giving a false sense of success.
Dialog - The top level of Kirkpatrick's model evaluates the overall results of the training with respect to what the organization cares about: things such as sales, profits, complaint reduction, and so on. As you can imagine, this would be incredibly difficult to evaluate. So many factors go into the bottom line, both internal and external. To state that the training was SOLELY responsible for a change is a bold statement to make. In order to back it up, it is necessary to document changes. Kirkpatrick recommends establishing baseline data before the training occurs, and track that data for quite some time following the training.
Dialog - Kirkpatrick’s Four Levels of Evaluation is very concrete and easy to understand. It is a well-established model and is broadly used in industrial and other professional settings. It’s broad appeal allows for a common language among evaluators and their clients. It has been credited with helping practitioners promote evaluations within their organization. It is notable that Kirkpatrick’s model has been the foundation of other Evaluation models. Kaufman and Keller’s Levels and Phillip’s ROI model were both developed as reactions or variations on the original Kirkpatrick Four Level Model.
Dialog - Kirkpatrick’s model has been criticized for its simplicity and its lack of causal relationship between the four levels. Many scholars argue that the levels are merely a classification system and that the completion of one level will in no way lead to the next level. Additionally Levels 1 and 2 are subject to bias. If the evaluator asks the wrong questions at this stage the conclusions and resulting performance improvement actions will be misguided. This model is often touted in professional settings, but when put into practice only Level 1 and 2 are implemented. This occurs, in part, because data regarding Levels 3 and 4 is so difficult to ascertain. However without Levels 3 and 4 the evaluation is far from complete, and perhaps missing its most important elements. Finally, this model deals only with training interventions, thereby ignoring other performance improvement interventions.
Here’s some biographical information on Kirkpatrick. At 85 years of age, he serves as professor emeritus at his alma mater, the University of Wisconsin-Madison. He has also held positions with the American Society for Public Administration and the American Society for Training & Development. Finally, take note of the photo of Kirkpatrick. A copy of his book, Evaluating Training Programs, stands next to him. In his 1996 Training and Development article, he mentioned that a colleague urged him in 1993 to write a book describing his model. One year later, he published this book. It’s currently in its third edition, translated into several languages, and co-written with his son, James.
Kirkpatrick's Four Levels Of Evaluation Model
Kirkpatrick's Four Levels of Evaluation Model IT 7150 Sara Kacin Joseph Palmisano Jason Siko
Background of Model <ul><li>Originated with Ph.D. dissertation research in 1952 </li></ul><ul><li>Published in four-article series titled “Techniques for Evaluating Training Programs” in 1959 </li></ul><ul><li>Developed to clarify evaluation concept in four levels : reactions, learning, behavior, and results </li></ul><ul><li>Primarily used to evaluate traditional instructor-led training programs </li></ul><ul><li> </li></ul><ul><li>Sources: Dick, W., & Johnson, R. B. (2007). Evaluation in instructional design: The impact of Kirkpatrick’s four-level model. In R. A. Reiser & J. V. Dempsey (Eds.), Trends and issues in instructional design and technology (2nd ed., 94-103). Upper Saddle River, NJ: Pearson Education. </li></ul><ul><li>Kirkpatrick, D. L. (1996). Great ideas revisited. Training & Development, 50(1), 54-59. </li></ul><ul><li>Kirkpatrick, D. L., & Kirkpatrick, J. D. (2006). Evaluating training programs: The four levels (3rd ed.). San Francisco: Berrett-Koehler. </li></ul>
Level 1 – Reactions <ul><li> </li></ul><ul><li>Measures how participants react to a training program </li></ul><ul><li>This type of questionnaire is often called a “Smile Sheet” </li></ul><ul><li> </li></ul><ul><li>Data is collected and processed using a Likert scale </li></ul><ul><li> </li></ul><ul><li>Kirkpatrick's emphasis on customer satisfaction </li></ul><ul><li>Source: Guerra-López, I. (2008). Performance evaluation: Proven approaches for improving program and organizational performance. San Francisco: Jossey-Bass. </li></ul><ul><li> </li></ul><ul><li> </li></ul>
Level 2 – Learning <ul><li>Measures the extent to which students have increased their skills, knowledge, or desired attitudes </li></ul><ul><li>Pretest – Participants are tested before the program </li></ul><ul><li>Posttest – Participants are tested after training is complete </li></ul><ul><li>Experimental Group – A group that receives the training </li></ul><ul><li>Control Group – A group that does not receive the training </li></ul><ul><li>Validity – Looks at how closely matched the test items are to the actual objectives </li></ul><ul><li> </li></ul>
Level 3 – Behavior <ul><li>Measures whether the training is being used on the job </li></ul><ul><li>If training was successful, new skills should appear on job </li></ul><ul><li>Data – Performance measures, observations, interviews, and questionnaires </li></ul><ul><li>Data becomes harder to obtain </li></ul><ul><li>̶ Additional time and money </li></ul><ul><li> </li></ul><ul><li>… and more difficult to trust, e.g., Hawthorne effect </li></ul>Sources: Cennamo, K., & Kalk, D. (2005). Real world instructional design . Belmont, CA: Thomson Wadsworth Publishing. Dick, W., Carey, L., & Carey, J. O. (2005). The systematic design of instruction (6th ed.). Boston: Allyn and Bacon. Guerra-López, I. (2008). Performance evaluation: Proven approaches for improving program and organizational performance. San Francisco: Jossey-Bass.
Level 4 – Results <ul><li>Measures the effect on what the organization cares about: the BOTTOM LINE! </li></ul><ul><li> ̶ Sales, productivity, profits </li></ul><ul><li>Very difficult to assess </li></ul><ul><li>̶ but necessary to document </li></ul><ul><li>Important to establish baseline data in order to document change </li></ul>
Strengths of Model <ul><li>Easily understood within and outside of the field </li></ul><ul><li>Well-established and utilized throughout industrial and other professional environments </li></ul><ul><li>Has been used as basis for other evaluation models including Kaufman and Keller’s Levels and Phillips ROI Model </li></ul><ul><li> </li></ul><ul><li> </li></ul>Sources: Galloway, D. L. (2005). Evaluating distance delivery and e-learning: Is Kirkpatrick’s model relevant? Performance Improvement, 44(4), 21-27. Holton, E. (1996). The flawed four-level evaluation model. Human Resource Development Quarterly, 7(1), 5-21. Kaufman, R., &, Keller, J. M. (1994). Levels of evaluation: Beyond Kirkpatrick. Human Resources Development Quarterly, 5(4), 371-380.
Limitations of Model <ul><li>Too simplistic </li></ul><ul><li>Causal relationship between levels has not been proven </li></ul><ul><li>Levels 1 and 2 are subject to bias, which may lead to erroneous conclusions </li></ul><ul><li>Many organizations implement only Levels 1 and 2, thereby ignoring learning transfer which is arguably the most important outcome </li></ul><ul><li>Levels of evaluation should be expanded beyond training to include performance improvement interventions </li></ul>
Donald Kirkpatrick <ul><li>Born March 15, 1924, in Richland Center, WI </li></ul><ul><li>Education: University of Wisconsin-Madison, B.B.A., 1948, M.B.A., 1949, Ph.D., 1954 </li></ul><ul><li>Memberships: ASPA, ASTD (president, 1975) </li></ul><ul><li>Career status: Professor emeritus, University of Wisconsin-Madison. </li></ul><ul><li>Consultant to business and government. </li></ul><ul><li>Publications: Numerous including Evaluating Training Programs: The Four Levels, 2006 (first edition,1994) </li></ul><ul><li> </li></ul><ul><li>Source: Contemporary Authors Online (2009) . Donald Kirkpatrick. Retrieved September 14, 2009, from http://go.galegroup.com/ps/start.do?p=LitRG&u=litedi </li></ul><ul><li> </li></ul> Photo credit: Unknown