Testing and Evaluation Techniques  Chapter 7 Florida State Fire College Ocala, Florida
Testing and Evaluation You had the opportunity to cover part of this chapter in Fire Service Course Delivery
Terminal Objective The student will be able to construct, administer, and evaluate an assessment instrument
Enabling Objectives The student shall be able to: Define the four levels of evaluation Differentiate between summative and formative evaluation Define the different kinds of tests Discuss the difference among the various types of tests List various sources for tests
Four Levels of Evaluation  Level I – reaction Level II – learning Level III – transfer Level IV – business results
Reaction How did the student react to the class? Evaluation form Focus groups (rarely used) Required by most academic institutions
Learning What has the student learned? Oral examination Written test Skills assessment
Transfer How much did the student retain? Measured six weeks to six months post coursework – on the job Based upon tests, observations, surveys, and interviews  Not regularly done within the fire service
Business Results An assessment of the financial impact of the training or the return on investment. Six months to two years post coursework Most difficult to measure Training courses do not have business objectives – ie: reduction in accidents Methodology for assessment is not yet refined
Do we ever get a final evaluation? In teaching you cannot see the fruit of a day's work.  It is invisible and remains so, maybe for twenty years.  ~Jacques Barzun
Evaluation “An evaluation is a process of making a value judgment based upon one or more sources” Evaluation processes examine two components: Instruction from the teacher Performance of the student on objectives
Purpose of Evaluation Provides feedback to students Provide students gratification and motivation Measures the effectiveness of the instructor Measures the effectiveness of the program in meeting objectives
Formative Evaluation Ongoing evaluation to change or adapt the program Compares the objectives to the testing strategy Occurs during development of course or test Pilot testing is formative evaluation
Summative Evaluation Typically performed at end of program Provides students feedback on mastery of subject Provides feedback on effectiveness of teaching strategy Summative tools Course evaluation forms Final exams Written Practical
Formal Evaluation “ the test” Did the student attain the course objectives? Gives a grade Adds  stress   to the   student If required to pass, it should not be the first testing of the material
Informal Evaluation Provides student feedback With or without a recorded grade Helps ID weaknesses and strengths Use caution when presenting
Tests Instructor’s last chance to emphasize the important information the student needs to retain Written Practical  Oral
Written Tests Multiple Choice True/False Matching Completion or fill in the blank Essay
Guidelines for Written Tests Be sure questions relate to objectives Allow appropriate amount of time Simple multiple choice allow about one minute per question Scenarios and essays require more time Clear complete directions Be sure it has proper grammar and punctuation
Multiple Choice Tests Common method for formal and informal evaluation Normally used for national and state certification examinations Easy to grade and be objective Test construction miscues Using previous questions information Negatively worded stems Fill in the blank in middle of stem Using ‘all of the above” or “none of the above”
True/False Tests Limited to two answers, no gray area Difficult to construct in positive voice Avoid always or never statements Useful tool as a study guide
Matching Tests Works best with definitions and terms Difficult to design Cautious of multiple matches Test directions must be clear
Completion Tests Fill in the blank Statements must be clear as to intent of question Need  to be grammatically correct Be aware of size of blank Avoid having blank at beginning of sentence
Essay Tests May require long or short answer Time consuming and difficult to grade Recommended to grade in a group format Rubric is a useful grading tool Hand written exams must allow sufficient time Shotgun approach.. As much information  as possible in hopes of hitting the target
http://rubistar.4teachers.org/index.php?screen=WhatIs&module=Rubistar Heidi Goodrich, a rubrics expert, defines a rubric as "a scoring tool that lists the criteria for a piece of work or 'what counts.'" So a rubric for a multimedia project will list the things the student must have included to receive a certain score or rating. Rubrics help the student figure out how their project will be evaluated. Goodrich quotes a student who said he didn't much care for rubrics because "if you get something wrong, your teacher can prove you knew what you were supposed to do."  Generally rubrics specify the level of performance expected for several levels of quality. These levels of quality may be written as different ratings (e.g., Excellent, Good, Needs Improvement) or as numerical scores (e.g., 4, 3, 2, 1) which are then added up to form a total score which then is associated with a grade (e.g., A, B, C, etc).  Many rubrics also specify the level of assistance (e.g., Independently, With Minimal Adult Help; With Extensive Adult Help) for each quality rating.  Rubrics can help students and teachers define "quality". Rubrics can also help students judge and revise their own work before handing in their assignments.  What is a Rubric?
Oral Exams Requires verbal answers by students Advantages Evaluate quick reaction of student Assesses the student thought process Disadvantages Limited number of students examined at one time Difficult to standardize Time consuming and labor intensive Unexpected distractions Unfair emphasis on repeated mistakes
Project Assignments Gets students working outside the class In groups, helps develop people skills Negatives Hard to standardize Potential plagiarism May measure only end product and not consider the process
Practical Exams Situational Demonstration of a skill in the context of a scenario FFII Exam for forcible entry Rote “Demonstration of steps of performing a skill” FFII Exam donning of SCBA
Practical Skills Evaluation Rote mechanical skills No real world stresses Situational skills test asks the student to think through a situation
Simple Skill Evaluation Define required skill Determine efficiency expected Student needs to know what is expected Skills check lists are helpful
Performance Evaluations Determine and define Expected outcome Is performance or decision making more important than the situation? How stressful a situation is student prepared to handle? SAFETY
Performance Evaluations Determine standards to be evaluated Situation should Represent desired outcome Realistic scenarios Realistic environment Realistic in the real world SAFETY
Performance Evaluations Keep situation in perspective Remember safety Remember legal ramifications List all activities to be completed
Reliability Multiple tests Compare results The closer the scores, the more reliable Four questions Does it measure consistently on different occasions Any influence of the environment Any difference with administrators Does it discriminate against anyone
Test Resources Written examination resources: NFPA Publishers’ test banks Fire textbooks Fire textbook instructor guides Textbooks of practice certification examinations On-line and computer-based practice certification tests Fire Internet sites
Test Resources Practical examination resources: NFPA Fire Internet sites Fire textbooks Fire continuing education programs
Test Resources Oral examination resources: NFPA Fire Internet sites
Conclusion As part of the learning process, the instructor needs a mechanism to evaluate the student’s learning and identify whether or not the student is achieving the objectives and goals of instruction.
Conclusion Evaluate Four Levels Summative Formative Tests Kinds of test Differences among the various types Sources for testing materials
Multiple choice tests should avoid what? “Which choice is not correct” “All of the choices are correct” “None of the choices are correct” Maybe I just broke the rule!!!!!!
Which type of test is most commonly used for state or national certifications? True/False Matching Multiple choice Fill in the blank
Unexpected distractions would likely be a disadvantage in what type of testing? Rote Situational Written  Oral
Of the four levels of evaluation, which one would likely look at job performance 6 months to a year down the road? Reaction  Business results Learning Transfer

Design Chapter 7 - Testing and Evaluation Techniques

  • 1.
    Testing and EvaluationTechniques Chapter 7 Florida State Fire College Ocala, Florida
  • 2.
    Testing and EvaluationYou had the opportunity to cover part of this chapter in Fire Service Course Delivery
  • 3.
    Terminal Objective Thestudent will be able to construct, administer, and evaluate an assessment instrument
  • 4.
    Enabling Objectives Thestudent shall be able to: Define the four levels of evaluation Differentiate between summative and formative evaluation Define the different kinds of tests Discuss the difference among the various types of tests List various sources for tests
  • 5.
    Four Levels ofEvaluation Level I – reaction Level II – learning Level III – transfer Level IV – business results
  • 6.
    Reaction How didthe student react to the class? Evaluation form Focus groups (rarely used) Required by most academic institutions
  • 7.
    Learning What hasthe student learned? Oral examination Written test Skills assessment
  • 8.
    Transfer How muchdid the student retain? Measured six weeks to six months post coursework – on the job Based upon tests, observations, surveys, and interviews Not regularly done within the fire service
  • 9.
    Business Results Anassessment of the financial impact of the training or the return on investment. Six months to two years post coursework Most difficult to measure Training courses do not have business objectives – ie: reduction in accidents Methodology for assessment is not yet refined
  • 10.
    Do we everget a final evaluation? In teaching you cannot see the fruit of a day's work.  It is invisible and remains so, maybe for twenty years.  ~Jacques Barzun
  • 11.
    Evaluation “An evaluationis a process of making a value judgment based upon one or more sources” Evaluation processes examine two components: Instruction from the teacher Performance of the student on objectives
  • 12.
    Purpose of EvaluationProvides feedback to students Provide students gratification and motivation Measures the effectiveness of the instructor Measures the effectiveness of the program in meeting objectives
  • 13.
    Formative Evaluation Ongoingevaluation to change or adapt the program Compares the objectives to the testing strategy Occurs during development of course or test Pilot testing is formative evaluation
  • 14.
    Summative Evaluation Typicallyperformed at end of program Provides students feedback on mastery of subject Provides feedback on effectiveness of teaching strategy Summative tools Course evaluation forms Final exams Written Practical
  • 15.
    Formal Evaluation “the test” Did the student attain the course objectives? Gives a grade Adds stress to the student If required to pass, it should not be the first testing of the material
  • 16.
    Informal Evaluation Providesstudent feedback With or without a recorded grade Helps ID weaknesses and strengths Use caution when presenting
  • 17.
    Tests Instructor’s lastchance to emphasize the important information the student needs to retain Written Practical Oral
  • 18.
    Written Tests MultipleChoice True/False Matching Completion or fill in the blank Essay
  • 19.
    Guidelines for WrittenTests Be sure questions relate to objectives Allow appropriate amount of time Simple multiple choice allow about one minute per question Scenarios and essays require more time Clear complete directions Be sure it has proper grammar and punctuation
  • 20.
    Multiple Choice TestsCommon method for formal and informal evaluation Normally used for national and state certification examinations Easy to grade and be objective Test construction miscues Using previous questions information Negatively worded stems Fill in the blank in middle of stem Using ‘all of the above” or “none of the above”
  • 21.
    True/False Tests Limitedto two answers, no gray area Difficult to construct in positive voice Avoid always or never statements Useful tool as a study guide
  • 22.
    Matching Tests Worksbest with definitions and terms Difficult to design Cautious of multiple matches Test directions must be clear
  • 23.
    Completion Tests Fillin the blank Statements must be clear as to intent of question Need to be grammatically correct Be aware of size of blank Avoid having blank at beginning of sentence
  • 24.
    Essay Tests Mayrequire long or short answer Time consuming and difficult to grade Recommended to grade in a group format Rubric is a useful grading tool Hand written exams must allow sufficient time Shotgun approach.. As much information as possible in hopes of hitting the target
  • 25.
    http://rubistar.4teachers.org/index.php?screen=WhatIs&module=Rubistar Heidi Goodrich,a rubrics expert, defines a rubric as "a scoring tool that lists the criteria for a piece of work or 'what counts.'" So a rubric for a multimedia project will list the things the student must have included to receive a certain score or rating. Rubrics help the student figure out how their project will be evaluated. Goodrich quotes a student who said he didn't much care for rubrics because "if you get something wrong, your teacher can prove you knew what you were supposed to do." Generally rubrics specify the level of performance expected for several levels of quality. These levels of quality may be written as different ratings (e.g., Excellent, Good, Needs Improvement) or as numerical scores (e.g., 4, 3, 2, 1) which are then added up to form a total score which then is associated with a grade (e.g., A, B, C, etc). Many rubrics also specify the level of assistance (e.g., Independently, With Minimal Adult Help; With Extensive Adult Help) for each quality rating. Rubrics can help students and teachers define "quality". Rubrics can also help students judge and revise their own work before handing in their assignments. What is a Rubric?
  • 26.
    Oral Exams Requiresverbal answers by students Advantages Evaluate quick reaction of student Assesses the student thought process Disadvantages Limited number of students examined at one time Difficult to standardize Time consuming and labor intensive Unexpected distractions Unfair emphasis on repeated mistakes
  • 27.
    Project Assignments Getsstudents working outside the class In groups, helps develop people skills Negatives Hard to standardize Potential plagiarism May measure only end product and not consider the process
  • 28.
    Practical Exams SituationalDemonstration of a skill in the context of a scenario FFII Exam for forcible entry Rote “Demonstration of steps of performing a skill” FFII Exam donning of SCBA
  • 29.
    Practical Skills EvaluationRote mechanical skills No real world stresses Situational skills test asks the student to think through a situation
  • 30.
    Simple Skill EvaluationDefine required skill Determine efficiency expected Student needs to know what is expected Skills check lists are helpful
  • 31.
    Performance Evaluations Determineand define Expected outcome Is performance or decision making more important than the situation? How stressful a situation is student prepared to handle? SAFETY
  • 32.
    Performance Evaluations Determinestandards to be evaluated Situation should Represent desired outcome Realistic scenarios Realistic environment Realistic in the real world SAFETY
  • 33.
    Performance Evaluations Keepsituation in perspective Remember safety Remember legal ramifications List all activities to be completed
  • 34.
    Reliability Multiple testsCompare results The closer the scores, the more reliable Four questions Does it measure consistently on different occasions Any influence of the environment Any difference with administrators Does it discriminate against anyone
  • 35.
    Test Resources Writtenexamination resources: NFPA Publishers’ test banks Fire textbooks Fire textbook instructor guides Textbooks of practice certification examinations On-line and computer-based practice certification tests Fire Internet sites
  • 36.
    Test Resources Practicalexamination resources: NFPA Fire Internet sites Fire textbooks Fire continuing education programs
  • 37.
    Test Resources Oralexamination resources: NFPA Fire Internet sites
  • 38.
    Conclusion As partof the learning process, the instructor needs a mechanism to evaluate the student’s learning and identify whether or not the student is achieving the objectives and goals of instruction.
  • 39.
    Conclusion Evaluate FourLevels Summative Formative Tests Kinds of test Differences among the various types Sources for testing materials
  • 40.
    Multiple choice testsshould avoid what? “Which choice is not correct” “All of the choices are correct” “None of the choices are correct” Maybe I just broke the rule!!!!!!
  • 41.
    Which type oftest is most commonly used for state or national certifications? True/False Matching Multiple choice Fill in the blank
  • 42.
    Unexpected distractions wouldlikely be a disadvantage in what type of testing? Rote Situational Written Oral
  • 43.
    Of the fourlevels of evaluation, which one would likely look at job performance 6 months to a year down the road? Reaction Business results Learning Transfer

Editor's Notes

  • #6 Levels established by educator Donald Kirkpatrick. Take a look at http:// www.businessballs.com/kirkpatricklearningevaluationmodel.htm http://www.science.ulster.ac.uk/caa/presentation/kirkpatrick/tsld014.htm
  • #7 Typically, this is done through the use of an evaluation form. For discussion, “when is the best time to have an evaluation form completed”? Some think the very last thing in a class. Unfortunately, so often, the students are in a big hurry to get done and out the door that the evaluations are really not honest.
  • #8 Exams, tests, and skill assessments help an instructors evaluate if what they have taught is actually absorbed by the student and if they are able to adequately recall.
  • #9 Six weeks to six months down the road. Typically, we just don’t do follow up like this. How often have you said that the true evaluation of your course is the safe retirement of the student and possibly even those they have supervised? This should be a good slide for discussion. Can you think of an easy (or even reasonable) way to evaluating how much a student retains six weeks to six months post class?
  • #10 How did the department benefit from the training. Was it a train the trainer scenario? Did the student share the information? How did the student utilize the training for the benefit of the organization?
  • #11 Is the true evaluation completed with a safe retirement of the teacher, the student and all the students of the students? This slide is simply for consideration and discussion. Another thought: A teacher affects eternity; he can never tell where his influence stops.  ~Henry Brooks Adams
  • #14 Compares overall goals, objectives, and content to the student performance. Think about the first day of a Course Delivery class and how the students handle the introduction process. Look at the progress about mid-way and then again at the end.
  • #19 An activity should take place during the discussion of testing. Why not simply take a chapter from this course and ask one group to write a specified number of test questions (with possible exception of essay) and share them with the class. It serves as a good course review and you can evaluate the questions…..hey not to mention, you may use them for future classes. When writing any test questions, use caution with local vernacular wording.
  • #21 All of the above responses can and have been successfully challenged. Choice “A” fits the answer and the rest of the question is not read. Choice “A” is actually a correct answer. None of the above questions tend to trick or cause misreading of questions that suddenly change from looking for positive to looking for negative responses.
  • #22 Hopefully, true/false tests do not become a flip of the coin. Just for FYI, I wrote a 20 question T/F quiz once. The first 19 responses were true (maybe it was false) and the last one was false. I thought the class was going to lynch me.
  • #23 Question: Do you provide more possible answers than there are matches?
  • #24 Don’t forget the term synonym. Does your response have to be exact or would a word with the same meaning be acceptable. You need to know that ahead of grading.
  • #26 Suggest the General Writing Rubric (or one you like better) be handed out as an example. Copy in your “misc” folder.
  • #28 As an instructor, you need to be watching and evaluating the group projects. If you don’t, it is possible that a small number of the group will actually be doing while the others are skating.
  • #29 Rote learning is a learning technique which avoids understanding of a subject and instead focuses on memorization . The major practice involved in rote learning is learning by repetition . The idea is that one will be able to quickly recall the meaning of the material the more one repeats it. Although it has been criticized by some schools of thought , rote learning is commonly used in the areas of mathematics, music, and religion. (Wikipedia). In practical settings this might be an excellent way of doing things. We have learned the “practice makes perfect”. It is likely less beneficial in classroom settings.
  • #30 Situational skills testing can give a more true to life evaluation.
  • #31 Student is being evaluated on using his bunker gear as flotation until he is able to properly remove it all. Check list would be difficult since they are in the water. Students were adequately briefed on what was expected of them prior to their “leap of faith”; stepping off the side of the pool in full bunker gear and air pack.
  • #32 Photo from Extreme Extrication class at Great Florida Fire School 2007.
  • #34 Do not forget, if there is a degree of potential hazard, have a safety officer.
  • #43 Some argument, obviously, could be made for any of these. Oral; however, would have the most negative consequences related to unexpected distractions.
  • #44 Your choice may have been transfer; however, that looks at a much closer time frame of 6 weeks to 6 months.