Designing a Sales Representative Certification Program Through Validated Measurement and Comprehensive Assessment Greg Sapnar, Bristol-Myers Squibb Practitioner’s Perspective Steven Just, Pedagogue Solutions Psychometric & Legal Testing Perspective Tuesday, June 19 th  2:00 -3:30 Hollywood, Florida,  Diplomat 4
Designing a Sales Representative Certification Program Log into Channel 17 Press and release the “GO” button While the light is flashing red and green, enter (17) the 2 digit channel code After the second digit is entered, Press and release the “GO” button. The light will flash green to confirm that it is programmed.
What type of person are you? One who… Makes things happen. Watches things happen. Wonders what happened.
Learning Objectives At the completion of this workshop, participants will be able to: State the definitions of the four different types of certification Apply job certification to the certification of sales representatives State the competitive advantages of having a "certified" sales force. Design a sales representative certification process Implement a sales representative certification process
Certification ? Do you Certify?
Do you certify your representatives on their job required knowledge? Yes No I think so I don’t know
Do you have positive or negative consequences related to test results? Yes No I think so I don’t know
Four Types of Certification
Job Certification Job certification focuses on job requirements and must be supported by documented evidence of job relevance Relevance determined by job analysis Relevance determined by SME consensus
Certification should be viewed as a program rather than an activity
Certification Development Framework The Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop  Assessments Groundwork Design Program Policy and  Procedures Develop Evaluate Reference:  Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer Evaluation Deliver  Assessments Deliver
Certification Development Framework The Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop  Assessments Groundwork Design Program Policy and  Procedures Develop Evaluate Evaluation Deliver  Assessments Deliver Reference:  Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
Drivers and Business Cases to Support Certification of Pharmaceutical Representatives Demand for high standards/ethics in healthcare Need to differentiate in the industry External Forces making demands Customers Government Poor Press of Pharmaceutical Industry Support the Corporate Mission/ High Standards of Healthcare Develop Qualified Talent Meet customers’ expectations/access Prepare for rising tide of licensing discussions Raise perceptions of Pharmaceutical Representatives Driver Business Case
Certification Development Framework The Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop  Assessments Groundwork Design Program Policy and  Procedures Develop Evaluate Evaluation Deliver  Assessments Deliver Reference:  Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
Drivers and Business Cases to Support Certification of Pharmaceutical Representatives Skills Knowledge Areas Education Industry Compliance Other? Minimum Competency Performance criteria Minimum Knowledge requirements Minimum education Requirements Standards
Knowledge Measurement Strategy Provide clear evidence we are delivering the knowledge and skills we need to achieve our company mission Ensure business alignment through our Certification Governance Council Implement process which creates valid and reliable assessments Design for defensibility
Assessment Process & Resources Requirements Establish objectives Establish objectives Create test questions Validate Test (½ day meeting) Establish cut-score Report Outcomes Analyze & Interpret Outcomes Create Role-Play Scenarios & checklist Validate Role-Play Scenarios & checklist (full day meeting) Establish Cut Scores for Role-Play Train Raters & Establish Rater Reliability for Role-Play Implement Test Report Outcomes Analyze & Interpret Outcomes Implement Role Play Evaluation Role-Play Assessment Knowledge Assessment Validate objectives Validate objectives The above process illustrates basic requirements for test development.  Rigor may be increased if needed to support higher levels of decision making.
Certification Development Framework The Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop  Assessments Groundwork Design Program Policy and  Procedures Develop Evaluate Evaluation Deliver  Assessments Deliver Reference:  Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
Elements of a Certification/Assessment Strategy Governance and administration Define terms Legal issues Remediation and consequences Make expectations explicit and public Determine methods of testing Establish assessment frequency Assessment security Job competency analysis Create fair, valid and reliable assessments Determine cut (passing) scores Recertification Program evaluation/Item analysis
Governance and Administration
Document Process All stages of the test validation process must be documented because of its potential to precipitate internal and external disputes. Documentation of the entire test development process is essential. You may have a perfectly valid measurement tool, but if you do not have documentation to show how you ensured that validity, you have no legal defense of the test. Documentation should be complete and time relevant and may include: Special forms to match each stage in the process Summarization of each stage in the process
Defining Terms
Knowledge-based Knows terms, rules, principles,concepts, and procedures Skill-based Can apply the terms, rules, principles, concepts, and procedures under controlled conditions such as in a simulation Performance-based Can apply the terms, rules, principles, concepts,and procedures consistently under real working conditions Certification Development Framework Assessment Types:
Common Terms Assessment Quiz Test Exam Evaluation Pretest Post-test Formative Assessment Summative Assessment Diagnostic Assessment Rubric Performance Assessment Self-assessment High Stakes Assessment Certification
Assessment A systematic process for returning results in order to describe what students know or can do  An ongoing process aimed at measuring and improving student learning Assessments can be in the form of a quiz, test, exam or evaluation
Quiz A low-stakes diagnostic assessment in which the results are only to be used for self- or group-diagnosis and prescription.
Test A medium stakes formative assessment designed to inform both the learner and (optionally) the instructor of the learner’s level of knowledge at an intermediary point in the instructional process. There are no long-term consequences for failure. Short-term consequences may include required remediation before proceeding with a learning activity.
Exam A high-stakes summative assessment at the completion of a learning experience for which there are consequences for failure. Results of exams are made available to the learner’s direct supervisor and appropriate training department personnel. Exam results may have career impacting consequences
Evaluation An assessment that measures, compares, and judges  For example: Role play evaluations Smile sheets Evaluation of a training program Level 3 and 4 evaluations
Legal Issues
How do High Stakes Tests Differ From Other Types of Tests? Always a summative assessment Higher level of scrutiny More rigorous development methodology Potential legal consequences
Legal Jeopardy Individual Group Record-keeping requirements
Individual Legal Issues We live in a litigious society Ensure that your hiring/promotion/dismissal decisions are based on sound science Ensure that your record keeping is 100% accurate
Group Legal Issues Title VII of the Civil Rights Act of 1964 (as amended in 1991) prohibits basing employment decisions on race, gender, ethnicity, religion, or national origin This has been interpreted to require that an employer’s selection procedures not result in  disparate impact  against any group unless the procedure is demonstrated to be “valid and consistent with business necessity.”
Group Legal Issues Selection procedures that result in adverse impact are presumed to be discriminatory Once plaintiffs establish adverse impact, burden shifts to employer to demonstrate validity of process
Record Keeping: 21 CFR Part 11 Fully auditable Electronic signatures Equivalent to a paper signature   Statement at signature time clarifies purpose   Legally defensible data   Fully versioned results
Remediation and Consequences
Do you have a formal system of remediation for students who fail a test? Yes No Unsure
Remediation Must have a well-thought out remediation plan  Should involve: Trainer(s) District Manager Provide multiple, but fixed number of, attempts to display mastery
How many attempts do you allow for passing a test? 1 2 3 4 5 A number greater than 5 As many as needed
Consequences There must be consistent and increasing consequences for failure At each “failure” you may involve higher levels of corporate management Usually the final step is to involve HR
Global Considerations
Certification Development Framework The Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop  Assessments Groundwork Design Program Policy and  Procedures Develop Evaluate Evaluation Deliver  Assessments Deliver Reference:  Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
Job Competency Analysis
Analyze Job Content The most important part of the validation process is ensuring that the test items match the job, called content validation. Content validation is a process that formally determines and reflects the judgements of experts regarding the content or competencies assessed by the test. Subject matter experts for the content need to be identified. In a formal process the subject matter experts need to identify and list the tasks that need to be performed to successfully perform the job.
Establish Content Validity of Objectives The relevant tasks identified in step two are converted into instructional objectives, if the test is being developed in conjunction with a curriculum plan. Subject Matter Experts (SMEs) must review the objectives and record their concurrence that the objectives match job competencies.
Create Items Test items are created to match each relevant objective. Cognitive items , I.e. multiple choice questions, etc. are created to assess knowledge competencies. Rating instruments , such as, checklists, are created to measure whether skills are being demonstrated appropriately.
Knowledge-based Assessments: Four Keys to Developing Valid Questions Questions must be properly constructed Questions must be content-validated by placement within a structure of learning objectives Questions must be written at the proper cognitive level by categorization within Bloom’s Taxonomy Thorough post-hoc statistical evaluation must be performed
Skills Assessments: Four Keys to Valid Measurement Rater training Inter-rater reliability Create a scoring rubric Create behaviorally anchored rating scales (BARS)
Creating Fair, Valid and Reliable Assessments
Knowledge Assessments
What is Validity?
Validity Construct Validity Face Validity Predictive Validity Content Validity
Construct Validity Are you measuring what you think you are measuring?
Face Validity Will your exam appear fair to the test takers?
Predictive Validity A quantitative measure of how well a test predicts some form of measurable behavior.
Content Validity The adequacy with which a domain of content is tested Not a quantitative measure Flows from  valid learning objectives ,  attention to Bloom’s taxonomy  and  properly constructed questions . Must ensure a “sensible” method of testing
Pilot the Test
Conduct Initial Test Pilot Piloting a test has two purposes: To find major flaws in the test or the testing system To begin to establish the statistical validation of the test At least 30 people should be involved in the initial pilot. The more critical the test, the larger the number of  test-takers to be included in the pilot group.
Perform Item Analysis on Pilot Item analysis looks at each test item to see how it functions as a satisfactory measure in the test. The data most corporate test designers need to collect for cognitive tests come from three measures: Difficulty index  - the percentage of the test takers who answered a particular question correctly. Distractor pattern  - looks at the selection of individual distractors to uncover patterns in how participants choose or do not choose them, I.e. if a particular distractor is never chosen, it is too easily disregarded and should be replaced with one more plausible. Point -Biserial  - Identifies correlation between high scoring and low scoring test takers choices - computer support needed.
What is Test Reliability?
Reliability Consistency over time Consistency across forms  Consistency among items Consistency among evaluators
Setting Passing Scores
Setting Passing Scores for Criterion-referenced Tests A criterion-referenced test is one in which scores are judged against a pre-set “mastery” level
What is your passing test score? <80% 80% 85% 90% >90% Varies from test to test
Who sets your passing test score? I do Upper management Training management Therapeutic area I haven’t a clue who sets it
Setting Cut Scores:  The Three Most Common Methods   The Higher Authority Method: “Our Vice President said it should be  90”  The Committee Method: “90 seems about right” The Received Wisdom Method: “I don’t know how or when it got set, but it’s always been 90”
Angoff Method Identify judges who are familiar with the competency covered by the test. For each item on the test each judge estimates the probability that a  minimally competent  person would get it right. Sum the probabilities of each judge Average the judges’ scores
Angoff Method:  Example Item 1 2 3 4 5 Total Percent Judge 1  .75 .80 .75 .90 .95 4.15 83% Judge 2  .80 .90 .75 .90 .75 4.10 82% Judge 3 .85 1.00 .90 .80 .85 4.40 88% Averaging the totals for each Judge  Cut Score= 84%
Performance Testing
Creating Valid Performance Tests Create a scoring rubric Create Behaviorally Anchored Rating Scales (BARS) Train raters Determine Inter-rater reliability
Scoring Rubric Accurate performance assessment requires a scoring model for the behaviors being assessed. Typically displayed as a table with the performance criteria being judged down the left and the ratings across the top.
Scoring Rubric:  Example
The Descriptive Behaviors The judgment criteria that go into the boxes of the rubric These are the behaviors you expect the evaluatee to display for this judgment criteria
Descriptive Behaviors:  Example
Rater Training Effective rater training should ensure: Thorough knowledge of scoring standards (validity) Consistency of scores (reliability) Neutrality (fairness)
Inter-rater reliability Best way to ensure this is to have a properly developed scoring rubric and effective rater training Standard measure for inter-rater reliability is Kappa Level, which approximates an intra-class correlation
Recertification
Do You Retest Knowledge Periodically? Yes No I think so I don’t know
Ebbinghaus Curve of Forgetting
Ebbinghaus Curve of Forgetting                                                                
Re-certification Re-certification applies to credentials that have a time limit. It usually involves re-training and re-assessment.
Certification Development Framework The Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop  Assessments Groundwork Design Program Policy and  Procedures Develop Evaluate Evaluation Deliver  Assessments Deliver Reference:  Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
Some Early Results
Secure timed vs, unsecure, untimed Eight participants took both the unvalidated, less secure exam and the validated exam Participants have been given alpha designations for pilot Traditional pass score is determined arbitrarily, Angoff pass score is determined by SME consensus Even though Angoff pass score is 10% lower than traditional pass score, 2 out of 8 participants would not have passed Track 1 final exam, even though the validated exam was focused on required knowledge as determined by SMEs Traditional pass score 90% Angoff pass score 80% Uncontrolled Controlled
Certification Development Framework The Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop  Assessments Groundwork Design Program Policy and  Procedures Develop Evaluate Evaluation Deliver  Assessments Deliver Reference:  Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
Analyzing Results of the Tests Do you need to know how well your questions discriminate between weak and strong students? Point-biserial correlation Do you need to know what percent of students answered each question correctly? Difficulty level Do you need to know where students have misinformation? Choice distribution Do you need to know where the group has strengths and weaknesses? Score by learning objective/topic
Analyzing Results of the Program Do your test results show that your sales reps are meeting the standards you have defined? Percent passing Is the program perceived as credible in the organization? Validity Are your results consistent over time? Reliability
Finally… Document  your process Take the time to  construct  good assessments Take the time to  validate  your assessments Take the time to  set  defensible passing scores Pilot  the assessments Set the  right expectations  for the learners Analyze  results and  revise  assessments as necessary Recertify  periodically
Questions? Gregory Sapnar Bristol-Myers Squibb [email_address] 609-897-4307 Steven B. Just Ed.D. Pedagogue Solutions [email_address] 609-921-7585 x12

Version 6 Spbt 2007.Prs

  • 1.
    Designing a SalesRepresentative Certification Program Through Validated Measurement and Comprehensive Assessment Greg Sapnar, Bristol-Myers Squibb Practitioner’s Perspective Steven Just, Pedagogue Solutions Psychometric & Legal Testing Perspective Tuesday, June 19 th 2:00 -3:30 Hollywood, Florida, Diplomat 4
  • 2.
    Designing a SalesRepresentative Certification Program Log into Channel 17 Press and release the “GO” button While the light is flashing red and green, enter (17) the 2 digit channel code After the second digit is entered, Press and release the “GO” button. The light will flash green to confirm that it is programmed.
  • 3.
    What type ofperson are you? One who… Makes things happen. Watches things happen. Wonders what happened.
  • 4.
    Learning Objectives Atthe completion of this workshop, participants will be able to: State the definitions of the four different types of certification Apply job certification to the certification of sales representatives State the competitive advantages of having a &quot;certified&quot; sales force. Design a sales representative certification process Implement a sales representative certification process
  • 5.
    Certification ? Doyou Certify?
  • 6.
    Do you certifyyour representatives on their job required knowledge? Yes No I think so I don’t know
  • 7.
    Do you havepositive or negative consequences related to test results? Yes No I think so I don’t know
  • 8.
    Four Types ofCertification
  • 9.
    Job Certification Jobcertification focuses on job requirements and must be supported by documented evidence of job relevance Relevance determined by job analysis Relevance determined by SME consensus
  • 10.
    Certification should beviewed as a program rather than an activity
  • 11.
    Certification Development FrameworkThe Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop Assessments Groundwork Design Program Policy and Procedures Develop Evaluate Reference: Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer Evaluation Deliver Assessments Deliver
  • 12.
    Certification Development FrameworkThe Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop Assessments Groundwork Design Program Policy and Procedures Develop Evaluate Evaluation Deliver Assessments Deliver Reference: Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
  • 13.
    Drivers and BusinessCases to Support Certification of Pharmaceutical Representatives Demand for high standards/ethics in healthcare Need to differentiate in the industry External Forces making demands Customers Government Poor Press of Pharmaceutical Industry Support the Corporate Mission/ High Standards of Healthcare Develop Qualified Talent Meet customers’ expectations/access Prepare for rising tide of licensing discussions Raise perceptions of Pharmaceutical Representatives Driver Business Case
  • 14.
    Certification Development FrameworkThe Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop Assessments Groundwork Design Program Policy and Procedures Develop Evaluate Evaluation Deliver Assessments Deliver Reference: Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
  • 15.
    Drivers and BusinessCases to Support Certification of Pharmaceutical Representatives Skills Knowledge Areas Education Industry Compliance Other? Minimum Competency Performance criteria Minimum Knowledge requirements Minimum education Requirements Standards
  • 16.
    Knowledge Measurement StrategyProvide clear evidence we are delivering the knowledge and skills we need to achieve our company mission Ensure business alignment through our Certification Governance Council Implement process which creates valid and reliable assessments Design for defensibility
  • 17.
    Assessment Process &Resources Requirements Establish objectives Establish objectives Create test questions Validate Test (½ day meeting) Establish cut-score Report Outcomes Analyze & Interpret Outcomes Create Role-Play Scenarios & checklist Validate Role-Play Scenarios & checklist (full day meeting) Establish Cut Scores for Role-Play Train Raters & Establish Rater Reliability for Role-Play Implement Test Report Outcomes Analyze & Interpret Outcomes Implement Role Play Evaluation Role-Play Assessment Knowledge Assessment Validate objectives Validate objectives The above process illustrates basic requirements for test development. Rigor may be increased if needed to support higher levels of decision making.
  • 18.
    Certification Development FrameworkThe Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop Assessments Groundwork Design Program Policy and Procedures Develop Evaluate Evaluation Deliver Assessments Deliver Reference: Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
  • 19.
    Elements of aCertification/Assessment Strategy Governance and administration Define terms Legal issues Remediation and consequences Make expectations explicit and public Determine methods of testing Establish assessment frequency Assessment security Job competency analysis Create fair, valid and reliable assessments Determine cut (passing) scores Recertification Program evaluation/Item analysis
  • 20.
  • 21.
    Document Process Allstages of the test validation process must be documented because of its potential to precipitate internal and external disputes. Documentation of the entire test development process is essential. You may have a perfectly valid measurement tool, but if you do not have documentation to show how you ensured that validity, you have no legal defense of the test. Documentation should be complete and time relevant and may include: Special forms to match each stage in the process Summarization of each stage in the process
  • 22.
  • 23.
    Knowledge-based Knows terms,rules, principles,concepts, and procedures Skill-based Can apply the terms, rules, principles, concepts, and procedures under controlled conditions such as in a simulation Performance-based Can apply the terms, rules, principles, concepts,and procedures consistently under real working conditions Certification Development Framework Assessment Types:
  • 24.
    Common Terms AssessmentQuiz Test Exam Evaluation Pretest Post-test Formative Assessment Summative Assessment Diagnostic Assessment Rubric Performance Assessment Self-assessment High Stakes Assessment Certification
  • 25.
    Assessment A systematicprocess for returning results in order to describe what students know or can do An ongoing process aimed at measuring and improving student learning Assessments can be in the form of a quiz, test, exam or evaluation
  • 26.
    Quiz A low-stakesdiagnostic assessment in which the results are only to be used for self- or group-diagnosis and prescription.
  • 27.
    Test A mediumstakes formative assessment designed to inform both the learner and (optionally) the instructor of the learner’s level of knowledge at an intermediary point in the instructional process. There are no long-term consequences for failure. Short-term consequences may include required remediation before proceeding with a learning activity.
  • 28.
    Exam A high-stakessummative assessment at the completion of a learning experience for which there are consequences for failure. Results of exams are made available to the learner’s direct supervisor and appropriate training department personnel. Exam results may have career impacting consequences
  • 29.
    Evaluation An assessmentthat measures, compares, and judges For example: Role play evaluations Smile sheets Evaluation of a training program Level 3 and 4 evaluations
  • 30.
  • 31.
    How do HighStakes Tests Differ From Other Types of Tests? Always a summative assessment Higher level of scrutiny More rigorous development methodology Potential legal consequences
  • 32.
    Legal Jeopardy IndividualGroup Record-keeping requirements
  • 33.
    Individual Legal IssuesWe live in a litigious society Ensure that your hiring/promotion/dismissal decisions are based on sound science Ensure that your record keeping is 100% accurate
  • 34.
    Group Legal IssuesTitle VII of the Civil Rights Act of 1964 (as amended in 1991) prohibits basing employment decisions on race, gender, ethnicity, religion, or national origin This has been interpreted to require that an employer’s selection procedures not result in disparate impact against any group unless the procedure is demonstrated to be “valid and consistent with business necessity.”
  • 35.
    Group Legal IssuesSelection procedures that result in adverse impact are presumed to be discriminatory Once plaintiffs establish adverse impact, burden shifts to employer to demonstrate validity of process
  • 36.
    Record Keeping: 21CFR Part 11 Fully auditable Electronic signatures Equivalent to a paper signature Statement at signature time clarifies purpose Legally defensible data Fully versioned results
  • 37.
  • 38.
    Do you havea formal system of remediation for students who fail a test? Yes No Unsure
  • 39.
    Remediation Must havea well-thought out remediation plan Should involve: Trainer(s) District Manager Provide multiple, but fixed number of, attempts to display mastery
  • 40.
    How many attemptsdo you allow for passing a test? 1 2 3 4 5 A number greater than 5 As many as needed
  • 41.
    Consequences There mustbe consistent and increasing consequences for failure At each “failure” you may involve higher levels of corporate management Usually the final step is to involve HR
  • 42.
  • 43.
    Certification Development FrameworkThe Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop Assessments Groundwork Design Program Policy and Procedures Develop Evaluate Evaluation Deliver Assessments Deliver Reference: Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
  • 44.
  • 45.
    Analyze Job ContentThe most important part of the validation process is ensuring that the test items match the job, called content validation. Content validation is a process that formally determines and reflects the judgements of experts regarding the content or competencies assessed by the test. Subject matter experts for the content need to be identified. In a formal process the subject matter experts need to identify and list the tasks that need to be performed to successfully perform the job.
  • 46.
    Establish Content Validityof Objectives The relevant tasks identified in step two are converted into instructional objectives, if the test is being developed in conjunction with a curriculum plan. Subject Matter Experts (SMEs) must review the objectives and record their concurrence that the objectives match job competencies.
  • 47.
    Create Items Testitems are created to match each relevant objective. Cognitive items , I.e. multiple choice questions, etc. are created to assess knowledge competencies. Rating instruments , such as, checklists, are created to measure whether skills are being demonstrated appropriately.
  • 48.
    Knowledge-based Assessments: FourKeys to Developing Valid Questions Questions must be properly constructed Questions must be content-validated by placement within a structure of learning objectives Questions must be written at the proper cognitive level by categorization within Bloom’s Taxonomy Thorough post-hoc statistical evaluation must be performed
  • 49.
    Skills Assessments: FourKeys to Valid Measurement Rater training Inter-rater reliability Create a scoring rubric Create behaviorally anchored rating scales (BARS)
  • 50.
    Creating Fair, Validand Reliable Assessments
  • 51.
  • 52.
  • 53.
    Validity Construct ValidityFace Validity Predictive Validity Content Validity
  • 54.
    Construct Validity Areyou measuring what you think you are measuring?
  • 55.
    Face Validity Willyour exam appear fair to the test takers?
  • 56.
    Predictive Validity Aquantitative measure of how well a test predicts some form of measurable behavior.
  • 57.
    Content Validity Theadequacy with which a domain of content is tested Not a quantitative measure Flows from valid learning objectives , attention to Bloom’s taxonomy and properly constructed questions . Must ensure a “sensible” method of testing
  • 58.
  • 59.
    Conduct Initial TestPilot Piloting a test has two purposes: To find major flaws in the test or the testing system To begin to establish the statistical validation of the test At least 30 people should be involved in the initial pilot. The more critical the test, the larger the number of test-takers to be included in the pilot group.
  • 60.
    Perform Item Analysison Pilot Item analysis looks at each test item to see how it functions as a satisfactory measure in the test. The data most corporate test designers need to collect for cognitive tests come from three measures: Difficulty index - the percentage of the test takers who answered a particular question correctly. Distractor pattern - looks at the selection of individual distractors to uncover patterns in how participants choose or do not choose them, I.e. if a particular distractor is never chosen, it is too easily disregarded and should be replaced with one more plausible. Point -Biserial - Identifies correlation between high scoring and low scoring test takers choices - computer support needed.
  • 61.
    What is TestReliability?
  • 62.
    Reliability Consistency overtime Consistency across forms Consistency among items Consistency among evaluators
  • 63.
  • 64.
    Setting Passing Scoresfor Criterion-referenced Tests A criterion-referenced test is one in which scores are judged against a pre-set “mastery” level
  • 65.
    What is yourpassing test score? <80% 80% 85% 90% >90% Varies from test to test
  • 66.
    Who sets yourpassing test score? I do Upper management Training management Therapeutic area I haven’t a clue who sets it
  • 67.
    Setting Cut Scores: The Three Most Common Methods The Higher Authority Method: “Our Vice President said it should be 90” The Committee Method: “90 seems about right” The Received Wisdom Method: “I don’t know how or when it got set, but it’s always been 90”
  • 68.
    Angoff Method Identifyjudges who are familiar with the competency covered by the test. For each item on the test each judge estimates the probability that a minimally competent person would get it right. Sum the probabilities of each judge Average the judges’ scores
  • 69.
    Angoff Method: Example Item 1 2 3 4 5 Total Percent Judge 1 .75 .80 .75 .90 .95 4.15 83% Judge 2 .80 .90 .75 .90 .75 4.10 82% Judge 3 .85 1.00 .90 .80 .85 4.40 88% Averaging the totals for each Judge Cut Score= 84%
  • 70.
  • 71.
    Creating Valid PerformanceTests Create a scoring rubric Create Behaviorally Anchored Rating Scales (BARS) Train raters Determine Inter-rater reliability
  • 72.
    Scoring Rubric Accurateperformance assessment requires a scoring model for the behaviors being assessed. Typically displayed as a table with the performance criteria being judged down the left and the ratings across the top.
  • 73.
  • 74.
    The Descriptive BehaviorsThe judgment criteria that go into the boxes of the rubric These are the behaviors you expect the evaluatee to display for this judgment criteria
  • 75.
  • 76.
    Rater Training Effectiverater training should ensure: Thorough knowledge of scoring standards (validity) Consistency of scores (reliability) Neutrality (fairness)
  • 77.
    Inter-rater reliability Bestway to ensure this is to have a properly developed scoring rubric and effective rater training Standard measure for inter-rater reliability is Kappa Level, which approximates an intra-class correlation
  • 78.
  • 79.
    Do You RetestKnowledge Periodically? Yes No I think so I don’t know
  • 80.
  • 81.
    Ebbinghaus Curve ofForgetting                                                                
  • 82.
    Re-certification Re-certification appliesto credentials that have a time limit. It usually involves re-training and re-assessment.
  • 83.
    Certification Development FrameworkThe Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop Assessments Groundwork Design Program Policy and Procedures Develop Evaluate Evaluation Deliver Assessments Deliver Reference: Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
  • 84.
  • 85.
    Secure timed vs,unsecure, untimed Eight participants took both the unvalidated, less secure exam and the validated exam Participants have been given alpha designations for pilot Traditional pass score is determined arbitrarily, Angoff pass score is determined by SME consensus Even though Angoff pass score is 10% lower than traditional pass score, 2 out of 8 participants would not have passed Track 1 final exam, even though the validated exam was focused on required knowledge as determined by SMEs Traditional pass score 90% Angoff pass score 80% Uncontrolled Controlled
  • 86.
    Certification Development FrameworkThe Driver Business Case Requirements Standards Governance & Administration Re-certification & Maintenance Global Considerations Develop Assessments Groundwork Design Program Policy and Procedures Develop Evaluate Evaluation Deliver Assessments Deliver Reference: Hale, Judith (2000). Performance-Based Certification: How to Design a Valid, Defensible, Cost-Effective Program. San Francisco: Jossey-Bass/Pfeiffer
  • 87.
    Analyzing Results ofthe Tests Do you need to know how well your questions discriminate between weak and strong students? Point-biserial correlation Do you need to know what percent of students answered each question correctly? Difficulty level Do you need to know where students have misinformation? Choice distribution Do you need to know where the group has strengths and weaknesses? Score by learning objective/topic
  • 88.
    Analyzing Results ofthe Program Do your test results show that your sales reps are meeting the standards you have defined? Percent passing Is the program perceived as credible in the organization? Validity Are your results consistent over time? Reliability
  • 89.
    Finally… Document your process Take the time to construct good assessments Take the time to validate your assessments Take the time to set defensible passing scores Pilot the assessments Set the right expectations for the learners Analyze results and revise assessments as necessary Recertify periodically
  • 90.
    Questions? Gregory SapnarBristol-Myers Squibb [email_address] 609-897-4307 Steven B. Just Ed.D. Pedagogue Solutions [email_address] 609-921-7585 x12