Improving Testing with Key Strength Analysis
Have you ever wondered whether some distractors were just a little too close to being a right answer? Have you wished you had a way to decide whether an item's answer choice did not meet your standard? What about those items which were published with the wrong answer key?
If you have ever asked yourself these questions, be sure to watch our webinar, presented as part of the Caveon Webinar Series on September 18, 2013. You will learn a new evaluation method that will help you feel confident about your key strength.
The webinar will discuss the underlying concepts, the theory, and applications for the method Caveon has been using since 2011. The method uses classical item statistics, so it can be used for all assessments that can be analyzed using p-values and point-biserial correlations. As such, we believe it to be a valuable enhancement to other commonly-used item analyses.
Best Practices for the Academic User: Maximizing the Impact of Your Instituti...Qualtrics
To view the on-demand webinar for this presentation see the following link: https://success.qualtrics.com/academic-best-practices-watch.html
Qualtrics has changed the landscape for colleges and universities, introducing many features to help academic decision makers run more successful surveys.
Join Qualtrics and Jag Patel, Associate Director of Institutional Research at MIT, as we share best practices and tips for academic users.
Testing industry veterans John Fremer and Steve Addicott of Caveon are joined by Lou Woodruff, past president of the National College Testing Association to share their "lessons learned" from several of this summer's biggest testing conferences. For more information, please go to www.caveon.com
Best Practices for the Academic User: Maximizing the Impact of Your Instituti...Qualtrics
To view the on-demand webinar for this presentation see the following link: https://success.qualtrics.com/academic-best-practices-watch.html
Qualtrics has changed the landscape for colleges and universities, introducing many features to help academic decision makers run more successful surveys.
Join Qualtrics and Jag Patel, Associate Director of Institutional Research at MIT, as we share best practices and tips for academic users.
Testing industry veterans John Fremer and Steve Addicott of Caveon are joined by Lou Woodruff, past president of the National College Testing Association to share their "lessons learned" from several of this summer's biggest testing conferences. For more information, please go to www.caveon.com
Surveys that work: training course for Rosenfeld Media, day 3 Caroline Jarrett
Surveys seem easy: anyone can throw together a few questions, send them out, and hope that they are rewarded with a decent response. But we’ve all seen examples of poorly conceived surveys that couldn’t possibly deliver real insights for the organisation that sponsored them.
This highly participative three-session training - arranged by Rosenfeld Media as part of its Virtual Training with UX Industry Leaders programme - takes you through the whole process of creating an effective survey, from defining a goal through analysis of data and creating a presentation.
These slides come from day 3 of the course: responses and reports.
Psychometrics 101: Know What Your Assessment Data is Telling YouExamSoft
Presented by Eric Ermie, Executive Director of Sales, ExamSoft Worldwide, Inc.
Keep it? Throw it out? Content/teaching issue? Bad question? Too easy? Too hard? What the heck? More than likely you have asked some or all of these questions at one point or another when trying to understand the performance of questions on an assessment. With differing opinions on how to interpret the statistics provided, how do you know what all this data is trying to tell you? Join us for a webinar on the fundamentals of item analysis, how the data is derived, and the different ways they can be interpreted. This presentation will cover how to put data into a useful context that will allow you to draw your own conclusions on what it means, how you should apply them, and why you should ignore rules that others may use for their specific situation.
Cognitive, personality and behavioural predictors of academic success in a la...Blackboard APAC
In recent years there has been growing interest in the use of e-learning tools that are able to adapt to suit the ability levels, needs, or preferences of individual learners. In this project we aim to test the utility of an adaptive e-learning study tool within the context of a large undergraduate Psychology course (approximately 700 students). The study tool and a number of associated summative tests are hosted on the course’s Blackboard Learning Management System. Pilot data indicates that students that use the tool perform significantly better on the summative tests compared to non-users (t[683] = 4.35, p <0.001). We examine the relationship in the context of 1) learning analytics data that can be obtained via Blackboard, and 2) a number of known psychological predictors of academic success.
Delivered at Innovate and Educate: Teaching and Learning Conference by Blackboard. 24 -27 August 2015 in Adelaide, Australia.
Chapter 6: Writing Objective Test Items
1) What is an objective test items?
2) Examples of an objective test items
a) True or False
• Advantages & Disadvantages
• Suggestions for writing true or false test items
b) Matching Type
• Advantages & Disadvantages
• Suggestions for writing matching type test items
c) Multiple Choice
• Advantages & Disadvantages
• Suggestions for writing multiple choice test items
d) Completion Test
• Advantages & Disadvantages
• Suggestions for writing completion test items
3) Guidelines for writing test items
Surveys that work: training course for Rosenfeld Media, day 3 Caroline Jarrett
Surveys seem easy: anyone can throw together a few questions, send them out, and hope that they are rewarded with a decent response. But we’ve all seen examples of poorly conceived surveys that couldn’t possibly deliver real insights for the organisation that sponsored them.
This highly participative three-session training - arranged by Rosenfeld Media as part of its Virtual Training with UX Industry Leaders programme - takes you through the whole process of creating an effective survey, from defining a goal through analysis of data and creating a presentation.
These slides come from day 3 of the course: responses and reports.
Psychometrics 101: Know What Your Assessment Data is Telling YouExamSoft
Presented by Eric Ermie, Executive Director of Sales, ExamSoft Worldwide, Inc.
Keep it? Throw it out? Content/teaching issue? Bad question? Too easy? Too hard? What the heck? More than likely you have asked some or all of these questions at one point or another when trying to understand the performance of questions on an assessment. With differing opinions on how to interpret the statistics provided, how do you know what all this data is trying to tell you? Join us for a webinar on the fundamentals of item analysis, how the data is derived, and the different ways they can be interpreted. This presentation will cover how to put data into a useful context that will allow you to draw your own conclusions on what it means, how you should apply them, and why you should ignore rules that others may use for their specific situation.
Cognitive, personality and behavioural predictors of academic success in a la...Blackboard APAC
In recent years there has been growing interest in the use of e-learning tools that are able to adapt to suit the ability levels, needs, or preferences of individual learners. In this project we aim to test the utility of an adaptive e-learning study tool within the context of a large undergraduate Psychology course (approximately 700 students). The study tool and a number of associated summative tests are hosted on the course’s Blackboard Learning Management System. Pilot data indicates that students that use the tool perform significantly better on the summative tests compared to non-users (t[683] = 4.35, p <0.001). We examine the relationship in the context of 1) learning analytics data that can be obtained via Blackboard, and 2) a number of known psychological predictors of academic success.
Delivered at Innovate and Educate: Teaching and Learning Conference by Blackboard. 24 -27 August 2015 in Adelaide, Australia.
Chapter 6: Writing Objective Test Items
1) What is an objective test items?
2) Examples of an objective test items
a) True or False
• Advantages & Disadvantages
• Suggestions for writing true or false test items
b) Matching Type
• Advantages & Disadvantages
• Suggestions for writing matching type test items
c) Multiple Choice
• Advantages & Disadvantages
• Suggestions for writing multiple choice test items
d) Completion Test
• Advantages & Disadvantages
• Suggestions for writing completion test items
3) Guidelines for writing test items
Caveon Webinar Series - A Guide to Online Protection Strategies - March 28, ...Caveon Test Security
Join Executive Web Patrol Managers, Cary Straw and Jen Baldwin, as we explore the systems, methods and steps you need to successfully protect and extend the life of your high stakes certification, licensure, and state assessment exams from online threats.
Some of the questions we will answer include:
• Which processes should I implement to decrease the chance of my content appearing online?
• Where are the best places to use online security resources?
• Where do I look next if I found a threat, and where are the threats likely to spread?
• What are proactive steps I can take to protect my exams online?
• Who should be in my protection hierarchy?
• Am I "safe" after I've found a threat, and have had it removed?
Caveon Webinar Series - Five Things You Can Do Now to Protect Your Assessment...Caveon Test Security
Test season is approaching quickly! Maintaining the security and validity of assessment results is critical to support federal accountability and peer review requirements.
Kick off testing season with this year's first Caveon Webinar, "Five Things You Can Do Right Now to Protect Your Assessment Programs."
This webinar will focus on:
• Test security threats & risk analysis
• Creating test security policies and procedures
• Planning and implementing on-site monitoring
• Reviewing anomalous test results
• Managing incident reports
Join the webinar to learn more, and you'll be off to a strong start in protecting your tests, your results, and your reputation.
If you missed the first three sessions, you can still view them. And, if you can't attend on January 17, go ahead and register anyway and we will send you the recording and slides after the session.
The Do's and Dont's of Administering High Stakes Tests in Schools Final 121217Caveon Test Security
There is a great deal of advice available about giving high stakes tests securely in school settings. States run annual training sessions and provide test administration manuals. Major vendors serving schools provide training and guidelines of varying types. Sometimes the different sources disagree and the emphases vary by the nature of the helping agency. What is a test administrator to do?
This webinar focuses on administering tests in schools and identifies ten "best practices" that apply to all high stakes testing. The content is drawn from careful analyses of current testing practices by states, districts, and testing vendors.
To be an effective test administrator, you will need to read the background materials about each testing program and attend any training that is provided. If you also follow the guidelines presented in this webinar, you will be in a very good position to promote fairness and validity in each of the programs for which you share responsibility.
In this webinar, you will learn:
* Ten Best Practices that apply to all high stakes testing
* What is required to be an effective test administrator
* How to promote fairness and validity in your testing programs
Sponsored by the National Association of Assessment Directors and Caveon Consulting Services, Caveon Test Security
Caveon Webinar Series - The Art of Test Security - Know Thy Enemy - November ...Caveon Test Security
As Sun Tsu famously said... "If you know your enemy as you know yourself, you need not fear 100 battles." On the battlefield of security -- whether home security, airport security, or test security - the first step to success is knowing the threats.
Are you worried about tests being stolen and shared online? Or test takers cheating by being coached by an expert? If so, the steps to successfully protecting your test and triumphing over these fears include:
• conducting a risk assessment
• determining (and ranking) which threats pose the greatest risk
• strategizing how to render those threats impotent
• determining the right combination of prevention, detection and deterrence tactics for your program
This webinar will teach you to conquer the steps in this test security process. Join Caveon CEO David Foster to learn how to analyze and rank the threats that are specific to your program. You will also discover the three solutions necessary to counter any and all of these threats.
Caveon Webinar Series - Four Steps to Effective Investigations in School Dis...Caveon Test Security
Now that spring test administrations are almost over, K-12 districts and schools can breathe a sigh of relief. Weeks of vigilance have paid off with a smooth, incident-free test administration. Not your district? You’re not alone. No matter the extent of planning, training, and oversight, there are always unforeseen events that result in testing irregularities. Most will be straightforward and covered by standard policies and procedures. But some incidents may set off your internal alarms. By themselves, these reports are only single data points and need to be explored to determine the larger context and what really happened. This webinar will provide information on:
How to develop a plan for responding to test irregularity reports and;
How to carry out investigations if additional information is needed.
The session is free, and will only last 30 minutes. Space is limited, so register today! We look forward to seeing you on May 18th!
If you missed the first two sessions, you can still view them. And, if you can't attend on May 18, go ahead and register anyway and we will automatically send you the recording and slides after the session.
Caveon Webinar Series - On-site Monitoring in Districts 0317Caveon Test Security
Are you sure that school leaders and educators are following your state and local assessment policies and procedures during the administration of assessments?
On-site monitoring of assessment administrations at schools and in classrooms is an effective quality assurance measure that:
• ensures compliance with standardized policies and procedures
• helps identify the greatest areas of vulnerability in your assessment administration processes
• creates opportunities to improve training, and
• clarifies messaging about assessments for school leaders and educators.
Finally, LEA-sponsored monitoring demonstrates a strong commitment to the integrity of assessments and the important decisions made based upon assessment results.
By attending this webinar, you will gain exposure to:
1) the goals and purposes of monitoring,
2) best-practice monitoring activities during assessment administrations,
3) evaluating data from monitoring reports,
4) potential outcomes from monitoring and
5) first steps in implementing a monitoring program.
Caveon Test Security, the industry leader in providing security solutions for protecting high-stakes, K-12 assessments, is pleased to announce the first webinar in a series of 3, focused on test security challenges faced specifically by districts.
Session #1: Avoiding A School District Test Cheating Scandal:
A Tale of Two Cities
January 25, 2017, 12:00 p.m. ET
As a number of U.S. school districts have learned, mishandling of cheating incidents on tests, particularly state assessments, can have very negative and pervasive effects. This webinar reviews two examples of actual test cheating situations in school districts, contrasts how they were handled, and lays out practical and "battle-tested" strategies for avoiding and, if necessary, coping with test cheating events. Having a strong security plan and acting wisely and decisively when you see signs of trouble can be a very productive approach. This webinar will give you tools to manage a test cheating incident if you have a suspected or confirmed report of cheating.
Caveon Webinar Series - Discrete Option Multiple Choice: A Revolution in Te...Caveon Test Security
High-stakes testing faces major changes due to the use of computers and other technology in test administration. Some such changes include new test designs (such as computerized adaptive testing), proctoring tests online, and even administering tests on tablets and smartphones to improve test taker convenience. One of the most important changes is innovative new item types that better measure important skills. The Discrete Option Multiple Choice item type, or DOMC, is one of these ground-breaking new item types.
The DOMC item has the potential to revolutionize testing. It brings significant benefits in security, quality of measurement, fairness, test development, and test administration.
Caveon Webinar Series - Test Cheaters Say the Darnedest Things! - 072016Caveon Test Security
You won't believe what's actually happened in the world of testing!
What goes on in the mind of a would-be test cheater? While cheating is a serious offense, some of test takers go to great (and sometimes comical) lengths to try gaining an unfair advantage to achieve a successful testing outcome.
Join us as we look at some of the most memorable proctor/test taker cheating encounters. Our special guest, Jarret Dyer, of the College of DuPage Testing Center, has created a compilation of test proctor stories from testing centers around the United States and across the globe. Jarret will share his 'best of' stories, while Caveon's John Fremer will discuss the consequences of not following the right test security processes and procedures. You don't want to miss this fun, yet informative session! To listen to the recording that goes along with these slides, go to https://youtu.be/r-CCaDf7NEk
Caveon Webinar Series - The Test Security Framework- Why Different Tests Nee...Caveon Test Security
The need for global workforce skills credentials continues to grow. At the same time, the global workforce is shrinking. It is imperative that skill recognition be accurate and the level of test security be appropriate for the skills being assessed. The Security subcommittee of the new Workforce Skills Credentialing division of ATP created a new test security framework that will provide guidance to testing organizations when selecting the level of security needed for their assessments.
Join our guest presenters, Rachel Schoenig and Jennifer Geraets of ACT, as they discuss the challenge of identifying global workforce skills and how this new test security framework will help to align the expectations of those involved with workforce credentialing (e.g., test publishers, examinees, and employers). Rachel and Jennifer will also provide a call to action, requesting your comments on this new framework.
Caveon Webinar Series - Conducting Test Security Investigations in School Di...Caveon Test Security
In the coming weeks, schools all over the country will be administering standardized exams to millions of students. And inevitably, test security incidents will arise, many of which may directly impact test score validity. Is your team prepared to answer the following tough questions?
• What will you do if you find yourself in a position of having to respond to an incident or breach in your state or district?
• What process will you follow?
• What is your incident escalation plan?
• How will you communicate with internal and external stakeholders?
• Most importantly, how will you discover the truth of what did or did not occur, and its impact on test scores?
Join Caveon’s test security experts for an important, hour-long webinar to help you understand the steps to take when challenging situations arise. We will share:
• Recent experiences other districts have had with possible cheating, and what they have done to resolve their concerns
• Information and tools for you to arm yourself before an issue arises, and to help you be better equipped to deal effectively and efficiently
• Essential tips you need to know when invoking a Security Incident Response Plan, and further conducting a security investigation
Caveon Webinar Series - Creating Your Test Security Game Plan - March 2016Caveon Test Security
History has shown that as stakes rise for testing programs, so do threats to the program's test result validity. There are stories in the media almost daily about high-stakes programs suffering at the hands of those intent on obtaining the content for disingenuous purposes. Having a game plan in place before a threat or validity issue occurs is vital. This month's webinar will focus on key steps your organization can take to maximize your protection from test fraud, and stay one step ahead of the game.
Caveon Webinar Series - Mastering the US DOE Test Security Requirements Janua...Caveon Test Security
The U.S. Department of Education recently issued the Peer Review of State Assessment Systems, which includes a required "Critical Element" on Test Security. To fulfill this requirement, States must submit documentation of policies and procedures in four categories of test security: prevention, detection, remediation, and investigation.
It is up to each State to determine which steps to implement and what evidence to submit to prove they have met each of these requirements. Evidence could, and should, include a myriad of test security measures ranging from Security Handbooks and annual proctor training, to data forensics and web monitoring procedures (and everything in between).
Caveon can help guide you through this complicated process. In the upcoming session, our test security experts will unpack the requirements of this section of the Peer Review process. The goal is to help you form a road map moving forward, provide information on the best practices for protecting your assessments, and outline resources to streamline the process.
Caveon Webinar Series - Will the Real Cloned Item Please Stand Up? finalCaveon Test Security
Join us for this month's webinar on the ins and outs of developing item clones. While many of us are aware of the benefits cloning can provide, such as expanding an item bank, lengthening the shelf life of an exam, or deterring and detecting cheating, questions remain regarding the best practices for implementation. Secure exam development experts will address the question, "How do we know, during development, when an item has been sufficiently altered, making it a "real clone" and not just an "imitator" of a clone?" The answer isn't as clear cut as it would seem.
Additional topics will include:
• General information on cloning
• Lessons learned from the field
• Creative ideas for streamlining cloning processes
This webinar will help assessment and program managers be better positioned to put on their cloning lab coats and reap the rewards of this best practice in test security.
Caveon Webinar Series - Lessons Learned at the 2015 National Conference on S...Caveon Test Security
The National Conference on Student Assessment (NCSA) was held last month in San Diego, and Caveon was there. This month's webinar will focus on lessons learned at the conference regarding test security, and what's happening in the state assessment arena in terms of test security right now.
Caveon's Steve Addicott and Jamie Mulkey will be joined by special guest Walt Drane, State Assessment Director, Mississippi Department of Education. The panelists will summarize the test security trends and strategies that they drew from the conference, and also share key points from sessions they presented.
Caveon Webinar Series - Learning and Teaching Best Practices in Test Security...Caveon Test Security
Test security has been emerging as a cohesive discipline for the past ten years. There are no college courses that teach test security. And, even if there were, many practitioners don't have time to take those classes. How do you stay abreast of current developments? How do you train your staff in latest best practices if you don't know about them? Are there resources out there, and how do you find them?
In this webinar, Caveon will host several special guest practitioners from various industries. These test security veterans have had to answer these very questions. They will address how continuing education will help you improve test security in your organization.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Caveon Webinar Series: Improving Testing with Key Strength Analysis
1. Upcoming Caveon Events
• Caveon Webinar Series: Next session, October 16
The Good and Bad of Online Proctoring, Part 2
• EATP – September 25-27 in St. Julian’s, Malta.
– Caveon’s John Fremer and Steve Addicott presenting:
What are we Accountable For? Security Standards and Resources for High
Stakes Testing Programs
– Steve Addicott hosting an ignite session: Leveraging Social Media to Connect with
International Test Candidates
• The 2nd Annual Statistical Detection of Potential Test Fraud Conference
– October 17-19, 2013, Madison, Wisconsin
– Caveon’s Dennis Maynes and Cindy Butler will be presenting three sessions
• Handbook of Test Security – Now Available. We will share a discount code at the
end of this session.
2. Caveon Online
• Caveon Security Insights Blog
– http://www.caveon.com/blog/
• twitter
– Follow @Caveon
• LinkedIn
– Caveon Company Page
– ―Caveon Test Security‖ Group
• Please contribute!
• Facebook
– Will you be our ―friend?‖
– ―Like‖ us!
www.caveon.com
3. Improving Testing with Key Strength Analysis
Dennis Maynes Dan Allen
Chief Scientist Psychometrician
Caveon Test Security Western Governors University
Marcus Scott Barbara Foster
Data Forensics Scientist Psychometrician
Caveon Test Security American Board of Obstetrics
and Gynecology
September 18, 2013
Caveon Webinar Series:
4. Agenda for Today
• Review classical item analysis
• Introduce Key Strength Analysis
• Derive Key Strength Analysis
• Observations by Dan Allen and Barbara Foster
• Conclusions and Q&A
5. Review Classical Item Analysis
• Statistics
– P-value
– Point-biserial correlation
• Typical rules
– Low p-values (hard items)
– High p-values (easy items)
– Low point-biserial correlations (low discriminations)
• Easy to understand and implement
• Good at flagging poor items
6. Introduce Key Strength Analysis
• Why Key Strength Analysis?
– Model uses information from all items
– Answer choices for same item are compared
– Provides possible reasons for poor performance
• High performing test takers (knowledgeable students)
– Typically report problems with the answer key
– Usually choose the correct answer
• Most frequently selected choice
– Is usually correct for easy items
– Is not necessarily correct for hard items
7. Capabilities of Key Strength Analysis
• Built upon classical item analysis
– Point-biserial correlations discriminate between high and low
performers
– P-values detect hard/easy items
• Typical problems with items
– Mis-keyed items
– Weakly keyed items
– Ambiguously keyed items
• Use probabilities to make inferences about item
performance
8. Modify Point-Biserial Correlation
1. Exclude the item score from the test score
• Places all answer choices on ―the same playing field‖
• Allows correct and incorrect answers to be compared using
―what if‖
2. Compute point-biserial correlations
• For correct answer and
• For distractors
3. Scale point-biserial appropriately
• We call this statistic, z*
• Use z* to compute the probability of the choice (A, B, etc.) being
a key--this is the ―key strength‖
14. Approximation Theory
• Central Limit Theorem z* is normal.
• Probability function should be monotonic
increasing, which requires equal variances
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
10
z*
Right Right Normal Wrong Wrong Normal
16. Analysis of Distractors
• Compute key strength (KS) for all responses
• Low KS – probability less than 50%
• High KS – probability 50% or more
AnswerDistractors Low KS High KS
Low KS Weakly keyed Potential mis-key
High KS Normal Ambiguously keyed
17. Example I – Good Key
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
p(choiceisakey|z*)
z*
A
C D
B
Response z* Probability
A 3.25 0.99
B 0.25 0.06
C -2.75 0
D -2.4 0
Answer key arrow is
colored gold
18. Example II – Potential Mis-key
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
p(choiceisakey|z*)
z*
A
B
C D
Response z* Probability
A 3.25 0.99
B 0.25 0.06
C -2.75 0
D -2.4 0
Answer key arrow is
colored gold
19. Example III – Weak Key
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
p(choiceisakey|z*)
z*
A
B
C D
Response z* Probability
A 1.0 0.32
B 0.25 0.06
C -3 0
D -2.5 0
Answer key arrow is
colored gold
20. Example IV – Ambiguous Key
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
p(choiceisakey|z*)
z*
Response z* Probability
A 3.75 0.99
B 2.25 0.9
C -3 0
D -2.5 0
C D
A
B
Answer key arrow is
colored gold
21. Validation – Answer Key Estimation
• Assume the key is not known
• Check accuracy of estimated answer key
• Algorithm:
– Start with most frequent response as initial guess
– Revise key using probabilities until no more changes
• For 12 different exams
– Key estimation accuracy varied from 81% to 99%
– Cannot infer multiple keys
– Cannot guess key when there are no correct responses
22. Summary of Validation Study
• Accuracy improves with item quality
• Accuracy affected by sample size & test length
Exam
Name
N Forms
Form
Length
Items
Non-scored
Items
Accuracy Observations
A 2,966 2 180 307 0 99.2%
B 337 2 107 214 0 85.5%
C 337 1 230 230 0 90.9%
D 1815 1 204 204 7 92.1%Some association with "deleted" items
E 1408 1 199 199 1 96.0%
F 46,356 2 240 480 0 96.0%
G 44,104 2 120 240 0 95.8%
H 25,448 2 60 120 0 93.3%
I 121 3 165 417 43 81.0%Strong association with "field test" items
J 1,071 8 52 & 61 391 0 80.5%85.2% (English-only)
K 2,033 8 68, 76 & 77 510 0 85.9%
L 6,473 21 250 1050 850 85.7%
All errors except one were on non-scored
items.
23. Reason for Answer Key Estimation
• If a group of test takers has stolen the test and worked
out their own answer key, it is likely some answers will
be wrong.
• Answer key estimation can find the errors committed by
test thieves.
25. Example Item: Ambiguous Key
Which is a property of all X?
A. They contain Y.
B. They have property Z.
C. * They do not contain Y.
D. They have property W.
Looking at the item text, we see that this is likely being
caused by rival options A and C. SME feedback
suggests the item is too text specific.
26. Example Item: Ambiguous Key
Which is a component of X?
A. * Real anticipated expense
B. Time spent
C. Liquid assets
D. Quality
In this case, students of high ability were often
selecting C instead of A. SME feedback suggests the
deleted word may have been turning students off to
that option.
27. Example Item: Weak Key
Select 3 possible causes of X
A. *Obesity
B. Contaminated drinking water
C. *Unhealthy diet
D. *Genetic factors
E. Lack of exercise
High performing students were picking C and D correctly, but
were as likely to pick E as they were to pick A. SME feedback
suggested that E may be a reasonable answer to the question.
The revision involved making A, C, and E all incorrect answers
so that D would remain the sole answer.
28. Example Item: Potential Mis-key
Which is a sound accounting principle?
A. X
B. Not X
C. *Y
D. Z
Nearly all students selected distractor B (Not X). This
item was not mis-keyed. It seems most likely that this
concept was not covered sufficiently in the text and/or
other learning resources—leaving students to use
guessing strategies rather than content knowledge.
30. The American Board of
Obstetrics and Gynecology
2013 Certifying Exam
• 180 scored items
• Five sets of 40 field test items
31. • Potential mis-keys from Caveon
– 8 identified among the scored items (4%)
– 22 identified among the field test items (11%)
The lower proportion in the scored items is not
surprising since those items have been field
tested and some may have been previously
used.
The American Board of Obstetrics and Gynecology
32. • Result of the SME review of the flagged scored
items:
– 4 of the 8 (50%) were found to have problems.
These problems were a combination of ambiguous
wording, new information published just prior to
the exam, recent changes in guidelines, or just a
very difficult item. These items were deleted from
the exam prior to scoring.
The American Board of Obstetrics and Gynecology
33. • Result of the SME review of the flagged field
test items:
– 15 of the 22 (68%) were found to have problems.
These problems were mostly a combination of
ambiguous wording, responses too closely related,
and changes in the field.
The American Board of Obstetrics and Gynecology
34. Our Standard Methods The z* Method
27 Field Test Items
flagged
(13.5%)
22 Field Test Items
flagged
(11.0%)8 (4%)
items
flagged
by both
The American Board of Obstetrics and Gynecology
35. Our Standard Methods The z* Method
27 Field Test Items
flagged
(13.5%)
13 had problems
22 Field Test Items
flagged
(11.0%)
15 had problems
8 (4%)
5 items
had
problems
The American Board of Obstetrics and Gynecology
36. • Conclusion
This new method indicates that it is detecting
differences that are not being detected by our
current methods. These differences do not
appear to be strictly keying errors but involve
other important problem areas as well.
The American Board of Obstetrics and Gynecology
37. Conclusions
• Item analysis helps ensure
– Unidimensionality
– Desired item performance
• Key Strength Analysis enhances classical item analysis
– Uses information from all items
– Compares answer choices for same item
• Can detect structural flaws in items
• Can suggest the actual key when the item is mis-keyed
– Suggests possible reasons for poor performance
• Future research
– Investigate thresholds for Key Strength Analysis
– Simulate item problems to measure ability to detect
– Evaluate performance when assumptions fail
39. HANDBOOK OF TEST SECURITY
• Editors - James Wollack & John Fremer
• Published March 2013
• Preventing, Detecting, and Investigating Cheating
• Testing in Many Domains
– Certification/Licensure
– Clinical
– Educational
– Industrial/Organizational
• Don’t forget to order your copy at www.routledge.com
– http://bit.ly/HandbookTS (Case Sensitive)
– Save 20% - Enter discount code: HYJ82
40. THANK YOU!
- Follow Caveon on twitter @caveon
- Check out our blog…www.caveon.com/blog
- LinkedIn Group – ―Caveon Test Security‖
Dennis Maynes Dan Allen
Chief Scientist Psychometrician
Caveon Test Security Western Governors University
Marcus Scott Barbara Foster
Data Forensics Scientist Psychometrician
Caveon Test Security American Board of Obstetrics
and Gynecology