From Dr. David Foster, Caveon CEO
Have you ever felt the angst, doubt, and concern that comes from using current methods for setting cutscores? Well, I have, and that's why I am presenting this month's session of the Caveon Webinar Series.
This month's webinar presents a promising new method for helping to make pass/fail decisions. Borrowed from Cognitive Science, Information Integration Theory (IIT) is a quantitative method for comparing human rater judgments. It is a method that adds a scientific foundation to the way we determine who's qualified and at what level.
Standard setting using IIT is based on well-established, researched principles that explain and predict how we combine information in our brains in order to form consistent judgments. Since setting cutscores today is all about rater judgments, these methods should provide us with a quantitative basis for better establishing and evaluating the outcomes of our cutscore setting efforts.
By attending this informative session, you'll have the chance to :
• Participate in an actual "hands-on" (or more appropriately "brains-on") live pilot test of the methodology
• Learn the advantages of cut score setting using IIT
• Discover how the method may help in other routine psychometric analysis tasks that involve judgment (e.g., gender bias and content alignment reviews)
• Better understand the concepts behind using this new method for setting cutscores
• Use a software tool built on this methodology for calculating cut scores on your next test
Big shadow test
Big-Shadow-Test Method is used to solve a large simultaneous problem as a sequence of smaller simultaneous problems.
Shadow tests are no regular tests; their items are always returned to the pool. They are only assembled to balance the selection of items between current and future tests. Because of their presence, they neutralize the greedy character inherent in sequential test-assembly methods. In doing so, they prevent the best items from being assigned only to earlier tests and keep the later test-assembly problems feasible.
Big shadow test
Big-Shadow-Test Method is used to solve a large simultaneous problem as a sequence of smaller simultaneous problems.
Shadow tests are no regular tests; their items are always returned to the pool. They are only assembled to balance the selection of items between current and future tests. Because of their presence, they neutralize the greedy character inherent in sequential test-assembly methods. In doing so, they prevent the best items from being assigned only to earlier tests and keep the later test-assembly problems feasible.
Using Mastery Manager to Inform InstructionErin Abruzzo
The presentation is an overview of how teachers at Harlem High School are using Mastery Manger (data collection tool) to inform their classroom instruction.
Using Mastery Manager to Inform InstructionErin Abruzzo
The presentation is an overview of how teachers at Harlem High School are using Mastery Manger (data collection tool) to inform their classroom instruction.
Online Tests: Can we do them better? | Bopelo Boitshwarelo, Jyoti Vemuri, Han...Blackboard APAC
The use of e-assessment methods to facilitate and evaluate learning is a growing trend in the higher education space. In particular, the use of online tests has increased rapidly concomitant with the expansion of digital technologies for teaching purposes. Online tests, in the context of this presentation, refer to computer assisted-assessment where the deployment and marking is automated and typically involves objective types of questions such as multiple choice questions (MCQs), true/false questions, matching questions as well as predetermined short answer questions. The growing sophistication of Learning Management Systems(LMSs) such as Blackboard provide an increasing capacity for different types of online tests to be deployed, administered and marked efficiently. Additionally, most major textbook publishers and authors in certain disciplines provide online question banks that can easily integrate with LMSs meaning less time is spent on creating tests from scratch.
With these trends in mind, questions arise around the efficacy of online tests in higher education.
In this presentation we will share findings of a study investigating practices around online tests. First, we will explore what the literature reveals about the role of online tests in higher education and particularly how online tests are used to lead to student learning through formative assessment processes and feedback practices. Secondly, the presentation will review the practices around online tests at the Charles Darwin University Business School and discuss emerging issues. Thirdly, the presentation will distil some preliminary guiding principles around designing, developing, administering and reviewing online tests for effective learning and assessment. Finally, ongoing and further research by the team on the topic of online tests will be highlighted.
Individual Assignment Progistics-Solutions Inc. – The Critical Pa.docxjaggernaoma
Individual Assignment: Progistics-
Solution
s Inc. – The Critical Parts Network
1. Give two reasons why you think the performance of the Toronto depot is better than the performance of the Scarborough depot?
(2 marks each)
Reason #1
Reason #2
2. Improvements
a. Describe three initiatives that can be done at the Critical Parts Network to improve inventory turns without having a negative impact on customer service levels. (2 marks each)
b. Describe what the benefit would be of each initiative. (1 mark each)
c. Explain what role would Xerox have to play in each initiative. (1 mark each)
Initiative Description
Benefit
Xerox’ Role
1
2
3
3. What are the costs and benefits of reducing the cut-off point for filling from the technician’s trunks? For calculation purposes, assume inventory carrying costs are 25% the value of the inventory. (4 marks)
4. If you were in the position of Gary Parkinson, what are the top three recommendations would you make to Jim Eckler and why?
(Recommendation: 1 mark – Justification: 1 mark)
Recommendation
Justification
1
2
3
NUR 520: Nursing Theory and Research II | i03/07/2016
Nursing Theory and Research II
Course Description
Three hours per week. Prerequisite NUR 510. The focus of this course is on the research process and critical
examination of research designs. Exploration of data analysis and interpretation will be emphasized. This
course prepares students to use evidence-based practices in their specialized area of professional nursing.
Course Objectives
Following successful completion of course work, the student will be able to:
• Examine common ethical dilemmas related to research in nursing and other health care disciplines and
the ways in which these dilemmas impact patient care
• Pose specific research question(s) and/or hypotheses relevant to a significant problem in nursing practice
• Compare and contrast qualitative and quantitative research methods and appropriateness for researching
a specific problem in nursing practice
• Determine appropriate sampling methods and sample size needed for researching a specific nursing
problem
• Compare measurement strategies used in nursing research and select appropriate strategy for data
collection tool used in a specific research-based project
• Analyze various data collection methods and relevancy for answering specific research questions in
nursing.
• Analyze quantitative and qualitative research data
• Develop a data analysis plan based on data collection tools and statistics relevant to answering the
research question
• Employ leadership/management strategies in determining feasibility of conducting research on a specific
problem in nursing
• Develop strategies for incorporating evidence into nursing practice
• Develop a research proposal for a study in your area of nursing specialization
Topical Outline
• Nursing Research at the Graduate Level
• Research Objec.
Can you measure if the content in your eLearning system provides an enriching and engaging experience for your learners? If you can't answer this important question, you're not alone. Organizations struggle to combine the complex activity of analyzing data to identify opportunities that can improve learner engagement with their content. It's worth it to find out. Courses and related resources that may not be as valuable as intended can result in decreased interest and attendance rates—leading to poor learning outcomes. There are many ways to measure and analyze course engagement data in your LMS. These insights enable managers to identify, prioritize change to learning programs and step up their engagement game.
There's more to learning evaluation than surveys and smile sheets. In this recent webinar, Andrew Downes laid down practical, straightforward advice on how to take your learning evaluation further and measure whether your learning programs are having the impact they were designed to achieve.
Here's the slides!
Caveon Webinar Series - A Guide to Online Protection Strategies - March 28, ...Caveon Test Security
Join Executive Web Patrol Managers, Cary Straw and Jen Baldwin, as we explore the systems, methods and steps you need to successfully protect and extend the life of your high stakes certification, licensure, and state assessment exams from online threats.
Some of the questions we will answer include:
• Which processes should I implement to decrease the chance of my content appearing online?
• Where are the best places to use online security resources?
• Where do I look next if I found a threat, and where are the threats likely to spread?
• What are proactive steps I can take to protect my exams online?
• Who should be in my protection hierarchy?
• Am I "safe" after I've found a threat, and have had it removed?
Caveon Webinar Series - Five Things You Can Do Now to Protect Your Assessment...Caveon Test Security
Test season is approaching quickly! Maintaining the security and validity of assessment results is critical to support federal accountability and peer review requirements.
Kick off testing season with this year's first Caveon Webinar, "Five Things You Can Do Right Now to Protect Your Assessment Programs."
This webinar will focus on:
• Test security threats & risk analysis
• Creating test security policies and procedures
• Planning and implementing on-site monitoring
• Reviewing anomalous test results
• Managing incident reports
Join the webinar to learn more, and you'll be off to a strong start in protecting your tests, your results, and your reputation.
If you missed the first three sessions, you can still view them. And, if you can't attend on January 17, go ahead and register anyway and we will send you the recording and slides after the session.
The Do's and Dont's of Administering High Stakes Tests in Schools Final 121217Caveon Test Security
There is a great deal of advice available about giving high stakes tests securely in school settings. States run annual training sessions and provide test administration manuals. Major vendors serving schools provide training and guidelines of varying types. Sometimes the different sources disagree and the emphases vary by the nature of the helping agency. What is a test administrator to do?
This webinar focuses on administering tests in schools and identifies ten "best practices" that apply to all high stakes testing. The content is drawn from careful analyses of current testing practices by states, districts, and testing vendors.
To be an effective test administrator, you will need to read the background materials about each testing program and attend any training that is provided. If you also follow the guidelines presented in this webinar, you will be in a very good position to promote fairness and validity in each of the programs for which you share responsibility.
In this webinar, you will learn:
* Ten Best Practices that apply to all high stakes testing
* What is required to be an effective test administrator
* How to promote fairness and validity in your testing programs
Sponsored by the National Association of Assessment Directors and Caveon Consulting Services, Caveon Test Security
Caveon Webinar Series - The Art of Test Security - Know Thy Enemy - November ...Caveon Test Security
As Sun Tsu famously said... "If you know your enemy as you know yourself, you need not fear 100 battles." On the battlefield of security -- whether home security, airport security, or test security - the first step to success is knowing the threats.
Are you worried about tests being stolen and shared online? Or test takers cheating by being coached by an expert? If so, the steps to successfully protecting your test and triumphing over these fears include:
• conducting a risk assessment
• determining (and ranking) which threats pose the greatest risk
• strategizing how to render those threats impotent
• determining the right combination of prevention, detection and deterrence tactics for your program
This webinar will teach you to conquer the steps in this test security process. Join Caveon CEO David Foster to learn how to analyze and rank the threats that are specific to your program. You will also discover the three solutions necessary to counter any and all of these threats.
Caveon Webinar Series - Four Steps to Effective Investigations in School Dis...Caveon Test Security
Now that spring test administrations are almost over, K-12 districts and schools can breathe a sigh of relief. Weeks of vigilance have paid off with a smooth, incident-free test administration. Not your district? You’re not alone. No matter the extent of planning, training, and oversight, there are always unforeseen events that result in testing irregularities. Most will be straightforward and covered by standard policies and procedures. But some incidents may set off your internal alarms. By themselves, these reports are only single data points and need to be explored to determine the larger context and what really happened. This webinar will provide information on:
How to develop a plan for responding to test irregularity reports and;
How to carry out investigations if additional information is needed.
The session is free, and will only last 30 minutes. Space is limited, so register today! We look forward to seeing you on May 18th!
If you missed the first two sessions, you can still view them. And, if you can't attend on May 18, go ahead and register anyway and we will automatically send you the recording and slides after the session.
Caveon Webinar Series - On-site Monitoring in Districts 0317Caveon Test Security
Are you sure that school leaders and educators are following your state and local assessment policies and procedures during the administration of assessments?
On-site monitoring of assessment administrations at schools and in classrooms is an effective quality assurance measure that:
• ensures compliance with standardized policies and procedures
• helps identify the greatest areas of vulnerability in your assessment administration processes
• creates opportunities to improve training, and
• clarifies messaging about assessments for school leaders and educators.
Finally, LEA-sponsored monitoring demonstrates a strong commitment to the integrity of assessments and the important decisions made based upon assessment results.
By attending this webinar, you will gain exposure to:
1) the goals and purposes of monitoring,
2) best-practice monitoring activities during assessment administrations,
3) evaluating data from monitoring reports,
4) potential outcomes from monitoring and
5) first steps in implementing a monitoring program.
Caveon Test Security, the industry leader in providing security solutions for protecting high-stakes, K-12 assessments, is pleased to announce the first webinar in a series of 3, focused on test security challenges faced specifically by districts.
Session #1: Avoiding A School District Test Cheating Scandal:
A Tale of Two Cities
January 25, 2017, 12:00 p.m. ET
As a number of U.S. school districts have learned, mishandling of cheating incidents on tests, particularly state assessments, can have very negative and pervasive effects. This webinar reviews two examples of actual test cheating situations in school districts, contrasts how they were handled, and lays out practical and "battle-tested" strategies for avoiding and, if necessary, coping with test cheating events. Having a strong security plan and acting wisely and decisively when you see signs of trouble can be a very productive approach. This webinar will give you tools to manage a test cheating incident if you have a suspected or confirmed report of cheating.
Caveon Webinar Series - Discrete Option Multiple Choice: A Revolution in Te...Caveon Test Security
High-stakes testing faces major changes due to the use of computers and other technology in test administration. Some such changes include new test designs (such as computerized adaptive testing), proctoring tests online, and even administering tests on tablets and smartphones to improve test taker convenience. One of the most important changes is innovative new item types that better measure important skills. The Discrete Option Multiple Choice item type, or DOMC, is one of these ground-breaking new item types.
The DOMC item has the potential to revolutionize testing. It brings significant benefits in security, quality of measurement, fairness, test development, and test administration.
Caveon Webinar Series - Test Cheaters Say the Darnedest Things! - 072016Caveon Test Security
You won't believe what's actually happened in the world of testing!
What goes on in the mind of a would-be test cheater? While cheating is a serious offense, some of test takers go to great (and sometimes comical) lengths to try gaining an unfair advantage to achieve a successful testing outcome.
Join us as we look at some of the most memorable proctor/test taker cheating encounters. Our special guest, Jarret Dyer, of the College of DuPage Testing Center, has created a compilation of test proctor stories from testing centers around the United States and across the globe. Jarret will share his 'best of' stories, while Caveon's John Fremer will discuss the consequences of not following the right test security processes and procedures. You don't want to miss this fun, yet informative session! To listen to the recording that goes along with these slides, go to https://youtu.be/r-CCaDf7NEk
Caveon Webinar Series - The Test Security Framework- Why Different Tests Nee...Caveon Test Security
The need for global workforce skills credentials continues to grow. At the same time, the global workforce is shrinking. It is imperative that skill recognition be accurate and the level of test security be appropriate for the skills being assessed. The Security subcommittee of the new Workforce Skills Credentialing division of ATP created a new test security framework that will provide guidance to testing organizations when selecting the level of security needed for their assessments.
Join our guest presenters, Rachel Schoenig and Jennifer Geraets of ACT, as they discuss the challenge of identifying global workforce skills and how this new test security framework will help to align the expectations of those involved with workforce credentialing (e.g., test publishers, examinees, and employers). Rachel and Jennifer will also provide a call to action, requesting your comments on this new framework.
Caveon Webinar Series - Conducting Test Security Investigations in School Di...Caveon Test Security
In the coming weeks, schools all over the country will be administering standardized exams to millions of students. And inevitably, test security incidents will arise, many of which may directly impact test score validity. Is your team prepared to answer the following tough questions?
• What will you do if you find yourself in a position of having to respond to an incident or breach in your state or district?
• What process will you follow?
• What is your incident escalation plan?
• How will you communicate with internal and external stakeholders?
• Most importantly, how will you discover the truth of what did or did not occur, and its impact on test scores?
Join Caveon’s test security experts for an important, hour-long webinar to help you understand the steps to take when challenging situations arise. We will share:
• Recent experiences other districts have had with possible cheating, and what they have done to resolve their concerns
• Information and tools for you to arm yourself before an issue arises, and to help you be better equipped to deal effectively and efficiently
• Essential tips you need to know when invoking a Security Incident Response Plan, and further conducting a security investigation
Caveon Webinar Series - Creating Your Test Security Game Plan - March 2016Caveon Test Security
History has shown that as stakes rise for testing programs, so do threats to the program's test result validity. There are stories in the media almost daily about high-stakes programs suffering at the hands of those intent on obtaining the content for disingenuous purposes. Having a game plan in place before a threat or validity issue occurs is vital. This month's webinar will focus on key steps your organization can take to maximize your protection from test fraud, and stay one step ahead of the game.
Caveon Webinar Series - Mastering the US DOE Test Security Requirements Janua...Caveon Test Security
The U.S. Department of Education recently issued the Peer Review of State Assessment Systems, which includes a required "Critical Element" on Test Security. To fulfill this requirement, States must submit documentation of policies and procedures in four categories of test security: prevention, detection, remediation, and investigation.
It is up to each State to determine which steps to implement and what evidence to submit to prove they have met each of these requirements. Evidence could, and should, include a myriad of test security measures ranging from Security Handbooks and annual proctor training, to data forensics and web monitoring procedures (and everything in between).
Caveon can help guide you through this complicated process. In the upcoming session, our test security experts will unpack the requirements of this section of the Peer Review process. The goal is to help you form a road map moving forward, provide information on the best practices for protecting your assessments, and outline resources to streamline the process.
Caveon Webinar Series - Will the Real Cloned Item Please Stand Up? finalCaveon Test Security
Join us for this month's webinar on the ins and outs of developing item clones. While many of us are aware of the benefits cloning can provide, such as expanding an item bank, lengthening the shelf life of an exam, or deterring and detecting cheating, questions remain regarding the best practices for implementation. Secure exam development experts will address the question, "How do we know, during development, when an item has been sufficiently altered, making it a "real clone" and not just an "imitator" of a clone?" The answer isn't as clear cut as it would seem.
Additional topics will include:
• General information on cloning
• Lessons learned from the field
• Creative ideas for streamlining cloning processes
This webinar will help assessment and program managers be better positioned to put on their cloning lab coats and reap the rewards of this best practice in test security.
Caveon Webinar Series - Lessons Learned at the 2015 National Conference on S...Caveon Test Security
The National Conference on Student Assessment (NCSA) was held last month in San Diego, and Caveon was there. This month's webinar will focus on lessons learned at the conference regarding test security, and what's happening in the state assessment arena in terms of test security right now.
Caveon's Steve Addicott and Jamie Mulkey will be joined by special guest Walt Drane, State Assessment Director, Mississippi Department of Education. The panelists will summarize the test security trends and strategies that they drew from the conference, and also share key points from sessions they presented.
Caveon Webinar Series - Learning and Teaching Best Practices in Test Security...Caveon Test Security
Test security has been emerging as a cohesive discipline for the past ten years. There are no college courses that teach test security. And, even if there were, many practitioners don't have time to take those classes. How do you stay abreast of current developments? How do you train your staff in latest best practices if you don't know about them? Are there resources out there, and how do you find them?
In this webinar, Caveon will host several special guest practitioners from various industries. These test security veterans have had to answer these very questions. They will address how continuing education will help you improve test security in your organization.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Caveon webinar series Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014
1. Caveon Webinar Series:
Applying Information Integration
Theory to Setting Cutscores
and Other Tasks
David Foster,
CEO, Caveon
August 20, 2014
2. My Personal Issues with Current
Cutscore Methods
1. There are too many methods/variations,
perhaps hundreds. Why is that?
2. The cutscore point seems almost pre-determined.
3. The methods try to direct and conform
judgments (e.g., adding item statistics).
4. There is no check on the consistency and
quality of the judgments made.
5. The rating task is difficult to do.
6. There is a lack of confidence in the
cutscore.
3
3. So, What Is the Point?
• Why propose another method of
setting cutscores?
– To perhaps solve many of the issues above
– For added value: IIT can apply to other
“judgment” tasks in testing
• Introducing Information Integration
Theory or IIT, borrowed from the
Cognitive Sciences
– 50+ years of theoretical and scientific support
4
4. Reference Material
• Contributions to Information Integration Theory
Volume I: Cognition. Edited by Norman H.
Anderson (2009).
• Foundations of Information Integration Theory
by Norman H. Anderson (1981).
• Methods of Information Integration Theory by
Norman H. Anderson (1982).
5
5. IIT: How Is Information Integrated?
3 Fruits
2 Dips
6
6. Poll
Of the 6 given, which is your most preferred
combination of a dip and a fruit?
Chocolate and strawberry
Chocolate and apple slice
Chocolate and orange slice
Caramel and strawberry
Caramel and apple slice
Caramel and orange slice
7. Poll Results
• From the poll data:
– There are differences in your top choice,
which is normal for food preference
ratings
– MORE IMPORTANTLY, you were able to
combine or integrate the information
quickly, imagine the taste of the
combinations, rate the combinations,
and make your top pick
8
8. Much of What We Do Is Integrating
Information and Making Judgments
• Choosing a vacation place
• Buying a car
• Leaving a job for a better one
• Choosing a mate
• Voting
• Picking foods to eat
• …and everything else we do
We are constantly integrating various pieces of
information, then judging, rating, and eventually
deciding and acting based on the integrated
value.
How we do the cognitive part of
these tasks is explained by9 IIT.
9. Schematic of IIT
Source: Wikipedia
Not Directly Observable
Basic Cognitive Algebra Models:
ADDITIVE AND MULTIPLICATIVE
10
10. Cognitive Algebra:
ADDITIVE MODEL Examples
• Individuals are adding the stimuli before
judging
• Produces parallelism when charted
Statesmanship rated after
reading two biographical
paragraphs
Cookie size evaluated by 5-
year-olds given length and
width
11
11. Cognitive Algebra:
MULTIPLICATIVE MODEL Examples
• Individuals are multiplying the stimuli
Value of a lottery ticket
given odds of winning
and value of the ticket
before judging
• Produces linear fan when charted
Rating of likeableness
given adjective and
adverb
12
12. Not Just Humans
Research I
conducted in
1976 using
pigeons
Information
integrated:
Type of food
Amount of work
to obtain the
food
14. Mid-Webinar Summary of IIT Benefits
for Judgment Tasks in Testing
• Easy visual evaluation of overall ratings
and individual raters
• Better understanding of the judgment
process
• Production of results (e.g., item difficulty
ratings) on interval-level scales
• Quantitative comparison of performance
levels
• Practical benefits: Quicker, easier, less
expensive
15
15. Item Judgment Exercise
You were asked to go to a Caveon site
and to provide a rating of the difficulty of
3 math questions for students that had
completed the 2nd and 10th grades.
Information that was integrated:
A. Test item content (3 items)
B. Student performance level (2 grade
levels)
16
19. Evaluation of Individual Raters
Here are the
results for Rater
#21 who either
didn’t try, didn’t
understand the
task or simply
answered
randomly.
His results were
removed from the
analysis.
20
21. ANOVA Results for IIT Data
Factors F Score Probability
Items 208.48 6.70-35
Proficiency Levels (Grades) 483.97 4.71-26
Items x Proficiency Levels 26.93 6.21-10
Confirms the multiplicative model
22
24. So, What Can We Do with
These Results?
Whether the model is ADDITIVE or
MULTIPLICATIVE, interpreting the results is the
same:
1. A model is confirmed.
2. Raters performed the task consistently and
properly.
3. Marginal means of item ratings can be used
as difficulty estimates on an interval scale.
4. Marginal means of performance level ratings
can be used for setting cutscores or other
purposes.
25
25. How to Set a Cutscore using IIT
At this point, the process is not very
different from what occurs with other
methods.
It is always a challenge to get from
ratings or judgment data to a
corresponding value on the score
scale.
26
26. Use Mean Ratings of Items for Each
Proficiency Level
• 2nd Grade = 4.95
– Average Difficulty Rating of 15.05
– Subtract from 20 to reverse the scale
• 10th Grade = 15.47
– Average Difficulty Rating of 4.53
– Subtract from 20 to reverse the scale
Remember that these are
cutscores based on the IIT
rating scale of 0 - 20
27
27. Graphical Display of IIT Cutscores
Cutscore for 2nd Grade = 4.95 (20 - avg rating of 15.05)
Cutscore for 10th Grade = 15.47 (20 - avg rating of 4.53)
28. One Conceptual Process for Converting
IIT Ratings to a Score Scale
For a particular IIT ratings-based cutscore,
how many items (or what % of items) have
IIT difficulty ratings below that IIT cutscore?
That number (or %) becomes an equivalent
cutscore on the score scale.
There will likely need to be some adjustments
for error.
29
29. Converting IIT Ratings to Score Scale:
Number of Items
Mean Ratings
Number of Items
Pretend we have 100 items
Instead of only 3
80
10th Grade
15.46
And this graph is a cumulative
frequency distribution of
those items and mean ratings
30
30. Converting IIT Ratings to Score Scale:
Number of Items
Mean Ratings
Number of Items
Pretend we have 100 items
Instead of only 3
7
2nd Grade
4.95
31
31. Other Applications of IIT in Testing
• Besides determining cutscores, where
else do we require ratings or
judgments?
– Item accuracy reviews
– Essay scoring
– Bias reviews (gender, race, age, etc.)
– Item quality (e.g., alignment with
objectives)
– Others?
32
32. Thank you!
Dr. David Foster
CEO, Caveon Test
Security
David.foster@caveon.com
Follow Caveon on twitter @caveon
Check out our blog www.caveon.com/blog
LinkedIn Group “Caveon Test Security”
34
Editor's Notes
What about other title? Standard Setting for the 21st Century: Using Information Integration Theory to Produce Cut Scores
I will use the acronym IIT throughout the presentation.
These are my personal observations and concerns.
A lot of tinkering and customization goes on; attempts to fine tune.
We have joked about this often.
Sometimes we give statistics that take the place of having to actually review or judge the item.
To be fair, there are some rater reliability statistics, and I’m sure that some raters have been dismissed for various reasons.
Deciding how many individuals from a particular performance level will answer a question correctly is not easy to do.
These problems and others, have given me a lingering concern about how close the actual cutscore is to some “true” point.
I had already complained about the number of methods.
IIT is new; it hasn’t been tried out in the kind of judgments used for cutscore determination.
IIT may have the foundation that other methods have lacked; what can it hurt to “borrow” a solid method from Cognitive Science?
If successful at helping us to set cutscores, perhaps it can be useful in other areas where we use human judgments.
To understand the basic principle behind IIT, let’s each consider this example.
It is common to combine fruits and dips. We’ve all likely tasted chocolate strawberries or caramel apples. This example illustrates how we combine the values of the fruits with the values we place on the dips to create a unique experience.
IF TIME: Imagine that one of the dips were “motor oil”, how would we rate it combined with these fruits or any fruit?
We are going to have a poll question that asks you to pick your most preferred combination. Because of a technology constraint I can only offer to you 5 of the 6 combinations. I dropped off the carmel/orange slice combination. Sorry about that.
I expect that had you ranked or rated them all, we would see even greater differences. This is normal for food preferences in humans.
Some combinations were not popular and were obviously rated lower
There may be some agreement among raters as well.
So how much of what we do in life requires this kind of “integration”?
Perhaps ALL of what we do. Every decision occurs in a context.
So, how does IIT provide a route to understanding our judgments.
Here is a general schematic of the process.
Stimuli (S) in the real world are valued by us due to experience, training, etc., (s). Knowing how this part of the process works is not important for understanding IIT.
Example: How we value a strawberry is personal and comes from experience.
Next comes the Integration function: How do we combine the values of individual stimuli together to get an overall value or I? More on this in a minute.
The amount of I will lead to a response (or choice), which we then perform (R).
STEVE: click at this point to bring up the Cognitive Algebra graphic/text.
There are many ways we can integrate information, and two of the most common ones are:
CLICK: We can add them together, or
CLICK: We multiply them together
Let me give some examples of each.
LEFT: Adults evaluated presidents on statesmanship based on paragraphs of biographical information. (Positive + Negative)
RIGHT: Five-year-old children judged value of a cookie given height and width of the cookie. Should have been multiplicative rule, but was additive instead. (Height + Width, not Height x Width)
CLICK. If additive models are used, then the chart will show parallel straight lines. Here is one example where adults rated Presidential statesmanship.
CLICK. A second example uses 5-year-olds as subjects and had them rate the size of the cookie when given the height and width. Surprisingly, they used the additive model where the multiplicative model would have been more accurate.
LEFT: Adults judged value of lottery ticket given odds of winning and amount (odds X amount)
RIGHT: Adults judged likeableness of a person described by a adverb-adjective phrase (adjective X adverb)
Multiplicative: (Type of Food X Work Schedule)
All pigeons showed similar linear fan results (all were using the multiplicative model).
Each had different preferences for foods, similar to humans, AND demonstrated that preference consistently across work schedules.
POINT: Individual results are meaningful and can be evaluated.
Support for last point: no need to travel; no need for meetings; no discussion of items; no supplying of additional item data
Will likely double or triple the number of ratings to be provided, but still takes less time
IMPORTANT to REMEMBER: In IIT integrated items are usually presented randomly.
47 as of yesterday had completed the rating of the items. Thank you and I hope you had very little difficulty completing the ratings.
XX as of today.
It’s amazing how many of you were able to do it properly without very good instructions and with new technology. My hat is off to you.
Here are a few individual ratings.
I removed several participants from the analysis for:
Incomplete data
Unusual data (one is shown)
This was an easy task to do and illustrates the need to provide proper instructions more than anything else.
You can see the multiplicative effect in the data.
I had expected an additive model, so either the effect is real, or I introduced some artificial effects:
Strange trio of items
Lousy/brief instructions
Use of non- or almost subject matter experts
We could be seeing a little influence of floor or ceiling effects base on how I set the study up.
Notice how small the probabilities are. Both the main effects and the interaction effects are significant. The significant interaction effect confirms the multiplicative model.
Before moving on to what we can do with these results, I want to show you two examples of programs that used the IIT method to rate items. I don’t have the statistical results, but I do have the graphs.
Certification Test
Data presented at ATP in 2013
Scale is reversed
Shows consistency of ratings using IIT. Shows additive model
Lower-level Nursing exam
Data presented at ATP in 2013
Scale is reversed
Shows the consistency of being able to rate the same items at different proficiency levels. Shows additive model
We completed a “fairly” successful IIT rating study. Now what can we do with the results.
Here is one possible way. There are surely others.
Some “art” and “logic” are applied.
These are not cutscores on the score scale, which could be number correct, a percent correct, or some other scale.
We need to transform these as well as we can.
Here is one way to do it.
Cutscore for 2nd grade is the means of the item ratings for that grade.
We can show this graphically.
I’ve expanded the number of items.
Impossible to illustrate with only 3 items
So, I invented 97 more items
We intend to use the same test for both 2nd and 10th graders.
Conceptual Method
Conceptually, how many items on the test have overall difficulty ratings lower than the ratings-based cutscore?
Create a cumulative frequency distribution (number of items with marginal means at each rating level)
Take Rating cut score vertically to intersect the distribution line
Draw a line horizontally until you have intersected the ordinate.
If you have a range in your rating cutscore you can apply that range as well.
Some of you likely can come up with a more exact method.
I’ve expanded the number of items.
Impossible to illustrate with only 3 items
So, I invented 97 more items
We intend to use the same test for both 2nd and 10th graders.
Conceptual Method
Conceptually, how many items on the test have overall difficulty ratings lower than the ratings-based cutscore?
Create a cumulative frequency distribution (number of items with marginal means at each rating level)
Take Rating cut score vertically to intersect the distribution line
Draw a line horizontally until you have intersected the ordinate.
If you have a range in your rating cutscore you can apply that range as well.
Some of you likely can come up with a more exact method.