Do you have concerns about test security and cheating? Here is a very brief primer on statistical detection of test fraud and related issues. It includes some suggestions for deeper resources. You are also welcome to check out the resources at www.assess.com.
Do you need dissertation or PhD thesis help with things surrounding your committees, defense or Vita, or publishing? These are the Doctoral EndGames and this set of slides was produced for DoctoralNet's first public hangout on 15 Sept 2013
I consider whether we as testers can be too closed-minded in our attitudes, whether there are schools of thought or approaches that, even if we care deeply about context, we are very unlikely even to consider and perhaps that we sometimes favour our reputation over giving ourselves the chance to do the best job that we can.
From CEWT#2, http://cewtblog.blogspot.co.uk/2016/02/cewt-2-abstracts.html
Test Fest: Catching up on Your Usability Testing BacklogSarah Joy Arnold
Presentation at North Carolina Librarians' Association Biennial 2017 in Winston-Salem, NC. Part of "So Many Users, Not Enough Time: Large Scale Usability Testing Methods" with Chad Haefele and Scott Goldstein.
This presentation will discuss the process that Appalachian State University Libraries used to measure and test website usability during its recent redesign and migration to a new Drupal theme. It will emphasize how we recruited a large number of users and how large sample sizes promote better design decisions. While web usability research is well known for its flexibility in needing only about a dozen users to discover most problems, robust data-driven decisions are best supported by datasets that are large and significant. Attendees will learn techniques for surveying and testing more users without greatly compromising the richness of data collected.
At UNC-Chapel Hill, the User Experience and Assessment department regularly runs usability tests to inform our decision-making and prioritize our users' perspective as we make changes. But there are more things to test than there are hours in the day. Our projects have a variety of stakeholders who are very interested in improving their services, and we found ourselves with a long list of tests we wanted to run. To catch up, we adapted Harvard Libraries' Test Fest model: five tests run simultaneously, with five participants rotating through the set of tests. Over a span of two hours, we completed 25 individual usability tests. In this one event, we caught up on much of our testing backlog. This session will outline how we planned and executed Test Fest, how we recruited participants, and what we learned from using this approach. We'll also discuss our methodologies and briefly look at the results of each test.
Do you need dissertation or PhD thesis help with things surrounding your committees, defense or Vita, or publishing? These are the Doctoral EndGames and this set of slides was produced for DoctoralNet's first public hangout on 15 Sept 2013
I consider whether we as testers can be too closed-minded in our attitudes, whether there are schools of thought or approaches that, even if we care deeply about context, we are very unlikely even to consider and perhaps that we sometimes favour our reputation over giving ourselves the chance to do the best job that we can.
From CEWT#2, http://cewtblog.blogspot.co.uk/2016/02/cewt-2-abstracts.html
Test Fest: Catching up on Your Usability Testing BacklogSarah Joy Arnold
Presentation at North Carolina Librarians' Association Biennial 2017 in Winston-Salem, NC. Part of "So Many Users, Not Enough Time: Large Scale Usability Testing Methods" with Chad Haefele and Scott Goldstein.
This presentation will discuss the process that Appalachian State University Libraries used to measure and test website usability during its recent redesign and migration to a new Drupal theme. It will emphasize how we recruited a large number of users and how large sample sizes promote better design decisions. While web usability research is well known for its flexibility in needing only about a dozen users to discover most problems, robust data-driven decisions are best supported by datasets that are large and significant. Attendees will learn techniques for surveying and testing more users without greatly compromising the richness of data collected.
At UNC-Chapel Hill, the User Experience and Assessment department regularly runs usability tests to inform our decision-making and prioritize our users' perspective as we make changes. But there are more things to test than there are hours in the day. Our projects have a variety of stakeholders who are very interested in improving their services, and we found ourselves with a long list of tests we wanted to run. To catch up, we adapted Harvard Libraries' Test Fest model: five tests run simultaneously, with five participants rotating through the set of tests. Over a span of two hours, we completed 25 individual usability tests. In this one event, we caught up on much of our testing backlog. This session will outline how we planned and executed Test Fest, how we recruited participants, and what we learned from using this approach. We'll also discuss our methodologies and briefly look at the results of each test.
How to do qualitative analysis: In theory and practice Heather Ford
These slides are from a recent workshop for Honours students and researchers at UTS's School of Communication. Not pictured are the examples from my own research that I used to illustrate concepts. Hopefully I will be able to make a prettier version soon.
These slides are specific phd thesis help for a talk I gave at Dublin City University on 15 May 2014. They should be helpful for any in a European context about to turn in their final thesis pre viva
EuroSTAR Software Testing Conference 2012 presentation on Curing Our Binary Disease by Rekard Edgren.
See more at: http://conference.eurostarsoftwaretesting.com/past-presentations/
Survey Methodology and Questionnaire Design Theory Part IQualtrics
Do you know what's going on in your respondents' heads as they take your survey? How can you design your questionnaire to collect better data? Understanding the answers to these questions can help you design surveys that collect high quality insights you can depend on.
Dave Vannette, principal research scientist at Qualtrics, shares his best hacks for designing surveys that will help you get quality data. In this presentation, Dave also highlights what your respondents are thinking when they take your surveys, and how your survey design can affect the responses you collect.
CUE Forum presented at JALT 2008 (Tokyo, Japan). Gives an overview of research design issues for Second Language Acquisition. For further details, visit jaltcue-sig.org
In this talk, we’ll look at the process of designing a research methodology. Is it better to stick to the safety of the lab, or to broaden our horizons? And how can we convince colleagues and stakeholders to buy into the decision? We’ll introduce a set of principles and a thinking tool to help you weigh up and justify your approach.
Intimidated by conducting your own usability study? This session will give you the tools you need to conduct effective usability tests whether your participants are in the room or in a different country. The session includes practical techniques to successfully plan, prepare, and conduct your test and activities to help you become more confident with the entire process of usability testing. Finally, you’ll get tips on how to get the most useful results from your study.
Participants will also learn about:
Testing protocols
Types of usability testing and required vs. optional resources
Recruiting and scheduling usability tests
Non-disclosure and consent forms and their purposes
Pilot testing
Techniques for interacting with test participants
Current usability testing issues of interest (e.g. testing internationally, moderated vs. un-moderated, etc.)
Presentation given at the CASE Communications, Marketing & Technology Conference in Boston on April 15, 2009.
Learn the tools of the trade for do-it-yourself research for little or no money. This session will teach you how to conduct focus groups, surveys, usability tests and more.
Good items are the basic building blocks of any good test or assessment. This presentation covers best practices in developing high-quality items for better psychometrics.
Ever wonder how a cutscore is set on a certification/licensure test? This is a very brief intro to the topic of standard setting, that is, how cutscores (passing points) are set on credentialing exams using scientifically-backed research and rigorous psychometrics. Some approaches include the modified-Angoff, Bookmark, and Contrasting Groups. Visit www.assess.com to learn more.
More Related Content
Similar to Statistical detection of test fraud (data forensics) - where do I start?
How to do qualitative analysis: In theory and practice Heather Ford
These slides are from a recent workshop for Honours students and researchers at UTS's School of Communication. Not pictured are the examples from my own research that I used to illustrate concepts. Hopefully I will be able to make a prettier version soon.
These slides are specific phd thesis help for a talk I gave at Dublin City University on 15 May 2014. They should be helpful for any in a European context about to turn in their final thesis pre viva
EuroSTAR Software Testing Conference 2012 presentation on Curing Our Binary Disease by Rekard Edgren.
See more at: http://conference.eurostarsoftwaretesting.com/past-presentations/
Survey Methodology and Questionnaire Design Theory Part IQualtrics
Do you know what's going on in your respondents' heads as they take your survey? How can you design your questionnaire to collect better data? Understanding the answers to these questions can help you design surveys that collect high quality insights you can depend on.
Dave Vannette, principal research scientist at Qualtrics, shares his best hacks for designing surveys that will help you get quality data. In this presentation, Dave also highlights what your respondents are thinking when they take your surveys, and how your survey design can affect the responses you collect.
CUE Forum presented at JALT 2008 (Tokyo, Japan). Gives an overview of research design issues for Second Language Acquisition. For further details, visit jaltcue-sig.org
In this talk, we’ll look at the process of designing a research methodology. Is it better to stick to the safety of the lab, or to broaden our horizons? And how can we convince colleagues and stakeholders to buy into the decision? We’ll introduce a set of principles and a thinking tool to help you weigh up and justify your approach.
Intimidated by conducting your own usability study? This session will give you the tools you need to conduct effective usability tests whether your participants are in the room or in a different country. The session includes practical techniques to successfully plan, prepare, and conduct your test and activities to help you become more confident with the entire process of usability testing. Finally, you’ll get tips on how to get the most useful results from your study.
Participants will also learn about:
Testing protocols
Types of usability testing and required vs. optional resources
Recruiting and scheduling usability tests
Non-disclosure and consent forms and their purposes
Pilot testing
Techniques for interacting with test participants
Current usability testing issues of interest (e.g. testing internationally, moderated vs. un-moderated, etc.)
Presentation given at the CASE Communications, Marketing & Technology Conference in Boston on April 15, 2009.
Learn the tools of the trade for do-it-yourself research for little or no money. This session will teach you how to conduct focus groups, surveys, usability tests and more.
Good items are the basic building blocks of any good test or assessment. This presentation covers best practices in developing high-quality items for better psychometrics.
Ever wonder how a cutscore is set on a certification/licensure test? This is a very brief intro to the topic of standard setting, that is, how cutscores (passing points) are set on credentialing exams using scientifically-backed research and rigorous psychometrics. Some approaches include the modified-Angoff, Bookmark, and Contrasting Groups. Visit www.assess.com to learn more.
This is the third of a series of powerpoints presented at a CAT/IRT workshop at the University of Brasilia in 2012. It provides an introduction to item response theory (IRT), discussing advanced topics like linking & equating, scaling, differential item functioning, polytomous models, and dimensionality. Learn more at www.assess.com.
Using Item Response Theory to Improve AssessmentNathan Thompson
This is the second of a series of powerpoints presented at a CAT/IRT workshop at the University of Brasilia in 2012. It provides a discussion on how IRT is applied to developing better assessments, including item and test information functions, standard error of measurement, and use of Xcalibre. Learn more at www.assess.com.
This is the first of a series of powerpoints presented at a CAT/IRT workshop at the University of Brasilia in 2012. It provides an introduction to item response theory (IRT), tying it to classical test theory and describing some of the major IRT models. Learn more at www.assess.com.
Leveraging Tech Enhanced Items without Sacrificing PsychometricsNathan Thompson
This presentation discusses tech-enhanced items (TEIs), an important innovation in educational assessment, and how they do not always provide better measurement. Charles County Public Schools (MD) discuss PARCC assessments in their district and how they leverage the FastTest platform to accurately assess student growth. Visit www.assess.com to learn more.
So, you've heard about adaptive testing, and wondering what it takes to develop a valid one? This presentation is made for you. It outlines a 5 step process, starting with feasibility studies and business case evaluation. More info at www.assess.com and http://pareonline.net/getvn.asp?v=16&n=1.
Introduction to Computerized Adaptive Testing (CAT)Nathan Thompson
These slides are from a short workshop I taught at the 2015 Conference for the International Association for Computerized Adaptive Testing (IACAT, www.iacat.org). Interested in CAT? I'd love to hear from you on LinkedIn, or visit www.assess.com to learn more.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
2. Welcome!
This is some of the lessons I have learned
while diving into the field.
Overview of the topic
Discuss resources
Save time and effort for anyone starting out
Purpose is NOT to be a full workshop on
data forensics
2
3. Outline
History
Where do I start learning?
Resources
What are threats to test security?
How do I start deterring?
Deterrent solutions like weblock and remote
proctoring
How do I start detecting?
Intro to data forensics
Software for detection
3
4. History
Literature dates before 1950
Many collusion indices
Most were descriptive or completely ad hoc
Notable exception: Frary, Tideman, and Watts
(1977) – G2
Modern era started when Wollack adapted G2
to IRT
Other analyses not as much literature
4
6. Resources
In the past, if you wanted to learn:
1. Read all the original articles
2. Read reviews
• Bliss (2012) Covington Award – 25 indices
• Khalid, Mehmood, & Rehman (2011) – 20 indices
• Cizek 1997 book: good but little attention to
forensics
• You still need all the originals.
6
UNTIL…
8. Overview of Security Threats
Major sources of issues
Brain dump makers (harvesting)
Brain dump takers (preknowledge)
Specific location problems
Examinee collusion
Receiving help (teacher, proctor, outside)
Proxy testing
What is your list?
8
9. Harvesting
9
What: Steal your
content and make it
public
Why: Often (but not
always) to make
money
How: Memorization
or images; Brain
dump sites
Deter: CAT/LOFT
Detect: Unusual
responses & latencies;
brain dump comparisons;
Trojan Horses
Minimize: Frequent
republishing
10. Preknowledge
10
What: Knowing
the questions and
answers
Why: Easy pass
How: Brain dump
sites (used to be
word of mouth)
Deter: CAT/LOFT
Detect: High score, low
time; brain dump
comparisons; Trojan
Horses
Minimize: Frequent
republishing
11. Examinee Collusion
11
What: Copying
Why: More
items correct
How: Individual
or group effort
Deter: CAT/LOFT,
multiple forms,
proctors
Detect: Collusion
indices, group
rollups
Minimize:
CAT/LOFT,
multiple forms
12. Receiving help
12
What: Teacher,
proctor, or
outside aid
Why: More items
correct; often
benefits the aider
How: Individual
or group effort
Deter: CAT/LOFT,
multiple forms,
proctors
Detect: Collusion
indices, group rollups,
erasure
Minimize: CAT/LOFT,
multiple forms, TEIs,
Perf tests
14. Many options
User roles in test development
Limit access to test content during delivery
Verify identity of examinee
Test window date/time
Test location (IP addresses)
Lockdown browser
Proctor/Examinee authentication
Biometrics for ID
Proctor training
14
17. It’s a Hypothesis Test!
17
First step:
Identify the threats you are worried about and how
you think it would present itself in data
18. It’s a Hypothesis Test!
Independent variables
Test centers/locations
Countries
Training programs
Test forms
Individuals
18
19. It’s a Hypothesis Test!
Dependent variables
Item response or test time
Item statistics
Test statistics (mean/SD, pass rate)
Person statistics (intra-individual)
Collusion indices
19
20. It’s a Hypothesis Test!
20
If you aim at
nothing, that’s
exactly what
you’ll hit.
21. It’s a Hypothesis Test!
Example: Teachers helping kids
Item statistics different than other teachers
Collusion indices
Relatively high scores with relatively short time
– bivariate plot?
Item latencies different than other teachers
21
22. It’s a Hypothesis Test!
Example: Brain dump users
Collusion indices
Responses on Trojan Horses
Relatively high scores with
relatively short time
Item latencies
Group level not likely (could be
at any test center)
22
23. Time
High score, low time: Preknowledge or aid
Low score, high time: Harvester
Response patterns
Person fit
Score gains
23
Step 2: Determine your analysis
24. Options for Detection
24
Intra-Indivvidual
• Time/RTE
(CBT only)
• Response
patterns
• Score gains
• Person fit
Inter-Individual
• Collusion
Indices
• Erasure
(paper only,
also Group
level)
Group
• Roll-up of
intra and
inter
• Descriptive
Statistics
25. More on Collusion Indices
How is collusion quantified? 100 item test…
Error similarity – we both had 10 errors:
Same items?
Same responses on those items?
Response similarity
We gave the same response on 50 items? 90?
Some indices are standardized/probabilistic (good)
Some are descriptive or non-probabilistic (bad)
Can vary in direction (one/two)
25
26. More on Collusion Indices
There are issues to consider when
comparing:
ESA only looks at errors, ignores rest of data
Major confound with ability
Two examinees with 99/100 will get flagged as
collusion!
Therefore important to condition on this
Some indices have no theoretical basis
whatsoever
26
27. More about collusion
27
Probabilistic Descriptive Ad
hoc
Error
Similarity
B&B EIC EEIC HH
HHJ
Response
Similarity
Wollack’s Omega
Wesolowsky Zjk
Frary et al G2
RIC
28. More resources
ITC Guidelines on the Security of Tests,
Examinations, and Other Assessments
TILSA Test Security Guidebook
Conference presentations/workshops
(harder to find)
28
29. Software
Next step: Find software that meets your
needs
Scrutiny!
S-check
R packages (CopyDetect)
SIFT
Integrity
Caveon
IRT software like IRTPRO or Xcalibre
29
30. 30
Epilogue: Then what?
Define a pathway for
investigation and
actions
Joy Matthews-Lopez
and Paul Jones
31. Examples (if time)
500 certification candidates
Gr4 Math (locations)
Check on teachers and schools; there is
incentive to help students
31
These are the two longest reviews I found, and have massive drawbacks…
Bliss (2012) Covington Award – 25 indices regurgitated in Appendix with little/no explanation
Khalid, Mehmood, & Rehman (2011) – State 20 indices but don’t even define them all (predatory journal!)
Cizek had notation errors that threw me off
Examples:Time: Flag an examinee for having a very low test time or average item time. Response Time Ratio is a statistic to quantify this.
Response pattern: Flag an examinee for answering one option >50% of the time. In this case, they probably gave up or didn’t care and just answered “C” over and over…
Score gains: Your score doubled since the last time you took the test. Not likely!
Person fit: Why are you getting tough items right but easy items wrong?
Collusion: A number of indices that quantify, for any given pair of examinees, whether their responses were unusually similar.
Erasure: Evaluating proportion of changes that are wrong-to-right vs. right-to-wrong.
Roll-up: What percent of examinees at each location/group were flagged for intra/inter issues? For example, 90% of a location gets flagged for collusion.
Other stats: Some locations have high average scores but low average test times. High pass rates.
As Paul Irwin said yesterday, if you don’t have policies and procedures, don’t be testing!