The study presented in this paper deals with copying answers
in MOOCs. Our findings show that a significant fraction of
the certificate earners in the course that we studied have used
what we call harvesting accounts to find correct answers that
they later submitted in their main account, the account for
which they earned a certificate. In total, 2.5% of the users
who earned a certificate in the course obtained the majority
of their points by using this method, and 10% of them used
it to some extent. This paper has two main goals. The first is
to define the phenomenon and demonstrate its severity. The
second is characterizing key factors within the course that affect
it, and suggesting possible remedies that are likely to decrease
the amount of cheating. The immediate implication of
this study is to MOOCs. However, we believe that the results
generalize beyond MOOCs, since this strategy can be used in
any learning environments that do not identify all registrants.
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
Using Multiple Accounts for Harvesting Solutions in MOOCs
1. LOGO
Using Multiple Accounts for
Harvesting Solutions in
MOOCs
José A. Ruipérez Valienteb,c,* @JoseARuiperez
Giora Alexandrona,*
Zhongzhou Chena
David E. Pritcharda
Contact: {jruipere, giora, zchen22, dpritch}@mit.edu
a Massachusetts Institute of Technology
b Universidad Carlos III de Madrid
c IMDEA Networks Institute
* Equal contribution to this work
2. Overview
Identify students who use fake accounts for harvesting solutions that they
later submit in their main account.
CAMEO: Copying Answers using Multiple Existence Online (Northcutt, Ho, and Chuang, 2015)
Main goals:
Study the amount of CAMEO, the motivation for using it, and how it can be reduced
Course studied: MITx 8.MReV Introductory Physics on edX
Some of the results:
~3% of the certificated students used this method in more than 50% of their correct
submissions
More cheating on high stake questions
Less cheating on randomized questions and when feedback is limited
Importance of the research:
Threat to the value of the certificates
Interfere with educational research
Closely related work:
Related to academic dishonesty (students break edX’s code of Honor) and gaming the
system (exploit properties of system)
CAMEO in MITx/HarvardX courses (Northcutt, Ho, and Chuang, 2015)
Online homework tutor system (Palazzo, Lee, Warnakulasooriya, and Pritchard, 2010)
2Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
3. Methodology
3Using Multiple Accounts for Harvesting Solutions in MOOCs
delay
Harvester
Master
Question 1 Question 2
Harvester
Master
What we detect: Cheating using multiple accounts
Harvesting account is used to collect correct solutions
(using ‘show answer’ or exhaustive search)
Master account submits these solutions
Methodology: Educational data mining on
tracking logs
Algorithm:
1. Collect all events (harvester, master, q) with the
properties:
• Harvester gets the solution for q, then master submits it
• Harvester and master share IP group
• Delay between events < 24h
2. Apply criteria:
• Master never behaves as harvester, and vice versa
• Harvester works for others:
– Most of submissions are actually used by the
master
– Does not earn a certificate
• Master harvests at least 10 questions
3. Remove events of users who don’t adhere to these
criteria
We made a manual verification of a sample
Identified events have a unique delay distribution
@JoseARuiperez
4. Findings: Amount of CAMEO
Population of the analysis:
Students who completed at least 5% of the
questions (1581 students)
502 certificate earners
4Using Multiple Accounts for Harvesting Solutions in MOOCs
# master
accounts
# harvester
accounts
#harvested
answers
All students 99 (6.3%) 112 19602 (3%)
Certificate
earners
52
(10.3%)
63 12396 (3%)
Non-
Certificatees
47
(4.35%)
49 7206 (3%)
2% of the c”e obtained
70% of correct answers
using CAMEO
@JoseARuiperez
5. Findings: Characteristics of CAMEO events
Harvesting technique: 53.5%
using ‘show answer’ and 46.5%
using exhaustive search
Harvesting precedes first
master answer in 91% of the
cases
Delay between the harvesting
event and the submission in
the master account:
90% events below 5 min
Median 27 seconds
Mode 5 seconds
5Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
6. Findings: Cheaters vs.
other students
6Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
Density distribution. Performance
per type of account.
Scatterplot. Response
time vs. performance
82%
61%43%
ANOVA (F=72.43, p < 0.001)
7. Findings: Certificate vs.
non-certificate earners
7Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
Comparison of cheating over time
Non-certificate earners tend to cheat more Cheat for a certificate
Drop in chapter 9+10 due to certification peak
Different behaviors depending the student
85% of certificates
were earned within
this chunk!
8. Findings: Reduction methods
Limiting feedback reduces harvesting
Our findings suggest that questions with delayed
feedback are cheated 3x less than the rest
Randomization reduces harvesting
Decreases harvested answers about 2x
Statistically significant per student and per
problem
8Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
9. Discussion and conclusions
9Using Multiple Accounts for Harvesting Solutions in MOOCs
Motivation? Certificate!
Master’s do not try to solve the problem first (no learning?)
High-stake questions more likely to be cheated
Change behavior after receiving certificate
Non-certificate earners start with the intention of ‘cheat for certificate’
Problematic issue
10% students harvested around 1% of their solutions
Relative large number of students cheating in introductory physics
It might increase if MOOC certificates become significant for industry
Implications
Decrease confidence in the assessment Affect the perceived value
certificates
Can interfere with MOOC research
Can apply to other online learning environments as well
@JoseARuiperez
10. Remedies
Problems using randomized variables
Delaying (or removing) feedback on high-stake questions
Limitations
Internal validity: No definite evidence that what we detect is CAMEO,
or a tagged sample (‘training set’).
• Too high? (4x more CAMEO users than Northcutt et al. found in our course)
• Too low? (very strict criteria; findings are low comparing to general literature)
Generalizability: We studied only one course. How typical is it?
Future research
More courses broadening the understanding of the phenomenon
Detection without relying on IP
Run-time detection [by the platform itself]
Other ways of copying
10Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
Discussion and conclusions
11. LOGO
José A. Ruipérez Valienteb,c,* @JoseARuiperez
Giora Alexandrona,*
Zhongzhou Chena
David E. Pritcharda
Contact: {jruipere, giora, zchen22, dpritch}@mit.edu
a Massachusetts Institute of Technology
b Universidad Carlos III de Madrid
c IMDEA Networks Institute
* Equal contribution to this work
Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
12. Harvester Master
Appendix 1: Example pattern
Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
13. Using Multiple Accounts for Harvesting Solutions in MOOCs @JoseARuiperez
Appendix 2: Example of individual profiles