SlideShare a Scribd company logo
1 of 14
Download to read offline
LOGO
A Data-driven Method for the
Detection of Close Submitters in
Online Learning Environments
José A. Ruipérez Valiente a,b – @JoseARuiperez
Srećko Joksimović c – @s_joksimovic
Vitomir Kovanović c – @vkovanovic
Dragan Gašević c – @dgasevic
Pedro J. Muñoz Merino a – @pedmume
Carlos Delgado Kloos a – @cdkloos
a Universidad Carlos III de Madrid
b IMDEA Networks Institute
c The University of Edinburgh
WWW’17, Perth
Overview
 Detect pairs or groups of accounts that always submit their
assignments very close in time
 Main goals:
 Design and develop a general algorithm to detect these accounts
 Apply it to our specific case study with Massive Open Online Course (MOOC) data
 Analyze and discuss the results in different directions
 Related to:
 Emerging groups and collaboration in MOOCs (surveys and social activity)
 Enrolling in a MOOC with friends improves completion rate [Brooks et al., 2015] and they enjoy
watching videos in groups [Li et al., 2014]
 Copying Answers using Multiple Existence Online (CAMEO) [Ruipérez-Valiente et al., 2016;
Northcutt et al., 2016; Alexandron et al., 2016]
 Academic dishonesty (breaking honor code) and gaming the system (exploit system properties)
2
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
3
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
Basic problem description
N  number of accounts | M  number of assignments, then:
𝑠𝑝𝑖 = 𝑠𝑝𝑖,1 𝑠𝑝𝑖,1 ⋯ 𝑠𝑝𝑖,𝑀 , 𝑖 ∈ 1 ⋯ 𝑁
where 𝑠𝑝𝑖,𝑗 is the submission timestamp
of student i for assignment j. Then we
define SP as:
𝑆𝑃 =
𝑠𝑝1
𝑠𝑝2
⋮
𝑠𝑝 𝑁
=
[𝑠𝑝1,1 𝑠𝑝1,2 𝑠𝑝1,3
[𝑠𝑝2,1 𝑠𝑝2,2 𝑠𝑝2,3
⋮
[𝑠𝑝 𝑁,1
⋮
𝑠𝑝 𝑁,2
⋮
𝑠𝑝 𝑁,3
⋯ 𝑠𝑝1,𝑀]
⋯ 𝑠𝑝2,𝑀]
⋱
⋯
⋮
𝑠𝑝 𝑁,𝑀]
𝐷𝑆 =
𝑑𝑠1,1 𝑑𝑠1,2 𝑑𝑠1,3
𝑑𝑠2,1 𝑑𝑠2,2 𝑑𝑠2,3
⋮
𝑑𝑠 𝑁,1
⋮
𝑑𝑠 𝑁,2
⋮
𝑑𝑠 𝑁,3
⋯ 𝑑𝑠1,𝑁
⋯ 𝑑𝑠2,𝑁
⋱
⋯
⋮
𝑑𝑠 𝑁,𝑁
then we can define a distance matrix
DS where 𝑑𝑠𝑖,𝑗 = 𝑑𝑖𝑠𝑠(𝑠𝑝𝑖, 𝑠𝑝𝑗). Note:
• Matrix is symmetric and hollow
• High complexity 𝑂(𝑁2
∗ 𝑑)
• Keep set D of unique distances
4
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
Problem operationalization
 Assignments
 Keep only graded quizzes and last submission to each quiz
 Course accounts
 Keep those accounts that submitted all graded quizzes
 Dissimilarity measure
 Mean Absolute Deviation (MAD)
 Mean Squared Deviation (MSD)
𝑑𝑖𝑠𝑠 𝑀𝐴𝐷 𝑠𝑝𝑖, 𝑠𝑝𝑗 =
1
𝑀
෍
𝑘=1
𝑀
𝑠𝑝𝑖,𝑘 − 𝑠𝑝𝑗,𝑘
𝑑𝑖𝑠𝑠 𝑀𝑆𝐷 𝑠𝑝𝑖, 𝑠𝑝𝑗 =
1
𝑀
෍
𝑘=1
𝑀
𝑠𝑝𝑖,𝑘 − 𝑠𝑝𝑗,𝑘
2
5
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
Two MOOCs on Coursera by the University of Edinburgh
 Introduction to Philosophy (PHIL)
• One graded quiz per week, 6-12 questions per quiz
• 7 weeks
• 2359 accounts submitted all assignments
 Music Theory (MUSIC)
• One graded quiz per week, 10-14 questions per week
• 5 weeks
• 5159 accounts submitted all assignments
 Example of notation 𝐷𝑆 𝑚𝑢𝑠
𝑀𝐴𝐷
Case study
6
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
 Compute set D for both
courses and
dissimilarity measures
 𝐷 𝑚𝑢𝑠
𝑀𝐴𝐷
and 𝐷 𝑚𝑢𝑠
𝑀𝑆𝐷
13.305.061
(i.e., (5.159*5.158)/2)
 𝐷 𝑝ℎ𝑖𝑙
𝑀𝐴𝐷
and 𝐷 𝑝ℎ𝑖𝑙
𝑀𝑆𝐷
2.781.261
Distances overview and distribution
7
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
We follow the next steps:
 Select an initial threshold by ‘common-sense’ MAD = 30 minutes
 Compute quantile that value represents 4.81e-6 for MUSIC and
5.76e-6 for PHIL
 Based on that initial threshold, we test different quantiles and
select one of them
Identifying close submitters
8
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
Close submitter pairs by quantile
Quantile
Course
MUSIC PHIL
6e-6
Account pairs 78 17
MAD threshold 0.61h 0.57h
MSD threshold 0.51h2 0.51h2
1e-5
Account pairs 132 28
MAD threshold 0.9h 1.25h
MSD threshold 1.15h2 1.98h2
5e-5
Account pairs 664 140
MAD threshold 2.9h 4.98h
MSD threshold 10.94h2 38.13h2
9
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
Based on the identified pairs of ‘close submitters’
Identifying couples and communities
 Graph nodes connected
with a undirected edge
between each one of the
pairs
 MUSIC: 99 different
accounts, 30 couples
 PHIL: 26 different
accounts, 11 couples
10
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
Selected variables:
 FinalGrade: The final numeric course grade (between 0 and 100)
 GotCertificate: Boolean variable representing certificate
 SubmissionCount: Number of submissions
 ActiveDaysCount: Number of active days
 DistinctVideoCount: Number of videos accessed or downloaded
 DistinctThreadCount: Number of discussion topics accessed
Examining differences: ‘close submitters’ vs. others
11
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
Examining differences: ‘close submitters’ vs. others
 MANOVA is significant
for both courses and for
both certificate and non-
certificate earners
 All independent t-tests
are significant too
Discussion and conclusions
12
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
 ‘Close submitters’ are a population statistically different than
the rest of accounts
 What are they actually doing?
 Is it good or bad for learning achievement?
 Implications for learning, research and certificate value
Future work
 Clustering based on their indicators  Assess different
associations
 Couple and community analysis  roles, good or bad for
learning, etc
 Algorithm improvements  more robust, different criteria
 Bigger longitudinal study with more MOOCs to increase
generalizability
 Other settings  e.g., online on-campus courses for credit
13
WWW’17, Perth
@JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
LOGOWWW’17, Perth
A Data-driven Method for the Detection of Close Submitters in Online Learning Environments
José A. Ruipérez Valiente a,b – @JoseARuiperez
Srećko Joksimović c – @s_joksimovic
Vitomir Kovanović c – @vkovanovic
Dragan Gašević c – @dgasevic
Pedro J. Muñoz-Merino a – @pedmume
Carlos Delgado Kloos a – @cdkloos
a Universidad Carlos III de Madrid
b IMDEA Networks Institute
c The University of Edinburgh

More Related Content

Similar to Detection of Close Submitters in Online Learning Environments

IRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET- Tracking and Predicting Student Performance using Machine LearningIRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET- Tracking and Predicting Student Performance using Machine LearningIRJET Journal
 
2021_03_26 "Drop-out prediction in online learning environments" - Paola Velardi
2021_03_26 "Drop-out prediction in online learning environments" - Paola Velardi2021_03_26 "Drop-out prediction in online learning environments" - Paola Velardi
2021_03_26 "Drop-out prediction in online learning environments" - Paola VelardieMadrid network
 
Krakow presentation speak_appsmngm_final
Krakow presentation speak_appsmngm_finalKrakow presentation speak_appsmngm_final
Krakow presentation speak_appsmngm_finalSpeakApps Project
 
SMS, Q&A, Course Evaluation tools in Sakai
SMS, Q&A, Course Evaluation tools in SakaiSMS, Q&A, Course Evaluation tools in Sakai
SMS, Q&A, Course Evaluation tools in SakaiStephen Marquard
 
SMS, Q&A and Course Evaluations in Sakai
SMS, Q&A and Course Evaluations in SakaiSMS, Q&A and Course Evaluations in Sakai
SMS, Q&A and Course Evaluations in SakaiStephen Marquard
 
Predicting students performance in final examination
Predicting students performance in final examinationPredicting students performance in final examination
Predicting students performance in final examinationRashid Ansari
 
A comparative study of machine learning algorithms for virtual learning envir...
A comparative study of machine learning algorithms for virtual learning envir...A comparative study of machine learning algorithms for virtual learning envir...
A comparative study of machine learning algorithms for virtual learning envir...IAESIJAI
 
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...eMadrid network
 
Intelligent system for sTudent placement
Intelligent system for sTudent placementIntelligent system for sTudent placement
Intelligent system for sTudent placementFemmy Johnson
 
Fuzzy Association Rule Mining based Model to Predict Students’ Performance
Fuzzy Association Rule Mining based Model to Predict Students’ Performance Fuzzy Association Rule Mining based Model to Predict Students’ Performance
Fuzzy Association Rule Mining based Model to Predict Students’ Performance IJECEIAES
 
Predicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesPredicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesOlugbenga Wilson Adejo
 
COET3A1.Powerpoint Presentation
COET3A1.Powerpoint PresentationCOET3A1.Powerpoint Presentation
COET3A1.Powerpoint Presentationtroyjan27
 
Using Multiple Accounts for Harvesting Solutions in MOOCs
Using Multiple Accounts for Harvesting Solutions in MOOCs Using Multiple Accounts for Harvesting Solutions in MOOCs
Using Multiple Accounts for Harvesting Solutions in MOOCs MIT
 
Data Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout SystemData Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout SystemKumar Goud
 

Similar to Detection of Close Submitters in Online Learning Environments (20)

IRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET- Tracking and Predicting Student Performance using Machine LearningIRJET- Tracking and Predicting Student Performance using Machine Learning
IRJET- Tracking and Predicting Student Performance using Machine Learning
 
2021_03_26 "Drop-out prediction in online learning environments" - Paola Velardi
2021_03_26 "Drop-out prediction in online learning environments" - Paola Velardi2021_03_26 "Drop-out prediction in online learning environments" - Paola Velardi
2021_03_26 "Drop-out prediction in online learning environments" - Paola Velardi
 
Krakow presentation speak_appsmngm_final
Krakow presentation speak_appsmngm_finalKrakow presentation speak_appsmngm_final
Krakow presentation speak_appsmngm_final
 
Fd33935939
Fd33935939Fd33935939
Fd33935939
 
Fd33935939
Fd33935939Fd33935939
Fd33935939
 
SMS, Q&A, Course Evaluation tools in Sakai
SMS, Q&A, Course Evaluation tools in SakaiSMS, Q&A, Course Evaluation tools in Sakai
SMS, Q&A, Course Evaluation tools in Sakai
 
SMS, Q&A and Course Evaluations in Sakai
SMS, Q&A and Course Evaluations in SakaiSMS, Q&A and Course Evaluations in Sakai
SMS, Q&A and Course Evaluations in Sakai
 
Remote Experimentation supported by Learning Analytics and Recommender Systems
Remote Experimentation supported by Learning Analytics and Recommender SystemsRemote Experimentation supported by Learning Analytics and Recommender Systems
Remote Experimentation supported by Learning Analytics and Recommender Systems
 
Predicting students performance in final examination
Predicting students performance in final examinationPredicting students performance in final examination
Predicting students performance in final examination
 
My experiment
My experimentMy experiment
My experiment
 
A comparative study of machine learning algorithms for virtual learning envir...
A comparative study of machine learning algorithms for virtual learning envir...A comparative study of machine learning algorithms for virtual learning envir...
A comparative study of machine learning algorithms for virtual learning envir...
 
2-IJCSE-00536
2-IJCSE-005362-IJCSE-00536
2-IJCSE-00536
 
2-IJCSE-00536
2-IJCSE-005362-IJCSE-00536
2-IJCSE-00536
 
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
VII Jornadas eMadrid "Education in exponential times". Mesa redonda eMadrid L...
 
Intelligent system for sTudent placement
Intelligent system for sTudent placementIntelligent system for sTudent placement
Intelligent system for sTudent placement
 
Fuzzy Association Rule Mining based Model to Predict Students’ Performance
Fuzzy Association Rule Mining based Model to Predict Students’ Performance Fuzzy Association Rule Mining based Model to Predict Students’ Performance
Fuzzy Association Rule Mining based Model to Predict Students’ Performance
 
Predicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesPredicting student performance using aggregated data sources
Predicting student performance using aggregated data sources
 
COET3A1.Powerpoint Presentation
COET3A1.Powerpoint PresentationCOET3A1.Powerpoint Presentation
COET3A1.Powerpoint Presentation
 
Using Multiple Accounts for Harvesting Solutions in MOOCs
Using Multiple Accounts for Harvesting Solutions in MOOCs Using Multiple Accounts for Harvesting Solutions in MOOCs
Using Multiple Accounts for Harvesting Solutions in MOOCs
 
Data Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout SystemData Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout System
 

More from MIT

Learning Analytics for the Evaluation of Competencies and Behaviors in Seriou...
Learning Analytics for the Evaluation of Competencies and Behaviors in Seriou...Learning Analytics for the Evaluation of Competencies and Behaviors in Seriou...
Learning Analytics for the Evaluation of Competencies and Behaviors in Seriou...MIT
 
Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX a...
Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX a...Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX a...
Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX a...MIT
 
Learning Analytics Design in Game-based Learning
Learning Analytics Design in Game-based LearningLearning Analytics Design in Game-based Learning
Learning Analytics Design in Game-based LearningMIT
 
Investigación en Learning Analytics vs. Learning Analytics en la Universidad
Investigación en Learning Analyticsvs.Learning Analytics en la UniversidadInvestigación en Learning Analyticsvs.Learning Analytics en la Universidad
Investigación en Learning Analytics vs. Learning Analytics en la UniversidadMIT
 
Ph.D. Defense - Dr. Jose A. Ruiperez Valiente
Ph.D. Defense - Dr. Jose A. Ruiperez Valiente Ph.D. Defense - Dr. Jose A. Ruiperez Valiente
Ph.D. Defense - Dr. Jose A. Ruiperez Valiente MIT
 
Diseño e Implementación de un Módulo de Analítica de Aprendizaje en la Plataf...
Diseño e Implementación de un Módulo de Analítica de Aprendizaje en la Plataf...Diseño e Implementación de un Módulo de Analítica de Aprendizaje en la Plataf...
Diseño e Implementación de un Módulo de Analítica de Aprendizaje en la Plataf...MIT
 

More from MIT (6)

Learning Analytics for the Evaluation of Competencies and Behaviors in Seriou...
Learning Analytics for the Evaluation of Competencies and Behaviors in Seriou...Learning Analytics for the Evaluation of Competencies and Behaviors in Seriou...
Learning Analytics for the Evaluation of Competencies and Behaviors in Seriou...
 
Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX a...
Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX a...Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX a...
Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX a...
 
Learning Analytics Design in Game-based Learning
Learning Analytics Design in Game-based LearningLearning Analytics Design in Game-based Learning
Learning Analytics Design in Game-based Learning
 
Investigación en Learning Analytics vs. Learning Analytics en la Universidad
Investigación en Learning Analyticsvs.Learning Analytics en la UniversidadInvestigación en Learning Analyticsvs.Learning Analytics en la Universidad
Investigación en Learning Analytics vs. Learning Analytics en la Universidad
 
Ph.D. Defense - Dr. Jose A. Ruiperez Valiente
Ph.D. Defense - Dr. Jose A. Ruiperez Valiente Ph.D. Defense - Dr. Jose A. Ruiperez Valiente
Ph.D. Defense - Dr. Jose A. Ruiperez Valiente
 
Diseño e Implementación de un Módulo de Analítica de Aprendizaje en la Plataf...
Diseño e Implementación de un Módulo de Analítica de Aprendizaje en la Plataf...Diseño e Implementación de un Módulo de Analítica de Aprendizaje en la Plataf...
Diseño e Implementación de un Módulo de Analítica de Aprendizaje en la Plataf...
 

Recently uploaded

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 

Recently uploaded (20)

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 

Detection of Close Submitters in Online Learning Environments

  • 1. LOGO A Data-driven Method for the Detection of Close Submitters in Online Learning Environments José A. Ruipérez Valiente a,b – @JoseARuiperez Srećko Joksimović c – @s_joksimovic Vitomir Kovanović c – @vkovanovic Dragan Gašević c – @dgasevic Pedro J. Muñoz Merino a – @pedmume Carlos Delgado Kloos a – @cdkloos a Universidad Carlos III de Madrid b IMDEA Networks Institute c The University of Edinburgh WWW’17, Perth
  • 2. Overview  Detect pairs or groups of accounts that always submit their assignments very close in time  Main goals:  Design and develop a general algorithm to detect these accounts  Apply it to our specific case study with Massive Open Online Course (MOOC) data  Analyze and discuss the results in different directions  Related to:  Emerging groups and collaboration in MOOCs (surveys and social activity)  Enrolling in a MOOC with friends improves completion rate [Brooks et al., 2015] and they enjoy watching videos in groups [Li et al., 2014]  Copying Answers using Multiple Existence Online (CAMEO) [Ruipérez-Valiente et al., 2016; Northcutt et al., 2016; Alexandron et al., 2016]  Academic dishonesty (breaking honor code) and gaming the system (exploit system properties) 2 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
  • 3. 3 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments Basic problem description N  number of accounts | M  number of assignments, then: 𝑠𝑝𝑖 = 𝑠𝑝𝑖,1 𝑠𝑝𝑖,1 ⋯ 𝑠𝑝𝑖,𝑀 , 𝑖 ∈ 1 ⋯ 𝑁 where 𝑠𝑝𝑖,𝑗 is the submission timestamp of student i for assignment j. Then we define SP as: 𝑆𝑃 = 𝑠𝑝1 𝑠𝑝2 ⋮ 𝑠𝑝 𝑁 = [𝑠𝑝1,1 𝑠𝑝1,2 𝑠𝑝1,3 [𝑠𝑝2,1 𝑠𝑝2,2 𝑠𝑝2,3 ⋮ [𝑠𝑝 𝑁,1 ⋮ 𝑠𝑝 𝑁,2 ⋮ 𝑠𝑝 𝑁,3 ⋯ 𝑠𝑝1,𝑀] ⋯ 𝑠𝑝2,𝑀] ⋱ ⋯ ⋮ 𝑠𝑝 𝑁,𝑀] 𝐷𝑆 = 𝑑𝑠1,1 𝑑𝑠1,2 𝑑𝑠1,3 𝑑𝑠2,1 𝑑𝑠2,2 𝑑𝑠2,3 ⋮ 𝑑𝑠 𝑁,1 ⋮ 𝑑𝑠 𝑁,2 ⋮ 𝑑𝑠 𝑁,3 ⋯ 𝑑𝑠1,𝑁 ⋯ 𝑑𝑠2,𝑁 ⋱ ⋯ ⋮ 𝑑𝑠 𝑁,𝑁 then we can define a distance matrix DS where 𝑑𝑠𝑖,𝑗 = 𝑑𝑖𝑠𝑠(𝑠𝑝𝑖, 𝑠𝑝𝑗). Note: • Matrix is symmetric and hollow • High complexity 𝑂(𝑁2 ∗ 𝑑) • Keep set D of unique distances
  • 4. 4 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments Problem operationalization  Assignments  Keep only graded quizzes and last submission to each quiz  Course accounts  Keep those accounts that submitted all graded quizzes  Dissimilarity measure  Mean Absolute Deviation (MAD)  Mean Squared Deviation (MSD) 𝑑𝑖𝑠𝑠 𝑀𝐴𝐷 𝑠𝑝𝑖, 𝑠𝑝𝑗 = 1 𝑀 ෍ 𝑘=1 𝑀 𝑠𝑝𝑖,𝑘 − 𝑠𝑝𝑗,𝑘 𝑑𝑖𝑠𝑠 𝑀𝑆𝐷 𝑠𝑝𝑖, 𝑠𝑝𝑗 = 1 𝑀 ෍ 𝑘=1 𝑀 𝑠𝑝𝑖,𝑘 − 𝑠𝑝𝑗,𝑘 2
  • 5. 5 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments Two MOOCs on Coursera by the University of Edinburgh  Introduction to Philosophy (PHIL) • One graded quiz per week, 6-12 questions per quiz • 7 weeks • 2359 accounts submitted all assignments  Music Theory (MUSIC) • One graded quiz per week, 10-14 questions per week • 5 weeks • 5159 accounts submitted all assignments  Example of notation 𝐷𝑆 𝑚𝑢𝑠 𝑀𝐴𝐷 Case study
  • 6. 6 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments  Compute set D for both courses and dissimilarity measures  𝐷 𝑚𝑢𝑠 𝑀𝐴𝐷 and 𝐷 𝑚𝑢𝑠 𝑀𝑆𝐷 13.305.061 (i.e., (5.159*5.158)/2)  𝐷 𝑝ℎ𝑖𝑙 𝑀𝐴𝐷 and 𝐷 𝑝ℎ𝑖𝑙 𝑀𝑆𝐷 2.781.261 Distances overview and distribution
  • 7. 7 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments We follow the next steps:  Select an initial threshold by ‘common-sense’ MAD = 30 minutes  Compute quantile that value represents 4.81e-6 for MUSIC and 5.76e-6 for PHIL  Based on that initial threshold, we test different quantiles and select one of them Identifying close submitters
  • 8. 8 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments Close submitter pairs by quantile Quantile Course MUSIC PHIL 6e-6 Account pairs 78 17 MAD threshold 0.61h 0.57h MSD threshold 0.51h2 0.51h2 1e-5 Account pairs 132 28 MAD threshold 0.9h 1.25h MSD threshold 1.15h2 1.98h2 5e-5 Account pairs 664 140 MAD threshold 2.9h 4.98h MSD threshold 10.94h2 38.13h2
  • 9. 9 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments Based on the identified pairs of ‘close submitters’ Identifying couples and communities  Graph nodes connected with a undirected edge between each one of the pairs  MUSIC: 99 different accounts, 30 couples  PHIL: 26 different accounts, 11 couples
  • 10. 10 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments Selected variables:  FinalGrade: The final numeric course grade (between 0 and 100)  GotCertificate: Boolean variable representing certificate  SubmissionCount: Number of submissions  ActiveDaysCount: Number of active days  DistinctVideoCount: Number of videos accessed or downloaded  DistinctThreadCount: Number of discussion topics accessed Examining differences: ‘close submitters’ vs. others
  • 11. 11 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments Examining differences: ‘close submitters’ vs. others  MANOVA is significant for both courses and for both certificate and non- certificate earners  All independent t-tests are significant too
  • 12. Discussion and conclusions 12 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments  ‘Close submitters’ are a population statistically different than the rest of accounts  What are they actually doing?  Is it good or bad for learning achievement?  Implications for learning, research and certificate value
  • 13. Future work  Clustering based on their indicators  Assess different associations  Couple and community analysis  roles, good or bad for learning, etc  Algorithm improvements  more robust, different criteria  Bigger longitudinal study with more MOOCs to increase generalizability  Other settings  e.g., online on-campus courses for credit 13 WWW’17, Perth @JoseARuiperezA Data-driven Method for the Detection of Close Submitters in Online Learning Environments
  • 14. LOGOWWW’17, Perth A Data-driven Method for the Detection of Close Submitters in Online Learning Environments José A. Ruipérez Valiente a,b – @JoseARuiperez Srećko Joksimović c – @s_joksimovic Vitomir Kovanović c – @vkovanovic Dragan Gašević c – @dgasevic Pedro J. Muñoz-Merino a – @pedmume Carlos Delgado Kloos a – @cdkloos a Universidad Carlos III de Madrid b IMDEA Networks Institute c The University of Edinburgh