This document discusses using big data from educational technology to continuously improve student models and instructional methods. It describes how data from intelligent tutoring systems and online courses is stored in the DataShop repository. Researchers use data-driven techniques like human-machine model discovery to iteratively improve cognitive models of student knowledge. Better models lead to enhanced tutoring systems that reduce student learning time and improve outcomes. The presentation advocates developing shared data infrastructure and analytics tools to integrate diverse data sources and enable discoveries across research silos.
6. Pittsburgh Science of Learning Center (PSLC)
• Created to bridge the Chasm between science &
practice
– Low success rate (<10%) of randomized field trials
• LearnLab = a socio-technical bridge between lab
psychology & schools
– E-science of learning & education
– Social processes for research-practice engagement
• Purpose: Leverage cognitive theory and computational
modeling to identify the conditions that cause robust student
learning
6
7. • Central Repository
– Secure place to store & access research data
– Supports various kinds of research
• Primary analysis of study data
• Exploratory analysis of course data
• Secondary analysis of any data set
• Analysis & Reporting Tools
– Focus on student-tutor interaction data
– Data Export
• Tab delimited tables you can open with your favorite
spreadsheet program or statistical package
• Web services for direct access
DataShop
7
8. How big is DataShop?
8
Domain Files Papers Datasets Student Actions Students Student Hours
Language 64 14 123 13,369,151 15,125 58,533
Math 283 83 417 140,002,452 170,403 317,008
Science 113 18 251 29,787,644 64,569 90,547
Other info 85 22 187 24,746,456 51,765 100,617
Unspecified 120 0 458 40,923,767 53,713 107,393
Total 665 137 1,436 248,829,470 355,575 674,100
As of July 2017
10. • KC: Knowledge component
– also known as a skill/concept/fact
– a piece of information that can be used to
accomplish tasks
– tagged at the step level
• KC Model:
– a computational cognitive model or skill model
– a mapping between correct steps and knowledge
components
DataShop Terminology
10
21. A good KC model
produces a learning
curve
Without decomposition, using
just a single “Geometry” skill,
Is this the correct or “best”
model?
no smooth learning curve.
a smooth learning curve.
But with 12 skills for
geometry area,
(Rise in error rate because
poorer students get
assigned more problems)
33. LFA –Model Search Process
Original
Model
BIC = 4328
4301 4312
4320
43204322
Split by Embed Split by Backward Split by Initial
43134322
4248
50+
4322 43244325
15 expansions later
Automates the process of
hypothesizing alternative cognitive
models & testing them against data
• Search algorithm guided
by a heuristic: BIC
• Start from an existing
cog model (Q matrix)
33
37. LFA –Model Search Process
Original
Model
BIC = 4328
4301 4312
4320
43204322
Split by Embed Split by Backward Split by Initial
43134322
4248
50+
4322 43244325
15 expansions later
Automates the process of
hypothesizing alternative cognitive
models & testing them against data
• Search algorithm guided
by a heuristic: BIC
• Start from an existing
cog model (Q matrix)
37
38. 38
Dataset Content area
RMSE Median Learning
slope (logit)
Orig
in-use
Best-hand Best-LFA Best-hand Best-LFA
Geometry 9697 Geometry area 0.4129 0.4033 0.4011 0.07 0.11
Hampton 0506 Geometry area NA 0.4022 0.4012 0.03 0.04
Cog Discovery Geometry area NA 0.3250 0.3244 0.16 0.16
DFA-318 Story problems 0.4461 0.4407 0.4405 0.07 0.17
DFA-318-main Story problems 0.4376 0.4287 0.4266 0.09 0.17
Digital game Fractions 0.4442 0.4396 0.4346 0.17 0.14
Self-explanation Equation solving NA 0.4014 0.3927 0.01 0.04
IWT 1 English articles 0.4262 0.4110 0.4068 0.10 0.12
IWT 2 English articles 0.3854 0.3854 0.3806 0.12 0.16
IWT 3 English articles 0.3970 0.3965 0.3903 0.05 0.15
Statistics-Fall09 Statistics 0.3648 0.3527 0.3353 ** 0.09
Better Models found in all of the Datasets
(Koedinger, McLaughlin, and Stamper, 2012)