John Stamper - WESST Keynote - Continuous Improvement of Educational Technology through Discoveries with Big Data

Continuous Improvement of Educational Technology
through Discoveries with Big Data
John Stamper
Human-Computer Interaction Institute
Carnegie Mellon University
WESST 7/20/2017

The Classroom of the Future
Which picture represents the
“Classroom of the Future”?
2

3
The answer is both!
Depends of how much money you have...
… but maybe not what you think…

4
Rich vs. Poor
– Poor kids will be forced to rely on “cheap”
technology
– Rich kids will have access to “expensive” teachers
We are seeing this today!
– Waldorf school in Silicon Valley – no technology
– NGLC Wave III Grants
– MOOCs
– Growth of adaptive technology companies
– … and more…

5
What does this mean?
My view is that we cannot stop this, I believe we
must accept that economics will force this route.
We should focus on improving learning technology
• New ways to improve teacher-student access
• Add more adaptive features to learning software
And this can’t be done with experiments on one class
at a time. We need data, BIG DATA!

Pittsburgh Science of Learning Center (PSLC)
• Created to bridge the Chasm between science &
practice
– Low success rate (<10%) of randomized field trials
• LearnLab = a socio-technical bridge between lab
psychology & schools
– E-science of learning & education
– Social processes for research-practice engagement
• Purpose: Leverage cognitive theory and computational
modeling to identify the conditions that cause robust student
learning
6

• Central Repository
– Secure place to store & access research data
– Supports various kinds of research
• Primary analysis of study data
• Exploratory analysis of course data
• Secondary analysis of any data set
• Analysis & Reporting Tools
– Focus on student-tutor interaction data
– Data Export
• Tab delimited tables you can open with your favorite
spreadsheet program or statistical package
• Web services for direct access
DataShop
7

How big is DataShop?
8
Domain Files Papers Datasets Student Actions Students Student Hours
Language 64 14 123 13,369,151 15,125 58,533
Math 283 83 417 140,002,452 170,403 317,008
Science 113 18 251 29,787,644 64,569 90,547
Other info 85 22 187 24,746,456 51,765 100,617
Unspecified 120 0 458 40,923,767 53,713 107,393
Total 665 137 1,436 248,829,470 355,575 674,100
As of July 2017

What kinds of data?
• By domain based on studies from the Learn Labs
• Data from intelligent tutors
• Data from online instruction
• Data from games
The data is fine grained at a transaction level!

• KC: Knowledge component
– also known as a skill/concept/fact
– a piece of information that can be used to
accomplish tasks
– tagged at the step level
• KC Model:
– a computational cognitive model or skill model
– a mapping between correct steps and knowledge
components
DataShop Terminology
10

Getting the KC Model Right!
The KC model drives instruction in adaptive learning
– Problem and topic sequence
– Instructional messages
– Tracking student knowledge
11

What makes a good KC Model?
• A correct expert model is one that is consistent with
student behavior
• Predicts task difficulty
• Predicts transfer between instruction and test
The model should fit the data!
12

Good KC Model = Good Learning Curve
• An empirical basis for determining when a KC model
is good
• Accurate predictions of student task performance &
learning transfer
– Repeated practice on tasks involving the same skill
should reduce the error rate on those tasks
A declining error rate learning curve should emerge
13

Traditionally Cognitive Task Analysis
But CTA interview methods have some issues…
– Extremely human driven
– Highly subjective
– Leads to differing results from different analysts
And these human discovered models are often
wrong!
16

If human centered intuitive design is not
the answer…
How should student models be designed?
They shouldn’t!
Student models should be discovered not designed!
17

Solution – Use Data
– Today we have lots of log data from edtech
– We can harness this data to validate and improve
existing student models
18

19
Human-Machine Student Model Discovery
(Stamper & Koedinger, 2011)
DataShop provides easy interface to add and modify
student models and ranks models
19

Human-Machine Student Model Discovery
3 strategies for discovering improvements to the
student model
– Lack of smooth learning curves
– No apparent learning
– Problems with unexpected error rates
20

A good KC model
produces a learning
curve
Without decomposition, using
just a single “Geometry” skill,
Is this the correct or “best”
model?
no smooth learning curve.
a smooth learning curve.
But with 12 skills for
geometry area,
(Rise in error rate because
poorer students get
assigned more problems)

Inspect curves for individual knowledge
components (KCs)
Some do not =>
Opportunity to
improve model!
Many curves show a
reasonable decline
22

These strategies suggest an improvement
– Hypothesized there were additional skills involved
in some of the compose by addition problems
– A new student model (better AIC/BIC values)
suggests the splitting the skill.
24

What does a better fit really mean?
The new model should be better at driving
instruction to the students!
Redesign of the technology can be guided by the
findings of the new model.
25

Redesign based on Discovered Model
Our discovery suggested changes needed to be
made to the tutor
– Re-sequencing – put problems requiring fewer
skills first
– Knowledge Tracing – adding new skills
– Creating new tasks – new problems
– Changing instructional messages, feedback or
hints
26

Closing the Loop, (Koedinger, Stamper, McLauglin, 2013)
We implemented a new version of the Carnegie
Learning Cognitive Tutor in Geometry
– Knowledge Tracing – added new skills for
decomposing combined shapes
– Created new tasks – new problems isolating the
new skills
– Changing instructional messages, feedback or
hints
27

Results
• Significantly less time to mastery (25% less time)
though more time on critical decomposition skills
• Better posttest performance on composition skills
indicating better learning of decomposition skills
28

29
DataShop Categorization
Low and Flat
No Learning
Still High
Too Little Data
Good

Can a data-driven process be automated
and brought to scale?
Yes!
• Combine Cognitive Science, Psychometrics,
Machine Learning …
• Collect a rich body of data
• Develop new model discovery algorithms,
visualizations, & on-line collaboration support
30

Discovering cognitive models from data
• Abstract from a computational symbolic cognitive model
to a statistical cognitive model
• For each task label the knowledge components that are
required:
Item | Skill Add Sub Mul
2*8 0 0 1
2*8 – 3 0 1 1
2*8 - 30 0 1 1
3+2*8 1 0 1
Original “Q matrix” Other possible “learning factors”
Item | Skill Deal with
negative
Order
of Ops
…
2*8 0 0
2*8 – 3 0 0
2*8 - 30 1 0
3+2*8 0 1

Logistic Regression Model of Student
Performance & Learning
“Additive Factor Model” (AFM) (cf., Draney, Pirolli, Wilson, 1995)
• Evaluate with BIC, AIC, cross validation to reduce over-fit
32

LFA –Model Search Process
Original
Model
BIC = 4328
4301 4312
4320
43204322
Split by Embed Split by Backward Split by Initial
43134322
4248
50+
4322 43244325
15 expansions later
Automates the process of
hypothesizing alternative cognitive
models & testing them against data
• Search algorithm guided
by a heuristic: BIC
• Start from an existing
cog model (Q matrix)
33

Cognitive Model Leaderboard for
Geometry Area Data Set
Some models are machine generated
(based on human-generated learning factors)
Some models are human generated

35
Dataset Content area
RMSE Median Learning
slope (logit)
Orig
in-use
Best-hand Best-LFA Best-hand Best-LFA
Geometry 9697 Geometry area 0.4129 0.4033 0.4011 0.07 0.11
Hampton 0506 Geometry area NA 0.4022 0.4012 0.03 0.04
Cog Discovery Geometry area NA 0.3250 0.3244 0.16 0.16
DFA-318 Story problems 0.4461 0.4407 0.4405 0.07 0.17
DFA-318-main Story problems 0.4376 0.4287 0.4266 0.09 0.17
Digital game Fractions 0.4442 0.4396 0.4346 0.17 0.14
Self-explanation Equation solving NA 0.4014 0.3927 0.01 0.04
IWT 1 English articles 0.4262 0.4110 0.4068 0.10 0.12
Statistics-Fall09 Statistics 0.3648 0.3527 0.3353 ** 0.09
Better Models found in all of the Datasets
(Koedinger, McLaughlin, and Stamper, 2012)

Better Models – What does it mean?
l The new model fits better
l Should be better at driving instruction
l The models should be human interpretable!
36

LFA –Model Search Process
Original
Model
BIC = 4328
4301 4312
4320
43204322
Split by Embed Split by Backward Split by Initial
43134322
4248
50+
4322 43244325
15 expansions later
Automates the process of
hypothesizing alternative cognitive
models & testing them against data
• Search algorithm guided
by a heuristic: BIC
• Start from an existing
cog model (Q matrix)
37

38
Dataset Content area
RMSE Median Learning
slope (logit)
Orig
in-use
Best-hand Best-LFA Best-hand Best-LFA
Geometry 9697 Geometry area 0.4129 0.4033 0.4011 0.07 0.11
Hampton 0506 Geometry area NA 0.4022 0.4012 0.03 0.04
Cog Discovery Geometry area NA 0.3250 0.3244 0.16 0.16
DFA-318 Story problems 0.4461 0.4407 0.4405 0.07 0.17
DFA-318-main Story problems 0.4376 0.4287 0.4266 0.09 0.17
Digital game Fractions 0.4442 0.4396 0.4346 0.17 0.14
Self-explanation Equation solving NA 0.4014 0.3927 0.01 0.04
Statistics-Fall09 Statistics 0.3648 0.3527 0.3353 ** 0.09
Better Models found in all of the Datasets
(Koedinger, McLaughlin, and Stamper, 2012)

What’s next?
l Learning Linkages: Integrating data streams of multiple
modalities and timescales.
• How to link multiple streams of data
• What predicts what
l Data-Driven Methods to Improve Student Learning from
Online Courses.
• Applying previous methods to online courses
• Tools for MOOCs
39

LearnSphere
“a community software infrastructure around
the analysis of educational data that supports
sharing and collaboration across the wide
variety of educational data”
40

Data Silos → Data Integration
Many paradigms of data-
driven education
research differ in
● data types
● time scale
● research goals
Disciplinary silos are
fostered by differences
Data infrastructure for
analytics across these
Ultimate goal: Produce
discoveries not
possible within current
silos
43

Distributed deployment and
storage
Central LearnSphere portal
Multiple “myLearnSphere”
installations
● Individual installations
curate their own data or
replicate to central
repository
● Outside researchers can
identify existing datasets
through metadata
provided by local
versions
44

Learning Analytics Workflow Authoring
Environment
45

Take Aways
• The amount of data coming from educational
technology is growing exponentially (Big Data
is here in Education)
• Students are going to rely more and more on
technology, so improving the learning in
edtech systems is critical
• Human-Centered, Data-Driven approaches are
most likely to be the ones that succeed in
actionable improvements in edtech
46

http://dev.stamper.org
john@stamper.org
47

John Stamper - WESST Keynote - Continuous Improvement of Educational Technology through Discoveries with Big Data

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (10)

Similar to John Stamper - WESST Keynote - Continuous Improvement of Educational Technology through Discoveries with Big Data

Similar to John Stamper - WESST Keynote - Continuous Improvement of Educational Technology through Discoveries with Big Data (20)

Recently uploaded

Recently uploaded (20)

John Stamper - WESST Keynote - Continuous Improvement of Educational Technology through Discoveries with Big Data