Probabilistic Graphical
Models as Predictive
Feedback for Students
Presenter: Mary Loftus
PhD Supervisor: Prof Michael Madden
1
... is the measurement, collection, analysis and
reporting of data about learners and their
contexts, for purposes of understanding and
optimizing learning and the environments in
which it occurs.
Learning Analytics
Psychology
Data Mining
Visualization
Learning
Sciences
2
Learning Analytics – The Story So Far
● Predicting student outcomes
● Identifying ‘at-risk students’
● Personalisation of student learning
● Multi-modal analytics – analyses
of audio, video, location data
● Discourse and writing analytics
● Measuring ‘student engagement’
& disengagement
•But little work done
from the Student
perspective
(Kitto 2015)
3
Central to education’s purpose is “the coming into presence of unique individual beings”
Education “spaces might open up for uniqueness to come into the world”
“Subjectification”
– Biesta, G. J. J. (2015). Good Education in an Age of Measurement: Ethics, Politics, Democracy
4
Ethics, Student Vulnerability & Agency
Data is Political
We must find ways to:
• decrease student vulnerability
• increase student agency
• empower students as participants in
learning analytics
• moving students away from quantified
data objects to qualified and qualifying
selves
Institutional Policy exemplar from
University of Edinburgh
5
“In light of increasing concerns about
surveillance, higher education
institutions (HEIs) cannot afford a simple
paternalistic approach to student data”
Prinsloo & Slade (2016)
Ethics, GDPR & Personal Data
• This study received ethical clearance from NUI Galway’s Research Ethics
Committee in 2017
• The GDPR details a set of six principles that must be adhered to when
processing personal data.
Personal data must be:
• processed lawfully, fairly and in a transparent manner.
• collected for specified, explicit and legitimate purposes and only used for these purposes.
• adequate, relevant and limited to what is necessary. In other words, only data that is required for
the explicit purpose detailed above should be gathered and stored.
• accurate and up-to-date.
• stored for no longer than is necessary.
• processed in a secure manner that protects against unauthorised processing, loss and accidental
destruction or damage.
6
Input  Output 
Machine Learning Algorithms often ‘Black-Box’
7
dana boyd (2017)
The problem with
contemporary data analytics
is that we’re often
categorizing people
without providing human
readable descriptors.
Students (and citizens)
need to be able to ‘see
into’ algorithms that
impact their
opportunities and
quality of life.
Research Goals
Ethical
approach to
data that
would
empower
Students
Facilitate
Agency
moving
students away
from
quantified
data objects
to qualified
and
qualifying
selves
New Ways of
Seeing
Help Students
to see new
aspects of
themselves
and their
world
Whitebox
Learning
algorithms to
allow
students to
‘see into’ their
learning
Self-
regulation
Help students
set and track
goals
Metacognition
Provide
opportunities
for
understanding
own learning
8
Bayesian Approaches
• Thomas Bayes – wanted to prove the existence of God - 1761
• William Price – developed & published Bayes’ work - 1763
• Laplace improved on the initial Bayes Rule – which used prior
evidence and new evidence to calculate probabilities - 1812
• Alan Turing used Bayesian methods in his top-secret work at
Bletchley Park to decipher enemy encrypted messages - 1945
• Mary Wollstonecraft – mentored by Price - 1792
• Judea Pearl – Probabilistic Graphical Models – 1985
• Daphne Koller - Probabilistic Graphical Models - 2009
• McGrayne, S. B. (2011) The Theory that Would Not Die: How
Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian
Submarines, and Emerged Triumphant from Two Centuries of
Controversy
9
Random Variables
Conditional
Dependencies
Acyclic Graph
Madden et al (2008)
Bayesian Networks
Bayesian Networks
• Bayesian Networks
provide visual,
accessible,
interactive, white-
box models
• Data Mind-Maps with
a side of Probability
• Genie – great starter
tool from
BayesFusion
– free for academic use
11
Research Questions
1. Can a Learning Analytics system, using machine
learning, modelling and classifiers, provide an
interface for students to engage in metacognitive
activities, allowing them to see hidden aspects of
their own learning, and support the student’s own
development goals?
2. Can we optimise machine learning for learning
data? Can we identify the particular
characteristics and the factors affecting the
distributions of these datasets? Can we manually
add extra knowledge to counteract the dearth of
data?
4.
Metacognition
1. Student
Actions
2. Bayesian
Models & Open
Learning Models
3. Student
Reflection
12
Data Exploration
13
Hand-built BNs
• Initially, I hand-built a Bayesian
Network using the Genie tool
generating the Conditional
Probabilities I expected to see...
• Modelled Student Data on :
• Attendance
• Github Use
• Trello Use
• Quiz Result
• Final Result
14
Naive Bayes Network
• Then learned BNs from data
• These examples allow
students to ‘play’ with
course variables and see
correlations with course
performance….
• ‘If my Attendance is
excellent, what’s the
probability of passing the
module?’
15
Bayes Rule
17
P(Pass | ExcellentAtt)
= P(ExcellentAtt | Pass) * P(Pass)
P(ExcellentAtt)
= .358025 * .82
.31
P(Pass | ExcellentAtt) = .94
18
Validation – are the models any good?
Modified TAN Model TAN Model
Accuracy - Test Only
0.87 (56/64) 0.91 (58/64)
K-fold Cross Validation (k=10)
0.75 (48/64) 0.83 (53/64)
K-fold Cross Validation (k=3)
0.75 (48/64) 0.84 (54/64)
Leave One Out
0.78 (50/64) 0.82 (53/64)
19
• Results of Validation within the Genie environment
Validation – are the models any good?
• Comparison Analyses were also
conducted with Sci-kit Learn in Python:
20
Algorithm Accuracy on
Test Dataset
Accuracy on
Held-Out
Data
Naive Bayes 100% 90%
KNN 93% 75%
SVC 93% 75%
Decision Tree (CART) 86% 95%
Linear Regression 92% 75%
But what’s in this for students?
• What would they model?
21
Writing a
Paper
Reading
Writing
Analysis
Exercise
Free Time
Location
Creativity
Pedagogical implications & possibilities?
• Could we support students to:
• Update a model like this?
• Add nodes?
• Interpret this model?
• See their learning reflected?
• Reflect on insights generated?
• Set goals accordingly?
22
Create
Evaluate
Synthesise
Analyse
Apply
Understand
Know
Bloom’s Taxonomy, 1956
● Data Literacy as a tool for students
● Could students build their own models to analyse data in their
coursework?
● What data literacy skills would they require?
● Is there a pedagogy of Data Literacy that we need to develop?
• Work started on this at the LAK18 Hackathon
23
● Data Literacy as a tool for citizenship
• Would these activities provide
insights to students on:
• How algorithmic models are built?
• Why their data is useful &
valuable?
• How decisions are made about
them as customers, citizens,
employees?
• How algorithms can be biased?
• How data is owned?
• How elections can be swayed?
24
Freire said Education is learning to read the
world around us...
Critical study correlates with
teaching that is equally critical,
which necessarily demands a
critical way of comprehending
and of realizing the reading of
the word and that of the world,
the reading of text and of
context
(Paulo Freire, 1968)
We want education to be...
Data Informed
... Not Data Driven
(Sarah Moore, 2016)
25
Quantitative Ethnography
• David Williamson Shaffer
• Can we bring Quantitative and
Qualitative approaches togther?
• http://www.quantitativeethnography.org/
26
Rilke said ‘love the questions’...
‘The Book of Why’
Judea Pearl
Dana Mackensie
27
28
Love the Questions...
• Judea Pearl says data is dumb...
• Correlation is not Causality – so how do
we model cauasality?
29
“If I could sum up the message of this book in
one pithy phrase, it would be that you are
smarter than your data”
“There is no better way to understand ourselves
than by emulating ourselves”
Let’s ask the Students...
• Qualitative Study
• Can we show Students a higher level, interconnected view of their
learning activity?
• Invite students to become actively involved in model construction?
• Invite students to actively participate in their own data story?
• Develop a numerate approaches to self-reflection and feedback
30
And let’s teach Students to ask great questions
Publications
• Loftus M., Madden MG. (Mar 2018). Probabilistic Graphical Models as Personalised Feedback -
Learning Analytics Knowledge (LAK) – Sydney, Australia https://sites.google.com/view/pfeedback-
lak18
• Loftus M., Madden MG. (Oct 2017). Ways of Seeing Learning with Data - Learning Analytics for
Learners. Society. Critical Learning Analytics Seminar. Society for Research into Higher Education
(SRHE), London https://www.srhe.ac.uk/events/details.asp?EID=322
•
• Loftus M., Madden MG. (Sep 2017). Ways of Seeing Student Learning – With Machine Learning
and Learner Models. ACM WomENcourage. https://womencourage.acm.org/wp-
content/uploads/2017/02/womENcourage_2017_paper_79.pdf?189db0
• Loftus M., Madden MG. (Jun 2017). Ways of Seeing Student Learning & Metacognition with
Machine Learning and Learning Models. EDTECH – TEL in an Age of Supercomplexity
http://programme.exordo.com/edtech2017/delegates/presentation/137/
32
Timeline
33
•Literature
Review
•Research
Questions
Ethical
Approval
•Data Modelling
Data
Gathering •Qualitative
Research
Data
Gathering
•Share improved
models with
students
•Assess impact
Write up
2017 2018 2019
34
Thank you
• Mary Loftus
• @marloft
• mary.loftus@nuigalway.ie 34

Probabilistic Graphical Models as Predictive Feedback for Students

  • 1.
    Probabilistic Graphical Models asPredictive Feedback for Students Presenter: Mary Loftus PhD Supervisor: Prof Michael Madden 1
  • 2.
    ... is themeasurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs. Learning Analytics Psychology Data Mining Visualization Learning Sciences 2
  • 3.
    Learning Analytics –The Story So Far ● Predicting student outcomes ● Identifying ‘at-risk students’ ● Personalisation of student learning ● Multi-modal analytics – analyses of audio, video, location data ● Discourse and writing analytics ● Measuring ‘student engagement’ & disengagement •But little work done from the Student perspective (Kitto 2015) 3
  • 4.
    Central to education’spurpose is “the coming into presence of unique individual beings” Education “spaces might open up for uniqueness to come into the world” “Subjectification” – Biesta, G. J. J. (2015). Good Education in an Age of Measurement: Ethics, Politics, Democracy 4
  • 5.
    Ethics, Student Vulnerability& Agency Data is Political We must find ways to: • decrease student vulnerability • increase student agency • empower students as participants in learning analytics • moving students away from quantified data objects to qualified and qualifying selves Institutional Policy exemplar from University of Edinburgh 5 “In light of increasing concerns about surveillance, higher education institutions (HEIs) cannot afford a simple paternalistic approach to student data” Prinsloo & Slade (2016)
  • 6.
    Ethics, GDPR &Personal Data • This study received ethical clearance from NUI Galway’s Research Ethics Committee in 2017 • The GDPR details a set of six principles that must be adhered to when processing personal data. Personal data must be: • processed lawfully, fairly and in a transparent manner. • collected for specified, explicit and legitimate purposes and only used for these purposes. • adequate, relevant and limited to what is necessary. In other words, only data that is required for the explicit purpose detailed above should be gathered and stored. • accurate and up-to-date. • stored for no longer than is necessary. • processed in a secure manner that protects against unauthorised processing, loss and accidental destruction or damage. 6
  • 7.
    Input  Output Machine Learning Algorithms often ‘Black-Box’ 7 dana boyd (2017) The problem with contemporary data analytics is that we’re often categorizing people without providing human readable descriptors. Students (and citizens) need to be able to ‘see into’ algorithms that impact their opportunities and quality of life.
  • 8.
    Research Goals Ethical approach to datathat would empower Students Facilitate Agency moving students away from quantified data objects to qualified and qualifying selves New Ways of Seeing Help Students to see new aspects of themselves and their world Whitebox Learning algorithms to allow students to ‘see into’ their learning Self- regulation Help students set and track goals Metacognition Provide opportunities for understanding own learning 8
  • 9.
    Bayesian Approaches • ThomasBayes – wanted to prove the existence of God - 1761 • William Price – developed & published Bayes’ work - 1763 • Laplace improved on the initial Bayes Rule – which used prior evidence and new evidence to calculate probabilities - 1812 • Alan Turing used Bayesian methods in his top-secret work at Bletchley Park to decipher enemy encrypted messages - 1945 • Mary Wollstonecraft – mentored by Price - 1792 • Judea Pearl – Probabilistic Graphical Models – 1985 • Daphne Koller - Probabilistic Graphical Models - 2009 • McGrayne, S. B. (2011) The Theory that Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy 9
  • 10.
  • 11.
    Bayesian Networks • BayesianNetworks provide visual, accessible, interactive, white- box models • Data Mind-Maps with a side of Probability • Genie – great starter tool from BayesFusion – free for academic use 11
  • 12.
    Research Questions 1. Cana Learning Analytics system, using machine learning, modelling and classifiers, provide an interface for students to engage in metacognitive activities, allowing them to see hidden aspects of their own learning, and support the student’s own development goals? 2. Can we optimise machine learning for learning data? Can we identify the particular characteristics and the factors affecting the distributions of these datasets? Can we manually add extra knowledge to counteract the dearth of data? 4. Metacognition 1. Student Actions 2. Bayesian Models & Open Learning Models 3. Student Reflection 12
  • 13.
  • 14.
    Hand-built BNs • Initially,I hand-built a Bayesian Network using the Genie tool generating the Conditional Probabilities I expected to see... • Modelled Student Data on : • Attendance • Github Use • Trello Use • Quiz Result • Final Result 14
  • 15.
    Naive Bayes Network •Then learned BNs from data • These examples allow students to ‘play’ with course variables and see correlations with course performance…. • ‘If my Attendance is excellent, what’s the probability of passing the module?’ 15
  • 16.
    Bayes Rule 17 P(Pass |ExcellentAtt) = P(ExcellentAtt | Pass) * P(Pass) P(ExcellentAtt) = .358025 * .82 .31 P(Pass | ExcellentAtt) = .94
  • 17.
  • 18.
    Validation – arethe models any good? Modified TAN Model TAN Model Accuracy - Test Only 0.87 (56/64) 0.91 (58/64) K-fold Cross Validation (k=10) 0.75 (48/64) 0.83 (53/64) K-fold Cross Validation (k=3) 0.75 (48/64) 0.84 (54/64) Leave One Out 0.78 (50/64) 0.82 (53/64) 19 • Results of Validation within the Genie environment
  • 19.
    Validation – arethe models any good? • Comparison Analyses were also conducted with Sci-kit Learn in Python: 20 Algorithm Accuracy on Test Dataset Accuracy on Held-Out Data Naive Bayes 100% 90% KNN 93% 75% SVC 93% 75% Decision Tree (CART) 86% 95% Linear Regression 92% 75%
  • 20.
    But what’s inthis for students? • What would they model? 21 Writing a Paper Reading Writing Analysis Exercise Free Time Location Creativity
  • 21.
    Pedagogical implications &possibilities? • Could we support students to: • Update a model like this? • Add nodes? • Interpret this model? • See their learning reflected? • Reflect on insights generated? • Set goals accordingly? 22 Create Evaluate Synthesise Analyse Apply Understand Know Bloom’s Taxonomy, 1956
  • 22.
    ● Data Literacyas a tool for students ● Could students build their own models to analyse data in their coursework? ● What data literacy skills would they require? ● Is there a pedagogy of Data Literacy that we need to develop? • Work started on this at the LAK18 Hackathon 23
  • 23.
    ● Data Literacyas a tool for citizenship • Would these activities provide insights to students on: • How algorithmic models are built? • Why their data is useful & valuable? • How decisions are made about them as customers, citizens, employees? • How algorithms can be biased? • How data is owned? • How elections can be swayed? 24
  • 24.
    Freire said Educationis learning to read the world around us... Critical study correlates with teaching that is equally critical, which necessarily demands a critical way of comprehending and of realizing the reading of the word and that of the world, the reading of text and of context (Paulo Freire, 1968) We want education to be... Data Informed ... Not Data Driven (Sarah Moore, 2016) 25
  • 25.
    Quantitative Ethnography • DavidWilliamson Shaffer • Can we bring Quantitative and Qualitative approaches togther? • http://www.quantitativeethnography.org/ 26
  • 26.
    Rilke said ‘lovethe questions’... ‘The Book of Why’ Judea Pearl Dana Mackensie 27
  • 27.
  • 28.
    Love the Questions... •Judea Pearl says data is dumb... • Correlation is not Causality – so how do we model cauasality? 29 “If I could sum up the message of this book in one pithy phrase, it would be that you are smarter than your data” “There is no better way to understand ourselves than by emulating ourselves”
  • 29.
    Let’s ask theStudents... • Qualitative Study • Can we show Students a higher level, interconnected view of their learning activity? • Invite students to become actively involved in model construction? • Invite students to actively participate in their own data story? • Develop a numerate approaches to self-reflection and feedback 30 And let’s teach Students to ask great questions
  • 30.
    Publications • Loftus M.,Madden MG. (Mar 2018). Probabilistic Graphical Models as Personalised Feedback - Learning Analytics Knowledge (LAK) – Sydney, Australia https://sites.google.com/view/pfeedback- lak18 • Loftus M., Madden MG. (Oct 2017). Ways of Seeing Learning with Data - Learning Analytics for Learners. Society. Critical Learning Analytics Seminar. Society for Research into Higher Education (SRHE), London https://www.srhe.ac.uk/events/details.asp?EID=322 • • Loftus M., Madden MG. (Sep 2017). Ways of Seeing Student Learning – With Machine Learning and Learner Models. ACM WomENcourage. https://womencourage.acm.org/wp- content/uploads/2017/02/womENcourage_2017_paper_79.pdf?189db0 • Loftus M., Madden MG. (Jun 2017). Ways of Seeing Student Learning & Metacognition with Machine Learning and Learning Models. EDTECH – TEL in an Age of Supercomplexity http://programme.exordo.com/edtech2017/delegates/presentation/137/ 32
  • 31.
  • 32.
    34 Thank you • MaryLoftus • @marloft • mary.loftus@nuigalway.ie 34

Editor's Notes

  • #7 https://www.teachingandlearning.ie/wp-content/uploads/2018/05/NF-GDPR-Insight-25052018F.pdf