SlideShare a Scribd company logo
1 of 33
Download to read offline
Applications of Multivariate
Techniques to Measure Content
Structure with Multidimensional IRT
Quinn N Lathrop
Advanced Computing & Data Science Lab
1
What do we do in the Advanced
Computing & Data Science Lab?
Prototyping
● Adaptive learning capabilities
● Authoring and tagging tools with machine
intelligence
● Exploration of new technology
Research and Development
● Direct support of a product or prototype
● Exploratory research into new capabilities
2
Why
3
Data Context and IRT
Online learning systems provide students with instruction, homework, and summative
tests for an entire course.
The Book is organized hierarchically into Chapters and Sections. Organization is
called the Table of Contents (TOC). Items are within a single Section.
4
Traditional (unidimensional) IRT models can work well at both
● Local-level: Section or Chapter specific models
● Book-level: placing the entire course on a single latent trait
We need to turn to multidimensional models to help understand the relationships
between the structures in the book.
Explore the use of multivariate
and multidimensional techniques
to make inferences about the
structure of content in online
learning systems
5
How
6
Multivariate Section-Level IRT
1. IRT model specified at section level
○ Can be any specification, 1PL, 2PL, 3PL polytomous, etc
2. Section-level covariance matrix
○ Jointly estimate all section-level IRT models
○ Simple structure. All items within a section only load on the section-level
latent trait. All section-level latent traits can freely covary
3. Secondary analysis of covariance matrix
○ Plug covariance into any EFA, PCA, SEM
○ EFA to explore book structures analysis
○ SEM to verify TOC structures or “aggregate” covariance up the TOC
7
A Few Equations
8
1. IRT model specified at section level
2. Section-level covariance matrix
3. Secondary analysis of covariance matrix
1. Section-level IRT
All the usual unidimensional
psychometric results are available
• ability
• difficulty
• discrimination
• guessing
• ...
Now we have psychometric results but only inside in context of the item’s
section. Next, we look at the covariance between all objectives.
2. Section-level covariance matrix
Section 1 Section 2 Section 3 Section 4
Item 1 X
Item 2 X
Item 3 X
Item 4 X
Item 5 X
Item 6 X
Item 7 X
Item 8 X
Item 9 X
Item 10 X
Item 11 X
Item 12 X
Item 13 X
Item 14 X
Item 15 X
Item 16 X
Multidimensional
Section 1 Section 2 Section 3 Section 4
Section 1 1.000 0.836 0.855 0.456
Section 2 0.836 1.000 0.919 0.684
Section 3 0.855 0.919 1.000 0.413
Section 4 0.456 0.684 0.413 1.000
2. Section-level covariance matrix
Independent
Section 1 Section 2 Section 3 Section 4
Section 1 1 0 0 0
Section 2 0 1 0 0
Section 3 0 0 1 0
Section 4 0 0 0 1
Dependent
Section 1 Section 2 Section 3 Section 4
Section 1 1 1 1 1
Section 2 1 1 1 1
Section 3 1 1 1 1
Section 4 1 1 1 1
3. Secondary analysis of covariance matrix
Book-level IRT model
Section Book-level diff % Correct
Section 1 -0.098 76%
Section 2 -0.543 83%
Section 3 0.296 68%
Section 4 0.146 71%
Section 1
Section 2
Section 3
F1
F2
0.8
0.8
1.0
0.9
Section 4
0.5
Objective-Level Multivariate IRT
Section 1 Section 2 Section 3 Section 4
Section 1 1.000 0.836 0.855 0.456
Section 2 0.836 1.000 0.919 0.684
Section 3 0.855 0.919 1.000 0.413
Section 4 0.456 0.684 0.413 1.000
What
13
Presentation Title Arial Bold 7 pt
Data and Scale
Not your usual data...
15
● 5,000 items
● Item cloning, each item can have up to thousands of instantions
● Instructor controlled learning aids, scoring policies, and settings
● A semester can have 20,000 students and 6,000,000 responses
● As a person by item response matrix, that’s 95% missing data
● Missingness do to an ensemble of effects
○ Instructor customizations
○ Variety of courses and institutions using the same book
Table of Contents and Multidimensional Models
16
● Books have can have 10 to 30 Chapters
● Each Chapter has about 4 to 8 Sections
So if we want to do what we said...
…that implies a 40 to 250 dimensional model.
How do we do that?
Algorithm
How do we estimate high dimensional models?
Pairwise.
● Problem grows quadratically, not exponentially.
● Instead of fitting one 40-dimensional model with, for example, 10^40 latent evaluation points,
we fit (40^2 - 40)/2 = 780 2-dimensional models each with 10^2 latent evaluation points
● Pairwise models are easily parallelized, CPU-limited, and chunk the data, allowing the method to
scale with appropriate computational resources
18
The Obvious Criticism
● Secondary (post-hoc) analysis of covariance matrix does not correctly account
for standard errors.
● It would be better to jointly estimate the model on the covariance matrix
simultaneous with its estimation.
○ ...but the pairwise estimation is hard part, requiring significant
computational resources and time. Once we get that, the secondary
analysis is trivial.
○ ...and there are larger threats to the inference and standard errors
(non-ignorable missing data, student growth over time, etc).
○ ...even still, the value of the results justify its use.
19
Example 1: Comparing
TOC to Exploratory
Factor Analysis
21
22
23
Example 2: Impose the
TOC with Structural
Equation Modeling
Can the Chapters Explain the Covariance of Sections?
Intro =~ 1_1
Ch1 =~ 2_1 + 2_2 + 2_3 + 2_4 + 2_5 + 2_6 + 2_7 + 2_8
Ch2 =~ 3_1 + 3_2 + 3_3 + 3_4 + 3_5 + 3_6 + 3_7
Ch3 =~ 4_1 + 4_2 + 4_3 + 4_4 + 4_5 + 4_6
Ch4 =~ 5_1 + 5_2 + 5_3 + 5_4 + 5_5
25
26
Ch4
Ch3
Ch2
Ch1
Intro
27
Screen Book for Areas for
Expert Review
28
Ch A
Ch B
Ch C
● Flag areas where data do not match
expectations
● Can be thought of as taking the TOC as
the expert domain model, and then
validating that model with the data
● Target human and expert reviews to
areas most likely in need
Odd Section
Example 3: Where can
this take us?
30
31
Goals of Psychometric Models
Primary goal is measuring latent traits
Secondary goal involves inferences about content
33
Test-level inferences
● Dimensionality analysis
● Linking and equating
● Validity studies
Item-level inferences
● Item parameter filtering
● Differential item functioning
● Item fit

More Related Content

Similar to Applications of Multivariate Techniques to Measure Content Structure with Multidimensional IRT

CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
Arumugam90
 
Basic lesson plan نسخه%e2%80%ab%e2%80%ac0909
Basic lesson plan نسخه%e2%80%ab%e2%80%ac0909Basic lesson plan نسخه%e2%80%ab%e2%80%ac0909
Basic lesson plan نسخه%e2%80%ab%e2%80%ac0909
abdullah254el
 
Basic lesson plan نسخه%e2%80%ab%e21%80%ac
Basic lesson plan نسخه%e2%80%ab%e21%80%acBasic lesson plan نسخه%e2%80%ab%e21%80%ac
Basic lesson plan نسخه%e2%80%ab%e21%80%ac
abdullah254el
 

Similar to Applications of Multivariate Techniques to Measure Content Structure with Multidimensional IRT (20)

a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarization
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
alg_u1_unit_1_plan.docx
alg_u1_unit_1_plan.docxalg_u1_unit_1_plan.docx
alg_u1_unit_1_plan.docx
 
NPTEL_SEM_3.pdf
NPTEL_SEM_3.pdfNPTEL_SEM_3.pdf
NPTEL_SEM_3.pdf
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
2nd sem
2nd sem2nd sem
2nd sem
 
2nd sem
2nd sem2nd sem
2nd sem
 
2018 syllabus
2018 syllabus2018 syllabus
2018 syllabus
 
Object oriented analysis_and_design_v2.0
Object oriented analysis_and_design_v2.0Object oriented analysis_and_design_v2.0
Object oriented analysis_and_design_v2.0
 
A Fuzzy Logic Intelligent Agent for Information Extraction
A Fuzzy Logic Intelligent Agent for Information ExtractionA Fuzzy Logic Intelligent Agent for Information Extraction
A Fuzzy Logic Intelligent Agent for Information Extraction
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Lecture_01.1.pptx
Lecture_01.1.pptxLecture_01.1.pptx
Lecture_01.1.pptx
 
Fundamentals of OOP (Object Oriented Programming)
Fundamentals of OOP (Object Oriented Programming)Fundamentals of OOP (Object Oriented Programming)
Fundamentals of OOP (Object Oriented Programming)
 
80410172053.pdf
80410172053.pdf80410172053.pdf
80410172053.pdf
 
A First Course In With Applications Complex Analysis
A First Course In With Applications Complex AnalysisA First Course In With Applications Complex Analysis
A First Course In With Applications Complex Analysis
 
AI Orange Belt - Session 1
AI Orange Belt - Session 1AI Orange Belt - Session 1
AI Orange Belt - Session 1
 
Lecture_2_Stats.pdf
Lecture_2_Stats.pdfLecture_2_Stats.pdf
Lecture_2_Stats.pdf
 
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdfAIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
 
Basic lesson plan نسخه%e2%80%ab%e2%80%ac0909
Basic lesson plan نسخه%e2%80%ab%e2%80%ac0909Basic lesson plan نسخه%e2%80%ab%e2%80%ac0909
Basic lesson plan نسخه%e2%80%ab%e2%80%ac0909
 
Basic lesson plan نسخه%e2%80%ab%e21%80%ac
Basic lesson plan نسخه%e2%80%ab%e21%80%acBasic lesson plan نسخه%e2%80%ab%e21%80%ac
Basic lesson plan نسخه%e2%80%ab%e21%80%ac
 

Recently uploaded

Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهله
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 

Applications of Multivariate Techniques to Measure Content Structure with Multidimensional IRT

  • 1. Applications of Multivariate Techniques to Measure Content Structure with Multidimensional IRT Quinn N Lathrop Advanced Computing & Data Science Lab 1
  • 2. What do we do in the Advanced Computing & Data Science Lab? Prototyping ● Adaptive learning capabilities ● Authoring and tagging tools with machine intelligence ● Exploration of new technology Research and Development ● Direct support of a product or prototype ● Exploratory research into new capabilities 2
  • 4. Data Context and IRT Online learning systems provide students with instruction, homework, and summative tests for an entire course. The Book is organized hierarchically into Chapters and Sections. Organization is called the Table of Contents (TOC). Items are within a single Section. 4 Traditional (unidimensional) IRT models can work well at both ● Local-level: Section or Chapter specific models ● Book-level: placing the entire course on a single latent trait We need to turn to multidimensional models to help understand the relationships between the structures in the book.
  • 5. Explore the use of multivariate and multidimensional techniques to make inferences about the structure of content in online learning systems 5
  • 7. Multivariate Section-Level IRT 1. IRT model specified at section level ○ Can be any specification, 1PL, 2PL, 3PL polytomous, etc 2. Section-level covariance matrix ○ Jointly estimate all section-level IRT models ○ Simple structure. All items within a section only load on the section-level latent trait. All section-level latent traits can freely covary 3. Secondary analysis of covariance matrix ○ Plug covariance into any EFA, PCA, SEM ○ EFA to explore book structures analysis ○ SEM to verify TOC structures or “aggregate” covariance up the TOC 7
  • 8. A Few Equations 8 1. IRT model specified at section level 2. Section-level covariance matrix 3. Secondary analysis of covariance matrix
  • 9. 1. Section-level IRT All the usual unidimensional psychometric results are available • ability • difficulty • discrimination • guessing • ... Now we have psychometric results but only inside in context of the item’s section. Next, we look at the covariance between all objectives.
  • 10. 2. Section-level covariance matrix Section 1 Section 2 Section 3 Section 4 Item 1 X Item 2 X Item 3 X Item 4 X Item 5 X Item 6 X Item 7 X Item 8 X Item 9 X Item 10 X Item 11 X Item 12 X Item 13 X Item 14 X Item 15 X Item 16 X
  • 11. Multidimensional Section 1 Section 2 Section 3 Section 4 Section 1 1.000 0.836 0.855 0.456 Section 2 0.836 1.000 0.919 0.684 Section 3 0.855 0.919 1.000 0.413 Section 4 0.456 0.684 0.413 1.000 2. Section-level covariance matrix Independent Section 1 Section 2 Section 3 Section 4 Section 1 1 0 0 0 Section 2 0 1 0 0 Section 3 0 0 1 0 Section 4 0 0 0 1 Dependent Section 1 Section 2 Section 3 Section 4 Section 1 1 1 1 1 Section 2 1 1 1 1 Section 3 1 1 1 1 Section 4 1 1 1 1
  • 12. 3. Secondary analysis of covariance matrix Book-level IRT model Section Book-level diff % Correct Section 1 -0.098 76% Section 2 -0.543 83% Section 3 0.296 68% Section 4 0.146 71% Section 1 Section 2 Section 3 F1 F2 0.8 0.8 1.0 0.9 Section 4 0.5 Objective-Level Multivariate IRT Section 1 Section 2 Section 3 Section 4 Section 1 1.000 0.836 0.855 0.456 Section 2 0.836 1.000 0.919 0.684 Section 3 0.855 0.919 1.000 0.413 Section 4 0.456 0.684 0.413 1.000
  • 15. Not your usual data... 15 ● 5,000 items ● Item cloning, each item can have up to thousands of instantions ● Instructor controlled learning aids, scoring policies, and settings ● A semester can have 20,000 students and 6,000,000 responses ● As a person by item response matrix, that’s 95% missing data ● Missingness do to an ensemble of effects ○ Instructor customizations ○ Variety of courses and institutions using the same book
  • 16. Table of Contents and Multidimensional Models 16 ● Books have can have 10 to 30 Chapters ● Each Chapter has about 4 to 8 Sections So if we want to do what we said... …that implies a 40 to 250 dimensional model. How do we do that?
  • 18. How do we estimate high dimensional models? Pairwise. ● Problem grows quadratically, not exponentially. ● Instead of fitting one 40-dimensional model with, for example, 10^40 latent evaluation points, we fit (40^2 - 40)/2 = 780 2-dimensional models each with 10^2 latent evaluation points ● Pairwise models are easily parallelized, CPU-limited, and chunk the data, allowing the method to scale with appropriate computational resources 18
  • 19. The Obvious Criticism ● Secondary (post-hoc) analysis of covariance matrix does not correctly account for standard errors. ● It would be better to jointly estimate the model on the covariance matrix simultaneous with its estimation. ○ ...but the pairwise estimation is hard part, requiring significant computational resources and time. Once we get that, the secondary analysis is trivial. ○ ...and there are larger threats to the inference and standard errors (non-ignorable missing data, student growth over time, etc). ○ ...even still, the value of the results justify its use. 19
  • 20. Example 1: Comparing TOC to Exploratory Factor Analysis
  • 21. 21
  • 22. 22
  • 23. 23
  • 24. Example 2: Impose the TOC with Structural Equation Modeling
  • 25. Can the Chapters Explain the Covariance of Sections? Intro =~ 1_1 Ch1 =~ 2_1 + 2_2 + 2_3 + 2_4 + 2_5 + 2_6 + 2_7 + 2_8 Ch2 =~ 3_1 + 3_2 + 3_3 + 3_4 + 3_5 + 3_6 + 3_7 Ch3 =~ 4_1 + 4_2 + 4_3 + 4_4 + 4_5 + 4_6 Ch4 =~ 5_1 + 5_2 + 5_3 + 5_4 + 5_5 25
  • 27. 27
  • 28. Screen Book for Areas for Expert Review 28 Ch A Ch B Ch C ● Flag areas where data do not match expectations ● Can be thought of as taking the TOC as the expert domain model, and then validating that model with the data ● Target human and expert reviews to areas most likely in need Odd Section
  • 29. Example 3: Where can this take us?
  • 30. 30
  • 31. 31
  • 32.
  • 33. Goals of Psychometric Models Primary goal is measuring latent traits Secondary goal involves inferences about content 33 Test-level inferences ● Dimensionality analysis ● Linking and equating ● Validity studies Item-level inferences ● Item parameter filtering ● Differential item functioning ● Item fit