OLI Analysis - Internship write up

OLI Skills Map, Predictive Models, and Reports – Findings and Recommendations
Kay Christensen,
under the supervision of Dawn Zimmaro
Summary
The Open Learning Initiative (OLI) platform created at Carnegie Mellon University is a leader in
predictive learning modeling and learning analytics. Regardless of its initiative on these fronts, it
has many limitations and much room for human error that, if overcome, can contribute to more
advanced learning analytics constructs. We examined course material and data from the OLI
Statistics Spring 2011 course in particular when generating the following findings about OLI’s
skills map, predictive modeling, and reports.
Skills Map
An integral start to the learning inference process is the creation of a skills map, or Q-matrix.
The skills map is a spreadsheet authored by domain experts—CMU faculty in the relevant
subject—that identifies students’ learning objectives, the skills that compose those objectives,
and the specific questions designed to test said skills.
OLI’s learning inference model is strongly linked to the skills map, as students’ transactions with
specific questions are linked back to the skills identified with those questions; those looking at
learning curves are examining data about skills that have been linked by human experts to the
questions that measure them.

Authoring process
Authoring a skills map for OLI’s Statistics course required gathering multiple (precise number
unknown) faculty domain experts who would rely upon their experience, presumably in both
subject matter and statistics teaching, to create the map in sum. Faculty members working on the
map were given instructions that prescribe the following (also see instructions in Appendix A):
 Generate the learning objectives associated with said statistics course.
 Generate the skills required to meet these objectives.
 Assign each learning objective at least one skill with which it is associated. Assign
multiple skills to a learning objective if that learning objective is associated with multiple
skills. Learning objectives that do not receive a skill will not be included in the map.
 Match authored assessment questions—“Learn By Doing”, “Did I Get This?”, and
checkpoints—to the skills required to solve them. Assign multiple skills to the question if
it requires more than one skill to solve. Questions that cannot be matched to a skill will
not be included in the course.
Findings
Given that the skills map is the cornerstone of the learning inference process, it is critical to
achieve both consistency and accuracy when creating it. Unfortunately, human authoring—
especially by multiple redactors—leads to inconsistency; likewise, the instructions these authors
received may have contributed to problematic results.
Assuming that the domain experts followed the instructions, they created a list of learning

objectives and a list of more fine-grained skills separately, then attempted to link them together.
A more systematic process would have them starting with a learning objective and working
downward to identify the skills the objective require demonstrating. Working in this manner
would provide more freedom for authors to declare that the same skill is needed in more than one
learning objective; thus, the learning inference model may be more accurate as a result.
Additionally, the vast majority (96.5%) of course questions in all Units 1-5 are mapped to just
one skill in the skills map, and only one question is linked to more than two skills. It is unlikely
that such a high percentage of the assessment questions in the course should measure a single
skill. Thus, there are potentially hundreds of questions that should contribute to the data sets for
particular skills that currently do not.
Similarly, it can be argued that the skills map entirely lacks skills that many questions require but
that are not specific to statistics. For instance, some questions appear in word problem format,
others in graphical format, and still others in simple equation form; the word problems require
reading comprehension and visual tracking skills that other kinds of problems do not (see
Appendix B for examples). Tracking reading comprehension skill progress in the learning
inference process provides few benefits in a statistics course, but including reading
comprehension as a skill on relevant question formats does more accurately describe the
student’s transaction with them. Perhaps a sudden dip in performance when we would expect an
increase is actually due to changes in assessment types rather than erosion of learning.

Recommendations
Many of the possible obfuscators in the current skills map result from employing a manual,
subjective process. The following recommendations are designed to improve existing skills maps
and the process used on future courses.
1. To flesh out the skills associated with the current OLI Statistics course questions, run
frequencies of question keywords, such as “histogram” or “r-value.” Then associate these
keywords with the skills their questions are intended to demonstrate according to the
current skills map. Finally, have one domain expert check each question for agreement
with the results. This will likely result in a skills map in which a higher percentage of
questions are associated with multiple skills, thus providing the learning inference model
with a more accurate reference.
2. If using multiple domain experts, test inter-rater reliability in mapping questions to skills.
3. If using multiple domain experts, reserve one faculty member exclusively for editing
submitted portions. This person’s role is to ensure consistency in question labeling and
question-skill association.
4. Provide instructions that ask domain experts to create skills from the initial learning
objectives. Have faculty members congregate to justify skills that are associated with
multiple learning objectives.
5. Author course questions after creating the skills map, rather than mapping skills and then
matching them to already-authored questions.
6. In line with Ken Koedinger’s research associated with the PSLC DataShop, automate
skills map creation using a skill model algorithm.

Assessment Types
Findings
There are multiple assessment types employed even within each question format (Learn By
Doing, Did I Get This?, and checkpoints). Some assessments consist of multiple similarly-
themed multiple choice questions in a row; others, particularly in checkpoints, put a question
testing one skill between questions testing disparate skills, which is arguably more difficult given
the necessary transitions between different subjects. Likewise, some assessments contain a series
of the same question and answer choices, associated with different graphs; here, since each
answer choice corresponds to one graph, the student gains a greater probability of correctness in
later questions because they have seen what answer choices have already been used. And even
questions that are multiple choice differ in number of possible answer choices: 3, 4, or 5. See
Appendix B for some examples of various types of assessment formats. Such variation in
assessment format is problematic when all types roll up into the same learning model, as many of
these types assess a different type of learning, and each one has a distinct probability of
correctness.
Recommendations
1. Using the current OLI Statistics Spring 2011 course material, count only the higher stakes
and most uniform question formats in the learning model data. For instance, use only
checkpoint questions.
2. Alter course assessments to fit one standard model (i.e. five-choice multiple choice
questions). In particular, promote consistency in assessment formats among the three
major question types of Learn By Doing, Did I Get This?, and checkpoints.

3. For future courses, and given the budget, use a consulting psychometrician to verify that
the assessment types are correlated to measurable learning outcomes according to the
domain. Or have the psychometrician author the assessment questions.
4. For future courses, and if faculty members create course assessments, promote consistent
assessment creation among the faculty with guidelines on allowable question formats and
sequences per question type.
Predictive Model
OLI employs a two-state Hidden Markov Model as its learning prediction model. An HMM uses
students’ transactions with assessment questions as evidence of either having learned or not
having learned skills in the course. Its basic assumptions are that
 it is the statistician’s decision whether students likely enter with prior knowledge or with
no prior knowledge of course skills (hence the “two-state” nature of the model)
 knowledge can be gained as a result of interactions with course material
 knowledge, once learned, is not forgotten
 correct answers are evidence of skill knowledge
 incorrect answers are evidence of lack of knowledge
 hints should be treated as incorrect answers
 only the student’s first interaction with each question should be modeled, since they
usually have multiple chances to revise answers

OLI’s Hidden Markov Model uses the following values:
 p: probability that the student already possesses the skill being measured. The naïve
model begins with a p-value of .7 on a 0.0-1.0 scale.
 gamma0: probability that the student “learns” the skill (i.e. can demonstrate it correctly
the next time it is presented) as a result of getting an assessment question correct. The
naïve model begins with a gamm0-value of .7 on a 0.0-1.0 scale.
 gamma1: probability that the student, having not already possessed the skill, “learns”
skill as a result of getting an assessment question incorrect. The naïve model begins with
a gamm1-value of .7 on a 0.0-1.0 scale.
 lambda0: probability that the HMM is in State 1--that is, the assumption that learner
does not possess prior knowledge of skills before interacting with any content or
activities. The naïve model begins with a lambda0-value of 1.
The predictive model ends up producing numeric values on a scale from 0 to 3 for each student
in the course, with the following levels and color representation:
Gray: 0 - 1.4 – lacks enough data to make a decision
Red: 1.5 - 2.0 – skill is unlearned
Yellow: 2 - 2.40 – skill is somewhat mastered
Green: 2.5+ - skill is mastered

Findings
It’s clear that the evaluation metrics for “mastery” on a specific skill are too easy, as the mastery
proportions don’t significantly change across populations (for instance, four-year university
degree versus two-year degree or community college) in the way that would be expected.
Additionally, the majority of skills display as “mastered” regardless of variation in students’
engagement with the course material. This is primarily because of the mastery metrics, which
since they depend on proportion of skill observations are easier to achieve when there are fewer
observations associated with the skill.
In addition, attrition during the course may contribute to impoverished data for the model to use
in later sections. Units 1 and 2 of the OLI Statistics Spring 2011 course data, for instance, show a
spate of observations in some skills, whereas Units 3 and 4 tend to show just two observations
for skills. This late dearth of data means that ends of learning curves may be operating on as few
as two students’ data.
Training the predictive model on student data also appears to be problematic. One would expect
the p-value, gamm0- and gamma1-values to change from the naïve model values once the model
is tuned to students’ actual transactions. However, only 22 of 114 (19.3%) skills in the Statistics
Spring 2011 course demonstrated changed values after tuning. Interestingly, nearly all of the
skills that showed changed values appeared in Units 1 and/or 2 of the course, whereas the
majority of skills showing no change appeared only in Units 3 and 4, where we also see far fewer
observations per skill. See Appendix B for examples of questions with changing and unchanging
gamma values.

Recommendations
1. Lower the initial p-value of the naïve HMM. The p-value should taken into account that
learners are likely to have some exposure to statistics content, but that statistics is an
esoteric subject for many; the current p-value of .7 is surprisingly high. As a point of
reference, Ryan Baker’s “Big Data in Education” course at Coursera demonstrates p-
values of between .28 and .4 for example HMMs.
2. Require more than two observations of a subject to include data in learning curves. While
doing so will be problematic in cases of mass attrition, it will also increase the model’s
precision.
3. Train several types of naïve model (two-state HMM, BKT extension that accounts for
guesses and slips, IRT) on existing data to evaluate whether there are significant changes
in performance. This would allow some swapping of different models, perhaps depending
upon the course’s subject matter.
Post Hoc Reports
Recommendations
Though reports are the domain of data visualization experts, I have a few recommendations as a
result of working with the model data available.
1. Given the amount of information that is possible to show, teacher dashboard reports
should strongly emphasize graphical displays rather than text.
2. Likewise, as much data as possible should be hidden from users until they deliberately

seek it.
3. Show common attrition points rather than attrition rates. It is more helpful for
interventions to know the common points at which learners attrite than simply the rate at
which they do so.
4. Show performance data by question format (e.g. Learn By Doing, Did I Get This?, and
checkpoint).

Glossary
Learning objective: Something the learner should be able to demonstrate after using the course software;
a result of her learning. Also a part of the skills map, and defined by one or more specific skills.
Observation: The compilation of multiple transactions with a single opportunity. For example, if a
learner clicks on three answers before getting a question correct,the three transactions all roll up into one
observation of the skill.
Opportunity: An opportunity to demonstrate mastery of a skill through an assessment question; a single
appearance of a skill in an assessment question. For instance, “estimating the r-value” might be tested in
five questions, yielding five opportunities.
Q-matrix: see “Skills map”
Skill: A concept or ability required to answer an assessment question, and the smallest unit under a
learning objective.
Skills map (see also “Q-matrix): A map that delineates the learning objectives of a course,the skills
associated with each learning objective, and the course assessment questions that utilize those skills. Used
by the predictive model.
Transaction: A single interaction between the learner and the course software. Example: an incorrect
answer to an assessment question.

Appendix A
Preparing for the Learning Dashboard
Tracking Skills
Capture learning outcomes and skills as you design and build new instruction. This is part of
good design. Mapping learning objectives to activities helps you to check alignment (Do the
practice and assessment opportunities align with the stated outcomes?) and coverage (Are there
enough practice opportunities available for each outcome?).
DesignDocuments
As course authors and reviewers create activities and workbook pages freeform, label pages,
sections and activities as we go and before this document is split for conversion.
When activities are taken out, formalize notation to standard  and  in the activity documents.
For the remaining Workbook page XML that will go through the Google Docs Converter,
formalize the notations for learning objectives and skills so they will be converted correctly.
Google Docs
The Google Docs to XML converter is able to generate learning objective references. Include a
heading with the text “OBJECTIVE: “ followed by the ID of the learning objective you would
like to reference. For example OBJECTIVE: identify_median would generate <objref
idref=”identify_median”/> in the resulting workbook page. The converter is not able to check if
learning objective you reference exists. Build the learning objectives’ XML first and reference
the objective IDs carefully in your Google document.
OLI Author

Enter ID of the desired learning objective in the “Learning Objective” field. Add free form text
following the ID if helpful. For example, “explain_relevance_ld: Explain the relevance of the
Learning Dashboard to course design.” The tool saves this information as a comment in the
XML document. Remember to update the skill model spreadsheet as you write questions.
Assessment XML
Record relevant learning objectives and skills as comments in the XML. This will allow tools to
read and process this information in the future. The labels also serve as a reminder of the
intended learning goal when you return to edit the questions again later.
Place the comment inside the XML element the learning objective or skill is associated
with. The comment should identify the label as a learning objective or skill, include a reference
to its ID, and optionally a text description. Structure the information in the format below. The
OLI Author tool adds learning objective comments in the same format.


Remember to update the skill model spreadsheet as you write questions.
Example 1: Associate a learning objective with a question.
<question id="q1">

<body>
Example 2: Associating a skill with a question part.
<qustion id=”q2”>
<body>...</body>
<multiple_choice>...</multiple_choice>
<part id=”q2p1”>

<response>
Skill Models
A skill model maps learning objectives to skill and skills to practice. It also defines the
parameters used to estimate learning within the Learning Dashboard.
The learning objectives shown within the Learning Dashboard match those shown on the module
page. Both present all of the learning objectives appearing within the module in the order they
first occur.
The model breaks high-level objectives down to their component skills. The Learning
Dashboard requires at least one skill per learning objective. For fine grain objectives, create a
skill which represents the objective in its entirety. Learning objectives are visible to students;
skills are not.
A practice opportunity is an opportunity for a student to demonstrate a skill. To inform learning
estimates, a practice opportunity must have correct and incorrect states and report outcomes to
the Learning Dashboard. Assessment questions, whether inline or as part of checkpoint or quiz,
report to the Learning Dashboard. Submit and compare questions do not because they are not
evaluated by the system as correct or incorrect. Instructor graded questions report to the
Learning Dashboard once scored by the instructor.
Complex lab environments (e.g. StatTutor, Proof Lab, Causality Lab) do not report outcomes to
the Learning Dashboard by default. These applications require special setup and often they must
be specially instrumented to report the required data.
Mappings from learning objectives to skills to practice are many to many. Each learning
objective may involve one or more skills. Each skill may relate to one or more learning
objectives. Each skill may be associated with zero or more practice opportunities. Each practice
opportunity may involve zero or more skills.
A skill model is represented as a Google Spreadsheet. The spreadsheet contains three
worksheets.

 The first worksheet defines the skills present in the model. Each skill must have a unique
identifier and a title. The title is used as the label in the Learning Dashboard. The
remaining columns contain parameters used to tune learning estimates and are data
driven. Omit these values as you build your model.
 The second worksheet relates practice opportunities to skills. Each practice opportunity
is entered as a row. Practice opportunities are identified by resource, problem, and step
identifiers. The resource identifier is ID of the learning activity which provides the
practice. For question pools, use the ID of the resource referencing the pool. The
problem identifier (optional) is the ID of a question or problem within the resource. The
step identifier (optional) is the ID of a question part or step within the problem. When
the step identifier is omitted, the skills in the mapping apply to all steps of the
problem. When the problem identifier is omitted, the skills in the mapping apply to all
problems and all steps of the learning activity. Enter skills by ID in the remaining
columns.
 The third worksheet relates learning objectives to skills. Learning objectives are defined
in a separate XML file and referenced in the model by ID. A low opportunity objective is
one for which there is not enough practice to generate a reliable learning estimate. The
minimum practice, low cutoff, and high cutoff define the thresholds between gray/red,
red/orange, and orange/green learning estimates. Omit these values as you build your
model. Enter skills by ID in the remaining columns.
Map practice opportunities at the time they are created. Do not wait until later. It is much easier
to create the skill model as you go along.
The skill model references many different identifiers. It is important to enter these values
correctly. Take a moment to double check each ID as you enter it. This will save time
later. Likewise, if you rename a resource, problem or step in the content update the skill model
immediately. The core software team can assist you in checking your model for errors. When
the model is imported the system provides a list of errors.
At least two practice opportunities (i.e. tagged questions or steps) are required to create a
learning estimate at the learning objective level. If there are fewer than two practice
opportunities, students will not be able to work past the “gray” state. Target at least three or
more practice opportunities per learning objective and two or more per skill or sub-objective. If
insufficient practice is available, mark the LO as a low opportunity LO in the spreadsheet.
Creating a New Skill Model
1. Make a copy of the skill model in Google Docs. Rename the document to match the
content package ID and version.

2. Enter the content package ID and version in cells A1 and B1 of the Problems and LOs
worksheets
3. Share the document with John, Renee, and Marsha.
Adding Learning Objectives
1. Open the skill model spreadsheet to the LOs worksheet.
2. Open the learning objectives XML file.
3. For each learning objective, create a new entry in the spreadsheet. Copy and paste its
identifier into column A.
4. If you have identified skills related to the objective, add corresponding entries to the
Skills worksheet. Reference the skills in columns F through J. Enter the ID of one skill
in each column. Add more columns if needed.
Remember, each learning objective must have at least one skill.
Adding Skills to the Model
1. Open the skill model spreadsheet to the Skills worksheet.
2. For each skill, provide a unique identifier in column A and a title in column B.
3. Map the skill to one or more learning objectives in the LOs worksheet. References skills
by ID in columns F through J.
In order to appear in the Learning Dashboard, each skill must be mapped to at least one learning
objective.
Mapping a Practice Opportunity
1. Open the skill model spreadsheet to the Skills worksheet.
2. Enter the resource ID of the learning activity in column A. The resource ID is identifier
of the XML file. If the XML file is named andrew_printing_practice1.xml then its ID is
“andrew_printing_practice1”.
3. Unless the same set of skills applies to all questions, enter a problem identifier in column
B. For assessments, the problem identifier is the question ID. For example, the problem
identifier for <question id=”q1”> is “q1”.
4. If the problem or question has multiple steps or parts with different sets of skills, enter
the applicable step identifier in column C. For assessments, use the part ID. For example,
the step identifier for <part id=”q3p1”> is “q3p1”.
5. In the remaining columns, reference applicable skills by their ID. Enter one skill ID per
cell. Skill identifiers must match those defined in the Skills worksheet.

Appendix B
Example Questions for Skills with Changed vs. Unchanged Gamma Values
Kay Christensen
Gamma Values Changed
The following questions from OLI Statistics Spring 2011 course material exemplify skills (listed
in bold) for which gamma0 and/or gamma1 values changed from those in the naïve model after
model tuning.
Estimating r (estimater)
gamma0 = 0.85, gamma1 = 1.00
One “Learn By Doing” question, 3 answer choices.
Unit 1, Mod 2, Linear Relationships 1, Q1
5 consecutive “Did I Get This” questions, then 6 additional “Did I Get This” questions. Answer
choices remain the same, so chance of estimating correct r-value on first attempt should
increase as student progresses through questions and receives feedback.

Unit 1, Mod 2, Linear Relationships 3, Extra Problems Q1

Unit2, Mod2, Checkpoint2, Q3

Interpreting boxplot (interpboxplot)
gamma0 = 0.35, gamma1 = 0.50
3 “Did I Get This” questions associated with same boxplot, followed by 3 extra “Did I Get This”
questions associated with same boxplot.
Unit 1, Mod 1, Boxplot 2, Q1-3

Unit 1, Mod 1, Boxplot 2, Extra Problems Q1-3

Interpreting Categorical Display (interpcatchart)
gamma0 = 0.20, gamma1 = 0.50
Two “Learn By Doing” questions with various answer choice structures: 4-option multiple
choice, 4 drop downs, and 3-option multiple choice.

Unit 1, Mod 1, One Categorical Variable2, Q1
Unit 1, Mod 1, One Categorical Variable2, Q2
Unit 1, Mod1, One Categorical Variable3, Q1

Unit 1, Mod1, Checkpoint2, Q5
Unit 1, Mod1, Checkpoint2, Q6

Meanv. Median (meanvsmedian)
gamma0 = 0.50, gamma1 = 0.50
Three consecutive 3-option multiple-choice questions.
Unit1, Mod1, Measures of Center2, Q1

Survey Design(surveydesign)
gamma0 = 0.30, gamma1 = 0.50
Four consecutive “Did I Get This” questions with 3-option multiple-choice answers.
Unit2, Mod2, Sample Surveys2, Q1

Gamma Values Unchanged
The following questions from OLI Statistics Spring 2011 course material exemplify skills (listed
in bold) for which gamma0 and gamma1 values did not change from those in the naïve model
after model tuning.
Binomial Parameters (binparams)
Only 2 questions, one of which is checkpoint.
Unit3, Mod3, Binomial Random Variables1, Q5

Definition of Conditional Probability (condprobdef)
Four- and three-option multiple-choice questions, all regarding same few graphs.
Unit3, Mod7, Conditional Probability2, Q1-4

Unit3, Mod7, Independence1, Q1-3
Unit3, Mod7, Independence2, Q2-3

Using distribution to find probability (disttofindprob)
Unit3, Mod8, Probability Distribution5, LBD Q1
Unit3, Mod8, Probability Distribution5, DIGT Q1-3

Interpreting t-test
Unit4, Mod12, Hypothesis Testing for the Population Proportion4, Q1-4
Unit4, Mod13, Two Independent Samples3, Q1

Unit5, Mod2, ANOVA Checkpoint1, Q4

Computing z-score (zscorecompute)
Unit3, Mod8, Introduction to Normal Random Variables3, LBD Q1
Unit3, Mod8, Introduction to Normal Random Variables3, DIGT Q1

Unit4, Mod12, Hypothesis Testing for the Population Mean2, DIGT4 Q1

OLI Analysis - Internship write up

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (20)

Similar to OLI Analysis - Internship write up

Similar to OLI Analysis - Internship write up (20)

OLI Analysis - Internship write up