Presentation of a short paper to the CSPRED (computer supported peer review) workshop at ITS 2010.
Full title:
Democratizing Authoring through Computer-Supported Review
A proposal for two peer cycles
4. Small groups with much effort per personDistributed development more scalable Students can learn through critical use Domain experts have blind spots Domain novices have myopia Goal: use computing to match strengths and weaknesses 2
8. Issues to address Management issue Keeping track of multiple distinct artifacts in taxonomy Their relationships, versions, sources, utility and use Author vulnerability issue May use their contributions in other contexts (such as their exams) and can’t risk them being revealed to learners Student vulnerability issue Misinformed by poor contributions or develop false confidence through seeing the question answers or solutions prematurely 6
14. Conclusion Linking separated peer communities could lead to leveraging strengths and attenuating weaknesses Opportunity for exploring computer supports Thanks to IlyaGoldin for the conversation that prompted this work Support from PIER graduate training grant from Department of Education ((#R305B040063). Updates at http://OpenEducationResearch.org 12
Editor's Notes
Hi. I’m Turadg Aleahmad and my office is across the way there, at the Human Computer Interaction Institute. I’ve been trying to design systems crowd-sourcing of educational materials at a large scale.
Why would we want to crowd-source educational materials? A major motivation is personalizing the learning experience. ITS systems do this, in certain limited ways, and they’re very expensive to produce. Also they’re done by small highly coordinated groups of skilled experts. Make them hard to scale.Can we distribute it to scale for everyone?And while we’re at it, can we better engage learners in a knowledge community? Maybe help their metacognitive skills?But who should make what? Well experts can blind spots, forgetting just how clueless they were as learners. And learners don’t know what they don’t know.So what can we design to help this situation?
Wikipedia is the poster child from democratized authoring but it doesn’t work very well for pedagogy. We can examine it to illuminatesome of issues to consider.In Wikipedia each article is canonical. There is one text for each topic and disputes are resolved through referencing real sources. If you put non-reference material into Wikipedia, you get a note you move to Wikiversityand Wikibooks.
Paucity of content on Wikiversity site suggests their may be a mismatch.
Wikibooks actually has some pedagogical content, but still suffers from some key issues.There’s no room for practice or assessment. There’s no doing in Wikipedia at all. No interactions for practice, limits types of materials.No family history (like this item started as a copy of this other one), no way to talk about specific versions. Everything Is open to everyone. Works great in a lot of cases, but prevents contributions of sensitive materials.Students can see everything authors can, including contributions that can not only waste their time but lead to misconceptions or harmful confusion. Students can read through shallowly and think they’ve learned when they haven’t. There’s no assessment or real checks on their understanding.
I categorize these into three these issues.[read them]How can we address these? Workshop are an opportunity to explore new ideas so I’m going to propose a new model for “peer review”.
To address these threats to author and student constituencies, let’s separate them. Here’s how that might work, in say, a corpus of worked example problems.Suppose an author creates version 1 of a worked example, writing a problem statement and solution. Another author thinks it’s pretty good and rates it as ready, after correcting some formatting. Then it’s presented to a student who gets through it but asked for a better explanation of one of the steps. Another student says it’s a good problem but sounds really contrived. A third student just does it without comment. Another student proposes an edit to make the problem relevant to something they’ve been learning in another subject.The author gets all the feedback and the proposed edit, reflects on it, and creates version 3 to accommodate it. Another author, who is more experienced, takes a look now and gives feedback on these changes, pointing to some research on student misconceptions on this skill. Another author tries improving it by acting on the feedback d to make 4.Here the students have learned the domaincontent and only gotten materials that were productive for them. They didn’t have to do any review but they were permitted to give feedback if they wanted. The authors know the domain but they’ve learned pedagogical skills and pedagogical content knowledge from their peers and also the feedback from students.
Here’s the flow from the perspective of a particular contribution. First it’s generated by Authors, then authors and computer support qualify it for use by students. Students learn from the material and can give feedback for the authors. Authors combine this feedback, and perhaps student interaction data, to improve the resource to continue the cycle.
To help keep up quality of contributions, authors should demonstrate requisite content knowledge in a domain (e.g. algebra) or subdomain (e.g. quadratic equations). They don’t need to demonstrate pedagogical knowledge or pedagogical content knowledge (as defined by Shulman [6]) because that’s what the system will help the author learn.Students will learn the content knowledge but unlike common “democratic” peer production without such role distinctions,will be spared the pedagogical knowledge or pedagogical content knowledge. Thus the “peer review” among the authors benefits the students, without requiring students to reciprocate spending cognitive load on the skills of pedagogy.
In a poster yesterday at Educational Data Mining I presented a model for automatically rating worked math solutions that works about as well as human experts. This can help address the student vulnerability issue by automatically withhold contributions of insufficient quality from reaching students. It can also allow good quality contributions to be used immediately, before any author peers have taken the time to rate it. Once items reach students, they can learn by practicing with them and optionally add questions or feedback to the authors or helpful comments to their peers. These comments would be expected to lead to better contributions from the author side and thus better future learning for students.For authors, the model can provide immediate feedback on their contributions. In building the automated rating model I found that the predictors of quality fell into four categories: attention to the instructions of the task (e.g. removing boilerplate text), commitment to the community (e.g. revealing their email address), effort (e.g. length, number of edits), and use of domain-specific terms. The data is only for worked example math solutions for the Pythagorean Theorem but these features should do are more general and could be expected to transfer to other skills and material types. Each of these can be quickly calculated and translated into feedback to the author. E.g. “Rating: Fair. To improve this please add more details in the steps.” Providing this feedback could allow authors to game the system, but there’s not much incentive to do so. Additionally, the automatic rating model is trained on human ratings so as the authors add their ratings to items the system would update its model.
A caveat to the automated evaluation is that it takes some training. To see how the performance is affected by the number of instances, I trained on random subsets from the corpus each of size 8, 16, 32, 64, 128 and 250. Twenty subsets were created with random sampling with replacement. The performance increase dramatically from samples sizes of 8 (mean r=.21) to 32 (mean r=.54) instances. At about 64 instances you can expect r=0.6 correlation which should be worthwhile. (In particular the model is better at detecting bad from good than good from great.)