These slides are for a presentation at ISSOTL 2016 (the International Society for the Scholarship of Teaching and Learning. Please direct any questions to me at the contact information provided at the end!
Christina HendricksProfessor of Teaching at University of British Columbia-Vancouver
Dose-response curve for peer feedback on writing: A pilot study
1. Tracking a Dose-Response Curve
for Peer Feedback on Writing:
A Pilot Study
PI: Christina Hendricks
Co-PI: Jeremy Biesanz
University of British Columbia-Vancouver
Funded by the UBC Institute for the Scholarship of Teaching
and Learning SoTL Seed Fund
ISSOTL, October 2016
Slides licensed CC-BY 4.0
2. Literature on peer feedback
Receiving peer
feedback improves
writing
(Paulus, 1999; Cho & Schunn,
2007; Cho & MacArthur, 2010;
Crossman & Kite, 2012)
Giving peer feedback
improves writing
(Cho & Cho, 2011; Li, Liu &
Steckelberg, 2010)
3. GAPS:
Most studies look at revisions to a single
essay, not changes across different essays
Draft 1 Draft 2 Draft 3
Essay 1 Essay 2 Essay 3 Essay 4 Essay …n
PFB
PF
B
PF
B
PFB PF
B
PFB
Few studies look at “dose-response curve”
4. Pilot study research questions
1. How do students use peer comments given and
received for improving different essays rather
than drafts of the same essay?
1. Are students more likely to use peer comments
given and received for improving their writing
after more than one or two peer feedback
sessions? How many sessions are optimal?
2. Does the quality of peer comments improve
over time?
5. • Interdisciplinary, full year course for first-years
• 18 credits (English, History, Philosophy)
• Students write 10-12 essays (1500-2000
words)
• Peer feedback tutorials every week (4
students)
http://artsone.arts.ubc.ca
Toni Morrison, Wikimedia Commons,
licensed CC BY-SA 2.0
Osamu Tezuka, public domain
on Wikimedia Commons
Jane Austen, public domain on
Wikimedia Commons
Friedrich Nietzsche, public
domain, Wikimedia Commons
6. Data for pilot study 2013-2014
• 10 essays by 12
participants (n=120)
• Comments by 3 peers on
essays (n=1218)
• Comments by instructor
(n=3291)
• All coded with same
rubric
7. Coding Rubric
Categories
(plus
subcategories, for
11 options)
• Strength of argument
• Organization
• Insight
• Style & Mechanics
Numerical
value
1: Significant problem
2: Moderate problem
3: Positive comment/praise
E.g., STREV 2: could use more textual
evidence to support your claims
Change
for future
8. Inter-coder reliability
Fleiss’ Kappa Intra-class
correlation
Student
comments
(n=141)
All categories: 0.61 (moderate)
Most used categories: 0.8
(excellent)
0.96
(excellent)
Essays (n=120) 0.71
(adequate)
3 coders:
• Daniel Munro & Kosta Prodanovic
(undergrads, former Arts One)
• Jessica Stewart (author, editor)
17. Cross-lagged panel design with
auto-regressive structure
Essay Quality
Time 1
Essay Quality
Time 2
Comments
Time 1
Comments
Time 2
B
A
C
D
E
… N
… N
Looking at time 1 to time 2, then time 2 to time 3…
one single time lag.
18. Path A: Student comments
Essay Quality
Time 1
Essay Quality
Time 2
Comments
Time 1
Comments
Time 2
A
C
D
… N
… N
Significant relationships
• Ratings of 2 in Insight (-.53*)
• Ratings of 3 in Organization (.13*)
*p < .05, **p< .01, ***p< .001, ****p < .0001 *****p <.00001
19. Path A: Instructor Comments
Essay Quality
Time 1
Essay Quality
Time 2
Comments
Time 1
Comments
Time 2
B
A
C
D
E
… N
… N
Significant relationships
• Ratings of 1 in Strength (-.12*) & Org. (-.23**)
• Ratings of 2 in Strength (-.06*) & Style (-.08*)
• Ratings of 3 in Str, (.11*), Insight (.35*), Style (.15*)
*p < .05, **p< .01, ***p< .001, ****p < .0001 *****p <.00001
20. Path C: student comments
Essay Quality
Time 1
Essay Quality
Time 2
Comments
Time 1
Comments
Time 2
A
C
D
… N
… N
Significant relationships
• Comments rated 2 in Strength (.22*) & Style (.33**)
• Comments rated 3 in Style (.31*)
*p < .05, **p< .01, ***p< .001, ****p < .0001 *****p <.00001
21. Path C: instructor comments
Essay Quality
Time 1
Essay Quality
Time 2
Comments
Time 1
Comments
Time 2
B
A
C
D
E
… N
… N
Significant effects:
• Rating of 3 in Strength (.34**) and Style (.30**)
*p < .05, **p< .01, ***p< .001, ****p < .0001 *****p <.00001
22. Path D: Student & Instructor
comments
Essay Quality
Time 1
Essay Quality
Time 2
Comments
Time 1
Comments
Time 2
A
C
D
… N
… N
Significant relationship ONLY if combine student
& instructor comments, & only for comments
rated 1 (all categories combined): (.05, p=.06)
23. Path D: Two time lags
Essay Quality
Time 1
Essay Quality
Time 2
Comments
Time 1
Comments
Time 2
A
C
Essay Q
Time 3
… N
D
No significant relationships in comments time 1
plus time 2 for essay time 3, for any comments or
categories
24. Research question 1
How do students use peer comments given
and received for improving different
essays rather than drafts of same essay?
o Very little significant evidence of
relationships in Path D
o No difference between comments given
& received
25. Research question 2
Are students more likely to use peer comments
given and received for improving their writing
after more than one or two peer feedback
sessions? How many sessions are optimal?
o No evidence that there is any change over
time in path D
o No difference between comments given or
received
26. Research question 3
Does the quality of peer comments improve
over time?
o No evidence of change over time in path A
Essay Quality
Time 1
Essay Quality
Time 2
Comments
Time 1
Comments
Time 2
B
A
C
D
E
… N
… N
27. Some conclusions
Pilot study: feasible for larger sample? Yes, if:
o instructors code essay quality rather than coders
o have easy collection of comments
28. References
• Cho, K., & MacArthur, C. (2010). Student revision with peer
and expert reviewing, Learning and Instruction. 20, 328-338.
• Cho, Y. H., & Cho, K. (2011). Peer reviewers learn from giving
comments. Instructional Science, 39, 629-643.
• Cho, K. & Schunn, C. D. (2007). Scaffolded writing and
rewriting in the discipline: A web-based reciprocal peer review
system. Computers & Education, 48, 409–426
• Crossman, J. M., & Kite, S. L. (2012). Facilitating improved
writing among students through directed peer review, Active
Learning in Higher Education, 13, 219-229.
• Li, L., Liu, X., & Steckelberg, A. L. (2010). Assessor or
assessee: How student learning improves by giving and
receiving peer feedback. British Journal of Educational
Technology, 41(3), 525–536.
• Paulus, T. M. (1999). The effect of peer and teacher feedback
on student writing. Journal of Second Language Writing, 8,
265-289.
29. Thank you!
Christina Hendricks
University of British Columbia-Vancouver
Website: http://blogs.ubc.ca/christinahendricks
Blog: http://blogs.ubc.ca/chendricks
Twitter: @clhendricksbc
Slides licensed CC-BY 4.0
Editor's Notes
Number of “1” comments total: 239 out of over 4000
1’s by students: 35
1’s by instructor: 204
How much agreement do we observe relative to how much we would expect to see by chance?
-- takes into account the frequency of the type of code occurring in the data
-- some codes are more frequent, so you’d expect those to have more apparent agreement
-1 to +1
0 = amount of agreement we’d expect to see by chance
-1 is complete disagreement
0.6 is moderate agreement; 0.8 is substantial
-- Kappa includes just the category
Many of the mostly used categories have agreement in 0.8 range
Reliability on degree: intra class correlation (ICC) of 0.96
-- to what extent is the average across the three raters reliable: average of all the numbers each gave—how does this correlate with the average of everyone who could possibly do this—get no benefit for adding more people
-- average is 2.5
-- 1’s are pretty infrequent
-- people agree on whether a 2 or a 3 (40% are 2s, 60% are 3s)
These numbers are linear trend over time, not autoregressive
Path A: number 2 comments for “insight” related to lower quality mark for insight; for every #2 comment in insight the students give, the essay quality drops by 0.53 on quality scale
What this says, basically, is that the coders’ ratings of essay quality are pretty similar to the instructor’s comments on essay quality, in these categories at least. So the intructor’s comments are tracking instructor ratings of quality, and that’s pretty similar to coder ratings of quality.
Path C: For #2 comments on style and strength, significant relationship in that likely to get more of those comments in these categories on second essay
This could just be saying that students tend to give the same sorts of comments to the same people, but also that things aren’t changing that much from one essay to another.
But see notes—there are some significant effects in C in instructor comments of 3 in strength and style
I think the above numbers are actually for path B, not path C
if relationship is positive (b=.06, not negative something), then your paper improves the next time. The more number 1 comments you have, the better your score is on the next essay.