TRACKING A DOSE-RESPONSE 
CURVE IN PEER FEEDBACK ON 
WRITING: A WORK IN PROGRESS 
BC Campus Symposium, Nov. 14, 2014 
Christina Hendricks 
Sr. Instructor, Philosophy 
University of British Columbia-Vancouver 
Slides licensed CC-BY 4.0
My first SoTL project! 
• Co-investigator: Dr. Jeremy Biesanz, 
Psychology, UBC-Vancouver 
• Funding: SoTL Seed Fund, Institute for the 
Scholarship of Teaching and Learning, UBC 
• This is a work in progress: currently 
analyzing data from pilot study (2013-2014), 
looking for feedback for larger one (2015- 
2016)
SoTL literature on peer feedback 
• Peer feedback improves writing (Paulus, 1999; 
Cho & Schunn, 2007; Cho & MacArthur, 2010; Crossman 
& Kite, 2012) 
•Writing improves from giving peer feedback 
(Cho & Cho, 2011; Li, Liu & Steckelberg, 2010) 
• Gaps in the literature: 
• More peer feedback sessions -> increased 
implementation of feedback in writing? 
• Do comments on one paper transfer to later papers 
(rather than just revisions of same paper)?
http://artsone.arts.ubc.ca 
Interdisciplinary, team-taught, full year course 
for first-year students; 18 credits (6 each in first-year 
English, History, Philosophy) 
Writing intensive: Students write 10-12 essays 
(approx 1500-2000 words) 
Weekly structure: 
• Lecture once per week (100 students) 
• Seminars twice per week (20 students) 
• Tutorials once per week (4 students plus instructor; 
instructor does 5 of these per week)
Research questions 
1. Do later essays improve on the dimensions in feedback 
received as well as those in feedback given by students? 
One more than the other? 
2. Do later essays improve on the dimensions in peer 
feedback given and/or received even when instructor 
comments don’t agree with these? 
3. Are students more likely to implement peer comments for 
later essays after a few sessions or do they do so right 
away? 
4. Does the quality of peer comments improve over time (as 
compared to instructor comments and/or raters’ 
evaluation of essays)?
Data 
• 10 essays by each participant (13 in pilot 
study 2013-2014) 
• Comments by each student in a small group 
(4 students) on peers’ essays (at least 2 per 
essay) 
• Comments by instructor on each essay 
• All essays and comments coded according to 
a common rubric
Rubric 
4 categories: 
• Strength of argument 
• Organization 
• Insight 
• Style/mechanics 
Subcategories in each, plus degree (1-3) 
-- total of 11 subcategories
Complications, difficulties 
• Gathering written comments by peers 
• 2013-2014: wiki 
• 2014-2015: piloting sidebar comments on 
website 
•Tutorial discussions each week 
•Written comments sometimes given before, 
sometimes after 
• How to incorporate oral discussion of essays?
Where we are right now 
• Research assistants (UBC undergrads): 
• Jessica Wallace (Psychology, author/editor) 
• Daniel Munro (Philosophy, former Arts One) 
• Kosta Prodanovic (English, former Arts One) 
• Refined coding rubric: added, subtracted, 
condensed dimensions according to peer comments 
• Split student comments into single meaning 
units 
• Achieved inter-coder reliability on student 
comments
Inter-coder reliability on student 
comments (approx 2000 total) 
242 
comments 
Last 70 
All 3 coders agree on degree (1-3), 
regardless of category 
90% 87% 
3 agree on category & final decision 
(after mtg) is same 
56% 67% 
2 or 3 agree on category & final 
decision is same 
82% 93% 
2 agree on category & final is different 12% 7%
Inter-coder reliability on student 
comments: Fleiss’ Kappa 
• Average for 141 comments: 0.61 (moderate 
agreement) 
• For the most frequently used categories: 0.8 
(substantial agreement) 
• Agreement on degree (numbers 1-3) (Intra Class 
Correlation): 0.96 
Now: if two raters agree on category & number, 
that’s the final decision; otherwise meet and 
discuss
Coding yet to be done 
• Instructor comments on all essays 
• To isolate comments only given by peers 
• To measure improvement in student comment 
quality over time 
• Coding essays on the categories and 
degrees on the rubric used for comments 
• To measure improvement in essays over time
Analyses to be done 
Cross-lagged panel design with auto-regressive 
structure 
E1 quality E2 quality E3 quality 
E1 comments E2 comments E3 comments
Timeline 
• April 2015: Finish coding all essays and 
comments 
•May-June 2015: Do statistical analyses to 
address research questions 
• July-August 2015: Refine the design for a 
larger study to start Sept. 2015, recruit 
other Arts One instructors to join the study
References 
• Cho, K., & MacArthur, C. (2010). Student revision with peer and 
expert reviewing, Learning and Instruction. 20, 328-338. 
• Cho, Y. H., & Cho, K. (2011). Peer reviewers learn from giving 
comments. Instructional Science, 39, 629-643. 
• Cho, K. & Schunn, C. D. (2007). Scaffolded writing and rewriting in 
the discipline: A web-based reciprocal peer review system. 
Computers & Education, 48, 409–426 
• Crossman, J. M., & Kite, S. L. (2012). Facilitating improved writing 
among students through directed peer review, Active Learning in 
Higher Education, 13, 219-229. 
• Li, L., Liu, X., & Steckelberg, A. L. (2010). Assessor or assessee: How 
student learning improves by giving and receiving peer feedback. 
British Journal of Educational Technology, 41(3), 525–536. 
• Paulus, T. M. (1999). The effect of peer and teacher feedback on 
student writing. Journal of Second Language Writing, 8, 265-289.
Thank you! 
Christina Hendricks 
Website: 
http://blogs.ubc.ca/christinahendricks 
Blog: http://blogs.ubc.ca/chendricks 
Twitter: @clhendricksbc 
Slides available: 
Slides licensed CC-BY 4.0

Peer Feedback on Writing: A Work in Progress

  • 1.
    TRACKING A DOSE-RESPONSE CURVE IN PEER FEEDBACK ON WRITING: A WORK IN PROGRESS BC Campus Symposium, Nov. 14, 2014 Christina Hendricks Sr. Instructor, Philosophy University of British Columbia-Vancouver Slides licensed CC-BY 4.0
  • 2.
    My first SoTLproject! • Co-investigator: Dr. Jeremy Biesanz, Psychology, UBC-Vancouver • Funding: SoTL Seed Fund, Institute for the Scholarship of Teaching and Learning, UBC • This is a work in progress: currently analyzing data from pilot study (2013-2014), looking for feedback for larger one (2015- 2016)
  • 3.
    SoTL literature onpeer feedback • Peer feedback improves writing (Paulus, 1999; Cho & Schunn, 2007; Cho & MacArthur, 2010; Crossman & Kite, 2012) •Writing improves from giving peer feedback (Cho & Cho, 2011; Li, Liu & Steckelberg, 2010) • Gaps in the literature: • More peer feedback sessions -> increased implementation of feedback in writing? • Do comments on one paper transfer to later papers (rather than just revisions of same paper)?
  • 4.
    http://artsone.arts.ubc.ca Interdisciplinary, team-taught,full year course for first-year students; 18 credits (6 each in first-year English, History, Philosophy) Writing intensive: Students write 10-12 essays (approx 1500-2000 words) Weekly structure: • Lecture once per week (100 students) • Seminars twice per week (20 students) • Tutorials once per week (4 students plus instructor; instructor does 5 of these per week)
  • 5.
    Research questions 1.Do later essays improve on the dimensions in feedback received as well as those in feedback given by students? One more than the other? 2. Do later essays improve on the dimensions in peer feedback given and/or received even when instructor comments don’t agree with these? 3. Are students more likely to implement peer comments for later essays after a few sessions or do they do so right away? 4. Does the quality of peer comments improve over time (as compared to instructor comments and/or raters’ evaluation of essays)?
  • 6.
    Data • 10essays by each participant (13 in pilot study 2013-2014) • Comments by each student in a small group (4 students) on peers’ essays (at least 2 per essay) • Comments by instructor on each essay • All essays and comments coded according to a common rubric
  • 7.
    Rubric 4 categories: • Strength of argument • Organization • Insight • Style/mechanics Subcategories in each, plus degree (1-3) -- total of 11 subcategories
  • 8.
    Complications, difficulties •Gathering written comments by peers • 2013-2014: wiki • 2014-2015: piloting sidebar comments on website •Tutorial discussions each week •Written comments sometimes given before, sometimes after • How to incorporate oral discussion of essays?
  • 9.
    Where we areright now • Research assistants (UBC undergrads): • Jessica Wallace (Psychology, author/editor) • Daniel Munro (Philosophy, former Arts One) • Kosta Prodanovic (English, former Arts One) • Refined coding rubric: added, subtracted, condensed dimensions according to peer comments • Split student comments into single meaning units • Achieved inter-coder reliability on student comments
  • 10.
    Inter-coder reliability onstudent comments (approx 2000 total) 242 comments Last 70 All 3 coders agree on degree (1-3), regardless of category 90% 87% 3 agree on category & final decision (after mtg) is same 56% 67% 2 or 3 agree on category & final decision is same 82% 93% 2 agree on category & final is different 12% 7%
  • 11.
    Inter-coder reliability onstudent comments: Fleiss’ Kappa • Average for 141 comments: 0.61 (moderate agreement) • For the most frequently used categories: 0.8 (substantial agreement) • Agreement on degree (numbers 1-3) (Intra Class Correlation): 0.96 Now: if two raters agree on category & number, that’s the final decision; otherwise meet and discuss
  • 12.
    Coding yet tobe done • Instructor comments on all essays • To isolate comments only given by peers • To measure improvement in student comment quality over time • Coding essays on the categories and degrees on the rubric used for comments • To measure improvement in essays over time
  • 13.
    Analyses to bedone Cross-lagged panel design with auto-regressive structure E1 quality E2 quality E3 quality E1 comments E2 comments E3 comments
  • 14.
    Timeline • April2015: Finish coding all essays and comments •May-June 2015: Do statistical analyses to address research questions • July-August 2015: Refine the design for a larger study to start Sept. 2015, recruit other Arts One instructors to join the study
  • 15.
    References • Cho,K., & MacArthur, C. (2010). Student revision with peer and expert reviewing, Learning and Instruction. 20, 328-338. • Cho, Y. H., & Cho, K. (2011). Peer reviewers learn from giving comments. Instructional Science, 39, 629-643. • Cho, K. & Schunn, C. D. (2007). Scaffolded writing and rewriting in the discipline: A web-based reciprocal peer review system. Computers & Education, 48, 409–426 • Crossman, J. M., & Kite, S. L. (2012). Facilitating improved writing among students through directed peer review, Active Learning in Higher Education, 13, 219-229. • Li, L., Liu, X., & Steckelberg, A. L. (2010). Assessor or assessee: How student learning improves by giving and receiving peer feedback. British Journal of Educational Technology, 41(3), 525–536. • Paulus, T. M. (1999). The effect of peer and teacher feedback on student writing. Journal of Second Language Writing, 8, 265-289.
  • 16.
    Thank you! ChristinaHendricks Website: http://blogs.ubc.ca/christinahendricks Blog: http://blogs.ubc.ca/chendricks Twitter: @clhendricksbc Slides available: Slides licensed CC-BY 4.0

Editor's Notes

  • #9 Too much to record and trascribe all tutorials 2013-2014: asked students to pick two things they got out of tutorials that they think are important 2014-2015: not doing this
  • #10 Refined coding rubric Added, subtracted, condensed dimensions acc. to student comments Added examples of each sub-dimension Single meaning units Had several comments that could be given more than one code; needed to split them up so each comment had one code so as to better do analysis Inter-coder reliability
  • #11 Refined coding rubric Added, subtracted, condensed dimensions acc. to student comments Added examples of each sub-dimension Out of 242 peer comments: All 3 coders agree on value (1-3), regardless of dimension: 90% 2 or 3 coders agree on dimension and final decision (after meeting) is same as that: 82% 2 coders agree on dimension & final is different: 12% 3 coders agree on dimension & final is same: 56% Just last set of 70 comments Single meaning units Had several comments that could be given more than one code; needed to split them up so each comment had one code so as to better do analysis
  • #12 How much agreement do we observe relative to how much we would expect to see by chance? -- takes into account the frequency of the type of code occurring in the data -- some codes are more frequent, so you’d expect those to have more apparent agreement -1 to +1 0 = amount of agreement we’d expect to see by chance -1 is complete disagreement 0.6 is moderate agreement; 0.8 is substantial -- Kappa includes just the category Many of the mostly used categories have agreement in 0.8 range Reliability on degree: intra class correlation (ICC) of 0.96 -- to what extent is the average across the three raters reliable: average of all the numbers each gave—how does this correlate with the average of everyone who could possibly do this—get no benefit for adding more people -- average is 2.5 -- 1’s are pretty infrequent -- people agree on whether a 2 or a 3 (40% are 2s, 60% are 3s)