Ofqual’s reliability of  results programme Dennis Opposs (Office of the Qualifications and Examinations Regulator)
A health check <ul><li>“ As the regulator of qualifications in England I believe that  it is essential for all of us to un...
Myths of perfection? <ul><li>“ There’s a broad expectation that  assessment should be absolutely perfect and accurate , th...
Educational assessment error <ul><li>What factors contribute to  educational  assessment error? </li></ul><ul><li>You … if...
Student misclassification <ul><li>“…  it is likely that the proportion of students awarded a level higher or lower than th...
Programme objectives <ul><li>To generate evidence of reliability of results from a number of major national tests, examina...
Programme objectives <ul><li>To stimulate national debate on the significance  of the reliability evidence generated by th...
Our programme of work <ul><li>Strand 1 </li></ul><ul><li>Generating evidence   on reliability </li></ul><ul><li>Strand 2 <...
Strand 1 – Generating evidence <ul><li>Synthesising  pre-existing evidence </li></ul><ul><li>Literature reviews </li></ul>...
Specifications to support experimental studies <ul><li>Literature reviews: sources of unreliability; procedures used to pr...
Our Technical Advisory Group
Strand 2 – Interpreting and communicating evidence <ul><li>How do we  interpret  our findings? </li></ul><ul><li>How do we...
Our communication challenge <ul><li>“ results on a six or seven point grading scale are  accurate to about one grade eithe...
Strand 3 – Developing policy <ul><li>Exploring  public understanding  of, and  attitudes  towards, assessment error </li><...
Some questions for discussion <ul><li>What are  your experiences  of exam error? </li></ul><ul><li>Would you be  surprised...
Upcoming SlideShare
Loading in …5
×

Ofquals reliability of results programme

632 views

Published on

Ofquals reliability of results programme presentation delivered by Dennis Opposs at the CIEA conference.

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
632
On SlideShare
0
From Embeds
0
Number of Embeds
53
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Ofquals reliability of results programme

  1. 1. Ofqual’s reliability of results programme Dennis Opposs (Office of the Qualifications and Examinations Regulator)
  2. 2. A health check <ul><li>“ As the regulator of qualifications in England I believe that it is essential for all of us to understand better the reliability of assessments in our national systems […] I can, therefore, tell you today that Ofqual will undertake an in-depth programme of work – call it a health check – on the reliability of tests, examinations and teacher assessments, in this country.” </li></ul><ul><li>Kathleen Tattersall, 16 May, 2008. Ofqual Launch Event. National Motorcycle Museum. Solihull. </li></ul>
  3. 3. Myths of perfection? <ul><li>“ There’s a broad expectation that assessment should be absolutely perfect and accurate , that a mark of 50 is a mark of 50, regardless of who marks, the time at which it is marked and so on. We need to explore whether that sort of expectation is well founded...” </li></ul><ul><li>Kathleen Tattersall, 14 May, 2008. Interview with Tim Ross of PA. </li></ul>
  4. 4. Educational assessment error <ul><li>What factors contribute to educational assessment error? </li></ul><ul><li>You … if you don’t present sufficient evidence </li></ul><ul><li>Instruments … if they’re poorly calibrated, or have design limitations </li></ul><ul><li>Your assessors … if they misinterpret the evidence before them </li></ul>
  5. 5. Student misclassification <ul><li>“… it is likely that the proportion of students awarded a level higher or lower than they should be because of the unreliability of the tests is at least 30% at key stage 2” </li></ul><ul><li>Wiliam, D. (2001). Level best? London: ATL. </li></ul><ul><li>“ Professors Black, Gardner and Wiliam argued […] that up to 30% of candidates in any public examination in the UK will receive the wrong level or grade” </li></ul><ul><li>House of Commons Children, Schools and Families Committee. (2008a). Testing and Assessment. Third Report of Session 2007–08. Volume I. HC 169-I. London: TSO. </li></ul>
  6. 6. Programme objectives <ul><li>To generate evidence of reliability of results from a number of major national tests, examinations and qualifications </li></ul><ul><li>To stimulate, capture and synthesise technical debate on the interpretation of reliability evidence generated from reliability studies </li></ul><ul><li>To investigate how results and the associated errors are reported and communicated </li></ul><ul><li>To explore public understanding of, and attitudes towards, assessment inconsistency </li></ul>
  7. 7. Programme objectives <ul><li>To stimulate national debate on the significance of the reliability evidence generated by this programme and other studies </li></ul><ul><li>To help improve public understanding of the concept of reliability </li></ul><ul><li>To develop Ofqual policy on reliability of results </li></ul>
  8. 8. Our programme of work <ul><li>Strand 1 </li></ul><ul><li>Generating evidence on reliability </li></ul><ul><li>Strand 2 </li></ul><ul><li>Interpreting and communicating evidence on reliability </li></ul><ul><li>Strand 3 </li></ul><ul><li>Exploring public understanding of reliability and developing Ofqual policy on reliability </li></ul>
  9. 9. Strand 1 – Generating evidence <ul><li>Synthesising pre-existing evidence </li></ul><ul><li>Literature reviews </li></ul><ul><li>Generating new evidence </li></ul><ul><li>Monitoring existing practices </li></ul><ul><li>Experimental studies </li></ul>
  10. 10. Specifications to support experimental studies <ul><li>Literature reviews: sources of unreliability; procedures used to produce reliability measures; how to report results and associated errors; how to interpret and evaluate reliability evidence </li></ul><ul><li>Partial estimates of reliability: estimating reliability in relative isolation </li></ul><ul><li>Overall estimates of reliability: all major factors considered </li></ul>
  11. 11. Our Technical Advisory Group
  12. 12. Strand 2 – Interpreting and communicating evidence <ul><li>How do we interpret our findings? </li></ul><ul><li>How do we communicate our findings? </li></ul>
  13. 13. Our communication challenge <ul><li>“ results on a six or seven point grading scale are accurate to about one grade either side of that awarded.” </li></ul><ul><li>Schools Council. (1980). Focus on examinations. Pamphlet 5. London: Schools Council. </li></ul>
  14. 14. Strand 3 – Developing policy <ul><li>Exploring public understanding of, and attitudes towards, assessment error </li></ul><ul><li>Stimulating national debate on the significance of the reliability evidence generated by the programme </li></ul><ul><li>Developing Ofqual’s policy on reliability </li></ul>
  15. 15. Some questions for discussion <ul><li>What are your experiences of exam error? </li></ul><ul><li>Would you be surprised if a substantial proportion of students were misclassified? </li></ul><ul><li>How much error is too much error? </li></ul><ul><li>Is it right to apportion blame when students are misclassified (if so, to whom and why)? </li></ul><ul><li>What about the different causes of error? </li></ul><ul><li>Does misclassification undermine the effective use of results (why, to what extent)? </li></ul>

×