Assessment Assignment: Bath MA International Education


Published on

This is an assignment I completed for the Assessment unit of the University of Bath's MA in International Education programme.

It is shared here to allow me to embed it onto my professional reflective blog at

Downloads have been disabled.

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Assessment Assignment: Bath MA International Education

  1. 1. Stephen Taylor Assessment Survival of the Fittest for Purpose?   Exploring  reliability  and  validity  in  criterion-­‐related  assessment  of   the  IB  Middle  Years  Programme  sciences  as  it  moves  into  the  Next   Chapter.         Stephen  Taylor   MA  International  Education   University  of  Bath   (@IBiologyStephen)       This  assignment  was  submitted  as  part  of  my  MA  coursework  in  February  2012.  It  is  uploaded  here   (with  permission)  to  be  included  as  part  of  my  professional  development  and  reflective  portfolio  at  
  2. 2. Stephen Taylor Assessment Introduction  The  International  Baccalaureate’s  Middle  Years  Programme  (MYP)  is  going  through  an  exciting  period  of  reinvention.  Dubbed  “MYP:  The  Next  Chapter,”  this  programme  overhaul  will  affect  all  MYP  teachers,  students  and  school  leaders  over  the  coming  years  (IB,  2011a).  The  Next  Chapter  breaks  from  the  usual  curriculum  review  cycle,  which  runs  on  a  per-­‐subject  group  basis,  and  will  result  in  new  subject  guides,  assessment  criteria  and  practices  being  published  for  every  subject  simultaneously.    Due  to  be  officially  launched  in  2014,  subject  and  assessment  reviews  and  trials  are  currently  ongoing  in  schools  around  the  globe.        In  this  essay,  I  will  explore  the  implications  of  key  changes  proposed  under  the  Next  Chapter  and  their  implications,  in  terms  of  validity  and  reliability,  of  assessment  of  the  sciences.  I  will  attempt  to  evaluate  these  proposals  and  make  recommendations  for  teachers  and  the  IB  on  steps  that  may  make  for  a  smoother  transition  from  principles  into  practice.          Structure  and  assessment  of  the  IB  Middle  Years  Programme  The  IB  MYP  is  a  rapidly  growing  educational  framework  for  middle  school-­‐aged  students  (11-­‐16  years  of  age).  With  its  roots  as  a  ‘pre-­‐IB’  programme  in  Africa  in  the  1980’s,  it  has  developed  into  a  four  or  five-­‐year  programme,  acting  not  only  as  a  precursor  to  the  Diploma  Programme  (its  original  intended  purpose),  but  also  as  an  interface  with  the  Primary  Years  Programme.  (Nicolson  &  Hannah,  2010).  The  holistic  nature  of  the  programme  is  intended  to  develop  both  concepts  and  skills  in  its  learners,  developing  not  only  knowledge  and  understanding  of  the  eight  subject  groups,  but  also  allowing  students  to  become  versed  in  the  learning  skills  required  to  be  successful  in  the  IB  Diploma,  university  and  beyond.  (Nicolson  &  Hannah,  2010).      The  core  of  the  MYP  is  similar  in  nature  to  that  of  the  Diploma  Programme,  with  the  IB’s  Learner  Profile  focusing  on  the  desired  attributes  of  learners.  The  five  Areas  of   Figure  1:  The  current  MYP  model.  Taken  from  A  History  of  the  interaction  form  contexts  for  learning  within   Middle  Years  Programme  (Appendix)  (Nicolson  &  Hannah,  2010)  
  3. 3. Stephen Taylor Assessmentthe  curriculum.  Community  and  service  is  analogous  to  the  Creativity,  action  and  service  component  of  the  Diploma  Programme.  Approaches  to  learning,  another  Area  of  interaction,  highlights  the  development  of  study  and  research  skills  (IB,  2009)  and  also  allows  for  some  introduction  to  the  Theory  of  knowledge  component  of  the  Diploma  Programme  (Nicolson  &  Hannah,  2010).  A  culminating,  student-­‐directed  task,  the  Personal  Project,  aims  to  facilitate  student  exploration  in  a  similar  way  to  the  Diploma  Programme’s  Extended  Essay.        Growth  and  development  in  the  MYP:  Why  the  Next  Chapter?  From  407  schools  running  the  MYP  in  2007,  there  are  now  729  MYP  schools  worldwide  (IB,  2011a,  p.4).  This  rapid  growth  in  the  programme  could  be  due  a  number  of  factors,  such  as  a  greater  demand  for  international  education  in  developed  and  developing  nations  and  a  increasing  ‘brand  recognition’  of  the  International  Baccalaureate  in  the  education  sector.  The  International  Baccalaureate  Organisation  has  three  regions.  The  Americas  (IBA)  encompasses  the  USA,  Canada  and  South  America  and  in  recent  years  has  been  the  fastest-­‐growing  market,  with  over  71%  of  IB  schools  running  the  MYP  (IB,  2011a,  p.6).  Growth  is  slower  but  steady  in  the  IB’s  other  two  regions,  Africa,  Europe  and  the  Middle  East  (IBAEM),  and  Asia-­‐Pacific  (IBAP).      Despite  this  growth  in  the  MYP,  the  proportion  of  schools  choosing  to  moderate  their  assessment  is  decreasing:  from  38.8%  (155/407  schools)  of  June-­‐session  schools  registering  candidates  for  moderation  in  2007  to  just  22.91%  (167/729  schools)  in  2011  (IB,  2011a).  Although  in  real  terms  this  represents  a  small  increase  in  the  number  of  schools  choosing  to  have  their  assessments  moderated,  it  does  raise  questions  of  the  reliability  of  the  grades  given  to  students  in  the  majority  of  schools.  As  part  of  the  five-­‐year  programme  evaluation  process,  schools  which  do  not  have  their  grades  formally  moderated  are  required  to  submit  some  samples  of  assessed  final-­‐year  work  for  monitoring,  a  version  of  moderation  which  provides  feedback  on  assessment  without  affecting  grades  awarded  (IB,  2010a,  p.49).      This  low  uptake  of  moderation  and  potential  loophole  in  quality  control  leaves  the  MYP  in  an  interesting  position  in  terms  of  reliability,  recognition  and  competition.  Globally  it  is  growing  and  becoming  the  choice  of  international  schools  and  local  schools  aiming  to  ‘internationalise’  their  learning.    The  IB  Diploma  is  a  well-­‐established  programme  
  4. 4. Stephen Taylor Assessmentinternationally,  with  a  current  tally  of  2,313  schools  offering  the  programme  (IB,  2012).  However,  of  these  schools,  just  212  offer  the  MYP  preceding  the  Diploma  Programme  (IB,  2012).  Of  course,  many  of  the  DP-­‐only  schools  will  be  similar  to  sixth-­‐form  colleges  with  an  exclusively  16-­‐19  student  body,  but  there  is  still  some  shortfall  with  its  leading  competitor,  the  IGCSE.      Boasting  over  9,000  schools  enrolled  internationally  (CIE,  2011),  the  Cambridge  International  GCSE  is  often  found  as  the  ‘pre-­‐IB’  qualification  in  international  schools  that  offer  the  Diploma  but  not  MYP.    The  IGCSE  is  closely  based  on  England’s  GCSE,  developed  in  1988  as  a  broader  style  of  assessment  for  Key  Stage  4  in  the  UK  than  the  incumbent  O-­‐Levels  system  (Bishop  et  al.,  1999).  Originally  the  GCSE,  like  the  MYP,  was  intended  to  go  beyond  selection  and  summative  assessment  of  content,  to  also  “embrace  the  broader  notion  of  assessment,  which  includes  the  following:   • a  system  which  tests  a  balance  of  knowledge,  understanding  and  skills;  this   system  employs  different  types  of  assessment  within  the  courses  of  study  which   reflects  a  variety  of  styles  of  teaching  and  learning;     • challenging  the  range  of  abilities  of  pupils  at  the  end  of  key  stage  4;     • being  relevant  to  everyday  life.”  (Bishop  et  al.,  1999)    In  their  paper  Users’  perceptions  of  the  GCSE,  Bishop,  Black,  Martin  and  Thompson  (1999)  conclude  that  “it  must  be  recognized  that  the  [GCSE]  examination  cannot  perform  concurrently  all  functions  that  users  are  claiming  for  it.”  These  sentiments  could  well  be  shared  of  the  MYP  in  its  current  form:  philosophically  sound  and  in-­‐tune  with  the  needs  of  international  education,  but  with  a  wide  range  of  goals,  assessment  methods  and  low  moderation  somewhat  vulnerable  in  terms  of  validity  and  reliability.  As  Hayden  and  Thompson  (2011,  p.17),  conclude:  “  [for  some]  …the  absence  of  external  external  examination  leading  to  an  externally-­‐awarded  certificate  at  age  16  is  anathema.”    While  discussions  continue  over  UK  schools  moving  away  from  the  GCSE  and  OFQUAL  questions  over  standards  following  recent  revisions  (Morrison,  2009),  the  IB  are  working  on  their  next  incarnation  of  the  MYP:  The  Next  Chapter.  Fundamentally,  The  Next  Chapter  involves  more  streamlined,  structured  and  potentially  more  valid  and  reliable  assessment  of  student  learning.      
  5. 5. Stephen Taylor AssessmentOne  justification  for  the  move  to  the  Next  Chapter  and  its  associated  modes  of  assessment  is  for  the  MYP  to  gain  accreditation  in  many  of  the  countries  in  which  it  is  being  implemented.  In  a  recent  email  exchange  Malcolm  Nicolson,  the  Head  of  MYP  Programme  Development,  stated,  “We  will  be  looking  at  accreditation  standards  globally  –  so  looking  at  USA,  Australia,  Canada,  Netherlands,  Germany,  Japan  and  many  others.  The  UK  is  one  of  the  countries  will  aim  to  satisfy.”  In  order  to  satisfy  the  UK,  the  MYP  must  adhere  to  the  assessment  principles  laid  out  by  OFQUAL,  the  same  body  which  currently  accredits  the  IGCSE,  GCSE  and  A-­‐Level  qualifications,  along  with  the  IB’s  own  Diploma  Programme.    Condition  E4.2  of  the  OFQUAL  document  General  Conditions  of  Recognition,  states  that:     “…In  designing  such  an  assessment,  an  awarding  organization  must  […]  ensure  that   the  assessment  is:  fit  for  purpose,  […]  allows  each  Learner  to  generate  evidence   which  can  be  Authenticated,  [and  which]  allows  Assessors  to  be  able  to  differentiate   accurately  and  consistently  between  a  range  of  attainments  of  Learners.”    (OFQUAL,  2011,  p.44)    Inherently  the  recognition  sought  by  the  MYP  is  an  issue  of  validity  and  reliability  in  assessment.  Here  we  can  try  to  evaluate  validity  and  reliability  in  the  current  model  of  the  MYP  and  look  at  some  of  the  key  proposals  for  change  under  the  Next  Chapter.        What  makes  for  valid  and  reliable  assessment?  When  teaching  my  own  science  classes  I  often  ask  my  students  two  questions  when  they  are  designing,  carrying  out  and  evaluating  lab  work  and  processing  their  results.  The  first  is  “how  do  you  know  your  method  is  allowing  you  to  address  your  research  question?”  The  second  is  “how  do  you  know  you  can  rely  on  your  results?”    Moss  et.  al  (2006)  state  that  educational  assessment  “should  be  able  to  support  [educators]  in  developing  interpretations,  decisions  and  actions  that  enhance  students’  learning.”  Validity  “refers  to  the  soundness  of    [those]  decisions,  interpretations  or  actions.”  (Moss  et  al.,  2006).  Wynne  Harlen  (2007)  defines  validity  as  “how  well  what  is  assessed  corresponds  with  the  behaviour  or  learning  outcome  that  it  is  intended  should  be  assessed;  this  is  often  referred  to  as  construct  validity.”    He  clarifies  that  the  “important  requirement  is  that  the  
  6. 6. Stephen Taylor Assessmentassessment  concerns  all  aspects  –  and  only  those  aspects  -­‐  of  students  achievement  relevant  to  a  particular  purpose.”  (Harlen,  2007).      Validity  has  been  traditionally  broken  into  three  domains.  Content  validity  “demonstrates  how  well  the  test  samples  the  class  situations  or  subject  matter  about  which  conclusions  are  to  be  drawn.”  (Moss  et  al.,  2006).  Criterion-­‐related  validity  compares  those  scores  with  “one  or  more  external  variables  considered  to  provide  a  direct  measure  of  the  characteristic  or  behavior  in  question.”  (Moss  et  al.,  2006).  Construct  validity  can  be  described  as  “a  more  indirect  method  of  validation,  “  (Moss  et  al.,  2006).  Harlen  elucidates  construct  validity  as  being  “based  on  an  integration  of  any  evidence  that  bears  on  the  interpretation  or  meaning  of  the  test  scores—including  content-­‐  and  criterion-­‐related  evidence—which  are  thus  subsumed  as  part  of  construct  validity.” On  the  other  hand, Messick  (1995)  describes  construct  validity  as  being  “not  a  property  of  the  test  or  assessment  as  such,  but  rather  of  the  meaning  of  the  test  scores.”    (Messick,  1995).  He  goes  further,  arguing  that  construct  validity  can  be  broken  into  six  sub-­‐domains:  “content,  substantive,  structural,  generalizability,  external,  and  consequential  aspects  of  construct  validity.”      In  classroom  practice  and  assessment  in  the  MYP,  we  are  most  concerned  about  ‘what  to  assess’  and  ‘how  to  assess’.  A  third  fundamental  aspect  of  evaluating  the  usefulness  of  an  assessment  model  or  tools  is  reliability.  This  is  described  by  Harlen  (2007)  in  his  Criteria  for  evaluating  systems  for  student  assessment  as  being  “the  extent  to  which  results  are  of  acceptable  consistency  for  a  particular  use,”  or,  more  commonly  “the  extent  to  which  the  assessment,  if  repeated,  would  give  the  same  result.”    Harlen  also  makes  the  distinction  between  tools  used  for  formative  assessment  and  summative  assessment.  Formative  assessment,  (assessment  for  learning),  has  the  intended  purpose  “of  helping  learning  and  teaching.”  Summative  assessment  information,  (assessment  of  learning),  “is  required  for  the  purpose  of  keeping  records  of  the  progress  of  individual  students,  reporting  to  parents  and  students  at  regular  intervals,  passing  information  to  other  teachers  on  transfer  from  class  to  class  or  in  guiding  decisions  about  subjects  for  further  study”.  (Harlen,  2007).    Formative  assessment  plays  a  vital  role  in  the  classroom,  though  in  this  case  I  will  be  looking  at  the  assessment  of  the  MYP  models  through  the  lens  of  final-­‐year,  summative  assessment.    
  7. 7. Stephen Taylor AssessmentWhen  evaluating  assessment  in  the  MYP,  we  should  ask  three  key  questions:     • Does  this  mode  of  assessment  allow  us  to  assess  the  content  we  intend  to   assess  -­‐  does  it  have  content  validity?   • Does  this  mode  of  assessment  allow  us  to  assess  the  skills  or  attributes  we   intend  to  assess  –  does  it  have  criterion-­‐related  validity?   • Does  this  mode  of  assessment  provide  us  with  reliable  and  verifiable   assessment  data  –  is  it  reliable?    If  we  can  answer  these  three  questions  in  the  affirmative,  we  could  conclude  that  the  assessment  is  indeed  ‘fit  for  purpose’.        Assessment  in  the  Middle  Years  Programme:  methods  and  challenges  Assessment  of  student  achievement  in  both  the  current  MYP  and  the  Next  Chapter  derive  from  shared  foundations  in  educational  assessment  theory.  These  are  well  documented  in  the  IB’s  publication  MYP:  principles  into  practice  (2008),  and  include  criterion-­‐related  assessment,  the  best-­‐fit  approach,  the  value  of  formative  assessment,  fitness  for  purpose  of  assessment  tools,  feedback  and  grade  determination.      Assessment  in  the  MYP  has  a  set  of  aims,  below  quoted  from  MYP:  from  principles  to  practice  (IB,  2008):    “Assessment  in  the  MYP  aims  to:   • support  and  encourage  student  learning  by  providing  feedback  on  the  learning  process   • inform,  enhance  and  improve  the  teaching  process   • promote  positive  student  attitudes  towards  learning   • promote  a  deep  understanding  of  subject  content  by  supporting  students  in  their  inquiries   set  in  real  world  contexts  using  the  areas  of  interaction   • promote  the  development  of  higher-­‐order  cognitive  skills  by  providing  rigorous  final   objectives  that  value  these  skills   • reflect  the  international-­‐mindedness  of  the  programme  by  allowing  for  assessments  to  be   set  in  a  variety  of  cultural  and  linguistic  contexts   •  support  the  holistic  nature  of  the  programme  by  including  in  its  model  principles  that  take   account  of  the  development  of  the  whole  student”   (IB,  2008,  p.41)  
  8. 8. Stephen Taylor Assessment  These  aims  are  supplemented  with  further  subject-­‐specific  aims  and  objectives.  Within  each  of  the  subjects  there  are  multiple  criteria,  each  with  its  own  aims  and  objectives.  Appendix  2  lists  the  aims  and  objectives  of  the  sciences.  With  eight  subject  groups  in  total  we  can  see  that  one  obstacle  to  validity  in  MYP  assessment  lies  in  the  sheer  volume  of  content  and  objectives  that  are  to  be  assessed.  It  is  good  practice  to  assess  through  ‘multiple  measures’,  with  an  IB  stipulation  of  at  least  two  data  points  per  criterion  per  year  (IB,  2010a,  p.54).  In  reality,  that  plays  out  in  schools  as  being  two  data  points  per  reporting  period  (commonly  a  semester).  There  are  six  assessed  criteria  in  the  sciences,  four  in  other  subjects.  As  a  consequence,  students  face  a  minimum  of  eight  summative  assessments  per  semester  in  some  subjects  and  twelve  in  others.  This  is  an  incredible  load  on  teachers  and  students  and  at  the  high-­‐school  level  can  leave  teachers  in  a  position  of  poor  assessment  practices  –  cramming  content  ‘in  preparation  for  the  Diploma’,  assigning  assessed  tasks  as  homework  or  simply  missing  out  valuable  steps  such  as  exploration,  drafting  and  peer  or  self-­‐assessment.  As  a  result,  the  size  of  the  MYP,  paired  with  the  significant  backwash  effect  of  Diploma  Programme  preparation,  could  be  having  a  negative  impact  not  just  on  content  validity  but  likely  also  reliability.          Grading  and  reporting  Overall  grades  on  students’  progress  in  the  eight  academic  subject  groups  are  reported  on  a  1-­‐7  scale.  A  full  set  of  descriptors  of  these  grades  is  included  in  Appendix  2.  These  1-­‐7  scores  are  determined  against  a  set  of  published  grade  boundaries  for  each  of  the  eight  subject  groups.  A  best-­‐fit  approach  is  used  to  determine  the  score  for  each  of  the  subject’s  assessment  criteria.  These  scores  are  then  added  up  and  grade  boundaries  are  applied.  The  positioning  of  these  boundaries  is  an  example  of  norm-­‐referencing  to  some  extent  in  the  MYP.  It  is  a  point  at  which  and  essentially  descriptive,  criterion-­‐referenced  system  is  used  to  produce  a  single  numerical  score  –  and  in  the  sciences  it  does  not  quite  add  up.  A  student  who  scores  4  in  all  criteria  falls  one  point  the  wrong  side  of  a  5  grade  overall  –  the  grade  which  best  represents  his  achievement  when  the  descriptors  are  compared  to  one  another.  This  suggests  an  issue  with  criterion  validity,  but  could  be  remedied  with  a  normative  decision  to  move  the  boundary.        
  9. 9. Stephen Taylor AssessmentIn  the  current  model  of  the  MYP,  “all  the  work  of  students  is  internally  assessed  by  teachers.  There  is  no  formal  examination  structure,  no  system  of  external  assessment  and  the  IB  does  not  provide  MYP  exams.”  (IB,  2010a,  p.52).  The  MYP  in  its  current  guise  is  commonly  described  as  a  framework  for  teaching  and  assessment,  and  is  not  intended  to  be  a  curriculum  or  replacement  for  standardized  testing.  The  MYP  Coordinator’s  Handbook  (IB,  2010a)  goes  on  to  say  that  “external  examinations  provided  by  other  organisations  are  unlikely  to  address  the  MYP  subject-­‐specific  objectives.”  (IB,  2010a,  p.52).      As  mentioned  before,  the  low  uptake  of  schools  in  the  formal  moderation  process  raises  concerns  about  the  reliability  of  grades  awarded  in  MYP  assessment.  It  is  a  requirement  that  schools  with  multiple  teachers  per  section  moderate  internally,  though  there  is  little  in  the  way  of  quality  control  to  ensure  that  this  takes  place  until  the  school’s  five-­‐year  evaluation  visit.        Criterion-­‐related  assessment  in  the  MYP  Assessment  of  student  achievement  in  eight  subject  areas  and  the  personal  project  of  the  MYP  are  entirely  criterion-­‐related,  using  a  best-­‐fit  approach  (IB,  2008,  p.40).  This  is  derived  from  previous  practice  in  criterion-­‐referenced  assessment.    Although  similar,  and  often  confused  by  teachers  and  administrators,  there  are  subtle  differences  between  the  two  approaches.  To  fully  understand  the  impacts  of  the  Next  Chapter  and  the  continuing  role  of  criterion-­‐related  assessment,  we  must  first  understand  these  key  modes  of  assessment.      Norm-­‐referenced  assessment  of  student  achievement  does  not  overtly  exist  in  the  MYP  and  is  not  generally  accepted  practice  in  the  MYP  classroom.  Norms  are  traditionally  used  to  rank  learners  in  terms  of  their  perceived  achievement  in  a  test  or  assessment  battery.  Norm-­‐referencing  “places  groups  of  students  into  predetermined  bands  of  achievements.  Students  compete  for  limited  numbers  of  grades  within  these  bands  which  range  between  fail  and  excellence.”  (Dunn  et  al.,  2002)  In  its  most  traditional  sense,  norm-­‐referencing  measures  students  only  against  others  and  is  not  necessarily  a  good  measure  of  content  mastery  (OConnor,  2011,  pp.79-­‐80).  Norm-­‐referenced  grading  is,  in  essence,  a  competitive  pursuit  and  not  in  the  interests  of  all  students  –  especially  those  who  struggle  to  succeed.  
  10. 10. Stephen Taylor AssessmentThis  may  be  appropriate  in  a  competitive  environment,  but  it  does  not  suit  the  inclusive  nature  of  the  IB  programmes.      Criterion-­‐referenced  achievement  “is  not  dependent  on  how  well  others  in  the  cohort  have  performed,  but  on  how  well  the  individual  student  has  performed  as  measured  against  specific  criteria  and  standards.”  (Dunn  et  al.,  2002).  It  is  an  assessment  idea  which  has  been  in  use  since  the  1960s  although  it  wasn’t  until  the  early  1970’s  that  academics  such  as  Hambleton  &  Novick  (1973)  joined  up  key  ideas  in  theory  and  practice.  They  state  that  in  common  with  all  previous  definitions  of  criterion-­‐referenced  assessment  is  that    “the  definition  of  a  well-­‐specified  content  domain  and  the  development  of  procedures  for  generating  appropriate  samples  of  test  items  are  important.”  (Hambleton  &  Novick,  1973)      Having  said  this,  it  could  be  argued,  as  David  F.  Lohman  quotes,  that,  “behind  every  criterion  lurks  a  norm”  (Lohman,  2009).  In  assessment  of  learners  in  the  MYP  we  aim  to  measure  them  against  pre-­‐determined  performance  outcomes  –  criterion  descriptors  –  but  how  are  these  outcomes  decided?    This  is  where,  to  a  greater  extent,  we  find  the  norm:  hiding  in  plain  sight  as  the  command  terms  of  an  achievement-­‐level  descriptor!    Assessment  in  a  criterion-­‐referenced  system  raises  more  challenges  in  terms  of  construct  validity  than  traditional  norm-­‐referenced  tests,  as  described  by  Edward  Haertal  in  1985:   “When  tests  are  used  only  to  rank  examinees,  validity  can  be  established  by  simple   correlations  of  test  scores  with  criteria.  Criterion-­‐referenced  interpretations,  using   test  performance  […]  require  new  approaches  to  test  validation.”    Essentially  here  we  see  the  importance  of  command  terms  come  to  the  fore  –  the  language  or  action-­‐verbs  used  in  assessment  tasks  and  descriptors:   “This  methodology  begins  with  the  description  of  the  achievement  construct  in   psychological  and  behavioral  terms.  The  psychological  description  of  the   achievement  construct  is  an  account  of  the  knowledge  and  skills  it  entails.”    (Haertal,  1985)          
  11. 11. Stephen Taylor Assessment The  command  terms  are  a  defined  set  of  action  verbs  which  have  been  categorized  in   accordance  with  the  ideas  of  Bloom’s  taxonomy  to  represent  a  hierarchy  of  desired   achievement  constucts.  The  example  rubric  below,  for  the  sciences  criterion  C:  Knowledge   and  understanding,  demonstrates  this:    Table  1:  Criterion  C:  Knowledge  &  understanding  (current)  taken  from  the  MYP  Science  Guide  (IB,  2010b)  Level   Descriptor  0   The  student  does  not  meet  any  of  the  descriptors  below.  1-­‐2   The student recalls some scientific ideas, concepts and/or processes. The student applies scientific understanding to solve simple problems.  3-­‐4   The student describes scientific ideas, concepts and/or processes. The student applies scientific understanding to solve complex problems in familiar situations. The student analyses scientific information by identifying parts, relationships or causes.  5-­‐6   The student uses scientific ideas, concepts and/or processes correctly to construct scientific explanations. The student applies scientific understanding to solve complex problems including those in unfamiliar situations. The student analyses and evaluates scientific information and makes judgments supported by scientific understanding.     The  descriptors  ‘recall’  and  ‘describe’  are  in  line  with  the  lower  end  on  Bloom’s  taxonomy  –   the  knowledge  domain.  However,  ‘construct’  and  ‘analyse’  appear  at  the  higher  end.  By   focusing  assessment  on  these  skills  and  knowledge  outcomes,  the  normative  aspect  of   assessment  is  present  in  the  grade-­‐level  descriptors.       This  generates  another  issue  with  content  and  criterion  validity  in  the  current  MYP  model.   At  the  moment,  these  command  terms  are  fully  defined  and  published  in  a  document   entitled  ‘Command  terms  in  the  MYP’  (IB,  2010c).  However,  they  are  not  present  in  all   subject  guides  and  the  usage  of  those  that  are  present  may  not  be  consistent  between   subjects.  A  lack  of  coherence  between  classrooms  may  lead  into  issues  of  criterion-­‐related   validity,  especially  for  students  and  teachers  who  teach  across  disciplines  and  see  command   terms  used  in  different  ways.       Criterion-­‐related  assessment  in  the  MYP  differs  from  criterion-­‐referenced  assessment  in  a   subtle  but  important  way.  Criterion-­‐referenced  assessment  is  often  used  to  assess  mastery   of  skills  and  content.  Criterion-­‐related  assessment  uses  a  best-­‐fit  approach  to  assign  grades   to  students:  “When  assessing  a  student’s  work,  teachers  should  read  the  descriptors   (starting  with  level  0)  until  they  reach  a  descriptor  that  describes  an  achievement  level  that   the  work  being  assessed  has  not  attained.”  (IB,  2010a,  p.25)  In  practice,  this  allows  for  a  
  12. 12. Stephen Taylor Assessmentteacher  to  judge  a  student’s  work  based  on  the  most  appropriate  combination  of  descriptors  as  outlines  in  the  rubric.  The  best-­‐fit  approach  also  covers  assigning  final  grades.  Averages  and  percentages  are  not  acceptable  practice  –  instead  one  must  look  at  the  recent  trend  in  a  student’s  work  towards  a  given  criterion.  For  this  reason  it  is  important  that  there  are  multiple  measures  for  each  criterion  per  reporting  period.  A  clarification  of  the  IB’s  position  on  best-­‐fit  grading  is  included  in  Appendix  3.      This  best-­‐fit  approach  to  assessment  is  a  strength  of  the  MYP  in  terms  of  criterion-­‐related  validity  as  it  focuses  on  the  student’s  ability  to  achieve  in  relation  to  a  set  of  pre-­‐determined,  published  performance  descriptors.  With  the  best-­‐fit  approach,  teachers  are  best  placed  to  assess  a  student’s  work  for  what  they  have  achieved,  rather  than  what  they  have  not  (which  is  a  feature  of  pure  criterion-­‐referenced  assessment).    It  is  reliable  as  it  is  based  on  multiple  measures  and  evidence  of  trends  in  student  achievement.  However,  for  the  system  to  work  effectively,  there  needs  to  be  multiple  measures  of  each  criterion  –  which  regularly  proves  a  challenge  in  a  subject  with  six  criteria.  In  some  classes,  a  ‘race  to  assess’  can  impact  both  reliability  and  validity.      In  a  recent  study  in  Sweden,  grade  inflation  was  observed  in  criterion-­‐referenced  assessment  system.  (Wikström,  2005).    Wikström  found  in  her  study  over  six  years  that  grades  had  been  increasing  in  the  criterion-­‐referenced  system  and  was  able  to  exclude  factors  relating  to  authentic  improved  achievements,  strategic  course  selection  and  selective  exclusion  of  low-­‐achievers.  What  remained  was  a  lowering  of  standards,  with  a  more  notable  change  in  the  Arts  and  the  lowest  in  English  and  Mathematics,  subjects  calibrated  against  national  tests.  In  a  typical  MYP  classroom,  assessment  is  in  the  hands  of  the  teacher  and  therefore  prone  to  positive  grading  or  an  indivdual’s  interpretation  of  the  criteria.  Under  the  current  system  which  includes  attitudinal  grades,  the  effect  of  grade  inflation  may  be  more  pronounced,  having  a  negative  impact  on  reliability  of  grades  awarded.      In  the  sciences  it  could  be  argued  tha  half  of  a  student’s  current  grade  comes  not  from  the  ‘hard  science’  of  knowledge  and  lab  investigative  skills  but  from  a  more  social-­‐sciences  and  language  leaning  towards  One  World,  Communication  in  science  and  Attitudes  in  science.  This  raises  a  concern  over  content  validity  –  is  a  student  scoring  well  because  she  is  good  at  
  13. 13. Stephen Taylor Assessmentscience  or  is  it  because  what  is  being  assessed  is  not  science?  It  also  raises  a  more  serious  question  of  reliability  and  appropriateness  when  part  of  a  grade  is  devoted  to  attitudinal  or  behavioural  evidence  –  which  can  be  subjective,  is  hard  to  track  and  does  not  give  a  measure  of  a  student’s  genuine  achivements  in  science.  (OConnor,  2011,  pp.16-­‐20).        Validity  and  reliability  in  science  assessment  in  MYP:  The  Next  Chapter    So  does  the  Next  Chapter  address  the  issues  in  validity  and  reliability  that  are  present  in  the  current  model  and  how  does  this  impact  the  sciences?  To  get  a  better  picture  of  some  of  these  proposed  changes  (which  are  currently  being  implemented  in  selected  pilot  schools),  please  refer  to  Appendices  4-­‐7  which  include:  summary  changes  to  the  aims  of  the  sciences;  summary  changes  to  assessment  in  the  sciences;  comparison  of  old  vs  new  assessment  criteria;  and,  comparison  of  grade  level  descriptors  for  the  knowledge-­‐related  criterion.      Criterion-­‐related  validity  in  the  MYP  sciences  Paring  back  the  aims,  assessed  criteria  and  descriptors  of  the  sciences  is  likely  to  have  a  positive  effect  on  criterion-­‐related  validity.  Through  a  clearer,  shorter  and  better-­‐defined  set  of  aims  and  objectives,  the  task  of  assessing  whether  a  student  has  met  these  goals  will  be  more  manageable  and  potentially  more  reliable.      Cutting  the  sciences  criteria  from  six  to  four  will  also  likely  have  a  number  of  positive  impacts  on  validity  and  reliability.  The  removal  of  the  behavioural  Attitudes  in  science  criterion  will  allow  for  more  reliable  assessment  of  a  student’s  actual  achievements  against  the  science  aims  and  objectives,  with  a  reduced  risk  of  subjective  contamination.  With  the  best  practice  of  multiple  measures,  four  criteria  are  easier  to  handle  than  six.  This  should  give  more  opportunities  for  meaningful  assessment  of  each  criterion.  It  will  be  an  interesting  study,  that  which  addresses  the  impact  of  removing  these  attitudinal  criteria  on  overall  student  achievement.  One  might  hypothesise  that  overall  1-­‐7  scores  will  decrease  as  the  ‘safety  nets’  of  Communication  in  science  and  Attitudes  in  science  are  removed  from  the  conceptually  weaker  students.      Finally,  an  increased  programme-­‐wide  focus  on  the  command  terms,  with  common  definitions,  should  serve  to  make  the  language  of  assessment  easier  for  all  to  understand  
  14. 14. Stephen Taylor Assessmentand  lead  to  more  criterion-­‐related  reliability.  Wordy  descriptors  with  multiple  command  terms  should  be  replaced  with  more  concise  descriptors,  giving  a  focus  for  assessment  of  the  criterion.  With  a  more  manageable  task  in  hand,  students  should  be  able  to  identify  performance  elements  which  will  allow  them  to  access  higher  grades.        Content  validity  in  the  MYP  sciences  The  MYP  is  described  as  a  framework  for  assessment  and  learning  and  not  an  exhaustive  curriculum.  This  allows  scope  for  schools  to  set  their  own  levels  of  content  validity,  such  as  meeting  the  state  science  content  standards.  However,  this  can  be  a  challenge  for  schools  where  there  is  no  parallel  set  of  standards  and  can  make  the  feed-­‐in  role  of  the  MYP  to  the  DP  difficult.  In  the  Next  Chapter,  clearer  guidelines  for  content  in  the  form  of  significant  concepts  and  perhaps  even  online  support  content  should  allow  teachers  to  plan  units  of  work  which  can  be  assessed  with  greater  content  validity.        Testing  knowledge  in  the  MYP  sciences  Under  the  Next  Chapter,  he  key  proposal  that  the  Using  knowledge  criterion  “must  only  be  assessed  through  tests  or  exams,”  (IB,  2011)  is,  to  me,  one  of  the  most  interesting  changes  to  be  put  forth  in  the  MYP  sciences.  It  represents  a  move  to  an  assessment  of  knowledge  that  at  face  value  may  seem  more  ‘old-­‐fashioned’  and  less  suited  to  differentiation  to  students’  needs  than  the  current  system.  The  working  sciences  guide  allows  for  assessment  of  Knowledge  and  understanding  through  a  diversity  of  modes,  including  case  studies  and  response  to  articles  or  datasets  (IB,  2010a,  p.31).  As  long  as  testing  is  used  well  the  new  system  will  allow  for  greater  reliability  in  the  data  produced  (free  from  potential  contamination  of  other  students’  ideas  such  as  in  the  current  system).  It  may  also  have  a  positive  impact  on  consequential  validity  as  students  move  into  the  DP  and  preparation  for  a  final  exam  marked  on  grade  boundaries,  making  up  76%  of  their  summative  assessment.      Arguably  the  move  to  stipulate  testing  or  exams  as  a  method  of  assessment  of  Using  knowledge  is  one  to  ensure  greater  reliability  of  assessment.  In  practice,  this  will  hold  significant  challenges  for  teachers  that  will  need  to  be  given  professional  development  considerations  from  the  IB.  As  Sylvia  Green  notes,    
  15. 15. Stephen Taylor Assessment “…The  links  between  the  level  descriptions  and  [the  national]  test  mark  schemes  are   not  so  transparent.    Different  elements  within  structured  questions  may  address   different  levels  and  content,  even  different  domains  within  the  subject,  therefore  it   may  be  difficult  to  classify  some  questions  as  ‘at  a  particular  level’.    In  such   circumstances  standard  setting  is  done  by  determining  ‘thresholds’  in  total  test   scores,  initially  by  judgmental  means  and  subsequently  using  statistical  equating  to   support  judgments.“                (Green,  2002)    Test  design  is  a  complex  business  and  designing  tests  that  work  in  a  criterion-­‐related  situation  is  a  challenge.  As  a  traditional  mode  of  assessment  that  gives  the  perception  of  rigour  and  ‘academia’,  it  will  take  a  concerted  effort  to  change  the  approach  of  stakeholders  in  assessment  and  to  reinforce  the  criterion-­‐related  approach.          Conclusions  &  Recommendations  A  great  deal  of  thought  and  scholarship  lies  behind  the  Next  Chapter  and  its  implications  for  assessment  in  the  sciences.  Removal  of  attitudinal  criteria,  clearly  defined  command  terms,  more  concise  achievement-­‐level  descriptors  and  a  narrower  set  of  acceptable  assessment  tools  should  serve  to  enhance  reliability  of  assessment.  Emphasis  on  the  aims  of  the  sciences  and  the  proposed  production  of  pre-­‐populated  online  unit  planner  tools  may  make  some  headway  in  validity  of  what  is  being  assessed.  However,  it  will  take  considerable  work  on  the  part  of  the  IB,  school  leaders  and  teachers  to  translate  the  Next  Chapter  into  effective  classroom  action.      Professional  development  of  all  teachers  must  play  a  central  role  in  ensuring  that  assessment  in  the  Next  Chapter  makes  a  successful  translation  from  paper  to  practice.  With  over  900  schools  practicing  the  MYP,  it  must  not  be  assumed  that  the  teachers  in  each  classroom  and  the  administrators  in  each  office  are  clued-­‐in  to  current  educational  philosophy  and  practices.  It  is  already  a  programme  authorization  and  evaluation  requirement  that  teachers  attend  MYP  workshops  for  programme  delivery  and  development.  Online  and  in-­‐school  workshops,  as  well  as  the  Online  Curriculum  Centre  (OCC)  exist  as  tools  for  professional  development  and  are  becoming  stronger.    
  16. 16. Stephen Taylor AssessmentThe  IB  should  take  the  opportunity  to  capitalize  on  its  own  developments  and  opportunities  by  including  making  explicit  discussion  of  validity  and  reliability  in  assessment  practices  a  part  of  these  resources.  Outreach  through  the  OCC,  video  or  article  resources.      Clear  exemplars,  such  as  those  generally  found  in  teachers’  support  material,  must  be  made  widely  available  and  readily  accessible  if  they  are  to  be  put  to  good  use.  This  is  of  particular  importance  to  testing  –  perhaps  the  one  criterion  which  represents  the  biggest  change  for  science  teachers  in  their  methods.      Finally,  there  is  a  need  to  allow  teachers  to  support  the  effective  development  of  their  students’  assessment  practices.  With  the  removal  of  attitudinal  grading,  and  its  consequential  boost  to  validity  and  reliability,  comes  an  increases  likelihood  of  a  failing  student.  It  must  emphasized  through  all  professional  development  modes,  handbooks  and  other  available  media  that  effective,  criterion-­‐related  formative  assessment  plays  a  crucial  role  in  development:     “There  is  a  body  of  firm  evidence  that  formative  assessment  is  an  essential   component  of  classroom  work  and  that  its  development  can  raise  standards  of   achievement.”  (Black  &  Wiliam,  2010)    With  some  excitement,  but  also  trepidation,  I  look  forward  to  the  Next  Chapter.  Early  signs  look  positive  that  it  will  become  more  reliable  and  valid  in  its  assessment:  it  will  evolve  into  a  form  that  shows  greater  fitness  for  purpose.            Acknowledgements    Thank-­‐you  to  Malcolm  Nicolson,  Head  of  the  Middle  Years  Programme,  and  Sean  Rankin,  Head  of  Curriculum  and  Assessment  for  the  Sciences,  for  their  input  and  willingness  to  answer  questions  by  email.  Thanks  also  to  Sue  Martin  for  her  guidance  and  mentoring  during  the  summer  school  and  by  email  since.  
  17. 17. Stephen Taylor Assessment   ReferencesBishop, K., Bullock, K., Martin, S. & Thompson, J., 1999. Users perceptions of the GCSE.Educational Research, 41(1), pp.35-49.Black, P. & Wiliam, D., 2010. Kappan Classic: Inside the Black Box: Raising Standards ThroughClassroom Assessment. The Phi Delta Kappan , 92(1), pp.81-90.CIE, 2011. Cambridge IGCSE Brochure (pdf). [Online] Available at: [Accessed 4 January2012].Dunn, L., Parry, S. & Morgan, C., 2002. Seeking quality in criterion referenced assessment.[Online] Available at: [Accessed 20February 2012].Green, S., 2002. Criterion referenced assessment as a guide to learning - the importance ofprogression and reliability. [Presentation, available online at:] Johannesburg Available at: [Accessed 13 February 2012].Haertal, E., 1985. Construct Validity and Criterion-Referenced Testing. Review of EducationalResearch, 55(1), pp.23-46.Hambleton, R.K. & Novick, M.R., 1973. Toward an integration of theory and method forcriterion-referenced tests.. Journal of Educational Measurement, 10(3), pp.159-70.Harlen, W., 2007. Criteria for evaluating systems for student assessment. Studies in EducationalEvaluation, 33(1), pp.15-28.IB, 2008. MYP: From principles to practice [Note: Password protected]. Cardiff, UK:International Baccalaureate Organisation. Available at: [password protected][accessed 18 October 2011].IB, 2009. The Middle Years Programme: A basis for practice (pdf). Cardiff, UK: InternationalBaccaluareate Organisation. Available at: [password protected] [accessed 4January 2012].IB, 2010a. MYP Coordinators Handbook (pdf). Cardiff, UK: International BaccalaureateOrganisation. Available at: [password protected] [accessed 4 January 2012].IB, 2010b. MYP: Sciences guide. For use from January 2011. Cardiff, UK: InternationalBaccaluareate Organisation. Available at: [password protected] [accessed 30January 2011].IB, 2010c. Command terms in the MYP. Cardiff, UK: International Baccaluareate Organisation.Available at: [password protected] [accessed 30 January 2011].IB, 2011a. Development Report: MYP Sciences guide (pdf). [Online] Available at: [password protected] [Accessed 5 November 2011].
  18. 18. Stephen Taylor AssessmentIB, 2011b. MYP Statistical Bulletin, June 2011 moderation session (pdf) [Note: passwordprotected]. [Online] Available at: [password protected] [Accessed 12 February 2012].IB, 2011c. MYP: the next chapter. Project report October 2011. [Online] Available at: [password protected] [Accessed 25 November 2011].IB, 2012. IB Fast Facts. [Online] Available at: [passwordprotected] [Accessed 20 February 2012].Lohman, D.F., 2009. The Contextual Assessment of Talent. In Vantassel-Baska, J. LeadingChange in Gifted Education: The Festschrift of Dr. Joyce Vantassel-Baska. Accessed online at ed.Waco, Texas, USA: Prufrock Press. pp.229-41.Messick, S., 1995. Validity of Psychological Assessment: Validation of Inferences From PersonsResponses and Performances as Scientific Inquiry Into Score Meaning. American Psychologist,50(9), p.741–749.Morrison, N., 2009. GCSE your time is up. [Online] Available at: [Accessed 12 February 2012].Moss, P., Girard, B. & Haniford, L., 2006. Validity in edcuational assesssment. Review ofResearch in Education (, 30(1), pp.109-62.Nicolson, M. & Hannah, L., 2010. History of the Middle Years Programme (pdf). [Online]Available at: [Accessed 14 February 2012].Nicolson, Malcolm. Personal email correspondences. January 16-February 26 2012.OConnor, K., 2011. A repair kit for grading/ 15 fixes for broken grades - 2nd Ed.. Boston:Pearson Education.OFQUAL, 2011. General Conditions of Recognition. [Online] Available at:[Accessed 12 February 2012].Rankin, Sean. Personal email correspondences regarding sciences assessment. January 16-February 28 2012.Thompson, J. & Hayden, M., 2011. The Middle Years Programme. In Thompson, J. & Hayden,M. Taking the MYP forward. Melton, UK: John Catt Educational. pp.13-18.Wikström, C., 2005. Grade stability in a criterion-referenced grading system: a Swedish example..Assessment in Education, 12(2), pp.125-44.
  19. 19. Stephen Taylor Assessment AppendicesAppendix 1: Aims and objectives of the MYP sciences. Taken from the science subject guide(IB, 2010a)Aims  The  aims  of  any  MYP  subject  and  of  the  personal  project  state  in  a  general  way  what  the  teacher  may  expect  to  teach  or  do,  and  what  the  student  may  expect  to  experience  or  learn.  In  addition,  they  suggest  how  the  student  may  be  changed  by  the  learning  experience.  The  aims  of  the  teaching  and  study  of  MYP  sciences  are  to  encourage  and  enable  students  to:  1. develop  curiosity,  interest  and  enjoyment  towards  science  and  its  methods  of  inquiry  2. acquire  scientific  knowledge  and  understanding  3. communicate  scientific  ideas,  arguments  and  practical  experiences  effectively  in  a  variety  of  ways  4. develop  experimental  and  investigative  skills  to  design  and  carry  out  scientific  investigations  and  to   evaluate  evidence  to  draw  a  conclusion  5. develop  critical,  creative  and  inquiring  minds  that  pose  questions,  solve  problems,  construct   explanations,  judge  arguments  and  make  informed  decisions  in  scientific  and  other  contexts  6. develop  awareness  of  the  possibilities  and  limitations  of  science  and  appreciate  that  scientific   knowledge  is  evolving  through  collaborative  activity  locally  and  internationally  7. appreciate  the  relationship  between  science  and  technology  and  their  role  in  society  8. develop  awareness  of  the  moral,  ethical,  social,  economic,  political,  cultural  and  environmental   implications  of  the  practice  and  use  of  science  and  technology  9. observe  safety  rules  and  practices  to  ensure  a  safe  working  environment  during  scientific  activities  10. engender  an  awareness  of  the  need  for  and  the  value  of  effective  collaboration  during  scientific   activities.    Objectives  The  objectives  of  any  MYP  subject  and  of  the  personal  project  state  the  specific  targets  that  are  set  for  learning  in  the  subject.  They  define  what  the  student  will  be  able  to  accomplish  as  a  result  of  studying  the  subject.  These  objectives  relate  directly  to  the  assessment  criteria  found  in  the  “Sciences  assessment  criteria”  section.    A   One  world  This  objective  refers  to  enabling  students  to  gain  a  better  understanding  of  the  role  of  science  in  society.  Students  should  be  aware  that  science  is  a  global  endeavour  and  that  its  development  and  applications  can  have  consequences  for  our  lives.  One  world  should  provide  students  with  the  opportunity  to  critically  assess  the  implications  of  scientific  developments  and  their  applications  to  local  and/or  global  issues.  At  the  end  of  the  course,  students  should  be  able  to:  • explain  the  ways  in  which  science  is  applied  and  used  to  address  specific  problems  or  issues  • discuss  the  effectiveness  of  science  and  its  application  in  solving  problems  or  issues  • discuss  and  evaluate  the  moral,  ethical,  social,  economic,  political,  cultural  and  environmental   implications  of  the  use  of  science  and  its  application  in  solving  specific  problems  or  issues.    B   Communication  in  science  This  objective  refers  to  enabling  students  to  become  competent  and  confident  when  communicating  information  in  science.  Students  should  be  able  to  use  scientific  language  correctly  and  a  variety  of  communication  modes  and  formats  as  appropriate.  Students  should  be  aware  of  the  importance  of  acknowledging  and  appropriately  referencing  the  work  of  others  when  communicating  in  science.    At  the  end  of  the  course,  students  should  be  able  to:  • use  scientific  language  correctly  • use  appropriate  communication  modes  such  as  verbal  (oral,  written),  visual  (graphic,  symbolic)  and   communication  formats  (laboratory  reports,  essays,  presentations)  to  effectively  communicate  theories,   ideas  and  findings  in  science  
  20. 20. Stephen Taylor Assessment• acknowledge  the  work  of  others  and  the  sources  of  information  used  by  appropriately  documenting   them  using  a  recognized  referencing  system.    C   Knowledge  and  understanding  of  science  This  objective  refers  to  enabling  students  to  understand  scientific  knowledge  (facts,  ideas,  concepts,  processes,  laws,  principles,  models  and  theories)  and  to  apply  it  to  construct  scientific  explanations,  solve  problems  and  formulate  scientifically  supported  arguments.    At  the  end  of  the  course,  students  should  be  able  to:  • recall  scientific  knowledge  and  use  scientific  understanding  to  construct  scientific  explanations  • apply  scientific  knowledge  and  understanding  to  solve  problems  set  in  familiar  and  unfamiliar  situations  • critically  analyse  and  evaluate  information  to  make  judgments  supported  by  scientific  understanding.    D   Scientific  inquiry  While  the  scientific  method  may  take  on  a  wide  variety  of  approaches,  it  is  the  emphasis  on  experimental  work  that  characterizes  MYP  scientific  inquiry.  This  objective  refers  to  enabling  students  to  develop  intellectual  and  practical  skills  to  design  and  carry  out  scientific  investigations  independently  and  to  evaluate  the  experimental  design  (method).    At  the  end  of  the  course,  students  should  be  able  to:  • state  a  focused  problem  or  research  question  to  be  tested  by  a  scientific  investigation  • formulate  a  testable  hypothesis  and  explain  it  using  scientific  reasoning  • design  and  carry  out  scientific  investigations  that  include  variables  and  controls,  material  and/or   equipment  needed,  a  method  to  be  followed  and  the  way  in  which  the  data  is  to  be  collected  and   processed  • evaluate  the  validity  and  reliability  of  the  method  • judge  the  validity  of  a  hypothesis  based  on  the  outcome  of  the  investigation  suggest  improvements  to   the  method  or  further  inquiry,  when  relevant.    E    Processing  data  This  objective  refers  to  enabling  students  to  collect,  process  and  interpret  sufficient  qualitative  and/or  quantitative  data  to  draw  appropriate  conclusions.  Students  are  expected  to  develop  analytical  thinking  skills  to  interpret  data  and  judge  the  reliability  of  the  data.    At  the  end  of  the  course,  students  should  be  able  to:  • collect  and  record  data  using  units  of  measurement  as  and  when  appropriate      • organize,  transform  and  present  data  using  numerical  and  visual  forms    • analyse  and  interpret  data    • draw  conclusions  consistent  with  the  data  and  supported  by  scientific  reasoning.    F   Attitudes  in  science  This  objective  refers  to  encouraging  students  to  develop  safe,  responsible  and  collaborative  working  practices  in  practical  science.    During  the  course,  students  should  be  able  to:  • work  safely  and  use  material  and  equipment  competently    • work  responsibly  with  regards  to  the  living  and  non-­‐living  environment    • work  effectively  as  individuals  and  as  part  of  a  group  by  collaborating  with  others.•
  21. 21. Stephen Taylor AssessmentAppendix 2: Grade-level descriptors in the Middle Years ProgrammeGrade   Descriptor   1   Minimal  achievement  in  terms  of  the  objectives.   Very   limited   achievement   against   all   the   objectives.   The   student   has   difficulty   in   understanding   the   required   2   knowledge  and  skills  and  is  unable  to  apply  them  fully  in  normal  situations,  even  with  support.   Limited   achievement   against   most   of   the   objectives,   or   clear   difficulties   in   some   areas.   The   student   3   demonstrates  a  limited  understanding  of  the  required  knowledge  and  skills  and  is  only  able  to  apply   them  fully   in  normal  situations  with  support.   A  good  general  understanding  of  the  required  knowledge  and  skills,  and  the  ability  to  apply  them  effectively  in   4   normal  situations.  There  is  occasional  evidence  of  the  skills  of  analysis,  synthesis  and  evaluation.   A  consistent   and   thorough   understanding  of  the  required  knowledge  and  skills,  and  the  ability  to  apply  them  in   5   a   variety   of   situations.   The   student   generally   shows   evidence   of   analysis,   synthesis   and   evaluation   where   appropriate  and  occasionally  demonstrates  originality  and  insight.   A   consistent   and   thorough   understanding   of   the   required   knowledge   and   skills,   and   the   ability   to   apply   them   in   6   a   wide   variety   of   situations.   Consistent   evidence   of   analysis,   synthesis   and   evaluation   is   shown   where   appropriate.  The  student  generally  demonstrates  originality  and  insight.   A  consistent  and  thorough  understanding  of  the  required  knowledge  and  skills,  and  the  ability  to  apply  them   almost   faultlessly   in   a   wide   variety   of   situations.   Consistent   evidence   of   analysis,   synthesis   and   evaluation   is   7   shown  where  appropriate.  The  student  consistently  demonstrates  originality  and  insight  and  always  produces   work  of  high  quality.  Taken  from  the  MYP  Coordinator’s  Handbook  (IB,  2010a,  pp.59-­‐60)  Appendix 3: The best-fit approach (clarification from the IB) “The  descriptors  for  each  criterion  are  hierarchical.  When  assessing  a  student’s  work,  teachers   should  read  the  descriptors  (starting  with  level  0)  until  they  reach  a  descriptor  that  describes   an  achievement  level  that  the  work  being  assessed  has  not  attained.  The  work  is  therefore   best  described  by  the  preceding  descriptor.     Where  it  is  not  clearly  evident  which  level  descriptor  should  apply,  teachers  must  use  their   judgment  to  select  the  descriptor  that  best  matches  the  student’s  work  overall.  The  “best-­‐fit”   approach  allows  teachers  to  select  the  achievement  level  that  best  describes  the  piece  of  work   being  assessed.     If  the  work  is  a  strong  example  of  achievement  in  a  band,  the  teacher  should  give  it  the  higher   achievement  level  in  the  band.  If  the  work  is  a  weak  example  of  achievement  in  that  band,  the   teacher  should  give  it  the  lower  achievement  level  in  the  band.”   (IB,  2010a,  p.25)
  22. 22. Stephen Taylor Assessment Appendix 4: A comparison of the aims of the sciences under the current MYP model and after the proposed changes of the Next Chapter.Current  MYP  Sciences  Guide  (IB,  2010a,  p.4)   Proposed  changes  (IB,  2011,  p.6)  The  aims  of  the  teaching  and  study  of  MYP  sciences  are  to   The  aims  of  the  teaching  and  study  of  MYP  sciences  are  to  encourage  and  enable  students  to:   encourage  and  enable  students  to:  1.   develop  curiosity,  interest  and  enjoyment  towards   • understand  and  appreciate  science  and  its  implications  science  and  its  methods  of  inquiry   through  the  areas  of  interaction  2.   acquire  scientific  knowledge  and  understanding   • consider  science  as  a  human  endeavour  with  benefits  and  3.   communicate  scientific  ideas,  arguments  and  practical   limitations  experiences  effectively  in  a  variety  of  ways   • cultivate  analytical,  inquiring  and  flexible  minds  that  pose  4.   develop  experimental  and  investigative  skills  to  design   questions,  solve  problems,  construct  explanations  and  and  carry  out  scientific  investigations  and  to  evaluate  evidence   judge  arguments  to  draw  a  conclusion   • develop  skills  to  design  and  perform  investigations,  5.   develop  critical,  creative  and  inquiring  minds  that  pose   evaluate  evidence  and  reach  conclusions    questions,  solve  problems,  construct  explanations,  judge   • engender  an  awareness  of  the  need  to  effectively  arguments  and  make  informed  decisions  in  scientific  and  other   collaborate  and  communicate  contexts   • apply  language  skills  and  knowledge  in  a  variety  of  real-­‐6.   develop  awareness  of  the  possibilities  and  limitations  of   life  contexts  science  and  appreciate  that  scientific  knowledge  is  evolving   • demonstrate  sensitivity  towards  the  living  and  non-­‐living  through  collaborative  activity  locally  and  internationally   environments  7.   appreciate  the  relationship  between  science  and   • reflect  on  learning  experiences  and  make  informed  technology  and  their  role  in  society   choices  8.   develop  awareness  of  the  moral,  ethical,  social,  economic,  political,  cultural  and  environmental  implications  of  the  practice  and  use  of  science  and  technology  9.   observe  safety  rules  and  practices  to  ensure  a  safe  working  environment  during  scientific  activities  10.   engender  an  awareness  of  the  need  for  and  the  value  of  effective  collaboration  during  scientific  activities.  
  23. 23. Stephen Taylor Assessment Appendix 5: Summary of assessment-related changes to the MYP sciences programme under the Next Chapter.Current  Sciences  Guide.  (IB,  2010a)   Proposed  changes  under  the  Next  Chapter.  (IB,  2011)  Six  assessment  criteria   Four  assessment  criteria  Zero-­‐band  plus  three  dual  bands  of  achievement-­‐ Zero-­‐band  plus  four  bands  of  achievement-­‐level  level  descriptors  (0,  1-­‐2,  3-­‐4,  5-­‐6)   descriptors  (0,  1,  2,  3,  4  in  the  current  working  guide   for  pilot  schools)*.    Command  terms  defined  in  the  subject  guide  and   Command  terms  defined  across  the  whole  MYP  and  used  in  achievement-­‐level  descriptors.     used  in  a  more  focused  manner  in  achievement-­‐level   descriptors.    Attitudes  in  science  criterion  in  use.     Attitudes  in  science  criterion  removed.    One  world  and  Communication  in  science  criteria  in   Science  in  the  world  criterion  merges  the  aims  of  One  use.     world  and  Communication  in  science.    Assessment  of  subject  content  acquisition  primarily   Using  knowledge  criterion  to  assess  subject  content  through  Knowledge  and  understanding  criterion.   acquisition.  Key  proposal  that  this  “must  only  be  Modes  of  assessment  open  to  teachers.     assessed  through  tests  or  exams”  (IB,  2011,  p.13)  Lab  and  investigative  work  assessed  through   Lab  and  investigative  work  assessed  through  Inquiring  Scientific  inquiry  and  Processing  data  criteria.     and  designing  and  processing  and  evaluating  criteria.     *This  change  is  noted  in  a  copy  of  an  assessment  rubric  from  the  guide  for  pilot  schools,  shared  by   Sean  Rankin,  Curriculum  and  Assessment  Manager  for  the  MYP  sciences.