This is my presentation at the 2012 Western Australian Institute of Educational Research held at Notre Dame University, Fremantle. It uses survey responses from teachers to evaluate just how 'high-stakes' NAPLAN is.

  • Good afternoon and welcome to my presentation that explores whether on not NAPLAN can be considered a high-stakes test. In this presentation I will provide a brief summary of international similarities and differences in standardised tests, before I provide evidence of teacher perceptions of the impact that NAPLAN is having on curriculum and pedagogy in schools in WA and SA. This evidence is drawn from my ARC funded research project The effects of NAPLAN on Australian School Communities. At the end of the presentation I hope to facilitate a discussion as to whether or not NAPLAN qualifies as high-stakes.
  • NAPLAN, which began in 2008 with the specific intention of being a vehicle to improve education equity through reducing the gap in literacy/numeracy, has to be understood within a broader policy framework of an increased role of the Federal Government in educationa global move towards education reform that de-emphasizes teacher professionalism, promotes debates about teacher quality, school efficiency linking of education to national productivity agendas in place of the social goodidea that improving equity requires increased control/coercionthe international experience, PISA, competition, The purpose of this presentation is to look at the effects of high-stakes testing within a particular ‘vernacular’. As Rizvi and Lingard have pointed out: As education has been reconstituted as central to the economic competitiveness of nations in the context of a global economy, many educational systems have instituted high-stakes, standardized testing to try to drive up educational standards (Rizvi & Lingard, 2010, p. 98).The adoption of high-stakes testing, and their effects on teachers, students and schools is often presented as the same around the globe. While there may be similarities, the results or impacts “always have a vernacular character as they build incrementally on what has gone before in specific educational systems” (Rizvi & Lingard, 2010, p. 97).
  • Public reporting and public accountability are central to the issue of what constitutes high stakes testing. Johnson et al. (2008) define high stakes tests in the United States as those which have consequences for student success (e.g. grade promotion or graduation), teacher accountability, the reputation of schools or the funding of schools. Ball (2008) notes that the parental perception of schools may be affected by high stakes testing, with a resulting detrimental effect of parental market choice being exerted on low SES schools that work in disadvantaged communities. Howe et al. (2001) also note the detrimental effect of “white flight” from low socio-economic schools in the United States, as a result of competition arising from the publication of high stakes testing results. Howe et al. note the unequal ability of different kinds of parents to “choose”, with middle class parents much better situated to make informed choices. They highlight the parental use of high stakes test results or ‘league tables’ as one of the key pieces of evidence that middle class parents consider when choosing a school for their child. Au (2008, p.501) describes the impact of the results of testing in terms of “sanctions or rewards to students, teachers, administrators, schools, school districts and other official bodies charged with the education of children”.
  • In the Australian context, whether NAPLAN constitutes high stakes testing has been subject to some debate. However, Lobascher (2011, pp. 9-10) argues that the publication of the results of the NAPLAN program on the MySchool website, with the associated media coverage, means that NAPLAN too may be labelled as a high stakes testing program. Lingard (2010) supports this view, citing the implementation of the recommendations relating to Queensland’s recent poor NAPLAN performance contained in the Masters (2009) report.That said, there are several key differences: Firstly there is currently no suggestion that schools will be found to be ‘failing’ on the basis of their results and face closures. Schools that are found to be underperforming in the Australian context will be offered support and financial assistance under the current Federal Government policy. This is in stark contrast to the USA where the No Child Left Behind Act (2001) brought with it the threat of school closures. Secondly NAPLAN test scores are not used to determine grade promotion for any student. In the USA students get ‘held back’ if they have not performed adequately in their high stakes testing. Currently, in Australia, NAPLAN results have no impact on grade promotion. Finally the reporting of NAPLAN results in Australia is currently different to the previous models used in both the UK and the USA. In Australia, it is very difficult (though not impossible) to make a ‘league table’ and a range of contextual data is included on the MySchool website, including socioeconomic (SES) data.
  • Using a survey research design, quantitative data was obtained from Likert-type questions. These questions intended to measure various aspects of teachers’ concerns regarding teaching after the introduction of NAPLAN, teachers’ perceptions about changes in pedagogy and curriculum, and motivational constructs relating to teaching and performance. Additionally, demographic information was collected relating to teaching experience, location, gender, school system and SES. The survey was well promoted by unions, school administrators and public media. Voluntary snowball sampling was used to obtain participants. To minimize presentation bias, no direct incentives were offered for participating in the research. To obtain a representative sample, a large sample size of nearly 1000 was obtained (from an overall population of 24,749 teachers in WA and 17,858 teachers in SA based on 2007 census data (ABS, 2007)). As data was voluntarily offered, care must be taken not to generalize to larger populations. However, group differences observed for samples of this size provide support for further exploration.       MANOVAs were conducted across all survey items to determine the differences between levels of the demographic factors and to determine the possibility of inter-correlated constructs among the survey items. Factor analysis was used to confirm the hypothesized theoretical constructs within the survey items. Group differences were assessed using ANOVAs with the aggregate scores for each construct relating to teachers’ concerns about teaching and perceptions regarding changes to pedagogy and curriculum.
  • Demographic information across the survey indicate that the respondents showed the following characteristics. There were 961 teacher responses (224 male, 737 female). Of these responses, 558 came from Western Australia and 383 came from South Australia. 590 participants taught in Government (or public) schools, while 226 taught in Catholic Schools and 145 taught in Independent Schools (such as Anglican or Uniting Church Schools or non-denominational schools). 233 responses were from high school teachers, while 728 were primary teachers. Teachers were asked to report their perception of the drawing area of their school population in terms of SES. The following graph represents the breakdown of responses according to teacher perceptions of the SES of their school.
  • In these responses an average close to 2.0 indicates a fairly even spread to the questions. The lower the standard deviation the more closely grouped the responses (remembering that +/- 1 SD = 50% of the population, in this case 769 responses).Questions to focus on:Higher order thinking skills – mean and SD show that while the majority of teachers feel that NAPLAN does not resulted in lessons that promote higher-order thinking skills, there is a significant minority (21%) who either Agree or Strongly Disagree.This is a very important question for this analysis. The high mean and small SD shows that NAPLAN is having a significant impact through lessons that teach to the test. 83.4% of respondents ether A or SA to this – a highly statistically significant response.Suggests that teachers tend to perceive that NAPLAN stifles classroom dialogue.Very interesting response pattern. While overall spread suggests that teachers do not perceive NAPALN has improved literacy, there is a minority that do 23.4%. 27.9% feel that it has made no difference while 48.7% either D or SD.Interestingly there are different patterns in terms of numeracy – a far lower % feel that it has improved numeracy.This is also a significant response that points to a narrowing of the curriculum, supporting research that suggests that high-stakes testing squeezes out or limits time spent on non-tested subjects. If we agree that a broad curriculum is educative for our students, this is concerning.See above.This question also refers to authenticity in assessment and refers to that research suggesting that students are intrinsically motivated (and vice versa) by assessments that connect with their real-life contexts. NAPLAN is particularly problematic in this context for rural, remote, Indigenous students etc.
  • This question supports the spread of responses above.This question lends itself to the critique of high-stakes testing that it moves away from student-centred approaches to pedagogy in favour of teacher-centred approaches. This, (along with Q 11) has the most skewed response, (x=0.8, SD=0.8) indicating that teachers perceive NAPLAN as having a significant impact on pedagogy.See above for inclusivity.This response rate was the most ‘even’ of all the questions asked. It would appear that teachers are generally comfortable with NAPLAN criteria.This question is also points to teachers perceptions of the effects that NAPLAN is having on student motivation and engagement. This mirrors research in the US, UK that show that high-stakes testing generally lowers student motivation and engagement, probably for the reasons outlined in Q 8,9,10,11.Research into the impact of high-stakes tests from non-traditional and/or disadvantaged backgrounds supports the response pattern to this question, that it promotes learning environments and contexts that may increase disadvantage, lower equity.
  • Of course, and I would be the first to admit, what this survey generates is a snapshot of how teachers were feeling at or around the time at which NAPLAN occurs. As such it is subjective, but no more valuable because of that subjectivity. It is not that the data generated is flawed, it does what it is intended to do, namely provide us with an insight into how a group of individuals perceive the impacts of a Federal education policy on what they do and how they do it.A common criticism is that the people who tend to respond to surveys are those who have an axe to grind, so that the sample is generally not representative. Research suggests that while there is an element of truth to this, in general with a large survey the effect of this bias evens out and is less significant than it may appear.However, there are a number of statistical analysis operations that can be done that follow the logic; even if the sample is skewed, what are the internal effects, relationships, interactions between demographics and responses. In other words, if everyone who responds has an axe to grind, or the same axe to grind, we would expect little variation across demographic differences. As I will show, this is not the case.In a paper I wrote with Allen Harbaugh from Murdoch, we ran statistical analyses that examined one- and two-way MANOVA to assess main effects and interaction effects on teachers’ global perceptions. These were followed by a series of exploratory one- and two-way ANOVA of specific survey items to suggest potential sources for differences among teachers from different socioeconomic regions, states and systems.The following slides show our findings in relation to a set of questions that show an interaction effect between response patterns and demographics. There is another interaction effect between Qs 1,3,4,5,8,10,11,12,13,14.
  • There were significant interactions between teacher perceptions of the SES of the school and how they reported changes to the curriculum in their class. This is perhaps the most prominent finding of this research. One of the main aims of NAPLAN testing is to provide data to remove education inequities. However, these responses suggest that the impact on curriculum of NAPLAN is most keenly felt in schools in low socioeconomic drawing areas. If, as research suggest, the broadest curriculum that encourages a range of learning experiences is crucial in lowering the equity gap, then the fact that this is occurring with a greater intensity in those low SES schools may mean that the gap will further grow as a result of NAPLAN. However, this hypothesis can only be tentatively supported by the survey data, more research is needed to contextualise individual schools with curriculum approaches and NAPLAN results.
  • There were also significant impacts between the school system and the perceptions of impact on curriculum. Based on the survey responses, while all teachers regardless of school system pointed to a negative impact on the curriculum in their class and school (as evidenced by the literature that narrowing curriculum focus teaching to the test has a negative impact on student learning) teachers in government schools reported NAPLAN having a greater impact on curriculum. There were no significant differences between Catholic and Independent teacher perceptions. Further research is required to explore the hypothesis that these differences are due to different systemic approaches and emphasis placed on NAPLAN.
  • However, analysis of the data suggests that there is a relationship between the States in which teachers work (either WA or SA) and their perceptions of the effects of NAPLAN on curriculum in their schools and classrooms. Broadly speaking, while teachers in both States tend to perceive a negative impact of NAPLAN on curriculum, this relationship is higher or more pronounced in WA than SA. It is most likely that this difference is a combination of complex political, procedural, systemic and societal factors being played out in local settings. This requires further investigation to build a hypothesis as to why there is this difference between the States.
