Useful tools with chart1


Published on

Don't let the wool be pulled over your eyes.

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Useful tools with chart1

  1. 1. The state of Connecticut has given the citizens of Connecticut a very useful tool. A tool—it need be nothing more, nothing less, but a tool. How to use that tool and for what purpose is open to discussion and application. The Connecticut Mastery Test is an instrument we can use to judge the achievement of our students—their relative ranking to past years (across time) and relative ranking to other towns and cities (across districts) along with other uses including program evaluation. It is a scientifically derived instrument constructed by psychometricians (cognitive and affective domain measurement specialists). While it can be argued that it does not measure what is being taught in every town and city of Connecticut, it is aimed at measuring what the state department of education considers as the state standards for each grade level. These standards are well known to all Connecticut towns and communities and are what the state ranks as basic content areas to be taught and mastered (hopefully) for all of Connecticut’s students. The test used to be administered toward the beginning of the school year (2000-2004 and earlier), but is now administered in the early part of March (2006-2009 and 2010). It is statistically calibrated which means systematic errors have been minimized, items have been chosen that represent a theoretical construct (reading, math, writing aptitude, etc.) based on what the state thinks every student in Connecticut should have learned, and many other statistical and conceptual assumptions have been applied. It is considered by some to be a model for other states to emulate.<br />In essence, the test consists of several tests (reading, writing, math to name a few at the elementary level) which themselves consist of subtests (in 4th grade math, for example, there are 25 strands which assess separate math skills, reading and writing have multiple choice and short answer comprehension items, a Degrees of Reading Power section, a writing essay, and others). The number of correct responses on the multiple choice sections and the ratings on the open ended and essay items are totaled as a raw score. This raw score is then converted to a scale score to produce a true interval score which allows comparisons up and down the scale equally (i.e. a score change from 290 to 295 has the same magnitude as a score change from 360 to 365). The “scaling” here is arguably producing a norm-referenced type of measurement that “spreads” the examinees scores (from 100-400) and allows for relative placement (placement/ranking within the group). The scale scores are then used to place examinees in categories: for example, a 242 or above scale score might place a student in the “At or Above Goal” category for Reading. This is the mastery part of the test. Students are now said to have either mastered or not a content area. Now we are in the categorical or criterion-referenced type of measurement that tends to group examinees together in pass/fail or several subcategories. For the mastery test at the elementary level, there are 5 groupings: Advanced, At or Above Goal, Proficient, Basic, and Below Basic. All test scores reflect some errors of measurement and are often reported as falling in a range that is estimated by error variances. So a student’s true score is reported within a confidence internal that has a certain probability attached such as 95%. Theoretically, it can be stated that if the student took this test 100 times, the true score would be in this range 95 out of 100 times. Those are the bars that are often charted on a student’s individual report. Because of errors of measurement, a student scoring at or near a cut score (the score that is the threshold of a category), could be a false positive or false negative for the reported category. Hence, strong evidence is necessary for the individual’s observed score being representative of the true score with a cut-score interpretation. When using a group mean score (a mean scale score), such as when comparing group means over time or across districts, less evidence is necessary and saying the observed scores generally represent the groups’ true mean scores may satisfy any and all requirements. <br />This brief attempt to help understand the Connecticut Mastery Test uses a bit of psychometric theory that barely scratches the surface. The amount of work, time, and effort to produce this test is great, including specialists in all content areas, statistical techniques, pilot studies, experts in mental measurement (psychometricians), state standards, years of experimental studies, item banking, and much, much more. It is a tool, a gift really, given to us to let us see things we otherwise could not. <br />An example follows.<br />This chart shows trend lines for Branford (pink trend line), Connecticut state average (dark blue) and 7 randomly selected districts in the same District Reference Group (DRG) as Branford. Percentages “At Goal or Above” are shown for the years 2006, 2007, 2008, and 2009. The districts were randomly selected by assigning numbers to the 23 other districts in Branford’s DRG. A random number generator was then used to select the sample thereby eliminating selection bias to allow a generalization of the results to the population DRG. The results show Branford in 2006 at the upper limit of the sample far above Connecticut’s state average; then dropping over the next 3 years to approximately tie with the state average near the bottom limit for 2009.<br />ITS Data used is available by following this link and then choosing the second link from the bottom of the right-hand column’s list of links (<br />