Norms 
• The statistics used to develop derived scores 
in norm-referenced (NR) testing come from 
normative samples. 
• Most NR tests have several samples 
– Samples for different ages 
– Samples for different grades 
– Sometimes different samples for boys and girls
Representativeness of Norms 
• NR scores are only as valid as the norms are representative. 
• Dimensions represented 
– Gender 
– Age 
– Grade in school 
– Acculturation of parents 
• Education 
• Occupation 
• Income 
– Racial identity 
– Geography 
• Area of the country 
• Urban, suburban, rural 
• Community size 
• Population trends 
– Intelligence
Technical Considerations, norms 
• Finding people 
– Stratified National Samples 
– Cluster sampling 
– Representative communities 
– Poor procedures 
• Clinical samples 
• Whole schools or agencies 
• friends 
• Each Norm group must be representative. Even when 
the combined norm groups are representative, some 
individual groups may not be.
Norms, Technicals 
• Proper proportion of people 
– Sample proportion should be same a population 
proportion in each norm group. 
• Number of Subjects 
– Large enough to be stable 
– Large enough to represent infrequent groups or 
characteristics (e.g., Native Americans) 
– Large enough to get a full range of derived scores
Massaging Norms 
• Norms are often manipulated to get them correct 
– Smoothing – scores may be transformed to remove 
minor irregularities in shapes or progression of 
means. Outliers may be dropped. 
• Norms are often weighted – a procedure in which 
some scores are counted as less than one case 
and others are counted as more than one case in 
order to achieve the correct proportions of 
characteristics
Old Norms 
• Norms must current (<15 years) 
– to represent today’s individuals 
– People learn the specific test content 
• Normative Updates 
– New norms for old test content 
– Statistics from smaller samples may be used to 
adjust means and standard deviations of old 
norms
Norm Relevance 
• National vs. Local norms 
• Special Norm Groups (e.g., SATs are normed 
on high school students who intent do go to 
college) 
• Avoid typological thinking. Just because you 
got the same score on the LSAT as successful 
lawyers does not mean that you should 
become a lawyer.
Using Norms Appropriately 
• The correct norm table. 
– How was norm group sampled? If sampled by age, 
age norm tables are usually preferable. 
– The norm table should make sense. For achievement 
tests, grade tables are usually better than age tables. 
– Out of age or grade testing. When a student’s test 
score is extreme, there is a tendency to use younger 
(for low) or older (for high) norm groups. Don’t!
Final Warning 
• Norms are expensive and time consuming to 
develop. Unrepresentative norms may be 
easy to get -- convenient, but they do not 
produce accurate derived scores. 
• If a test’s norms are inadequate, select a 
different test.
Final Warning 
• Norms are expensive and time consuming to 
develop. Unrepresentative norms may be 
easy to get -- convenient, but they do not 
produce accurate derived scores. 
• If a test’s norms are inadequate, select a 
different test.

Norms

  • 1.
    Norms • Thestatistics used to develop derived scores in norm-referenced (NR) testing come from normative samples. • Most NR tests have several samples – Samples for different ages – Samples for different grades – Sometimes different samples for boys and girls
  • 2.
    Representativeness of Norms • NR scores are only as valid as the norms are representative. • Dimensions represented – Gender – Age – Grade in school – Acculturation of parents • Education • Occupation • Income – Racial identity – Geography • Area of the country • Urban, suburban, rural • Community size • Population trends – Intelligence
  • 3.
    Technical Considerations, norms • Finding people – Stratified National Samples – Cluster sampling – Representative communities – Poor procedures • Clinical samples • Whole schools or agencies • friends • Each Norm group must be representative. Even when the combined norm groups are representative, some individual groups may not be.
  • 4.
    Norms, Technicals •Proper proportion of people – Sample proportion should be same a population proportion in each norm group. • Number of Subjects – Large enough to be stable – Large enough to represent infrequent groups or characteristics (e.g., Native Americans) – Large enough to get a full range of derived scores
  • 5.
    Massaging Norms •Norms are often manipulated to get them correct – Smoothing – scores may be transformed to remove minor irregularities in shapes or progression of means. Outliers may be dropped. • Norms are often weighted – a procedure in which some scores are counted as less than one case and others are counted as more than one case in order to achieve the correct proportions of characteristics
  • 6.
    Old Norms •Norms must current (<15 years) – to represent today’s individuals – People learn the specific test content • Normative Updates – New norms for old test content – Statistics from smaller samples may be used to adjust means and standard deviations of old norms
  • 7.
    Norm Relevance •National vs. Local norms • Special Norm Groups (e.g., SATs are normed on high school students who intent do go to college) • Avoid typological thinking. Just because you got the same score on the LSAT as successful lawyers does not mean that you should become a lawyer.
  • 8.
    Using Norms Appropriately • The correct norm table. – How was norm group sampled? If sampled by age, age norm tables are usually preferable. – The norm table should make sense. For achievement tests, grade tables are usually better than age tables. – Out of age or grade testing. When a student’s test score is extreme, there is a tendency to use younger (for low) or older (for high) norm groups. Don’t!
  • 9.
    Final Warning •Norms are expensive and time consuming to develop. Unrepresentative norms may be easy to get -- convenient, but they do not produce accurate derived scores. • If a test’s norms are inadequate, select a different test.
  • 10.
    Final Warning •Norms are expensive and time consuming to develop. Unrepresentative norms may be easy to get -- convenient, but they do not produce accurate derived scores. • If a test’s norms are inadequate, select a different test.