2. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
2
Tests
McMillan and Schumacher (1997:245) state that a test is a
standard set of questions presented to each subject that
requires completion of cognitive tasks.
They further elaborate that response or answers are
summarized to obtain a numerical value that represents the
characteristic of the subject.
They also underline that the cognitive task can focus on what
the person knows (achievement), is able to learn (ability or
aptitude), chooses or selects (interest, attitude or values) or is
able to do (skills).
Thus, we can infer that an educational researcher can use a
test to measure achievement, aptitude, attitude and skills so as
to gather relevant information about his subjects.
3. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
3
Parametric Tests
Parametric tests are tests that represent the wide population as
a country or age group and published as standardized tests
which are commercially available and piloted on large and
representative sample of the whole population.
These tests are complete with the backup data of reliability
and validity statistics.
Parametric tests enable the researcher to use statistics
applicable to interval and ratio levels of data.
Moreover, these tests are used in data processing of mean,
standard deviation, t-test, etc.
4. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
4
Non-parametric
Non-parametric tests are tests which are designed for a
given specific population like a class in a school or a small
group in a school.
The researcher using these tests is confined to nominal and
ordinal levels of data.
These tests have less complicated computational statistics.
In most case ,they are prepared by classroom teachers.
They have the advantage of being tailored to particular
instructional, departmental and individual circumstances.
Non-parametric tests offer teachers a valuable opportunity
for quick, relevant and focused feedback on student
performance.
As opposed to parametric, the non parametric tests are used
for small samples, they do not make any assumption about
how normal, even and regular the distribution of scores are.
5. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
5
Parametric Vs. Non-parametric
When the two tests are compared, parametric tests are more
powerful than non- parametric tests because :
Parametric tests are derived from standardized scores
They help to compare sub-populations with a whole
population.
To compare the result of one school or local education
authority with the whole country.
Hence, the researcher can prepare his/her own ‘home made’
tests to fit the purpose or employ standardized test to measure
what he/she needs for large number of students.
Nevertheless, the researcher should closely check what the
test intends to test.
6. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
6
Commercially Produced Tests
These are tests in public domain which cover a vast range of topics and
which can be used for evaluative purposes.
These tests include: diagnostics tests, aptitude tests, achievement tests,
norm-referenced tests, criterion- referenced tests, reading tests, readiness
tests and etc.
There are numerous reasons for using commercially published tests. They
are objective, piloted and refined.
Moreover, they are standardized across a named population. The tests
declare how reliable and valid they are and tend to be parametric tests.
Hence, they enable sophisticated statistics to be calculated.
They come complete with instructions and quick to administer and mark.
Guides to the interpretation of the data are clearly included.
The golden rule for deciding to use a published test is that it must
demonstrate fitness for purpose. If it fails to demonstrate this, then tests
will have to be devised by the researcher.
7. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
7
The Researcher Produced Test
The Researcher Produced Test - is a ‘home grown’ test
which is tailored to the local and institutional context very
tightly, that the purposes, objectives and content of the test
will be deliberately fitted to the specific needs of the
researcher in a specific, given context.
In spite of varied advantages, there are several important
considerations in devising a ‘home grown’ test. It might be
time consuming to devise, pilot, refine and then administer
the test.
Generally, when the commercially produced tests are not
accessible, teacher produced tests are the major alternatives.
These teacher produced tests are more alike to the non-
parametric test.
8. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
8
Aptitude Test
It is a kind of test used to predict future performance.
The results are used to make a prediction about performance
on some criterion like grades, teaching effectiveness,
certification or test scores prior to instruction, placement or
training.
The term aptitude refers to the predicative use of the scores
from a test, rather than the nature of the test items.
Some terms like intelligence or ability are used
interchangeably with aptitudes.
A national examination given to preparatory students to
predict their future performance in the university is a good
example of it.
9. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
9
Achievement Tests
Achievement tests have a more restricted coverage and they are
more closely tied to school subjects, and measure more recent
learning than aptitude tests.
The purpose of achievement test is to measure what has been
learned rather than to predict future performance. Achievement
test can be:
a. Diagnostic (identifies weakness and strength).
b. Survey batteries (test different content areas)
c. Norm-referenced (compares result in relation to others)
d. Criterion-referenced (requires criteria to be fulfilled)
10. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
10
Performance Assessment
In the past few years, a new type of assessment has become
very popular as an alternative to traditional testing formats
that rely on written objective items.
Here the emphasis is on measuring student proficiency on
cognitive skills by directly observing how a student performs
the skill in an authentic context.
In this regard, a researcher can see the performance of his
students in a live presentation.
The data gathered from the actual performance can help
him/her to come to conclusion about the students
achievement.
11. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
11
Psychological / Non-cognitive Tests
Tests can also be cognitive or non-cognitive. Cognitive tests include
aptitude and achievement. On the other hand, non cognitive tests
incorporate traits such as interest, attitudes self concept, values,
personality and beliefs.
Wiersma (200:310) writes that Personality inventories consider
characteristics such as motivation, attitudes and emotional adjustment as a
whole.
There are two kinds of personality tests, projective and non-projective.
Projective tests use a word, a picture or some stimulus to elicit an
unstructured response. Whereas, non projective tests use paper and pencil
tests that require the individual to respond.
Attitudes involve an individuals feeling toward such things as ideas,
procedures, and social institutions.
Attitude inventories used for research tend to be quite specific. A school
teacher, for instance, can be inspired to test the attitude of grade 9
students towards group work in English periods.
For this purpose, the researcher can set a test to measure the attitude of his
students in a specific way.
12. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
12
Psychological / Non-cognitive Tests
McMillan, and Schumacher (1997:250) emphasize that these
non-cognitive traits are important in school success and at the
same time measuring them accurately is more difficult than
assessing cognitive traits or skills.
In testing, the non-cognitive items are susceptible to faking. One
of these is social desirability, in which subjects answer items in
order to appear most normal or most socially desirable rather
than responding honestly. The other problem with non-cognitive
tests is that reliability of cognitive tests is greatly lower than that
of cognitive tests.
Similarly the non-cognitive tests do not have “right” answers like
cognitive tests.
Therefore, a researcher need to be conscious when he/she uses
non-cognitive tests to gather information as the results can be
affected for varied reasons discussed above.
13. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
13
Empirical procedures for
estimating reliability
When we administer a researcher made
tests, we should check the reliability and
vaidity of that test.
Several procedures are used to estimate
reliability. All of them have computation
formulas that produce reliability
coefficient. The common ones are the
following.
1. Parallel forms or alternate forms
2. Test- retest
3. Internal- consistency Methods
4. Inter-rater reliability
14. Haramaya auniversity TEFL 703
Instructor: Mulugeta Teka (PhD)
14
Validity
Another essential characteristic of measurement is validity
the extent to which an instrument measures what it is
supposed to measure.
To facilitate an understanding of the need to gather
appropriate evidence to establish validity of our cognitive
and psychological tests, three types of evidence are
identified.
A. Content related evidence: This refers to the extent to
which the content of a test is judged to be representative of
some appropriate universe or large domain of content.
Criterion- related evidence: This indicates whether the
scores on an instrument predict scores on a well specified
predetermined criterion.
Criterion related validity uses two types of evidence:
Concurrent evidence and predicative evidence.
Construct related evidence: It is an interpretation or
meaning that is given to set of scores from instruments that
asses a trait or theory that can not be measured directly,
such as measuring an unobservable trait like intelligence,
creativity or anxiety.