Quality assurance on internal attributes of a good

QUALITY ASSURANCE ON
INTERNAL ATTRIBUTES OF A
GOOD MEASUREING LANGUAGE
DEVICES (RELIABILITY)
AMIRUL FAISAL RIZZA

English test
tools / instruments
to draw out evidence
of the existance of
English abilities

1.Good instruments :
2.The hidden English abilities
are guaranteed to be
observable.
1.Bad instruments :
2.1. Damage measurements and
evaluation.
3.2. Can not describe the real
language ablities of the test takers.

1. Reability
2. Validity
3. Practicallity/usability
4. Andeconomy
Requirements for a good instrument

RELIABLE = STABLE = CONSISTENT
Reliability
• Reliable test is a test that can produce stable scores or
consistent scores.
• Test scores demonstate consistency or stability no
matter who administers the test (Rater or Interrater).
• The scores consistent no matter when or where the test
is administrated.

Reliability
– Observed Score is the data gathered by the researcher
– True Score is the actual unknown values that correspond to the
construct of interest
– Error
– Systematic Error is variations that results from constructs of disinterest
– Unsystematic / Random Error is nonsystematic variations in the observed
scores
Observed Score = True Score + (Measurement) Error

Students Rater 1 Rater 2
A 8 8
B 8.6 8.6
C 9 9
D 8 8
E 9.4 9.4
Perfect Consistency
Consistency between Raters or interraters

Students Rater 1 Rater 2
A 8.2 8
B 8.6 8.8
C 8.9 9
D 8 8.1
E 9.3 9.4
Consistent Test

Students Administrated on
Wednesday
Administrated on
Friday
A 7.8 8
B 8.6 8.3
C 9.1 9
D 8 8.2
E 9.3 9.4
Consistent across time

Inconsistency between raters or interraters
Wednesday
Administrated on
Friday
A 3 8
B 8.6 2
C 3.1 9
D 8 8.2
E 9.3 5

Wednesday
Administrated on
Friday
A 4 8
B 8.6 2
C 5 9
D 6 8.2
E 8 8
Inconsistency across time

How do we determine whether a
measurement is reliable?
The principles of reliability estimation utilizing
these APPROACHES:
Test Retest
Parrarel Forms
Internal Consistency

TEST RETEST
Uses the same test twice to the same group of
subjects on different testing occasions.
There is a repetation on the use of the same
instrument and the invovements of the same
subjects.
The repetetion is done on different day.

TIME ACTIVITY PARTICIPANTS RESULT
Wednesday, 3/10/18 Vocabulary test 50 students of SMA
1 Bangsal
Result 1
Wednesday,
10/10/18
The same
vocabulary test
50 students of SMA
1 Bangsal
Result 2

Advantanges Disadvantages
• We only need one set of a test. • Requires of testing occasions.
• It is not easy to create a similar
condition on different testing
occasions.
• Too close time for the test
administration makes the test
takers still remember the
content of the test.
• Far too long of the second test
may affect the test takers’
performance.
• Cause boredom, ailment and
the like.
Back

Parrarel Forms
Requires two or more sets of tests.
Each set of test is made equal in every aspect
of the test with other test.
Equal in :
Test format
Test lenght
The level of difficulty
Discrimination indexes used
Time allocation
Test content

SET A
(administrated on
Tuesday)
SET B
(administrated on
Friday)
Administered
to a group of
students
Scores
produced from
completing Set
A
Scores
produced from
completing Set
B
Correlational
analysis

Advantanges Disadvantages
• Has more variations in sets of
the tests.
• Time consuming to make two
more sets of the tests.
• Not easy to keep the students’
motivation in doing the second
test.
Back

Internal Consistency
Based on the logic that if the items in the test are highly
correlated, the test is said to be reliable.
Before develop a test, it should be built a theoritical
ability that would be measured by the test.
The items of the test should be constructed to measure
a single ability (technically it is calaed as “construct”).
Tests with higher internal consistency more accurately
measure the intended construct of the test developers.

Vocabulary ability
Synonym
Indicator 1 Indicator 2
Antonym
Indicator 1 Indicator 2
Item 1
Item 2
Item 3
Item 1
Item 2
Item 3
Item 4
Item 1
Item 2
Item 3
Item 1
Item 2
Item 3
Item 4
Concept level
Dimension level
Indicator level
Test item
level
A tetst of vocabulary containing assembles & selected items Test level

Students item Total score
A 1 1 0 1 0 1 1 0 1 0 6
B 1 0 1 0 1 0 1 1 1 0 6
C 0 1 1 1 1 1 1 1 1 1 9
D 1 0 1 0 0 0 1 1 1 0 4
E 1 0 0 0 1 1 1 1 1 1 7
Test takers’ scores
(Hypothetical dichotoumous scoring on 10 items)
F 1 1 0 1 1 1 1 1 1 1 9
Total score 5 3 3 3 4 4 6 5 6 3

Approaches to perform internal consistency
 Split half : split the scores based on the test achievement in the first half of
the items and those on the second items.
 The split can be half of the total items or based on the odd or even numbers.
 Some drawbacks of split-half are:
 Inter-item estimation : the test scores are correlated with themselves
within the same test.
It is called as inter-item correlation.
Obtained scores in each item are correlated with one another.
Item 1 is correlated with item 2, 3, 4, 5, 6, 7, 8, 9, 10 or
Item 2 is correlated with item 1, 3, 4, 5, 6, 7, 8, 9, 10.
Examples :

Split-half
Test
takers’
identity
Item Total
(set A)1 2 3 4 5
A 1 1 0 1 0 3
B 0 1 1 0 1 3
C 1 1 1 1 1 5
D 1 1 0 0 0 2
E 1 0 1 1 0 3
F 1 0 1 0 1 3
TOTAL
SCORE
5 4 4 3 3
1ST half

Split-half
Test
takers’
identity
Item Total
(set A)6 7 8 9 10
A 1 1 0 1 0 3
B 0 1 1 0 1 3
C 1 0 1 1 1 4
D 1 1 0 0 0 2
E 1 0 1 1 1 4
F 0 0 1 0 1 2
TOTAL
SCORE
4 3 4 3 4
2nd half
Back

Does not fully reflect the true value of reliablity of the
test (Kline, 1993:11)
Different split may cause different result of reliability
(Cronbach, 1951)
Test lenght affects the reliability of the test. The more
items in the test, the reliable the test is (Wiersma and
Jurs, 1990:163).
DRAWBACKS OF SPLIT-HALF
Back

Example of inter-item estimation
There two or more raters evaluate students speaking
skills.
The scoring may be based on some several aspects.a
statistical analysis may be used to analyze the data,
usually uses t-test.
A correlational analysis may be applied to examine the
closeness of the scores got by the two rates.

Quality assurance on internal attributes of a good

Quality assurance on internal attributes of a good

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Quality assurance on internal attributes of a good

Similar to Quality assurance on internal attributes of a good (20)

Recently uploaded

Recently uploaded (20)

Quality assurance on internal attributes of a good