ENGLISH PROFICIENCY TEST Development & Standardization

ENGLISH
PROFICIENCY
TEST
Course Title : Language Test Development
Professor: Dr. Lemuel Fontillas
Prepared by: Roselle M. Reonal

- Measures a learner’s level of language
- Are uncommon within the classroom but very frequ
ent as the end aim (and motivation) of language le
arning
- May assess student’s skills in reading, writing, liste
ning, speaking or vocabulary
What is a proficiency test?
Dizayee, 2013

PURPOSES
Examples of Proficiency Tests are International
English Language Testing System (IELTS) and
Test of English as a Foreign Language (TOEFL).
Benchmarking of skills and
higher mental abilities
Providing motivation for
academic excellence
Providing feedback to schools on
levels of learning of their students

GENERAL PROCEDURES FOR TEST CONSTRUCTION:
International English Language Testing System (IELTS)
The test “measures the ability to communicate
in English across all four language skills—
listening, reading, writing and speaking for peo-
ple who intent to study or work where English is
the language of communication.”

LISTENING
Four sections of recorded texts which increase in difficulty a
s the text progresses; mixture of conversation and dialogue
s. Seven different task types includes: forms, notes, table, m
atching, multiple choice, classification

READING
Three passages which based on authentic texts drawn from
books, magazines, journals. Ten different task types include:
short answer, sentence completion, labeling a diagram.

WRITING
(Two Tasks)
1. Write a 150 word report based on material found in a ta
ble or diagram, demonstrating ability to describe and ex
plain
2. Short essay of 250 words in response to an opinion or p
roblem. Expected to demonstrate ability to discuss issue
s, construct an argument, and use appropriate tone and
register.

SPEAKING
10-15 min. one-on-one interaction between test-taker and
examiner. Requires test taker to describe, narrate, and provi
de explanations on personal and general interest topics.
Overall test time: 2 hours 45 minutes

Test of English as a Foreign Language (TOEFL)
LISTENING
50 multiple choice questions divided into three parts:
1. 30 questions about short conversations
2. 8 questions about longer conversations
3. 12 questions about lectures or talks

STRUCTURE AND WRITTEN EXPRESSION
40 multiple choice questions including sentence completion
(15 items) and error identification (25 items).

READING
50 Multiple choice questions
Test of Written English (TWE): 30 minutes. Test takers are as
ked to write a 250-300 word essay on an assigned topic.
Overall test time: approx. 4 hours

CRITERIA FOR THE CONSTRUCTION OF TESTS:
(Ivanova & Terzieva, 2015)
1. REMEMBERING
Meant to check students’ knowledge of words and grammatical constru
ctions. The test-takers are required to recognize, recall, and reproduce l
exical and grammatical units which have been studied during the acade
mic classes.
2. COMPREHENSION
Test tasks check their understanding of the meaning of words. Students
need to be able to paraphrase words, recognize synonyms and antony
ms of words and expressions, etc.

CRITERIA FOR THE CONSTRUCTION OF TESTS:
(Ivanova & Terzieva, 2015)
3. APPLICATION
Students are instructed to detect and correct different types of mistakes
in a given context.
4. ANALYZING
Students are required to relate their theoretical knowledge to practice,
compare and distinguish various options, and select the most appropria
te ones for specific cases.
5. EVALUATING AND CREATING
Students are expected to organize their ideas into a comprehensible tex
t to summarize their view points and evaluate other people’s opinions a
nd texts.

The process of
Standardization
ets.org

PURPOSE OF STANDARDIZED TESTS
“to provide fair, valid and reliable assessments that
produce meaningful results. Standardized testing, if
done carefully and with a high degree of quality
assurance, can eliminate bias and prevent unfair adv
antages by testing the same or similar information un
der the same testing conditions.”

STEPS IN MAKING STANDARDIZED TESTS:
1. DEFINING OBJECTICVES
identifying a need to measure certain skills or knowledge. Onc
e a decision is made to develop a test to accommodate this n
eed, test developers ask some fundamental questions:
Who will take the test and for what purpose?
What skills and/or areas of knowledge should be tested?
How should test takers be able to use their knowledge?
What kinds of questions should be included? How many of each kind?
How long should the test be?
How difficult should the test be?

2. ITEM DEVELOPMENT COMMITTEES
typically consist of educators and/or other professionals with t
he guidance of the sponsoring agency or association. Respon
sibilities of these item development committees may include:
• defining test objectives and specifications
• helping ensure test questions are unbiased
• determining test format (e.g., multiple-choice, essay, constructed-response, etc.
)
• considering supplemental test materials
• reviewing test questions, or test items
• writing test questions

3. WRITING AND REVIEWING QUESTIONS
Each test question undergoes numerous reviews and revision
s to ensure it is as clear as possible, that it has only one corre
ct answer among the options provided on the test and that it c
onforms to the style rules used throughout the test.
Scoring guides for open-ended responses, such as short writt
en answers, essays and oral responses, go through similar re
views.

4. THE PRETEST
After the questions have been written and reviewed, many are
pretested with a sample group similar to the population to be t
ested. The results enable test developers to determine:
• the difficulty of each question
• if questions are ambiguous or misleading
• if questions should be revised or eliminated
• if incorrect alternative answers should be revised or replaced

5. DETECTING AND REMOVING UNFAIR
QUESTIONS
To meet the guidelines, trained reviewers must carefully inspe
ct each individual test question, the test as a whole and any d
escriptive or preparatory materials to ensure that language, sy
mbols, words, phrases and content generally regarded as sex
ist, racist or otherwise inappropriate or offensive to any subgr
oup of the test-taking population are eliminated.

6. ASSEMBLING THE TEST
After the test is assembled, it is reviewed by other specialists,
committee members and sometimes other outside experts. E
ach reviewer answers all questions independently and submit
s a list of correct answers to the test developers.

7. Making Sure — Even After the Test is Adminis
tered — that the Test Questions are Functioni
ng Properly
Statisticians and test developers review to make sure that test
questions are working as intended. Before final scoring takes
place, each question undergoes preliminary statistical analysi
s and results are reviewed question by question. If a problem i
s detected, such as the identification of a misleading answer t
o a question, corrective action, such as not scoring the questi
on, is taken before final scoring and score reporting takes pla

7. Making Sure — Even After the Test is Adminis
tered — that the Test Questions are Functioni
ng Properly
Tests are also reviewed for reliability. Performance on one ver
sion of the test should reasonably predict performance on any
other version of the test. If reliability is high, results will be sim
ilar no matter which version a test taker completes.

Item Analysis
schreyerinstitute.psu.edu

WHAT IS ITEM ANALYSIS?
Item Analysis (a.k.a. Test Question Analysis)
is a useful means of discovering how well in
dividual test items assess what students have
learned.

An item analysis provides the systematic appr
oach to examine the tests to determine if indi
vidual questions function the way they were i
ntended (Worthen et al., 1999).

For instance, it helps us to answer the following
questions.
• Is a particular question as difficult, complex, or rigorous as you
intend it to be?
• Does the item do a good job of separating students who know the
content from those who may merely either guess the right answer
or apply test-taking strategies to eliminate the wrong answers?
• Which items should be eliminated or revised before use in subsequ
ent administrations of the test?

With this process, you can improve test score
validity and reliability by analyzing item performan
ce over time and making necessary adjustments.
Test items can be systematically analyzed regardless
of whether they are administered.

FOUR STEPS TO ITEM ANALYSIS
1. RELIABILITY
Test Score Reliability is an index of the likelihood
that scores would remain consistent over time if the
same test was administered repeatedly to the same
learners.
Item Reliability is an indication of the extent to whic
h your test measures learning about a single topic, s

2. DIFFICULTY
Item Difficulty represents the percentage of students
who answered a test item correctly.
3. DISCRIMINATION
Item Discrimination is the degree to which students
with high overall exam scores also got a particular it
em correct

4. DISTRACTORS
Distractors are the multiple choice response options
that are not the correct answer. They are plausible b
ut incorrect options that are often developed based
upon students’ common misconceptions or miscalcul
ations to see if they’ve moved beyond them.

CONCLUSION TO ITEM ANALYSIS
Item analysis is an empowering process. Knowledge
of score reliability, item difficulty, item discrimination
, and crafting effective distractors can help an instru
ctor make decisions about whether to retain items
for future administrations, revise them, or elimin
ate them from the test item pool.

References:
http://www.schreyerinstitute.psu.edu/pdf/Guide
ToItemAnalysis.pdf
https://www.ets.org/understanding_testing/test_de
velopment

ENGLISH PROFICIENCY TEST Development & Standardization

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to ENGLISH PROFICIENCY TEST Development & Standardization

Similar to ENGLISH PROFICIENCY TEST Development & Standardization (20)

More from Roselle Reonal

More from Roselle Reonal (9)

Recently uploaded

Recently uploaded (20)

ENGLISH PROFICIENCY TEST Development & Standardization