Types of tests: proficiency, achievement, diagnostic, placement
Types of testing: direct vs indirect tests, discrete point vs integrative tests, criterion-referenced vs norm-referenced tests, objective vs subjective tests
kinds of tests and testing
proficiency tests- achievement tests, diagnostics test, placement tests, direct and indirect test, discrete point and intergrative testing, norm-referenced and criterion testing, objective testing and subjective testing, computer adapting testing
Types of tests: proficiency, achievement, diagnostic, placement
Types of testing: direct vs indirect tests, discrete point vs integrative tests, criterion-referenced vs norm-referenced tests, objective vs subjective tests
kinds of tests and testing
proficiency tests- achievement tests, diagnostics test, placement tests, direct and indirect test, discrete point and intergrative testing, norm-referenced and criterion testing, objective testing and subjective testing, computer adapting testing
For the presentation transcription which contains more information, click here:
http://www.4shared.com/file/bLzJpPYqce/presentation_transcription__2_.html
Power Point based on the article "Testing for language teachers" (Arthur Hughes), pages 83 to 112 (Chapter 9: Testing writing). This work is done by Idoia Argudo and Marta Ribas, in a subject from Universidad de Cantabria.
For the presentation transcription which contains more information, click here:
http://www.4shared.com/file/bLzJpPYqce/presentation_transcription__2_.html
Power Point based on the article "Testing for language teachers" (Arthur Hughes), pages 83 to 112 (Chapter 9: Testing writing). This work is done by Idoia Argudo and Marta Ribas, in a subject from Universidad de Cantabria.
A Brief History on the Approaches to
Language Testing
In the 1950s, an era of behaviorism and special
attention to constrastive analysis, testing focused on
specific language elements such as the phonological,
grammatical, and lexical contrasts between two
languages.
Between the 1970s and 1980s, communicative theories
of language brought with them a more integrative view of
testing in which specialists claimed that the whole of
communicative event was considerably greater than the
sum of its linguistic element (Clark, 1983; Brown, 2004: 8)
Definition of Language Testing
According to Oller (1979, 1-2), a language testing is a
device that tries to assess how much has been learned
in a foreign language course, or some part of a course
by learners.
According to Brown (2004: 3), a language testing is a
method of measuring a person’s ability, knowledge, or
performance in a given domain.
It is an important part in English Language Teaching. It helps the teachers to make an effective test as well as to take the testing system to new height.
"This file provides a concise overview of fundamental assessment concepts. It covers key topics such as assessment types, validity, reliability, and the importance of clear assessment objectives. Whether you're new to assessment or seeking a quick refresher, this document offers valuable insights to enhance your understanding."
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
2. TEACHING AND TESTING
BACKWASH
The effect of testing on teaching and learning
If the test is important, it can dominates all teaching and learning
activities
It can be harmful or beneficial
Harmful Backwash :
The test content and testing techniques are in variance with ojectives
of the course
Beneficial Backwash :
It has an immediate effect on teaching
3. All measures of mental ability are necessarily
indirect,
incomplete,
imprecise,
subjective,
relative
To minimize the effects of these limitations
A. provide clear theoretical definitions of the abilities we
want to measure;
B. Specify precisely the conditions, or operations that we
will follow in eliciting and observing performance,
C. Quantify the observations so as to assure our
measurement scales have the properties we require.
4. GENERAL TYPES OF TESTS
Proficiency tests
Achievement tests
Diagnostic tests
Placement tests
Selection Tests
Competition tests
Aptitude tests
Language aptitude tests
Vocational aptitude tests
KINDS OF TESTS AND TESTING
5. PROFICIENCY TESTS
measure language ability regardless of any previous
training.
are not based on
the content / objectives of language courses .
a specification of what candidates are able to do
to be considered proficient
Proficiency: having sufficient command of the
language
used for a particular purpose such as:
A translator in the United Nations
A student seeking admission in American / British
Universities
Used for a general purpose such as:
General Proficiency Tests
FCE, CPE, TOEFL, IELTS
KINDS OF TESTS AND TESTING
6. ACHIEVEMENT TESTS
are directly related to language courses
Used to determine whether students have achieved
the objectives of the course or not.
Kinds of Achievement tests
1. Final achievement tests
2. Progress achievement tests
7. Final achievement tests
are administered at the end of a course of study.
Their contents are related to the course concerned.
syllabus content approach:
Should the test be based directly on a detailed course syllabus?
disadvantage : If the syllabus is badly designed, the results of
the test could be misleading
Objective content approach: :
Should the test be based on course objectives?
Advantages
compelling course designers to be explicit about course objectives
making it possible for the test to show how far objectives have been
achieved
compelling the course designers to choose a syllabus which is
consistent with the course objectives
working against poor teaching practice
promoting a more beneficial backwash effect
8. PROGRESS ACHIEVEMENT TESTS
measure the progress of students and one to measure
progress is to administer repeatedly final achievement
tests
Disadvantage:
The low scores in early stages are discouraging.
The alternative is to establish a series of well-defined
short-term objectives. These should make a clear
progression towards the final achievement test based
on course objectives
Pop Quizzes
make a rough check on students’ progress
keep students on their toes.
9. • are used to identify learners’ strengths and weaknesses
• are intended to ascertain what learning still needs to take place.
• can tell us that someone is particularly weak in, say, speaking as
opposed to reading in a language
• Proficiency tests may prove adequate for this purpose
• Teacher may even need to analyze samples of a person’s
performance in writing or speaking in order to create profiles of
student’s ability in certain categories
DIAGNOSTIC TESTS
10. PLACEMENT TESTS
are intended to place students at the stage of the
teaching program most appropriate to their abilities.
are used to assign students to classes at different levels
are constructed for particular situations.
depend on the identification of the key features at
different levels of teaching.
APTITUDE TESTS
indicate an individual facility for acquiring specific
skills and learning
are used to measure aptitude for learning and to
predict future performance
11. DIRECT VS. INDIRECT TESTING
Direct tests require the candidate to perform precisely the
skill that we wish to measure
If we want to know how well candidates can write
composition, we get them to write composition.
If we want to know how they pronounce a language, we
ask them to speak.
The tasks, and the texts used, should be as authentic as
possible.
Direct testing is easier to carry out to measure the
productive skill.
12. Attractions of Direct testing
1. Straightforward to create the conditions to elicit the
required behaviors
2. Straightforward assessment and interpretation
3. helpful backwash effect
SEMI-DIRECT TESTING
A. speaking tests where candidates respond to a tape-
recorder stimuli, their own responses being recorded
and later scored
13. INDIRECT TESTING
measures the abilities that underlie the skills tested.
EXAMPLE: One section of the TOEFL as an indirect
measure of writing ability.
At first the old woman seemed unwilling to accept anything
that was offered her by my friend and I.
The main appeal of indirect testing:
testing a large number of elements in one test
giving it to a large number of students
correcting it objectively
The main problem of indirect testing:
The relationship between performance in the test and
actual performance of the skills being tested is weak in
strength and uncertain in nature
14. A. DISCRETE POINT TESTING
refers to the testing of one element at a time, item by
item.
might take the form of a series of items, each testing a
particular grammatical structure .
is a testing approach which cuts up language skills
and components into smaller parts and then tests
them one by one.
is an atomistic approach to language teaching and
learning.
DISCRETE POINT VS. INTEGRATIVE TESTING
15. B. INTEGRATIVE TESTING
requires the candidate to combine many language
elements in the completion of a task
writing a composition
taking notes while listening to a lecture
taking a dictation
completing a cloze passage
Unlike DP tests , IN tests tend to be direct.
some integrative methods, such as cloze procedure, are
indirect
Diagnostic tests of grammar tend to be discrete point
16. A. NORM-REFERENCETESTING (NRT)
relates one candidate’s performance to that of other
candidates .
We are not told directly what the student is capable of
doing in the language .
B. CRITERION-REFERENCE TESTING (CRT)
provides direct information about what a candidate
can actually do in the language.
NORM-REFERENCE VS. CRITERION-REFERENCE TESTING
17. A. OBJECTIVE TESTING
No judgment is required on the part of the scorer
(multiple-choice tests)
B. SUBJECTIVE TESTING
Judgment is called for on the part of the scorer
(composition)
There are different degrees of subjectivity in testing.
Scoring composition is more subjective than scoring short-
answer items.
Objectivity in scoring brings greater reliability to testing.
Scoring rubrics can increase reliability of subjective
tests such as composition.
OBJECTIVE VS. SUBJECTIVE TESTING
18. No real need for strong candidates to attempt easy items,
and no need for weak candidates to attempt difficult items.
an efficient way of collecting information on testees’ ability
Presenting initially items of average difficulty.
Those who respond correctly are presented with a more
difficult item.
Those who respond incorrectly are presented with an
easy item.
The computer adapts the items to the testees’ level .
Oral interviews are typically a form of adaptive testing
COMPUTER ADAPTIVE TESTING
19. Measuring any ability to take part in acts of
communication, including reading and listening
It is assumed that it is usually communicative ability
that we want to test .
COMMUNICATIVE LANGUAGE TESTING
20. VALIDITY: Definition
A test is valid if it measures accurately what it is intended
to measure
Types of validity
Construct
Content
Criterion-related
Face
CHAPTER 4 : VALIDITY
21. VALIDITY: Definition
A test is valid if it measures accurately what it is intended
to measure
Types of validity
Construct
Content
Criterion-related
Face
CHAPTER 4 : VALIDITY
22. Construct Validity
the degree to which a test measures what it claims,
or purports, to be measuring
Construct: A construct is an attribute, an ability, or
skill that happens in the human brain and is defined by
established theories.
Intelligence, motivation, anxiety, proficiency, and fear
are all examples of constructs.
They exist in theory and has been observed to exist in
practice.
Constructs exist in the human brain and are not
directly observable.
There are two types of construct validity: convergent
and discriminant validity. Construct validity is
established by looking at numerous studies that use
the test being evaluated.
CHAPTER 4 : VALIDITY
23. 2. CONTENT VALIDITY
The test content is a representative sample of the
language skills being tested .
The test is content valid if it includes a proper
sample.
importance of content validity
the greater a test’s content validity, the more
likely its construct validity
a test without content validity is likely to have a
harmful backwash effect since areas that are not
tested are likely to become ignored in teaching and
learning
24. 3. CRITERION-ORIENTED VALIDITY
The degree to which results on the test agree with those
provided by an independent criterion
Kinds of criterion-related validity
A. Concurrent Validity
is established when the test and the criterion are
administered at the same time
B. Predictive Validity
concerns the degree to which a test can predict
candidates’ future performance.
Areas that are not tested are likely to become
ignored in teaching and learning
25. VALIDITY COEFFIENT
A mathematical measure of similarity to show the degree
of validity .
Perfect validity will result in a coefficient of 1.00
Total lack of validity results in a coefficient of 0.00
Satisfactory validity depends on the test’s purpose &
importance
A coefficient of 0.70 might be considered low if the test is
important
VALIDITY IN SCORING
a reading test may call for short written responses
If the scoring of these responses takes into account
spelling and grammar, then it is not valid in scoring.
26. 4. FACE VALIDITY
The way the test looks to the examinees, test administrator,
educators, and the like
If you want to test the student in pronunciation, but you do not ask
them to speak, your test lacks face validity
If your test contain items or materials which are not acceptable to
candidates, teachers, educators, etc., your test lacks face validity
HOW TO MAKE TESTS MORE VALID?
Write explicit specifications for the test, which include all the
constructs to be measured.
Make sure that you include a representative sample of the content
Use direct testing .
Make sure the scoring is valid .
Make the test reliable .
27. RELIABILITY
refers to the stability or consistency of scores
Nearly the same scores for the same individuals in two
sessions
Multiple-choice tests have high coefficient of reliability
Look at the tables on p. 37
RELIABILITY COEFFICIENT
The ideal coefficient is 1.00
Total lack of reliability is 0.00
Satisfactory reliability depends on the purpose and
importance of the test
Vocabulary, structure, and reading tests: .90 - .99
Auditory comprehension tests: .80 - .89
Oral production tests: .70 - .79
CHAPTER 5 : RELIABILITY
28. HOW TO ESTIMATE RELIABILITY?
The way in which reliability coefficient arrived at
Test-retest Method
Taking the same test twice by the same students, and
then comparing the scores
Drawbacks of this method:
If the administration is too soon, the students will
remember, and then their scores will be affected
If the time is too long, the students will forget or
improve, and then that will affect the scores
CHAPTER 5 : RELIABILITY
29. The Alternate Forms Method
Two equivalent forms, but the problem is such forms
are not available
The Split Half Method
The most common method to obtain reliability
The subjects take the test one time, but each subject is
given two scores
One score for each half of the test
The two sets of scores are then used to obtain the
reliability coefficient be affected
CHAPTER 5 : RELIABILITY
30. THE STANDARD ERROR OF MEASUREMENT
AND THE TRUE SCORE
All test scores are estimates
All tests contain some degree of error
you have to use a statistic known as the standard
error of measurement… to estimate the limits within
which an obtained score is likely to diverge from a true
score
CHAPTER 5 : RELIABILITY
31. SCORER RELIABILITY
Consistency of scoring
Nearly the same score for the same test
In other words, comparing the scores of two or more
scorers for the same students
In composition tests, scores are usually fluctuate
In multiple-choice tests, scores are nearly perfect
If the scoring of a test is not reliable, then the test
results cannot be reliable either
CHAPTER 5 : RELIABILITY
32. HOW TO MAKE TESTS MORE RELIABLE? 1) Take
enough samples of behavior The more items you have
on a test, the more reliable the test will be
Considerations to be taken when adding extra items:
Additional items should be independent of each other
and of existing items
Each additional item should represent a fresh start for
the candidate
Tests should neither be too long, nor too short
CHAPTER 5 : RELIABILITY
33. HOW TO MAKE TESTS MORE RELIABLE?
2) Exclude items which do not discriminate well
between weaker and stronger students
Items on which strong students and weak students
perform with similar degree of success contribute little to
the reliability of a test
Too easy items or too difficult items should be excluded
A small number of easy items may be kept at the
beginning of a test to give candidates confidence and
reduce the stress they feel
CHAPTER 5 : RELIABILITY
34. HOW TO MAKE TESTS MORE RELIABLE?
3) Do not allow candidates too much freedom
The procedure of giving choices of questions to
candidates has a negative effect on reliability
In general, candidates should not be given a choice
4) Write unambiguous items
5) Provide clear and explicit instructions
6) Ensure that tests are well laid out and perfectly
legible
7) Make candidates familiar with format and
testing techniques
CHAPTER 5 : RELIABILITY
35. HOW TO MAKE TESTS MORE RELIABLE?
8) Provide uniform and non-distracting conditions
of administration
9) Use items that permit scoring which is as
objective
10) Provide a detailed scoring key
11) Train scorers
12) All scorers should follow the same criteria for
scoring
13) Identify candidates by number not name
14) Employ multiple, independent scoring
CHAPTER 5 : RELIABILITY
36. RELIABILITY AND VALIDITY
A valid test must be reliable
However, a reliable test may not be valid at all
Increasing the reliability of a test may be on the expense
of validity
There will always be some tension between reliability
and validity
The tester has to balance gains in one against losses in
the other
CHAPTER 5 : RELIABILITY
37. CHAPTER 6 :
ACHIEVING BENEFICIAL BACKWASH
Test the abilities whose development you want to
encourage.
Beware of reasons for not testing particular abilities.
In case of MCQ and objectivity
subjective scoring in case of subjective tests
the expense involved in terms of time and money
Determine the points that should be tested and give them
sufficient weight in relation to the other
abilities
38. How to achieve beneficial backwash:
I. Sample widely and unpredictably.
II. Use direct testing.
III. Make testing criterion-reference.
IV. Base achievement tests on objectives.
V. Ensure the test is known and understood by
students and teachers.
VI. Where necessary provide assistance to teachers.
VII.Count the cost.
39. CHAPTER 7:
STAGES OF TEST DEVELOPMENT
1) Make a full and clear statement of the testing ‘problem’
2) Write complete specifications for the test
3) Write and moderate items
4) Try the items on native speakers
5) Try the items on non-native speakers
6) Analyze the results of the trial and make necessary
changes
7) Calibrate scales
8) Validate
9) Write handbooks for test takers, test users, and staff
10) Train any necessary staff (interviewers, raters, etc.)
40. The questions to be answered in order to state the problem:
i) What kind of test is it to be?
ii) What is its precise purpose?
iii) What abilities are to be tested?
iv) How details must the results be?
v) How accurate must the results be?
vi) How important is backwash?
vii) What constraints are set by unavailability, expertise,
facilities, and time?
Stating the problem
41. i) Determining Content
Specifying instructional objectives
Preparing a table of specifications
Determining number of items
ii. Necessary operations by the test-developer
Specification of Text types:
Letters, forms, academic essays
Addresses of Texts
Length of Text(s)
Topics (familiar/unfamiliar)
Readability
Structural and Vocabulary Range
Dialect, accent, style
Speed of Processing
words to be read per minute , rate of speech
2) Writing specifications for the test
42. iii) Structure, timing, medium/channel and techniques
Test Structure(test section: grammar, voc.,
reading)
Number of Items
Number of Passages
Medium/channel( tape, paper & pencil, ...)
Timing
Techniques
iv) Critical Levels of Performance
iv) Scoring Procedures :
subjective or objective
3) Writing and Moderating Items
i) Sampling ( based on the contents)
ii) Writing Items
iii) Moderating Items (Reviewing )
43. 4) Pretesting
Informal Trial of Items on Native Speakers
Trialing Items on Non-native Speakers (Pretesting)
6) Item Analysis (analysis of the results)
Reliability
level of difficulty
discrimination index
distracters
clearance of instructions and items
timing
7) Calibration of scales
8) Validation
9) Writing handbooks for test-takers, test users & staff
10. Training staff
44. 7) Calibration of Scales For testing speaking and
writing, a team of experts looks at samples of skills
and assign each of them to a point on the relevant
scale
8) Validation It is essential for proficiency tests and
repeatedly-used tests
9) Writing Handbooks for test takers, users, staff
It is essential for proficiency tests and repeatedly -
used tests
10) Training Staff It is essential for proficiency tests and
repeatedly -used tests
See pp. 66 – 72 for examples of test development