Language Testing Techniques:
Direct Testing vs. Indirect Testing
Discrete Testing vs. Integrative Testing
Norm-Referenced Testing vs. Criterion-Referenced Testing
Objective Testing vs. Subjective Testing
A brief summary of the Test Methods and Test Facets affecting testing performance (Source: Fundamental Considerations in Language Testing - Lyle F. Bachman)
Language Testing Techniques:
Direct Testing vs. Indirect Testing
Discrete Testing vs. Integrative Testing
Norm-Referenced Testing vs. Criterion-Referenced Testing
Objective Testing vs. Subjective Testing
A brief summary of the Test Methods and Test Facets affecting testing performance (Source: Fundamental Considerations in Language Testing - Lyle F. Bachman)
Types of tests: proficiency, achievement, diagnostic, placement
Types of testing: direct vs indirect tests, discrete point vs integrative tests, criterion-referenced vs norm-referenced tests, objective vs subjective tests
A Brief History on the Approaches to
Language Testing
In the 1950s, an era of behaviorism and special
attention to constrastive analysis, testing focused on
specific language elements such as the phonological,
grammatical, and lexical contrasts between two
languages.
Between the 1970s and 1980s, communicative theories
of language brought with them a more integrative view of
testing in which specialists claimed that the whole of
communicative event was considerably greater than the
sum of its linguistic element (Clark, 1983; Brown, 2004: 8)
Definition of Language Testing
According to Oller (1979, 1-2), a language testing is a
device that tries to assess how much has been learned
in a foreign language course, or some part of a course
by learners.
According to Brown (2004: 3), a language testing is a
method of measuring a person’s ability, knowledge, or
performance in a given domain.
To add knowledge about teaching that can help the students and teachers in their learning process in which they can be both assess their way of interaction to achieve their goals in class. Assessment of learning focuses on the development and utilization of assessment tools to improve the teaching-learning process. It emphasizes on the use of testing for measuring knowledge, comprehension and other thinking skills. It allows the students to go through the standard steps in test constitution for quality assessment. Students will experience how to develop rubrics for performance-based and portfolio assessment. The presentation includes educational technology and statistical tools that helps to determine the learning of the students.
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptxHazelLansula1
Contemporary Philippine Arts from the Region is an art produced at the present period in time. In vernacular English, “modern” and “contemporary” are synonyms. Strictly speaking, the term “contemporary art” refers to art made and produced by artists living today. Today’s artists work in and respond to a global environment that is culturally diverse, technologically advancing, and multifaceted. Working in a wide range of mediums, contemporary artists often reflect and comment on modern-day society. When
Similar to Lyle F. Bachman Measurement ( Chapter 2 ) (20)
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
2. Chapter 2 Measurement
◊ Introduction
◊ Definition of terms: measurement, test, evaluation
♦ Measurement
◊ Quantification
◊ Characteristics
◊ Rules and procedures
♦ Test
♦ Evaluation
◊ Essential measurement qualities
◊ Properties of measurement scales
◊ Characteristics that limit measurement
◊ Steps in measurement
◊ Summary
Instructor: Professor Khoshsima
Presenter: Omidi, A
Sat 19/04/2014
3. Introduction
◊ In developing language tests, we must take into account considerations and follow procedures
that are characteristics of tests and measurement in the social sciences in general. Likewise our
interpretation and use of the results of language tests are subject to the same general limitations
that characterize measurement in the social sciences.
◊ The purpose of this chapter is to introduce the fundamental concepts of measurement, an
understanding of which is essential to the development and use of language tests.
◊ Fundamental Concepts of Measurement
► The terms ‘measurement’ , ‘test’ , and ‘evaluation’ and how these are distinct from each other
► Different types of measurement scales and their properties
► The essential qualities of measures – reliability and validity
► The characteristics of measures limiting our interpretations of test results
4. Definition of terms: measurement, test, evaluation
◊ The terms ‘measurement’ , ‘test’ , and ‘evaluation’ are often used synonymously; in deed they
may, in practice, refer to the same activity. When we ask for an evaluation of an individual’s language
proficiency, for example, we are frequently given a test score. This attention to the superficial
similarities among these terms, however, tends to obscure the distinctive characteristics of each. So
an understanding of the distinctions among the terms is vital to the proper development and use of
language tests.
♦ Measurement
► Measurement in the social sciences is ‘the process of quantifying the characteristics of persons
.according to explicit procedures and rules’
► This definition includes three distinguishing features :
quantification, characteristics, and explicit procedures and rules
(1) Quantification
Quantification involves assigning numbers, and this distinguishes measures from qualitative
descriptions such as verbal accounts or nonverbal, visual representations. Non-numerical categories
or rankings such as letter grades (‘A,B.C,..’), or labels (for example, ‘excellent , good, average,…’) may
have the characteristics of measurement. When we actually use categories or rankings, we
frequently assign numbers to them in order to analyze and interpret them.
5. (2) Characteristics
We can assign numbers to both physical and mental characteristics of persons. Physical attributes
such as height and weight can be observed directly. In testing, however, we are almost always interested
in quantifying mental attributes and abilities, sometimes called traits or constructs, which can only be
observed indirectly. These mental attributes include characteristics such as aptitude, intelligence,
motivation, field dependence/independence, attitude, native language, fluency in speaking, and
achievement in reading comprehension. Whatever attributes or abilities we measure, it is important to
understand that it is these attributes or abilities and not the persons themselves that we are measuring.
(3) Rules and procedures
The third distinguishing characteristic of measurement is that quantification must be done according to
explicit rules and procedures. That is, the blind or haphazard assignment of numbers to characteristics
of individuals cannot be regarded as measurement.
► In order to be considered a measure, an observation of an attribute must be replicable:
for other observers,
in other contexts,
and with other individuals.
6. ♦ Test
Carroll’s (1968)definition of a test:
► A psychological or educational test is ‘a procedure designed to elicit certain behavior from which one
can make inferences about certain characteristics of an individual.’
► As one type of measurement, a test necessarily quantifies characteristics of individuals according
to explicit procedures.
► What distinguishes a test from other types of measurement is that :
it is designed to obtain a specific sample of behavior.
7. ♦ Evaluation
Evaluation can be defined as the systematic gathering of information for the purpose of making
decisions. (Weiss, 1972).
► Evaluation is involved only when the results of tests are used as a basis for making a decision.
► One aspect of evaluation is the collection of reliable and relevant information.
► Evaluation does not necessarily entail testing.
► Tests in and of themselves are not evaluative.
► Tests are often used for pedagogical purposes either as a means of motivating students to study or
as a means of reviewing material taught.
► Tests may also be used for purely descriptive purposes.
Information-providing function of measurement versus decision-making function of evaluation
10. unlike physical attributes, such as height, weight, voice pitch, and temperature, we cannot directly
observe intrinsic attributes or abilities, and we therefore must establish our measurement scales by
definition, rather than by direct comparison.
The scales we define can be distinguished in terms of four properties:
1. A measure has the property of distinctiveness if different numbers are assigned to persons with
different values on the attribute.
2. It is ordered in magnitude if larger numbers indicate larger amounts of the attribute.
3. The measure has equal intervals if equal differences between ability levels are indicated by equal
differences in numbers.
4. The measure has an absolute zero point if a value of zero indicates the absence of the attribute.
Measurement specialists have defined four types of measurement scales –
nominal, ordinal, interval, and ratio – according to how many of these four properties they possess.
Properties of measurement scales
11. Nominal/ categorical scale
A nominal scale comprises numbers that are used to ‘name’ the classes or categories of a given
attribute. That is, we can use numbers as a shorthand code for identifying different categories.
Nominal scales possess the property of distinctiveness.
► A special case of a nominal scale is a dichotomous scale, in which the attribute has only two
categories, such as ‘sex’ (male and female), or ‘status of answer’ (true and false) on some types of
tests.
The attribute ‘native language’
native language background number
Chinese = 1
Bengali = 2
French = 3
Arabic = 4
… …
12. Ordinal scale
An ordinal scale comprises the numbering of different levels of an attribute that are ordered with
respect to each other.
The most common example of an ordinal scale is a ranking, in which individuals are ranked ‘first’,
‘second’, ‘third’, and so on, according to some attribute or ability.
The use of subjective ratings in language tests is another example of ordinal scales. The points or
levels on an ordinal scale can be characterized as ‘greater than’ or ‘less than’ each other.
Ordinal scales possess the property of ordering in addition to the property of distinctiveness.
The attribute ‘speaking ability’
student rank
Ali 1st
Reza 2nd
Mehdi 3rd
… …
13. Interval scale
An interval scale is a numbering of different levels in which the distances, or intervals, between
the levels are equal. That is, in addition to the ordering that characterizes ordinal scales, interval scales
consist of equal distances or intervals between ordered levels. Thus they possess the properties of
distinctiveness, ordering, and equal intervals.
14. Ratio scale
The distinguishing feature of a ratio scale is an absolute zero point.
The reason we call a scale with an absolute zero point a ratio scale is that we can make comparisons in
terms of ratios with such scales.
15. Characteristics that limit measurement
As test developers and test users, we must know that our tests:
► are not perfect indicators of the abilities we want to measure and
► that test results must always be interpreted with caution.
The most valuable basis for keeping this clearly in mind can be found in:
► an understanding of the characteristics of measures of mental abilities and
► the limitations these characteristics place on our interpretation of test scores.
1. limitations in specification, and
2. limitations in observation and quantification
These limitations are of two kinds:
16. Limitations in specification
In any language testing situation, as with any non-test situation in which language use is involved, the
performance of an individual will be affected by a large number of factors. The most important factor
affecting test performance, with respect to language testing, is the individual’s language ability.
► In order to measure a given language ability, we must be able to specify what it is.
♦ This specification generally is at two levels:
(1) At the theoretical level, we need to specify the ability in relation to, or in contrast to, other
language abilities and other factors that may affect test performance.
(2) At the operational level, we need to specify the instances of language performance that we
are willing to interpret as indicators of the ability we wish to measure. This level of specification
defines the relationship between the ability and the test score.
17. Limitations in observation and quantification
In addition to the limitations related to the underspecification of factors that affect test performance,
there are characteristics of the processes of observation and quantification that limit our
interpretations of test results. These derive from the fact that all measures of mental ability are
necessarily indirect, incomplete, imprecise, subjective, and relative.
► Indirectness
► Incompleteness
► Imprecision
► Relativeness
► Subjectivity
18. Steps in measurement
Interpreting a language test score as an indication of a given level of language ability involves being
able to infer, on the basis of an observation of that individual’s language performance, the degree to
which the ability is present in the individual. The limitations discussed above restrict our ability to make
such inferences.
A major concern of language test development is to minimize the effects of these limitations.
To accomplish this, the development of language tests needs to be based on a logical sequence of
procedures linking the putative ability to the observed performance.
◊ This sequence includes three steps:
(1) Identifying and defining the construct theoretically;
(2) Defining the construct operationally,
(3) Establishing procedures for quantifying observations
19. ◊ The first step – defining constructs theoretically
The first step in the measurement of a given language ability is to distinguish the construct we wish to
measure from other similar constructs by defining it clearly, precisely, and unambiguously.
This can be accomplished by determining what specific characteristics are relevant to the given
construct.
♦ Two distinct approaches to defining language proficiency:
(1) The ‘real life’ approach:
In this approach, language proficiency itself is not defined, but rather, a domain of actual, or
‘real life’ language use is identified that is considered to be characteristic of the performance
of competent language users.
Example: ILR oral proficiency interview, ACTFL oral proficiency interview
(2) The ‘interactional/ability’ approach:
In this approach, language proficiency is defined in terms of its component abilities.
Example: the functional framework of Halliday, or the communicative framework of Munby
20. ◊ The second step – defining constructs operationally
This step enables us to relate the constructs we have defined theoretically to our observations of
behavior. It involves determining how to isolate the construct and make it observable.
We must decide what specific procedures, or operations, we will follow to elicit the kind of perform-
ance that will indicate the degree to which the given construct is present in the individual.
The theoretical definition itself will suggest relevant operations.
The context in which the language testing takes place also influences the operations we would
follow.
For an operational definition to provide a suitable basis for measurement, it must elicit language
performance in a standard way, under uniform conditions.
21. ◊ The third step – quantifying our observations
This step involves establishing procedures for quantifying or scaling our observations of performance.
◊ The primary concern in establishing scales for measuring mental abilities:
defining the units of measurement
◊ The units of measurement of language tests are typically defined in two ways:
(1) One way is to define points or levels of language performance or language ability on a scale.
(2) Another common way of defining units of measurement is to count the number of tasks
successfully completed.
22. Relevance of steps to the interpretation of test results
◊ The first step – defining constructs theoretically
provides the basis for evaluating the validity of the uses of test scores
◊ The second step – defining constructs operationally
the observed relationship among different measures of the same theoretical construct provide the
basis for investigating concurrent relatedness
◊ The third step – quantifying our observations
directly related to reliability
23. Summary
► Fundamental measurement terms and concepts – measurement, test, and evaluation
► Four properties of measurement scales – distinctiveness, ordering, equal intervals and
an absolute zero point
► Four types of scales or levels of measurement – nominal, ordinal, interval, and ratio
► Essential qualities of measurement scales – reliability and validity
► characteristics of measures of ability limiting our interpretation and use of test scores
► Fundamental steps in the development of tests in order to minimize the effects of the limitations
and to maximize the reliability of test scores