Testing and Teaching Vocabulary, by Paul Nation



   I.      Statical information about vocabulary

Three words of counting words in a book:

   1. To know how long it is: count words on a page and multiply by the number of
      pages.
   2. To know how many words we need to understand in order to comprehend:
      decide what we understand by word, make a list of all the words,
   3. To do a frequency count we make a list of all the different words and count
      how often each one occurs.

According to the last point we can deduce that:

   a) If you know the 2,000 most frequent words of English means that you
      recognize 81% of the words on any page.
   b) Low frequency words are important for the understanding of the text;
      however, it is not very useful to focus on learning or teaching them because we
      are not likely to meet them very often.
   c) Specialized vocabulary occurs quite frequently in university textbooks, but not
      very frequently in language as a whole.

   II.      Making the test.
   - Each section of the test consists of six words and three definitions.
   - Easy to make and to mark.
   - The words in each section were chosen so that they would be representative of
       all the words at that level.
   - Very low chances of guessing correctly.
   - 18 words are matched at each level.
   - Plus 18 as distracters, that are not meanings but words.

 People who know the words can do that level quickly; a native speaker did the whole
test in five minutes and got full marks. A maximum of 50 minutes should be allowed
for sitting the test. The items put together in each section were not related in their
meanings. The definitions in the text use words from higher frequency level than the
tested words.



   III.    Using the test.
The instructions should not require any explanation, but if they are needed they must
be explained.

   -    Give one mark for each correct matching.
   -    It takes more or less two minutes to mark and add the score of one test.
   -    A score of 12 out of 18 indicates that approximately one-third of the words at
        that level are not known.
   -    High frequency words
   a)   Are worth individual attention and thus activities such as learning lists of words
        and vocabulary study using books like Barnard (1972) are appropriate.
   b)   For teachers who are doubtful about getting learners to study vocabulary lists
        should read Nation (1982) which reviews experimental research on list
        learning.
   c)   Direct teaching of vocabulary is also appropriate, large amount of words can be
        learned by meeting them incidentally in context.
   -    Specialized vocabulary
   a)   Can be treated in much the same way as high frequency vocabulary.
   b)   Learning prefixes and roots is a useful aid to learning
   -    Individual low frequency
   a)   Do not deserve teaching time, unless they contain useful prefixes or roots, or
        are an example of some other regular feature that will help vocabulary learning
        in general.
   b)   Guessing words using context.



Second language vocabulary assessment: Current practices and new directions, by John
Read.



   I.      Measuring vocabulary size.

There are several purposes for measuring vocabulary:

   a) Vocabulary size is closely associated with reading comprehension ability.
   b) Vocabulary tests have traditionally had a significant role in research on reading
      development and in literacy programmes.
   c) Vocabulary assessment can reveal the extent of the lexical gap they face in
      coping with authentic reading materials and undertaking other communicative
      tasks in the target language.
   d) Vocabulary size measures typically require a relatively large sample of words
      that represent a defined frequency range.
II.       Sampling from word frequency lists.

   a) For native speakers the sample of words to be tested has come from a large
      dictionary of contemporary English in order to cover as many as possible of the
      words that the participants in the study are likely to know.
   b) One limitation of dictionaries for native-speaker users is that they do not give
      any explicit information about the frequency of words.
   c) This deficiency in dictionaries is overcome by word frequency lists that are
      based on computer corpora.

   - The preferred lexical unit for current vocabulary size research s the word family.

   - Vocabulary size tests for second language learners understandably focus on a
   narrowerrange of words than those for native speakers.

   -      Low frequency words are much less likely to be known, especially by learners in
          a foreign language environment.
   -      One list that does combine these virtues is the Academic Word List (AWL),
          Coxhead’s (2000) set of 570 word families occurring frequently in written texts
          across a range of university disciplines.
   -      The AWL it is based on the assumption that students participating in an English
          for Academic Purposes programme are intending to study in a range of
          disciplines and therefore the vocabulary they learn should represent a common
          core of words.

   III.      The yes/no format

   -    The next step in developing a vocabulary size test is to select a sample of target
        words for the test items.
    - The most widely test format used measure of English vocabulary size for
        second language learners, Nation’s Vocabulary Levels Test requires:
    a) the test-takers to match words with their synonyms
    b) the test-takers to match words with their short definitions
Referred to above, has a:
    - Multiple choice format
    - Each target word is presented in a short non-defining sentence followed by
        four possible definitions as options
The simplest test format was originally called the checklist and is now generally
referred to as the Yes/No format. In this case the test-takers are presented with a
series of words and just indicate whether they know each word or not.

   Anderson and Freebody (1983):
   - Included among the items a substantial proportion of non-words.
   - Provided a basis for adjusting the scores of test-takers who responded “Yes”.
   - Penalty for claimed knowledge of non-words.
DIALANG (www.dialang.org)
   - The web-based system through which learners of 14 European languages can
      assess their proficiency in the target language.
   - When learners access the system, they are invited to assess their own skills in
      the target language and also to take a Vocabulary Size Placement Test in the
      yes/no format.
   - When they take a specific skill test, the system will present them with items
      and texts that are broadly suited to their level.

   Diagnosis (Alderson, 2005)
   - Focus on the structural system of the target language.
   - There are well-established procedures for measuring the size of a learner’s
      vocabulary
   - The yes/no format has proved to be an informative and cost-effective means of
      assessing the state of learners’ vocabulary knowledge, particularly for
      placement and diagnostic purposes.

   IV.    Testing depth of knowledge.

   Is built on:
   - The recognition.
   - Learner’s ability to associate written form of a word with a simple statement of
       its meaning
   Learner’s L2 lexicon:
   - How the word is pronounced and spelled.
   - What its morphological forms are.
   - How it functions syntactically.
   - How frequent it is.
   - How it is used appropriately from a sociolinguistic perspective.
       And so on.

One type of test that has been adopted to some extent is Read’s (1993, 1998) word
associatesformat.
   - Word association by creating items that consists of a target word and six or
       eight other words.
   - The relationships between the words are primarily semantic and collocational.
   - The format offers opportunities to assess some key elements of the core
       meaning of the target word.
   - The aim was to design a simple type of item that would test deep word
       knowledge in a meaningful way.


Another measure of deep word knowledge which has gained some currency is
Paribakht and Wesche’s (1997) Vocabulary Knowledge Scale (VKS).
    - Incidental acquisition of word meaning through intensive reading activities

   V.     Assessing vocabulary use in context
Read and Chapelle (2001)
   - Noted that most existing vocabulary tests implicitly defined vocabulary
      knowledge as a trait, a mental attribute of the learner that could be described
      and measured without any reference to the contexts in which the words are
      used.
   - Vocabulary size measures such as the Yes/No format discussed above represent
      a classic example of this approach to assessment.

One way to approach the distinctive features of academic registers is through the
study of technical vocabulary.
   - This is an area that has not received much scholarly attention until recently.
   - We have lacked systematic procedures for identifying the technical words in
       particular texts.

Rating scale for finding technical words:

Step 1
Words with no semantic relationship to anatomy: the, is, between, amounts,
common, directly
Step 2
Words whose meaning is minimally related: superior, part, forms, pairs,
structures, surrounds
Step 3
Words whose meaning is closely related to anatomy but also in general use:
chest, trunk, neck, abdomen, ribs, breast
Step 4
Words with a specific meaning in anatomy, not used in general language:
thorax, sternum, costal, pectoral, fascia, periosteum, viscera

The alternative approach, then, is to apply the tools of corpus analysis to the
identification of vocabulary that is characteristic of particular texts or registers.

Chujo and Utiyama (2006)
   - Have pushed the keyword concept a step further in their research on the
       vocabulary of business English.
   - Work is promising, in that it may provide an automated alternative to Chung
       and Nation’s (2003) rational basis for identifying degrees of technicalness in the
       vocabulary of a particular register.

Video

   -    Vocabulary testing 1980s was outside of the mainstream of the field
   -    Multiple choices items to test the knowledge of a word require a routine
        component of tests like TOEFL
   -    70s 80s communicative approaches to language teaching had an impact in
        testing vocabulary as well.
-   In the previous period the old ways of testing were obsoletes, reconsideration
       of assessing vocabulary
   -   Vocabulary size testing, you need a large sample of words.

        Notes on the reading of Purpura’s book: Assessing Grammar


Chapter 1: Differing notions of ‘Grammar’ for assessment

   -   Historical importance of grammar for learning an L2.
   -   It was thought that grammar was sufficient for learners to acquire another
       language
   -   It was thought that to know a language means to know about its grammatical
       system and recite its rules
   -   The Grammar translation approach gave birth to the Natural Approach
       (Krashen and Terrel, 1983) and other methods and approaches
   -   Grammar is a set of rules to be internalized and used for communication
       (current perspective)
   -   The assessment of grammar was based on tasks that would lead students to
       demonstrate their ability to communicate in speaking and writing
   -   The assessment of grammar has changed a lot over time as well as the teaching
       of grammar
   -   Grammar and linguistics:
   -   Syntactocentric perspective: the arrangement of words in a sentence
   -   Communication perspective: how language is used to convey meaning
   -   Many theories and types of distinctive grammar, such as:
           o Formal grammar
           o Traditional grammar
           o Structural grammar
           o Transformational-generative grammar
           o Universal grammar
   -   Form and Use perspective: focused on the function of grammar, rather than
       meaning
   -   Corpus linguistics: how often and where a linguistic form occurs in spoken or
       written texts.
   -   It examines linguistic and non-linguistic features
   -   Lexical form (Katz and Fodor, 1063) ‘the grammatical dimension of lexis’
   -   Bieber et al, corpus-based study degree of what linguistic features are likely to
       occur in certain texts. It provides distributional and frequency information on
       the lexico-grammatical features of the language
   -   It helps provide an empirical basis for determining which learning points to
       teach or to test
-   Kennedy (1998) states that language teachers might promote L2 vocabulary
    development or introduce students to features of the L2 that allow them to
    function appropriately in social contexts
-   Communication-based perspectives of language:
-   Beyond of the view of language as patterns of morphosyntax observed within
    relatively decontextualized sentences
-   It views grammar as a set of linguistic norms, preferences and expectations that
    an individual invokes to convey a host of pragmatic meanings that are
    appropriate, acceptable and natural depending on the situation
-   3 speech acts: locutionary, illocutionary and perlocutionary
-   Halliday and Hasan (1976, 1989): clear relationship between syntax and
    semantics: COHESION
-   Language teachers benefit from these linguistic theories. They are relevant to
    how grammatical ability might be assessed
-   By consulting these resources and relating them to L2 learning processes,
    teachers should have the information they need to create viable lesson plans
    that suit their student’s reads and construct assessments of how students are
    progressing.

    Chapter 5: Designing test tasks to measure L2 grammatical ability

-   It depends on the test-taker how he/she will do according to the type of item:
    Authenticity (Bachman and Palmer)
-   Test method: the way we use to elicit test performance
-   According to a certain TLU situation, test designer use a certain TLU task taken
    from the TLU domain
-   It would also help defining the grammatical constructs to assess and the test
    specifications
-   A TASK: is any activity that requires students to do something for the intent
    purpose of learning the target language. They have a number of instructions
    that control the kind of activity to be performed. Contains input and elicit a
    response
        o Task-naturalness: a grammatical construction must arise naturally
            during the performance of a particular task
        o Task-utility: it is possible to complete the task (meaningfully) without
            the structure, but with the structure the task becomes easier
        o Task-essentialness: The task cannot be completed unless the
            grammatical form is used
-   Task fulfillment: defining the construct according to what examinees can do in
    a single instance of communication rather than what they know, or have the
    capacity to do in any instance of communication
-   TLU domains: Real-life domain / Language-instruction domain
-   Bachman and Palmer’s task characteristics framework
    1. The setting
    2. The test rubrics
    3. The input
    4. The expected response
    5. The relationship between input and response
-   3 uses: describe TLU tasks as basis for designing test tasks, specify the test tasks
    and compare the characteristics of the TLU with the test tasks
-   Objective tasks: do not require expert judgment
-   Subjective tasks: require expert judgment
                                       Task Types
-   Selected-response tasks: present input in form of an item. The test-taker is
    expected to select a response. It’s intended to measure recognition or recall of
    grammatical form and meaning
-   Such as:
-   Multiple-choice task
-   Multiple-choice error-identification task
-   Matching task
-   Discrimination task
-   Noticing task
-   Limited-response tasks: Elicit a response embodying a limited amount of
    language production. They are intended to assess one or more areas of
    grammatical knowledge depending on the construct definition
-   Such as:
-   Gap-filling
-   Short-answer task
-   Dialogue (or discourse) completion task
-   Extended-production tasks: Present input in a prompt form rather than as an
    item. They aim to elicit large amounts of data of which the quality and quantity
    can vary greatly for each test-taker
-   Such as:
-   Information-gap task
-   Story-telling and reporting task
-   Role-play and simulation tasks


                                                                       Catalina Correa
                                                                      Camilo Saavedra

Grammar and vocabulary

  • 1.
    Testing and TeachingVocabulary, by Paul Nation I. Statical information about vocabulary Three words of counting words in a book: 1. To know how long it is: count words on a page and multiply by the number of pages. 2. To know how many words we need to understand in order to comprehend: decide what we understand by word, make a list of all the words, 3. To do a frequency count we make a list of all the different words and count how often each one occurs. According to the last point we can deduce that: a) If you know the 2,000 most frequent words of English means that you recognize 81% of the words on any page. b) Low frequency words are important for the understanding of the text; however, it is not very useful to focus on learning or teaching them because we are not likely to meet them very often. c) Specialized vocabulary occurs quite frequently in university textbooks, but not very frequently in language as a whole. II. Making the test. - Each section of the test consists of six words and three definitions. - Easy to make and to mark. - The words in each section were chosen so that they would be representative of all the words at that level. - Very low chances of guessing correctly. - 18 words are matched at each level. - Plus 18 as distracters, that are not meanings but words. People who know the words can do that level quickly; a native speaker did the whole test in five minutes and got full marks. A maximum of 50 minutes should be allowed for sitting the test. The items put together in each section were not related in their meanings. The definitions in the text use words from higher frequency level than the tested words. III. Using the test.
  • 2.
    The instructions shouldnot require any explanation, but if they are needed they must be explained. - Give one mark for each correct matching. - It takes more or less two minutes to mark and add the score of one test. - A score of 12 out of 18 indicates that approximately one-third of the words at that level are not known. - High frequency words a) Are worth individual attention and thus activities such as learning lists of words and vocabulary study using books like Barnard (1972) are appropriate. b) For teachers who are doubtful about getting learners to study vocabulary lists should read Nation (1982) which reviews experimental research on list learning. c) Direct teaching of vocabulary is also appropriate, large amount of words can be learned by meeting them incidentally in context. - Specialized vocabulary a) Can be treated in much the same way as high frequency vocabulary. b) Learning prefixes and roots is a useful aid to learning - Individual low frequency a) Do not deserve teaching time, unless they contain useful prefixes or roots, or are an example of some other regular feature that will help vocabulary learning in general. b) Guessing words using context. Second language vocabulary assessment: Current practices and new directions, by John Read. I. Measuring vocabulary size. There are several purposes for measuring vocabulary: a) Vocabulary size is closely associated with reading comprehension ability. b) Vocabulary tests have traditionally had a significant role in research on reading development and in literacy programmes. c) Vocabulary assessment can reveal the extent of the lexical gap they face in coping with authentic reading materials and undertaking other communicative tasks in the target language. d) Vocabulary size measures typically require a relatively large sample of words that represent a defined frequency range.
  • 3.
    II. Sampling from word frequency lists. a) For native speakers the sample of words to be tested has come from a large dictionary of contemporary English in order to cover as many as possible of the words that the participants in the study are likely to know. b) One limitation of dictionaries for native-speaker users is that they do not give any explicit information about the frequency of words. c) This deficiency in dictionaries is overcome by word frequency lists that are based on computer corpora. - The preferred lexical unit for current vocabulary size research s the word family. - Vocabulary size tests for second language learners understandably focus on a narrowerrange of words than those for native speakers. - Low frequency words are much less likely to be known, especially by learners in a foreign language environment. - One list that does combine these virtues is the Academic Word List (AWL), Coxhead’s (2000) set of 570 word families occurring frequently in written texts across a range of university disciplines. - The AWL it is based on the assumption that students participating in an English for Academic Purposes programme are intending to study in a range of disciplines and therefore the vocabulary they learn should represent a common core of words. III. The yes/no format - The next step in developing a vocabulary size test is to select a sample of target words for the test items. - The most widely test format used measure of English vocabulary size for second language learners, Nation’s Vocabulary Levels Test requires: a) the test-takers to match words with their synonyms b) the test-takers to match words with their short definitions Referred to above, has a: - Multiple choice format - Each target word is presented in a short non-defining sentence followed by four possible definitions as options The simplest test format was originally called the checklist and is now generally referred to as the Yes/No format. In this case the test-takers are presented with a series of words and just indicate whether they know each word or not. Anderson and Freebody (1983): - Included among the items a substantial proportion of non-words. - Provided a basis for adjusting the scores of test-takers who responded “Yes”. - Penalty for claimed knowledge of non-words.
  • 4.
    DIALANG (www.dialang.org) - The web-based system through which learners of 14 European languages can assess their proficiency in the target language. - When learners access the system, they are invited to assess their own skills in the target language and also to take a Vocabulary Size Placement Test in the yes/no format. - When they take a specific skill test, the system will present them with items and texts that are broadly suited to their level. Diagnosis (Alderson, 2005) - Focus on the structural system of the target language. - There are well-established procedures for measuring the size of a learner’s vocabulary - The yes/no format has proved to be an informative and cost-effective means of assessing the state of learners’ vocabulary knowledge, particularly for placement and diagnostic purposes. IV. Testing depth of knowledge. Is built on: - The recognition. - Learner’s ability to associate written form of a word with a simple statement of its meaning Learner’s L2 lexicon: - How the word is pronounced and spelled. - What its morphological forms are. - How it functions syntactically. - How frequent it is. - How it is used appropriately from a sociolinguistic perspective. And so on. One type of test that has been adopted to some extent is Read’s (1993, 1998) word associatesformat. - Word association by creating items that consists of a target word and six or eight other words. - The relationships between the words are primarily semantic and collocational. - The format offers opportunities to assess some key elements of the core meaning of the target word. - The aim was to design a simple type of item that would test deep word knowledge in a meaningful way. Another measure of deep word knowledge which has gained some currency is Paribakht and Wesche’s (1997) Vocabulary Knowledge Scale (VKS). - Incidental acquisition of word meaning through intensive reading activities V. Assessing vocabulary use in context
  • 5.
    Read and Chapelle(2001) - Noted that most existing vocabulary tests implicitly defined vocabulary knowledge as a trait, a mental attribute of the learner that could be described and measured without any reference to the contexts in which the words are used. - Vocabulary size measures such as the Yes/No format discussed above represent a classic example of this approach to assessment. One way to approach the distinctive features of academic registers is through the study of technical vocabulary. - This is an area that has not received much scholarly attention until recently. - We have lacked systematic procedures for identifying the technical words in particular texts. Rating scale for finding technical words: Step 1 Words with no semantic relationship to anatomy: the, is, between, amounts, common, directly Step 2 Words whose meaning is minimally related: superior, part, forms, pairs, structures, surrounds Step 3 Words whose meaning is closely related to anatomy but also in general use: chest, trunk, neck, abdomen, ribs, breast Step 4 Words with a specific meaning in anatomy, not used in general language: thorax, sternum, costal, pectoral, fascia, periosteum, viscera The alternative approach, then, is to apply the tools of corpus analysis to the identification of vocabulary that is characteristic of particular texts or registers. Chujo and Utiyama (2006) - Have pushed the keyword concept a step further in their research on the vocabulary of business English. - Work is promising, in that it may provide an automated alternative to Chung and Nation’s (2003) rational basis for identifying degrees of technicalness in the vocabulary of a particular register. Video - Vocabulary testing 1980s was outside of the mainstream of the field - Multiple choices items to test the knowledge of a word require a routine component of tests like TOEFL - 70s 80s communicative approaches to language teaching had an impact in testing vocabulary as well.
  • 6.
    - In the previous period the old ways of testing were obsoletes, reconsideration of assessing vocabulary - Vocabulary size testing, you need a large sample of words. Notes on the reading of Purpura’s book: Assessing Grammar Chapter 1: Differing notions of ‘Grammar’ for assessment - Historical importance of grammar for learning an L2. - It was thought that grammar was sufficient for learners to acquire another language - It was thought that to know a language means to know about its grammatical system and recite its rules - The Grammar translation approach gave birth to the Natural Approach (Krashen and Terrel, 1983) and other methods and approaches - Grammar is a set of rules to be internalized and used for communication (current perspective) - The assessment of grammar was based on tasks that would lead students to demonstrate their ability to communicate in speaking and writing - The assessment of grammar has changed a lot over time as well as the teaching of grammar - Grammar and linguistics: - Syntactocentric perspective: the arrangement of words in a sentence - Communication perspective: how language is used to convey meaning - Many theories and types of distinctive grammar, such as: o Formal grammar o Traditional grammar o Structural grammar o Transformational-generative grammar o Universal grammar - Form and Use perspective: focused on the function of grammar, rather than meaning - Corpus linguistics: how often and where a linguistic form occurs in spoken or written texts. - It examines linguistic and non-linguistic features - Lexical form (Katz and Fodor, 1063) ‘the grammatical dimension of lexis’ - Bieber et al, corpus-based study degree of what linguistic features are likely to occur in certain texts. It provides distributional and frequency information on the lexico-grammatical features of the language - It helps provide an empirical basis for determining which learning points to teach or to test
  • 7.
    - Kennedy (1998) states that language teachers might promote L2 vocabulary development or introduce students to features of the L2 that allow them to function appropriately in social contexts - Communication-based perspectives of language: - Beyond of the view of language as patterns of morphosyntax observed within relatively decontextualized sentences - It views grammar as a set of linguistic norms, preferences and expectations that an individual invokes to convey a host of pragmatic meanings that are appropriate, acceptable and natural depending on the situation - 3 speech acts: locutionary, illocutionary and perlocutionary - Halliday and Hasan (1976, 1989): clear relationship between syntax and semantics: COHESION - Language teachers benefit from these linguistic theories. They are relevant to how grammatical ability might be assessed - By consulting these resources and relating them to L2 learning processes, teachers should have the information they need to create viable lesson plans that suit their student’s reads and construct assessments of how students are progressing. Chapter 5: Designing test tasks to measure L2 grammatical ability - It depends on the test-taker how he/she will do according to the type of item: Authenticity (Bachman and Palmer) - Test method: the way we use to elicit test performance - According to a certain TLU situation, test designer use a certain TLU task taken from the TLU domain - It would also help defining the grammatical constructs to assess and the test specifications - A TASK: is any activity that requires students to do something for the intent purpose of learning the target language. They have a number of instructions that control the kind of activity to be performed. Contains input and elicit a response o Task-naturalness: a grammatical construction must arise naturally during the performance of a particular task o Task-utility: it is possible to complete the task (meaningfully) without the structure, but with the structure the task becomes easier o Task-essentialness: The task cannot be completed unless the grammatical form is used - Task fulfillment: defining the construct according to what examinees can do in a single instance of communication rather than what they know, or have the capacity to do in any instance of communication
  • 8.
    - TLU domains: Real-life domain / Language-instruction domain - Bachman and Palmer’s task characteristics framework 1. The setting 2. The test rubrics 3. The input 4. The expected response 5. The relationship between input and response - 3 uses: describe TLU tasks as basis for designing test tasks, specify the test tasks and compare the characteristics of the TLU with the test tasks - Objective tasks: do not require expert judgment - Subjective tasks: require expert judgment Task Types - Selected-response tasks: present input in form of an item. The test-taker is expected to select a response. It’s intended to measure recognition or recall of grammatical form and meaning - Such as: - Multiple-choice task - Multiple-choice error-identification task - Matching task - Discrimination task - Noticing task - Limited-response tasks: Elicit a response embodying a limited amount of language production. They are intended to assess one or more areas of grammatical knowledge depending on the construct definition - Such as: - Gap-filling - Short-answer task - Dialogue (or discourse) completion task - Extended-production tasks: Present input in a prompt form rather than as an item. They aim to elicit large amounts of data of which the quality and quantity can vary greatly for each test-taker - Such as: - Information-gap task - Story-telling and reporting task - Role-play and simulation tasks Catalina Correa Camilo Saavedra