The assessment of deep word knowledge in young learners


Published on

The assessment of deep word knowledge in young first and second language learners

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • They (1998) made a start by developing deep-word knowledge tests based on word associations for 9 to 11 Dutch children, natives as well as nonnatives.
  • According to Qian and these researchers, certain…. (The construct of deep word knowledge is further specified as the decontextualized knowledge of word meanings and word associations.)
  • In terms of mental lexicon, children expand their lexical knowledge at school in the 2 directions. The 1 st one is breadth, which refers to the size…, which mean vocabulary size; the other one is depth, which refers to the amount …. How well these items are mastered? It’s the deep word knowledge or the depth of lexical knowledge. [The number of lexical items known (vocabulary size) vs. how well these items are mastered (vocabulary depth or quality) which means ‘The depth of word knowledge refers to ‘How much do you know of a word?’’ The amount or extensiveness of knowledge of individual words is referred to as deep word knowledge or depth of lexical knowledge .]
  • 2. Therefore, 3. 背 which means that all kinds of connections with related words have to be established. 4. Thus, vocabulary learning might be ‘ more a matter of system learning “ than of item learning ’”(Haastrup & Henriksen, 2000). 補充 In the same vein, learning new words is more than the acquisition of isolated lexical units:
  • According to John Read (1993), there were 3 types of associations , i.e., the relationships between target word and associates : p relation , s and a. According to Clark (1970), syntagmatic responses are influenced to a considerable extant by the left-to-right production of sentences.
  • Unlike syntagmatic relations, such as bird-fly, bird-nest, bird-egg, para…. A swallow is the subordination and it’s a migratory bird, and the superordiation is an animal. KK [ˋmaɪgrə ͵ torɪ] (In school, new meanings and new relationships between familiar words are learned by way of a continuing process of generalizing, categorizing and abstracting.) and in this way much academic knowledge is logically organized and can be pictured as a tree structure.
  • In brief, KK [ ͵ ɪdɪəsɪŋˋkrætɪk] 有特性的 They can figure out word meaning in isolation.
  • 唸 Here are some established assessment instruments. (The key question which will be addressed in the next section is how to measure deep word knowledge. Research on this dimension so far has involved adolescent and adult second language learners (Read, 1993; Greidanus & Nienhuis, 2001; Qian, 1999), whereas in this study we focus on primary school children at a relatively early stage of their second language learning, and their schooling generally.)
  • As you can see this photo, the teacher says a word, then the child listen and point to the corresponding picture.
  • The second measure of vocabulary size is the Vocabulary Levels Test The child read and match the correct definition, then write down the number.
  • Paul Meara This test has 5 levels. If you know what it means, check the box beside the word, ; if you aren't sure, do not check the box.
  • There are two main methods of measuring depth, described by Read (2000) as developmental and dimensions approaches. Little validation Read (2000) developmental and dimensions approaches. Limitations (see Read, 2000; Schmitt, 2010)
  • VKS asks learners to rate their knowledge of words not in binary terms (I know/I don't know what this word means) but on a five-point scale (ranging from "I don't remember having seen this word before," to "I can use this word in a sentence.
  • The words here on the left side may help to explain the meaning of "sudden". The words here on the right side are nouns that may come after "sudden" in a phrase or a sentence.
  • In order to simplify the test for these younger test-takers in two ways: 4. In order to visualize the word associations. (4. …associated with the stimulus word best.  What is the best WAF format: 6-option or 8-option?)
  • Figure 1 is the sample item “banana” from the word association task. Fruit has a paradigmatic relation with banana because it’s the superordination. peel is a part of a banana. It has partonomic association (constituents), and yellow has a distingushing feature and perceptual feature of a banana. Whereas the rest 3 distractors can be associated with banana, but in a more contextualized way!!! In Figure 1, the three related words to be selected are the superordinate fruit, peel and decontextualized syntagmatic relations, the perceptual feature, it’s a distinquishing feature of a banana yellow. Generally speaking, to pick 3 words out of 6 that fit with the stimulus best or always go together with the central word. 1. The intended correct answers represent different kinds of relationship that can be found in a semantic network, such as paradigmatic relations (i.e., superordinate….) 2. Partonomic association 類屬
  • In order to prevent cheating and to minimize the effects of fatigue or lack of time, …. There were 4 test forms. which might result in the items towards the end not being tested adequately, a variant of each form was created. asterisk [ˋæstə ͵ rɪsk] (WAT-scoring methods were…) Scoring system for the WAT reflects the examinees’ actual knowledge of the words. 2. we considered 改變分數不好的詮釋: Treating each stimulus word – associated word relation (i.e., each (possible) connection) as a separate item causes unwanted interdependence between the items which might inflate reliability estimates. the number of items for which the three intended connecting lines were drawn  the word web as just one whole item
  • Table 1 represents 唸最上面,比上面、比左邊 唸 A sub-group of 86 children were also given the definition task TAK 接著唸最下面那一行。 from a standardized Dutch test battery 系列 for young Dutch (L2) learners TAK= Language Test for Non-native Children, Upper Grades
  • 分成兩組 The data collection was carried out by test assistant, who had to follow an extensive protocol to make sure the administration of the tests as uniform as possible. They went over the WAT procedure with the children, making use of two sample items (see Section II.1). See p. 218. 2. … to the children, who completed them in class.
  • The reliability of the test items was estimated in terms of …. To equate the test parts an item response analysis was conducted fitting a two-parameter model using BILOG (Mislevy & Bock, 1986). Scoring students’answers is essentially a mechanical task, and thus rater reliability is not an issue in the WAT. Mislevy, R. J., & Bock, R. D. (1986). PC-Bilog. Item analysis and test scoring with binary logistic models. Mooresville, IN: Scientific Software. (^^A two-parameter model was fitted using BILOG.)
  • Evidence of validity was obtained by…. from a concurrent definition task, and from an analysis of the scores in terms of language background, which is known to be associated with different levels of vocabulary knowledge. The WAT should at least correlate substantially with such a criterion measure. e.g., in this case ....WAT and TAK 互比效度
  • 學生的任務 Task: Children make any associations with a stimulus word. Ss performed quite well.
  • In sum, this initial analysis indicates that
  • Table 2 shows that “the test forms for the 3 rd graders had an acceptable degree of difficulty given that the children answered approximately 50% of the 30 items correctly. 藍色五年級 The distributions are skewed [skjud to the left. Small deviations 誤差 from normality are found. 795 children, 405 for WAT-A and 390 for WAT-B WAT-A and WAT-B have 10 items in common and each contains 20 unique items (total 30 items each). The differences between WAT-A and B within each grade are not significant difference. [kɝtosɪs] 峰態 結果報告 The tests are adequate for Grade 3 and, to a slightly lesser degree to Grade 5. For the 5 th graders, the score variance is slightly smaller than in Grade 3 and the
  • Estimated overall IRT reliability was .915.
  • Language background had a significant effect.
  • Table 6 shows the correlation between the two scores. Correction for attenuation shows that the correlation can only amount to a maximum of 0.82 in each grade.  The variables do not coin’cide completely. 再說 Both tests thus show a strong ‘true’ correlation, but do not seem to measure exactly the same construct. Which does not lie (A perfect correlation (r = 1.0) does not lie) within the 95% confidence interval of a correlation between .69-.90. 0.82 (95% c.i .: 0.69–0.90 ), which means that the variables do not coincide [͵ koɪnˋsaɪd] 同時發生 completely.
  • because of delays in their language development (Verhallen, 1994; Verhallen & Schoonen, 1993). Children who are second language learners in particular drop out
  • whereas
  • 4. Scoring: a 4-point scale per item could be applied to interpret WAT scores. (2. What is the best way to score the WAF? What is the best way to INTERPRET various WAF scores, especially ‘spilt’ scores?) 4. What effect does guessing behavior have on the WAF? Scoring: a four-point scale per item Guessing will have more influence on the results.
  • Conclusion: the authors claimed that gaining….
  • All-or-Nothing method vs. a less harsh scoring A well-established test format involving word associations has been adapted for this younger target group and the feasibility of assessing deep word knowledge in primary school children is evaluated empirically. So far we have English tests for the 6 th graders, it is hoped that we could construct our own test format involving word associations to assess deep lexical knowledge. 1-1 Relationships between associates and distractors: semantically related or not? 2-1 Ss can not demonstrate their lexical knowledge. Give Ss positive feedback toward their partial knowledge How to score the WAT? Size: the number of known words What effects do distractor type and distribution of ASSOCIATIONS have on the WAF? frequency range (amount of exposure to the print) How does L2 learners’ depth of vocabulary knowledge relate to the degree and the type of [lexical inferencing] strategies they use? Decontextualization of attribution of meaning
  • 1200 for the junior high school graduates and 320 word for the elementary school children in Taipei city.
  • In this article the construct of deep word knowledge is further specified as the decontextualized knowledge of word meanings and word associations.
  • The assessment of deep word knowledge in young learners

    1. 1. The Assessment of Deep Word Knowledge in Young First and Second Language Learners Presenter: Chia-Hui Cindy Shen Schoonen, R., & Verhallen, M. (2008). The assessment of deep word knowledge in young first and second language learners. Language Testing, 25 , 211-236. Advisor: Dr. Vincent W. Chang
    2. 2. Outline Questions for discussion 6 Introduction 1 Literature review 2 Method 3 Results 4 Discussion and conclusion 5
    3. 3. I. Introduction <ul><li>1. Lexical knowledge in second language acquisition and school </li></ul><ul><li>Certain levels and qualities of lexical knowledge are important prerequisites for successful language learning and language use (Qian, 1999; Zareva, Schwanenflugel & Nikolova, 2005), and therefore for school success (Carlo, August, McLaughlin, Snow, Dressler, Lippman, Lively & White, 2004; Saville-Troike, 1984; Snow, Burns & Griffin, 1998). </li></ul>
    4. 4. Mental lexicon the size of the lexicon as the number of ‘entries’ (words) the amount of knowledge of individual words breadth depth <ul><li>There are multiple components of word knowledge for second language learners to acquire (Nation, 2001). </li></ul>
    5. 5. <ul><li>Words function in a semantic network of related words (Aitchison, 1994; Meara & Wolter, 2004). </li></ul><ul><li>Differences in lexical knowledge can be thought of as differences in the number of connections in the network, as well as of the type of relations (Meara, 1996, 1997). </li></ul><ul><li>New words are embedded in a lexical network (Aitchison, 1994; Meara, 1997; Meara & Wolter, 2004; Read, 2004). </li></ul><ul><li> Thus, vocabulary learning might be a matter of system learning (Haastrup & Henriksen, 2000). </li></ul>
    6. 6. The semantic relationships <ul><li>paradigmatic : the two words are synonyms or similar in meaning, and belong to the same word class, e.g., ‘edit’ and ‘revise.’ </li></ul><ul><li>syntagmatic : the two words could occur one after another, as collocations, e.g., ‘edit’ and ‘text.’ </li></ul><ul><li>analytic : the association represents one aspect of the meaning of the target word, and is likely to form part of its dictionary definition, e.g., ‘edit’ and ‘publishing.’ </li></ul>cf. Greidanus, T., & Nienhuis, L. (2001). Testing the quality of word knowledge in a second language by means of word associations: Types of distractors and types of associations . The Modern Language Journal, 85 , 567-577.
    7. 7. <ul><li>Paradigmatic relations are hierarchical (Cruse, 1986; Verhallen & Schoonen, 1993; Wolter, 2001). </li></ul><ul><li>The hierarchical classification is characterized by class inclusion . </li></ul>
    8. 8. Dutch as a second language children (DSL) monolingual Dutch peers came up with more syntagmatic, personal, or idiosyncratic meaning associations provided more paradigmatic and decontextualized meanings
    9. 9. <ul><li>2. The operationalization of deep lexical knowledge </li></ul><ul><li>Measures of vocabulary size </li></ul><ul><li>Peabody Picture Vocabulary Test (Dunn & Dunn, 1997) for L1 children </li></ul><ul><li>Vocabulary Levels Test (Schmitt, Schmitt, & Clapham, 2001) </li></ul><ul><li>Yes/No recognition tests (Meara & Buxton, 1987) </li></ul>
    10. 10. <ul><li>Measures of vocabulary size : </li></ul><ul><li>Peabody Picture Vocabulary Test (Dunn & Dunn, 1997) for L1 children </li></ul>
    11. 11. <ul><li>Measures of vocabulary size : </li></ul><ul><li>Vocabulary Levels Test (Schmitt, Schmitt, & Clapham, 2001) </li></ul><ul><li> </li></ul>
    12. 12. <ul><li>Measures of vocabulary size : </li></ul><ul><li>Yes/No recognition tests (Meara & Buxton, 1987) </li></ul>
    13. 13. <ul><li>Assessment of deep word knowledge : </li></ul><ul><li>Wesche and Paribakht’s (1996) Vocabulary Knowledge Scale (VKS) </li></ul><ul><li>Read’s (1993, 2000) word association format </li></ul>
    14. 14. <ul><li>Assessment of deep word knowledge : </li></ul><ul><li>Wesche and Paribakht’s (1996) Vocabulary Knowledge Scale (VKS) </li></ul>
    15. 15. <ul><li>Assessment of deep word knowledge : </li></ul><ul><li>Read’s (1993, 2000) word association format </li></ul><ul><li>Read, J. (1993). The development of a new measure of vocabulary knowledge. Language Testing, 10 , 355-371. </li></ul>
    16. 16. <ul><li>The Word Association Task (WAT) </li></ul><ul><li>Schoonen and Verhallen (2008) </li></ul><ul><li>for Grades 3-6 </li></ul><ul><li>six possibly associated words </li></ul><ul><li>to select three words out of six </li></ul><ul><li>to draw connecting lines as a word web </li></ul><ul><li>Only content words (nouns, verbs and adjectives) were included in the test. </li></ul>
    17. 17. The Word Association Task (WAT) paradigmatic relations: the superordination partonomic association perceptual feature
    18. 18. III. Method <ul><li>Participants </li></ul><ul><li>795 third graders (age 9) and fifth graders (age 11) who learn Dutch as a second language from 19 schools in Netherlands </li></ul>
    19. 19. <ul><li>Instruments and scoring </li></ul><ul><li>a variant of each form was created with the items in reverse order (*) </li></ul><ul><li>4 test forms : A, A*, B and B* </li></ul>
    20. 20. TAK is a 25-stimulus words from a standardized definition task which were subsequently scored on a three-point scale (0-2) .
    21. 21. <ul><li>Procedures </li></ul><ul><li>Two sample items from WAT-A and WAT-B with four test forms were randomly assigned . </li></ul><ul><li>The children who were selected for the definition task had an individual session with the test assistant in accordance with the standardized instructions for the TAK. </li></ul>
    22. 22. <ul><li>Analyses </li></ul><ul><li>Reliability : internal consistency (Cronbach’s alpha) </li></ul><ul><li>A one-factor model was fitted to the test scores. </li></ul><ul><li>The overlap of 10 items in the two forms A and B was used to equate the two test forms for further analyses. </li></ul><ul><li>To equate two test forms, an item response analysis was conducted fitting a two-parameter model using BILOG (Mislevy & Bock, 1986). </li></ul>
    23. 23. <ul><li>Validity : by comparing the test scores with scores on a concurrent measure of the same or a highly related construct (Cronbach, 1971; Messick, 1989), e.g., the definition test. </li></ul>
    24. 24. IV. Results <ul><li>1. Utility of the test </li></ul><ul><li>From the initial sample of 822 students who were each presented with 30 items, only 15 students sometimes drew more than the required three connecting lines. </li></ul><ul><li>Children more frequently marked fewer than three alternatives. </li></ul><ul><li>More than 83% of the children marked three alternatives throughout the whole test, and the number of times (i.e., items) that children marked less than three associations was limited to two or three out of the 30 items; only 1% of the children marked less than three alternatives on more than five items. </li></ul>
    25. 25. <ul><li>The test format is satisfactory for this age range. </li></ul><ul><li>It is suitable for classroom administration in a reasonably short period of time . </li></ul>In terms of test utility , the WAT is useful and efficient.
    26. 26. 2. Level of difficulty and dispersion <ul><li>The tests are adequate for Grade 3 and for Grade 5. </li></ul><ul><li>The distributions do not deviate strongly from normality and show </li></ul><ul><li>sufficient individual differences. </li></ul>
    27. 27. 3. Reliability and item analysis <ul><li>A two-parameter model was fitted using BILOG (Mislevy & Bock, 1986), and it fitted well. </li></ul><ul><li>IRT evidence with reliability of .75- .83. </li></ul><ul><li>No items were removed. </li></ul>
    28. 28. 4. Validity analysis <ul><li>a. Known-group validity: </li></ul>The interaction between language background and grade was non-significant.
    29. 29. 4. Validity analysis: <ul><li>b . Concurrent validity: </li></ul><ul><li>To what extend the WAT and the definition task provide convergent information? </li></ul><ul><li>Correction for attenuation shows that the correlation can only amount to a maximum of .82 in each grade. </li></ul><ul><li>The variables do not coincide completely. </li></ul>
    30. 30. V. Discussion and conclusion <ul><li>Non-native children are apparently exposed to a less-varied language input than their native peers, which results in an increasing shortfall in breadth of vocabulary (Verhoeven & Vermeer, 1996). </li></ul>
    31. 31. <ul><li>The scores on the word association test appear to be strongly correlated with those on the definition task ( r = .82). However, the two tasks are not interchangeable. </li></ul><ul><li>1. They measure not exactly the same thing: </li></ul>WAT The definition task (TAK) <ul><li>is a receptive task, in which one must consider the relations existing within the semantic network. </li></ul><ul><li>has a productive character, meaning that speaking </li></ul><ul><li>ability plays an important roles. </li></ul>
    32. 32. <ul><li>2. There are some practical differences : </li></ul><ul><li>The WAT version has the advantage of being easy to administer in the classroom thereby avoiding the kind of assessment problems one encounters in administering and scoring the definition task . </li></ul><ul><li>WAT could distinguish between students with previously-known differences in level of more advanced word knowledge. </li></ul>
    33. 33. <ul><li>Gaining more insight into the concept of deep word knowledge is very important as it may reveal vast differences among children who superficially appear to have a similar amount and type of word knowledge. </li></ul>
    34. 34. VI. Questions for discussion <ul><li>What is the best way to interpret WAT scores? </li></ul><ul><li>Item scoring: all-or-nothing vs. credit for each associate </li></ul>
    35. 35. <ul><li>What are the limitations in measuring how well a learner knows a word? </li></ul><ul><li>young EFL learners’ limited vocabulary size > guessing behavior > over- or underestimate vocabulary knowledge </li></ul><ul><li>Can the WAT be applied to EFL context in Taiwan? </li></ul><ul><li>as for the test developers…. </li></ul><ul><li>strategies and teaching materials </li></ul><ul><li>test items with illustrations or 3 to 4 scenarios with 4 to 6 descriptions to match </li></ul>
    36. 36. The Assessment of Deep Word Knowledge in Young First and Second Language Learners Thank You ! Schoonen, R., & Verhallen, M. (2008). The assessment of deep word knowledge in young first and second language learners. Language Testing, 25 , 211-236. Presenter: Chia-Hui Cindy Shen Advisor: Dr. Vincent W. Chang