Assessing vocabulary



How to assess vocabulary? This paper attempts to address some of the limitation found in vocabulary assessment in EFL context in Indonesia.



    • Assessing Vocabulary A paper assignment forLanguage Testing and Evaluation ENGL 6201 Ihsan Ibadurrahman (G1025429)
    • Assessing VocabularyI. IntroductionVocabulary is an essential part of learning a language, without which communication would suffer. Amessage could still be conveyed somewhat withouta correct usage of grammatical structure, butwithout vocabulary nothing is conveyed. Thus, one may expect a communication breakdown when theexact word or vocabulary itself is missing albeit the correct syntax or grammar is there such as in “Wouldit perhaps be possible that you could lend me your ___”. On the contrary, without much grammar but asufficient knowledge of vocabulary, it would suffice to say ”Correction pen?” (Thornburry, 2002).In thecontext of ESL teaching, it would make sense that the teaching of vocabulary should take priority overthe teaching of grammar, especially in today’s growing use of communicative approach where limitedvocabulary is the primary cause of students’ inability to express what they intend to say incommunicative activities (Chastain, 1988 as cited in Coombe, 2011). Since vocabulary is the basicbuilding blocks of language, most second language learners undertake their ambitious venture inmemorizing lists of words. For ESL learners, learning a language is essentially a matter of learning newwords (Read, 2000). Vocabulary is also closely tied to comprehension; it is generally believed thatvocabulary assessment may be used to predict reading comprehension performance (DeVriez, 2012;Pearson et al, 2007; Read, 2000; Thornbury, 2002). This implies that to be able to comprehend text fully,vocabulary is much needed (Nemati, 2010). Vocabulary is thus important both for communication andcomprehension, especially in reading. Knowing the importance of vocabulary in language teaching, it would make sense to be able toassess vocabulary. Such measurement may inform teachers of how much vocabulary learning has takenplace in the class, whether the teaching has been indeed effective. Thornbury (2002) contends that 1
    • vocabulary tests may also givetwo added advantages to teachers in that it provides beneficial backwashand an opportunity for recycling vocabulary.Provided that students are informed in advance thatvocabulary is part of the assessment, students may review and learn vocabulary in earnest, thus creatinga beneficial backwash effect.The test also gives students a chance to recycle, and use their previouslylearned vocabulary in a new way (Coombe, 2011). However, despite the many benefitsit has onlanguage teaching, vocabulary assessment does not receive the attention it deserves. Pearson et al(2007: 282) argue that vocabulary assessment is “grossly undernourished” and instead of living up to agood standard of measurement, much more effort has been exerted in favor of a practical standpoint inwhich tests are designedfor economical and convenient reasons. Such phenomenon exists in IndonesianEFL contexts where vocabulary assessment is merely used as a part of reading test in its standardizednational examinations, rather than on its own and is fashioned in the form of convenient multiple-choicetests (Aziez, 2011). This paper aims to describe an overview ofcurrent practices of vocabulary assessment. Thepaper also attempts to outline some recommendations on testing vocabulary, and how it may berelevant to the EFL vocabulary teaching in Indonesian contexts.To achieve these aims, recent journalarticles dated 2005 onwards were gathered and studied. Books and other journal articles that are ofrelevance to the study were also used to support the current journal articles. The paper begins bydescribing the many facets of vocabulary assessment that teachers first need to take into account. Itthen goes on to elaborate some common types of vocabulary assessment.Where relevant, theadvantages and disadvantages of each test techniques will also be discussed. It then goes on further tobriefly report what researchers have done on vocabulary assessment, including the new direction thatthey have taken. Recommendations on how to assess vocabulary will then be presented, and related tothe EFL contexts in Indonesia.Finally, the paper closes with a conclusion whichsummarizes the contentof the paper. 2
    • II. The dichotomies in vocabulary assessment Before going into the details of the common types of vocabulary assessment, it would be usefulhere to present the many facets of vocabulary assessment. The first thing that test-designers need to dois to decide on which aspects of vocabulary that they want to test. This is especially true in vocabularyassessment since vocabulary is multi-faceted, in which words can be examined in many different ways,not just its meanings (Samad, 2010).These aspects are often viewed in a binary opposition ordichotomies. Thus, vocabulary can either be assessed informally or formally, whether it is part of alarger test or a vocabulary test on its own, or whether the assessment is on learners’ passive vocabularyor its active counterpart. Some of the many facets of vocabulary assessment found in the literature arediscussed as follows.a. Informal versus formal Formal vocabulary assessment refers to the tests that are standardized, and have been designed in such a way that reliability and validity are ensured (DeVries, 2012).Test of vocabulary can sometimes be a part of placement test and proficiency test to measure the extent of vocabulary knowledge a learner has. In proficiency tests such as TOEFL (Test Of English as Foreign Languages), vocabulary is usually tested as a part of a larger construct such as reading, where candidates are tested on their vocabulary knowledge based on a context on a reading passage. Formal assessment also includes achievement test that is typically administered at the end of the course, and is designed to measure whether words taught within the duration of a specific course have been successfully learned. Informal assessments, on the other hand, are not usually standardized, and are typically done as a formative test, or a progress check to see if students have made a progress in learning specific words that we want them to learn. Learning words is not something that can be done overnight. Especially, in second language learning where there is less exposure to the words, learners need to 3
    • recycle the vocabulary from time to time by doing some kind of revision vocabulary activities. Such activities are an informal vocabulary assessment, intended primarily to check whether they have learned and progressed with their vocabulary learning (Thornburry, 2002). DeVries (2012) proposes teacher’s observation as one of the most useful informal vocabulary assessment during on-going classroom activities. Observations may provide teachers the first indication of whether or not words have been grasped by learners, from which follow-up activities may ensue. In sum, informal and formal assessment is very much related to the nature of the test itself, particularly on the demands of the testing and the standard of the test itself. The next three distinctions that follow are proposed by Read (2000), to which he calls the three dimensions of vocabulary assessment.b. Discrete versus embedded The distinction in Read’s (2000) first dimension of vocabulary assessment is the construct of the test itself, whether it is independent or dependent of other constructs. Where vocabulary is measured as a test on its own right, it is called discrete. However, when a test of vocabulary forms a larger part of a construct, it is called embedded.Using this first dimension, we can say that progress check tests that are available at the end of a unit in most course books fall into the former category, whereas the TOEFL test mentioned previously clearly falls into the latter category. 4
    • c. Selective versus comprehensive The second dimension deals with the specification of vocabulary that is included in the test. Vocabulary test is said to be selective when certain words are selected as the basis of vocabulary measurement. Its comprehensive counterpart, on the other hand, examines all the words that are spoken or written by the candidate.A selective vocabulary measure can be found typically in most conventional vocabulary tests where the test-designer selects the words to be tested in the assessment, such as those found in TOEFL reading comprehension questions. Comprehensive vocabulary measure is typically found on a speaking or writing test where raters judge the overall quality of the words rather than looking specifically at some words.d. Context-independent versus context-dependent The last dimension of vocabulary entails the use of context in a vocabulary test. If words in the test are presented in isolation, without a context, the test is called context-independent but when it makes use of the context in order for the test-takers to give the appropriate answer it is called context-dependent.In the former case, learnersare typically asked to respond whether they know specific words or not. For example, the yes/no vocabulary check list asks whether learners know the words from the list by marking a tick on it. For the latter, learners must engage in the context in order to come up with the right response in the test. For example, in TOEFL reading passage, in order to know which options is the closest synonym to the word, learners must refer to the text and use the available context there.e. Receptive versus productive Another distinction to make in vocabulary assessment is to decide whether we want to test learners’ receptive vocabulary (passive) or the productive (active) one. Receptive vocabulary is the vocabulary 5
    • needed to comprehend listening or reading text while active vocabulary is the vocabulary used when learners use it in writing or speaking. It is understood that learners have more receptive vocabulary than productive vocabulary at their disposal. Knowing this distinction is crucial because we certainly do not need to tests learners to demonstrate how they can use all words; there are words which we simply want the learners to be able to comprehend.f. Size versus depth The last distinction is one that has gained currency in the research of vocabulary assessment (Cervatiuc, 2007; Read, 2007). Size (or breath) of vocabulary refers the amount to vocabulary a learner has, while the depth is the quality of these words. It is generally understood that knowing a word does not simply entail knowing its meaning, but other aspects as well such as its pronunciation, part of speech, collocation, register, morphological changes. This word knowledge deepens through a gradual process of learning (Cervatiuc, 2007). A vocabulary depth test is thus used to assess learners’ knowledge on some of these aspects of words. As Read (1999) puts it, vocabulary size measures how much leaners know words, whereas vocabulary depth deals with how well they know these words. The binary distinctions listed thus far suggest that we must have some kind of reasons first before vocabulary tests are constructed. Nation (2008, as cited in Samad, 2010) proposes a matrix which lists different reasons in vocabulary assessment along with its corresponding formats of test.The adapted version of the matrix is illustrated in the table below: 6
    • Reason for testing Useful formats and existing test Teacher labeling, matching, completion,To encourage learning translation Vocabulary Levels test, Dictation level test,For placement Yes/No, Matching, Multiple choice Vocabulary Levels test, Dictation level test, EVST-For diagnosis Yes/no.To award a grade Translation, Matching, Multiple-choice Form recognition, Multiple-choice, Translation,To evaluate learning Interview Lexical Frequency Profile, Vocabulary size test,To measure learners’ proficiency Translation Table 1: Reasons for assessing vocabulary and its corresponding test formats (Samad, 2010: 78) Some of the examples of the formats presented in the table will be given in the next section.III. Howvocabulary is assessedThis section briefly outlines some commonly used vocabulary formats in vocabulary assessment. The listbelow roughly follows a chronological order of how they appeared in the testing of vocabulary. The firstfour formats listed below were the earliest measures of vocabulary which primarily ask the learners todemonstrate their vocabulary knowledge by labeling, giving definition, and translating. Traditionally,such assessment was done orally via an individual interview (Pearson et al, 2007). However, due to themass testing triggered by World War I, a more reliable, practical scoring is needed. This gave birth to thenext two test techniques in the list: Yes/No list and Multiple Choice Questions (MCQs). Research onSecond Language Acquisition (SLA) and Reading soon changed the view on how words are learned. Itbecomes a widespread belief that words are learned best when they are presented in context(Thornbury, 2002). Such view motivates more contextualized vocabulary assessments such as the cloze- 7
    • test. Next in the list is, the four skills assessment (writing, speaking, listening, and reading), wherevocabulary is sometimes a part of the construct, which makes use ofthe context to demonstratelearners’ ability in using the words (active vocabulary).a. Labeling One of the most commonly used test technique in vocabulary assessment is labeling, where learners are typically asked to respond by writing down what the word is for a given picture as illustrated below. From Redman, S, Vocabulary in Use: Pre-intermediate & intermediate, p. 13, CUP. (2003) Alternatively, one picture can be used in which the learners are asked to label parts of it. Although it may be relatively easy to come up with a picture especially with the growing mass of picture content available on the net, it is somehow limited to pictures showing, and thereby testing concrete nouns (Hughes, 2003).b. Definitions In definitions, learners are asked to write the word that corresponds to the given definition, as illustrated below. 8
    • A ____________ is a person who teaches in your class. ______________ is a loud noise that you hear after lighting, in a storm. ______________ is the firstday of the week. Definition provides a wider range of vocabulary to test, unlike the labeling format which is restricted to concrete nouns. However, Hughes (2003) pinpoints one issue in this kind of test in that not all words can be uniquely defined. To address this limitation, dictionary definitions may provide shortcuts and save our headaches in finding the best, clear-cut, unambiguous definition.c. Translation There are many different ways in which vocabulary is measured using translation. Learners can choose the correct translation in a MCQ, or simply be asked to provide the translation for each word given in a list as follows: Teacher __________ Taxi driver __________ Student __________ Librarian __________ Actor __________ Shop keeper __________ President __________ Professor __________ Note the above example may also be reversed, asking learners to provide the English words from the L2. One pitfall in using translation is that one word may consist of more than one meaning, and therefore there may be more than one correct answer which is an issue of reliability. However, the use of context may help address this limitation. This can be done by adding sentences, in which the word to be translated in underlined. Another issue in translation is the assumption that the teacher has the knowledge of the student’s mother tongue (Coombe, 2011). It may also be noted that the use of translation is somewhat regarded as controversial in the current trend in language educationwhere the use of mother tongue is discouraged (Read, 2000); learners should instead be 9
    • given a healthy dose of L2 exposure in the classroom (Harmer, 2007). However, a recent study done by Hayati and Mohammadi (2009) suggests that translation provides longer memory retention of words than another vocabulary learning technique called ‘task-based’ approach whereby learners are asked to remember the definition, parts of speech, collocation, and other aspects of a word (or to which is referred earlier to as vocabulary depth). Their findings imply that translation may still have its place in vocabulary assessment.d. Matching Another common vocabulary test is where learners are presented with two columns of information, and are asked to respond by matching a word in one column to another one. Items on the left-hand column are referred to as premises, and items on the other end are called options. The word can be matched based on its related meaning, a synonym, an antonym, or a collocation as exemplified in the excerpt below: From Redman, S, Vocabulary in Use: Pre-intermediate & intermediate, p. 13, CUP. (2003) Ur (1991) cautions the use of matching since learners can utilizethe process of elimination, which can be useful when they do not know the words in question. She thus recommends the use of more options in matching. 10
    • e. Yes/No list The Yes/No format is particularly useful when we wish to test a large sample of items in a relatively short time. This is achievable because in such format the learners are only asked to give a mark if they know what the word means. For this practical reason, the yes/no format is typically used to measure learner’s vocabulary size as a large sample of items is particularly needed in measuring size. Give a tick (√) if you know what the word means. __ Mayonnaise __ Catastrophe __ Belligerent __ Distinctivef. Multiple choice question MCQs are among the most common test techniques in vocabulary assessment, especially in formal tests (Combee, 2011). MCQs consist of a stem and response options. What the learners do is simply to find one correct answer in the options. In vocabulary test, MCQ can be used to demonstrate learners’ word knowledge of synonyms,antonyms, meanings in context, or a range of English expressions as shown in the excerpt below: McCarthy and O’Dell, AcademicVocabulary in Use, p. 41, CUP. (2008) 11
    • Although MCQs are often criticized for its sheer difficulty in designing good construct, limited number of distractors to use, and existence of guessing element, MCQs nevertheless remain one of the most popular vocabulary test simply because of their virtue of practicability, versatility, familiarity, and high reliability.g. Cloze-test Cloze-test, also known as sentence completion or gap-fill item, is yet another common vocabulary test where learners are asked to fill in the missing words in a text. It is relatively more demanding than the previous test format since learners must demonstrate their ability in using the word based on the context provided in the text. Cloze-test comes in many forms. The first one, is a fixed cloze test in which every n-th word is deleted in a fixed ratio from a reading passage. The second form is called selective-deletion or rational cloze test where instead of deleting words in a fixed ratio, the test-designers purposefully delete some words from a reading passage. Another form of cloze-test involves the use of multiple choice questions in answering the items. Thus, instead of the learners having to write down the answer, they need to choose one correct response for every deleted word. To eliminate the possibility of having more than one correct answer, it is desirable to provide the first letter of each deleted word as illustrated in the following excerpt: McCarthy and O’Dell, AcademicVocabulary in Use, p. 43, CUP. (2008) 12
    • In the excerpt above, the first letter serves as a clue as to what the answer should be. Another extreme version of this is called C-test, where instead of giving the first letter for each deleted word, the first half of the word is revealed. One main advantage of cloze-test is its ease in writing one, however Read (2000) casts doubt on the use of cloze-test as a true vocabulary measure. Since there are quite many aspects to look at in answering a cloze-test item, the score may not reflect only the learners’ lexical knowledge but it may be seen as gauging learner’s overall proficiency in the language, including reading ability.h. Embedded test As previously mentioned, embedded vocabulary test is not a vocabulary test on its own but it is part of a larger construct such as found in the testing of four language skills (reading, listening, writing and speaking). In such assessment, the rater judges the overall quality of learners’ vocabulary in a given skill. In reading, mainly the learner is asked to define the meaning from the context in a reading passage. In listening, vocabulary can be one part of a larger writing component where students’ knowledge of word spelling is assessed. Since writing and speaking are both productive skills, vocabulary is somewhat given more weight. IELTS writing and speaking, for instance, put ‘lexical resource’ as one of the four marking schemes. This lexical resource refers to the quality of learners’ vocabulary, whether for example the usage of word is appropriate, varied, and natural or incorrect, limited, and stilted. 13
    • IV. Research on vocabulary assessmentRead (2000) provides one of the most comprehensive historical account of research into vocabularyassessment. He states that vocabulary assessment is one field of study where not much attention hasbeen paid, particularly by the researchers of language testing themselves. In 1990s, most of the studywas conducted by Second Language Acquisition (SLA) researchers who might not have an adequateunderstanding of testing and measurement but need vocabulary testing as their research instrument inorder to validate their own findings. Basically, these SLA researchers were interested in examiningwhether specific lexical strategies are fruitful in terms of vocabulary acquisition by means of a test. Thefour recurring topics that SLA researchers contributed to the field were systematic vocabulary learning,incidental vocabulary learning, inferring the meaning of words from context, and communicationstrategies. Systematic vocabulary learning looks at a systematic, orderly, way of how one acquiresvocabulary. Incidental vocabulary learning concerns the extent to which learners absorb new wordsincidentally over time. The third topic investigates learners’ use of context in getting the meaning ofunknown words. The last most researched SLA topic deals with the ranges of communication strategiesused in a situation where learners lack the vocabulary to express what they wish to say. Otherkeycontributors in the previous study of vocabulary assessment are first language reading researchers. Thisis primarily due to the consistent findings on the positive correlation between vocabulary knowledgeand reading. The real testing researchers, on the other hand, take interest in the construction of vocabularytest as to how it may cater for different testing purposes such as diagnosis, achievement andproficiency, rather than in the processes of vocabulary learning and how effective each of thesedifferent processes is by employing some sort of vocabulary measurement.The twentieth century wasmarked as the year where researchers in the field of language testing began to take interest in 14
    • vocabulary assessment. The first research area that gained currency at that time was objective testing,which is a kind of test thatdoes not require judgment in its scoring. Read (2000) also contends that themost frequently used test techniques in objective scoring is multiple-choice question, which is favoredparticularly in vocabulary assessment because: (a) the availability of synonyms, translation, and a shortdefining phrase lend themselves readily to the ease of constructing distractors, (b) the source of whatvocabulary to test is available thanks to the development of lists of the most frequent words in English,(c) objective vocabulary measurement can also be used to indicate overall language proficiency. The useof MCQs means that vocabulary testing throughout the whole twentieth century is typified as the testthat is selective, discrete, and context-independent. However, with the growing concern of using contextin vocabulary assessment, more and more test uses context in its construct such as the cloze test.However, as Read (2000) acknowledges, the dilemma of a context-dependent vocabulary measure isthat it becomes quite difficult to separate the scoring of pure vocabulary knowledge from other skillssuch as reading ability. Read (2007) continues the documentation of research in vocabulary assessment from its 2000publication. He states that a growing trend in the current research on vocabulary assessment is themeasurement of vocabulary size (or breadth) and depth. Both of these two distinct vocabulary measureswill be briefly discussed in turn. Vocabulary size is an area that has gained currency in second language vocabulary assessment.But what is it that makes it worth studying? As pointed out by Nation and Beglar (2007), measuringlearners’ vocabulary size is important for several reasons. First, it may inform teachers of how theirlearners cope with real life, authentic task such as reading newspaper, novel, watching movies, orspeaking in English. Data on vocabulary size needed to perform each task is available, by testinglearners’ current vocabulary size, teachers could estimate how close their learners are to performing 15
    • these tasks. Secondly, such measurement is needed to track the progress of learners’ vocabulary. Andlastly, vocabulary size can also be used to compare non-native speakers and native speakers, wherebylearners may be predicted as to how close they are to achieving the size of native speakers’vocabulary.In measuring vocabulary size, researchers have used word list as a source to assess thenumber of words a learner has. This is more possible now than ever before due to the growing use ofcomputer corpora which may provide word lists that are of quality. The first step researchers must takein measuring vocabulary size is thus to choose word list that is available. After that, words are selectedfrom the list, and finally a suitable test technique is chosen. One commonly used test in assessingvocabulary size is the Vocabulary Level Test (VLT) designed by Paul Nation and has been used for secondlanguage diagnostic test (Cervatiuc, 2007). The test is available online (http://www.lextutor.ca/tests/)and it has both the receptive and productive version which can be used to measure learner’s passiveand active vocabulary respectively. In contrast to vocabulary size, there has been relatively little progress made on the research ofthe depth of vocabulary. This is due to the lack of definition that constitutes ‘depth’ and the construct indeveloping such test. It is generally understood that knowing word does not simply mean knowing itsdefinition. The fact that there are many aspects of word a learner must know such as its pronunciation,spelling, morphological forms, part of speech, and collocation mean that there are quite a lot of thingsto measure, and there is little agreement on which ones should constitute learner’s depth of vocabulary(Read, 2007). As such it is relatively more difficult to construct this kind of test. However, one muchacknowledged vocabulary depth test is the Word-Associate Test (WAT) designed by John Read in 1993(Cervatiuc, 2007). As its name suggests, learners’ vocabulary are measured by using word associationssuch as synonyms, collocations, and related meaning. Typically, WAT measures how well leaners knowwords by ticking four out of eight possible options that have these associations such as exemplified 16
    • below: Read (1998), Word Associate Test, taken from <http://www.lextutor.ca/tests/associates/>V. Assessing vocabulary in Indonesian EFL contextTaking my personal experience of being a student learning English in Indonesia as a compulsory subjectfor six years, and also my experience of being a teacher teaching English in a senior high school inBandung for seven years, vocabulary assessment in Indonesia seems to leave a lot to be desired. Thereare two main issues that contribute to my painful experience of being assessed and assessingvocabulary, namely the nature of teaching and learning in school, and the tough national examinationtest. These two issues will be elaborated in turn. Since I was a student learning English in junior and senior high school until I became an Englishteacher myself, vocabulary learning remains unchanged. Typically, learners are asked to read a passagefrom a book, followed by a comprehension exercise which may entail some vocabulary exercises.Usually, words from these exercises will be recycled only once in the review unit. After that, the wordsthat learners learn would never get repeated; they seem to vanish into thin air. As Thornbury (2002)recommends, learner needs to encounter at least 8 times for a word to be ‘stuck’ in their mental lexicalknowledge. This suggests that teachers should incorporate informal vocabulary assessment, as 17
    • mentioned previously in this paper, in their teaching so that these words get recycled and usedmeaningfully in a different way, and eventually stored into the long-term memory. Another sad fact about vocabulary assessment in Indonesian context is that the same wordslearned in the class do not even come in the national examination – an achievement test at the end ofthe school year as a requirement for graduation. These words seem to be used as a kind of readingexercise to get students used to one component of reading skill, which is to guess meanings fromcontext. Therefore, vocabulary testing is largely used as a means to an end, rather than a means byitself. Using Read’s (2000) dimensions of vocabulary testingthen, the national tests in Indonesia havemainly been embedded and selective.Below is an example of an item taken from the 2010 Englishnational examination. Taken from UjianNasional 2009/2010, KementrianPendidikanNasional The above sample typifies the test as a context-dependent vocabulary measure, whereby theword inhabitant is not presented in isolation but used in sentence taken from a text. As Read (2000)points out this might come from the assumption that words never occur by themselves but constitute asan integrated part of the whole text. However, a closer look at the item reveals that it might not becontext-dependent in its true sense. Test-takers who attempt to answer this question might respond C.Citizens without necessarily having to read the whole text in order to come up with that answer. It mustbe highlighted that in order the key element in context-dependent question is that learners must 18
    • engage with the context in order to give appropriate response. As a comparison, below is a context-dependent item as illustrated by Read (2000): Humans have an innate ability to recognize the taste of salt because it provides us with sodium, an element which is essential to life. Although too much salt in our diet may be unhealthy, we must consume a certain amount of it to maintain our wellbeing. What is the meaning of consume in this text? A. Use of completely B. Eat or drink C. Spend wastefully D. Destroy Taken from Read (2000: 12), Assessing Reading. In contrast to the previous item sample, the above item has all four options with possibleinvariant meaning of consume, and therefore puts the learners in a condition where they must read thetext and use the available context to select the correct response, which is B. Eat or Drink. The English National Examination (ENE) has 50 multiple-choice items altogether, with 15listening comprehension questions and the 35 reading comprehension questions. However, not all ofthese 35 reading questions pertain to vocabulary assessment, from the 2010 ENE only 5 questions or10% from the whole items assess vocabulary knowledge. The large proportion of reading in the testseems to indicate that there is an emphasis to reading skill. This might be one of the ways thegovernment invest in developing a culture of reading as stipulated in article three of NationalEducational Law of July 2003 (UNESCO, 2011).Such emphasis is also desirable owing to the fact whenstudents enroll university they are expected to read English text books that cover 80% of the requiredreading (Nurweni& Read, 1999). However, these reading texts are deemed to be too difficult for thestudents to comprehend. In relation to vocabulary size mentioned previously, learners must possess atleast 4000 word level in order to gain 95% comprehension levelwith the assumption that the remaining 19
    • 5% is the maximum amount of tolerance of grappling with unfamiliar words (Laufer, 1989 as cited inAzies 2011; Nurwenti& Read, 1999). With 95% coverage, learners may still be able to comprehend a200-word reading passage with 10 unknown words present.A recent study suggests that with 4000 wordlevel, learners might be able to cover at least 95.96% the words occurring in 2010 senior high schoolENE, and surprisingly a slight margin of 95.80% in its junior high school counterpart (Aziez, 2011). Thismeans that reading texts in both junior high and senior high school national exams belong to the sameword level, which further suggests that test designers have not fully considered the vocabulary load ofthese two different levels of high school education. Even more surprisingly perhaps, Nurwenti and Read(1999) reveal that most of these senior high school students entering university do not even come closeto the required 4000 -5000 word level, meaning that it is such arduous work having to deal with thesereading passages. A reading passage that is difficult is also said to affect test reliability, which refers tothe degree of consistency and accuracy of a test. As pointed out by Samad (2010), a test that containsdifficult reading passages might contribute to the errors which in turn affect the accuracy of one’s truescore. To sum up, vocabulary measurement in senior high schools in Indonesia largely employembedded, selective, and context-dependent test in a form of MCQs. It is embedded in the sense thatvocabulary measurement constitutes a larger part of a reading skill, and thus measuring only learners’receptive (or passive) vocabulary. It is selective since the words to be tested are chosen by the test-designers and context-dependent since the word is not presented in isolation. However, the test doesnot use context-dependent in its full sense since learners do not need to engage with the context inorder to come up with the correct response (Read, 2000). To overcome this and some other problemsmentioned previously, the following suggestions may be helpful: 20
    • 1. The vocabulary construct should be revamped as to reflect true context-dependent vocabulary measure, in which all the options are possible variant meanings of the word and forces the learner to make use of the context in order to come up with the correct response.2. Teachers need to make use of informal vocabulary assessment so that words get recycled for at least eight times. This can be done in various ways such as doing a 10 minute review game at the beginning of every class to recycle and use these words in different ways. Although these same words may not come up in the national exam, they still can train their students the reading skill in which learners infer the meaning of unknown words in a reading passage or familiarize students with the types of text that will come out in the exam.3. In order to improve test reliability, test designers must carefully consider the weight of tests difficulty in national exams. Reading texts that are too difficult will considerably affect the accuracy of the score (Samad, 2010).4. If the education goal is to enable students to deal with English text books later when they enroll university, then the teaching of vocabulary that gears towards mastering at least 4000 word level is desirable (Nurweni& Read, 1999). Teachers may consider using the Academic Word List (AWL) devised by Coxhead since most of these words are related to academic registers. This list, and many other word listsare largely available to download in the internet. This implicates that the testing of vocabulary should also be directed to measuring vocabulary size, so that teachers may track the number of words students know from time to time. Thankfully there are many websites that can do this automatically for them, such as vocabularysize.com (http://my.vocabularysize.com/). 21
    • VI. ConclusionVocabulary is an essential building block of learning language. As such, it makes sense to be able toaccurately measure it. As important as it may, it is sobering to know that there is a paucity of researchinto vocabulary assessment. Even more saddening is the fact that most of the contributors to the fieldare SLA, and first language reading researchers who might not have an adequate understanding oftesting but need vocabulary measurement to validate their own findings. It was not until the latertwentieth century that real researchers in the field of language testing began to pay more attention tovocabulary assessment. The current trend in vocabulary assessment is towards measuring learner’svocabulary size and vocabulary depth, or also referred to as the measurement of how many words theyknow, versus how well they know these words. Other vocabulary distinctions that researchers might usein assessing vocabulary are receptive versus productive, informal versus formal, discrete versusembedded, selective versus comprehensive, and context-dependent versus context-independent. Thesevocabulary distinctions may come in various test formats such as Labeling, Definition, translation, MCQs,Yes/No Checklist, Matching, Cloze-test, and embedded test. This paper has also described the current practice of vocabulary assessment in Indonesian EFLcontexts particularly in senior high school in Bandung. The common practice of its assessment is thereceptive, embedded, context-dependent, selective use of vocabulary measurement in its MCQs. Thepaper also mentioned problems in its assessment which includesthe tough reading passages which makeup 70% of the total questions in the English National Exams. The difficult reading passages severelyhamperlearners’ comprehension which in turn threatens test reliability. This paper thus suggests astrong call for test-designers to reconsider the difficulty weighed in reading as well as for teachers toadapt direct teaching of high frequency words which further confirm the need to measure learners’vocabulary size, and is in line with the new direction in vocabulary assessment. 22
    • ReferencesAziez, F. (2011). Examining the Vocabulary Levels of Indonesia’s English National Examination Texts.Asian EFL Journal, Vol. 51, pp. 16-29.Cervatiuc, A. (2007). Assessing Second Language Vocabulary Knowledge, International Forum of Teaching and Studies, Vol. 3(3), pp. 40-78.Coombie, C. (2011). Assessing vocabulary in the classroom. Retrieved April 2th, from http://marifa.hct.ac.ae/files/2011/07/Assessing-Vocabulary-in-the-Language-Classroon.pdf.DeVriez, B. (2012). Vocabulary assessment as predictor of literacy skills, New England ReadingAssociation Journal, Vol. 47(2), pp. 4-9.Harmer, J. (2007). The Practice of English Language Teaching (4th ed.), Essex: Pearson Longman.Hayati, A., Mohammadi, M. (2009). Task-based instruction vs. translation method in teaching vocabulary: The case of Iranian-secondary school students, Iranian Journal of Language Studies, Vol. 3(2), pp.153-176. Retrieved 14th April, from: http://www.ijls.net/volumes/volume3issue2/hayati2.pdfHughes, A. (2003). Testing for Language Teachers (2nd ed.), Cambridge: Cambridge University Press.Pearson, P., Hiebert, E., Kamil, M. (2007). Vocabulary assessment: What we know and what we need to learn. Reading Research Quarterly, Vol. 42(2), pp. 282-296.Read, J. (1998). Word Associate Test. Retrieved 15th April, from: http://www.lextutor.ca/tests/associates/Read, J. (2000). Assessing Vocabulary. Cambridge: Cambridge University Press.Read, J. (2007). Second Language Vocabulary Assessment: Current Practices and New Directions, International Journal of English Studies, Vol. 7(2). pp. 105-125.Redman, S. (2003). Vocabulary in Use: Pre-intermediate & intermediate, Cambridge: Cambridge University Press.Samad, A. (2010). Essentials of Language Testing for Malaysian Teachers. Selangor: Universiti Putra Malaysia PressThornbury, S. (2002). How to Teach Vocabulary. Essex: Pearson ESL.McCarthy, M., O’Dell, F. (2008). Academic Vocabulary in Use, Cambridge: Cambridge University Press. 23
    • Nation, P., Beglar, D. (2007). A vocabulary size test, The language teacher, Vol. 31(7). pp.9-12.Nemati, A. (2010). Proficiency and Size of Receptive Vocabulary: Comparing EFL and ESL Environments. International Journal of Education Research and Technology, Vol. 1(1), June 2010, pp. 46-53.Nurweni, A., Read, J. (1999). The English Vocabulary Knowledge of Indonesian University Students, English for Specific Purposes, Vol. 18(2), pp.161-175.UNESCO [United Nations Educational, Scientific, and Cultural Organization] (2011). Indonesia. Word Data on Education(7th ed.). Retrieved 15th April, from: http://unesdoc.unesco.org/images/0019/001931/193181e.pdfUr, P. (1991). A course in Language Teaching. Cambridge: Cambridge University Press. 24