• Save
Tesol 2010 Boston
Upcoming SlideShare
Loading in...5
×
 

Tesol 2010 Boston

on

  • 1,443 views

This talk describes the design, development and some validation work of the revised Oxford Online Placement Exam.

This talk describes the design, development and some validation work of the revised Oxford Online Placement Exam.

Statistics

Views

Total Views
1,443
Slideshare-icon Views on SlideShare
1,440
Embed Views
3

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 3

http://www.slideshare.net 2
https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • TESOL 2010 Purpura & Beeston 03/27/10
  • TESOL 2010 Purpura & Beeston 03/27/10
  • English language students in programs around the world need to be placed quickly and efficiently into class levels at the beginning of a course. Oxford University Press has provided a placement exam for many years, but recently they decided to revise it. To get information one what stakeholders wanted, the Assessment Team at Oxford surveyed some 300 stakeholders. Based on this research, the revised exam would… Measure language knowledge and use (e.g., listening); Make score-based placement decisions aligned with Common European Framework of Reference (CEFR); Report scores in relation to the CEFR; Provide detailed feedback to constituents; Be short, easy to administer & inexpensive. 03/27/10
  • 03/27/10
  • • The new placement exam would have five sections: a language knowledge section (i.e., grammatical & pragmatic knowledge) & four language skills sections. • The grammar section was developed first, and would be used determine what CEFR-level-specific test items would follow. • The decision to measure language knowledge separately stems from research showing that grammatical knowledge (i.e., form & meaning) is a critical component & strong predictor of the ability to communicate in meaningful & pragmatically appropriate ways. Thus, the grammar section would be used determine what CEFR-level-specific test items would follow. The grammar section of the Oxford English placement Test (OOPT) will be the focus of today’s talk. 03/27/10
  • KEY SALES MESSAGE: Because the test is computer adaptive (CAT), it is able to provide tailored test for each person so the test can be shorter yet still reliable i.e. measure a test taker with a known and acceptable level of error = TEST RELIABILITY NB: The placement test item bank has around 1200 items across the CEF levels so each test of approximately 45 items can be expected to be somewhat different to any other test. But it is impossible for tests to be unique! Can the item bank be learnt? If you keep retaking the test e.g. if you buy test access as an individual, you could keep moving up the ability continuum and getting higher scores but you would need to make a huge effort to do this and would arguably be learning more and more English as you did this – i.e. the system would select items of increasing difficulty. But getting a higher score by learning more English is fine! TESOL 2010 03/27/10 Purpura & Beeston
  • • The new placement exam would have five sections: a language knowledge section (i.e., grammatical & pragmatic knowledge) & four language skills sections. • The grammar section was developed first, and would be used determine what CEFR-level-specific test items would follow. • The decision to measure language knowledge separately stems from research showing that grammatical knowledge (i.e., form & meaning) is a critical component & strong predictor of the ability to communicate in meaningful & pragmatically appropriate ways. Thus, the grammar section would be used determine what CEFR-level-specific test items would follow. The grammar section of the Oxford English placement Test (OOPT) will be the focus of today’s talk. 03/27/10
  • 03/27/10
  • 03/27/10
  • 03/27/10
  • We drew on a number of sources of information for defining grammatical knowledge in the OOPT. TESOL 2010 Purpura & Beeston 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. Over the years, the Common European Framework of Reference has been increasingly used for the development of language curricula, textbooks, and assessments. This framework claims to provide a comprehensive description of (1) what language learners have to learn to do in order to use a language for communication and (2) what knowledge and skills they have to develop so as to be able to act effectively. The CEFR also claims to provide language proficiency descriptors that can be used to portray language growth on a vertical scale across different levels of achievement. In other words, the CEFR describes what learners might be expected to do with the language at these levels, but it does not actually specify what knowledge components learners need to have acquired at different proficiency levels! 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. Over the years, the Common European Framework of Reference has been increasingly used for the development of language curricula, textbooks, and assessments. This framework claims to provide a comprehensive description of (1) what language learners have to learn to do in order to use a language for communication and (2) what knowledge and skills they have to develop so as to be able to act effectively. The CEFR also claims to provide language proficiency descriptors that can be used to portray language growth on a vertical scale across different levels of achievement. In other words, the CEFR describes what learners might be expected to do with the language at these levels, but it does not actually specify what knowledge components learners need to have acquired at different proficiency levels! 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. Over the years, the Common European Framework of Reference has been increasingly used for the development of language curricula, textbooks, and assessments. This framework claims to provide a comprehensive description of (1) what language learners have to learn to do in order to use a language for communication and (2) what knowledge and skills they have to develop so as to be able to act effectively. The CEFR also claims to provide language proficiency descriptors that can be used to portray language growth on a vertical scale across different levels of achievement. In other words, the CEFR describes what learners might be expected to do with the language at these levels, but it does not actually specify what knowledge components learners need to have acquired at different proficiency levels! 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. Over the years, the Common European Framework of Reference has been increasingly used for the development of language curricula, textbooks, and assessments. This framework claims to provide a comprehensive description of (1) what language learners have to learn to do in order to use a language for communication and (2) what knowledge and skills they have to develop so as to be able to act effectively. The CEFR also claims to provide language proficiency descriptors that can be used to portray language growth on a vertical scale across different levels of achievement. In other words, the CEFR describes what learners might be expected to do with the language at these levels, but it does not actually specify what knowledge components learners need to have acquired at different proficiency levels! 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. Over the years, the Common European Framework of Reference has been increasingly used for the development of language curricula, textbooks, and assessments. This framework claims to provide a comprehensive description of (1) what language learners have to learn to do in order to use a language for communication and (2) what knowledge and skills they have to develop so as to be able to act effectively. The CEFR also claims to provide language proficiency descriptors that can be used to portray language growth on a vertical scale across different levels of achievement. In other words, the CEFR describes what learners might be expected to do with the language at these levels, but it does not actually specify what knowledge components learners need to have acquired at different proficiency levels! 03/27/10
  • 03/27/10
  • TESOL 2010 Purpura, Grabowski, Dakin, Ameriks, Beeston
  • TESOL 2010 Purpura & Beeston Grammatical knowledge is defined in terms of knowledge of grammatical forms and meanings…this includes a variety of forms and meanings at the sub-sentential, sentential, and discourse levels. Pragmatic knowledge is also part of langage abilty in this model, but task 1 is only explicitly measuring grammatical knowledge. Lexical forms at the advanced levels: Lexical meanings at the advanced levels • Noun compounding, • denotation, connotation, collocation • co-dependence restrictions (depend on) Morphosyntactic forms :Morphosyntactic meanings • inflectional & derivational affixes, • notions of time, space,, • syntactic structure, tense, • passivization • aspect, mood, complex/compound • interrogation sentence types Cohesive forms: Cohesive meanings • referential forms, ellipsis, • logical connectors Information management forms Information management meanings • given/new information, • emphasis, foregrounding • word order, parallelism 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. 03/27/10
  • 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. We coded the items for a number of variables. We will discuss a few here. We first coded the item of hypothesized CEFR level. Then we coded the items grammatical knowledge—the possibilities were F, M or F&M After that, the items were coded for the type of grammatical form or meaning according to Purpura (2004). Item 191 was a MS form item designed to measure the past simple. Then, we coded the item for the specific grammatical focus. Item 191 was a tense & aspect item. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. We coded the items for a number of variables. We will discuss a few here. We first coded the item of hypothesized CEFR level. Then we coded the items grammatical knowledge—the possibilities were F, M or F&M After that, the items were coded for the type of grammatical form or meaning according to Purpura (2004). Item 191 was a MS form item designed to measure the past simple. Then, we coded the item for the specific grammatical focus. Item 191 was a tense & aspect item. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. We coded the items for a number of variables. We will discuss a few here. We first coded the item of hypothesized CEFR level. Then we coded the items grammatical knowledge—the possibilities were F, M or F&M After that, the items were coded for the type of grammatical form or meaning according to Purpura (2004). Item 191 was a MS form item designed to measure the past simple. Then, we coded the item for the specific grammatical focus. Item 191 was a tense & aspect item. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. We coded the items for a number of variables. We will discuss a few here. We first coded the item of hypothesized CEFR level. Then we coded the items grammatical knowledge—the possibilities were F, M or F&M After that, the items were coded for the type of grammatical form or meaning according to Purpura (2004). Item 191 was a MS form item designed to measure the past simple. Then, we coded the item for the specific grammatical focus. Item 191 was a tense & aspect item. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. We coded the items for a number of variables. We will discuss a few here. We first coded the item of hypothesized CEFR level. Then we coded the items grammatical knowledge—the possibilities were F, M or F&M After that, the items were coded for the type of grammatical form or meaning according to Purpura (2004). Item 191 was a MS form item designed to measure the past simple. Then, we coded the item for the specific grammatical focus. Item 191 was a tense & aspect item. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. KIRBY STARTS HERE Once we had a definition of grammatical knowledge, we prepared test specs and commissioned items to be written to specs. 03/27/10
  • 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. We are focusing on Task 1 for this paper. Task 1 – Testing knowledge of grammatical forms It is designed to measure the test takers’ knowledge of grammar. In this task, test takers are asked to read a short gapped dialogue and then complete the dialogue by selecting one of four options. 03/27/10
  • ECOLT 2009 Purpura, Grabowski, Dakin, Ameriks, Beeston 03/27/10
  • ECOLT 2009 Purpura, Grabowski, Dakin, Ameriks, Beeston 03/27/10
  • ECOLT 2009 Purpura, Grabowski, Dakin, Ameriks, Beeston 03/27/10
  • ECOLT 2009 Purpura, Grabowski, Dakin, Ameriks, Beeston 03/27/10
  • ECOLT 2009 Purpura, Grabowski, Dakin, Ameriks, Beeston 03/27/10
  • ECOLT 2009 Purpura, Grabowski, Dakin, Ameriks, Beeston 03/27/10
  • ECOLT 2009 Purpura, Grabowski, Dakin, Ameriks, Beeston 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. JIM STARTS AGAIN HERE The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10
  • 03/27/10
  • TESOL 2010 Purpura & Beeston Grammatical knowledge is defined in terms of knowledge of grammatical forms and meanings…this includes a variety of forms and meanings at the sub-sentential, sentential, and discourse levels. Pragmatic knowledge is also part of langage abilty in this model, but task 1 is only explicitly measuring grammatical knowledge. Lexical forms at the advanced levels: Lexical meanings at the advanced levels • Noun compounding, • denotation, connotation, collocation • co-dependence restrictions (depend on) Morphosyntactic forms :Morphosyntactic meanings • inflectional & derivational affixes, • notions of time, space,, • syntactic structure, tense, • passivization • aspect, mood, complex/compound • interrogation sentence types Cohesive forms: Cohesive meanings • referential forms, ellipsis, • logical connectors Information management forms Information management meanings • given/new information, • emphasis, foregrounding • word order, parallelism 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. 03/27/10
  • 03/27/10
  • TESOL 2010 Purpura & Beeston Grammatical knowledge is defined in terms of knowledge of grammatical forms and meanings…this includes a variety of forms and meanings at the sub-sentential, sentential, and discourse levels. Pragmatic knowledge is also part of langage abilty in this model, but task 1 is only explicitly measuring grammatical knowledge. Lexical forms at the advanced levels: Lexical meanings at the advanced levels • Noun compounding, • denotation, connotation, collocation • co-dependence restrictions (depend on) Morphosyntactic forms :Morphosyntactic meanings • inflectional & derivational affixes, • notions of time, space,, • syntactic structure, tense, • passivization • aspect, mood, complex/compound • interrogation sentence types Cohesive forms: Cohesive meanings • referential forms, ellipsis, • logical connectors Information management forms Information management meanings • given/new information, • emphasis, foregrounding • word order, parallelism 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. 03/27/10
  • 03/27/10
  • TESOL 2010 Purpura & Beeston Grammatical knowledge is defined in terms of knowledge of grammatical forms and meanings…this includes a variety of forms and meanings at the sub-sentential, sentential, and discourse levels. Pragmatic knowledge is also part of langage abilty in this model, but task 1 is only explicitly measuring grammatical knowledge. Lexical forms at the advanced levels: Lexical meanings at the advanced levels • Noun compounding, • denotation, connotation, collocation • co-dependence restrictions (depend on) Morphosyntactic forms :Morphosyntactic meanings • inflectional & derivational affixes, • notions of time, space,, • syntactic structure, tense, • passivization • aspect, mood, complex/compound • interrogation sentence types Cohesive forms: Cohesive meanings • referential forms, ellipsis, • logical connectors Information management forms Information management meanings • given/new information, • emphasis, foregrounding • word order, parallelism 03/27/10
  • We scored these responses on multiple dimensions. TESOL 2010 Purpura, Grabowski, Dakin, Ameriks, Beeston
  • We scored these responses on multiple dimensions. TESOL 2010 Purpura, Grabowski, Dakin, Ameriks, Beeston
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. We coded the items for a number of variables. We will discuss a few here. We first coded the item of hypothesized CEFR level. Then we coded the items grammatical knowledge—the possibilities were F, M or F&M After that, the items were coded for the type of grammatical form or meaning according to Purpura (2004). Item 191 was a MS form item designed to measure the past simple. Then, we coded the item for the specific grammatical focus. Item 191 was a tense & aspect item.
  • 03/27/10
  • A) Performance: Test Takers can successfully… • Use grammatical resources to understand the literal and intended meanings explicitly stated within the spoken text (i.e., literal & intended meaning). • Use grammatical resources to understand a range of implied or pragmatic meanings implicit within and beyond the spoken text (implied meaning). More specifically, depending on the proficiency level of the test, listeners must be able to infer: contextual meanings (e.g., Where does this conversation take place? What can you conclude from the conversation? What’s the main point of the lecture? ) sociolinguistic meanings (e.g., What is the relationship between the two speakers? ) sociocultural meanings (e.g., What does the speaker mean by “the Big Apple?” ) psychological meanings (e.g., What is the woman’s attitude toward the man? How do you think the man feels? What is the tone of the conversation? rhetorical meanings (e.g., What is the purpose of this announcement? How does the discussion end? What is the woman trying to do in this lecture? ) • The interpretation of meaning can be derived from explicit information within the spoken text. This is referred to as “endophoric reference” (Halliday & Hasan, 1989). For example, a traditional “detail” item like Where does the man live? might be coded as “literal meaning/endophoric reference” if the place the man lived is explicitly mentioned in the text. Similarly, a “gist” item, where the main idea is explicitly stated in the spoken text (e.g., “ The purpose of my talk today is to…” ) would be coded as “literal meaning/endophoric reference.” • The implied meaning of a spoken text can be derived from information implicit within the spoken text. This is called endophoric reference (Halliday & Hasan, 1989). For example, consider the “gist’ item: What is the speaker’s main message? If the main idea of the speaker’s message is not explicitly stated in the text, but can be inferred through implicit contextualization cues, then this item would be coded “implied (contextual) meaning/endophoric reference.” Consider also a lexical expression item such as: What does the man mean when he says “Suppose you had to be there.”? If the answer to this question is not explicitly stated in the spoken text, but can be inferred from the surrounding text, this item would also be coded “implied (contextual) meaning/endophoric reference.” Finally, consider an “inference” item, where the test taker is asked where the conversation seems to be taking place. If the location is not explicitly stated, but can be inferred from lexical cues in the co-text, this item would be coded “implied (contextual) meaning/endophoric reference.” • The implied meaning of a spoken text can also be derived implicitly from information that lies outside the spoken text itself. This is called exophoric reference (Halliday & Hasan, 1989). These extensions of meaning draw on (1) the overall context of the situation, (2) the norms, assumption, expectations and presuppositions associated with the situational context, and (3) the background knowledge of the listener in relation to that context. For example, consider the following “tone” items: What is the tone of the conversation? or How do you think [person X]) feels about the situation?” If the answer to these questions are not explicitly stated in the spoken text, but can be inferred, not so much from the surrounding text, but from the context of the situation coupled with the norms and expectations of interaction in this situation, this item would be coded “implied (psychological) meaning/exophoric reference.” In this case, the pragmatic meaning clearly lies outside the spoken text itself and in the context of the situation. • Note that sociocultural references (e.g., the Big Apple, the Windy City, the Angel of the North, the Bull Ring) should be avoided if they are deemed outside the reference of the Oxford Test candidature. Nonetheless, sociocultural references are a part of the living language, and on some level, it is impossible to avoid them for certain populations (e.g., some students might not be familiar with train or airplane timetables). If the reference is clearly outside the reference, dialogues should be written so that the meaning can be derived from within the text, perhaps as a repair strategy. For example: A: I’m off to the Big Apple tomorrow. B: What? A: You know… New York City. B: Oh, I see. How long are you going for? B) Focus: Test Takers need to have knowledge of … • Grammatical Forms & Meanings : This task requires examinees to know a range of grammatical forms so their literal and intended meaning s can be understood from spoken text (See Purpura, 2004). • Pragmatic Meanings: This task also requires examinees to be able to use grammatical resources to understand a range of pragmatic (implied) meanings (See Purpura, 2004). More specifically, depending on the proficiency level, this task may require the ability to interpret: contextual meanings (e.g., situational features) sociolinguistic meanings (e.g., social identity markers referring to gender, age, status; social meanings pointing to power, politeness and formality; and social norms, preferences, and situational expectations) sociocultural meanings (e.g., cultural references, figurative meanings, metaphor) psychological meanings (e.g., affective stance, point of view, tone) rhetorical meanings (e.g., coherence and genre)
  • A) Performance: Test Takers can successfully… • Use grammatical resources to understand the literal and intended meanings explicitly stated within the spoken text (i.e., literal & intended meaning). • Use grammatical resources to understand a range of implied or pragmatic meanings implicit within and beyond the spoken text (implied meaning). More specifically, depending on the proficiency level of the test, listeners must be able to infer: contextual meanings (e.g., Where does this conversation take place? What can you conclude from the conversation? What’s the main point of the lecture? ) sociolinguistic meanings (e.g., What is the relationship between the two speakers? ) sociocultural meanings (e.g., What does the speaker mean by “the Big Apple?” ) psychological meanings (e.g., What is the woman’s attitude toward the man? How do you think the man feels? What is the tone of the conversation? rhetorical meanings (e.g., What is the purpose of this announcement? How does the discussion end? What is the woman trying to do in this lecture? ) • The interpretation of meaning can be derived from explicit information within the spoken text. This is referred to as “endophoric reference” (Halliday & Hasan, 1989). For example, a traditional “detail” item like Where does the man live? might be coded as “literal meaning/endophoric reference” if the place the man lived is explicitly mentioned in the text. Similarly, a “gist” item, where the main idea is explicitly stated in the spoken text (e.g., “ The purpose of my talk today is to…” ) would be coded as “literal meaning/endophoric reference.” • The implied meaning of a spoken text can be derived from information implicit within the spoken text. This is called endophoric reference (Halliday & Hasan, 1989). For example, consider the “gist’ item: What is the speaker’s main message? If the main idea of the speaker’s message is not explicitly stated in the text, but can be inferred through implicit contextualization cues, then this item would be coded “implied (contextual) meaning/endophoric reference.” Consider also a lexical expression item such as: What does the man mean when he says “Suppose you had to be there.”? If the answer to this question is not explicitly stated in the spoken text, but can be inferred from the surrounding text, this item would also be coded “implied (contextual) meaning/endophoric reference.” Finally, consider an “inference” item, where the test taker is asked where the conversation seems to be taking place. If the location is not explicitly stated, but can be inferred from lexical cues in the co-text, this item would be coded “implied (contextual) meaning/endophoric reference.” • The implied meaning of a spoken text can also be derived implicitly from information that lies outside the spoken text itself. This is called exophoric reference (Halliday & Hasan, 1989). These extensions of meaning draw on (1) the overall context of the situation, (2) the norms, assumption, expectations and presuppositions associated with the situational context, and (3) the background knowledge of the listener in relation to that context. For example, consider the following “tone” items: What is the tone of the conversation? or How do you think [person X]) feels about the situation?” If the answer to these questions are not explicitly stated in the spoken text, but can be inferred, not so much from the surrounding text, but from the context of the situation coupled with the norms and expectations of interaction in this situation, this item would be coded “implied (psychological) meaning/exophoric reference.” In this case, the pragmatic meaning clearly lies outside the spoken text itself and in the context of the situation. • Note that sociocultural references (e.g., the Big Apple, the Windy City, the Angel of the North, the Bull Ring) should be avoided if they are deemed outside the reference of the Oxford Test candidature. Nonetheless, sociocultural references are a part of the living language, and on some level, it is impossible to avoid them for certain populations (e.g., some students might not be familiar with train or airplane timetables). If the reference is clearly outside the reference, dialogues should be written so that the meaning can be derived from within the text, perhaps as a repair strategy. For example: A: I’m off to the Big Apple tomorrow. B: What? A: You know… New York City. B: Oh, I see. How long are you going for? B) Focus: Test Takers need to have knowledge of … • Grammatical Forms & Meanings : This task requires examinees to know a range of grammatical forms so their literal and intended meaning s can be understood from spoken text (See Purpura, 2004). • Pragmatic Meanings: This task also requires examinees to be able to use grammatical resources to understand a range of pragmatic (implied) meanings (See Purpura, 2004). More specifically, depending on the proficiency level, this task may require the ability to interpret: contextual meanings (e.g., situational features) sociolinguistic meanings (e.g., social identity markers referring to gender, age, status; social meanings pointing to power, politeness and formality; and social norms, preferences, and situational expectations) sociocultural meanings (e.g., cultural references, figurative meanings, metaphor) psychological meanings (e.g., affective stance, point of view, tone) rhetorical meanings (e.g., coherence and genre)
  • A) Performance: Test Takers can successfully… • Use grammatical resources to understand the literal and intended meanings explicitly stated within the spoken text (i.e., literal & intended meaning). • Use grammatical resources to understand a range of implied or pragmatic meanings implicit within and beyond the spoken text (implied meaning). More specifically, depending on the proficiency level of the test, listeners must be able to infer: contextual meanings (e.g., Where does this conversation take place? What can you conclude from the conversation? What’s the main point of the lecture? ) sociolinguistic meanings (e.g., What is the relationship between the two speakers? ) sociocultural meanings (e.g., What does the speaker mean by “the Big Apple?” ) psychological meanings (e.g., What is the woman’s attitude toward the man? How do you think the man feels? What is the tone of the conversation? rhetorical meanings (e.g., What is the purpose of this announcement? How does the discussion end? What is the woman trying to do in this lecture? ) • The interpretation of meaning can be derived from explicit information within the spoken text. This is referred to as “endophoric reference” (Halliday & Hasan, 1989). For example, a traditional “detail” item like Where does the man live? might be coded as “literal meaning/endophoric reference” if the place the man lived is explicitly mentioned in the text. Similarly, a “gist” item, where the main idea is explicitly stated in the spoken text (e.g., “ The purpose of my talk today is to…” ) would be coded as “literal meaning/endophoric reference.” • The implied meaning of a spoken text can be derived from information implicit within the spoken text. This is called endophoric reference (Halliday & Hasan, 1989). For example, consider the “gist’ item: What is the speaker’s main message? If the main idea of the speaker’s message is not explicitly stated in the text, but can be inferred through implicit contextualization cues, then this item would be coded “implied (contextual) meaning/endophoric reference.” Consider also a lexical expression item such as: What does the man mean when he says “Suppose you had to be there.”? If the answer to this question is not explicitly stated in the spoken text, but can be inferred from the surrounding text, this item would also be coded “implied (contextual) meaning/endophoric reference.” Finally, consider an “inference” item, where the test taker is asked where the conversation seems to be taking place. If the location is not explicitly stated, but can be inferred from lexical cues in the co-text, this item would be coded “implied (contextual) meaning/endophoric reference.” • The implied meaning of a spoken text can also be derived implicitly from information that lies outside the spoken text itself. This is called exophoric reference (Halliday & Hasan, 1989). These extensions of meaning draw on (1) the overall context of the situation, (2) the norms, assumption, expectations and presuppositions associated with the situational context, and (3) the background knowledge of the listener in relation to that context. For example, consider the following “tone” items: What is the tone of the conversation? or How do you think [person X]) feels about the situation?” If the answer to these questions are not explicitly stated in the spoken text, but can be inferred, not so much from the surrounding text, but from the context of the situation coupled with the norms and expectations of interaction in this situation, this item would be coded “implied (psychological) meaning/exophoric reference.” In this case, the pragmatic meaning clearly lies outside the spoken text itself and in the context of the situation. • Note that sociocultural references (e.g., the Big Apple, the Windy City, the Angel of the North, the Bull Ring) should be avoided if they are deemed outside the reference of the Oxford Test candidature. Nonetheless, sociocultural references are a part of the living language, and on some level, it is impossible to avoid them for certain populations (e.g., some students might not be familiar with train or airplane timetables). If the reference is clearly outside the reference, dialogues should be written so that the meaning can be derived from within the text, perhaps as a repair strategy. For example: A: I’m off to the Big Apple tomorrow. B: What? A: You know… New York City. B: Oh, I see. How long are you going for? B) Focus: Test Takers need to have knowledge of … • Grammatical Forms & Meanings : This task requires examinees to know a range of grammatical forms so their literal and intended meaning s can be understood from spoken text (See Purpura, 2004). • Pragmatic Meanings: This task also requires examinees to be able to use grammatical resources to understand a range of pragmatic (implied) meanings (See Purpura, 2004). More specifically, depending on the proficiency level, this task may require the ability to interpret: contextual meanings (e.g., situational features) sociolinguistic meanings (e.g., social identity markers referring to gender, age, status; social meanings pointing to power, politeness and formality; and social norms, preferences, and situational expectations) sociocultural meanings (e.g., cultural references, figurative meanings, metaphor) psychological meanings (e.g., affective stance, point of view, tone) rhetorical meanings (e.g., coherence and genre)
  • A) Performance: Test Takers can successfully… • Use grammatical resources to understand the literal and intended meanings explicitly stated within the spoken text (i.e., literal & intended meaning). • Use grammatical resources to understand a range of implied or pragmatic meanings implicit within and beyond the spoken text (implied meaning). More specifically, depending on the proficiency level of the test, listeners must be able to infer: contextual meanings (e.g., Where does this conversation take place? What can you conclude from the conversation? What’s the main point of the lecture? ) sociolinguistic meanings (e.g., What is the relationship between the two speakers? ) sociocultural meanings (e.g., What does the speaker mean by “the Big Apple?” ) psychological meanings (e.g., What is the woman’s attitude toward the man? How do you think the man feels? What is the tone of the conversation? rhetorical meanings (e.g., What is the purpose of this announcement? How does the discussion end? What is the woman trying to do in this lecture? ) • The interpretation of meaning can be derived from explicit information within the spoken text. This is referred to as “endophoric reference” (Halliday & Hasan, 1989). For example, a traditional “detail” item like Where does the man live? might be coded as “literal meaning/endophoric reference” if the place the man lived is explicitly mentioned in the text. Similarly, a “gist” item, where the main idea is explicitly stated in the spoken text (e.g., “ The purpose of my talk today is to…” ) would be coded as “literal meaning/endophoric reference.” • The implied meaning of a spoken text can be derived from information implicit within the spoken text. This is called endophoric reference (Halliday & Hasan, 1989). For example, consider the “gist’ item: What is the speaker’s main message? If the main idea of the speaker’s message is not explicitly stated in the text, but can be inferred through implicit contextualization cues, then this item would be coded “implied (contextual) meaning/endophoric reference.” Consider also a lexical expression item such as: What does the man mean when he says “Suppose you had to be there.”? If the answer to this question is not explicitly stated in the spoken text, but can be inferred from the surrounding text, this item would also be coded “implied (contextual) meaning/endophoric reference.” Finally, consider an “inference” item, where the test taker is asked where the conversation seems to be taking place. If the location is not explicitly stated, but can be inferred from lexical cues in the co-text, this item would be coded “implied (contextual) meaning/endophoric reference.” • The implied meaning of a spoken text can also be derived implicitly from information that lies outside the spoken text itself. This is called exophoric reference (Halliday & Hasan, 1989). These extensions of meaning draw on (1) the overall context of the situation, (2) the norms, assumption, expectations and presuppositions associated with the situational context, and (3) the background knowledge of the listener in relation to that context. For example, consider the following “tone” items: What is the tone of the conversation? or How do you think [person X]) feels about the situation?” If the answer to these questions are not explicitly stated in the spoken text, but can be inferred, not so much from the surrounding text, but from the context of the situation coupled with the norms and expectations of interaction in this situation, this item would be coded “implied (psychological) meaning/exophoric reference.” In this case, the pragmatic meaning clearly lies outside the spoken text itself and in the context of the situation. • Note that sociocultural references (e.g., the Big Apple, the Windy City, the Angel of the North, the Bull Ring) should be avoided if they are deemed outside the reference of the Oxford Test candidature. Nonetheless, sociocultural references are a part of the living language, and on some level, it is impossible to avoid them for certain populations (e.g., some students might not be familiar with train or airplane timetables). If the reference is clearly outside the reference, dialogues should be written so that the meaning can be derived from within the text, perhaps as a repair strategy. For example: A: I’m off to the Big Apple tomorrow. B: What? A: You know… New York City. B: Oh, I see. How long are you going for? B) Focus: Test Takers need to have knowledge of … • Grammatical Forms & Meanings : This task requires examinees to know a range of grammatical forms so their literal and intended meaning s can be understood from spoken text (See Purpura, 2004). • Pragmatic Meanings: This task also requires examinees to be able to use grammatical resources to understand a range of pragmatic (implied) meanings (See Purpura, 2004). More specifically, depending on the proficiency level, this task may require the ability to interpret: contextual meanings (e.g., situational features) sociolinguistic meanings (e.g., social identity markers referring to gender, age, status; social meanings pointing to power, politeness and formality; and social norms, preferences, and situational expectations) sociocultural meanings (e.g., cultural references, figurative meanings, metaphor) psychological meanings (e.g., affective stance, point of view, tone) rhetorical meanings (e.g., coherence and genre)
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. 03/27/10
  • ILACA 2009 03/27/10 James E Purpura, Teachers College, Columbia Univ. The levels identified by the coders often differed by more than one level from the codings generated by Rasch. Given the number of discrepancies, this raised questions about which what set of item codings to believe as a basis for specifying what it means to have grammatical knowledge at what CEFR level—perhaps both. Even though item writers were asked to write developmentally sequenced distractors, the items were scored dichotomously by means of the simple Rasch model. This raises the question of the correspondence between the measurement model and the scoring method. Partial credit? An example of a discrepancy is when an item is judeged to be a XX item on the CEFR scale, but the Rasch based coding rooted to item difficulty places the item at the XX level. The problem with using item difficulties is that difficulty is often a function of the key AND the options, but the analysis is base only on they key—partial credit?. 03/27/10

Tesol 2010 Boston Tesol 2010 Boston Presentation Transcript

  • THE TESOL-ILTA JOINT SESSION The Design and Development of the Web-Based Online Oxford Placement Exam James E. Purpura, Teachers College, Columbia University [email_address] TESOL, Boston March 2010 (Thanks to Yoko Ameriks, Simon Beeston, Jee Wha Dakin, Kristin DiGennaro, & Kirby Grabowski) (For support thanks to: Oxford University Press, TESOL International & The International Language Testing Association (ILTA—www.iltaonline.com)
  • Purpose of this Talk
    • To discuss the design & development of the Oxford Online Placement Test—OOPT).
    Contents of the Talk
    • I will discuss the test mandate.
    • I will discuss the development of the language elements section of the test (grammatical & pragmatic knowledge) along with some validation results;
    • I will then discuss the development of the language use section of the test, focusing on listening ability.
  • Test Mandate for a Revised Placement Test Needs Assessment for a Revised OUP Placement Exam • 300 stakeholders surveyed • Goals of new Oxford Online Placement Test (OOPT):
    • Measure language knowledge & use (e.g., listening);
    • Make score-based placement decisions aligned with
    • Common European Framework of Reference (CEFR);
    • Report scores in relation to the CEFR;
    • Provide detailed feedback to constituents;
    • Be short, easy to administer & inexpensive (delivered online & computer-adaptive);
    • Able to customize the administration system for schools to collect data & add information.
  • Initial Design Decisions • The revised OOPT would have 5 sections: a language knowledge section (i.e., grammatical & pragmatic knowledge) & 4 language use (i.e., skills) sections. • The decision to measure language knowledge separately stemmed from research showing that grammatical knowledge (i.e., form & meaning) is a critical resource (Purpura, 1999, 2006) & a strong predictor of the ability to communicate in meaningful & pragmatically appropriate ways (Ameriks, 2009; Chang, 2002, 2004; Grabowski, 2009; Kim, 2009; Liao, 2009). • The language knowledge section would consist of 4 tasks: (1) MC grammar (F&M); (2) MC Meaning (Literal & Intended); (3) Cloze (F&M); & (4) MC Pragmatic Meaning.
  • Initial Design Decisions • The grammar section was developed first since it would be used to determine what CEFR-level-specific items the examinees would receive in the rest of the adaptive test. This is a computer-adaptive test.
  • How Does the Computer Adaptive System Work? 1. Candidate views test demo info, then registers & selects level description e.g. B1 4. Listening items (15 items) selected on the basis of level at end of the Task 3. 2. Candidate starts test with B1 level item from Task 1 (MC F&M). Person ability is calculated & the next item selected & so on, until 10 items are administered—at the point the person may be identified at B1 or other level Item level adaptive across the CEFR levels 3. Task 2 (Meaning) testlet is selected from database based on the examinee’s level at end of Task 1. Task 2 then administered. The same procedures occurs for Task 3 (F&M). Pre-constructed CEFR level tests The score is derived from 2 levels combined into one “block” score showing the examinee’s relative position each. This is a probabilistic model based in IRT. 5. Score generated: etc 48 47 46 45 etc. Listening Average Grammar
  • How the tests work - adaptively B1--1st item CORRECT Easier 7th item CORRECT Easier 5th item CORRECT Harder 6th item INCORRECT Harder 4th item INCORRECT Easier 9th item CORRECT Harder 3rd item CORRECT Harder 2nd item CORRECT Test focuses on student’s ability & adapts to provide items targeted for that person’s ability level – final test is shorter but still accurate. Harder 8th item INCORRECT
  • Initial Design Decisions • The listening section was developed last. Listening ability is defined in terms of understanding a range of literal, intended & implied meanings. This section consists of 3 tasks.
  • The Language Knowledge Section Assessing grammatical & pragmatic knowledge
  • The Language Knowledge Section Purpose: to measure the grammatical & pragmatic knowledge resources that examinees need to communicate at six levels of CEFR proficiency . Information needed • How the structures & associated meanings in the OUP course books related to the scaled descriptors of grammatical knowledge in the CEFR. — Could we get this from the CEFR itself? • How grammatical knowledge was defined & operationalized in the 6 CEFR levels — Could we get this from the CEFR itself?
  • The Grammar Section Problems: Limitations of the CEFR .
    • The CEFR claims to provide a comprehensive description of:
    • (1) What language learners have to learn to do in order to use a language for communication (can-do statements);
    • (2) What knowledge & skills they have to develop to be able to act effectively (COE, 2001, p. 1)
    In other words… • The CEFR describes what learners might be expected to do with the language at these levels, but it does not specify what learners need to know to perform at these different levels!
    • Grammar descriptors in the CEFR
    • OUP grammar curricula (language foci, topics, situations, text types)
    • Pedagogical English grammars
    • SLA research (e.g., morpheme studies, F&M research, etc.)
    • Models of language ability (e.g., Canale & Swain, 1980; Bachman & Palmer, 1996; Purpura, 2004, etc.).
    Sources of Information for Defining Grammatical Knowledge on the OOPT
  • The Oxford Curricula (aligned with CEFR) Grammar covered at A1/A2 level • countable & uncountable nouns • adjectives & adverbs of frequency • how much/many & other quantifiers • present & past simple/present perfect • present continuous, going to future • can/can’t , like + ing, would like to • some/any; there is/are/was/were • subject & object pronouns • possessive adjectives/pronouns • demonstrative adjectives/pronouns • prepositions of time & place • comparative/superlative adj. • Saxon genitive • telling the time Vocabulary & Topic Areas • classroom expressions • personal information • daily routines, days of the week • family, holidays, • shopping, food & drink, Instances of Language Use • greetings (formal/informal) • dinner invitations • making small talk • past experiences, • making reservations
  • Grammar from the CEFR B1 > B2 Intermediate [Independent user-Vantage] • Uses accurately & appropriately vocabulary adequate for understanding and responding to a range of familiar & some unfamiliar topics & can identify appropriate lexical items . • Can handle with reasonable accuracy simple & some more complex grammatical forms . Can produce longer relevant utterances in spoken & written discourse using a variety of cohesive devices . • Can read & understand different kinds of texts. Can identify overall meaning & specific detail attitudes & opinions & can understand text structure. • Can understand standard spoken language on both familiar & unfamiliar topics relating to personal social academic life or work. • Can communicate in most situations ask & respond to questions appropriately & participate in interactions without resorting to repair strategies.
  • SLA Research—Developmental Orders
  • SLA Research—Developmental Sequences
  • SLA Research—Developmental Sequences
  • We decided to use Purpura’s (2004) model of language ability as a basis for test development at 6 CEFR levels & as a basis for validation. Models of Communicative Language Ability as an Organizing Framework Models of language ability: Canale & Swain, 1980; Bachman, 1990; Bachman & Palmer, 1996; Chapelle, 1998; Purpura, 2004, etc.).
  • Forms Meanings Grammatical Knowledge Based on Jaszczolt (2002) & Wedgwood (2005) The model assumes that grammatical forms are the primary resource for conveying a range of literal & intended meanings. What is Language Knowledge? Knowledge of forms at the sentence & discourse levels Logical representation of semantic meanings conveyed by grammatical forms Use in Context Pragmatic Meanings Production of implicatures, irony, stance, affect, identity, power, metaphor, politeness, etc. This serves to convey a range of implied ( pragmatic) meanings in context. And this determines overall communicative effectiveness.
  • Model of CLA Graphological or Phonological Meanings Graphological or Phonological Forms Lexical Meanings Lexical Forms Morphosyntactic Meanings Morphosyntactic Forms Cohesive Meanings Cohesive Forms Info. Management Meanings Info. Management Forms Interactional Meanings Interactional Forms Sentence Level Discourse Level Language Ability Grammatical Knowledge Grammatical Form Grammatical Meaning
  • The OOPT Language Knowledge Section • Item distractors (especially in task 1) were written to reflect partial knowledge of the form and/or meaning being tested, when possible. • Specs for each task were drawn up, one for each CEFR level tested (yes, 24!). Items were designed “theoretically” to target the different levels. • Designed to measure grammatical & pragmatic knowledge. Tasks Task 1: Testing knowledge of forms & meanings 10 items (out of 300) Task 2: Testing meanings (literal & intended) 10 items (out of 142) Task 3: Testing forms & meanings 10 items (out of 91) Task 4: Testing pragmatic meanings (implied) To be determined
  • Task 1: Assessing Forms & Meanings
  • Task 1: Example Item
    • A2 Level
    ID: 0191 Select a word or phrase to complete the conversation shown below. Woman: Sorry I didn’t make it to the party. Man: What happened? Woman: Oh, I _____ really ill so I couldn’t go. a. felt b. feel c. feels d. feeling
    • Item Coding: Item 0191—A2 (at least 3 coders)
    Item ID 0191 Grammatical Form Proficiency level A2 Lexical Possible key felt (a) Morphosyntactic Simple past tense Number of turns 3 Cohesive Response dependency Same turn Info. management Grammatical Meaning Language Knowledge Lexical Grammatical knowledge X Form Morphosyntactic Meaning Cohesive Form & Meaning Info. management Item Focus Adjectives & adjective phrases Non-referential it and there Adverbs & adverbials Nouns & noun phrases Comparisons Passive voice Complements (DO + IO) & complementation Phrasal verbs Conditionals Prepositions & prepositional phrases Focus & emphasis (cleft) Pronouns & reference Formulaic expressions Questions (wh-, y/n, tags) & answers Lexis Relative clauses Logical connectors & conjunctions X Tense & aspect; other verb forms (past part) Modals and phrasal modals (have to)
    • Item Coding: Item 0191—A2 (at least 3 coders)
    Item ID 0191 Grammatical Form Proficiency level A2 Lexical Possible key felt (a) Morphosyntactic Simple past tense Number of turns 3 Cohesive Response dependency Same turn Info. management Grammatical Meaning Language Knowledge Lexical Grammatical knowledge X Form Morphosyntactic Meaning Cohesive Form & Meaning Info. management Item Focus Adjectives & adjective phrases Non-referential it and there Adverbs & adverbials Nouns & noun phrases Comparisons Passive voice Complements (DO + IO) & complementation Phrasal verbs Conditionals Prepositions & prepositional phrases Focus & emphasis (cleft) Pronouns & reference Formulaic expressions Questions (wh-, y/n, tags) & answers Lexis Relative clauses Logical connectors & conjunctions X Tense & aspect; other verb forms (past part) Modals and phrasal modals (have to)
    • Item Coding: Item 0191—A2 (at least 3 coders)
    Item ID 0191 Grammatical Form Proficiency level A2 Lexical Possible key felt (a) Morphosyntactic Simple past tense Number of turns 3 Cohesive Response dependency Same turn Info. management Grammatical Meaning Language Knowledge Lexical Grammatical knowledge X Form Morphosyntactic Meaning Cohesive Form & Meaning Info. management Item Focus Adjectives & adjective phrases Non-referential it and there Adverbs & adverbials Nouns & noun phrases Comparisons Passive voice Complements (DO + IO) & complementation Phrasal verbs Conditionals Prepositions & prepositional phrases Focus & emphasis (cleft) Pronouns & reference Formulaic expressions Questions (wh-, y/n, tags) & answers Lexis Relative clauses Logical connectors & conjunctions X Tense & aspect; other verb forms (past part) Modals and phrasal modals (have to)
    • Item Coding: Item 0191—A2 (at least 3 coders)
    Item ID 0191 Grammatical Form Proficiency level A2 Lexical Possible key felt (a) Morphosyntactic Simple past tense Number of turns 3 Cohesive Response dependency Same turn Info. management Grammatical Meaning Language Knowledge Lexical Grammatical knowledge X Form Morphosyntactic Meaning Cohesive Form & Meaning Info. management Item Focus Adjectives & adjective phrases Non-referential it and there Adverbs & adverbials Nouns & noun phrases Comparisons Passive voice Complements (DO + IO) & complementation Phrasal verbs Conditionals Prepositions & prepositional phrases Focus & emphasis (cleft) Pronouns & reference Formulaic expressions Questions (wh-, y/n, tags) & answers Lexis Relative clauses Logical connectors & conjunctions X Tense & aspect; other verb forms (past part) Modals and phrasal modals (have to)
    • Item Coding: Item 0191—A2 (at least 3 coders)
    Item ID 0191 Grammatical Form Proficiency level A2 Lexical Possible key felt (a) Morphosyntactic Simple past tense Number of turns 3 Cohesive Response dependency Same turn Info. management Grammatical Meaning Language Knowledge Lexical Grammatical knowledge X Form Morphosyntactic Meaning Cohesive Form & Meaning Info. management Item Focus Adjectives & adjective phrases Non-referential it and there Adverbs & adverbials Nouns & noun phrases Comparisons Passive voice Complements (DO + IO) & complementation Phrasal verbs Conditionals Prepositions & prepositional phrases Focus & emphasis (cleft) Pronouns & reference Formulaic expressions Questions (wh-, y/n, tags) & answers Lexis Relative clauses Logical connectors & conjunctions X Tense & aspect; other verb forms (past part) Modals and phrasal modals (have to)
  • Item Foci for Grammar--abbreviated
    • What is the nature of grammatical knowledge at each of the six proficiency levels targeted in the OOPT?
    • Does a dichotomous scoring model or a partial credit scoring model of item difficulty provide better information with respect to the scaled CEFR levels?
    Validation: Some Research Questions
    • 22,136 examinees completed Task 1 as part of the OOPT.
    Data Collection
    • Examinees were given items targeting their school-based CEFR level, along with items immediately above & below (counterbalanced single group design).
    • Data from 217 (out of 300) items from Task 1 have been obtained from the online test.
  • Data Analysis: Dichotomous Data
    • In the construction of each item, a hypothesized CEFL level was indicated. These codings were made independently by four researchers. The codings were compared and differences adjudicated.
    • The dichotomously-scored data for Task 1 items were modeled in Rasch.
    • Item difficulties were calculated.
    • Item codings for the item difficulties were also determined based on the correspondence between the test items & teachers’ intuitions ( can do statements) about their students’ CEFR level (Pollitt, 2009)—these are the Rasch-based codings.
    • The Rasch-based CEFR item levels were then compared with the original CEFR levels hypothesized by the researchers.
  • Item Descriptors for Grammatical Form with the Item Focus ‘Tense & Aspect’ Within the A2 CEFR Level for Task 1 A2 Key Item Difficulty Hypoth. CEFR Future progressive we'll be packing 24.74 B2 Negation with “do” aux I don't 25.51 A2 "will" used with offers I’ll take 28.64 B1 Future time in question form will you do 32.46 B1 Future tense I’ll do/I’m going to do 33.50 B2 Present continuous to express future time am going 34.24 A1 Present tense of verb "go" go 34.28 A1 Simple past we went 34.36 A2 Present continuous in fixed expression “Who shall I say is calling?” is calling 35.38 B1 Simple past felt 35.45 A2 Simple past in interrogative did 37.18 A2
  • Item Descriptors for Grammatical Form with the Item Focus ‘Tense & Aspect’ Within the A2 CEFR Level for Task 1 A2 Key Item Difficulty Hypoth. CEFR Future progressive we'll be packing 24.74 B2 Negation with “do” aux I don't 25.51 A2 "will" used with offers I’ll take 28.64 B1 Future time in question form will you do 32.46 B1 Future tense I’ll do/I’m going to do 33.50 B2 Present continuous to express future time am going 34.24 A1 Present tense of verb "go" go 34.28 A1 Simple past we went 34.36 A2 Present continuous in fixed expression “Who shall I say is calling?” is calling 35.38 B1 Simple past felt 35.45 A2 Simple past in interrogative did 37.18 A2
  • Item Descriptors for Grammatical Form with the Item Focus ‘Tense & Aspect’ Within the A2 CEFR Level for Task 1 A2 Key Item Difficulty Hypoth. CEFR Future progressive we'll be packing 24.74 B2 Negation with “do” aux I don't 25.51 A2 "will" used with offers I’ll take 28.64 B1 Future time in question form will you do 32.46 B1 Future tense I’ll do/I’m going to do 33.50 B2 Present continuous to express future time am going 34.24 A1 Present tense of verb "go" go 34.28 A1 Simple past we went 34.36 A2 Present continuous in fixed expression “Who shall I say is calling?” is calling 35.38 B1 Simple past felt 35.45 A2 Simple past in interrogative did 37.18 A2
  • Item Descriptors for Grammatical Form with the Item Focus ‘Tense & Aspect’ Within the A2 CEFR Level for Task 1 Easy Hard A2 Key Item Difficulty Hypoth. CEFR Future progressive we'll be packing 24.74 B2 Negation with “do” aux I don't 25.51 A2 "will" used with offers I’ll take 28.64 B1 Future time in question form will you do 32.46 B1 Future tense I’ll do/I’m going to do 33.50 B2 Present continuous to express future time am going 34.24 A1 Present tense of verb "go" go 34.28 A1 Simple past we went 34.36 A2 Present continuous in fixed expression “Who shall I say is calling?” is calling 35.38 B1 Simple past felt 35.45 A2 Simple past in interrogative did 37.18 A2
  • Item Descriptors for Grammatical Form with the Item Focus ‘Tense & Aspect’ Within the A2 CEFR Level for Task 1 Easy Hard A2 Key Item Difficulty Hypoth. CEFR Future progressive we'll be packing 24.74 B2 Negation with “do” aux I don't 25.51 A2 "will" used with offers I’ll take 28.64 B1 Future time in question form will you do 32.46 B1 Future tense I’ll do/I’m going to do 33.50 B2 Present continuous to express future time am going 34.24 A1 Present tense of verb "go" go 34.28 A1 Simple past we went 34.36 A2 Present continuous in fixed expression “Who shall I say is calling?” is calling 35.38 B1 Simple past felt 35.45 A2 Simple past in interrogative did 37.18 A2
  • Item Descriptors for Grammatical Form with the Item Focus ‘Tense & Aspect’ Within the A2 CEFR Level for Task 1 Easy Hard A2 Key Item Difficulty Hypoth. CEFR Future progressive we'll be packing 24.74 B2 Negation with “do” aux I don't 25.51 A2 "will" used with offers I’ll take 28.64 B1 Future time in question form will you do 32.46 B1 Future tense I’ll do/I’m going to do 33.50 B2 Present continuous to express future time am going 34.24 A1 Present tense of verb "go" go 34.28 A1 Simple past we went 34.36 A2 Present continuous in fixed expression “Who shall I say is calling?” is calling 35.38 B1 Simple past felt 35.45 A2 Simple past in interrogative did 37.18 A2
  • Item Descriptors for Grammatical Form with the Item Focus ‘Tense & Aspect’ Within the A2 CEFR Level for Task 1 Easy Hard A2 Key Item Difficulty Hypoth. CEFR Future progressive we'll be packing 24.74 B2 Negation with “do” aux I don't 25.51 A2 "will" used with offers I’ll take 28.64 B1 Future time in question form will you do 32.46 B1 Future tense I’ll do/I’m going to do 33.50 B2 Present continuous to express future time am going 34.24 A1 Present tense of verb "go" go 34.28 A1 Simple past we went 34.36 A2 Present continuous in fixed expression “Who shall I say is calling?” is calling 35.38 B1 Simple past felt 35.45 A2 Simple past in interrogative did 37.18 A2
  • A2 Level-- Hypothesized at B2 ID: 0104 Select a word or phrase to complete the conversation shown below. 0 0 1 0 Example: Dichotomous Scoring Woman: What are you thinking about? Man: I was just thinking-- this time next week _____ our suitcases to go on holiday. a. we’ve packed b. we’d be packing c. we’ll be packing d. we’ve been packing
  • Observations
    • Many items were at the correct levels. But some Rasch-based item difficulties did not correspond to what we felt we knew about the item types based on our teaching experience or from SLA research.
    • There was a limitation in using MC questions scored dichotomously as a basis for item difficulty since difficulty can be a function of the stem , the key , AND the  distractors on the one hand , and the ability level of the test-takers on the other. • So we modeled the data with the partial-credit data. Actually, from the beginning, item writers were instructed to write items based on levels of interlanguage achievement, whenever possible? • While the PC analyses involved more work (coding), we thought they might provide a better representation of grammatical knowledge.
  • Our Next Steps: Partial Credit Scoring
    • For each of the 217 Task 1 items, we coded each distractor (from 0-2) based on how much development we felt it represented in relation to the key (given a score of 3) The coding drew on three sets of information:
        • The average logit measures for the TTs who chose that particular distractor
        • Frequencies of each option
        • Substantive reasoning
    • The Rasch-based item difficulties were then recalculated & item characteristic curves, outfit stats., & step calibrations were examined for data-model fit.
    • When there was a mismatch, the data were reexamined & recoded if substantively justifiable.
  • Example: Partial Credit Scoring C2 Level ID: 0034 Select a word or phrase to complete the conversation shown below. 2 3 1 0 Woman: It’s time I _____ a new car. Man: Why? What’s wrong with the one you’ve got? a. buy b. bought c. should buy d. had bought
  • Example: Partial Credit Scoring Item ID 0034
  • Example: Partial Credit Scoring A1 Level ID: 0008 Select a word or phrase to complete the conversation shown below. 3 1 0 1 12.85 16.30 18.48 40.15 ??? Man: What do you usually _____ in the evenings? Woman: I like to watch television. a. do b. look c. have d. make
  • Dichotomous vs. Partial Credit Item Difficulty Dichotomous scoring
  • Dichotomous vs. Partial Credit Item Difficulty Partial credit scoring
  • Dichotomous vs. Partial Credit Item Difficulty Dichotomous scoring Partial credit scoring
  • Dichotomous vs. Partial Credit Item Difficulty Dichotomous scoring Partial credit scoring
  • Dichotomous vs. Partial Credit Item Difficulty Dichotomous scoring Partial credit scoring
  • What did we learn from this?
    • The items do measure a large range of grammatical forms and meanings, as evidenced by our content analysis (i.e., item focus).
    • 2) The items were scalable, either when scored dichotomously or using partial credit.
    • In terms of the items & distractors, there was an empirical basis for scoring some items dichotomously; for other items, there was an empirical basis for scoring them using a variety of partial credit score assignments.
  • What did we learn from this?
    • The partial credit model seems to be a better representation of how many grammatical forms & meanings don’t actually represent one discrete level. Thus, item difficulty, as a series of somewhat overlapping ranges for each of the corresponding CEFR levels, is compelling in terms of its ability show development along a continuum—and not fixed to one CEFR level.
    5) Even though we found good data-model fit for the test when scored using partial credit, the score values that we assigned were somewhat arbitrary. As such, a partial credit model using a more fine-grained scoring method (i.e., one with more score categories) may offer a more precise representation of development within an item.
  • Test taker performance #1
  • Task 2: Assessing Literal, Intended & Implied Meanings
  • Model of CLA Graphological or Phonological Meanings Graphological or Phonological Forms Lexical Meanings Lexical Forms Morphosyntactic Meanings Morphosyntactic Forms Cohesive Meanings Cohesive Forms Info. Management Meanings Info. Management Forms Interactional Meanings Interactional Forms Sentence Level Discourse Level Language Ability Grammatical Knowledge Grammatical Form Grammatical Meaning
  • Task 2: Example Item Literal meaning of the elliptical form “not really.”
  • Task 2: Classical Item Analyses The percentage of students who answered correctly The percentage of students who answered A & C The percentage of students who selected each option in the bottom scoring and the top scoring group. The discrimination index; top group – bottom group, so .96 -.63 = .33 The point biserial gives an overall index of discrimination
  • Task 3: Assessing Forms & Meanings
  • Model of CLA Graphological or Phonological Meanings Graphological or Phonological Forms Lexical Meanings Lexical Forms Morphosyntactic Meanings Morphosyntactic Forms Cohesive Meanings Cohesive Forms Info. Management Meanings Info. Management Forms Interactional Meanings Interactional Forms Sentence Level Discourse Level Language Ability Grammatical Knowledge Grammatical Form Grammatical Meaning
  • Task 3 Item
  • Task 4: Assessing Pragmatic (Implied) Meanings
  • Model of CLA Graphological or Phonological Meanings Graphological or Phonological Forms Lexical Meanings Lexical Forms Morphosyntactic Meanings Morphosyntactic Forms Cohesive Meanings Cohesive Forms Info. Management Meanings Info. Management Forms Interactional Meanings Interactional Forms Sentence Level Discourse Level Language Ability Grammatical Knowledge Grammatical Form Grammatical Meaning
  • Situation: A conversation between a customer & waiter at a restaurant about the menu. Task 4 Item: CEFR Level A1 (contextual, sociolinguistic & psychological meanings) Administered to all levels of the Community English Program .
  • Situation: A conversation between a two close friend . Task 4: CEFR Level B1 (contextual, sociolinguistic & psychological meanings) Circle the best answer.
    • An Abbreviated Coding chart
    All items were designed to measure contextual & sociolinguistic meanings. Some were also designed to measure psychological meanings and/or sociocultural meanings. Item ID 001 Grammatical Form Proficiency level B2 Lexical Situation Restaurant, allergic to nuts Morphosyntactic Cohesive Info. management Pragmatic Knowledge Grammatical Meaning Contextual X Lexical Sociolinguistic X Morphosyntactic Sociocultural Cohesive Psychological Info. management
  • The Listening Section
  • Listening Section Purpose: • To measure a range of literal, intended, and implied(pragmatic) meanings. More specifically… • To measure literal meanings with endophoric reference (“to understand the lines”—literal paraphrase) — The answer explicitly stated in the text.
  • Listening Section • To measure implied meanings with endophoric reference (“to understand between the lines”) — The answer is implied by what is stated in the text. it is implied by or through the text. • To measure implied meanings with exophoric reference (“to understand beyond the lines”) — The answer is neither explicitly nor implicitly stated in the text; rather it is implied by or through the text.
  • Listening for Literal Mng (Endophoric Ref.) “ I think I’ll just have a coffee” = “ The woman want some coffee.”
  • Listening for Implied Meaning (Endo. Ref.) By asking for the building location, she is implying that she wants to go there. The man understands her implied meaning, & gives her the address of the art gallery. The meaning is implicitly stated in the text (i.e., endophoric).
  • Listening for Implied Meaning (Exophoric. Ref.)
  • Listening for Implied Meaning (Exophoric. Ref.) Nothing in the text explicitly states the woman is addressing a group of students. This can be surmised, however, by the type of activities & locations the woman describes and by the shared knowledge that students are often required to take tests when they arrive at a new school. The woman’s implied audience is beyond what is stated in the text (i.e., exophoric reference). Therefore, the focus of the item is on implied contextual or sociocultural meaning, rather than the literal meaning. And the reference to this information is exophoric because meaning can only be derived from outside the spoken text.
  • Questions?