Me? Write an exam?  How to write a communicative exam   JoAnn Miller, Macmillan Publishers [email_address]   www.efltasks.net
Written Oral Projects Portfolio What kinds of evaluations do you do?
Alternative Testing Not paper and pencil Constant not punctual Definition:  “ the ongoing process  involving the student and teacher  in making judgments about the student’s progress in language using  non-conventional strategies   “ techniques that can be used within the context of instruction and can be easily incorporated into the  daily activities  of the school or classroom.” The Reflective Portfolio: Two Case Studies from the United Arab Emirates,  Christine Coombe and Lisa Barlow, Forum Online, http://exchanges.state.gov/forum/vols/vol42/no1/p18.htm
Portfolios a collection of student production over time  shows the stages in the learning process  and the stages of the student’s growth. The Reflective Portfolio: Two Case Studies from the United Arab Emirates,  Christine Coombe and Lisa Barlow, Forum Online, http://exchanges.state.gov/forum/vols/vol42/no1/p18.htm
Portfolios? more subjective acceptance  physical limits
Why written tests? Not the only way and maybe not the fairest. But easiest with large numbers of students More objective Accepted by institutions, parents, students
Exam Banks A collection of exams for classroom use maintained by the institution itself. Written by the teachers themselves or a special committee Following institutional guidelines Could be various “cycles” covering the same material
The benefits of an exam bank Less work for teachers More standardization in large one-campus schools and in multi-campus schools Criteria, instructions and grading  Face validity Format Uniform length
The basic questions Why test? How often? When? What? How?
Student’s need to self-evaluate Institutions require student evaluations Students’ need to be evaluated to see if they are ready for the next level Teacher’s need to evaluate how they have taught Why do YOU test?
Once a semester…too little Once a week…too much Depends on students and institution Testing takes time from language exposure in a communicative context How often do YOU test? Why? Do you like it?
Why written tests? Not the only way and maybe not the fairest. But easiest with large numbers of students More objective Better? Portfolio: Collection of all the student has done. But…it’s more subjective
Exam Banks A collection of exams for classroom use maintained by the institution itself. Written by the teachers themselves or a special committee Following institutional guidelines Could be various “cycles” covering the same material
The benefits of an exam bank Less work for teachers More standardization in large one-campus schools and in multi-campus schools Criteria, instructions and grading  Face validity Format Uniform length
First, the situation Only your students or exam bank? Criteria? Relative weight of grammar, function, vocabulary, reading, listening, writing, etc.? Institutional criteria? Students. Age? Interests? Who will give it? Correct it? What about your situation? How do you decide the relative weight of grammar, function, vocabulary, etc.? Who are your students? Who will give the exam? Correct it?
Students Younger students More images Shorter exams Humor Older students Professionalism Humor
Teachers Level of English Lack of mathematical skills Time factor Ease of grading Answer key Understand why
Factors: Students (younger / shorter) Teachers (mathematical skills) Total points in course Minimum?  Maximum? How many points would you like on your exams?
How many points for each skill? If institution tells you, just follow through If not, text (the common denominator) General text analysis How much time is spent on each skill Count exercises in a few units, determine percent Keep institutional goals in mind
Discussion What are your students like? What are your teachers like? What is the purpose of your English programs?
Content Validity Assessment should be based on a  content-analysis  of the text being used
Content Analysis You must test only material students have seen The only common denominator is the textbook Analysis of percent of time spent on each topic (grammar structure, vocabulary item, function, etc.)
Content Analysis: Information from the contents Functions  (10 points): Talking about imitation products  Talking about food and food festivals Discussing the movie industry  Making a business plan   Grammar (5 points): Nouns in groups Indefinite Pronouns Vocabulary (10 points): Food  Business language /////  /// ///// ///// ///// //// ///// /// ///// ///// /// 8 10 9 5 32 3 5 8 5 3 8 8 / 32 =__% 25% 31% 28% 16% 25% X 10 = ___pts 2.5 pts 3  pts 3  pts 1.5 pts 37%  2 pts  63%  3 pts 63%  6 pts  37%  4 pts
 
(Attitude. Macmillan, 2000.)
What information can you get out of this overview?
Content Analysis: Information from overview Skill Sections in Unit   Functions (10 points)                   Talking about similarites and differences                   Talking about prefrences                   Discussing opinions                   Talking about past events                                       Grammar  (10 points)                   comparative adjectives                   comparatives and superlatives                   Superlative adjectives                                       Vocabulary  (10 points)                   sports                    feelings                  
Categorization of activities Skill Sections in Unit   Functions  (10 points)                   Talking about similarities and differences 1 1               Talking about preferences 2 2 2             Discussing opinions 3 3               Talking about past events 4 4 4 4                               Grammar  (10 points)                   comparative adjectives 1 1 1 1 w1 w1       comparatives and superlatives 2 2 w2 w2 w3         Superlative adjectives 3 3 w2                                 Vocabulary  (10 points)                   sports  1 2 2 w1 w1         feelings 2 2              
Analysis (now you work) Count number of entries in each row (A) Count Functions  (10 points)                   Talking about similarities and differences 1 1         2    
Get total number of entries per skill aspect (B)      Skill Sections in Unit Count Functions  (10 points)                   Talking about similarities and differences 1 1         2     Talking about prefernces 2 2 2       3     Discussing opinions 3 3         2     Talking about past events 4 4 4 4     4                   11    
Divide A by B to get percent of A (C%) Skill Sections in Unit Count % Functions  (10 points)                   Talking about similarities and differences 1 1         2 18%   Talking about prefrences 2 2 2       3     Discussing opinions 3 3         2     Talking about past events 4 4 4 4     4                   11                                            
Multiply C% by the number of points for that skill / aspect (D) = the number of points per topic Skill Sections in Unit Count % Points Functions  (10 points)                   Talking about similarities and differences 1 1         2 18% 1.8 Talking about prefernces 2 2 2       3     Discussing opinions 3 3         2     Talking about past events 4 4 4 4     4                   11                                            
Skill Sections in Unit Count % Points Functions  (10 points)                   Talking about similarities and differences 1 1         2 18% 1.8 Talking about prefernces 2 2 2       3 27% 2.7 Discussing opinions 3 3         2 18% 1.8 Talking about past events 4 4 4 4     4 36% 3.6               11     Grammar  (10 points)                   comparative adjectives 1 1 1 1 w1 w1 6 43% 4.3 comparatives and superlatives 2 2 w2 w2 w3   5 36% 3.6 Superlative adjectives 3 3 w2       3 21% 2.1               14     Vocabulary  (10 points)                   sports  1 2 2 w1 w1   5 71% 7.1 feelings 2 2         2 29% 2.9               7                        
Communicative testing We teach “communicatively” but we test “traditionally”. What  IS  communicative testing? Communicative testing means testing  in context .
Grammar? What will you test?
Which version? Why? Circle the correct answer 1. Do you like __________?  swimming  b. to swum c. swim 2. Where ________ live?  does she  b. she does  c. she  3. I _________ speak French.  no speak  b. doesn’t c. don’t  4. What __________?  a. does he do  b. does he  c. he does do Write the correct forms of the words in parentheses.   Alice:  Where (1)______ you _________ (live)? Bart:  Acapulco. Alice: My brother (2)___________ (go) there every summer on vacation, but he (3)_________(not speak) Spanish. Bart:  Acapulco (4)_________ (attract) tourists from all over the world. Many people there (5)___________(speak) English very well. What about you, (6)______ you ________(speak) Spanish? Alice:  A little.
Vocabulary? What will you test?
Which version? Why? Match the letters (a to e) with the numbers (1 to 5).   1. Your mother’s husband is your___. 2. Your mother’s father is your___. 3. Your mother’s brother is your___.  4. Your uncle’s son is your___. 5. Your father’s sister is your ___. uncle cousin aunt  father grandfather Underline the word in each pair that completes the conversation correctly.  My  (1) [  uncle / aunt ]  likes  (2) [  playing / going to ]  movies. He is my father’s ( 3) [  sister / brother] . He’s  (4) [ heavy / average ]  and he has  (5) [ blue / brown ]  hair. His birthday is on October  (6) [twelve / twelfth ] .
Functions? What will you test?
What is a function? The communicative purpose of the users of the language. How language is used. Usually expressed as gerunds:  introducing, apologizing, asking directions, requesting
Examples of a Functional Cycle Function:  Requesting (1)  Open the window, please. (2)  Would you open the window? (3)  Could you please open the window? (4)  Would you mind opening the window? (5)  I was wondering if you would mind opening the window. (6)  I’d be grateful if you opened the window. Each time the difference in register is emphasized.
How to test?  Complete the conversation. Complete the conversation logically. Use the words in parentheses.  Miriam: Tell me about your new apartment. Mary:  (1)____________________(big / living room). Miriam:  (2)___________________(how / bedrooms)? Mary:   There are two, but (3)________(any furniture) in one of them. Or:  Miriam: Tell me about your new apartment. Mary:  (1)_______________________(living room). Or: Miriam: Tell me about your new apartment. Mary:  (1)________________________________.
What is an exam section? A certain number of items testing the same skill / aspect To be communicative, they should be written as a conversation, note, letter, or some “real” type of discourse Isolated sentences are difficult to contextualize and don’t represent real communication
Determining sections You can combine point values within the same aspect. You can divide point values between two sections You shouldn’t combine point values across aspects / skills…the testing methods are different.
Combining point values within the same aspect. Vocabulary  (8 points) Family relationships   2.5 Describing people     0.5 say tell ask   1 Phrasal verbs   2 Every day expressions   2 Dividing point values between two sections Vocabulary  (8 points) Family relationships   2.5 Describing people     0.5 say tell ask   1 Phrasal verbs     2 Every day expressions   2 1.5 1
Accuracy and Fluency Fluency (Communication) The ability to produce written and / or spoken language with ease Communicate ideas effectively Accuracy Ability to produce grammatically correct sentences
Production and Recognition Production Student writes more than one word Can be creative / involves more “mental” work More than one answer may be possible Recognition Student recognizes correct answer Not creative Only one correct answer
Objective and Subjective Sections Subjective There is more than one possible answer Corrector must be trained and experienced There can be surprises Students can protest grading Objective There is only one answer Anyone can correct the exam No surprises No argument from students
So what does this all mean? Accuracy Accuracy Accuracy Accuracy Accuracy / Fluency Fluency Fluency Prod. /Recog. Prod. /Recog. Prod. /Recog. Recognition Recognition Production Production Subj. / Obj Subj. / Obj Objective Objective Objective Subjective Subjective
Point Values Give more points to… Production items Give fewer points to… Recognition items Give partial credit in Fluency / Production sections Use fractions only if users are matematical
Correcting communicative sections
Correcting Grammar, Reading, Vocabulary, Listening In general these sections are  all right or all wrong .  We rarely give partial credit.  These sections test  accuracy . Communicative sections You can give partial credit  These sections test fluency. Ask yourself if the S’s answer  communicates  what the S wants to say .
Examples of partial credit Correct answer: What’s your name? Student writes:  What you name? Correct answer:  If you invited me, I’d go. Student writes:  If you invite me, I go. Correct answer:  I went to the movies yesterday. Student writes:  I go to the movies yesterday.   I go to the movies.
IV. The clerk knows Cleopatra. Caesar asks the clerk about Cleo. Complete the conversation. Use the words in parentheses. ( 4 points, .5 each) Clerk : Yes, I know her. Julius : (1) _______________________________________ (work) ? Clerk : (2) ________________________________ ( palace downtown). Julius : (3) _______________________________________ ( do) ? Clerk : (4) ___________________________________ ( help people). Julius : (5) ___________________________________ ( close friend)? Clerk : Yes, (6) ____________________________________ (funny). Julius : (7) _______________________________________ ( sports)? Clerk : Yes, (8) ___________________________________ ( tennis ). Actual Student Responses on the worksheet
Communicative Exam Sections Now you are going to write an exam section. Get out  Your content analysis You exam plan You textbook
Complete the conversation These sections evaluate a student’s ability to communicate ideas if they are corrected for communication and not for accuracy. They can be written with different degrees of cueing.
Testing Reading and Listening Two formats are commonly used: T/ F Multiple Choice
True / False Advantages :  Can test large amounts of content Students can answer 3-4 questions per minute Disadvantages:   They are easy Students have a 50-50 chance of getting the right answer by guessing It is difficult to discriminate between students that know the material and students who don't Need a large number of items for high reliability Designing Test Questions, Grayson  H. Walker Teaching Resource Center, The University of Tennessee at Chattanooga, h ttp://www.utc.edu/Administration/WalkerTeachingResourceCenter/FacultyDevelopment/Assessment/test-questions.html
Tips for Writing Good True/False items Avoid double negatives. Avoid long/complex sentences. Use specific determinants with caution:  never, only, all, none, always, could, might, can, may, sometimes, generally, some, few. Use only one central idea in each item; don't emphasize the trivial. Don't lift items straight from the text. Make more false than true (60/40).  (Students are more likely to answer true.)
How to “save” a T/F section… Add a third option “ Not mentioned” OR  Have student correct F answers But only if students have practiced this version in the textbook
Multiple Choice Parts of question Stem Options Distractors XXXXXXXXXXXXXXXXXX? YYYYYYYYY ZZZZZZZZ AAAAAAAA { { Correct answer
Stems Don’t include words that do not contribute to the basis for choosing among the options.  The American flag has three colors. One of them is (1) red (2) green (3) black  vs. One of the colors of the American flag is (1) red (2)green (3) black  Or If the pressure of a certain amount of gas is held constant, what will happen if its volume is increased? a. The temperature of the gas will decrease. *b. The temperature of the gas will increase. c. The temperature of the gas will remain the same. Kehoe,  Jerard. Writing Multiple-Choice Test Items. ERIC/AE Digest Series EDO-TM-95-3, October 1995. http://www.ericdigests.org/1997-1/test.html
Include as much information in the stem and as little in the options as possible.  California: a. Contains the tallest mountain in the United States b. Has an eagle on its state flag. c. Is the second largest state in terms of area. *d. Was the location of the Gold Rush of 1849.
Avoid irrelevant clues to the correct option. For example, grammatical construction: A word used to describe a noun is called an: *a. Adjective. b. Conjunction. c. Pronoun. d. Verb.
Options  (Kehoe) 1. Use three or four options.  2. Construct distractors that are comparable in length, complexity,  grammatical form Which of the following would do the most to promote the application of nuclear discoveries to medicine? a. Trained radioactive therapy specialists. *b. Developing standardized techniques for treatment of patients. c. Do not place restrictions on the use of radioactive substances. d. If the average doctor is trained to apply radioactive treatments.
8. You have just spent ten minutes trying to teach one of your new employees how to change a typewriter ribbon. The employee is still having a great deal of difficulty changing the ribbon, even though you have always found it simple to do. At this point, you should:  a. Tell the employee to ask an experienced employee working nearby to change the ribbon in the future. b. Tell the employee that you never found this difficult, and ask what he or she finds difficult about it. *c. Review each of the steps you have already explained, and determine whether the employee understands them. d. Tell the employee that you will continue teaching him or her later, because you are becoming irritable.
9. Which of the following is the best indication of high morale in a supervisor’s unit? a. The employees are rarely required to work overtime. *b. The employees are willing to give first priority to attaining group objectives, subordinating any personal desires they may have. c. The supervisor enjoys staying late to plan the next day. d. The unit gives expensive birthday presents
Ordering Multiple Choice Items Numerical  a. 1939 b. 1940 c. 1941 d. 1942 Burton,  Steven J. Richard R. Sudweeks, Paul F. Merrill, Bud Wood.  How to Prepare Better Multiple-Choice Test Items: Guidelines for University Faculty ,  Brigham Young University Testing Services and The Department of Instructional Science. 1991. http://testing.byu.edu/info/handbooks/betteritems.pdf Sequential  a. Heating ice from -100°C to 0°C. b. Melting ice at 0°C. c. Heating water from 0°C to 100°C. d. Evaporating water at 100°C. Alphabetical  a.  C hanging a from .01 to .05. b.  D ecreasing the degrees of freedom. c.  I ncreasing the spread of the exam scores. d.  R educing the size of the treatment effect. After the options are written, vary the location of the answer randomly.
 
 
Thank you very much JoAnn Miller [email_address]  /  [email_address]   www.efltasks.net

Me, write an exam?

  • 1.
    Me? Write anexam? How to write a communicative exam JoAnn Miller, Macmillan Publishers [email_address] www.efltasks.net
  • 2.
    Written Oral ProjectsPortfolio What kinds of evaluations do you do?
  • 3.
    Alternative Testing Notpaper and pencil Constant not punctual Definition: “ the ongoing process involving the student and teacher in making judgments about the student’s progress in language using non-conventional strategies “ techniques that can be used within the context of instruction and can be easily incorporated into the daily activities of the school or classroom.” The Reflective Portfolio: Two Case Studies from the United Arab Emirates, Christine Coombe and Lisa Barlow, Forum Online, http://exchanges.state.gov/forum/vols/vol42/no1/p18.htm
  • 4.
    Portfolios a collectionof student production over time shows the stages in the learning process and the stages of the student’s growth. The Reflective Portfolio: Two Case Studies from the United Arab Emirates, Christine Coombe and Lisa Barlow, Forum Online, http://exchanges.state.gov/forum/vols/vol42/no1/p18.htm
  • 5.
    Portfolios? more subjectiveacceptance physical limits
  • 6.
    Why written tests?Not the only way and maybe not the fairest. But easiest with large numbers of students More objective Accepted by institutions, parents, students
  • 7.
    Exam Banks Acollection of exams for classroom use maintained by the institution itself. Written by the teachers themselves or a special committee Following institutional guidelines Could be various “cycles” covering the same material
  • 8.
    The benefits ofan exam bank Less work for teachers More standardization in large one-campus schools and in multi-campus schools Criteria, instructions and grading Face validity Format Uniform length
  • 9.
    The basic questionsWhy test? How often? When? What? How?
  • 10.
    Student’s need toself-evaluate Institutions require student evaluations Students’ need to be evaluated to see if they are ready for the next level Teacher’s need to evaluate how they have taught Why do YOU test?
  • 11.
    Once a semester…toolittle Once a week…too much Depends on students and institution Testing takes time from language exposure in a communicative context How often do YOU test? Why? Do you like it?
  • 12.
    Why written tests?Not the only way and maybe not the fairest. But easiest with large numbers of students More objective Better? Portfolio: Collection of all the student has done. But…it’s more subjective
  • 13.
    Exam Banks Acollection of exams for classroom use maintained by the institution itself. Written by the teachers themselves or a special committee Following institutional guidelines Could be various “cycles” covering the same material
  • 14.
    The benefits ofan exam bank Less work for teachers More standardization in large one-campus schools and in multi-campus schools Criteria, instructions and grading Face validity Format Uniform length
  • 15.
    First, the situationOnly your students or exam bank? Criteria? Relative weight of grammar, function, vocabulary, reading, listening, writing, etc.? Institutional criteria? Students. Age? Interests? Who will give it? Correct it? What about your situation? How do you decide the relative weight of grammar, function, vocabulary, etc.? Who are your students? Who will give the exam? Correct it?
  • 16.
    Students Younger studentsMore images Shorter exams Humor Older students Professionalism Humor
  • 17.
    Teachers Level ofEnglish Lack of mathematical skills Time factor Ease of grading Answer key Understand why
  • 18.
    Factors: Students (younger/ shorter) Teachers (mathematical skills) Total points in course Minimum? Maximum? How many points would you like on your exams?
  • 19.
    How many pointsfor each skill? If institution tells you, just follow through If not, text (the common denominator) General text analysis How much time is spent on each skill Count exercises in a few units, determine percent Keep institutional goals in mind
  • 20.
    Discussion What areyour students like? What are your teachers like? What is the purpose of your English programs?
  • 21.
    Content Validity Assessmentshould be based on a content-analysis of the text being used
  • 22.
    Content Analysis Youmust test only material students have seen The only common denominator is the textbook Analysis of percent of time spent on each topic (grammar structure, vocabulary item, function, etc.)
  • 23.
    Content Analysis: Informationfrom the contents Functions (10 points): Talking about imitation products Talking about food and food festivals Discussing the movie industry Making a business plan Grammar (5 points): Nouns in groups Indefinite Pronouns Vocabulary (10 points): Food Business language ///// /// ///// ///// ///// //// ///// /// ///// ///// /// 8 10 9 5 32 3 5 8 5 3 8 8 / 32 =__% 25% 31% 28% 16% 25% X 10 = ___pts 2.5 pts 3 pts 3 pts 1.5 pts 37% 2 pts 63% 3 pts 63% 6 pts 37% 4 pts
  • 24.
  • 25.
  • 26.
    What information canyou get out of this overview?
  • 27.
    Content Analysis: Informationfrom overview Skill Sections in Unit   Functions (10 points)                   Talking about similarites and differences                   Talking about prefrences                   Discussing opinions                   Talking about past events                                       Grammar (10 points)                   comparative adjectives                   comparatives and superlatives                   Superlative adjectives                                       Vocabulary (10 points)                   sports                   feelings                  
  • 28.
    Categorization of activitiesSkill Sections in Unit   Functions (10 points)                   Talking about similarities and differences 1 1               Talking about preferences 2 2 2             Discussing opinions 3 3               Talking about past events 4 4 4 4                               Grammar (10 points)                   comparative adjectives 1 1 1 1 w1 w1       comparatives and superlatives 2 2 w2 w2 w3         Superlative adjectives 3 3 w2                                 Vocabulary (10 points)                   sports 1 2 2 w1 w1         feelings 2 2              
  • 29.
    Analysis (now youwork) Count number of entries in each row (A) Count Functions (10 points)                   Talking about similarities and differences 1 1         2    
  • 30.
    Get total numberof entries per skill aspect (B)     Skill Sections in Unit Count Functions (10 points)                   Talking about similarities and differences 1 1         2     Talking about prefernces 2 2 2       3     Discussing opinions 3 3         2     Talking about past events 4 4 4 4     4                   11    
  • 31.
    Divide A byB to get percent of A (C%) Skill Sections in Unit Count % Functions (10 points)                   Talking about similarities and differences 1 1         2 18%   Talking about prefrences 2 2 2       3     Discussing opinions 3 3         2     Talking about past events 4 4 4 4     4                   11                                            
  • 32.
    Multiply C% bythe number of points for that skill / aspect (D) = the number of points per topic Skill Sections in Unit Count % Points Functions (10 points)                   Talking about similarities and differences 1 1         2 18% 1.8 Talking about prefernces 2 2 2       3     Discussing opinions 3 3         2     Talking about past events 4 4 4 4     4                   11                                            
  • 33.
    Skill Sections inUnit Count % Points Functions (10 points)                   Talking about similarities and differences 1 1         2 18% 1.8 Talking about prefernces 2 2 2       3 27% 2.7 Discussing opinions 3 3         2 18% 1.8 Talking about past events 4 4 4 4     4 36% 3.6               11     Grammar (10 points)                   comparative adjectives 1 1 1 1 w1 w1 6 43% 4.3 comparatives and superlatives 2 2 w2 w2 w3   5 36% 3.6 Superlative adjectives 3 3 w2       3 21% 2.1               14     Vocabulary (10 points)                   sports 1 2 2 w1 w1   5 71% 7.1 feelings 2 2         2 29% 2.9               7                        
  • 34.
    Communicative testing Weteach “communicatively” but we test “traditionally”. What IS communicative testing? Communicative testing means testing in context .
  • 35.
  • 36.
    Which version? Why?Circle the correct answer 1. Do you like __________? swimming b. to swum c. swim 2. Where ________ live? does she b. she does c. she 3. I _________ speak French. no speak b. doesn’t c. don’t 4. What __________? a. does he do b. does he c. he does do Write the correct forms of the words in parentheses. Alice: Where (1)______ you _________ (live)? Bart: Acapulco. Alice: My brother (2)___________ (go) there every summer on vacation, but he (3)_________(not speak) Spanish. Bart: Acapulco (4)_________ (attract) tourists from all over the world. Many people there (5)___________(speak) English very well. What about you, (6)______ you ________(speak) Spanish? Alice: A little.
  • 37.
  • 38.
    Which version? Why?Match the letters (a to e) with the numbers (1 to 5). 1. Your mother’s husband is your___. 2. Your mother’s father is your___. 3. Your mother’s brother is your___. 4. Your uncle’s son is your___. 5. Your father’s sister is your ___. uncle cousin aunt father grandfather Underline the word in each pair that completes the conversation correctly. My (1) [ uncle / aunt ] likes (2) [ playing / going to ] movies. He is my father’s ( 3) [ sister / brother] . He’s (4) [ heavy / average ] and he has (5) [ blue / brown ] hair. His birthday is on October (6) [twelve / twelfth ] .
  • 39.
  • 40.
    What is afunction? The communicative purpose of the users of the language. How language is used. Usually expressed as gerunds: introducing, apologizing, asking directions, requesting
  • 41.
    Examples of aFunctional Cycle Function: Requesting (1) Open the window, please. (2) Would you open the window? (3) Could you please open the window? (4) Would you mind opening the window? (5) I was wondering if you would mind opening the window. (6) I’d be grateful if you opened the window. Each time the difference in register is emphasized.
  • 42.
    How to test? Complete the conversation. Complete the conversation logically. Use the words in parentheses. Miriam: Tell me about your new apartment. Mary: (1)____________________(big / living room). Miriam: (2)___________________(how / bedrooms)? Mary: There are two, but (3)________(any furniture) in one of them. Or: Miriam: Tell me about your new apartment. Mary: (1)_______________________(living room). Or: Miriam: Tell me about your new apartment. Mary: (1)________________________________.
  • 43.
    What is anexam section? A certain number of items testing the same skill / aspect To be communicative, they should be written as a conversation, note, letter, or some “real” type of discourse Isolated sentences are difficult to contextualize and don’t represent real communication
  • 44.
    Determining sections Youcan combine point values within the same aspect. You can divide point values between two sections You shouldn’t combine point values across aspects / skills…the testing methods are different.
  • 45.
    Combining point valueswithin the same aspect. Vocabulary (8 points) Family relationships 2.5 Describing people 0.5 say tell ask 1 Phrasal verbs 2 Every day expressions 2 Dividing point values between two sections Vocabulary (8 points) Family relationships 2.5 Describing people 0.5 say tell ask 1 Phrasal verbs 2 Every day expressions 2 1.5 1
  • 46.
    Accuracy and FluencyFluency (Communication) The ability to produce written and / or spoken language with ease Communicate ideas effectively Accuracy Ability to produce grammatically correct sentences
  • 47.
    Production and RecognitionProduction Student writes more than one word Can be creative / involves more “mental” work More than one answer may be possible Recognition Student recognizes correct answer Not creative Only one correct answer
  • 48.
    Objective and SubjectiveSections Subjective There is more than one possible answer Corrector must be trained and experienced There can be surprises Students can protest grading Objective There is only one answer Anyone can correct the exam No surprises No argument from students
  • 49.
    So what doesthis all mean? Accuracy Accuracy Accuracy Accuracy Accuracy / Fluency Fluency Fluency Prod. /Recog. Prod. /Recog. Prod. /Recog. Recognition Recognition Production Production Subj. / Obj Subj. / Obj Objective Objective Objective Subjective Subjective
  • 50.
    Point Values Givemore points to… Production items Give fewer points to… Recognition items Give partial credit in Fluency / Production sections Use fractions only if users are matematical
  • 51.
  • 52.
    Correcting Grammar, Reading,Vocabulary, Listening In general these sections are all right or all wrong . We rarely give partial credit. These sections test accuracy . Communicative sections You can give partial credit These sections test fluency. Ask yourself if the S’s answer communicates what the S wants to say .
  • 53.
    Examples of partialcredit Correct answer: What’s your name? Student writes: What you name? Correct answer: If you invited me, I’d go. Student writes: If you invite me, I go. Correct answer: I went to the movies yesterday. Student writes: I go to the movies yesterday. I go to the movies.
  • 54.
    IV. The clerkknows Cleopatra. Caesar asks the clerk about Cleo. Complete the conversation. Use the words in parentheses. ( 4 points, .5 each) Clerk : Yes, I know her. Julius : (1) _______________________________________ (work) ? Clerk : (2) ________________________________ ( palace downtown). Julius : (3) _______________________________________ ( do) ? Clerk : (4) ___________________________________ ( help people). Julius : (5) ___________________________________ ( close friend)? Clerk : Yes, (6) ____________________________________ (funny). Julius : (7) _______________________________________ ( sports)? Clerk : Yes, (8) ___________________________________ ( tennis ). Actual Student Responses on the worksheet
  • 55.
    Communicative Exam SectionsNow you are going to write an exam section. Get out Your content analysis You exam plan You textbook
  • 56.
    Complete the conversationThese sections evaluate a student’s ability to communicate ideas if they are corrected for communication and not for accuracy. They can be written with different degrees of cueing.
  • 57.
    Testing Reading andListening Two formats are commonly used: T/ F Multiple Choice
  • 58.
    True / FalseAdvantages : Can test large amounts of content Students can answer 3-4 questions per minute Disadvantages: They are easy Students have a 50-50 chance of getting the right answer by guessing It is difficult to discriminate between students that know the material and students who don't Need a large number of items for high reliability Designing Test Questions, Grayson H. Walker Teaching Resource Center, The University of Tennessee at Chattanooga, h ttp://www.utc.edu/Administration/WalkerTeachingResourceCenter/FacultyDevelopment/Assessment/test-questions.html
  • 59.
    Tips for WritingGood True/False items Avoid double negatives. Avoid long/complex sentences. Use specific determinants with caution: never, only, all, none, always, could, might, can, may, sometimes, generally, some, few. Use only one central idea in each item; don't emphasize the trivial. Don't lift items straight from the text. Make more false than true (60/40). (Students are more likely to answer true.)
  • 60.
    How to “save”a T/F section… Add a third option “ Not mentioned” OR Have student correct F answers But only if students have practiced this version in the textbook
  • 61.
    Multiple Choice Partsof question Stem Options Distractors XXXXXXXXXXXXXXXXXX? YYYYYYYYY ZZZZZZZZ AAAAAAAA { { Correct answer
  • 62.
    Stems Don’t includewords that do not contribute to the basis for choosing among the options. The American flag has three colors. One of them is (1) red (2) green (3) black vs. One of the colors of the American flag is (1) red (2)green (3) black Or If the pressure of a certain amount of gas is held constant, what will happen if its volume is increased? a. The temperature of the gas will decrease. *b. The temperature of the gas will increase. c. The temperature of the gas will remain the same. Kehoe, Jerard. Writing Multiple-Choice Test Items. ERIC/AE Digest Series EDO-TM-95-3, October 1995. http://www.ericdigests.org/1997-1/test.html
  • 63.
    Include as muchinformation in the stem and as little in the options as possible. California: a. Contains the tallest mountain in the United States b. Has an eagle on its state flag. c. Is the second largest state in terms of area. *d. Was the location of the Gold Rush of 1849.
  • 64.
    Avoid irrelevant cluesto the correct option. For example, grammatical construction: A word used to describe a noun is called an: *a. Adjective. b. Conjunction. c. Pronoun. d. Verb.
  • 65.
    Options (Kehoe)1. Use three or four options. 2. Construct distractors that are comparable in length, complexity, grammatical form Which of the following would do the most to promote the application of nuclear discoveries to medicine? a. Trained radioactive therapy specialists. *b. Developing standardized techniques for treatment of patients. c. Do not place restrictions on the use of radioactive substances. d. If the average doctor is trained to apply radioactive treatments.
  • 66.
    8. You havejust spent ten minutes trying to teach one of your new employees how to change a typewriter ribbon. The employee is still having a great deal of difficulty changing the ribbon, even though you have always found it simple to do. At this point, you should: a. Tell the employee to ask an experienced employee working nearby to change the ribbon in the future. b. Tell the employee that you never found this difficult, and ask what he or she finds difficult about it. *c. Review each of the steps you have already explained, and determine whether the employee understands them. d. Tell the employee that you will continue teaching him or her later, because you are becoming irritable.
  • 67.
    9. Which ofthe following is the best indication of high morale in a supervisor’s unit? a. The employees are rarely required to work overtime. *b. The employees are willing to give first priority to attaining group objectives, subordinating any personal desires they may have. c. The supervisor enjoys staying late to plan the next day. d. The unit gives expensive birthday presents
  • 68.
    Ordering Multiple ChoiceItems Numerical a. 1939 b. 1940 c. 1941 d. 1942 Burton, Steven J. Richard R. Sudweeks, Paul F. Merrill, Bud Wood. How to Prepare Better Multiple-Choice Test Items: Guidelines for University Faculty , Brigham Young University Testing Services and The Department of Instructional Science. 1991. http://testing.byu.edu/info/handbooks/betteritems.pdf Sequential a. Heating ice from -100°C to 0°C. b. Melting ice at 0°C. c. Heating water from 0°C to 100°C. d. Evaporating water at 100°C. Alphabetical a. C hanging a from .01 to .05. b. D ecreasing the degrees of freedom. c. I ncreasing the spread of the exam scores. d. R educing the size of the treatment effect. After the options are written, vary the location of the answer randomly.
  • 69.
  • 70.
  • 71.
    Thank you verymuch JoAnn Miller [email_address] / [email_address] www.efltasks.net

Editor's Notes

  • #4 There are testing alternatives [click] that aren’t the traditional “paper and pencil” exams. [click] You can evaluate students’ progress constantly, not punctually (at specific times) [click] Definitions from REF 1: The Reflective Portfolio: Two Case Studies from the United Arab Emirates, Christine Coombe and Lisa Barlow, Forum Online, http://exchanges.state.gov/forum/vols/vol42/no1/p18.htm
  • #5 Reference 1 (REF1) From The Reflective Portfolio: Two Case Studies from the United Arab Emirates, Christine Coombe and Lisa Barlow, Forum Online, http://exchanges.state.gov/forum/vols/vol42/no1/p18.htm
  • #6 Portfolio assessment is very popular in places like the US where classes are small and teachers have office hours and preparation time [click] But they are more subjective, you can’t guarantee that two teachers will be grading the same way and [click] Teachers in Mexico have physical limitations: lots of students, lots of groups, little free time to correct [click] Would they be accepted by the SEP, schools, parents and students? Have participants discuss in pairs and then compare responses.
  • #7 Why do we rely on written tests here in Mexico? [click] It isn’t the only way to test and it probably isn’t the fairest…some students don’t do well on written tests [click] But they are the easiest way to test with the large numbers of students we have here [click] It is more objective and can be designed so that results can be fairly consistent among teachers at the same institution [click] And they are accepted by institutions (official study plans, schools) parents and the students themselves
  • #8 Only show the title of the slide. Ask teachers how many have exam banks at their schools (If some participants don’t know what exam banks are, ask for a definition). Put students into groups, try to get at least one teacher who has worked with an exam bank in each group. Have them discuss the advantages and disadvantages of the exam banks. When they finish, listen to some of their conclusions. Then continue on with the slide. Exam banks have many advantages for large schools. They are a collection of exams [click] that could be written by the teachers or a special committee [click]. They must be based on institutional guidelines. [click] It is possible to have various cycles of the exams.
  • #9 Benefits: Teachers don’t have to write all their own exams. Instead of writing 5, 10 or more exams a semester, they write one or two. It save time, energy and quality. [click] They are useful in larger schools because they help standardize performance assuring all the students who pass from one level to another are ready. The criteria, instructions and grading are uniform. [click] Face validity (an exam looks like the teachers and students expect it to). The format is similar. All exams in the exam bank follow the same rules so there are no surprises. They even look similar. [click] Exams are also about the same length. In reality exams banks are the most time-efficient, reliable ways to test. Each exam can be written with the same criteria and the results between groups is much more reliable. Various cycles can be created so that cheating is minimized and each individual teacher works less since instead of writing an exam for every class they are teaching, they might just write a couple of exams for the exam bank. In pairs: Do they have exam banks? How do they work? If not, would they work?
  • #14 Only show the title of the slide. Ask teachers how many have exam banks at their schools (If some participants don’t know what exam banks are, ask for a definition). Put students into groups, try to get at least one teacher who has worked with an exam bank in each group. Have them discuss the advantages and disadvantages of the exam banks. When they finish, listen to some of their conclusions. Then continue on with the slide. Exam banks have many advantages for large schools. They are a collection of exams [click] that could be written by the teachers or a special committee [click]. They must be based on institutional guidelines. [click] It is possible to have various cycles of the exams.
  • #15 Benefits: Teachers don’t have to write all their own exams. Instead of writing 5, 10 or more exams a semester, they write one or two. It save time, energy and quality. [click] They are useful in larger schools because they help standardize performance assuring all the students who pass from one level to another are ready. The criteria, instructions and grading are uniform. [click] Face validity (an exam looks like the teachers and students expect it to). The format is similar. All exams in the exam bank follow the same rules so there are no surprises. They even look similar. [click] Exams are also about the same length. In reality exams banks are the most time-efficient, reliable ways to test. Each exam can be written with the same criteria and the results between groups is much more reliable. Various cycles can be created so that cheating is minimized and each individual teacher works less since instead of writing an exam for every class they are teaching, they might just write a couple of exams for the exam bank. In pairs: Do they have exam banks? How do they work? If not, would they work?
  • #22 Base the content analysis on the textbook for the reasons we mentioned previously…Slide 13
  • #23 You can’t test students on something they haven’t seen. [click] If you are writing for other teachers, the only thing you have in common is the textbook. If you are only writing for your students and they are absent, the only resource they have is their textbooks[click] You need to make an analysis of how much time is spent on each aspect you want to test.
  • #25 Here is an example of a final content analysis. If you are using an exam bank, all the teachers should have it. The students should also have it. It can help them study. Look at the Grammar section. Notice that one point of the 15 points on that section of the exam is dedicated to the comparative. If the student didn’t know only one point was dedicated to that structure, he would know that it was more important to study BE and the simple present (7 points). This can also help the teachers since no one would spend hours teaching all aspects of the comparative if it is not represented with more than one point on the exam. This would be a positive washback effect.
  • #35 Ask participants what they think “communicative testing” refers to. After hearing some possibilities, tell them: The “Communicative Approach” has been in existence since the late 80s. We now accept the ideas of teaching communicatively: group work, teaching language in context, etc, but we continue testing traditionally [click] Ask participants what “traditional testing” is (isolated sentences, transformation, grammar-based) [click] Communicative testing means testing in context…not isolated sentences
  • #36 HO2-- Let’s look at some examples. What will you test? [click] Grammar? Refer participants to HO2. Have them look at the grammar examples and compare them in pairs. What are the differences ? When they finish giving you some differences go to the next slide.
  • #37 [click] The example on the left is traditional. There is no context. It is OK for simple structures like this one, but what about more complex structures such as If clauses or present perfect/past? Here is an example you can give them: If I __________ (be invited) to your party, I __________ (go). What is the answer? If I am invited to her party, I’ll go (If 1) If I were invited to her party, I’d go (If 2) If I had been invited to her party, I would have gone (If 3) Since there is no context, any of them would be correct. [click] In the other example the grammar is presented in context, a conversation. If this had been about If clauses, it might have been: “ I’m sorry. I didn’t know she was going to have a party. If I ______________ (be invited), I ___________ (go).” Obviously If 3.
  • #38 What will you test? [click] Vocabulary? Refer participants to the HO2. Have them look at the vocabulary examples and compare them in pairs. What are the differences? When they finish giving you some differences go to the next slide.
  • #39 [click] The example on the left is a traditional vocabulary section. It tests if they learned a vocabulary list, but it doesn’t test if they really know how to use the vocabulary. [click] The example on the right tests many different problems students can have when they work with vocabulary: Difficult pairs: (1) (3) (6) Collocation: (2) (4) (5) The context lets you test more.
  • #40 Ask participants how they could test functions…collect some ideas.
  • #41 Go over the definition of functions just in case someone doesn’t know what they are.
  • #42 Go through the examples. Emphasize how the following aspects become more complex as we go down the list: Grammar structures (easy to more difficult) Length of sentences (short to long) Register (from informal to formal)
  • #43 This is the best type of section to test students’ knowledge of functions. (In HO2) [click] Students complete a conversation (or paragraph) with sentences (or even phrases) that communicate the correct function logically. Go over the first example. Show that there isn’t one correct answer. The first one could be: It has / I have / There is a big living room. Elicit possible answers for # 2 and 3. The section can be even more open as the other two option show. Elicit possible answers. Have participants compare this with the grammar and vocabulary sections they have seen. What are the differences? Emphasize that here the purpose is communication and that communication can occur even if the students don’t use complete sentences or make some grammar errors. This will be seen later in the workshop when they learn to correct these sections.
  • #46 Have participants look at the example and suggest what kinds of exam sections they could write combining vocabulary themes and then [click] dividing them…. If there is time, in pairs, have them think of conversations, paragraphs, etc. and describe what they would be like to practice these sections. Then have them go back to their content analyses and decide how point values can be combined or divided to reach the number of points they determined previously for their exam (see What to test and how often, slides 11-15). 15 minutes.
  • #48 In production, the student writes more than one word (remind them of Complete the Conversation for testing functions) The student can be more creative and the teacher has to be more alert because more than one answer might be correct. In recognition (for example, multiple choice or true/false), there is only one correct answer and it isn’t creative. Production means more work. Teachers with large number of students can’t handle a 50-50 balance. A good exam could be 20-80 or 30-70 (production/recognition) and still be fair. TOEFL and other similar exams are all recognition…they never test whether the student can produce language. Years ago this led to a big influx of Asian students into US universities. They had studied grammar and reading, but no speaking or writing. They did great recognizing correct answers on the TOEFL and got very high scores, but when they arrived in the US, authorities realized they couldn’t say two words in English….The TOEFL exam was revised and now includes writing sections and often oral interviews are required to study in the US.
  • #49 Sections can also be objective or subjective. It is good to have some subjective sections, but an exam with sections that are all graded subjectively make it difficult to judge student ability between different teachers…. However, even though they are more difficult to grade, they do give more information about students’ ability. The limitations can be overcome if the graders are trained…. This is a very common problem with oral grading.
  • #52 It’s easy to correct an “A, b, c” section, but how do you correct a “complete the conversation” section so that it really tests communicative ability and competence?
  • #53 Go over the summary
  • #54 These are examples of partial credit. Go over then one-by-one. Ask participants if they would give partial credit or not if they were grading communication… First example: Does it communicate? Would you be able to answer the student’s questions? Probably. If it were a C1 student on the first exam, I’d give credit, but write in the corrections. On later exams, I’d probably not be a generous. Second example: The meaning is different. In the correct answer, you didn’t invite me, in the student’s answer you might do so in the future. Although the grammar is correct, it doesn’t communicate and I wouldn’t give any credit. Third example: Click slowly, discussing as you go. The first one communicates and if it were the first time students had worked with the past I’d accept it and just write in the correction. They use “yesterday” to indicate the tense. [click] This could also be accepted if it were in answer to the question. “what did you do yesterday?”. Correcting these sections, you have to ask “does the answer communicate the idea required by the conversation.” If so, accept it or give partial credit depending on the level of the students.
  • #55 This is the exam section used in the activity (HO7). Go over it and get possible answers before they begin working. Have them work in pairs, grading the students’ work. (15 minutes) Then go over the worksheet, comparing how they graded.
  • #57 Go over this with the participants. It will be the hardest section to write since they have very little experience with communicative exams.
  • #59 Go over with participants. Ask for their opinions. From (REF5) Designing Test Questions, Grayson H. Walker Teaching Resource Center, The University of Tennessee at Chattanooga, http://www.utc.edu/Administration/WalkerTeachingResourceCenter/FacultyDevelopment/Assessment/test-questions.html
  • #60 Go over… From same source as previous slide
  • #62 Get examples so you can be sure they understand what the stem (first part) and options are (A.b.c)
  • #63 REF 3: From Kehoe, Jerard. Writing Multiple-Choice Test Items. ERIC/AE Digest Series EDO-TM-95-3, October 1995. http://www.ericdigests.org/1997-1/test.html Go through the points one-by-one…checking comprehension as you go
  • #64 Continued
  • #66 You might want to discuss how many items they feel they should use. Kehoe recommends 3-4, but some really formal exams use 5 options.
  • #69 These are the different orders possible…. Talk about which seem best and why? Do multiple Choice handout (HO5) in pairs (30 minutes). These are the poor examples from the Burton article. Have participants think about what they think is wrong with them and how they could be improved. Then go over them. Use REF 4 to correct… REF 4: Burton, Steven J. Richard R. Sudweeks, Paul F. Merrill, Bud Wood. How to Prepare Better Multiple-Choice Test Items: Guidelines for University Faculty, Brigham Young University Testing Services and The Department of Instructional Science. 1991. http://testing.byu.edu/info/handbooks/betteritems.pdf When they finish HO5, have them write a multiple choice section using their content analyses and the section divisions they recently did. When they finish, have them compare and criticize their work. Be careful they are using multiple choice for reasonable sections…(20 minutes)