Corpus Tools for Language Teaching


  1. 1. CORPUS TOOLS FOR LANGUAGE TEACHING (PART 2) LANGUAGE TEACHING WORKSHOP SERIES 2013-2014 Center for Language Acquisition (CLA) and Center for Advanced Language Proficiency Education and Research (CALPER) The Pennsylvania State University CLAandCALPERatPennState
  2. 2. GOALS OF THE WORKSHOP  Review:  What is a language corpus?  Question:  How are corpora used for language teaching and learning?  Model:  What do corpus-based classroom activities look like?  Explore:  Creating a corpus-based learning activity  Discuss:  Share your ideas CLAandCALPERatPennState
  3. 3. PRESENTERS  Brody Bluemel  Department of Applied Linguistics & Asian Studies  Chinese, ESL, German  Edie Furniss  Department of Applied Linguistics and IECP  Russian, ESL, French  Meredith Doran  Center for Language Acquisition/Applied Linguistics  French, Spanish, ESL CLAandCALPERatPennState
  4. 4. WHAT IS A LANGUAGE CORPUS?  A collection of ‘real world’ language samples (written texts, oral transcriptions, audio/video)  Principled selection of language (one language, one historical period, one genre type, a mix of written genre types—e.g. academic, journalistic, literary, blogs, etc.)  Usually electronic and searchable  Used for language research and teaching CLAandCALPERatPennState
  5. 5. WHAT ARE SOME EXAMPLES OF CORPORA?  COBUILD (Bank of English—650 million words)  MICASE (Michigan Corpus of Academic Spoken English)  Brigham Young University Corpora  Various  Includes COCA, Corpus of Contemporary American English)   Dialect corpora  Chinese:  English: ICE (International Corpus of English) CLAandCALPERatPennState
  6. 6. WHAT CAN CORPORA SHOW US?  Word frequency  What are the top 500 words in this language? How often does the word ‘jocular’ or ‘pickle’ occur?  Word clusters  Which short expressions are common? (e.g., ‘you know what I mean,’ ‘do you want me to’)  Collocations  Which words typically go together? (e.g. weak coffee, weak soup?)  Concordances  Examples of particular words/phrases in context CLAandCALPERatPennState
  7. 7. CONCORDANCE  Example of concordance for modal ‘CAN’ CLAandCALPERatPennState
  8. 8. BENEFITS OF CORPORA FOR LANGUAGE TEACHING  More accurate descriptions of language than textbooks/intuitions  Exposure to contextualized, meaningful language in ‘real’ usages  Examples of specific registers/genres of language  Reference tool for independent/autonomous language investigation and learning  Cited from: Jonathan Smart, Northern Arizona University CLAandCALPERatPennState
  9. 9. OTHER BENEFITS FOR LEARNERS  Gives learners exposure to non-textbook language patterns  Gives learners access to a much larger language sample than classes can normally provide  Can answer questions about everyday usages (Do people really say/use this? Which constructions are common? Which vocabulary is frequent/rare? Is this word or feature typical in speech/writing?) CLAandCALPERatPennState
  10. 10. SPECIFIC APPLICATIONS OF CORPORA FOR TEACHING  Use contextualized examples for quizzes, activities, explanations  Share lists of frequent words or expressions with learners  Research features of language for lesson design (e.g., Are the modals can, could, may, might, shall, will equally used? Should they be equally taught?)  Pre-select materials from corpora to help learners discover/explore particular language patterns  Have students search corpora using a search tool CLAandCALPERatPennState
  11. 11. WHAT TO TEACH WITH A CORPUS?  Vocabulary  Word meanings in context, combinations/collocations, parts of speech, common expressions, differences in meaning  Grammar  Differences between similar forms, how forms are used in context  Pragmatics  Greetings, genre features and their cultural meanings (e.g. job letters, CVs, personal letters, etc.) CLAandCALPERatPennState
  12. 12. EXAMPLE: VOCABULARY  Modal ‘CAN’  Explain to students that CAN has four functions:  ability  possibility  permission  request  Students find examples of each type of usage via corpus search  Might evaluate frequency of usage types, connotational values, pragmatic applications CLAandCALPERatPennState
  13. 13. USEFUL CORPORA FOR TEACHERS  Lextutor—useful for study of vocabulary  MICASE (Michigan Corpus of Academic Spoken English)  SACODEYL—European Youth Language, pedagogical focus  Brigham Young University Corpora (— including COCA (Corpus of Contemporary American English)  Backbone—European Pedagogic Corpora for Content & Language Integrated Learning  Linguee  The Internet!—Webcorp as search tool CLAandCALPERatPennState
  14. 14. WAYS LEARNERS CAN WORK WITH CORPORA  Illustration: Looking at data  Interaction: Discussing and sharing observations and opinions  Intervention: Providing learners with hints or clearer guides for seeing patterns  Induction: Learners making their own rules for particular features (Flowerdew, 1999) CLAandCALPERatPennState
  15. 15. EXAMPLES/DEMONSTRATIONS  Brody  Parallel Corpora for Reading Comprehension and Lexical Acquisition.    Edie  Lexical study: Color terms  Lextutor  Russian National Corpus  Meredith  Pragmatic choices: “I would like to” in Spanish  Linguee  Word Reference Forum CLAandCALPERatPennState
  16. 16. PARALLEL CORPORA FOR READING & WRITING  Activity: 12 Zodiac Animals Using, students were asked to: Read the story Identify 5 new vocab items Identify the 12 zodiac animals Write 10 sentences using new vocabulary Summarize the story in your own words  12 Zodiac Animals Story  Vocabulary Term Search: "看" CLAandCALPERatPennState
  17. 17. OUTCOMES Beginning level students able to read through and comprehend the meaning of the story. Students learned to use new vocabulary items with multiple meanings and uses. Sample sentences: 你看,那是一只好看的老虎。 Look, there’s a good-looking tiger. 我看到一个球。 I saw a ball. Reference for constructing grammatical sentences. CLAandCALPERatPennState
  18. 18. PARALLEL CORPORA FOR OTHER LANGUAGES:   Explore complex terms/structures.  Identify stories/ content-based websites to work through.  Reference for using new terms and constructs correctly. CLAandCALPERatPennState
  19. 19. PRAGMATICS OF REQUESTING IN SPANISH  Goal:  Raise students’ awareness of polite request forms in Spanish (e.g., “I would like . . . “ vs. “I want . . . “) Step 1: Brainstorming Have students generate some ways to ‘request’ in Spanish:  Yo quiero . . . (“I want . . .’, present tense)  Me gustaría . . . (“It would please me . . . “, conditional)  Deseo . . . (“I desire . . . “, present)  Quisiera . . . (“I was wanting . . . “, imperfect subjunctive) CLAandCALPERatPennState
  20. 20. Researching Requests (‘asking politely’) Step 2: Illustration (looking at data) Ask students to search Linguee for translations of “I would like . . .” in Spanish What translations of “I would like” do you find? What words tend to follow these constructions? (Verbs? Nouns? List at least 6 constructions you found.) In what kinds of texts does each option appear? Formal? Informal? What patterns do you notice? Do the forms seem to differ in meaning in some way? CLAandCALPERatPennState
  21. 21. Researching Requests (‘asking politely’) Step 3: Interaction  Ask students to report on patterns they observed  Discuss their observations Step 4: Intervention  Consult Word Reference Forum (or similar sites) for further evidence/perspectives on usage  Offer teacher explanation, intuition, evidence CLAandCALPERatPennState
  22. 22. Researching Requests (‘asking politely’) Step 5: Induction Students try to formulate their ‘rules’ for usage (Step 6: Application)—(optional) Students create role plays Use a range of forms they found Focus on politeness/appropriateness CLAandCALPERatPennState
  23. 23. EXPLORE & CREATE  In small groups:  Go to the CLA website at  Click on the ‘Corpus workshop 2’ Upcoming Event (left column)  Scroll down, Click on the Google Doc link  Identify a sample item or set of items for a corpus-based inquiry in a language you teach.  Greetings  Close synonyms (e.g., happy vs. psyched)  Color terms and their applications  Register differences  Slang words and their uses  Idiomatic expressions CLAandCALPERatPennState
  24. 24. DISCUSS  Ideas and applications of corpora generated in your group  Key features or aspects of corpora we haven’t yet considered?  Questions? CLAandCALPERatPennState
