Tracking Learning: Using Corpus Linguistics to Assess Language Development


Published on

Published in: Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • steve
  • steve
  • steve
  • steve
  • All of these accounts of IL developmental trajectories reflect the underlying process of restructuring and provide a challenge for current corpus analytic techniques.
  • Yet, today so much of our learning is technology-mediated: email, chat, forum discussions, essays, voice-message boards, etc. that collecting much of the language students generate in instructional settings.
  • To use only corpus for assessing IL development all naturally-occurring language use must be captured and added to the corpus. Even in the most Orwellian of worlds, this is not a possibility. In other words, the weakness of CBA lies in the inability to collect an exhaustive sample of a learner’s language. Therefore, it would be unwise to declare a learner deficient in some aspect of the language just because the structure/function/etc. was not present in the corpus. With “testing” a very different limitation is encountered: is the language elicited truly representative of the learner’s ability? Therefore, the two approaches to assessment are very complementary. If
  • Tracking Learning: Using Corpus Linguistics to Assess Language Development

    1. 1. Tracking Learning: Using Corpus Linguistics to Assess Language Development James Lantolf Steve Thorne CALPER Center for Advanced Language Proficiency Education and Research The Pennsylvania State University
    2. 2. Tracking Learning: Approaches to Assessment <ul><li>Traditional Classroom Assessment </li></ul><ul><ul><li>Achievement, Placement, Formative </li></ul></ul><ul><li>Standardized Tests </li></ul><ul><ul><li>AP, TOEFL, OPI, STAMP </li></ul></ul><ul><li>Alternative Assessment </li></ul><ul><ul><li>Portfolio & LinguaFolio </li></ul></ul><ul><ul><li>Performance Assessment, Task-Based </li></ul></ul><ul><li>CALPER Assessment </li></ul><ul><ul><li>Dynamic Assessment </li></ul></ul><ul><ul><li>Corpus-Informed Assessment </li></ul></ul>
    3. 3. Today’s Talk <ul><li>What is a corpus? </li></ul><ul><li>Types of corpora </li></ul><ul><li>Corpus-informed assessment </li></ul><ul><ul><li>Developmental learner corpora </li></ul></ul><ul><ul><li>Contrastive learner corpus analysis against baseline </li></ul></ul><ul><li>Two examples of corpus-informed assessment </li></ul><ul><ul><li>Advanced ESL academic discourse competence </li></ul></ul><ul><ul><li>German modal particles </li></ul></ul>
    4. 4. What is a Corpus? <ul><li>A corpus (plural corpora ) </li></ul><ul><ul><li>Large collection of texts </li></ul></ul><ul><ul><li>Gathered according to specific criteria </li></ul></ul><ul><ul><li>Stored in an electronic database with relevant meta-data associated with each text entry </li></ul></ul><ul><ul><ul><li>Student ID </li></ul></ul></ul><ul><ul><ul><li>Time/date </li></ul></ul></ul><ul><ul><ul><li>Activity type </li></ul></ul></ul><ul><li>Corpora can be constructed from written language use (especially digital texts) or transcribed from spoken interaction </li></ul>
    5. 5. Basic Tenets of Corpus Analysis <ul><li>Data driven, highly empirical </li></ul><ul><li>Objective approach </li></ul><ul><li>A grammar of use based on attested utterance types </li></ul><ul><li>A grammar of probability based on frequency and distribution </li></ul><ul><li>Language use and structure: </li></ul><ul><ul><li>Collocational patterns </li></ul></ul><ul><ul><li>Lexicon heart of systematicity in language, i.e., grammar </li></ul></ul><ul><ul><li>Formulaic sequences comprise ~60% of language use (Wray, 2002; Schmitt & Carter, 2004) </li></ul></ul>
    6. 6. Corpora & Language Assessment <ul><li>For advanced proficiency -- develop and/or utilize genre, modality, and context-specific corpora </li></ul><ul><li>Focus can be on grammatical, lexical, metaphoric, discourse, pragmatic features </li></ul><ul><li>Typical problems and errors of usage can be found in learner data </li></ul><ul><li>Teachers and learners themselves can observe and assess their own and one another’s performance </li></ul><ul><li>Expert-speaker corpora can reveal what learners are not using/doing, as well as how appropriately, successfully, and differentially they are using the target language </li></ul>
    7. 7. Comparing Assessment Approaches <ul><li>Elicited performance indicative of competence </li></ul><ul><li>“ Authenticity” and / or ecological validity of test instrument </li></ul><ul><li>Sampling issues </li></ul><ul><li>Reliability </li></ul><ul><li>Critical question: Is the elicited performance representative of the individual’s state of language development? </li></ul><ul><li>Naturally-occurring language performance indicates competence </li></ul><ul><li>Volume of language learners produce across tasks/genres and time </li></ul><ul><li>Sampling issues become irrelevant </li></ul><ul><li>Reconceptualize reliability </li></ul><ul><li>Critical question: Have enough data been collected to conclude that an individual’s performance is representative of her state of language development? </li></ul>Testing Corpus-based
    8. 8. ITA Project Describing, assessing, and developing academic discourse with international teaching assistants Steve Thorne Jonathan Reinhardt Paula Golombek
    9. 9. ITAcorp Project <ul><li>ITAs highly competent researchers </li></ul><ul><li>Expand repertoire of options for performing often complex social roles (instructor, adjudicator, tutor, advisor, fellow student, mediator) </li></ul><ul><li>Assessment --> Contrastive corpus analysis of ITACorp with baseline corpus -- MICASE </li></ul><ul><li>Grammar as choice as it relates to meaning and social actions </li></ul><ul><li>Formulaic sequences, small words, modulation </li></ul><ul><li>Corpus-informed pedagogical intervention to prepare students to participate successfully in spoken and written genres of academic discourse </li></ul>
    10. 10. Methodology <ul><li>Contrastive corpus analysis of MICASE and ITAcorp --> what are the differences in language use between expert/native and ITA/advanced ESL speakers? </li></ul><ul><li>Identified directive and obligative constructions </li></ul><ul><li>Quantified usage of directive language in both corpora </li></ul><ul><li>The case of wanna / want to </li></ul>
    11. 11. Corpus Assessment: Time 1 <ul><li>The case of “you want to” | “you could …” </li></ul><ul><li>Please + [imperative] </li></ul>
    12. 12. Corpus Assessment: Time 2 <ul><li>Post intervention usage of “you want to” </li></ul><ul><li>10 instances of usage across 25 advanced ESL students </li></ul><ul><li>Concordance lines of proceduralized usage in context </li></ul>
    13. 13. Corpus Assessment <ul><li>Corpus-informed Assessment and Materials Development: German Modal Particles </li></ul><ul><li>Nina Vyatkina </li></ul>
    14. 14. Teaching the MPs: Challenges <ul><li>Modal Particles: ja, doch, denn, mal </li></ul><ul><li>Rampant polysemy in MPs </li></ul><ul><li>Strongly context-bound meaning </li></ul><ul><li>Absence of a direct counterpart in English (translated by tag questions, intonation, omitted) </li></ul><ul><li>Absence of an informal “particle-friendly climate” in traditional language classrooms </li></ul><ul><li>Overly formal treatment in textbooks </li></ul><ul><ul><li>Sentence-based rather than utterance-based [interactive] </li></ul></ul>
    15. 15. Participants <ul><li>7 American students and 16 German students discussing intercultural topics in German and in English using email and chat during 8 semester weeks (Fall 2005) </li></ul>
    16. 16. <ul><li>German modal particles: indeclinable “smallwords” typical of conversations </li></ul><ul><li>‘ The German listener expects a particle. If it is absent, the sentence acquires a specific stylistic value: without a particle it sounds choppy, harsh, unfriendly, its utterance is apodictic, abrupt, blatantly noncommittal.’ (Weydt, 1969) </li></ul>German Modal Particles
    17. 17. Pedagogical intervention <ul><li>Classroom intra cultural sessions: explicit instruction based on the data produced by the participants in </li></ul><ul><li>Internet-mediated inter cultural sessions: practice in language use in CMC with native speakers (Belz, 2006) </li></ul>Data-driven instruction CMC practice QUAN & QUAL analysis CMC practice Data-driven instruction QUAN & QUAL analysis
    18. 18. Relative frequency: modality/intervention effect <ul><li>* Statistically significant difference in mean relative frequencies (no. MPs/1000 German words), p<.05 </li></ul>
    19. 19. MP Dispersion in the corpus <ul><li>Learners: </li></ul><ul><li>ja </li></ul><ul><li>denn </li></ul><ul><li>doch </li></ul><ul><li>mal </li></ul><ul><li>NSs: </li></ul><ul><li>ja </li></ul><ul><li>denn </li></ul><ul><li>doch </li></ul><ul><li>mal </li></ul>
    20. 20. MP use by NSs and learners (absolute numbers) 80 89 NSs 1 22 Post-Int. W4 6 65 Total Post-Interv. 3 27 Interv. W3 0 6 Interv. W2 2 7 Interv. W1 0 3 Pre-Interv. (4 weeks) Learners: Inaccurate use Learners: Accurate use Stages
    21. 21. Awareness-raising exercise 1 Questions adopted in part from Möllering and Nunan (1995) <ul><li>In this excerpt from your chat with your German partner, what lexical category (part of speech) do the words ja, mal, aber belong to? </li></ul><ul><li>Can you list other words belonging to this category? </li></ul><ul><li>What functions do these words have in the examples from your partners’ writing? </li></ul><ul><li>Which of these words have you ever used (in this course or earlier) in the same functions? </li></ul><ul><ul><li>Soren: Wann kommst Du mal nach Deutschland? </li></ul></ul><ul><ul><li>Jeremy: Hoffentlich komme ich der Fruhling 2007. </li></ul></ul><ul><ul><li>Soren: Oh das dauert aber noch </li></ul></ul><ul><ul><li>Soren: Das ist ja noch über ein Jahr </li></ul></ul><ul><ul><li>Soren: Naja, vielleicht schaffst Du es ja dann mal bei mir vorbei zu kommen. </li></ul></ul>
    22. 22. Awareness-raising exercise 2 <ul><li>In the given excerpt from your chat with your German partner, separate modal particles (MP) from their homonyms (H). Try to paraphrase the meaning of each word marked in bold. </li></ul><ul><li>How many MPs and their homonyms did you use? And your partner? </li></ul><ul><ul><li>Chip : Was hast du ueber FKK geschrieben? [...] </li></ul></ul><ul><ul><li>Simone : Zu welcher Frage meinst Du denn (___) ? </li></ul></ul><ul><ul><li>Chip : Umm... ich muss die Frage finden </li></ul></ul><ul><ul><li>Simone : Es gab ja (___) mehrere Fragen zum Thema FKK […] </li></ul></ul><ul><ul><li>Simone : Die Serie läuft doch (___) aber noch in den USA? [...] </li></ul></ul><ul><ul><li>Simone : dann mal (___) bis zur nächsten e-mail! </li></ul></ul><ul><ul><li>Chip : Ja (___) ! Bis zum naechsten Mal (___) ! :-) </li></ul></ul><ul><ul><li>Simone : Du kannst mir ja (___) mal (___) schreiben, was Du außerhalb der Uni noch so machst </li></ul></ul>
    23. 23. Awareness-raising exercise 3 Questions adopted in part from Möllering (2004) <ul><li>Consider the following concordance lines with the modal particle MAL extracted by means of WordSmith Tools® from your partners’ writing and answer the questions: </li></ul><ul><ul><li>Underline all the finite verbs in the clauses containing MAL. Do you see any patterns? </li></ul></ul><ul><ul><li>In these examples, does the content refer to the past, present, or future time? </li></ul></ul><ul><ul><li>What is the lexical category (part of speech) of the word expressing the subject in each line? </li></ul></ul><ul><ul><li>What sentence types do the examples contain – declaratives, exclamatives, commands, questions? </li></ul></ul>
    24. 24. Corpus-informed Assessment: Conclusions, Questions, & Resources <ul><li>Representativeness and ecological validity? </li></ul><ul><ul><li>Assemble corpus data to adequately and significantly represent production </li></ul></ul><ul><ul><li>Use benchmark corpora for assessing learner language successes and problems </li></ul></ul><ul><ul><li>Developmental corpus assessment of individuals and class-cohorts </li></ul></ul><ul><li>CALPER materials: </li></ul><ul><ul><li>Corpus tutorial -- see </li></ul></ul><ul><ul><li>INVESTIGATING REAL LANGUAGE -- June 25-27, 2007 </li></ul></ul><ul><ul><li>DYNAMIC ASSESSMENT workshop June 25-27, 2007 </li></ul></ul><ul><ul><li>CALPER Corpus Tool available Summer, 2007 </li></ul></ul>
    25. 25. <ul><li>Thanks -- please visit our website for more information on CALPER materials, events, and services: </li></ul><ul><li> </li></ul>
    26. 27. Challenges to Corpus Approaches <ul><li>One data source among many: ethnographic details, visual field, introspection, clinical and experimental elicitation </li></ul><ul><li>Descriptive not explanatory </li></ul><ul><li>Focus on externalized language use / performance – psycholinguistics and language processing inferred </li></ul><ul><li>Corpora are “real” (representation of actual use), but are they “authentic” (meaningful and applicable to learners, e.g., Widdowson, 2002) </li></ul><ul><li>Only as good as its representativeness </li></ul><ul><li>Harkening back to contrastive error analysis? No, contrastive analysis of actual use that does not need to include incapacity evaluations of learners </li></ul>
    27. 28. Types of Corpora & Analyses Synchronic Diachronic <ul><li>Developmental Learner Corpora </li></ul><ul><li>(Myles, Payne, Belz, Thorne et. al.) </li></ul><ul><li>Frequencies, ratios, ??? </li></ul><ul><li>Learner Corpora (Granger) </li></ul><ul><li>Contrastive IL Analyses </li></ul><ul><li>Frequencies, ratios </li></ul><ul><li>Genre/Register/Variation </li></ul><ul><li>(Biber, Swales, Sinclair) </li></ul><ul><li>Factor & cluster analyses </li></ul><ul><li>Youman’s Vocabulary Management Profiling </li></ul><ul><li>Mutual information </li></ul><ul><li>Historical Corpora (Davies) </li></ul><ul><li>Frequencies, ratios </li></ul><ul><li>Descriptive Benchmark (BNC, ANC) </li></ul><ul><li>Frequencies, ratios </li></ul>
    28. 29. Corpus Design and Construction <ul><li>Aggregative </li></ul><ul><li>Genre, register </li></ul><ul><li>Meta-data: </li></ul><ul><ul><li>Situational context </li></ul></ul><ul><ul><li>Activity </li></ul></ul><ul><ul><li>Level of proficiency </li></ul></ul>Synchronic <ul><li>Learner Corpora (Granger) </li></ul><ul><li>Contrastive IL Analyses </li></ul><ul><li>Frequencies, ratios </li></ul><ul><li>Genre/Register/Variation </li></ul><ul><li>(Biber, Swales, Sinclair) </li></ul><ul><li>Factor & cluster analyses </li></ul><ul><li>Youman’s Vocabulary Management Profiling </li></ul><ul><li>Mutual information </li></ul><ul><li>Descriptive Benchmark (BNC, ANC) </li></ul><ul><li>Frequencies, ratios </li></ul>
    29. 30. Corpus Design and Construction <ul><li>Role of meta-data: </li></ul><ul><ul><li>Individual </li></ul></ul><ul><ul><li>Task </li></ul></ul><ul><ul><li>Time </li></ul></ul><ul><li>Corpus construction as a form of experimental research </li></ul>Diachronic <ul><li>Developmental Learner Corpora </li></ul><ul><li>(Myles, Payne, Belz, Thorne et. al.) </li></ul><ul><li>Frequencies, ratios, ??? </li></ul><ul><li>Historical Corpora (Davies) </li></ul><ul><li>Frequencies, ratios </li></ul>
    30. 31. Corpus Annotation <ul><li>Frequency and location of tags </li></ul><ul><ul><li>Laughter for hyperbole </li></ul></ul><ul><ul><li>Language use as social action </li></ul></ul><ul><li>Part-of-speech </li></ul><ul><li>Lemmatization </li></ul><ul><li>Syntactic tagging </li></ul><ul><li>Error tagging </li></ul><ul><li>Semantic tagging </li></ul>
    31. 32. Corpus Informing Language Theory <ul><li>Not only what is possible (e.g., nativist and UG approaches), but what is likely or frequent in usage </li></ul><ul><li>Illustrates the limits of introspection about language (enormous differences between intuition and actual use) </li></ul><ul><li>Language structure, i.e., formulaic sequences comprise ~60% of language use (Wray, 2002; Schmitt & Carter, 2004) </li></ul><ul><li>Emergent grammar (Hopper, 2002; Bybee, 2001) </li></ul><ul><ul><li>Grammar a consequence, not a precondition -- epiphenomenal </li></ul></ul><ul><ul><li>Grammar = observable repetition in discourse </li></ul></ul><ul><ul><li>Grammar contingent upon lexical environment </li></ul></ul><ul><ul><li>“ Grammar contracts as texts expands” --> fragments and repertoires </li></ul></ul>
    32. 33. Revisioning Ellipsis <ul><li>Speakers add features as necessary rather than as taking away from what would be required in written discourse (see also Wittgenstein, 1953; Rommetveit, 1974) </li></ul><ul><ul><ul><li>Omission of auxiliaries is common (be, have, do) but not often from speaker’s or 1 st person perspective </li></ul></ul></ul><ul><ul><ul><li>Empty “its” and existential “there is” often dropped in spoken discourse </li></ul></ul></ul><ul><ul><ul><li>Pronouns before modal verbs e.g., can happen, should be </li></ul></ul></ul><ul><ul><ul><li>Overall, beginning bits are left out </li></ul></ul></ul><ul><li>Grammatical description SHOULD represent spoken language use, should relate items and structures to interactional and situational functions </li></ul>
    33. 34. Importance of Measuring & Understanding Process <ul><li>Alfred Binet (1909) advocated process assessment , though never designed an instrument to measure it. </li></ul><ul><li>Buckingham (1921) accounting for learning processes as important as products . </li></ul>
    34. 35. Challenges of Assessing Process <ul><li>Feasibility </li></ul><ul><ul><li>“the most direct procedure for determining an individual’s proficiency…would simply be to follow that individual surreptitiously over an extended period of time…It is clearly impossible, or at least highly impractical, to administer a ‘test’ of this type in the language learning situation” (Clark 1978, as quoted in Bachman, 1990). </li></ul></ul><ul><li>Scalability - the bane of “alternative” assessment </li></ul>
    35. 36. Depicting Process in SLA <ul><li>Accuracy of production of L2 forms and IL development suggests a curvilinear rather than a linear relationship (Norris & Ortega, 2003). </li></ul><ul><li>Threshold and stage effects (Meisel, Clahsen & Pienemann, 1991). </li></ul><ul><li>U-shaped behavior (Kellerman, 1985) </li></ul><ul><li>Omega-shaped behavior - temporary increase in frequency followed by a normalization (Wolfe-Quintero, Inagaki,& Kim, 1998). </li></ul>
    36. 37. Using Corpus to Assess IL Development <ul><li>Addressing feasibility and scalability </li></ul><ul><ul><li>Proliferation of technology-mediated language learning </li></ul></ul><ul><ul><li>More powerful computers and more refined software. </li></ul></ul><ul><ul><li>Automated speech recognition - “dirty” ASR </li></ul></ul>
    37. 38. “Complementary” Assessment <ul><li>Use testing techniques (traditional or performance) in conjunction with corpus-based assessment to generate a more detailed and broad-based account of IL development. </li></ul>
    38. 39. <ul><li>An ITA’s success as instructors and future faculty depends on successful participation in written and spoken academic discourse </li></ul><ul><li>e.g. spoken genres: </li></ul>Academic Discourse Performance <ul><li>small lecture presentation </li></ul><ul><li>large lecture presentation </li></ul><ul><li>discussion leading </li></ul><ul><li>lab section leading </li></ul><ul><li>seminar leading </li></ul><ul><li>advising </li></ul><ul><li>colloquia participation </li></ul><ul><li>interviewing </li></ul><ul><li>meeting participation </li></ul><ul><li>office hours conducting </li></ul><ul><li>service encounters </li></ul><ul><li>tutorial leading </li></ul><ul><li>socializing </li></ul><ul><li>conference presentation </li></ul>
    39. 40. The ITA “problem” <ul><li>Jan 2005: North Dakota proposed legislation: bill would have forced universities to reimburse class fees to student complaints about an instructor’s inability in English. If ten percent or more students had complained, the instructor would have been relieved from teaching pending further review. A watered-down version of the bill passed. </li></ul><ul><li>High number of international graduate students in the U.S. -- 50 % of US graduate students in engineering and sciences are international </li></ul>
    40. 41. Directive Language <ul><li>DL is language with directive illocutionary force (Searle, 1979) used functionally for making suggestions or giving advice </li></ul><ul><li>In traditional frameworks, DL has primarily deontic qualities of obligative modality </li></ul><ul><li>In textbooks, is taught as series of modals & semi-modals (must, mustn’t, have to, should, ought to, need to, needn’t) </li></ul><ul><li>In SYS-FUNC, DL would be considered part of the MODULATION system, a continuum between obligation (what I want you to do) and inclination (what you want to do) </li></ul>
    41. 42. Why Study Directive Language? <ul><li>DL is an important part of several academic discourse genres and professional competence </li></ul><ul><li>Inappropriate or unintended use of DL may result in miscommunication or misunderstanding of speaker intention </li></ul><ul><li>DL is highly interpersonal, involving speaker authority and power hierarchies </li></ul>
    42. 43. Research <ul><li>Contrastive genre-comparable spoken corpora </li></ul><ul><ul><li>ITAcorp (ITA language use): office hours role plays (CMC, presentation, post-evaluation)-- approx. 120,000 tokens </li></ul></ul><ul><ul><li>MICASE (base-line ‘expert’ corpus): Advising and Office Hours sub-corpora--180,000 tokens </li></ul></ul><ul><ul><li>MICASE data as model </li></ul></ul><ul><li>Analytical framework: </li></ul><ul><ul><li>Corpus: usage-based, frequency & distribution </li></ul></ul><ul><ul><li>Qualitative: (professional) discourse analysis, SYS-FUNC & APPRAISAL </li></ul></ul>
    43. 44. Preliminary Contrastive Analysis of wanna / want to
    44. 45. You [+ hedge] want to / wanna [+ hedge] MICASE ITACorp <ul><li>MICASE shows 12x the hedged use of want to / wanna </li></ul><ul><li>ITACOrp uses followed pedagogical intervention on hedged wanna DL </li></ul>
    45. 46. Additional Preliminary Descriptive Findings <ul><li>In comparison to MICASE data, ITAs as represented in ITAcorp: </li></ul><ul><ul><li>Generally use very few hedges or intensifiers </li></ul></ul><ul><ul><li>Generally under use periphrastic forms </li></ul></ul><ul><ul><li>Overuse obligative modals (must, should) and please + imperative </li></ul></ul><ul><ul><li>Use ‘ can ’ for obligative ‘ should ’ </li></ul></ul><ul><ul><li>Use only basic conditional, underuse of ‘ you could ’ and no use of ‘ I would’ </li></ul></ul><ul><ul><li>Navigate between ‘I’ and exclusive ‘ we’ strategically, invoking departmental or professorial authority when the going gets tough </li></ul></ul>
    46. 47. Next Steps <ul><li>Complementary ethnographic data (survey, interviews) for ITAcorp participants </li></ul><ul><li>Use audio to produce narrow transcriptions of select data </li></ul><ul><li>Focus on differences across modality (CMC vs. F2F) </li></ul><ul><li>Focus on classroom presentation of a concept (contrasting with MICASE) </li></ul><ul><li>Gather data from non-role play ITA professional activity (section leader, lecturer, office hours) </li></ul><ul><li>Develop set of corpus-informed pedagogical interventions focusing on professional discourse competencies </li></ul>
    47. 48. What is Data-Driven Learning? <ul><li>Application of tools (concordancers) and techniques from corpus linguistics in the service of language learning. </li></ul><ul><li>Inquiry-based pedagogy </li></ul><ul><li>Learner as researcher </li></ul><ul><li>&quot;Research is too important to leave to researchers&quot; (Johns, 1991, p2.) </li></ul>
    48. 49. Paradigms of L2 Instruction <ul><li>Traditional approaches: </li></ul><ul><li>Present -> Practice -> Produce </li></ul><ul><li>Data-driven learning: </li></ul><ul><li>Observe -> Hypothesize -> Experiment </li></ul>
    49. 50. Impact of Corpus Techniques on L2 Pedagogy <ul><li>Materials development </li></ul><ul><ul><li>How do native/expert speakers actually use the target language? </li></ul></ul><ul><ul><li>What drives sequencing? </li></ul></ul><ul><li>Instructional activities </li></ul><ul><ul><li>Example - link </li></ul></ul><ul><li>Data-driven learning tools </li></ul><ul><ul><li>KWICionary - link </li></ul></ul>
    50. 51. Research on Data-Driven Learning <ul><li>Vocabulary Acquisition: improved through the use of concordances (Steven, 1991; Cobb, 1997) </li></ul><ul><li>Writing Instruction: students can correct their own errors with concordances (Gaskell & Cobb, 2004; Ross & Payne, 2005) </li></ul>
    51. 52. Pedagogical Issues for DDL <ul><li>Learning a new way to learn language </li></ul><ul><li>Relationship between proficiency level and data-driven learning approach </li></ul><ul><li>Should frequency of use drive materials development? </li></ul>
    52. 53. Next Generation Corpus Tools <ul><li>Text files => relational databases </li></ul><ul><ul><li>Storing data as smallest atomic unit </li></ul></ul><ul><ul><li>Associate extensive meta-data with each data entry </li></ul></ul><ul><li>Application-based => web-based </li></ul><ul><ul><li>Promote aggregation and sharing of data </li></ul></ul><ul><ul><li>Location-independent collaborative research </li></ul></ul><ul><li>Integration with online learning environments </li></ul><ul><li>Online Corpus Analytic Tool (OCAT) </li></ul>
    53. 54. OCAT Design <ul><li>Relational database backend </li></ul><ul><li>Extensive meta-data can be assigned to each data entry. </li></ul><ul><li>Multiple corpora can be linked and meta-data fields aligned to create meta-corpora. </li></ul><ul><li>Dynamic sub-corpora </li></ul><ul><li>Users can create corpora and upload data via a web interface. </li></ul><ul><li>Location-independent collaborative research </li></ul><ul><li>Concordance query, frequency lists, Mean Tokens per Learner </li></ul><ul><li>Data visualization techniques </li></ul>
    54. 55. Assessing Language Development
    55. 56. Assessing Language Development
    56. 57. Assessing Language Development