Developing Teaching Materials
with Authentic Data and Corpus
Analysis Tools

  Hongyin Tao
  University of California, Los Angeles

  tao@humnet.ucla.edu
The Project
 One of the National Resource Center
  projects at Penn State University
 Center for Advanced Language
  Proficiency Education and Research
  (CALPER), Penn State University
 Supported by the US Department of
  Education
 http://calper.la.psu.edu/chinese.php
Goals

 This project is developing teaching materials
  for advanced learners based on a collection of
  authentic examples of contemporary spoken
  Chinese.
 The materials will highlight some of the
  interactive aspects of spoken Chinese including
  features such as topic
  transition, assessments, repairs, linking, and
  acknowledgement.
 In addition, the materials will be useful to teach
  grammar points, such as the use of the particle
  le, from a discourse perspective.
BACKGROUND

   The vast majority of spoken language teaching materials
    available for learners of Chinese are either based on
    constructed sentences or on some assumed features of
    spoken language. Rarely do they rely on naturally occurring
    spoken language.
   Discourse linguists, however, have shown that there are
    fundamental and systematic differences between written
    and spoken language. Hence, examining natural
    conversation offers important insights into the workings of
    spoken language.
   Example: discourse analyses show that spoken Chinese
    tends to use a great deal of ellipses, verb-less
    constructions, discourse
    markers, formulations, backchannels, and so forth, which
    are much less frequently found in written discourse, and as
    a result are rarely made explicit in second language
    instruction.
ACTIVITIES (I): Research Database

 Collection of conversational Mandarin
  Chinese: Over 60 hours; transcribed into
  300,000 words.
 The data come from speakers discussing
  readings, narrating stories based on past
  experience, talking to each other while
  playing games, talking about movies,
  talking on campus tours, conversing with
  each other at dinner parties, and talking
  while shopping at a farmers market, etc.
ACTIVITIES (II): Materials

 A practical guide to selected features of
  natural spoken Chinese, which will highlight
  important interactive aspects of the spoken
  language and features of spoken grammar.
 The materials are aimed at students who
  have had at least 300 hours of instruction in
  Chinese but will also be valuable for
  teachers who would like to use them with
  somewhat less advanced students.
ACTIVITIES (III): Other

 Workshops: Intensive workshops have
  been conducted for (K-12 and college)
  teachers of Mandarin Chinese, informing
  about ways to improve language teaching
  with natural discourse data.
 Software: Developed a suite of software
  tools called A Corpus Worker’s Toolkit
  (ACWT) for data processing and analysis.
 Publications: Papers and Chinese Corpus
  Resource Guide.
Material Development and Use

 Selection of segments from transcriptions
  based on genre types and linguistic and
  discourse pragmatic features;
 Clean up transcription;
 Use various text analysis computer
  programs to process the data.
Structural Features

 conditionals; extended coordination;
  verbal classifiers; clause linking
  devices; inclusive and exclusive
  pronominals; complex interrogatives;
  ba constructions, bei constructions;
  aspect markers, syntactic
  constructions (e.g. verb
  complements), and relative clauses.
Discourse Pragmatic and
Sociolinguistic Features

 story opening; disagreement in
  strategic forms; topic transition;
  making and soliciting assessments;
  repairs; checking for clarification;
  linking discourse units;
  acknowledgement and response to
  the previous speaker; expressing
  personal emotions and stances; and
  evidential marking.
Use of Technology

 Chinese annotation tools for code
  conversion, Romanization, etc.
 Tokenization
 Vocabulary/Word List
 Concordance
 Web/hyperlink presentation
Sample Online Materials

 http://calper.la.psu.edu/chinese.php

 http://www.alc.ucla.edu/Chinese/CALPER/
Summary

 Highlight differences between spoken and
  written language and different types of
  communicative strategies;
 Maximize learner (and teacher) exposure to
  real language use;
 Exposure to the ‘current’ state of the target
  language.
 Emphasize both linguistic and pragmatic
  competences;
 Emphasize both language and cultural
  knowledge.

Developing Teaching Materials with Authentic Data and Corpus Analysis Tools

  • 1.
    Developing Teaching Materials withAuthentic Data and Corpus Analysis Tools Hongyin Tao University of California, Los Angeles tao@humnet.ucla.edu
  • 2.
    The Project  Oneof the National Resource Center projects at Penn State University  Center for Advanced Language Proficiency Education and Research (CALPER), Penn State University  Supported by the US Department of Education  http://calper.la.psu.edu/chinese.php
  • 3.
    Goals  This projectis developing teaching materials for advanced learners based on a collection of authentic examples of contemporary spoken Chinese.  The materials will highlight some of the interactive aspects of spoken Chinese including features such as topic transition, assessments, repairs, linking, and acknowledgement.  In addition, the materials will be useful to teach grammar points, such as the use of the particle le, from a discourse perspective.
  • 4.
    BACKGROUND  The vast majority of spoken language teaching materials available for learners of Chinese are either based on constructed sentences or on some assumed features of spoken language. Rarely do they rely on naturally occurring spoken language.  Discourse linguists, however, have shown that there are fundamental and systematic differences between written and spoken language. Hence, examining natural conversation offers important insights into the workings of spoken language.  Example: discourse analyses show that spoken Chinese tends to use a great deal of ellipses, verb-less constructions, discourse markers, formulations, backchannels, and so forth, which are much less frequently found in written discourse, and as a result are rarely made explicit in second language instruction.
  • 5.
    ACTIVITIES (I): ResearchDatabase  Collection of conversational Mandarin Chinese: Over 60 hours; transcribed into 300,000 words.  The data come from speakers discussing readings, narrating stories based on past experience, talking to each other while playing games, talking about movies, talking on campus tours, conversing with each other at dinner parties, and talking while shopping at a farmers market, etc.
  • 6.
    ACTIVITIES (II): Materials A practical guide to selected features of natural spoken Chinese, which will highlight important interactive aspects of the spoken language and features of spoken grammar.  The materials are aimed at students who have had at least 300 hours of instruction in Chinese but will also be valuable for teachers who would like to use them with somewhat less advanced students.
  • 7.
    ACTIVITIES (III): Other Workshops: Intensive workshops have been conducted for (K-12 and college) teachers of Mandarin Chinese, informing about ways to improve language teaching with natural discourse data.  Software: Developed a suite of software tools called A Corpus Worker’s Toolkit (ACWT) for data processing and analysis.  Publications: Papers and Chinese Corpus Resource Guide.
  • 8.
    Material Development andUse  Selection of segments from transcriptions based on genre types and linguistic and discourse pragmatic features;  Clean up transcription;  Use various text analysis computer programs to process the data.
  • 9.
    Structural Features  conditionals;extended coordination; verbal classifiers; clause linking devices; inclusive and exclusive pronominals; complex interrogatives; ba constructions, bei constructions; aspect markers, syntactic constructions (e.g. verb complements), and relative clauses.
  • 10.
    Discourse Pragmatic and SociolinguisticFeatures  story opening; disagreement in strategic forms; topic transition; making and soliciting assessments; repairs; checking for clarification; linking discourse units; acknowledgement and response to the previous speaker; expressing personal emotions and stances; and evidential marking.
  • 11.
    Use of Technology Chinese annotation tools for code conversion, Romanization, etc.  Tokenization  Vocabulary/Word List  Concordance  Web/hyperlink presentation
  • 12.
    Sample Online Materials http://calper.la.psu.edu/chinese.php http://www.alc.ucla.edu/Chinese/CALPER/
  • 13.
    Summary  Highlight differencesbetween spoken and written language and different types of communicative strategies;  Maximize learner (and teacher) exposure to real language use;  Exposure to the ‘current’ state of the target language.  Emphasize both linguistic and pragmatic competences;  Emphasize both language and cultural knowledge.