Your SlideShare is downloading. ×
Sacodeyl Birmingham 2007
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Sacodeyl Birmingham 2007


Published on

Published in: Education, Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Spoken multimedia corpora for pedagogical purposes Sabine Braun (University of Surrey) Pascual Pérez-Paredes (Universidad de Murcia) Ylva Berglund (Oxford University) Birmingham Corpus Linguistics Conference 2007
  • 2. Introduction
    • The usefulness of corpora in language pedagogy is widely recognised.
    • But there is a need for pedagogically relevant corpora , reflected e.g. in initiatives to create 'ad-hoc' corpora in pedagogical contexts.
    • The creation of pedagogically relevant corpora raises challenges for corpus design.
    • Past and current initiatives have largely focussed on written corpora; spoken discourse is becoming more important in pedagogical contexts.
    • The creation of pedagogically relevant spoken corpora raises additional challenges for corpus design.
  • 3. The challenges (1)
    • CORPUS DESIGN Traditional reference corpora (content, size, data format, transcription, annotation, query)
    • CORPUS EXPLOITATION Data-Driven Learning (focus on non-linear reading: concordances and co-texts)
    • Corpora contain textual records of discourse; their interpretation requires (re-)contextualisation .
    • Learners may have difficulties analysing corpus data; they require pedagogical mediation .
    • Pedagogical corpus uses differ from linguistic description; this requires e.g. pedagogically motivated query options .
    • Corpora need to be integrated with curricula; this requires e.g. complementarity of content and effective delivery .
    Do not fully support pedagogical requirements.
  • 4. The challenges (2)
    • CORPUS DESIGN Traditionally: representation in written format
    • CORPUS EXPLOITATION Work with text-only data and e.g. conversational markup
    • Spoken discourse is more dependent on shared physical contexts.
    • It is adjusted to aural and online perception (e.g. chunking)
    • It is affected by limitations of processing capacity (false starts, repair).
    • It is marked by accents.
    • It is multimodal.
    Again, this does not fully support pedagogical requirements.
  • 5. Requirements
    • Format : multimedia to retain multimodal character of spoken language
    • Content : complementary with curriculum topics, more coherence than in traditional corpora
    • Pedagogically motivated transcription , annotation and alignment (transcript-video)
    • Combination of query methods : text-based exploration and application of corpus techniques
    • Pedagogical enrichment of corpora with complementary resources (e.g. exercises, explanations)
    • Effective delivery of corpora and additional resources to learners/teachers
  • 6. Corpus creation (1)
    • ELISA
    • Professional English
    • Accounts of professional life
    • Different varieties
    • 7 European languages
    • Youth language corpora
    • Speakers 13-15 and 16-18
    • Examples: ELISA and SACODEYL
    • Interview format
    • Video clips with transcript
    • Communicatively relevant topics, e.g. in SACODEYL topics outlined in the Common European Framework
    • Elicitation process: briefing informants and prompting them during the interview, ensuring naturally flowing discourse
  • 7. Corpus creation (1) Example of topics in SACODEYL Conditional Modal verbs B2 can speculate about causes, consequences, hypothetical situations 16-18
    • On what grounds do you decide?
    Future B1 can explain/give reasons for my plans, intentions and actions 16-18
    • What are your plans for your career?
    Plans for the future Future Conditonal Modal verbs B1 can describe dreams, hopes and ambitions 13-15
    • What are your plans for the next holidays?
    Past tense A2 can describe past activities, personal experiences 13-15 16-18
    • Where did you spend your last holidays?
    Holidays Gramm. functions CEF Age Interview questions Topic
  • 8. Corpus creation (2) Markup Pedagogic annotation XML files TEI-compliant corpora Transcription CONTINUUM RAW, ORTHOGRAPHIC TRANSCRIPTION – ANNOTATED CORPORA
  • 9. Corpus creation (2) SACODEYL TRANSCRIPTOR SACODEYL ANNOTATOR Markup Pedagogic annotation XML files TEI-compliant corpora Transcription
  • 10. Corpus creation (3) SACODEYL TRANSCRIPTOR
  • 11.
    • [METADATA]
    • Title: La Unión Europea une a los ciudadanos
    • Date Recording:2006-11-05
    • Date Transcription:2007-02-02
    • Locale:I.E.S. Floridablanca,Murcia, España
    • Principal Investigator: Pascual Perez-Paredes
    • Researcher:Pascual Perez-Paredes
    • Transcriber: Encarnación Tornero Valero
    • Editor:
    • Autority: SACODEYL Project
    • ID:
    Corpus creation (2) Language:ES MediaFileName:ES02.avi Participants: person:Chico name: role: Entrevistado sex: Hombre age: 16 description: person: E name: Andrés Mercader Rodríguez role: Entrevistador sex: Hombre age: 32 description: [/METADATA]
  • 12.  
  • 13.  
  • 14.  
  • 15.  
  • 16.  
  • 17.  
  • 18. Corpus query
    • Query options will support text- and corpus-based exploration and include e.g.
      • Easy access to entire interviews
      • A topic index supporting the analysis of similar sections across interviews ("topic concordances")
      • Other indices based on the annotation categories
      • Ready-made data (e.g. frequency lists of each interview; selective concordances)
      • A concordancer for extended/advanced search; adapted to pedagogical requirements
  • 19. Corpus query
  • 20. Pedagogical enrichment
    • The corpora will be enriched with prototypical learning activities.
    • These will focus on one interview section or one interview as a whole or sections across interviews…
    • They will include e.g.
      • linguistic and cultural explanations and exercises (form-focussed as well as communication-oriented),
      • (listening) comprehension and production tasks,
      • explorative tasks (concordance-based as well as interview-based).
    • Use of authoring tool Telos Language Partner to create learning packages with ranges of activities.
  • 21. Pedagogical enrichment
  • 22. Pedagogical enrichment
  • 23. Pedagogical enrichment
  • 24. Pedagogical enrichment
  • 25. Corpus delivery
    • Effective delivery as a further prerequisite for integration into curriculum
    • In SACODEYL, use of Moodle learning platform, giving access to:
      • Corpora (query interfaces)
      • Resources created in the project (different types of learning activities)
      • Resources created by future corpus users
  • 26. Summary
    • Method outlined is transferable to other pedagogical contexts, topics, languages
    • Method helps to use corpora more efficiently in pedagogical contexts – from sporadically used resource to systematic exploitation
    • Corpus creation complies with standards to facilitate reuse of corpora for other contexts (research)
  • 27. Contact
    • Sabine Braun: [email_address]
    • Pascual Pérez-Paredes: [email_address]
    • Ylva Berglund: [email_address]
    • And visit our poster session…
    • As well as our websites: