Your SlideShare is downloading. ×
Sacodeyl Birmingham 2007
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Sacodeyl Birmingham 2007

1,213
views

Published on

http://www.um.es/sacodeyl

http://www.um.es/sacodeyl

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,213
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Spoken multimedia corpora for pedagogical purposes Sabine Braun (University of Surrey) Pascual Pérez-Paredes (Universidad de Murcia) Ylva Berglund (Oxford University) Birmingham Corpus Linguistics Conference 2007
  • 2. Introduction
    • The usefulness of corpora in language pedagogy is widely recognised.
    • But there is a need for pedagogically relevant corpora , reflected e.g. in initiatives to create 'ad-hoc' corpora in pedagogical contexts.
    • The creation of pedagogically relevant corpora raises challenges for corpus design.
    • Past and current initiatives have largely focussed on written corpora; spoken discourse is becoming more important in pedagogical contexts.
    • The creation of pedagogically relevant spoken corpora raises additional challenges for corpus design.
  • 3. The challenges (1)
    • CORPUS DESIGN Traditional reference corpora (content, size, data format, transcription, annotation, query)
    • CORPUS EXPLOITATION Data-Driven Learning (focus on non-linear reading: concordances and co-texts)
    • Corpora contain textual records of discourse; their interpretation requires (re-)contextualisation .
    • Learners may have difficulties analysing corpus data; they require pedagogical mediation .
    • Pedagogical corpus uses differ from linguistic description; this requires e.g. pedagogically motivated query options .
    • Corpora need to be integrated with curricula; this requires e.g. complementarity of content and effective delivery .
    Do not fully support pedagogical requirements.
  • 4. The challenges (2)
    • CORPUS DESIGN Traditionally: representation in written format
    • CORPUS EXPLOITATION Work with text-only data and e.g. conversational markup
    • Spoken discourse is more dependent on shared physical contexts.
    • It is adjusted to aural and online perception (e.g. chunking)
    • It is affected by limitations of processing capacity (false starts, repair).
    • It is marked by accents.
    • It is multimodal.
    Again, this does not fully support pedagogical requirements.
  • 5. Requirements
    • Format : multimedia to retain multimodal character of spoken language
    • Content : complementary with curriculum topics, more coherence than in traditional corpora
    • Pedagogically motivated transcription , annotation and alignment (transcript-video)
    • Combination of query methods : text-based exploration and application of corpus techniques
    • Pedagogical enrichment of corpora with complementary resources (e.g. exercises, explanations)
    • Effective delivery of corpora and additional resources to learners/teachers
  • 6. Corpus creation (1)
    • ELISA
    • Professional English
    • Accounts of professional life
    • Different varieties
    • SACODEYL
    • 7 European languages
    • Youth language corpora
    • Speakers 13-15 and 16-18
    • Examples: ELISA and SACODEYL
    • Interview format
    • Video clips with transcript
    • Communicatively relevant topics, e.g. in SACODEYL topics outlined in the Common European Framework
    • Elicitation process: briefing informants and prompting them during the interview, ensuring naturally flowing discourse
  • 7. Corpus creation (1) Example of topics in SACODEYL Conditional Modal verbs B2 can speculate about causes, consequences, hypothetical situations 16-18
    • On what grounds do you decide?
    Future B1 can explain/give reasons for my plans, intentions and actions 16-18
    • What are your plans for your career?
    Plans for the future Future Conditonal Modal verbs B1 can describe dreams, hopes and ambitions 13-15
    • What are your plans for the next holidays?
    Past tense A2 can describe past activities, personal experiences 13-15 16-18
    • Where did you spend your last holidays?
    Holidays Gramm. functions CEF Age Interview questions Topic
  • 8. Corpus creation (2) Markup Pedagogic annotation XML files TEI-compliant corpora Transcription CONTINUUM RAW, ORTHOGRAPHIC TRANSCRIPTION – ANNOTATED CORPORA
  • 9. Corpus creation (2) SACODEYL TRANSCRIPTOR SACODEYL ANNOTATOR Markup Pedagogic annotation XML files TEI-compliant corpora Transcription
  • 10. Corpus creation (3) SACODEYL TRANSCRIPTOR
  • 11.
    • [METADATA]
    • Title: La Unión Europea une a los ciudadanos
    • Date Recording:2006-11-05
    • Date Transcription:2007-02-02
    • Locale:I.E.S. Floridablanca,Murcia, España
    • Principal Investigator: Pascual Perez-Paredes
    • Researcher:Pascual Perez-Paredes
    • Transcriber: Encarnación Tornero Valero
    • Editor:
    • Autority: SACODEYL Project
    • ID:
    Corpus creation (2) Language:ES MediaFileName:ES02.avi Participants: person:Chico name: role: Entrevistado sex: Hombre age: 16 description: person: E name: Andrés Mercader Rodríguez role: Entrevistador sex: Hombre age: 32 description: [/METADATA]
  • 12.  
  • 13.  
  • 14.  
  • 15.  
  • 16.  
  • 17.  
  • 18. Corpus query
    • Query options will support text- and corpus-based exploration and include e.g.
      • Easy access to entire interviews
      • A topic index supporting the analysis of similar sections across interviews ("topic concordances")
      • Other indices based on the annotation categories
      • Ready-made data (e.g. frequency lists of each interview; selective concordances)
      • A concordancer for extended/advanced search; adapted to pedagogical requirements
  • 19. Corpus query
  • 20. Pedagogical enrichment
    • The corpora will be enriched with prototypical learning activities.
    • These will focus on one interview section or one interview as a whole or sections across interviews…
    • They will include e.g.
      • linguistic and cultural explanations and exercises (form-focussed as well as communication-oriented),
      • (listening) comprehension and production tasks,
      • explorative tasks (concordance-based as well as interview-based).
    • Use of authoring tool Telos Language Partner to create learning packages with ranges of activities.
  • 21. Pedagogical enrichment
  • 22. Pedagogical enrichment
  • 23. Pedagogical enrichment
  • 24. Pedagogical enrichment
  • 25. Corpus delivery
    • Effective delivery as a further prerequisite for integration into curriculum
    • In SACODEYL, use of Moodle learning platform, giving access to:
      • Corpora (query interfaces)
      • Resources created in the project (different types of learning activities)
      • Resources created by future corpus users
  • 26. Summary
    • Method outlined is transferable to other pedagogical contexts, topics, languages
    • Method helps to use corpora more efficiently in pedagogical contexts – from sporadically used resource to systematic exploitation
    • Corpus creation complies with standards to facilitate reuse of corpora for other contexts (research)
  • 27. Contact
    • Sabine Braun: [email_address]
    • Pascual Pérez-Paredes: [email_address]
    • Ylva Berglund: [email_address]
    • And visit our poster session…
    • As well as our websites:
    • www.um.es/sacodeyl
    • www.corpora4learning.net/elisa

×