SlideShare a Scribd company logo
SPinTX Corpus-to-Classroom:
ATeacher-Centered Pedagogical Interface for
the Spanish in Texas Corpus
Barbara E. Bullock, Almeida Jacqueline
Toribio, Rachael Gilg, Martí Quixal & Arthur
Wendorf
Who we are
• Barbara E. Bullock & Almeida Jacqueline Toribio
• Project Directors / Sociolinguistics Researchers
• Rachael Gilg
• Project Manager / Web Developer
• Arthur Wendorf
• Corpus Linguist / Developer
• Martí Quixal
• Computational Linguist / Developer
• Carl Blyth
• Director of COERLL
2
Agenda
• Part 1: Introduction to the Corpus-to-Classroom Project
• Part 2: Project Results
• The SpinTX Video Archive: a pedagogically-friendly interface to the
Spanish in Texas Corpus
• Involving teachers in the development of open educational
resources
• A model for open source corpus development
3
Corpus-to-Classroom
4
Corpora in the Classroom: the promise
• Corpus: a large, structured, collection of language
• Benefits:
• Naturalistic language use
• Motivation
• „Real‟ language
• Discovery learning
• Examples:
5
Corpora in the Classroom: the reality
• Large linguistic corpora are of limited utility to untrained
end users.
• Designed for researchers, not educators.
• Collections such as YouTube are popular for language
classes, but can present problems
• Searching for appropriate content is time-consuming using
available search methods.
• Content is not necessarily openly-licensed and can disappear
without warning.
6
Our two-pronged approach
Spanish in Texas Corpus Project
A project of COERLL, a National Foreign Language
Resource Center (2010-2014)
• Video interviews provide rich content
SpinTX: Corpus-to-Classroom Project
Grant from the University of Texas Longhorn
Innovation Fund for Technology (2012-2013)
• Collection of pre-selected, corrected, annotated
clips from the larger corpus
• Open-source, pedagogically-friendly search and
authoring tools
7
Spanish in Texas Corpus: Goals
• To make publically available authentic data about
variation in Spanish as spoken in Texas
• for education
• for research
• Encourage teachers/students/public to view local
varieties as a resource
8
Corpus-to-Classroom: Goals
• develop a pedagogically friendly interface for using
the Spanish in Texas corpus
• involve teachers and learners, via crowd-sourcing,
social networking, and workshops, in the
development of open educational resources
• create a model for using open source tools and a
pedagogical interface that can be adapted for any
language corpus collection
9
Corpus Overview
Spanish in Texas corpus
• Approx. 92 videos of sociolinguistic interviews (avg.
30–45 min)
• Transcribed (approx. 600,000 words)
• Time-synced video caption files
• Tagged for linguistic features
SpinTX Video Archive corpus
• Approx. 327 video clips from 33 speakers (avg. 1-4
min)
• Transcribed (approx. 80,000 words)
• Time-synced video caption files
• Tagged for linguistic and pedagogical features
• Completely open (no registration required, open CC
license)
• Teacher-friendly interface
10
Corpus Tagging: Basic
• Time-synced captions
• Part-of-speech tags (dual language)
• POS
• POS, simplified
• Gender
• Tense
• Aspect
• Mood
• Speaker identification
• Age
• Gender
• Region
11
Corpus Tagging: Pedagogical
• Topics (manually added)
• Automatic tags using custom rulesets
• Grammatical
• aggregated from textbooks
• Pragmatics
• discourse markers, place holders (“este”), attenuators
• Vocabulary
• concept words
• Functional (planned)
• greetings, ask for help, express opinions
• Bilingual forms (planned)
• CS, loans, loan translations
12
13
Interview Metadata
Original Transcript (from Automatic Sync)
Upload Video and Transcript to YouTube
Review Transcript in Google Docs
Download SRT file
Prepare Transcript for TreeTagger
Run through TreeTagger
Combine Data from SRT File and
TreeTagger File, and add additional Tags
Divide CSV Files and Videos into Clips and
adjust Timings and Numberings
The SpinTX Video Archive: a
pedagogically-friendly interface
to the Spanish in Texas Corpus
23
Needs assessment: teacher interviews
• How do you use authentic video in your teaching?
• Describe searches you have done in the past for video
content. What were you looking for and were you able to
find it?
• How can you imagine using clips from the Spanish in
Texas video corpus in your classes?
24
Needs assessment results: primary goals
• Enable teachers to easily videos that suit the
curriculum/work plan
• Search by grammar, theme, vocabulary, etc.
• Provide open, non-ephemeral content
• Downloadable from open site with a license enabling remixing
• Curating sets of videos for comparison and study
• Favoriting and tagging videos
• Provide access to supporting materials.
• Creating a “community of practice” around the videos so materials
can be shared among educators.
25
Needs assessment results: secondary goals
• Materials for teacher trainers
• Teachers of heritage learners can learn about local variation
• Video recording as a cross-competence task
• Interviews collected by students can be contributed to the corpus
26
27
Ideas for future development
• Advanced search capability
• support for wildcards
• improved phrase searching
• improved “keyword in context” result view
• Data visualizations
• word and/or tag clouds
• language maps
• Enhanced word-level annotations
• hover over a word in a transcript and see all annotations
28
Formative evaluation of Beta version
Data collection methods:
• Online user survey
• Web analytics (navigation patterns, popular content)
• Search analytics
• User observation and feedback through ongoing
workshops and focus groups
Results will drive future development of the interface.
29
Involving Teachers in the
Development of OER
30
Workshops with Educators
• Summer 2012 Workshop
• ~100 secondary and college Spanish teachers
• Fall 2012 Working Group
• ~10 Univ. of Texas Spanish teachers
• Spring 2013 Workshops
• Multiple conferences & Univ. of Texas Spanish teachers
• Summer 2013 Working Group
• ~10 secondary and college Spanish teachers
31
Sample materials from the community (1)
32
33
Sample materials from the community (2)
• Idea from teacher workshop: Use videos for grammar
lessons to develop the student‟s metalinguistic and critical
thinking skills as they pertain to language.
• Searched and selected clips for lesson on “por vs. para”.
• Lesson tested in heritage learners class.
• Anecdotal evidence that video lessons were effective and
motivating to students.
34
Template development ideas
• Using video clips from the SpinTX video archive, create
an activity for classroom use (at any level).
• Focus on Topics: Familia, Idioma, Identidad
• Focus on Grammar: Por vs. Para, Gustar, Ser vs. Estar
• Four steps
• Predict: Before watching
• Observe: While watching
• Discuss: After watching
• Produce: Follow-up activity
35
Publication of OER
• Community-developed lesson plans will be available on
the SpinTX website by August, 2013
• We encourage the publication of videos on third-party
platforms for remixing educational content, such as TedEd
(http://www.ed.ted.com)
36
A Model for Open Source
Corpus Development
37
Open source development
• Open Source Software
• TreeTagger (part-of-speech tagger)
• Drupal
• Open API‟s
• YouTube Captioning API
• Google Fusion Tables API
• Custom code developed for the project
• Freely available in our GitHub repository: http://github.com/coerll
38
Enable sharing of content and data
• With educators:
• SpinTX interface allows embedding, downloading, & social sharing
of videos and transcripts.
• With researchers:
• Source tagged data in our GitHub repository
https://github.com/coerll/SpinTXCorpusData
• Documentation of data in our GitHub wiki
https://github.com/coerll/SpinTXCorpusData/wiki
39
Open content licenses
• Creative Commons provides licenses for Open
Educational Resources
• We use CC BY-NC-SA (Attribution, Non-Commercial, Share-Alike)
40
Open Project Documentation
• Research protocols, development processes and
methodologies, and other project documentation
publically available:
• Corpus-to-Classroom Blog: http://sites.la.utexas.edu/corpus-to-
classroom/
• “For Researchers” page on
spanishintexas.orghttp://spanishintexas.org/for-researchers/
41
Questions
42
Links
• SpinTX Video Archive:
http://www.spintx.org
• Spanish in Texas Corpus:
http://www.spanishintexas.org
43

More Related Content

Similar to SPinTX Corpus-to-Classroom: A Teacher-Centered Pedagogical Interface for the Spanish in Texas Corpus

Blended Learning-Best Practices
Blended Learning-Best PracticesBlended Learning-Best Practices
Blended Learning-Best Practices
Saint Michael's College
 
Designing for Diversity: Creating Learning Experiences that Travel the Globe
Designing for Diversity: Creating Learning Experiences that Travel the GlobeDesigning for Diversity: Creating Learning Experiences that Travel the Globe
Designing for Diversity: Creating Learning Experiences that Travel the Globe
Una Daly
 
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
acornrevolution
 
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course DesignThe OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
Laura Murray
 
OER Vetting: Cultural Relevance, Accessibiilty, & Licensing
OER Vetting:  Cultural Relevance, Accessibiilty, & LicensingOER Vetting:  Cultural Relevance, Accessibiilty, & Licensing
OER Vetting: Cultural Relevance, Accessibiilty, & Licensing
Una Daly
 
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
Chris Willmott
 
Using pedagogic corpora in ELT
Using pedagogic corpora in ELTUsing pedagogic corpora in ELT
Using pedagogic corpora in ELT
Pascual Pérez-Paredes
 
Eurocall2014 SpeakApps Presentation - Speaking Practice
Eurocall2014 SpeakApps Presentation - Speaking PracticeEurocall2014 SpeakApps Presentation - Speaking Practice
Eurocall2014 SpeakApps Presentation - Speaking Practice
SpeakApps Project
 
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
National Council on Interpreting in Health Care (NCIHC)
 
Flip your classroom tech in elt-challenges and remedies
Flip your classroom   tech in elt-challenges and remediesFlip your classroom   tech in elt-challenges and remedies
Flip your classroom tech in elt-challenges and remediesEric H. Roth
 
Design and Development of Learning Resource Materials
Design and Development of Learning Resource MaterialsDesign and Development of Learning Resource Materials
Design and Development of Learning Resource Materials
Noel Ortega
 
Evaluating CALL
Evaluating CALLEvaluating CALL
Evaluating CALL
Jonathan Smart
 
How Open Education Practices Support Student Centered Design & Accessibility
How Open Education Practices Support Student Centered Design & AccessibilityHow Open Education Practices Support Student Centered Design & Accessibility
How Open Education Practices Support Student Centered Design & Accessibility
Una Daly
 
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Alannah Fitzgerald
 
The SIOP model...an Overview
The SIOP model...an OverviewThe SIOP model...an Overview
The SIOP model...an Overview
Beth Amaral
 
Materials devlopment good_practice
Materials devlopment good_practiceMaterials devlopment good_practice
Materials devlopment good_practicelllvt
 
OER.pptx
OER.pptxOER.pptx
HTC Presentation Version 1
HTC Presentation Version 1HTC Presentation Version 1
HTC Presentation Version 1
Michael Rost
 

Similar to SPinTX Corpus-to-Classroom: A Teacher-Centered Pedagogical Interface for the Spanish in Texas Corpus (20)

Testing
TestingTesting
Testing
 
Blended Learning-Best Practices
Blended Learning-Best PracticesBlended Learning-Best Practices
Blended Learning-Best Practices
 
Designing for Diversity: Creating Learning Experiences that Travel the Globe
Designing for Diversity: Creating Learning Experiences that Travel the GlobeDesigning for Diversity: Creating Learning Experiences that Travel the Globe
Designing for Diversity: Creating Learning Experiences that Travel the Globe
 
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
Two Hot Topics in Online Language Learning: Corpus Linguistics and Telecollab...
 
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course DesignThe OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
The OER Workshop as a Launching Pad to Zero-Textbook Cost Course Design
 
OER Workshop
OER Workshop OER Workshop
OER Workshop
 
OER Vetting: Cultural Relevance, Accessibiilty, & Licensing
OER Vetting:  Cultural Relevance, Accessibiilty, & LicensingOER Vetting:  Cultural Relevance, Accessibiilty, & Licensing
OER Vetting: Cultural Relevance, Accessibiilty, & Licensing
 
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
AudioVisuals In the Disciplines: Developing libraries of recommended TV and r...
 
Using pedagogic corpora in ELT
Using pedagogic corpora in ELTUsing pedagogic corpora in ELT
Using pedagogic corpora in ELT
 
Eurocall2014 SpeakApps Presentation - Speaking Practice
Eurocall2014 SpeakApps Presentation - Speaking PracticeEurocall2014 SpeakApps Presentation - Speaking Practice
Eurocall2014 SpeakApps Presentation - Speaking Practice
 
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
Target Your Training: Techniques to Adapt Your Content to Meet Your Students ...
 
Flip your classroom tech in elt-challenges and remedies
Flip your classroom   tech in elt-challenges and remediesFlip your classroom   tech in elt-challenges and remedies
Flip your classroom tech in elt-challenges and remedies
 
Design and Development of Learning Resource Materials
Design and Development of Learning Resource MaterialsDesign and Development of Learning Resource Materials
Design and Development of Learning Resource Materials
 
Evaluating CALL
Evaluating CALLEvaluating CALL
Evaluating CALL
 
How Open Education Practices Support Student Centered Design & Accessibility
How Open Education Practices Support Student Centered Design & AccessibilityHow Open Education Practices Support Student Centered Design & Accessibility
How Open Education Practices Support Student Centered Design & Accessibility
 
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
Bridging Informal MOOCs & Formal English for Academic Purposes Programmes wit...
 
The SIOP model...an Overview
The SIOP model...an OverviewThe SIOP model...an Overview
The SIOP model...an Overview
 
Materials devlopment good_practice
Materials devlopment good_practiceMaterials devlopment good_practice
Materials devlopment good_practice
 
OER.pptx
OER.pptxOER.pptx
OER.pptx
 
HTC Presentation Version 1
HTC Presentation Version 1HTC Presentation Version 1
HTC Presentation Version 1
 

Recently uploaded

Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
Nguyen Thanh Tu Collection
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
PedroFerreira53928
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 

Recently uploaded (20)

Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 

SPinTX Corpus-to-Classroom: A Teacher-Centered Pedagogical Interface for the Spanish in Texas Corpus

  • 1. SPinTX Corpus-to-Classroom: ATeacher-Centered Pedagogical Interface for the Spanish in Texas Corpus Barbara E. Bullock, Almeida Jacqueline Toribio, Rachael Gilg, Martí Quixal & Arthur Wendorf
  • 2. Who we are • Barbara E. Bullock & Almeida Jacqueline Toribio • Project Directors / Sociolinguistics Researchers • Rachael Gilg • Project Manager / Web Developer • Arthur Wendorf • Corpus Linguist / Developer • Martí Quixal • Computational Linguist / Developer • Carl Blyth • Director of COERLL 2
  • 3. Agenda • Part 1: Introduction to the Corpus-to-Classroom Project • Part 2: Project Results • The SpinTX Video Archive: a pedagogically-friendly interface to the Spanish in Texas Corpus • Involving teachers in the development of open educational resources • A model for open source corpus development 3
  • 5. Corpora in the Classroom: the promise • Corpus: a large, structured, collection of language • Benefits: • Naturalistic language use • Motivation • „Real‟ language • Discovery learning • Examples: 5
  • 6. Corpora in the Classroom: the reality • Large linguistic corpora are of limited utility to untrained end users. • Designed for researchers, not educators. • Collections such as YouTube are popular for language classes, but can present problems • Searching for appropriate content is time-consuming using available search methods. • Content is not necessarily openly-licensed and can disappear without warning. 6
  • 7. Our two-pronged approach Spanish in Texas Corpus Project A project of COERLL, a National Foreign Language Resource Center (2010-2014) • Video interviews provide rich content SpinTX: Corpus-to-Classroom Project Grant from the University of Texas Longhorn Innovation Fund for Technology (2012-2013) • Collection of pre-selected, corrected, annotated clips from the larger corpus • Open-source, pedagogically-friendly search and authoring tools 7
  • 8. Spanish in Texas Corpus: Goals • To make publically available authentic data about variation in Spanish as spoken in Texas • for education • for research • Encourage teachers/students/public to view local varieties as a resource 8
  • 9. Corpus-to-Classroom: Goals • develop a pedagogically friendly interface for using the Spanish in Texas corpus • involve teachers and learners, via crowd-sourcing, social networking, and workshops, in the development of open educational resources • create a model for using open source tools and a pedagogical interface that can be adapted for any language corpus collection 9
  • 10. Corpus Overview Spanish in Texas corpus • Approx. 92 videos of sociolinguistic interviews (avg. 30–45 min) • Transcribed (approx. 600,000 words) • Time-synced video caption files • Tagged for linguistic features SpinTX Video Archive corpus • Approx. 327 video clips from 33 speakers (avg. 1-4 min) • Transcribed (approx. 80,000 words) • Time-synced video caption files • Tagged for linguistic and pedagogical features • Completely open (no registration required, open CC license) • Teacher-friendly interface 10
  • 11. Corpus Tagging: Basic • Time-synced captions • Part-of-speech tags (dual language) • POS • POS, simplified • Gender • Tense • Aspect • Mood • Speaker identification • Age • Gender • Region 11
  • 12. Corpus Tagging: Pedagogical • Topics (manually added) • Automatic tags using custom rulesets • Grammatical • aggregated from textbooks • Pragmatics • discourse markers, place holders (“este”), attenuators • Vocabulary • concept words • Functional (planned) • greetings, ask for help, express opinions • Bilingual forms (planned) • CS, loans, loan translations 12
  • 13. 13
  • 15. Original Transcript (from Automatic Sync)
  • 16. Upload Video and Transcript to YouTube
  • 17. Review Transcript in Google Docs
  • 21. Combine Data from SRT File and TreeTagger File, and add additional Tags
  • 22. Divide CSV Files and Videos into Clips and adjust Timings and Numberings
  • 23. The SpinTX Video Archive: a pedagogically-friendly interface to the Spanish in Texas Corpus 23
  • 24. Needs assessment: teacher interviews • How do you use authentic video in your teaching? • Describe searches you have done in the past for video content. What were you looking for and were you able to find it? • How can you imagine using clips from the Spanish in Texas video corpus in your classes? 24
  • 25. Needs assessment results: primary goals • Enable teachers to easily videos that suit the curriculum/work plan • Search by grammar, theme, vocabulary, etc. • Provide open, non-ephemeral content • Downloadable from open site with a license enabling remixing • Curating sets of videos for comparison and study • Favoriting and tagging videos • Provide access to supporting materials. • Creating a “community of practice” around the videos so materials can be shared among educators. 25
  • 26. Needs assessment results: secondary goals • Materials for teacher trainers • Teachers of heritage learners can learn about local variation • Video recording as a cross-competence task • Interviews collected by students can be contributed to the corpus 26
  • 27. 27
  • 28. Ideas for future development • Advanced search capability • support for wildcards • improved phrase searching • improved “keyword in context” result view • Data visualizations • word and/or tag clouds • language maps • Enhanced word-level annotations • hover over a word in a transcript and see all annotations 28
  • 29. Formative evaluation of Beta version Data collection methods: • Online user survey • Web analytics (navigation patterns, popular content) • Search analytics • User observation and feedback through ongoing workshops and focus groups Results will drive future development of the interface. 29
  • 30. Involving Teachers in the Development of OER 30
  • 31. Workshops with Educators • Summer 2012 Workshop • ~100 secondary and college Spanish teachers • Fall 2012 Working Group • ~10 Univ. of Texas Spanish teachers • Spring 2013 Workshops • Multiple conferences & Univ. of Texas Spanish teachers • Summer 2013 Working Group • ~10 secondary and college Spanish teachers 31
  • 32. Sample materials from the community (1) 32
  • 33. 33
  • 34. Sample materials from the community (2) • Idea from teacher workshop: Use videos for grammar lessons to develop the student‟s metalinguistic and critical thinking skills as they pertain to language. • Searched and selected clips for lesson on “por vs. para”. • Lesson tested in heritage learners class. • Anecdotal evidence that video lessons were effective and motivating to students. 34
  • 35. Template development ideas • Using video clips from the SpinTX video archive, create an activity for classroom use (at any level). • Focus on Topics: Familia, Idioma, Identidad • Focus on Grammar: Por vs. Para, Gustar, Ser vs. Estar • Four steps • Predict: Before watching • Observe: While watching • Discuss: After watching • Produce: Follow-up activity 35
  • 36. Publication of OER • Community-developed lesson plans will be available on the SpinTX website by August, 2013 • We encourage the publication of videos on third-party platforms for remixing educational content, such as TedEd (http://www.ed.ted.com) 36
  • 37. A Model for Open Source Corpus Development 37
  • 38. Open source development • Open Source Software • TreeTagger (part-of-speech tagger) • Drupal • Open API‟s • YouTube Captioning API • Google Fusion Tables API • Custom code developed for the project • Freely available in our GitHub repository: http://github.com/coerll 38
  • 39. Enable sharing of content and data • With educators: • SpinTX interface allows embedding, downloading, & social sharing of videos and transcripts. • With researchers: • Source tagged data in our GitHub repository https://github.com/coerll/SpinTXCorpusData • Documentation of data in our GitHub wiki https://github.com/coerll/SpinTXCorpusData/wiki 39
  • 40. Open content licenses • Creative Commons provides licenses for Open Educational Resources • We use CC BY-NC-SA (Attribution, Non-Commercial, Share-Alike) 40
  • 41. Open Project Documentation • Research protocols, development processes and methodologies, and other project documentation publically available: • Corpus-to-Classroom Blog: http://sites.la.utexas.edu/corpus-to- classroom/ • “For Researchers” page on spanishintexas.orghttp://spanishintexas.org/for-researchers/ 41
  • 43. Links • SpinTX Video Archive: http://www.spintx.org • Spanish in Texas Corpus: http://www.spanishintexas.org 43

Editor's Notes

  1. Will introduce corpora in general, our source corpus, and the pedagogical corpus
  2. Discuss examples briefly one at a time.How frequently do teachers use them?How easy are they to use?Emphasis on YouTube as probably the most popular in language classes, but hard to use.
  3. Discuss examples briefly one at a time.How frequently do teachers use them?How easy are they to use?Emphasis on YouTube as probably the most popular in language classes, but hard to use.
  4. Describe original corpusThis is similar to the other corpora we looked at earlierIntroduce SpinTX corpus and highlight differences
  5. Will introduce corpora in general, our source corpus, and the pedagogical corpus
  6. We asked teachers how they use videos and how they would like to use videos. (interviews and focus groups
  7. We asked teachers how they use videos and how they would like to use videos.Here is how we havemet their needs
  8. We asked teachers how they use videos and how they would like to use videos.Here is how we havemet their needs
  9. 1. Anonymous userWatch intro video.Show search criteria: topics, grammar, pragmatics, keywords, etc.Show video page: related items, transcripts with highlighting, sharing & downloading tabs2. Registered userHow to favorite and tag a videoTagged video lists
  10. We asked teachers how they use videos and how they would like to use videos.Here is how we havemet their needs
  11. We asked teachers how they use videos and how they would like to use videos.Here is how we havemet their needs
  12. But that’s not all!
  13. This will be an ongoing process that will hopefully eventually be taken over by the users.
  14. This will be an ongoing process that will hopefully eventually be taken over by the users.
  15. This will be an ongoing process that will hopefully eventually be taken over by the users.
  16. This will be an ongoing process that will hopefully eventually be taken over by the users.
  17. This will be an ongoing process that will hopefully eventually be taken over by the users.
  18. This will be an ongoing process that will hopefully eventually be taken over by the users.
  19. 5 guidelines for developing open corporaWill also illustrate how we have implemented each guideline