Presentation by project directors Barbara E. Bullock and Almeida Jacqueline Toribio at the 24th Conference on Spanish in the United States, March 2013 in McAllen, Texas.
Presentation at the Texas Foreign Language Association 2013 Fall Conference
Abstract: Spanish teachers demonstrate their uses of the Spin TX Video Archive—a free and open collection of video interviews with bilingual Spanish speakers in Texas. Teachers discuss how they use the Spin TX videos to create standards-based lessons that focus on authentic, conversational Spanish. In particular, the teachers will show that the SpinTX videos capture language in context and demonstrate the linguistic and cultural diversity of the Spanish-speaking world. Lesson plans based on the SpinTX videos include elements that lead to deeper language learning: thinking critically about language/culture problems, working collaboratively, and learning how to learn.
SPinTX Corpus-to-Classroom: A Teacher-Centered Pedagogical Interface for the ...Spanish in Texas Project
Presentation at CALICO 2013: Corpora provide a promising way of creating language learning materials that accurately depict languages, but corpus search interfaces typically aren't designed with this goal in mind. The SPinTX Corpus-to-Classroom project is developing a website for educators to search and adapt authentic video for the teaching of Spanish. This presentation will describe the main results to date: (1) a pedagogically friendly interface to search over 300 tagged video clips from the Spanish in Texas Corpus; (2) tools for educators to easily create lessons and activities based on the videos; (3) an open source model for developing video corpora for language learning.
Presentation at the Texas Foreign Language Association 2013 Fall Conference
Abstract: Spanish teachers demonstrate their uses of the Spin TX Video Archive—a free and open collection of video interviews with bilingual Spanish speakers in Texas. Teachers discuss how they use the Spin TX videos to create standards-based lessons that focus on authentic, conversational Spanish. In particular, the teachers will show that the SpinTX videos capture language in context and demonstrate the linguistic and cultural diversity of the Spanish-speaking world. Lesson plans based on the SpinTX videos include elements that lead to deeper language learning: thinking critically about language/culture problems, working collaboratively, and learning how to learn.
SPinTX Corpus-to-Classroom: A Teacher-Centered Pedagogical Interface for the ...Spanish in Texas Project
Presentation at CALICO 2013: Corpora provide a promising way of creating language learning materials that accurately depict languages, but corpus search interfaces typically aren't designed with this goal in mind. The SPinTX Corpus-to-Classroom project is developing a website for educators to search and adapt authentic video for the teaching of Spanish. This presentation will describe the main results to date: (1) a pedagogically friendly interface to search over 300 tagged video clips from the Spanish in Texas Corpus; (2) tools for educators to easily create lessons and activities based on the videos; (3) an open source model for developing video corpora for language learning.
Designing for Diversity: Creating Learning Experiences that Travel the GlobeUna Daly
Workshop Title:
Designing for Diversity: Creating Learning Experiences that Can Travel the Globe
This highly interactive workshop will introduce and explore pedagogical, technical and policy-based strategies to design, create and deliver OER/OCW learning experiences that can be used by the broadest range of learners globally. Workshop participants will be exposed to a variety of tools while collaboratively creating educational resources that are amenable to translation across cultures, languages, formats, technical platforms, learning approaches, modes of interaction and sensory modalities.
The one consistent and predictable quality of learners is that they are diverse. Among the many differences, they differ in their expectations, language, learning approaches, priorities, culture, background knowledge, age, abilities, motivations, literacy, habits, learning context, available technology and skills. If the goal is to achieve the largest impact and support learners in reaching their optimum then the most important design criteria is to design OCW/OER for diversity.
There are tools, toolkits and guidelines available to support the creation of engaging, flexible and translatable learning experiences. There are also international research and innovation communities that support the advancement of inclusive design. Participants will be familiarized with both so that strategies introduced during the workshop can be further developed and updated after the workshop.
The workshop will address the full OER/OCW delivery chain from learning experience design, authoring, delivery, review, revision and reuse. Participants will explore a variety of content types including video, simulations, interactive forms, animations, games, electronic textbooks, math/science notation, and collaborative applications. Authoring tools and toolkits explored will range from office applications and OER authoring portals to application development environments. A variety of browsers and delivery platforms on desktops and mobile devices will be covered.
The workshop is intended for educators, policy makers, administrators, OER/OCW developers and technical support staff interested in reaching the broadest range of learners globally.
Webinar: Getting Started with Digitization An Introduction for Libraries-2016...TechSoup
In this webinar, collaborators from the Digital Public Library of America's Public Library Partnerships Project help participants think through the digitization of their archives. Using a free, online curriculum developed as part of the project, they share tips and ideas to consider when planning the who, what, when, where, how, and why of a digital project. They also discuss feedback from the beginners who have been through their training program.
How Open Education Practices Support Student Centered Design & AccessibilityUna Daly
There is no “typical” student; how can we design courses that meet varied student needs? Traditional textbooks and other instructional materials with all rights reserved can often be difficult to make accessible or flexible enough to engage a diverse group of students. Join us to hear how open educational practices (OEP) including OER adoption can support accessibility of instructional materials and enable student-centered course design methodologies such as universal design for learning (UDL).
Tara Bunag from the University of the Pacific discovered she had a student, who is blind, enrolled in her graduate statistics course just weeks before semester start. Unable to get the traditional statistics textbook converted to a screen-readable format in that timeframe, she turned to the OpenStax Introductory Statistics text which was digital, accessible, and free online. Integrating multiple OER with tactile resources and open data sets, she was able to achieve a more effective learning experience.
Suzanne Wakim of Butte Community College will share how she uses open educational practices to design courses based on the principles of UDL to increase student choice, encourage critical thinking, and improve learning outcomes. These practices include giving students various ways of acquiring information, interacting with the content, and demonstrating understanding. The result has been far more engaging for both students and teacher.
When: Wednesday, April 11th, 11am PT/ 2pm ET
Featured Speakers:
Tara Bunag, PhD., Senior Instructional Designer, University of the Pacific
Suzanne Wakim, OER Coordinator, Honors Chair, Biology Faculty, Butte Community College
PBL for WL is not done in quite the same as it is done in other subject areas, but it is nevertheless, a phenomonal opportunity to give students more access to their own interests in conncection with the languguages and cultures we bring to our students!
OER exploration with the adult literacy program, August 2016Manisha Khetarpal
Adult literacy program used OER videos, instructional resources, created blogs and commented on each others posts. The access to OER instructional resources in videos and audio format made complex text and concepts easy to understand for adult learners. OERs engaged students and this assisted with student retention. this helps students continue learning more as they are successful in their academic journey.
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Alannah Fitzgerald
Presented at the EAP and Corpora BALEAP Professional Issues Meeting in Coventry, UK on June 21st 2014. Research and Development Collaboration with the FLAX Language Project (University of Waikato), The Open Educational Resources Research Hub (The UK Open University) and the Language Centre at Queen Mary University of London (with Martin Barge, William Tweddle and Saima Sherazi).
NCSS 2013 Differentiated Instruction: A Gateway to Success with the Common CoreSusan Santoli
Workshop presented at 2013 NCSS conference in St. Louis. Web sites, activities, resources to involve all students in successfully meeting Common Core standards.
OER: insights into a multilingual landscape - EUROCALL 2014 conference LangOER
OER: insights into a multilingual landscape
Presentation by: Tita Beaven, Kate Borthwick, Linda Bradley, Sylvi Vigmo, Katerina Zourou
at the EUROCALL 2014 conference on 22 August, Groningen
Framing quality indicators for multilingual repositories of Open Educational ...LangOER
Presented at: EFQUEL Innovation Forum and International LINQ Conference, 9 May, Crete
By Sylvi Vigmo, Linda Bradley, Anne-Christin Tannhäuser, Katerina Zourou
Framing quality indicators for multilingual repositories of Open Educational ...LangOER
Presented at: EFQUEL Innovation Forum and International LINQ Conference, 9 May, Crete
By Sylvi Vigmo, Linda Bradley, Anne-Christin Tannhäuser, Katerina Zourou
Digital Literacy and the Role of the Language Teacher Cyprus May2021Jeroen Clemens
4th International Conference Literacy and Contemporary Society: Transitions in Digital Learning Digital Literacy and the Role of the Language Teacher May 2021
More Related Content
Similar to Spanish in the U.S.: Developing an open linguistic corpus
Designing for Diversity: Creating Learning Experiences that Travel the GlobeUna Daly
Workshop Title:
Designing for Diversity: Creating Learning Experiences that Can Travel the Globe
This highly interactive workshop will introduce and explore pedagogical, technical and policy-based strategies to design, create and deliver OER/OCW learning experiences that can be used by the broadest range of learners globally. Workshop participants will be exposed to a variety of tools while collaboratively creating educational resources that are amenable to translation across cultures, languages, formats, technical platforms, learning approaches, modes of interaction and sensory modalities.
The one consistent and predictable quality of learners is that they are diverse. Among the many differences, they differ in their expectations, language, learning approaches, priorities, culture, background knowledge, age, abilities, motivations, literacy, habits, learning context, available technology and skills. If the goal is to achieve the largest impact and support learners in reaching their optimum then the most important design criteria is to design OCW/OER for diversity.
There are tools, toolkits and guidelines available to support the creation of engaging, flexible and translatable learning experiences. There are also international research and innovation communities that support the advancement of inclusive design. Participants will be familiarized with both so that strategies introduced during the workshop can be further developed and updated after the workshop.
The workshop will address the full OER/OCW delivery chain from learning experience design, authoring, delivery, review, revision and reuse. Participants will explore a variety of content types including video, simulations, interactive forms, animations, games, electronic textbooks, math/science notation, and collaborative applications. Authoring tools and toolkits explored will range from office applications and OER authoring portals to application development environments. A variety of browsers and delivery platforms on desktops and mobile devices will be covered.
The workshop is intended for educators, policy makers, administrators, OER/OCW developers and technical support staff interested in reaching the broadest range of learners globally.
Webinar: Getting Started with Digitization An Introduction for Libraries-2016...TechSoup
In this webinar, collaborators from the Digital Public Library of America's Public Library Partnerships Project help participants think through the digitization of their archives. Using a free, online curriculum developed as part of the project, they share tips and ideas to consider when planning the who, what, when, where, how, and why of a digital project. They also discuss feedback from the beginners who have been through their training program.
How Open Education Practices Support Student Centered Design & AccessibilityUna Daly
There is no “typical” student; how can we design courses that meet varied student needs? Traditional textbooks and other instructional materials with all rights reserved can often be difficult to make accessible or flexible enough to engage a diverse group of students. Join us to hear how open educational practices (OEP) including OER adoption can support accessibility of instructional materials and enable student-centered course design methodologies such as universal design for learning (UDL).
Tara Bunag from the University of the Pacific discovered she had a student, who is blind, enrolled in her graduate statistics course just weeks before semester start. Unable to get the traditional statistics textbook converted to a screen-readable format in that timeframe, she turned to the OpenStax Introductory Statistics text which was digital, accessible, and free online. Integrating multiple OER with tactile resources and open data sets, she was able to achieve a more effective learning experience.
Suzanne Wakim of Butte Community College will share how she uses open educational practices to design courses based on the principles of UDL to increase student choice, encourage critical thinking, and improve learning outcomes. These practices include giving students various ways of acquiring information, interacting with the content, and demonstrating understanding. The result has been far more engaging for both students and teacher.
When: Wednesday, April 11th, 11am PT/ 2pm ET
Featured Speakers:
Tara Bunag, PhD., Senior Instructional Designer, University of the Pacific
Suzanne Wakim, OER Coordinator, Honors Chair, Biology Faculty, Butte Community College
PBL for WL is not done in quite the same as it is done in other subject areas, but it is nevertheless, a phenomonal opportunity to give students more access to their own interests in conncection with the languguages and cultures we bring to our students!
OER exploration with the adult literacy program, August 2016Manisha Khetarpal
Adult literacy program used OER videos, instructional resources, created blogs and commented on each others posts. The access to OER instructional resources in videos and audio format made complex text and concepts easy to understand for adult learners. OERs engaged students and this assisted with student retention. this helps students continue learning more as they are successful in their academic journey.
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Alannah Fitzgerald
Presented at the EAP and Corpora BALEAP Professional Issues Meeting in Coventry, UK on June 21st 2014. Research and Development Collaboration with the FLAX Language Project (University of Waikato), The Open Educational Resources Research Hub (The UK Open University) and the Language Centre at Queen Mary University of London (with Martin Barge, William Tweddle and Saima Sherazi).
NCSS 2013 Differentiated Instruction: A Gateway to Success with the Common CoreSusan Santoli
Workshop presented at 2013 NCSS conference in St. Louis. Web sites, activities, resources to involve all students in successfully meeting Common Core standards.
OER: insights into a multilingual landscape - EUROCALL 2014 conference LangOER
OER: insights into a multilingual landscape
Presentation by: Tita Beaven, Kate Borthwick, Linda Bradley, Sylvi Vigmo, Katerina Zourou
at the EUROCALL 2014 conference on 22 August, Groningen
Framing quality indicators for multilingual repositories of Open Educational ...LangOER
Presented at: EFQUEL Innovation Forum and International LINQ Conference, 9 May, Crete
By Sylvi Vigmo, Linda Bradley, Anne-Christin Tannhäuser, Katerina Zourou
Framing quality indicators for multilingual repositories of Open Educational ...LangOER
Presented at: EFQUEL Innovation Forum and International LINQ Conference, 9 May, Crete
By Sylvi Vigmo, Linda Bradley, Anne-Christin Tannhäuser, Katerina Zourou
Digital Literacy and the Role of the Language Teacher Cyprus May2021Jeroen Clemens
4th International Conference Literacy and Contemporary Society: Transitions in Digital Learning Digital Literacy and the Role of the Language Teacher May 2021
Similar to Spanish in the U.S.: Developing an open linguistic corpus (20)
Digital Literacy and the Role of the Language Teacher Cyprus May2021
Spanish in the U.S.: Developing an open linguistic corpus
1. Spanish in the US:
Developing an open linguistic corpus
Barbara E. Bullock & Almeida Jacqueline Toribio
24th Conference on Spanish in the United States
9th Conference on Spanish in Contact with Other Languages
March 6-9, 2013, McAllen, Texas
2. Spanish in Texas Corpus Project
• Purpose: to make publically available
authentic data about variation in Spanish as
spoken in Texas
– for education
– for research
2
3. Motivation
• Document continuity and variation
• Understand variation in its local context
• Overcome the challenges of studying
naturalistic data
– Cost: gathering, transcribing and coding data
– Accountability: corpora upon which studies are
based are rarely made available to the public
• Encourage teachers/students/public to view
local varieties as a resource
3
4. Inspiration
• Garland Bills, Vivian Cook, Lourdes Ortega,
Ricardo Otheguy, Bonnie Urciuoli, Guadalupe
Valdés, Walt Wolfram, Ana Celia Zentella, …
• language variation ‘in the public interest’
• an empirical turn in thinking about contact varieties
• Ornstein-Galicia’s (1981) call: investigate Spanish
varieties in your own backyard, share resources, create
concordances of usage
4
5. Impetus
• What is needed
– large, representative samples of oral Spanish in
the U.S.
– metadata about the speakers
– a context and protocols for sharing architecture,
scripts, analytical techniques, and data
… as well as findings
5
6. Why open?
• To facilitate access
– attract as many eyes as possible to the same data
– accelerate the production of findings, which is
particularly important for the study of U.S.
Spanish
• To reduce costs in terms of time and money,
especially for those who can least afford it
6
7. But…
• Large corpora are of limited utility to
untrained end users
– Teachers need short videos that are appropriate
for classroom use
– And teachers need tools
• to easily search videos,
• to author materials,
• to curate their own collections
7
8. Two-pronged approach
• Spanish in Texas Corpus Project
– Video interviews that provides rich content
• SpinTX: Corpus-to-Classroom
– Collection of pre-selected, corrected, annotated
clips from the larger corpus
– Open-source, pedagogically-friendly search and
authoring tools
8
9. Goals of this talk
• Document our efforts to develop an open corpus of
U.S. Spanish, using open-source tools
• Define ‘open’
• Describe the protocols that we are using for to convert
Spanish in TX interviews to pedagogically useful corpus
• Showcase materials and tools that we have for use
• Share our work with others who may be interested in
developing open Spanish in X corpora
• Forecast to an open sociolinguistic/computational
research corpus of the full interviews of Spanish in TX
9
10. Origins of the project
• Language Resource Center
[LRC], 2010-2013
• Center for Open Educational Resources for
Language Learning [COERLL]
10
11. Open Educational Resources [OER]
Educational material offered freely for anyone to
use, typically involving some permission to remix,
improve, and redistribute
creativecommons.org
11
12. Spanish in Texas Media License
• Attribution Required
• Non-Commercial
• Share-Alike
12
14. Spanish in Texas Corpus Project
• Spanish in Texas is our first collection of video
interviews
– provides content for SpinTX Corpus-to-Classroom
• Additional collections
– Spanish in Texas CS collection
– Hindi-English CS collection
14
15. Spanish in Texas Corpus Project
• Ideally serve as a reference corpus for oral
Spanish in Texas
– large (1 million + words), representative of
variation, fully open
– currently 134 interviews; approx. 600,000 words
• This will help establish a better “baseline” for
heritage language research and teaching than
the traditionally assumed monolingual one
15
17. SpinTX: Corpus-to-Classroom
• Aims
– develop a pedagogically friendly interface for
using the corpus
– involve teachers and learners, via crowd-sourcing,
social networking, and workshops, in the
development of open educational resources
– create a model for using open source tools and a
pedagogical interface that can be adapted for any
language corpus collection
17
18. Funding
• Department of Education, Title VI
• College of Liberal Arts
• Longhorn Innovation Fund for Technology
[LIFT]
18
19. Our team
• Directors: Barbara E. Bullock & Almeida J. Toribio
• Project Manager and Web Architect: Rachael Gilg
• Consultants
– Graphic Designer: Nathalie Steinfeld Childre
– Computational Linguist: Martí Quixal
– Digital Media Producer: Scott Zúñiga
– Educational Technologist: Arthur Wendorf
– Outreach Coordinator: Jeffrey Michno
– Content Manager, Intern Coordinator: Jacqueline Larsen Serigos
– Materials Development: Jesse Abing, Joshua Frank
• Undergraduate Interns
– 2011-present: 12
19
20. Our team
• Collaborators
– University of Texas Pan American
• José Esteban Hernández
• Stephanie Brock
• José Flores
• Viridiana Gallegos
• Rossy Limas
• Michelle Madrid
– Texas A&M International University
• Patricia González
• Conchita Hickey
• Lisa Flores
– Others
• Daniel Villa, New Mexico State University
• María Irene Moyna, Texas A&M University
• MaryEllen García, University of Texas, San Antonio
• Jens Clegg, Indiana University-Purdue University
• Abby Dings, Southwestern University
20
23. Recruit ‘locally’
• Recruit and train interns
– Internal Review Board training
– Video shooting and audio recording
– Practice interviews on site
• Recruit family, friends, acquaintances
– Any Spanish-speaking resident of TX
• Conduct interviews in their home
communities
23
24. Video Production Protocol
• HD video cameras
• Professional quality condenser microphones
– interviewer and interviewee are each recorded
into a separate channel
• Interviewer wears headphones to monitor
audio
24
25. Interview protocol
• Sampling of a large set of questions (~75)
– from NPR Storycorps (Historias)
– biographical information
• Average Length: 30-45 min.
• Language: Spanish and mixed
• Consent form and talent release
• Metadata on speaker and interviewer
– Google docs
25
27. Processing the Videos
• Intake interview materials
– create unique ID for video and forms
– archive raw video and remove from camera
• Video and transcript preparation
– Final Cut Pro
– Upload to Automatic Sync (3-5 day turnaround)
– convert transcript to UTF-8, upload to Google Drive
collection
– upload to Youtube to create synced caption file (SRT)
27
41. Manual Coding for Complex Cases
• Annotation of ‘lo’ as an article that allows for the
elision of nouns as in “lo bueno de esta clase es…”
– The rule requires a sequence of two words: “lo”
followed by an adjective with some words in between
(in fact only adjective modifiers, as adverbs, since the
BARRIER operator is telling the scanning process to
stop if a typical NP boundary is crossed.
41
42. Automatic annotation levels for clips
• Grammatical
– aggregated from textbooks
• Functional
– greetings, ask for help, express opinions
• Pragmatics
– discourse markers, place holders (“este”),
attenuators
• Bilingual forms
– CS, loans, loan translations
42
49. OER Materials
• Spanish in Texas searchable clip corpus
available this spring
– approximately 500 clips and growing
• All specially created code scripts are available
now through GitHub
• IRB, talent release, google metadata survey
template, etc. available
49
50. OER Materials
• In spirit of OER, please share-alike
• Add to repository any pedagogical materials
you or your students might develop from
Spanish in Texas clip corpus
50
51. Classroom and Community
• We are designing the corpus and tools with
the end-users
– using locally-relevant language samples to
illustrate every aspect of Spanish
• Users model their own language for
pedagogical purposes
• The corpus is the textbook
51
53. SPinTX: Corpus-to-Scholarship
• Full interviews, video-taped, captioned, POS
tagged will be made available
• Syntactically-parsed corpora
• Additional public protocols, open-source
search tools
53
54. Corpus-to-Scholarship: Share-alike
• When you use the corpus, share-alike
– crowd-sourcing approach to additional annotation
levels (e.g., PRAAT text grids)
• we’ll use stand-off annotation
– sociolinguists would ideally share data coding
– corpus linguists would ideally share scripts
• Any users could contribute their collections:
video, transcript, and metadata
– we’ll run it through SpinTX processing
54
55. Archive
• Spanish in Texas Corpus to be archived at the
Nettie Lee Benson Latin American Collection,
University of Texas Libraries
55
57. Thanks
• To all of our collaborators
• Especially to our students and their friends,
neighbors, and families who shared their time
and their language with us
57