This document summarizes a workshop on using language corpora in teaching. It defines what corpora are, describes different types of corpora, and approaches to using them in the classroom. Specific corpora resources are presented for exploring grammar, lexicon, translations, language variations, and more. Attendees then collaborate in groups to discuss challenges and ideas for applying corpora in their own teaching, before sharing discussions. The workshop aims to demonstrate how electronic language corpora can enhance language instruction.
Presentation developed for the class of Tópicos de Semântica em Inglês, under the responsability of Professor Elizabeth at the University of São Paulo, in the first semester of 2014.
Presentation developed for the class of Tópicos de Semântica em Inglês, under the responsability of Professor Elizabeth at the University of São Paulo, in the first semester of 2014.
What is Code switching?
Types of code switching
Example of code switching in print media
Code Mixing
Code Borrowing
Code Switching of Pakistan Languages
Examples from Urdu Text Books & Spoken
Introductory lecture on Corpus Linguistics. Contents: Corpus linguistics: past and present, What is a corpus?, Why use computers to study language? Corpus-based vs. Intuition-based approach, Theory vs. Methodology.
This lecture was based on McEnery et al. 2006. Corpus-based Language Studies. An Advanced resource book. Routlege.
Foreign Language Classroom Assessment in Support of Teaching and LearningCALPER
PPT presentation by Matthew E. Poehner for the LARC/CALPER 2011-2014 Webinar Series on Language Assessment. Author discusses formative assessment and explains some aspects of dynamic assessment.
What is Code switching?
Types of code switching
Example of code switching in print media
Code Mixing
Code Borrowing
Code Switching of Pakistan Languages
Examples from Urdu Text Books & Spoken
Introductory lecture on Corpus Linguistics. Contents: Corpus linguistics: past and present, What is a corpus?, Why use computers to study language? Corpus-based vs. Intuition-based approach, Theory vs. Methodology.
This lecture was based on McEnery et al. 2006. Corpus-based Language Studies. An Advanced resource book. Routlege.
Foreign Language Classroom Assessment in Support of Teaching and LearningCALPER
PPT presentation by Matthew E. Poehner for the LARC/CALPER 2011-2014 Webinar Series on Language Assessment. Author discusses formative assessment and explains some aspects of dynamic assessment.
Analysing Word Meaning over Time by Exploiting Temporal Random IndexingPierpaolo Basile
This work proposes an approach to the construction of WordSpaces which takes into account temporal information. The proposed method is able to build a geometrical space considering several periods of time. This methodology enables the analysis of the time evolution of the meaning of a word. Exploiting this approach, we build a framework, called Temporal Random Indexing (TRI) that provides all the necessary tools for building WordSpaces and performing such linguistic analysis. We propose some examples of usage of our tool by analysing word meanings in two corpora: a collection of Italian books and English scientific papers about computational linguistics.
http://clic.humnet.unipi.it/proceedings/Proceedings-CLICit-2014.pdf
eMargin Presentation given to Skills Funding AgencyRDUES
Presentation on the eMargin collaborative text annotation tool given to the Skills Funding Agency. Also contains description of AHRC Knowledge Transfer Fellowship project, working with A Level English Language students.
Euphemism in the Qur'an: A Corpus-based Linguistic ApproachCSCJournals
Euphemism is an important metaphoric resource in language, which has a relatively high functional load in religious texts, such as the Qur'an. This study creates an electronic HTML database of euphemisms in the Qur'an through adopting a more systematic corpus-based approach. The database of Qur'anic euphemisms is released into the public domain and is free for research and educational use (http://corpus.leeds.ac.uk/euphemismolimat/). The mechanism of annotating Qur'anic euphemisms relies on certain procedures including developing a set of linguistic guidelines, analysis of the content of the Qur'an using two renowned exegeses of the Qur'an and a comprehensive dictionary, evaluating scholarly efforts on the phenomenon of euphemism in the Qur'an, and consulting academics and religious scholars. The study proposes a broad classification of euphemistic topics on the basis of the data in the Qur'an and former categorisations produced by others. It suggests an effective strategy to check and verify inter-annotator agreement in the annotation of Qur'anic euphemisms. It presents statistical analysis and visualisation of the euphemistic data in the corpus. It has been found that the thirty parts of the Qur'an vary in the number and distribution of euphemisms across verses. Although the Meccan surahs comprise about three quarters of the Qur'an, they have only 518 euphemisms in 440 verses. By contrast, the Medinan surahs, which make up the remainder of the Qur'an, have 400 euphemisms in 263 verses. Sex and death are the most common euphemistic topics in the Qur'an, while feelings, divorce and pregnancy are the least frequent euphemistic topics. The study recommends that the designed corpus of Qur'anic euphemisms should be used to update existing web pages on the Qur'an with extended linguistic information about euphemism encoded with HTML/XML annotation.
Reflecting upon our experiences of learning academic writing, we will introduce some useful resources which can help both native speakers and learners of English develop an academic tone and style indispensable to scholarly writing. The resources to be reviewed in this presentation include non-digital and digital materials such as phrasebanks and corpora.
A Corpus-based Approach to Tracking L2 DevelopmentCALPER
A presentation by Dr. Xiaofei Lu on the Graphic Online Language Diagnostic (GOLD) tool developed by the Center for Advanced Language Proficiency Education and Research (CALPER) at The Pennsylvania State University.
Helping Teachers Meet Learner Needs Through Innovative Online Diagnostic Asse...CALPER
Presentation at the 2011 American Council of the Teaching of Foreign Languages (ACTFL). Description of Computerized Dynamic Assessment Tests developed for assessing listening comprehension in Chinese, Russian, and French. Test developed by the Center for Language Acquisition (CLA) and the Center for Advanced Language Proficiency Education and Research (CALPER) at the Pennsylvania State University.
Implementing E-portfolios in the Business Language Curriculum: A French CaseCALPER
Presented at the 2011 CIBER Business Language Conference, which described the implementation of electronic portfolios in a French business language course at the Pennsylvania State University. Sponsored by the Center for Language Acquisition (CLA) at Penn State.
Learning through Listening towards Advanced JapaneseCALPER
Describing a project by the Center for Advanced Language Proficiency Education and Research (CALPER) at the Pennsylvania State University. The project focuses on developing advanced level proficiency in Japanese. Applicable to all less-commonly-taught languages.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
1. How to Use Corpora in
Language Teaching
Brody Bluemel
Department of Applied Linguistics
The Pennsylvania State University
LANGUAGE TEACHING WORKSHOP SERIES
The Pennsylvania State University, February 2014
Sponsored by the Center for Language Acquisition (CLA) and the
Center for Advanced Language Proficiency Education and
Research (CALPER).
2. Outline
URL: https://sites.google.com/site/corpusteaching/
Presentation:
What are language corpora?
Approaches to using corpora in language teaching
Introduction to several available resources
Collaborate:
What ideas do you have for using corpora in your classroom?
Discussion:
Share ideas
3. What are corpora?
Leech (1992): “an unexciting phenomenon, a helluva lot of
text, stored on a computer”
Sinclair (1991): “a collection of naturally-occurring language
text, chosen to characterize a state or a variety of language”
Sinclair (2004): “a collection of pieces of language text in
electronic form, selected according to external criteria to
represent, as far as possible, a language or language variety
as a source of data for linguistic research”
Corpora: A systematized set of texts, typically accessed
electronically, that are used for linguistics research and
pedagogy.
4. Types of Corpora
General vs. Specialized
Native vs. Learner Corpora
Monolingual vs. Translation Corpora
Parallel Corpora, Comparable Corpora, Equivalent
Corpora
Language Variation Corpora
Synchronic vs. Diachronic Corpora
Spoken vs. Written Corpora
5. Approaches to using corpora in
language teaching
General vs. Specialized Corpora
Grammar, lexicon, rhetoric, style, expressions, Form
ulaic Speech
British National Corpus
American National Corpus
BYU Corpus Interface
MiCase
6. Approaches to using corpora in
language teaching
Native vs. Learner Corpora
Comparison, Analysis, Error Analysis, L1 specific
challenges
International Corpus of Learner English (ICLE)
Extensive List of Multilingual Learner Corpora
7. Approaches to using corpora in
language teaching
Translation Corpora
Parallel Corpora
Phrasing, conceptualizing complex concepts, reading
comprehension
www.parallelcorpus.com
EU Joint Research Centre
E-C Concord
www.linguee.com
8. Approaches to using corpora in
language teaching
Language Variation Corpora
Exploration of dialects
Phonemica
International Corpus of English (ICE)
Synchronic vs. Diachronic Corpora
Language change, modern speech, Understanding
novels and other texts
Spoken vs. Written Corpora
Genre and use
9. Online Resources
Presentation URL: https://sites.google.com/site/corpusteaching/
Multilingual Corpora:
Additional Resources:
Non-English Corpora
Corpus Tools & Websites
www.linguee.com
Extensive list of Online Corpora
Learner Corpora
Bookmarks for corpus-based linguist
Athel Corpus Resources
The corpora list
CALPER Corpus Tutorial
One of my favorites:
http://dict.bing.com.cn/
10. Primary Resources
Books and journals
Aijmer (2009): Corpora and Language Teaching
Hunston (2002): Corpora in Applied Linguistics
McEnery (2006): Corpus-Based Language
Studies
Sinclair (2004): How to Use Corpora in
Language Teaching
International Journal of Corpus Linguistics
Corpora
10
11. Collaborate
In groups of 3-4, discuss ideas, innovations, and questions
you have about applying corpus technology in the classroom.
Specific questions to consider:
Questions or applications of corpora that haven’t been
discussed?
What challenges do you foresee in applying corpora in
teaching?
What unique features about YOUR classroom should be
considered? (characteristics of the language you teach,
student population, etc.)
How would this technology benefit you in your teaching?
How do you plan to use corpus technology in your classroom?
12. Discussion
Share:
Ideas and possible applications of corpora generated
in your group discussion
Any key features or aspects of corpora we haven’t
yet considered
Questions:
Any questions regarding using corpora, finding
resources, or anything else.
13. Thank You!
Contact: Brody Bluemel (btb5129@psu.edu)
The Pennsylvania State University
Department of Applied Linguistics
Editor's Notes
Dear Colleagues,A friendly reminder of tomorrow's language teaching workshop:"How to Use Corpus Tools in Language Teaching"Wednesday, February 5, 20144:40-5:45 p.m.267 Willard This workshop offers an overview of how language corpora--collections of authentic textual and/or spoken language samples--can be highly valuable resources for the teaching and learning of second languages. Examples of available corpora in various languages, including a new corpus tool for learning Chinese, will be shown as models. Topics to be addressed include:The event is free and open to the public. Light refreshments will be provided.For further information, please contact mcd15@psu.edu. We hope you will join us! This workshop is sponsored by the Center for Language Acquisition (CLA) and the Center for Advanced Language Proficiency Education and Research (CALPER).
What is a language corpus?How can learners benefit from working with corpus materials?What do corpus-based activities and assignments look like?How can teachers find and use language corpora in their teaching?
Chinese – learning and using the orthographic system. (Bluemel, in press; Tsai & Choi, 2005)German – Learning gender, case, prepositions, and word order. (St. John, 2001)EFL/ESL – Learning articles, prepositions, and aspect (Frankenberg-Garcia, 2005; McEnery & Wilson, 2001)Italian – Verb Tense (Laviosa, 2002)Spanish – lexical and semantic analysis and differentiation (Lavid, Hita, & Zamorano-Mansilla, 2010)
Source info – Learner: l1, gender, programSample – date, mode, task, genreWhich numbers matterNumber of tokens, types, categories, samples in each category, and words in each sampleDescriptive adequacyBigger corpus generally better for low frequency words, but note Zipf’s Law (1935) 100K words of spontaneous speech enough for descriptive studies of prosody0.5 million words enough for study of verb-form morphology0.5-1 million words enough for studies of most syntactic processes and high frequency vocabulary Reliability of smaller corpus can be empirically tested against larger corpus Biber (1990)Measured internal variation of 50 pairs of samples from same textsSamples: 2000-5000 words enoughBiber (1993)Used multivariate techniques of factor analysis and cluster analysis to study variationPilot studies necessary to fine-tune structureOne million words good for grammatical studies