SlideShare a Scribd company logo
Steven Saffels
April 2014
• The impression of the language used on the AP Spanish
Exam is that it primarily consists of lexically rich but
grammatically simple text.
• Vocabulary – relatively specialized for specific topics
• Mostly of simple sentences and relies on noun phrase modification
• 86% of all verbs are in present, past, and infinitive forms.
• Recurrent formulaic expressions are used to introduce source
texts.
• (Anthony, 2011)
• Corpus research:
• Uses a computer program
called a concordancer
• analyze key words,
phrases & parts of words
• in a large, representative,
computerized collection of
texts, called a corpus.
(O’Keefe, McCarthy & Carter 2007)
• Allow very extensive,
systematic and
descriptive data
(De Kock 2001)
• Relatively few corpus
studies in languages
other than English
(Parodi, 2007)
• “Gap” between
corpus-based
research results and
pedagogical practice
(Cortes 2013)
The present study aspires to:
help redress both
the lack of corpus research in Spanish and
the gap between research and practice
by applying corpus methodologies
to a pedagogical problem
from a Spanish L2 classroom:
How to best prepare high school students for success on a
high-stakes, skills-based exam of proficiency in Spanish.
• RQ1: How representative is the AP Spanish Exam of
broader usage of Spanish? Specifically, in terms of:
• Vocabulary
• Parts of speech
• Verb forms
• RQ2: What are the most frequent recurrent word
combinations?
• What are the salient 3-, 4-, 5-, or 6-grams used on the exam?
• Are there any salient tendencies in n-gram use?
• RQ3: Are the “transition phrases” suggested by a popular
test-prep book used frequently on the exam?
• Year-end, Skills-based exam
• No vocabulary or grammar specifications
• Students must use information from authentic texts to:
• Write a personal letter
• Compose a synthesis essay
• Respond orally to a simulated conversation
• Make an oral synthesis presentation
(College Board, 2008-2013)
• Total of 10 texts
• 18,333 word tokens
• Most of the text is from the articles and radio
reports used as sources for the presentational
writing and speaking exercises.
• List of the 5,000 most
frequent words in
Spanish
• Based on a subset of
the 100-million-word
Corpus del Español
(CDE)
(Davies 2002-)
• Balanced,
representative corpus:
• Spoken/Written
• Latin America/Spain
• 30% of the top 300 words in Davies’ (2006) do not
appear on the AP word list.
• Of those, 41% are verbs, including many core vocabulary items for
lower-level Spanish classes:
PONER (PUT)
LLAMAR (CALL)
VENIR (COME)
SALIR (LEAVE)
VOLVER (RETURN)
VIVIR (LIVE)
MIRAR (LOOK)
EMPEZAR (BEGIN)
ENTRAR (ENTER)
ENTENDER (UNDERSTAND)
PEDIR (REQUEST)
RECIBIR (RECEIVE)
TERMINAR (FINISH)
SACAR (TAKE OUT)
NECESITAR (NEED)
LEER (READ)
ABRIR (OPEN)
• General Nouns:
• COSA (THING)
• HOMBRE (MAN)
• MUJER (WOMAN)
• MODO (WAY)
• RELACIÓN
(RELATIONSHIP)
• Body Parts:
• MANO (HAND)
• OJO (EYE)
• Human
Relations:
• HIJO (SON)
• SEÑOR (MISTER)
• MADRE
(MOTHER)
• NOSOTROS (WE)
• NADIE (NOBODY)
• Religion:
• VERDAD
(TRUTH)
• SANTO (HOLY)
• DIOS (GOD)
• Time/Space
• PUNTO (POINT)
• LADO (SIDE)
• NOCHE (NIGHT)
• PRINCIPIO
(BEGINNING)
• PUEBLO (TOWN)
• Several of the generally common adjectives that are
missing from the AP Corpus frequency list are typically
pre-modifiers.
• AQUEL (THAT) desde aquel día (from that day)
• TAL (SUCH) hacerlo de tal manera (to do it in such
a way)
• PROPIO (OWN) tiene su propio estilo (has his own style)
• NINGÚN (NONE) no hay ningún problema (there’s no problem)
• CUALQUIER (ANY) puede hacer cualquier cosa (can do
any thing)
• ÚNICO (ONLY) ¿Usted es el único hijo? (You are the only
son?)
• Terms used to introduce source texts for the
presentational writing and speaking activities
• Not extremely salient for the student taking the exam—
referential information—not necessary for interpreting the
texts
• May be helpful to guide students in quickly selecting
appropriate strategies to make the most efficient use of
time
FUENTE (SOURCE)
DIARIO
(NEWSPAPER)
INFORME
(REPORT)
ARTÍCULO
(ARTICLE)
APARECER
(APPEAR)
RADIO (RADIO)
EMITIR
(BROADCAST)
SIGUIENTE
(FOLLOWING)
TITULADO (TITLED)
TEXTO (TEXT)
CONVERSACIÓN
(CONVERSATION)
IMPRESO
(PRINTED)
PERIÓDICO
(NEWSPAPER)
ADAPTACIÓN
(ADAPTATION)
GRABACIÓN
(RECORDING)
• Geography: país (country), mundo (world), ciudad (city), español
(Spanish), lengua (language), mundial (worldwide), idioma
(language), estado (state)
• Environment: cambio (change), climático (climate), invierno (winter),
oso (bear), ave (bird), combustible (fuel), nieve (snow),
calentamiento (warming)
• Wellbeing: agua (water), físico (physique), salud (health), organismo
(body), risa (laughter), peso (weight), alimento (food), kilómetros
(kilometers)
• Technology: computadora (computer), internet (Internet), digital
(digital), electrónico (electronic), red (network), tecnología
(technology), virtual (virtual)
• Fine arts: arte (art), música (music), orquesta (orchestra), artista
(artist), producción (production), pintura (painting), músico
(musician), lienzo (canvas)
WORD CLASS
AP
TOKENS %
CDE
TOKENS %
PREPOSITIONS 3,079 22.79% 5,553,520 24.72%
ARTICLES 2,544 18.83% 4,643,039 20.67%
CONJUNCTIONS 1,371 10.15% 3,781,609 16.83%
PRONOUNS 608 4.50% 2,046,356 9.11%
VERBS 1,213 8.98% 1,928,260 8.58%
ADVERBS 566 4.19% 1,764,952 7.86%
COMMON
NOUNS 2,435 18.02% 1,459,968 6.50%
ADJECTIVES 1,294 9.58% 614,069 2.73%
PROPER
NOUNS 296 2.19% 365,057 1.62%
NUMERALS 100 0.74% 246,519 1.10%
INTERJECTIONS 5 0.04% 64,277 0.29%
TOTAL 13,511 100% 22,467,626 100%
WORD CLASS
ACADEMI
C NEWS
FICTIO
N ORAL
COMMON
NOUNS 241,116 222,729 209,619 169,680
PREPOSITIONS 162,221 156,788 132,089 118,329
ARTICLES 153,504 139,484 124,272 105,910
VERBS 116,308 136,359 187,788 183,306
ADJECTIVES 90,953 72,177 58,305 50,667
CONJUNCTIONS 72,917 76,856 97,745 116,953
PROPER
NOUNS 53,161 57,932 22,147 28,177
ADVERBS 27,754 37,160 55,152 79,902
PRONOUNS 24,821 32,464 65,805 73,150
NUMERALS 6,125 8,705 5,426 9,434
INTERJECTIONS 93 286 818 8,134
VERB FORM AP % CDE %
PRESENT 747 61.58% 1,190,971 37.52%
INFINITVE 211 17.39% 459,890 14.49%
PRETERITE 126 10.39% 386,218 12.17%
IMPERFECT 27 2.23% 443,182 13.96%
PAST PARTICIPLE 53 4.37% 259,488 8.18%
GERUND 4 0.33% 107,727 3.39%
CONDITIONAL 13 1.07% 57,225 1.80%
FUTURE 12 0.99% 67,040 2.11%
SUBJUNCTIVE-
PRESENT 17 1.40% 126,093 3.97%
SUBJUNCTIVE-PAST 3 0.25% 73,073 2.30%
SUBJUNCTIVE-
FUTURE 0 0.00% 3,141 0.10%
TOTALS 1,213 100% 3,174,048 100%
?-
Gram
Fre
q
Rang
e
N-Gram English Structure Function Subcategory
6 10 5
apareció en el
sitio de internet
appeared on the
website
Verb Phrase
fragment
Referential
Intangible
framing
4 15 8
este artículo
apareció en
this article
appeared in
Verb Phrase
fragment
Referential
Intangible
framing
3 20 10
artículo apareció
en
article appeared
in
Verb Phrase
fragment
Referential
Intangible
framing
3 19 7 apareció en el appeared on the
Verb Phrase
fragment
Referential
Intangible
framing
3 14 5 el sitio de the site of
Noun Phrase
fragment
Referential
Intangible
framing
3 14 5 en el sitio on the site
Prep Phrase
fragment
Referential
Identification/
Focus
3 14 5 sitio de internet internet site
Noun Phrase
fragment
Referential
Identification/
Focus
3 11 10 informe de la report from the
Noun Phrase
fragment
Referential
Identification/
Focus
3 11 5 se presentó en
was presented
on
Verb Phrase
fragment
Referential
Intangible
framing
• Lexical Bundle – an N-gram that occurs a certain number
of times acros a certain number of texts in a corpus
• Cut-off numbers determined by the type of corpus and the length
of N-gram
• Based on these criteria, the six-word expression
apareció en el sitio de internet (appeared on the
website)
can be considered a lexical bundle for this corpus.
• Empirically identified, frequency-based expressions
which could be salient for the examinee and therefore
useful for interpreting the texts:
• todo el mundo (the whole world)
• a través de (throughout)
• por ciento de (percent of)
• una de las (one of the)
• se trata de (is about)
• cuál es el (what is the…?)
• de enero de (of January of)
• de noviembre de (of November of)
• en la ciudad (in the city)
• One of the most
popular textbooks for
the AP Spanish
course.
• Contains exhaustive
list of transition words
and phrases
• Very few of these
appear in the AP
Corpus
TRANSITION English FREQ TRANSITION English FREQ
que that 565 entonces Then 8
y and 482 sin embargo However 8
como like, as 84 mientras While 7
o or 57 o sea that is 7
pero but 49 ya que Since 7
también also 34 al + inf upon + -ing 5
si if 33 sino but rather 5
cuando when 30 a partir de as of 4
porque because 26 como si as if 4
durante during 18 luego later, then 4
según according to 17 primero first 4
además in addition 14 sino que but rather 3
para que so that 12 tampoco neither 3
por ejemplo for example 11 una vez que once 3
sobre todo above all 11
tanto…
como…
as much… as… 3
aunque although 9
• The impression of the language used on the AP Spanish
Exam is that it primarily consists of lexically rich but
grammatically simple text.
• High frequency of relatively obscure & specific vocabulary items;
• Many common “general” vocabulary items are missing
• Texts consist mostly of simple sentences with few conjunctions
• Communication relies on noun phrase modification—academic
register
• 83% of all verbs in present, infinitive or preterite forms.
• Recurrent word combinations are primarily used to introduce
source texts.
• In order to successfully interpret the tasks on the AP
Spanish Exam, students must possess a broad
vocabulary that is strongly rooted in, but extends well
beyond, the most frequent lexical items in the language.
• An AP student’s vocabulary should include a variety of
synonyms, especially a wide range of nouns related to
specific themes that express concrete entities and
abstract concepts.
• Present, Preterite & Imperfect tenses along with the
Infinitive account for:
• 86% of all verbs in the AP Corpus
• 78% of all verbs in the Corpus del Español
• The most important grammatical focus for the AP class
might well be that of the noun phrase.
• Complex verb tenses should not be the organizing factor
for an upper-level Spanish curriculum
• Anderson, N. J. (2014). Developing Engaged Second Language Readers. In M. Celce-Murcia, D. M.
Brinton, & M. A. Snow (Eds.), Teaching English as a Second or Foreign Language. 4th ed. (pp. 170-188).
Boston: Heinle Cengage.
• Anthony, L. (2011). AntConc (Version 3.2.4w) [Computer Software]. Tokyo, Japan: Waseda University.
Available from http://www.antlab.sci.waseda.ac.jp/
• Biber, D., Johansson, S., Leech, G., Conrad, S., & Finnegan, E. (1999). Longman Grammar of Spoken
and Written English. Essex, England: Longman.
• College Board. (2008-2013). AP Spanish Language Exam: Free-Response Questions. Retrieved from
http://apcentral.collegeboard.com/apc/public/courses/teachers_corner/221848.html.
• Cortes, V. (2013, January). Waiting for the revolution. Plenary talk presented at the Conference for the
American Association of Corpus Linguistics (AACL), San Diego, California, USA.
• Davies, M. (2002-). Corpus del Español: 100 million words, 1200s-1900s. Available online at
http://corpusdelespanol.org.
• Davies, M. (2006). A Frequency dictionary of Spanish: Core vocabulary for learners. New York:
Routledge.
• De Kock, J. (2001). [Preface]. In J. De Kock (Ed.), Gramática española: Enseñanza e investigación (Vol.
7. Lingüística con corpus). (pp. 7-8). Salamanca: Ediciones Universidad de Salamanca.
• Díaz, J. M. (2014). AP Spanish: Preparing for the Language and Culture Examination. Boston: Pearson
Education.
• Parodi, G. (2007). Catching up with corpus linguistics: Register-diversified studies from different corpora
in different Spanish-speaking countries. In G. Parodi (Ed.), Working with Spanish Corpora. (pp. 1-10).
New York: Continuum.
• Tracy-Ventura, N., Cortes, V., & Biber, D. (2007). Lexical bundles in speech and writing. In G. Parodi
(Ed.), Working with Spanish Corpora. (pp. 217-231). New York: Continuum.

More Related Content

What's hot

A comprehensive grammar of the english language quirk greenbaum leech svartvik
A comprehensive grammar of the english language quirk greenbaum leech svartvikA comprehensive grammar of the english language quirk greenbaum leech svartvik
A comprehensive grammar of the english language quirk greenbaum leech svartvik
Ivana Jovanovic
 
SSLW 2014 Presentation: Lexical Diversity, Sophistication, and Size in Academ...
SSLW 2014 Presentation: Lexical Diversity, Sophistication, and Size in Academ...SSLW 2014 Presentation: Lexical Diversity, Sophistication, and Size in Academ...
SSLW 2014 Presentation: Lexical Diversity, Sophistication, and Size in Academ...
Melanie Gonzalez
 
Illustration australian
Illustration australianIllustration australian
Illustration australian
bebu_bom
 
Linguascope2018
Linguascope2018Linguascope2018
Linguascope2018
Isabelle Jones
 
3 phonology slides
3 phonology slides3 phonology slides
3 phonology slides
Jasmine Wong
 
Vocabulary i
Vocabulary iVocabulary i
Vocabulary i
Bruno Sampaio Garrido
 

What's hot (6)

A comprehensive grammar of the english language quirk greenbaum leech svartvik
A comprehensive grammar of the english language quirk greenbaum leech svartvikA comprehensive grammar of the english language quirk greenbaum leech svartvik
A comprehensive grammar of the english language quirk greenbaum leech svartvik
 
SSLW 2014 Presentation: Lexical Diversity, Sophistication, and Size in Academ...
SSLW 2014 Presentation: Lexical Diversity, Sophistication, and Size in Academ...SSLW 2014 Presentation: Lexical Diversity, Sophistication, and Size in Academ...
SSLW 2014 Presentation: Lexical Diversity, Sophistication, and Size in Academ...
 
Illustration australian
Illustration australianIllustration australian
Illustration australian
 
Linguascope2018
Linguascope2018Linguascope2018
Linguascope2018
 
3 phonology slides
3 phonology slides3 phonology slides
3 phonology slides
 
Vocabulary i
Vocabulary iVocabulary i
Vocabulary i
 

Viewers also liked

EXPRESIONISMO ABSTRACTO Y CUBISMO 4ºESO F
EXPRESIONISMO ABSTRACTO Y CUBISMO 4ºESO FEXPRESIONISMO ABSTRACTO Y CUBISMO 4ºESO F
EXPRESIONISMO ABSTRACTO Y CUBISMO 4ºESO F
Juan José Macías Moreno
 
Ppt inicial
Ppt inicialPpt inicial
Ppt inicial
relovejito
 
Kaplan & norton transforming the bsc from performance measurement to st...
Kaplan & norton   transforming the bsc from performance measurement to st...Kaplan & norton   transforming the bsc from performance measurement to st...
Kaplan & norton transforming the bsc from performance measurement to st...
Rinsanti Margaretha
 
2. tarea. sintesis creativa
2. tarea. sintesis creativa2. tarea. sintesis creativa
2. tarea. sintesis creativa
elva Marroquin
 
谢心怡Shirley dut-designer
谢心怡Shirley dut-designer谢心怡Shirley dut-designer
谢心怡Shirley dut-designer
心怡 谢
 
полімери
полімериполімери
полімери
utyyflbq
 
Contribuciones hector.
Contribuciones hector.Contribuciones hector.
Contribuciones hector.
Hector Alvarado
 
Pemodelan dan simulasi sistem komputer
Pemodelan dan simulasi sistem komputerPemodelan dan simulasi sistem komputer
Pemodelan dan simulasi sistem komputer
Ardhiansyah Purwanto
 
Wordpress Lightning talk: What not to do with WordPress
Wordpress Lightning talk: What not to do with WordPressWordpress Lightning talk: What not to do with WordPress
Wordpress Lightning talk: What not to do with WordPress
Joe Ortenzi
 
Test Book2
Test Book2Test Book2
Test Book2
Jamesnader
 
Error Proofing And Cost Reduction 2
Error Proofing And Cost Reduction 2Error Proofing And Cost Reduction 2
Error Proofing And Cost Reduction 2
Brian King
 

Viewers also liked (11)

EXPRESIONISMO ABSTRACTO Y CUBISMO 4ºESO F
EXPRESIONISMO ABSTRACTO Y CUBISMO 4ºESO FEXPRESIONISMO ABSTRACTO Y CUBISMO 4ºESO F
EXPRESIONISMO ABSTRACTO Y CUBISMO 4ºESO F
 
Ppt inicial
Ppt inicialPpt inicial
Ppt inicial
 
Kaplan & norton transforming the bsc from performance measurement to st...
Kaplan & norton   transforming the bsc from performance measurement to st...Kaplan & norton   transforming the bsc from performance measurement to st...
Kaplan & norton transforming the bsc from performance measurement to st...
 
2. tarea. sintesis creativa
2. tarea. sintesis creativa2. tarea. sintesis creativa
2. tarea. sintesis creativa
 
谢心怡Shirley dut-designer
谢心怡Shirley dut-designer谢心怡Shirley dut-designer
谢心怡Shirley dut-designer
 
полімери
полімериполімери
полімери
 
Contribuciones hector.
Contribuciones hector.Contribuciones hector.
Contribuciones hector.
 
Pemodelan dan simulasi sistem komputer
Pemodelan dan simulasi sistem komputerPemodelan dan simulasi sistem komputer
Pemodelan dan simulasi sistem komputer
 
Wordpress Lightning talk: What not to do with WordPress
Wordpress Lightning talk: What not to do with WordPressWordpress Lightning talk: What not to do with WordPress
Wordpress Lightning talk: What not to do with WordPress
 
Test Book2
Test Book2Test Book2
Test Book2
 
Error Proofing And Cost Reduction 2
Error Proofing And Cost Reduction 2Error Proofing And Cost Reduction 2
Error Proofing And Cost Reduction 2
 

Similar to An exploratory corpus study of the AP Spanish

Can we develop TV drama corpus-informed English vocabulary materials for elem...
Can we develop TV drama corpus-informed English vocabulary materials for elem...Can we develop TV drama corpus-informed English vocabulary materials for elem...
Can we develop TV drama corpus-informed English vocabulary materials for elem...
Hiroya Tanaka
 
Using Corpus Linguistics to Teach ESL Pronunication
Using Corpus Linguistics to Teach ESL PronunicationUsing Corpus Linguistics to Teach ESL Pronunication
Using Corpus Linguistics to Teach ESL Pronunication
Rebecca Allen
 
SIBAU Foundation Vocabulary
SIBAU Foundation VocabularySIBAU Foundation Vocabulary
SIBAU Foundation Vocabulary
AliAqsamAbbasi
 
5810 oral lang anly transcr wkshp (fall 2014) pdf
5810 oral lang anly transcr wkshp (fall 2014) pdf  5810 oral lang anly transcr wkshp (fall 2014) pdf
5810 oral lang anly transcr wkshp (fall 2014) pdf
SVTaylor123
 
Pronunciation and philippine dictionaries (philippine lexicography)
Pronunciation and philippine dictionaries (philippine lexicography)Pronunciation and philippine dictionaries (philippine lexicography)
Pronunciation and philippine dictionaries (philippine lexicography)
Sheng Nuesca
 
1001 Vocabulary and Spelling Questions
1001 Vocabulary and Spelling Questions1001 Vocabulary and Spelling Questions
1001 Vocabulary and Spelling Questions
Joy Celestial
 
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
Lifeng (Aaron) Han
 
5810 day 3 sept 20 2014
5810 day 3 sept 20 2014 5810 day 3 sept 20 2014
5810 day 3 sept 20 2014
SVTaylor123
 
Phonetic Basics1. Please write out the sounds for the followin.docx
Phonetic Basics1. Please write out the sounds for the followin.docxPhonetic Basics1. Please write out the sounds for the followin.docx
Phonetic Basics1. Please write out the sounds for the followin.docx
mattjtoni51554
 
2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring
Mizumoto Atsushi
 
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and SummarizationeSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
INESC-ID (Spoken Language Systems Laboratory - L2F)
 
Say That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent AcumenSay That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent Acumen
National Council on Interpreting in Health Care (NCIHC)
 
Day 7 lang to literacy (rdg wrtg) 2
Day 7 lang to literacy (rdg wrtg) 2  Day 7 lang to literacy (rdg wrtg) 2
Day 7 lang to literacy (rdg wrtg) 2
SVTaylor123
 
Pronunciation and philippine dictionaries (philippine lexicography)
Pronunciation and philippine dictionaries (philippine lexicography)Pronunciation and philippine dictionaries (philippine lexicography)
Pronunciation and philippine dictionaries (philippine lexicography)
Sheng Nuesca
 
Sample debate presentation: Is 'vocabulary' enough?
Sample debate presentation: Is 'vocabulary' enough?Sample debate presentation: Is 'vocabulary' enough?
Sample debate presentation: Is 'vocabulary' enough?
Ron Martinez
 
Mastering interpretive mode 2
Mastering interpretive mode 2Mastering interpretive mode 2
Mastering interpretive mode 2
hhs
 
Pragmatics
PragmaticsPragmatics
Pragmatics
vricigliano
 
Introduction to linguistics
Introduction to linguisticsIntroduction to linguistics
Introduction to linguistics
Francisco Cabrera
 
Evaluating A Dictionary
Evaluating A DictionaryEvaluating A Dictionary
Evaluating A Dictionary
Khaleel Al Bataineh
 
The Corpus In The Classroom
The Corpus In The ClassroomThe Corpus In The Classroom
The Corpus In The Classroom
Colin Graham
 

Similar to An exploratory corpus study of the AP Spanish (20)

Can we develop TV drama corpus-informed English vocabulary materials for elem...
Can we develop TV drama corpus-informed English vocabulary materials for elem...Can we develop TV drama corpus-informed English vocabulary materials for elem...
Can we develop TV drama corpus-informed English vocabulary materials for elem...
 
Using Corpus Linguistics to Teach ESL Pronunication
Using Corpus Linguistics to Teach ESL PronunicationUsing Corpus Linguistics to Teach ESL Pronunication
Using Corpus Linguistics to Teach ESL Pronunication
 
SIBAU Foundation Vocabulary
SIBAU Foundation VocabularySIBAU Foundation Vocabulary
SIBAU Foundation Vocabulary
 
5810 oral lang anly transcr wkshp (fall 2014) pdf
5810 oral lang anly transcr wkshp (fall 2014) pdf  5810 oral lang anly transcr wkshp (fall 2014) pdf
5810 oral lang anly transcr wkshp (fall 2014) pdf
 
Pronunciation and philippine dictionaries (philippine lexicography)
Pronunciation and philippine dictionaries (philippine lexicography)Pronunciation and philippine dictionaries (philippine lexicography)
Pronunciation and philippine dictionaries (philippine lexicography)
 
1001 Vocabulary and Spelling Questions
1001 Vocabulary and Spelling Questions1001 Vocabulary and Spelling Questions
1001 Vocabulary and Spelling Questions
 
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
 
5810 day 3 sept 20 2014
5810 day 3 sept 20 2014 5810 day 3 sept 20 2014
5810 day 3 sept 20 2014
 
Phonetic Basics1. Please write out the sounds for the followin.docx
Phonetic Basics1. Please write out the sounds for the followin.docxPhonetic Basics1. Please write out the sounds for the followin.docx
Phonetic Basics1. Please write out the sounds for the followin.docx
 
2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring
 
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and SummarizationeSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
 
Say That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent AcumenSay That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent Acumen
 
Day 7 lang to literacy (rdg wrtg) 2
Day 7 lang to literacy (rdg wrtg) 2  Day 7 lang to literacy (rdg wrtg) 2
Day 7 lang to literacy (rdg wrtg) 2
 
Pronunciation and philippine dictionaries (philippine lexicography)
Pronunciation and philippine dictionaries (philippine lexicography)Pronunciation and philippine dictionaries (philippine lexicography)
Pronunciation and philippine dictionaries (philippine lexicography)
 
Sample debate presentation: Is 'vocabulary' enough?
Sample debate presentation: Is 'vocabulary' enough?Sample debate presentation: Is 'vocabulary' enough?
Sample debate presentation: Is 'vocabulary' enough?
 
Mastering interpretive mode 2
Mastering interpretive mode 2Mastering interpretive mode 2
Mastering interpretive mode 2
 
Pragmatics
PragmaticsPragmatics
Pragmatics
 
Introduction to linguistics
Introduction to linguisticsIntroduction to linguistics
Introduction to linguistics
 
Evaluating A Dictionary
Evaluating A DictionaryEvaluating A Dictionary
Evaluating A Dictionary
 
The Corpus In The Classroom
The Corpus In The ClassroomThe Corpus In The Classroom
The Corpus In The Classroom
 

An exploratory corpus study of the AP Spanish

  • 2. • The impression of the language used on the AP Spanish Exam is that it primarily consists of lexically rich but grammatically simple text. • Vocabulary – relatively specialized for specific topics • Mostly of simple sentences and relies on noun phrase modification • 86% of all verbs are in present, past, and infinitive forms. • Recurrent formulaic expressions are used to introduce source texts.
  • 3.
  • 4. • (Anthony, 2011) • Corpus research: • Uses a computer program called a concordancer • analyze key words, phrases & parts of words • in a large, representative, computerized collection of texts, called a corpus. (O’Keefe, McCarthy & Carter 2007) • Allow very extensive, systematic and descriptive data (De Kock 2001)
  • 5. • Relatively few corpus studies in languages other than English (Parodi, 2007) • “Gap” between corpus-based research results and pedagogical practice (Cortes 2013)
  • 6. The present study aspires to: help redress both the lack of corpus research in Spanish and the gap between research and practice by applying corpus methodologies to a pedagogical problem from a Spanish L2 classroom: How to best prepare high school students for success on a high-stakes, skills-based exam of proficiency in Spanish.
  • 7. • RQ1: How representative is the AP Spanish Exam of broader usage of Spanish? Specifically, in terms of: • Vocabulary • Parts of speech • Verb forms • RQ2: What are the most frequent recurrent word combinations? • What are the salient 3-, 4-, 5-, or 6-grams used on the exam? • Are there any salient tendencies in n-gram use? • RQ3: Are the “transition phrases” suggested by a popular test-prep book used frequently on the exam?
  • 8. • Year-end, Skills-based exam • No vocabulary or grammar specifications • Students must use information from authentic texts to: • Write a personal letter • Compose a synthesis essay • Respond orally to a simulated conversation • Make an oral synthesis presentation (College Board, 2008-2013)
  • 9. • Total of 10 texts • 18,333 word tokens • Most of the text is from the articles and radio reports used as sources for the presentational writing and speaking exercises.
  • 10. • List of the 5,000 most frequent words in Spanish • Based on a subset of the 100-million-word Corpus del Español (CDE) (Davies 2002-) • Balanced, representative corpus: • Spoken/Written • Latin America/Spain
  • 11.
  • 12. • 30% of the top 300 words in Davies’ (2006) do not appear on the AP word list. • Of those, 41% are verbs, including many core vocabulary items for lower-level Spanish classes: PONER (PUT) LLAMAR (CALL) VENIR (COME) SALIR (LEAVE) VOLVER (RETURN) VIVIR (LIVE) MIRAR (LOOK) EMPEZAR (BEGIN) ENTRAR (ENTER) ENTENDER (UNDERSTAND) PEDIR (REQUEST) RECIBIR (RECEIVE) TERMINAR (FINISH) SACAR (TAKE OUT) NECESITAR (NEED) LEER (READ) ABRIR (OPEN)
  • 13. • General Nouns: • COSA (THING) • HOMBRE (MAN) • MUJER (WOMAN) • MODO (WAY) • RELACIÓN (RELATIONSHIP) • Body Parts: • MANO (HAND) • OJO (EYE) • Human Relations: • HIJO (SON) • SEÑOR (MISTER) • MADRE (MOTHER) • NOSOTROS (WE) • NADIE (NOBODY) • Religion: • VERDAD (TRUTH) • SANTO (HOLY) • DIOS (GOD) • Time/Space • PUNTO (POINT) • LADO (SIDE) • NOCHE (NIGHT) • PRINCIPIO (BEGINNING) • PUEBLO (TOWN)
  • 14. • Several of the generally common adjectives that are missing from the AP Corpus frequency list are typically pre-modifiers. • AQUEL (THAT) desde aquel día (from that day) • TAL (SUCH) hacerlo de tal manera (to do it in such a way) • PROPIO (OWN) tiene su propio estilo (has his own style) • NINGÚN (NONE) no hay ningún problema (there’s no problem) • CUALQUIER (ANY) puede hacer cualquier cosa (can do any thing) • ÚNICO (ONLY) ¿Usted es el único hijo? (You are the only son?)
  • 15. • Terms used to introduce source texts for the presentational writing and speaking activities • Not extremely salient for the student taking the exam— referential information—not necessary for interpreting the texts • May be helpful to guide students in quickly selecting appropriate strategies to make the most efficient use of time FUENTE (SOURCE) DIARIO (NEWSPAPER) INFORME (REPORT) ARTÍCULO (ARTICLE) APARECER (APPEAR) RADIO (RADIO) EMITIR (BROADCAST) SIGUIENTE (FOLLOWING) TITULADO (TITLED) TEXTO (TEXT) CONVERSACIÓN (CONVERSATION) IMPRESO (PRINTED) PERIÓDICO (NEWSPAPER) ADAPTACIÓN (ADAPTATION) GRABACIÓN (RECORDING)
  • 16.
  • 17. • Geography: país (country), mundo (world), ciudad (city), español (Spanish), lengua (language), mundial (worldwide), idioma (language), estado (state) • Environment: cambio (change), climático (climate), invierno (winter), oso (bear), ave (bird), combustible (fuel), nieve (snow), calentamiento (warming) • Wellbeing: agua (water), físico (physique), salud (health), organismo (body), risa (laughter), peso (weight), alimento (food), kilómetros (kilometers) • Technology: computadora (computer), internet (Internet), digital (digital), electrónico (electronic), red (network), tecnología (technology), virtual (virtual) • Fine arts: arte (art), música (music), orquesta (orchestra), artista (artist), producción (production), pintura (painting), músico (musician), lienzo (canvas)
  • 18.
  • 19. WORD CLASS AP TOKENS % CDE TOKENS % PREPOSITIONS 3,079 22.79% 5,553,520 24.72% ARTICLES 2,544 18.83% 4,643,039 20.67% CONJUNCTIONS 1,371 10.15% 3,781,609 16.83% PRONOUNS 608 4.50% 2,046,356 9.11% VERBS 1,213 8.98% 1,928,260 8.58% ADVERBS 566 4.19% 1,764,952 7.86% COMMON NOUNS 2,435 18.02% 1,459,968 6.50% ADJECTIVES 1,294 9.58% 614,069 2.73% PROPER NOUNS 296 2.19% 365,057 1.62% NUMERALS 100 0.74% 246,519 1.10% INTERJECTIONS 5 0.04% 64,277 0.29% TOTAL 13,511 100% 22,467,626 100%
  • 20. WORD CLASS ACADEMI C NEWS FICTIO N ORAL COMMON NOUNS 241,116 222,729 209,619 169,680 PREPOSITIONS 162,221 156,788 132,089 118,329 ARTICLES 153,504 139,484 124,272 105,910 VERBS 116,308 136,359 187,788 183,306 ADJECTIVES 90,953 72,177 58,305 50,667 CONJUNCTIONS 72,917 76,856 97,745 116,953 PROPER NOUNS 53,161 57,932 22,147 28,177 ADVERBS 27,754 37,160 55,152 79,902 PRONOUNS 24,821 32,464 65,805 73,150 NUMERALS 6,125 8,705 5,426 9,434 INTERJECTIONS 93 286 818 8,134
  • 21. VERB FORM AP % CDE % PRESENT 747 61.58% 1,190,971 37.52% INFINITVE 211 17.39% 459,890 14.49% PRETERITE 126 10.39% 386,218 12.17% IMPERFECT 27 2.23% 443,182 13.96% PAST PARTICIPLE 53 4.37% 259,488 8.18% GERUND 4 0.33% 107,727 3.39% CONDITIONAL 13 1.07% 57,225 1.80% FUTURE 12 0.99% 67,040 2.11% SUBJUNCTIVE- PRESENT 17 1.40% 126,093 3.97% SUBJUNCTIVE-PAST 3 0.25% 73,073 2.30% SUBJUNCTIVE- FUTURE 0 0.00% 3,141 0.10% TOTALS 1,213 100% 3,174,048 100%
  • 22.
  • 23. ?- Gram Fre q Rang e N-Gram English Structure Function Subcategory 6 10 5 apareció en el sitio de internet appeared on the website Verb Phrase fragment Referential Intangible framing 4 15 8 este artículo apareció en this article appeared in Verb Phrase fragment Referential Intangible framing 3 20 10 artículo apareció en article appeared in Verb Phrase fragment Referential Intangible framing 3 19 7 apareció en el appeared on the Verb Phrase fragment Referential Intangible framing 3 14 5 el sitio de the site of Noun Phrase fragment Referential Intangible framing 3 14 5 en el sitio on the site Prep Phrase fragment Referential Identification/ Focus 3 14 5 sitio de internet internet site Noun Phrase fragment Referential Identification/ Focus 3 11 10 informe de la report from the Noun Phrase fragment Referential Identification/ Focus 3 11 5 se presentó en was presented on Verb Phrase fragment Referential Intangible framing
  • 24. • Lexical Bundle – an N-gram that occurs a certain number of times acros a certain number of texts in a corpus • Cut-off numbers determined by the type of corpus and the length of N-gram • Based on these criteria, the six-word expression apareció en el sitio de internet (appeared on the website) can be considered a lexical bundle for this corpus.
  • 25. • Empirically identified, frequency-based expressions which could be salient for the examinee and therefore useful for interpreting the texts: • todo el mundo (the whole world) • a través de (throughout) • por ciento de (percent of) • una de las (one of the) • se trata de (is about) • cuál es el (what is the…?) • de enero de (of January of) • de noviembre de (of November of) • en la ciudad (in the city)
  • 26.
  • 27. • One of the most popular textbooks for the AP Spanish course. • Contains exhaustive list of transition words and phrases • Very few of these appear in the AP Corpus
  • 28. TRANSITION English FREQ TRANSITION English FREQ que that 565 entonces Then 8 y and 482 sin embargo However 8 como like, as 84 mientras While 7 o or 57 o sea that is 7 pero but 49 ya que Since 7 también also 34 al + inf upon + -ing 5 si if 33 sino but rather 5 cuando when 30 a partir de as of 4 porque because 26 como si as if 4 durante during 18 luego later, then 4 según according to 17 primero first 4 además in addition 14 sino que but rather 3 para que so that 12 tampoco neither 3 por ejemplo for example 11 una vez que once 3 sobre todo above all 11 tanto… como… as much… as… 3 aunque although 9
  • 29. • The impression of the language used on the AP Spanish Exam is that it primarily consists of lexically rich but grammatically simple text. • High frequency of relatively obscure & specific vocabulary items; • Many common “general” vocabulary items are missing • Texts consist mostly of simple sentences with few conjunctions • Communication relies on noun phrase modification—academic register • 83% of all verbs in present, infinitive or preterite forms. • Recurrent word combinations are primarily used to introduce source texts.
  • 30. • In order to successfully interpret the tasks on the AP Spanish Exam, students must possess a broad vocabulary that is strongly rooted in, but extends well beyond, the most frequent lexical items in the language. • An AP student’s vocabulary should include a variety of synonyms, especially a wide range of nouns related to specific themes that express concrete entities and abstract concepts.
  • 31. • Present, Preterite & Imperfect tenses along with the Infinitive account for: • 86% of all verbs in the AP Corpus • 78% of all verbs in the Corpus del Español • The most important grammatical focus for the AP class might well be that of the noun phrase. • Complex verb tenses should not be the organizing factor for an upper-level Spanish curriculum
  • 32. • Anderson, N. J. (2014). Developing Engaged Second Language Readers. In M. Celce-Murcia, D. M. Brinton, & M. A. Snow (Eds.), Teaching English as a Second or Foreign Language. 4th ed. (pp. 170-188). Boston: Heinle Cengage. • Anthony, L. (2011). AntConc (Version 3.2.4w) [Computer Software]. Tokyo, Japan: Waseda University. Available from http://www.antlab.sci.waseda.ac.jp/ • Biber, D., Johansson, S., Leech, G., Conrad, S., & Finnegan, E. (1999). Longman Grammar of Spoken and Written English. Essex, England: Longman. • College Board. (2008-2013). AP Spanish Language Exam: Free-Response Questions. Retrieved from http://apcentral.collegeboard.com/apc/public/courses/teachers_corner/221848.html. • Cortes, V. (2013, January). Waiting for the revolution. Plenary talk presented at the Conference for the American Association of Corpus Linguistics (AACL), San Diego, California, USA. • Davies, M. (2002-). Corpus del Español: 100 million words, 1200s-1900s. Available online at http://corpusdelespanol.org. • Davies, M. (2006). A Frequency dictionary of Spanish: Core vocabulary for learners. New York: Routledge. • De Kock, J. (2001). [Preface]. In J. De Kock (Ed.), Gramática española: Enseñanza e investigación (Vol. 7. Lingüística con corpus). (pp. 7-8). Salamanca: Ediciones Universidad de Salamanca. • Díaz, J. M. (2014). AP Spanish: Preparing for the Language and Culture Examination. Boston: Pearson Education. • Parodi, G. (2007). Catching up with corpus linguistics: Register-diversified studies from different corpora in different Spanish-speaking countries. In G. Parodi (Ed.), Working with Spanish Corpora. (pp. 1-10). New York: Continuum. • Tracy-Ventura, N., Cortes, V., & Biber, D. (2007). Lexical bundles in speech and writing. In G. Parodi (Ed.), Working with Spanish Corpora. (pp. 217-231). New York: Continuum.

Editor's Notes

  1. Thank you for being here. Thank Viviana and Eric. This study is the culmination of a two-term exploration of the power and utility of corpus linguistics methodologies as applied to the context of the Spanish foreign language classroom.
  2. In this study, I explored the language of the AP Spanish Exam. The impression of the language used on the AP Exam is that it primarily consists of lexically rich but grammatically simple text. -In this presentation, I will demonstrate that: -The vocabulary of the of the AP Spanish Exam is relatively specialized for specific topics. -That it uses mostly simple sentences and relies primarily on the noun phrase to express meaning -that 86% of all verbs are accounted for by an in-depth knowledge of the present, past and infinitive forms. -and that many recurrent formulaic expression are used to introduce information for the source texts
  3. For starters, what is a corpus study?
  4. -Corpus-based research begins by using a computer program, called a concordancer, to analyze key words or phrases, within a large, computerized collection of texts, called a corpus. -Here, you see what a concordancer program looks like. This is Antconc, the program I used in my analysis -These tools provide very extensive, systematic and descriptive linguistic data.
  5. -The vast majority of corpus research to this point has been focused on English. Until recently, very few studies have applied these methodologies to other languages, such as Spanish. -Additionally, there appears to be a general “gap” between corpus-based research results and pedagogical practice.
  6. This study hopes to help redress both the lack of corpus research in Spanish and the gap between research and practice by applying corpus methodologies to a pedagogical problem from a Spanish foreign language classroom: How to best prepare high school students for success on a high-stakes, skills-based exam of proficiency in Spanish.
  7. My research questions are: How representative is the AP Spanish Exam of broader usage of Spanish? Specifically, in terms of: Vocabulary Parts of speech And Verb forms Next, What are the most frequent / recurrent word combinations? What are the salient N-grams used on the exam? And, Are there any salient tendencies in N-gram use? Finally, Are the “transition phrases” suggested by a popular test-prep book used frequently on the exam?
  8. The Advanced Placement Program provides high-school students access to classes with “college-level” curricula. Students who score well on the year-end exam have the opportunity to earn college credit at most American universities. This exam does not have any specific vocabulary / or grammar specifications. On the free-response portion of the exam, students must use information from authentic sources to complete a series of communicative tasks, such as what you see here.
  9. Here we see some basic facts about the AP Corpus: -It has a total of 10 texts, and a total of 18,333 words. This corpus is quite small for reasons of design and feasibility. -The vast majority of the text in my corpus comes from the written / and oral sources for the presentational writing and speaking activities.
  10. For this investigation, I cross-referenced all words in my analysis with their rank in Davies’ Frequency Dictionary of Spanish. -This work is a catalog of the 5,000 most frequent words in Spanish, based on a sub-corpus of the Corpus del Español. -This is the first major corpus of Spanish to feature a large number of spoken texts as well as a geographical balance between Spain and Latin America.
  11. Lexical Analysis: Several interesting conclusions can be drawn by examining those items which are listed as very frequent in Davies’ frequency dictionary / but are absent from the AP Corpus.
  12. I compared Davies’ list of most frequent terms in Spanish with the word list of the AP Corpus. Looking at the top 300 words in Davies’ list, / 90 do not appear on the AP list. Of those 90, / 41% are verbs. Among these “absent” verbs are many of the core vocabulary items for lower-level Spanish classes. These are all verbs that are quite frequent in Spanish in general, but which do not show up in the AP Corpus.
  13. There are also many basic nouns missing from the AP Corpus list. A number of these are general nouns like thing, man, or way. One can conjecture that / in academic writing / many of these lexical items would be replaced by more specific terms. Other categories of nouns missing from the AP Corpus include human relationships, location in time or space, names of body parts and religious terms.
  14. Spanish primarily relies on post-modification of nouns, meaning that the adjective typically comes after the noun, unlike in English. So for instance, grande means big, and la casa is the house, but the usual word order is la casa grande, with the adjective following the noun. One unique feature of many of the generally common adjectives which are missing from the AP corpus, however is that they are typically pre-modifiers, meaning that these adjectives tend to appear before the noun that they describe. Here you see some examples from the Corpus del Español of each of these adjectives as a pre-modifier.
  15. On the other end of the spectrum, there is a large group of lexical items that are unusually frequent in the AP Corpus because they are used in the organization of the test itself. In particular, these terms are used in introducing the source texts for the presentational writing and speaking activities. These terms are not extremely salient for the student taking the exam since they convey only referential information, and they are not usually necessary for successfully interpreting the texts themselves.
  16. However, many of the lexical items that are unusually frequent in the AP Corpus are related to the exam question topics for a particular year. For example, the word BICICLETA (BICYCLE) appears a normalized 9,274 times per million words in the AP Corpus, but only 12 times per million in the Corpus del Español. Notice that all but one of the occurrences in the AP Corpus appear in only one year’s exam. The question for the 2011 presentational writing activity was: ¿Cuál es el impacto del uso de la bicicleta en distintos lugares del mundo? OR (What is the impact of the use of the bicycle in different places of the world?)
  17. Other thematic groups of unusually frequent words in the AP Corpus include: -Geography -Environment -Wellbeing -Technology -Fine Arts -Education You see here just a few of the unusually frequent terms for each topic.
  18. Turning to Grammatical Analysis… By compiling a table of the various word classes used in the AP Corpus and comparing that with the Corpus del Español, I noticed several interesting trends.
  19. The AP Corpus has a higher percentage of nouns and adjectives than the Corpus del Español, indicating a reliance on the noun phrase to carry most of the information within the text. The AP Corpus also uses many fewer conjunctions than the Corpus del Español, implying a scarcity of complex and compound sentences. In contrast, the percentage of adverbs is somewhat lower in the AP Corpus than in the Corpus del Español, indicating again that the AP Corpus relies primarily on noun phrase modification rather than verb phrase modification or discourse-level adverbials. The percentage of all other word classes were more or less equal between the two corpora.
  20. All of these observations fit with the general trends for the academic register of writing in Spanish, as shown in this table. The frequency numbers from the Corpus del Español for the various parts of speech are broken down by register. The register with the highest number for that particular word class is indicated in yellow and the register with the lowest number in blue. As you can see, the academic register of Spanish in the Corpus del Español shows higher numbers of noun phrase constituents, including common nouns, prepositions, articles and adjectives. In contrast, verbs, conjunctions, adverbs, pronouns, numerals and interjections are all least common in the academic register. These results reflect previous findings of the relationship between word class and register in English, as reported in the Longman Grammar of 1999.
  21. An analysis of verb tenses shows some very clear differences between the Corpus del Español and the AP Corpus. As you can see, almost 62% of all verb forms in the AP Corpus are in the present tense, while the Corpus del Español contains less than 40% present tense verbs. The AP Corpus also has a slightly higher percentage of infinitives than the Corpus del Español , but has far fewer past tense verbs, including a much smaller percentage of verbs in the imperfect aspect. Furthermore, there are considerably fewer verbs in the subjunctive mood, fewer participial forms and slightly smaller percentages of the future and conditional forms. In fact, the present, infinitive and preterite forms account for 83% of all verb tokens in the AP Corpus.
  22. Words in isolation, however, do not show the whole picture. Complex relationships between the ideas presented in a text are often communicated through formulaic multi-word expressions that can act as important “textual building blocks”, as they are called by Tracy-Ventura et al. (2007)
  23. In order to empirically identify these “textual building blocks,” I used the concordancer to find the most frequent N-grams in the corpus. N-grams are simply repeated sequences of words: the concordancer identifies every possible 3-word sequence, and then counts how many occurrences there are of that 3-gram. It is important to emphasize that this process is completed without any intuitive notions of what expressions will result. This table shows the most frequent N-grams in the AP Corpus, as well as analysis of the structure and function of each. No 5-grams were identified. As you can see, this empirical, corpus-driven methodology does not normally produce expressions which are complete structural units or which have idiomatic meaning. The majority of the 25 n-grams identified / consist of noun phrase or prepositional phrase fragments. In terms of the function, almost all N-grams found / were referential in nature. Most of these referential phrases are used to introduce written source texts and audio recordings on the exam. Much of this language is recurrent word-for-word from one year to the next, with the only difference being the name of the newspaper or website. These expressions are not extremely salient for the examinee because they do not form part of the actual text; however, they can hold important clues for the student about how to interpret the text and what metacognitive listening or reading strategies to employ in attending to the text that will follow.
  24. One of these N-grams meets the frequency and range requirements to be a lexical bundle. In order to be considered a lexical bundle, an expression must occur extremely frequently in a corpus and must appear across many different texts within the corpus. The longer a lexical bundle is, the more rare it is. Based on these criteria, the six-word expression apareció en el sitio de internet (appeared on the website) can be considered a lexical bundle for this corpus. This exact word combination occurs in half of the texts for a total of 10 times in the AP Corpus. The fact that this phrase is so long and that it has such a wide range within the corpus is quite extraordinary.
  25. As previously mentioned, many of the N-grams identified are used to introduce source texts on the exam. However, Several of the other expressions on this list are much more salient for the examinee, and therefore would be more useful for interpreting the texts. Exposing students to phrases such as these could help them develop what Anderson (2014) calls automaticity in rapid fluent reading, that is, the ability to recognize words and meaning with little or no conscious attention. Most of these expressions are completely un-idiomatic, having a purely straightforward, literal meaning. The compound preposition a través de (through) and the verbal phrase se trata de (is about) can be semantically opaque to students, and so must be taught as a unit. Likewise, students who have limited experience with authentic texts can sometimes be confounded by even seemingly straightforward expressions such as una de las (one of the). It is therefore important that students be challenged to develop automaticity with even structural phrases like these.
  26. Research question number 3 was: Are the “transition phrases” suggested by a popular test-prep book used frequently on the exam?
  27. AP Spanish: Preparing for the Language Examination by José Díaz is one of the most popular textbooks for the AP Spanish course. It contains exhaustive lists of transition words and phrases in several appendices to help students prepare for the AP Exam. Very few of the more than 150 terms appear in the AP Corpus, but many of those that do appear / are quite frequent.
  28. These lists are intended to help students with organization and textual cohesion in their own written and spoken production; however, it is often difficult for students to interpret the sometimes complex relationships of meaning that these terms express. Guiding students to notice these phrases when they are used in authentic texts could be an important step towards students beginning to use them in their own production.
  29. As we have seen, the language of the AP Spanish Exam is lexically rich but grammatically simple. -It consists of many obscure words which are very specific to a particular subject matter. -Many of the most common vocabulary items are missing from the texts. -It is made up of mostly simple sentences with few conjunctions. -The high percentage of common nouns and adjectives is consistent with an academic register of writing. -The clear majority of verbs appear in the present tense, and in fact, only 3 verb forms account for 83% of all verb tokens in the AP Corpus. -The AP Corpus presents a large number of recurrent multi-word expressions which primarily function to couch introductory information regarding the source texts. -One of these frequency-based formulaic expressions appears to qualify as a lexical bundle.
  30. Pedagogical Implications: In order to successfully interpret the tasks of the AP Spanish Exam, students must possess a broad vocabulary that is strongly rooted in, but extends well beyond, the most frequent lexical items in the language. An AP student’s vocabulary should include a variety of synonyms, especially a wide range of nouns related to specific themes that express concrete entities and abstract concepts.
  31. Furthermore, AP Spanish teachers need not overemphasize advanced conjugation patterns and complicated sequences of tenses, since patterns taught in Spanish I and Spanish II account for 86% of all verbs on the AP Test and 78% of all verbs in the Corpus del Español. In fact, the most important grammatical focus for the AP class might well be that of the noun phrase, which is often neglected in the intermediate and upper levels. I am certainly not suggesting that teachers limit their curriculum to only the “basic” grammatical structures; however, I am proposing that complex verb tenses should not be the organizing factor for an upper-level Spanish curriculum, but that lexical, discourse and communicative factors should be the guiding features of the AP curriculum.
  32. Thank you very much! Do you have any questions?