2. The Dictionary of Italian Collocations LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 2 Part of APRIL project (“Personalised web environmentforlanguagelearning”) NLP resourcesas a supportfor the lexicalcompetenceofstudentsofItalianwithin a VirtualLearningEnvironment(VLE).
3. Presentationoutline LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 3 background and motivation reference corpus methodology dictionary compilation integrationwithin VLE
4. Background differentsyntactic and semanticprofiles, but prototypicalfeatures: semanticnon-compositionality non-substitutabilityofcomponentsbysemanticallysimilarwords non-insertionofexternalitems continuum ratherthan definite categories LREC 2010 - Stefania Spina - The DictionaryofItalianCollocations 4
5. Continuum LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 5 semanticnon-compositionality Tagliare la corda “runaway” aprire la porta “open the door” non-substitutability Camera oscura “dark room” {fare|porre|rivolgere|formulare} una domanda “ask a question” * Stanza oscura insertionofexternalitems fare una lunga calda riposante doccia “take a long, hot, restfulshower” Sistema *molto operativo “operating system”
6. Motivation: collocations in SLA LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 6 improvinglearnersfluency non-nativespeakers and L2 vocabulary: first single words, then more extendedchunks trend tooveruse the creative combinationofisolatedwords Sinclair’s open choiceprinciple ExamplesfromItalianleanercorpora preoccupata per il corso che mi mette nelle difficoltà (Russia) mettere in difficoltà “cause problems” e poi alla fine ho fatto questa decisione (Vietnam) Prendere una decisione “make a decision”
7. DICI LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 7 collocationsrequirespecificpedagogicalattention DictionaryofItalianCollocations(DICI) itiscorpus-based; itis a learner-orientedtool: listof the most common Italiancollocations, classified on a frequencybasis; itisalsobased on statisticalmethodologies (dispersion in the differenttextualgenresrepresented in the corpus).
8. Reference corpus LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 8 Perugia corpus: POS-tagged, lemmatized
9. POS filtering LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 9 Analysisofexistinglistofcollocations: 150 different POS sequences 10 mostproductive POS sequences
10.
11.
12. Dispersion LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 12 Examples: Aggrottare la fronte “tofrown” (fiction) Vincere le elezioni “towin the elections” (press) Dare una definizione “togive a definition” (academic prose) Juilland’sDvalue (Juilland - Chang-Rodriguez, 1964) Dvalue: combinedwithfrequency = usage Usage value ≥ 2 2047 candidate collocations Manualselection. Finalresult: listof1553 word combinations = dictionaryentries
14. Compilation of the Dictionary LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 14 Lexical database enrichedwithtwokindsof data: Visibleto the learner (client output) definition, examples, part-of-speech, syntacticcontextofoccurrenceofcollocations tobeprocessedbyotherapplications (server) internalsyntacticconfigurationforautomaticrecognition
15. DB integration in the VLE LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 15 VirtualLearningEnvironment: web applicationspecificallydevotedtolanguagelearning LELE (Linguistically-EnhancedLearningEnvironment) providelanguagelearnerswithadditional NLP resources, in ordertoimprovetheirlinguisticcompetence receptive and productivelearningactivitiesconcerning the recognition and the activeuseofcollocations
16. LELE Features LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 16 toautomaticallyrecognize and highlightmulti-wordunits in writtenItaliantexts; to show additionallinguistic information about the selectedcollocations; to generate collocationtestsforcollocationalcompetenceassessmentofsecond or foreignlanguagelearners. …
17. LELE scheme LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 17 server
18.
19.
20. Conclusions LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 20 Nextstep: samemethodologyto the whole corpus, forall the 10 selected POS sequences Furtherresearch refinestatisticalmeasures assigncollocationstodifferentlevelsofcompetence othertools (productivetasks)
21. LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations 21 Stefania Spina stefania.spina@unistrapg.it http://april.unistrapg.it