Semantic Hand-Tagging of the SenSem Corpus Using Spanish WordNet Senses


Published on

Irene Castellón, Salvador Climent, Marta Coll-Florit, Marina Lloberes, German Rigau.
9 January 2012
Published at GWC'12 (Global WordNet Conference)

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Semantic Hand-Tagging of the SenSem Corpus Using Spanish WordNet Senses

  1. 1. Semantic Hand-Tagging of the SenSem Corpus Using Spanish WordNet Senses Irene Castellón Salvador Climent Marta Coll-Florit GRIAL, UB GRIAL, UOC GRIAL, UOC Barcelona, Spain Barcelona, Spain Barcelona, Spain Marina Lloberes German Rigau GRIAL, UOC IXA, EHU/UPV Barcelona, Spain Donostia, Spain Abstract Sentences are tagged at both syntactic and se- mantic levels: verb sense, phrase and construc- This paper presents the semantic annotation tion types, aspect, argument functions and se- of the SenSem Spanish corpus, a research fo- mantic roles. cused on the semantic annotation of the nom- Currently, the project finished the semantic inal heads of the verbal arguments, with the tagging of the heads of the verbal arguments. final goal of acquiring semantic preferences This phase of the project consists of tagging for verb senses. We used Spanish WordNet noun heads with ESPWN1.6, ─ a resource inte- 1.6 senses in the annotation process. This process involves the analysis of the adequacy grated into the Multilingual Central Repository 1 of WordNet for semantic annotation and, in (MCR) (Atserias et al., 2004b) which follows the cases of inadequacy, the proposal of solu- EuroWordNet architecture (Vossen, 1998) and tions. The results are the tagged corpus, a currently maps multiple wordnets and ontologies, guide with semantic annotation criteria, and a e.g. Top Ontology (Álvez et al., 2008) and critical assessment of WordNet as a resource SUMO (Niles and Pease, 2003). for semantic corpus tagging. The next section comments on the state of the art in semantic annotation of corpus and §3 intro-1 Introduction duces our methodology. Then §4 presents the as-The semantic annotation of corpora provides a sessment of ESPWN1.6 as a resource for seman-basis for characterizing lexical-semantic infor- tic tagging; and §5 shows the set of solutions andmation. In this paper we present a relevant part annotation criteria stated to solve problemsof the long-term project of annotation of the arisen. The results of the project are presented inSenSem Spanish corpus (Alonso et al., 2007), i.e. section 6 and, finally, conclusions and proposedthe semantic annotation of the nominal heads of future work conclude the paper.the verbal arguments using the Spanish WordNet 2 Semantic Annotation of Corpora1.6 (hereinafter ESPWN1.6) (Atserias et al.,2004a). The final goal of this research is the ac- There are several semantically-tagged corporaquisition of semantic preferences for verb senses. for English such as PropBank (Palmer et al., This process involved two lead-up tasks: 2003) or FrameNet (Baker et al., 1997). Howev- er, the projects that are more closely related to1) Assessing the adequacy of ESPWN1.6 (and ours are Ontonotes (Yu et al., 2007), SemCor thus WordNets in general) for corpus annota- (Miller et al., 1994) and Multi-SemCor (Ben- tion. tivogli and Pianta, 2005) as they use WordNets2) Finding solutions for such inadequacy prob- for sense tagging. lems ─ thus setting an annotator’s guide pro- For Spanish, two main semantically-annotated viding stable criteria, specially for sense dis- corpora have been developed: ADESSE (Garcia ambiguation. Miguel and Albertuz, 2005) and AnCora (Taulé et al., 2008). Both consist of a corpus annotated SenSem is a databank of Spanish which maps with references to a verbal database.a corpus and a verbal database. The corpus con-sists of 25,000 sentences, 100 for each of the 250most frequent verbs of Spanish (Davies, 2002). 1
  2. 2. SenSem is structured in the same way, being to annotators for such preliminary test were quiteits main difference that (i) it is not a posteriori general ─ they are listed in §5.1 as general in-mapping since from the beginning it was de- structions.signed and built in order to annotate corpus data The initial agreement between annotators waswith verb sense information from the database; only 40%. Then, we discussed the annotation de-(ii) it is a balanced representation as each verb cisions for producing a first set of disambigua-lemma holds 100 sentence examples; and (iii) it tion criteria. A subsequent annotation of a subsetapplies to the most frequent verbs of the lan- of the corpus based on this consensus reached anguage. agreement between annotators of 84.32%. The establishment of specific criteria for disambigua-3 SenSem Methodology tion and semantic annotation continued until they became stable.Our methodology draws upon that of Eusemcor This process led to a general assessment of the(Agirre et al., 2006) which carried out a concur- drawbacks of the nominal part of ESPWN1.6 asrent processes of corpus annotation, correction a resource for semantic annotation which can beand extension of WordNet for the Basque lan- quite straightforwardly ported to other wordnetsguage. However, for several practical reasons, in similar projects for other languages. This as-we did not modified the Spanish WordNet. We sessment is presented in the following section.just keep record of the potential changes to be in-corporated when necessary. 4 Assessment of ESPWN1.6 as a Re- The process presented here was performed by source for Semantic Taggingsix linguists, three annotators and three annota-tors and judges. The task spent 6 months and its We have classified the drawbacks found whencost was 8 person/month. The process was divid- using ESPWN1.6 for semantic annotation intoed into the following steps: five general types: technical (§4.1), structural1) Automatic POS tagging of the corpus using (§4.2), derived from ambiguity or uncertainty of FreeLing (Padró et al., 2010). This process led meaning (§4.3), related to incompleteness (§4.4), to identifying the heads of the nominal argu- and interlinguistic (§4.5). ments. 4.1 Technical problems2) Adapting Eusemcor interface to the specific Firstly, let us consider some problems derived needs of the project. from decisions taken on the original design of3) A preliminar test in order to set an agreement WordNet and ESPWN1.6. between annotators for developing the initial Lexicographical description. WordNet glosses criteria. and/or examples might conflict with the meaning4) Full corpus annotation. one can infer from the ontological relations of the concept.5) In parallel with 4, coordination sessions with judges and annotators for establishing, extend- Morphology. ESPWN1.6 was derived by map- ing or modifying the initial criteria. ping and translating the English WordNet ver- sion 1.6. Thus, it encodes as lemmas English The annotation was performed using an inter- feminine forms which in Spanish are inflectionalface implemented as a web service which facili- (e.g. sister-‘hermano/a’). Similar problems aretates the annotation of all occurrences of a single found with diminutives. This causes mismatch-lemma in the corpus2. This ensures annotation ing between FreeLing and ESPWM1.6 lemmati-consistency as the same linguist deals with all in- zation criteria thus leading to several problemsstances of the same word and, in turn, it allows in annotation.him/her to get a holistic view of the distributionof the lemma in different ESPWN1.6 meanings. 4.2 WordNet 1.6 structural drawbacks The annotation of the test phase was per- Since ESPWN1.6 follows the expand approachformed by a team of four linguists (Carrera et al., (Vossen 1998) it also inherits the English Word-2008). They all annotated the same subset of the Net structural problems.corpus with a total of 50 sentences. Instructions Autohyponymy. Quite often two senses of the2 same word exhibit a direct hyponymy relation-
  3. 3. ship. In these cases, WordNet encodes different as various types of regular polysemy (Apresjan,levels of generalization of the same concept. 1973) are not applied consistently in WordNet. We can classify the general problem of exces- For instance, the lemma ‘trabajo’ (work) in sive granularity of meaning in WordNet on theESPWN1.6 spans in seven senses, including ‘tra- following basic types:bajo_4’ (“productive work”) and ‘trabajo_1’(“activity directed toward making or doing some- Regular Polysemy. Some entities or situationthing”); this two synsets are related by hy- types are systematically polysemous, as in theponymy ─‘trabajo_4’ ISA ‘trabajo_1’. case of events and its resulting state. For exam- ple, in one occurrence of “the education ofFalse hyponymy. The taxonomic structure of youth”, it is difficult to discern whether someoneWordNet encodes ontological inaccuracies. is talking about either the event or the final stateGuarino (1998) classifies such false hyponymy of the person as “educated” ─ both possibilitiesinto the following categories: are synsets in ESPWN1.6.‒ Confusion of meaning in situations of multiple Sense proliferation because of the point of inheritance. Incompatible meanings collapse view. Frequently, many different WordNet into one synset, e.g. ‘Statue of Liberty’ having synsets actually denote the same type of entity both hypernyms ‘statue’ (an object) and ‘plas- but being observed from different points of view. tic_art’ (an abstract concept). For instance, ‘familia’ (family) has three senses‒ Reduction of meaning e.g. ‘organization’ as in ESPWN1.6 corresponding respectively to “a hyponym of ‘group’ ─ while the meaning of social unit living together”, a “primary social organization is wider than that of group, not group” and “people descended from a common an specialization of it. ancestor”.‒ Overgeneralization. It happens in case of ex- Senses modulated by context. Only a very rich cessive heterogeneity of cohyponyms so that context could allow annotators to disambiguate indirectly WordNet forces an inappropriate between the possible meanings of a word. For in- more general interpretation of the hyperonym stance, ‘interno’ has three meanings in ESP- ─ e.g. an amount of matter as hyponym of a WN1.6: (i) a child in a boarding school, (ii) a physical object. person confined in an institution (like a prison or‒ Confusion between type and role. This is a hospital) or (iii) a resident doctor in a hospital. common mistake in the WordNet taxonomy. A Annotation is performed on a sentence basis so taxonomic relation by definition must be es- usually there is no context enough to draw such tablished between entity types but in WordNet subtle differences. there are relations from type to role, e.g. ‘per- Sense intersection. In some cases two meanings sona_1’ (person) as a hyponym of of a word do not stand for completely disjoint as- ‘agente_causal_1’ (causal agent). pects of the denotation ─ instead, they intersect.‒ Relationship confusion. One of the most com- Clearly these types of ambiguity are facts of mon structural problems is the confusion be- language, not drawbacks of WordNet. The point tween taxonomy and meronymy; e.g. here is that WordNet’s developers rather than ad- ‘hueso_1’ (bone) shows up as hyponym of dress these problems in some simple and struc- ‘connective_tissue_1’ when in fact bone is an tured way, they choose to add new word mean- entity “made of” tissue, so the relationship ings, not always consistently. should be meronymy. A different but related problem arises not from4.3 Ambiguity and vagueness of meaning WordNet design but from the information it pro- vides:Excessive granularity of WordNet senses is themost mentioned problem in the literature, espe- Insufficient or misleading information. Whencially that concerning Word Sense Disambigua- neither the gloss nor the examples or the struc-tion (Agirre and Edmonds 2007). As a direct ture of ESPWN1.6 help to make the semanticconsequence of this problem semantic distinc- distinctions clear.tions are difficult to be drawn by human annota-tors: for the same lemma very similar senses can 4.4 Problems of ESPWN1.6 incompletenessbe possible. Moreover, regular distinctions, such
  4. 4. WordNet, despite being the largest lexical-se- of WordNet into Spanish is very limited. In thismantic knowledge base, does not include all the section we list the drawbacks straightforwardlyexisting vocabulary of a language. The main di- caused by this.mensions of its incompleteness of are the follow- Lack of Spanish equivalent. There are casesing: where the concept does exist in the EnglishLack of synset. There are lemmas not appearing WordNet but the ESPWN1.6 has not the equiva-in ESPWN1.6. Moreover also the resource does lent translation. For instance, ‘juez’ (judge) isnot contain the corresponding interlingual con- monosemous in ESPWN1.6 as it appears onlycept, e.g. ‘seguridad social’, whose English with the usual sense related to the legal system.translation would be something like social health However the sense of “evaluator”, which issystem. present in the English WordNet (‘judge_2’, ‘evaluator_1’) is not present in ESPWN1.6.Lack of variant. Some synsets are incomplete asthey do not bear some Spanish synonyms. For in- Synsets split because of morphological mis-stance, ‘testamento vital’ is not encoded in ESP- matches. Since English has no grammatical gen-WN1.6 but it is in the English WordNet. der there are cases in which two lemmas of Eng- lish correspond to only one in Spanish. ThisMetaphorical senses. WordNet does not include causes the inappropriate splitting of the Spanishin a systematic way those meanings created by word in two different synsets, as in ‘tia_1’ andmetaphorical extension. For example, ESP- ‘tio_2’ as a result of the projection of EnglishWN1.6 ‘puente’ (bridge) does incorporate the ‘aunt_1’ and ‘uncle_1’.original architectural sense plus those senses de-noting a dental prothesis and the gymnastics ex- Cultural bias. Some concepts in ENGWN1.6ercise, but it does not has the sense of a special are culturally marked. This is improperly project-bank holiday in Spain. ed to Spanish thus causing several imbalances in ESPWN1.6. For example, the lemma ‘presi- Moreover, in some cases ESPWN1.6 includes dente’ (president) does not include the case ofthe metaphorical sense but not the original one, the head of the government in a kingdom or sim-as in ‘pie’ (foot), which bears the meaning corre- ilar type of state; WordNet only has the Presidentsponding to the base of a mountain but surpris- as the head of a (republican) state.ingly not that of the limb of a person. These are both examples of incompleteness of 5 Solutions and Annotators Guidethe building process of ESPWN1.6 from theEnglish WordNet. In the first case because the This section details the solutions adopted for themeaning of ‘bridge’ as “holiday” is not lexical- problems outlined above, which have been col-ized in English and in the second because of a lected in a guideline for the annotators. Most aresimple oversight. pragmatic solutions, often conditioned to the ini- tial requirement of performing the annotationLack of multiword units. Similarly, ESPWN1.6 without changing the original ESPWN1.6. Thisimplements multiword units without following decision relied on the fact that their improvementconsistent criteria, e.g. it includes ‘agente secre- was out of the scope of the project and we hadto’ (secret agent), which has a non-composition- not the rights to do it. However, the assessmental reading, and also ‘police officer’, which has a described in §4 is a good base to face the task incompositional reading. future projects.Lack of named entities. Proper names in Word-Net are mostly culturally related to the United 5.1 Technical problemsStates. ESPWN1.6 incorporates a number of General instructions. The following instruc-named entities but of course not all of the exist- tions have been defined in order to carry out theing ones. annotation process:4.5 Interlinguistic problems: language ver- ‒ The relations of a synset should be always sions of WN1.6 inspected, at least the hypernym(s) and the first level of hyponyms.Several reported problems are caused by the wayESPWN1.6 was built, i.e. as an expansion of theEnglish WordNet so that the level of adaptation
  5. 5. ‒ The semantic features and ontologies inte- grated into the MCR, especially TO and Synsets of the lemmata ‘consejo’ : SUMO, must always be considered. consejo_1: it is a committee. consejo_2: it is a recommendation. ‒ Glosses are usually less informative than re- consejo_3: it is about guidelines. lations and semantic features. Anyway, the consejo_4: it is also a committee. annotator should pay more attention to the consejo_5: it is about a ‘tip-off’. consejo_6: it is a not stable committee, an impro- English glosses than to the Spanish ones vised one. since the latter might be non accurate trans- lations of the former. Annotation guidelines: - When choosing between 2, 3 and 5, sense 2Morphology. The tagging is not performed should be always used unless there are contextswhen lemmatization mismatches appear between where senses 2 and 5 are clearly differentiatedFreeLing and ESPWN1.6 but the case is record- (highly unlikely).ed in a special separate list. - 1 and 4 are virtually identical and both are hy- ponyms of "administrative unit". 1 is always anno-Operators. Some special notation operators had tated. - 6 is only used when it is very clear that the con-to be defined in order to allow for more flexibili- text deals with not stable committees. Otherwise, 1ty in the selection of the appropriate synset. For is used.instance we stated markers for indicating the Figure 1: Guide for disambiguating “consejo”metaphorical or metonymycal use of a synset(MTF, MET), for identifying a multiword(MLTW) or for marking a partitive noun, e.g. 5.4 Problems related to incompleteness‘slice’ in ‘a slice of bread’ (PRT). Metaphorical and metonymic senses. When the5.2 Structural problems metaphorical or the metonymic senses of a word are not declared in ESPWN1.6 we annotate theAutohyponymy. Whenever two synsets standing occurrence using the synset for the literal inter-for the same concept are found in this relation- pretation and we mark it with the MTF o METship, the more general is chosen to tag the word operator.─except when the context is rich enough to drawthe distinction. Named Entities. Named entities (proper names, dates and amounts of money) are tagged usingFalse hyponymy. These errors are treated case the MUC categories3 since they have become aby case as separate problems of ambiguity of standard in in NLP. Moreover, we see them asmeaning as explained below. well suited for establishing selective preferences5.3 Problems of ambiguity and vagueness ─ which is the final goal of SenSem.In cases of serious difficulty for distinguishing Lack of sense. When a lemma in the corpus issenses in specially complex lemmas, the judges not recorded in ESPWN1.6 as synset or the apro-developed a particular guide for every one. See priate sense is not recorded as a variant, the factin Figure 1 an abridged guide for ‘consejo’. is documented and described for a future revision of the resource. Other criteria involve specific classes of Multiwords. When the interpretation of a multi-nouns, such as the following: in the case of poly- word is compositional (e.g. ‘colegio electoral’,semy between an objective description of an en- electoral college) the annotator tags analyticallytity and the corresponding social use (e.g. be- every part of the compound. If it is not composi-tween the physical and the social meaning of tional but the concept is recorded in ESPWN1.6‘año’-year), unless strong evidence against, the (e.g. ‘célula madre’, stem cell), it is tagged associal use will be chosen. MTW (multiword). If it is neither compositional nor is recorded in ESPWN1.6 (e.g. ‘puesta en In several cases of excessive granularity of marcha’, the action or effect of putting on ormeaning synsets have been clustered. 58 clusters switching on something) the compound is taggedaffecting 129 synsets have been created. 3 Message Understanding Conference. http://www- html
  6. 6. using two operators: MTW (multiword) and FLT oped which may also be useful for similar(lack of sense). projects. Besides, the casuistry detected and recorded5.5 Interlinguistic problems during the annotation is now being applied byInterlinguistic drawbacks such as lacking Span- our research group for developing the Spanishish equivalents or synsets split because of mor- WordNet 3.0 (Fernandez et al., 2008; Oliver andphological mismatches can not be solved without Climent, 2011).changing ESPWN1.6 as it is now. Therefore the The SenSem corpus is freely available under aannotator is instructed to record the case for the GPL license4.future. Acknowledgements6 Results This research has been carried out thanks to theAs a result of the research here presented, 23,307 projects PATHS ICT-2010-270082, FFI2008-forms for 3,693 noun lemmas of the SenSem cor- 02579-E/FILO and KNOW2 TIN2009-14 715-pus have been semantically annotated with ESP- C04 from the Spanish Ministry of Education andWN1.6; this corresponds to the 82.6% of the to- Science.tal amount of verbal arguments in the corpus. Inthis phase of the project, in order to achieve the Referencesoptimal cost-effectiveness only lemmas which Eneko Agirre, Izaskun Aldezabal, Jone Etxeberria, Elioccur more than 5 times have been annotated. Izagirre, Karmele Mendizabal, Eli Pociello, andTherefore, 17.4% of the corpus remains un- Mikel Quintian. 2006. Improving the Basquetagged. 91 lemmas were not tagged because they WordNet by corpus annotation. In Proceedings ofare not recorded in ESPWN1.6, being most of the Third International WordNet Conference,them culturally-specific concepts; as explained, pages 287-290.they have been recorded for their inclusion in fu- Eneko Agirre and Philip Edmonds, editors. 2007.ture versions of the Spanish WordNet. Word Sense Disambiguation: Algorithms and Ap- plications, volume 33 of Text, Speech and Lan-Conclusions and future work guage Technology. Springer-Verlag. Laura Alonso, Joan Antoni Capilla, Irene Castellón,This paper has presented the methodology and Ana Fernández, and Glòria Vázquez. 2007. Thedevelopment of a project for the semantic disam- SenSem Project: Syntactico-Semantic Annotationbiguation and annotation of the arguments of the of Sentences in Spanish. In Nikolov, K., Bontche-nominal heads of the SenSem corpus. SenSem is va, K., Angelova, G. & Mitkov, R. (Eds.), Recenta balanced corpus containing 100 sentences for Advances in Natural Language Processing IV. Se-each of the 250 most frequent verbs in Spanish. lected papers from RANLP 2005. Benjamins Pub- The result, added to previous developments in lishing Co, pages 89-98.SenSem, is a richly tagged corpus both syntacti- Javier Álvez, Jordi Atserias, Jordi Carrera, Salvadorcally and semantically as the annotation includes Climent, Egoitz Laparra, Antoni Oliver, andverb sense, head sense of the arguments, type of German Rigau. 2008. Complete and Consistentphrase, argument functions, thematic roles, con- Annotation of WordNet Using the Top Conceptstruction type and aspectual information. The Ontology. In Proceedings of the 6th Language Re-SenSem corpus is mapped to a database bearing sources and Evaluation Conference (LREC 2008), pages 1529-1534.relevant information for each verb sense, there-fore the result is a well suited resource for empir- Ju. D. Apresjan. 1973. Regular Polysemy. Linguistics,ical studies focused on verbs and for the acquisi- 142(12):5-32.tion and representation of selectional prefer- Jordi Atserias, Rigau G., Villarejo L. Spanishences. WordNet 1.6: Porting the Spanish Wordnet across As a collateral result of the process, a critical Princeton versions. LREC’04. ISBN 2-9517408-1-assessment of ESPWN1.6 has been presented. 6. Lisboa, 2004.Possibly, such an assessment is applicable to oth- Jordi Atserias, Lluís Villarejo, German Rigau, Enekoer WordNets as resources for annotating semanti- Agirre, John Carroll, Bernardo Magnini, and Piekcally corpus of other languages. As a result of the Vossen. 2004. The MEANING Multilingual Cen-assessment, an annotation guide has been devel- 4
  7. 7. tral Repository. In Proceedings of the Second In- of the 27th Conference of The Spanish Society For ternational WordNet Conference (GWC’04), pages Natural Language Processing (SEPLN) 23-30. Brno, Czech Republic. Lluís Padró, Miquel Collado, Samuel Reese, MarinaBaker, C. Fillmore, and J. Lowe. 1997. The berkeley Lloberes, and Irene Castellón. 2010. FreeLing 2.1: framenet project. In COLING/ACL’98, Montreal, Five Years of Open-source Language Processing Canada. Tools. In Proceedings of the Seventh conference on International Language Resources and Evalua-Luisa Bentivogli and Emanuele Pianta. 2005. Exploit- tion (LREC’10), pages 931-936. ing Parallel Texts in the Creation of Multilingual Semantically Annotated Resources: The MultiSem- Martha Palmer, Daniel Gildea, and Paul Kingsbury. Cor Corpus. In Natural Language Engineering. 2003. The Proposition Bank: An Annotated Corpus Special Issue on Parallel Texts, 11(3):247-261. of Semantic Roles. Computational Linguistics, 31(1):71-106.Jordi Carrera, Irene Castellón, Salvador Climent, and Marta Coll-Forit. 2008. Towards Spanish verbs’ Mariona Taulé, M. Antònia Martí, and Marta selectional preferences automatic acquisition. Se- Recasens. 2008. AnCora: Multilevel Annotated mantic annotation of SenSem corpus. In Proceed- Corpora for Catalan and Spanish. In Proceedings ings of the 6th international conference on Lan- of the 6th conference on International Language guage Resources and Evaluation (LREC 2008), Resources and Evaluation (LREC 2008), pages 96- pages 2397-2402. 101.Malcolm Davies. 2002. Un corpus anotado de Piek Vossen. 1998. Introduction to EuroWordNet. 100.000.000 de palabras del español histórico y Ide, N., Greenstein, D. & Vossen, P. (Eds.), moderno. In Procesamiento del Lenguaje Natural, Computers and the Humanities. Special Issue on 29, pages 21-27. EuroWordNet, 32(2):73-89.Ana Fernández-Montraveta, Glòria Vázquez, and Liang-Chih Yu, Chung-Hsien Wu, and Eduard Hovy. Christiane Fellbaum. 2008. The Spanish Version of 2007. OntoNotes: Corpus Cleanup of Mistaken WordNet 3.0. In Storrer, A., Geyken, A., Siebert, Agreement Using Word Sense Disambiguation. In A., & Würzner, K.M. (Eds.), Text Resources and Proceedings of the 22nd International Conference Lexical Knowledge. Mouton de Gruyter, pages on Computational Linguistics (Coling 2008) Vol. I 175-182.José M. García-Miguel and Francisco J. Albertuz. 2005. Verbs, semantic classes and semantic roles in the ADESSE project. In Erk, K., Melinger, A. & Schulte im Walde, S. (Eds.), Proceedings of the In- terdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes, pages 50-55.Nicola Guarino. 1998. Some Ontological Principles for Designing Upper Level Lexical Resources. In Proceedings of the First International Conference on Language Resources and Evaluation, pages 527-534.George A. Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G. Thomas. 1994. Using a semantic concordance for sense identifica- tion. In Proceedings of the ARPA Human Lan- guage Technology Workshop, pages 240-243.Ian Niles and Adam Pease. 2003. Linking Lexicons and Ontologies: Mapping WordNet to the Suggest- ed Upper Model Ontology. In Proceedings of the 2003 International Conference on Information and Knowledge Engineering, pages 23-26.Antoni Oliver and Salvador Climent. 2011. Construcción de los WordNets 3.0 para castellano y catalán mediante traducción automática de corpus anotados semánticamente. In Proceedings