1. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
An Overview on Portuguese Nominalizations
Livy Real1 Alexandre Rademaker12
1IBM Research Brazil
2FGV/EMAp
TYTLES, ESSLLI, 2015
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
2. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Outline
1 Motivations
2 Lexical Resources and Corpora
3 -Ura nominalizations in Portuguese
4 First Results
5 Co-predication
6 General Conclusions
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
3. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Motivations
Main purpose: to describe nominalizations formed by a
specific morpheme in Portuguese considering all its possible
meanings
Empirical description: Corpus based + Lexical Resources
analysis
Nominalizations formed by -ura
Brazilian and European Portuguese
Trying to get generalizations on type relations from a single
word and co-predication structures
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
4. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Motivations - Why -ura?
Ambiguous suffix (Real, 2006, 2008).
Not productive anymore, so we do have a finite set of words
formed by -ura.
Straightforward correspondents in other neo-Latin Language
(-ure in French: coupure, -ura in Catalan: obertura).
Previous literature (Sandmann, 1988; Rocha, 1999; Real,
2006).
We already known co-predication structures with words
formed by -ura.
Example
A assinatura levou trˆes horas e estava ileg´ıvel.
The signature/signing took three hours and was unreadable.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
5. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Lexical Resources and Corpora
Lexical resources - OpenWordNet-PT (de Paiva et al., 2012)
http://wnpt.brlcloud.com/wn/
Dictionaries + Corpora + Google Engine
Corpus Brasileiro http://www.linguateca.pt/ACDC
Dictionaries
Porto’s Dictionaries http://www.infopedia.pt
Caldas Aulete Dictionary http://www.aulete.com.br
Houaiss Dictionary http://www.houaiss.uol.com.br
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
6. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Lexical Resources and Corpora
OpenWordnet-PT
Standard wordnet for Portuguese
Completely linked to Princeton’s Wordnet
Freely available in RDF
Automatically created and manually curated
OpenWordnet-PT has more -ura entries than any available
dictionary for Portuguese; easier to search.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
7. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Selection
Extraction all nominals finished by the graphic form -ura (442
words) in OpenWordnet-PT
Manual selection of words formed by the morpheme -ura (150
words)
Examples
dobradura, assinatura, brochura, semeadura, tesoura, queimadura,
viatura, gordura, legislatura, p´urpura, arquitetura, floricultura,
armadura, manjedoura, Cingapura, jura
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
8. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Categorization
Parted from proposed categorization (Real, 2014) for eventive
nouns
Checked other possible readings for each word in all
considered dictionaries
Types
event, result, physical result, locative, collective, means, property,
instrument, a given portion, rest, function, duration of a function,
science/art.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
9. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Event Deduziu-se que a m˜ae lhe deu muita chicotada a
cada travessura.
‘It was deduced that the mother gave him a lot of
whiplashes at every trick (every time he
misbehaved).’
Result A an´alise do material revelou que, 30 dias ap´os a
microenxertia, ocorreu a soldadura parcial dos
microenxertos.
‘The analysis of the material showed that, 30 days
after micrografting, occurred the partial welding of
micro-grafts.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
10. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations- Examples
Physical Result A varredura mostra somente picos, como pode
ser visto na Figura 8, onde o espelho de simetria de
0o ´e mostrado.
‘The scan shows only peaks, as it can be seen in
Figure 8, where the symmetry mirror of 0o is shown.’
Locative Meu certificado est´a na pasta com meus documentos
na prefeitura, mas o prefeito n˜ao o reconheceu.
‘My certificate is the folder with my documents in
the city hall, but the Mayor did not recognized it.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
11. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Collective Uffizi tem o mais completo testemunho do s´eculo
XV, um momento decisivo da hist´oria da arte,
marcado pela passagem da tradi¸c˜ao bizantina
medieval para a pintura do Renascimento.
‘Uffizi has the most complete reference of XV
Century, a decisive moment of Art History, marked by
the passage of Medieval Bizantine tradition to the
Renaissance painting’
Means A narrativa ´e um cavalo: um meio de transporte cujo
tipo de andadura, trote ou galope, depende do
percurso a ser executado.
‘The narrative is a horse: a means of transportation
which type of gait, trot or gallop, depends on the
route to run.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
12. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Means A narrativa ´e um cavalo: um meio de transporte cujo
tipo de andadura, trote ou galope, depende do
percurso a ser executado.
‘The narrative is a horse: a means of transportation
which type of gait, trot or gallop, depends on the
route to run.’
Property Possui cerca de 48% de umidade e 24% de gordura.
‘It has around 48% of umidity and 24% of fat.’
Instrument Caricaturizada, a gostosona desfila engravatada, com
chap´eu, abotoadura e tudo mais.
‘Caricatured, the hot girl parades with tie, hat,
cufflink and everything.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
13. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
A given portion Assim verificamos que os 587 p´es que aquelas dez
propriedades dos Cal¸ca Pereira comportam podiam
render uma m´edia de 23,5 moeduras, isto ´e, uns 940
alqueires de azeite, que valeriam, ao pre¸co de 60
reais o alqueire, 5640 reais.
‘Thus we have verified that the 587 feet of those ten
properties from Cal¸ca Pereira family include could
yield an average of 23.5 milling portions, that is,
some 940 acres of olive, which would be worth at the
price of 60 reais per bushel, 5640 reais.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
14. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Rest O arroz-caril, confeccionado com especiarias e
moedura de coco, era caracter´ıstico de Goa e estava
muito difundido em Mo¸cambique.
‘The rice-curry, made with spices and coconut
grinding, was characteristic of Goa and was
widespread in Mozambique.’
Function Mario renunciou `a magistratura em novembro.
‘Mario resigned to the magistracy in November.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
15. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Duration of a function Para a legislatura de 1995-1998, os dados
provˆem do Brasil.
‘For the legislative period 1995-1998, the data
comes from Brazil.’
Science/Art A It´alia exprimiu-se, durante certos s´eculos, pela
arquitetura, escultura, pintura.
‘Italy expressed herself, during some centuries, by the
architecture, sculpture, painting.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
16. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
First Results
150 words formed by -ura
33 lexicalized and idiosyncratic senses
2 possible types of action nominals in Portuguese are not
possible (or frequent) types to words formed by -ura:
resultative state and abstract result.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
17. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
First Results
Type Number of Senses
Event 74
Result 62
Physical result 43
Property 38
Instrument 24
Collective 21
Science/Art 20
Locative 8
Means 8
A given portion 7
Function 6
Rest 5
Duration of a function 3
Lexicalized 33
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
18. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
First Results
•Trying to find generalization on the types words can assume:
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
19. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
First Results - Generalizations
A nominal form that has the type ‘rest’ always belongs to the
type ‘event’ (as lavadura ‘washing’ and varredura ‘scan’);
A noun that belongs to the type ‘a given portion’ (as moedura
‘milling’ and semeadura ‘sowing’) has always the following
types: ‘event’, ‘result’, ‘event.result’;
Every noun that belongs to ‘duration of a function’ also holds
the ‘function’ type;
Nouns that belong to the type ‘means’ do not belong to any
other type.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
20. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Search for co-predications with all words on Corpus Brasileiro
(more than 1 billion words from various textual genders);
Manually checked 2000 random sentences for very common
words — as assinatura (70699 sentences) and gordura (29874
sentences);
Manually check all the sentences with rare words — as
enxaguadura (12 sentences) and andadura (35 sentences);
Standard co-predications not found!
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
21. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Corpus Brasileiro
Search for co-predications with all words on Corpus Brasileiro
(more than 1 billion words from various textual genders);
Manually checked 2000 random sentences for very common
words — as assinatura (70699 sentences) and gordura (29874
sentences);
Manually check all the sentences with rare words — as
enxaguadura (12 sentences) and andadura (35 sentences);
Standard co-predications not found!
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
22. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Examples from Corpus Brasileiro
Essa assinatura que eu dei nesse relat´orio da oposi¸c˜ao eu
estou confirmando.
‘This signature I did on this opposition’s report I am
confirming.’
A assinatura que ocorreu na tarde, foi mostrado na noite nos
telejornais RedeTV!
‘The signature/signing that took place in the afternoon, was
shown on the evening news programs in RedeTV!’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
23. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Google
Search on Google with regular expression following previous
proposed structures (Jezek & Melloni, 2011)
Constraints on co-predication with ANs
i. Split co-predication between main clause and subordinate clause;
ii. Temporal disjunction between the two predications;
iii. Omission of the internal argument.
Example:
”assinatura *, que *,”
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
24. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Google
Again, we did not find any standard co-predication!
Example
Acho que fica mais escuro e brilhante que com a op¸c˜ao de pintura
metalizada (que custa quase 1000 euros).
‘I think it gets darker and brighter than with the option of
metallic painting (which costs about 1000 euros)’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
25. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Again!
We produced sentences with rare and very frequent words formed
by -ura and have tested with 3 native speakers (not related to
linguistic fields):
A enxaguadura, que levou uma hora, ficou imunda.
‘The rinse, which took an hour, was filthy.’
A abertura, que levou uma hora, ficou ´otima.
‘The opening, which took an hour, was great.’
A arquitetura, que levou 5 anos, ´e quase toda produzida por
Neymer.
‘The architecture, which took five years, is almost entirely
produced by Neymer.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
26. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication
Surprise again!!! Almost none of the sentences were accepted by
the three speakers!
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
27. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Speakers’ evaluation
?!! A abertura, que levou uma hora, ficou ´otima.
‘The opening, which took an hour, was great.’
??! A arquitetura, que levou 5 anos, ´e quase toda
produzida por Neymer.
‘The architecture, which took five years, is almost
entirely produced by Neymer.’
??? A enxaguadura, que levou uma hora, ficou imunda.
‘The rinse, which took an hour, was filthy.’
Word Frequency in Corpus Brasileiro
Abertura 70699
Arquitetura 25669
Enxaguadura 12
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
28. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Hypothesis
Hypothesis: it seems that there is a relation between frequency of
use and grammatically of co-predications.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
29. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
General conclusions
A nominal form that has the type ‘rest’ belongs to the type
‘event’ (as lavadura ‘washing’ and varredura ‘scan’), but
co-predications between them are impossible;
A noun that belongs to the type ‘a given portion’ (as moedura
‘milling’ and semeadura ‘sowing’) has always the following
types: ‘event’, ‘result’, ‘event.result’, but any co-predication
with ‘a given portion’ is blocked;
All lexicalized senses can not be co-predicated with any other
type.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
30. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
General Remarks
It seems that the occurrence of a given word in the daily
lexicon does have a big effect on grammatically of
co-predication. Is this a cognitive issue? How to deal with
that?
If so, the formal approaches that consider co-predication only
under a syntactic-semantic perspective are loosing very
important features of this phenomenon as pointed out by
(Real & Retor´e 2014). Now we have a stronger evidence to
argue that co-predication has an idiosyncratic nature.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
31. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
General Conclusions
Obrigada! Thank you!
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
32. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
More Remarks
(?) I cannot believe this so small construction took three
years!
N˜ao acredito que esta constru¸c˜ao de merda levou trˆes anos!
(?) How a signature so ugly took three minutes?
Como uma assinatura t˜ao feia levou trˆes minutos?
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
33. Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
More Remarks
Should we start to search for the non-formal variables which
are around co-predications?
If type (mis)match is more related to pragmatics and contexts
than to true-formal variables, proposals that understand each
noun as a type are not so problematic.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations