SlideShare a Scribd company logo
1 of 33
PARAPHRASING BIOMEDICAL SUPPORT VERB
CONSTRUCTIONS FOR MACHINE TRANSLATION
Anabela Barreiro
CLUP
CENTRO DE LINGUÍSTICA
UNIVERSIDADE DO PORTO
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
 Build a body of lexical, syntactic and semantic knowledge around SVC
 Apply this linguistic knowledge to paraphrasing - paraphrase and re-write
SVC, such as fazer uma operação (to make an operation/surgery) with
corresponding correct equivalences. The equivalences can be:
• strong lexical verbs, such as operar (to operate on)
• lexical-syntactic and stylistic variants of the original SVC, such as realizar uma
operação (to perform an operation/surgery) or submeter-se a uma
operação/cirurgia (to undergo/have an operation/surgery or to be operated
on), depending on the lexical-semantic filling of the arguments and on their
argument structure.
 show how these paraphrases increase source text quality
 show how MT results can improve significantly when applying
paraphrasing capabilities
MAIN GOALS
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
GENERAL MOTIVATION
1. MT changed the world of translation
it can no longer be ignored or underestimated
2. It is increasingly more used and useful
it often replaces human translation when the client requires gisting
3. However, good quality MT is still an ambitious goal
after +50 years, MT still eludes researchers and developers who
are constantly challenged to provide better translations
How can MT move forward?
32009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
CHALLENGES TO MACHINE TRANSLATION
1. homography and common-noun nuance
2. anaphora + distant referential associations
3. ellipsis
4. extra-sentential and extra-textual information + extra-linguistic
knowledge
5. lexical divergences, idioms, etc.
6. named entities
7. long sentences and wordiness
8. unusual word order
9. multiword expressions (including support verb constructions)
42009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
= multiword or complex predicate consisting of a semantically weak verb
(the support verb), and a predicate noun, a predicate adjective, or a predicate
adverb.
Predicate nouns can be:
 morphologically related to a verb
PT: fazer uma apresentação de N = apresentar N
EN: pay a visit to N = to visit N
 autonomous
PT: fazer um mestrado - *mestrar
EN: have fun - *to fun
SUPPORT VERB CONSTRUCTION
52009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
 Abundant in language and indispensable to communicate
ASSESSMENT TEST: Search pattern in corpora (COMPARA)
(<dar.V+inf> + <tomar.V+inf> + <pôr.V+inf> + <fazer.V+inf> + <ter.V+inf>) <Modif>? <N>
Selected the 1st 100 sentences for each verb and manually annotated the 500 sentences as SVC or NO SVC
 Extensively and systematically studied within the Lexicon-Grammar Theory
[M. Gross and followers], also in Portuguese [Ranchhod, 1990] [Baptista, 2005] [Chacoto, 2005] and
in contrastive studies [Salkoff, 1990, 1999]
 Often can be replaced by stylistic variants - paraphrases
 Present several degrees of variability (variable - invariable - idiomatic)
 Semantically weak and ambiguous
 They cannot be translated literally
WHY SUPPORT VERB CONSTRUCTIONS?
RESULTS:
These verbs occur very frequently in SVCs
Overall, in 64.2% of their occurrence, these verbs
are support verbs
6
PT dar tomar pôr fazer ter
transl. give take put make/do have
NO SVC 11% 12% 33% 53% 80%
SVC 89% 88% 77% 47% 20%
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
7
AH… THAT’S
BETTER!
MT SYSTEMS
MISHANDLING SUPPORT
VERB CONSTRUCTIONS
MT OF SUPPORT VERB CONSTRUCTIONS
WE PROMPTLY VERIFY THE SUPERIOR QUALITY
OF THE RESULTS IN THE SECOND CASE!
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
Por falta de condições técnicas, ele foi removido para o Hospital das Clínicas, onde se fez uma amputação a nível de ombro.
FreeTranslation For lack of technical conditions, he was removed for the Hospital of the Clinics, where was done an
amputation in terms of shoulder.
WorldLingo Due to conditions techniques, it it was removed for the Hospital of the Clinics, where if the shoulder
level made an amputation.
Por falta de condições técnicas, ele foi transportado para o Hospital das Clínicas, onde os médicos amputaram o seu braço ao
nível do ombro.
FreeTranslation Due to conditions techniques, it it was carried to the Hospital of the Clinics, where the doctors had
amputated its arm to the level of the shoulder.
WorldLingo For lack of technical conditions, he was transported for the Hospital of the Clinics, where the
doctors amputated his arm level with the shoulder.
Por falta de condições técnicas, ele foi transportado para o Hospital das Clínicas, onde o seu braço foi amputado ao nível do
ombro.
FreeTranslation for lack of technical conditions, he was transported for the Hospital of the Clinics, where arm was
amputated level with the shoulder.
WorldLingo Due to conditions techniques, it it was carried to the Hospital of the Clinics, where its arm was
amputated to the level of the shoulder.
IMPROPER TREATMENT OF BIOMEDICAL
SUPPORT VERB CONSTRUCTIONS IN MT
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
STATEMENT
Linguistic knowledge of paraphrases can improve the quality of machine translation
 The conversion of support verb constructions into morphossyntactically and/or
semantically related verbs or verbal expressions and the adequate specification
and inclusion of predicate-argument structure information produces a controlled
language that is applicable both to general and domain specific contexts and
makes MT more reliable
1. Reduces complexity (= paraphrasing by simplification)
2. Reduces lexical ambiguity (paraphrase conveys the correct interpretation)
3. Facilitates interpretation when the support verb construction is idiomatic
4. Reduces wordiness (sometimes, improves source text quality)
5. Improves translation quality
92009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
THEORETICAL BACKGROUND - LINGUISTICS
 Meaning-Text Theory [Mel’čuk, 1988, 1996, 2003] – Support verb
constructions are described and formalized in terms of lexical
functions (LF). LF can be used as an interlingua to facilitate lexical
transfer in MT projects. [Milićević, 2007] establishes a typology of
paraphrases based on lexical-syntactic rules.
 FrameNet [Fillmore et al., 2002, 2003] – records the information
necessary for the representation of argument mapping relations
between a support verb and a nominalization.
 NomBank [Meyers et al., 2004b, 2004b] - maps syntactic positions in
nominalizations to verbal arguments and identify the allowed
complements for a nominalization, relating the nominal complements
to the arguments of the corresponding verb, including information
about support verbs.
 Lexicon-Grammar Theory [M. Gross and followers] - theoretical
framework adopted in this research
102009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
RELATED WORK ON PARAPHRASING
 question answering – discovering paraphrased answers provides additional
evidence that an answer is correct - [Ibrahim et al., 2003], [Paşca, 2003], [Duboué
& Chu-Carroll, 2006]
 information extraction and text mining - paraphrases help text categorization
tasks or mapping to texts with similar characteristics - [Ibrahim et al., 2003],
[Shinyama et al., 2002] [Shinyama & Sekine, 2003], [Sekine, 2005] [Paşca, 2005],
[Paşca & Dienes, 2005]
 summarization - the identification of paraphrases allows information across
documents to be condensed and helps improve the quality of the generated
summaries [McKeown et al., 2002], [Barzilay, 2001, 2003], [Hirao et al., 2004]
[Zhou et al., 2006b]
 natural language generation – the generation of paraphrases allows the
production of more varied and fluent text - [Iordanskaja et al. 1991]
 machine translation – paraphrases help create a more fluent translation and are
valuable in the evaluation of MT results - [Zhou et al., 2006], [Callison-Burch et
al., 2006a, 2006b, 2007 and 2008]
112009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
RELATED WORK ON CONTROLLED WRITING AND
STYLISTIC EDITORS/AUTHORING AIDS
12
 Controlled language - application of human language techniques in an industrial
environment. It is used to support technical writers in producing high quality technical
documentation by checking spelling, grammar, style and terminology in technical
documents (text optimisation). Automated rewriting for controlled language
translation
 Stylistic editing - focuses on clarity and expression, re-ordering of paragraphs,
sentences and words; elimination of verbiage and jargon; simplification of untangle
complicated clauses; elimination of inconsistencies, use of stylistic paraphrases
(reductions/extensions)
 Authoring aids - contemplate mostly spell checking, simple grammar checking
(correction of text), writing and morphological variants and synonyms LREC 2008
Some available tools:
MULTIDOC [Haller, 2000]
KANT CE Checker [Mitamura and Nyberg, 2001] [Mitamura et al., 2003] [Rascu, 2006]
CLAT [Schmidt-Wigger, 1998]; [Carl et al., 2002]; [Hernandez and Rascu, 2004]
CLOUT - the Controlled Language Optimized for Machine Translation.
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
METHODOLOGYTask 1
Analysis, extraction from corpora and formalization of support verb constructions
by using information described in dictionaries and grammars – NooJ linguistic
environment was the tool used
Task 2
Paraphrasing of predicate noun constructions
Phase 1
Development of resources and pre-processing
Construction of simple grammars to recognize and extract SVCs from texts
Definition of properties in the dictionary and establishment of semantic links
between verbs and SVCs + establishment of classes of paraphrases
Application of grammars with dictionary information to texts
Refinement of grammars
Finer recognition of SVCs in texts -> paraphrasing + translation
Phase 2
Evaluation experiments
132009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
Explicit marking of derivation and semantic verb associations
Adjective entries:
• Identification of derivational paradigms for predicate adverbs -
adverbializations (annotation AVDRV)
literal,A+FLX=PRINCIPAL+IN+symb+EN=literal+DRV=AVDRV00:LITERALMENTE
Autonomous predicate nouns:
• Identification of autonomous predicate nouns (annotation Npred)
• Identification of a semantically related verb (annotation VRB)
• Link to the support verb (annotation VSUP)
curso,N+FLX=ANO+Npred+IN+inst+EN=course+VSUP=tirar+VRB=estudar+NPrep=de+Det=um
FORMALIZATION IN THE DICTIONARIES
142009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
Explicit marking of derivation and support verb associations
Verb entries:
• Identification of derivational paradigms for predicate nouns -
nominalizations (annotation NDRV) and predicate adjectives
(annotation ADRV)
• Link to the support verbs (annotation VSUP) occurring with the
derived predicate nouns and adjectives
adaptar,V+FLX=FALAR+Aux=1+INOP57+Subset=132+EN=adapt+VSUP=fazer+DRV=NDRV00:CANÇÃO
azedar,V+FLX=LIMPAR+Aux=1+OBJTRundif98+Subset=740+EN=sour+VSUP=estar+DRV=ADRV00:ALTO
FORMALIZATION IN THE DICTIONARIES
152009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
 Port4NooJ
• a set of open source Portuguese linguistic resources
representing ontological relations, and paraphrasal relations.
This resources integrates a bilingual extension for PT-EN MT.
• Publicly available resources at:
http://www.nooj4nlp.net
http://www.linguateca.pt/Repositorio/Port4Nooj/
 DicTUM (Dicionário de Termos e Unidades Multipalavra)
• a Dictionary of Multiword Expressions
NEW RESOURCES
16
Semantic relations
Verb - Predicate Nouns 8472
Predicate Adjectives - Adverbs 222
Total Links 8694
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
 ReWriter
• a monolingual standalone paraphraser to pre-edit texts,
using paraphrasing capabilities
• Portuguese version ReEscreve - publicly available service at:
http://www.linguateca.pt/ReEscreve/
 ParaMT
• a bilingual/multilingual paraphraser to be integrated in
machine translation systems
NEW TOOLS
172009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
 Controlled writing & text pre-editing
 Machine translation
 Text production and stylistics (authoring aids)
USEFULNESS OF THE AFOREMENTIONED
PARAPHRASING TOOLS
182009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
REWRITER – APPLICATION INTERFACE
TEXT PRODUCTION
& STYLISTICS
192009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
Recognition of Portuguese SVC
and translation into English verbs
PARAMT: A BILINGUAL/MULTILINGUAL
PARAPHRASER FOR MT
MACHINE
TRANSLATION
202009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
SVC Recognition
Precision
SVC Recognition
Recall
SVC Paraphrasing
Precision
Pôr 73/73 - 100% 73/100 – 73% 72/73 - 98.6%
Tomar 75/75 - 100% 75/100 – 75% 68/73 - 93.1%
Ter 65/65 - 100% 65/100 – 65% 59/65 - 90.7%
Dar 57/60 - 95% 57/100 – 57% 46/51 - 90.1%
Fazer 43/45 – 95.5% 43/100 – 43% 40/45 - 88.8%
98.4% 62.6% 93.4%
Evaluation of
recognition
and
paraphrasing
of SVC with
ReWriter
5 support verbs
500 baseline sentences - manually annotated
100 for each elementary support verb
EVALUATION EXPERIMENT 1
212009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
EVALUATION EXPERIMENT 2
22
5 different support verbs
100 sentences with SVC (for each language PT and EN)
20 sentences for each support verb – manually assigned a paraphrase for each one of them
> 20 SVC sentences + 20 paraphrases of the original SVC sentences
TOTAL: 200 sentences for each language
100 with SVC + 100 with paraphrases of the SVC
10 testers translated and evaluated translations
students of the MA in Translation and Linguistic Services
Equal qualityresult SVCtranslationisbetter Paraphrase translationisbetter
EN-PTMT 17% 26% 57%
PT-ENMT 19% 30% 51%
Impact of
SVC
paraphrases
in translation
For both EN-PT and PT-EN, testers considered that the paraphrase translations presented a better quality.
The evaluation results clearly confirm that these paraphrases helped the systems produce better MT.
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
ENHANCED DICTIONARY ENTRIES WITH
SYNTACTIC-SEMANTIC INFORMATION
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
adaptar,V+FLX=FALAR+Aux=1+INOP57+Subset132+EN=adapt+VSUP=fazer+DRV=NDRV00:CANÇÃO+NPr
ep=de
favor,N+FLX=MAR+Npred+AB+state+EN=favor+VSUP=fazer+NPrep=a+VRB=ajudar
amputar,V+FLX=FALAR+Aux=1+OBJTRundif21+BioMed+EN=amputate+SUJ=AG+VSUP=fazer+DRV=NDR
V00:CANÇÃO+NPrep=de+OD=BP+VSTYLE=realizar+VSTYLE=efectuar+VASP=iniciar+VASP=prosseguir+V
ASP=concluir
amputar,V+FLX=FALAR+Aux=1+OBJTRundif21+BioMed+EN=amputate+VSUP=fazer+DRV=NDRV00:CAN
ÇÃO+NPrep=de+OD=BP+OI=PAT+VSTYLE=sofrer+VSTYLE=realizar+VSTYLE=efectuar+VASP=iniciar+VASP
=prosseguir+VASP=concluir
rápido,A+FLX=RÁPIDO+PV+eagerType+EN=quick+DRV=AVDRV06:RAPIDAMENTE
adoçar,V+FLX=COMEÇAR+Aux=1+OBJTRundif75+Subset604+EN=sweeten+DRV=ADRV11:VERDE+VCOP
=tornar
transplantar,V+FLX=FALAR+Aux=1+RECTR26+Subset=504+BioMed+EN=transplant+SUBJ=AG+VSUP=faz
er+DRV=NDRV79:ANO+NPrep=de+DO=BP+VSTYLE=realizar+VSTYLE=efectuar+VASP=iniciar+VASP=pros
seguir+VASP=concluir
transplantar,V+FLX=FALAR+Aux=1+RECTR26+Subset=504+BioMed+EN=transplant+SUBJ=AG+VSUP=faz
er+DRV=NDRV79:ANO+NPrep=de+DO=BP+IO=PAT+VSTYLE=sofrer+VSTYLE=realizar+VSTYLE=efectuar+
VASP=iniciar+VASP=prosseguir+VASP=concluir
médico,N+FLX=ANO+AN+des+med+AG+EN=doctor
médico,N+FLX=ANO+AN+des+med+AG+EN=physician
doente,N+FLX=ANO+AN+des+med+PAT+EN=patient
GRAMMAR TO RECOGNIZE AND
PARAPHRASE BIOMEDICAL SVC
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
PARAPHRASING FOR CONTROLLED WRITING AND
TEXT STYLISTICS
ARG0 = AGENT (AG)
Elementary SVC > Lexical Verb – fazer uma amputação = amputar
(to amputate)
Elementary SVC > non-elementary SVC - realizar/efectuar uma amputação
(to perform an amputation)
ARG0 = PATIENT (PAT)
Submeter-se/ser submetido a uma operação (to undergo surgery)
Ser operado (to be operated)
Elementary SVC > non-elementary SVC - realizar/efectuar uma operação
(to perform an operation)
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
RECOGNITION AND MONOLINGUAL
PARAPHRASING OF BIOMEDICAL-RELATED SVC
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
 Linguistic knowledge on paraphrases formalized in
this research when applied to a MT system improves
its output quality.
MAIN CONCLUSION
27
When SVC were identified and replaced with semantically
equivalent or similar verbal expressions as a pre-processing step
to translating:
a 21% improvement was observed in the evaluated quality of
the results of PT-EN machine translation
a 31% improvement in the results of EN-PT machine translation.
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
FUTURE RESEARCH SUGGESTED BY THIS THESIS
 Enlargement of syntactic-semantic relations between predicates,
and other morphossyntactically and/or semantically related elements
Exs: operar – operação – operado
urgente – urgentemente
 Assignment of thematic roles and semantic functions to nominal
entities
Exs: médico+AG+EN=doctor
doente+PAT+EN=patient
 Paraphrasing of various linguistic phenomena for ReWriter
extensibility (comprising stylistic variance and controlled language)
 Development and enhancement of ParaMT
 Enhancement of the machine translation model
28
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
Linguistic
Phenomenon
Expression or sentence Paraphrase
Adverbial à volta da órbita
around the orbit of the eye
de forma interactiva
in an interactive way
periorbital
periorbital
interactivamente
interactively
Relative
Clause
N0 que têmsido escritos
N0 that have been written
A velocidade a que se move a luz
Ŧ *The speed to which light moves
O papel que a Europa tem
The role that Europe plays/has
As dificuldades que temos
The difficulties we have
N0 que foram escritos > N0 escritos
N0 that were written > N0 written
A velocidade da luz
The speed of light
O papel da Europa
The role of Europe ≡ Europe’s role
As nossas dificuldades
Our difficulties
If clause se for necessário
if it is necessary
se necessário
if necessary
Named
Entity
Arainha de Inglaterra
The queen of England
Arainha inglesa
The British queen
Noun Phrase O heróico povo português
The heroic Portuguese people
Os heróicos portugueses
The heroic Portuguese
29
PARAPHRASING FOR REWRITER EXTENSIBILITY
A formal linguistic study of
paraphrases as these ones
would represent a significant
contribution to NLP in general,
and to MT in particular.
2009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
RELEVANT PUBLICATIONS
• Anabela Barreiro. "ParaMT: a Paraphraser for Machine Translation". In António Teixeira, Vera Lúcia Strube de
Lima, Luís Caldas de Oliveira & Paulo Quaresma (eds.), Computational Processing of the Portuguese Language,
8th International Conference, Proceedings (PROPOR 2008) Vol. 5190, (Aveiro, Portugal, 8-10 de Setembro de
2008), Springer Verlag, pp. 202-211.
• Anabela Barreiro. "Port4NooJ: Portuguese Linguistic Module and Bilingual Resources for Machine Translation". In
Proceedings of the 2007 International NooJ Conference (Barcelona, Spain, June 7-9, 2007), Cambridge Scholars
Publishing.
• Anabela Barreiro & Elisabete Ranchhod. "Machine Translation Challenges for Portuguese". Linguisticæ
Investigationes 28.1 (2005), pp. 3-18. (Machine Translation, Controlled Languages and Specialised Languages).
Amsterdam/Philadelphia: John Benjamins Publishing Company. ISSN: 0378-4169.
• Anabela Barreiro "Novas Ferramentas e Recursos Linguísticos para a Tradução Automática: Por ocasião d´O Fim
do Início de uma Nova Era no Processamento da Língua Portuguesa". In Luís Costa, Diana Santos & Nuno Cardoso
(eds.), Perspectivas sobre a Linguateca / Actas do encontro Linguateca : 10 anos. Linguateca, 2008, pp. 13-23.
• Anabela Barreiro. "Port4NooJ Linguistic Resources Overview". 2008.
http://poloclup.linguateca.pt/Port4NooJ/ResourcesOverview.pdf
• Anabela Barreiro. "Tutorial introdutório do Port4NooJ". 21 de Novembro de 2008.
• Belinda Maia & Anabela Barreiro "Uma experiência de recolha de exemplos classificados de tradução automática
de inglês para português". In Diana Santos (ed.), Avaliação conjunta: um novo paradigma no processamento
computacional da língua portuguesa. Lisboa, Portugal:IST Press, 2007, pp. 205-216.
• Luís Sarmento, Anabela Barreiro, Belinda Maia & Diana Santos "Avaliação de Tradução Automática: alguns
conceitos e reflexões". In Diana Santos (ed.), Avaliação conjunta: um novo paradigma no processamento
computacional da língua portuguesa. Lisboa, Portugal:IST Press, 2007, pp. 181-190.
302009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
REFERENCES
• Max Silberztein. "NooJ: A Cooperative, Object-Oriented Architecture for NLP". INTEX pour la Linguistique et le
traitement automatique des langues. Presses Universitaires de Franche-Comté. Cahiers de la MSH Ledoux, 2004.
Besançon, France.
• Adam Meyers, Ruth Reeves, Catherine Macleod, Rachel Szekely, Veronika Zielinska, Brian Young & Ralph
Grishman. "Annotating noun argument structure for NomBank". In Maria Teresa Lino, Maria Francisca Xavier,
Fátima Ferreira, Rute Costa & Raquel Silva (eds.), Proceedings of the 4th International Conference on Language
Resources and Evaluation (LREC'2004) (Lisboa, Portugal, May 26-28, 2004).
• Morris Salkoff. A French-English Grammar: A Contrastive Grammar On Translational Principles. Amsterdam,
Philadelphia: John Benjamins Publishing Company. 1999.
• Maurice Gross. "Les bases empiriques de la notion de prédicat sémantique". Formes Syntaxiques et Prédicat
Sémantiques, Langages 63 (1981), pp. 7-52. Paris, França: Larousse.
• Maurice Gross. "Simple sentences. Discussion of Fred W. Householder". Text Processing (1982), pp. 297-315.
Almqvist Wiksell.
• Elisabete Ranchhod (ed.). Sintaxe dos predicados nominais com Estar. Lisboa, Portugal: INIC. 1990.
• Jorge Baptista. Sintaxe dos Nomes Predicativos com verbo-suporte SER DE. Lisboa: Fundação para a Ciência e a
Tecnologia/Fundação Calouste Gulbenkian. 2005.
• Lucília Chacoto. O Verbo Fazer em Construções Nominais Predicativas. PhD dissertation. Universidade do
Algarve, Portugal. 2005.
• Diana Santos. "Lexical gaps and idioms in Machine Translation". In Hans Karlgren (ed.), Proceedings of the 14th
International Conference on Computational Linguistics (COLING'90) 2, (Helsínquia, 20-25 August 1990), pp. 330-
335.
312009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
REFERENCES
• Maurice Gross. "Simple sentences. Discussion of Fred W. Householder". Text Processing (1982), pp. 297-315. Almqvist
Wiksell.
• Zellig Harris. "Transformations in Linguistic Structure". Proceedings of the American Philosophical Society (1964), pp. 418-
422. Repr. in 1970a.472-481
• Catherine Fuchs. La Paraphrase. Presses Universitaires de France. 1982. ISBN: 9782130371038.
• Bud Scott. "The Logos Model: An Historical Perspective". Machine Translation 18 (2003), pp. 1-72.
• Belinda Maia & Anabela Barreiro "Uma experiência de recolha de exemplos classificados de tradução automática de
inglês para português". In Diana Santos (ed.), Avaliação conjunta: um novo paradigma no processamento computacional
da língua portuguesa. Lisboa, Portugal: IST Press, 2007, pp. 205-216.
• Chris Callison-Burch. Paraphrasing and Translation. PhD Thesis. University of Edinburgh. 2007.
• Fillmore, C. J., Johnson, C. R. & Petruck, M. R. L. "Background to Framenet". International Journal of Lexicography 16.3
(2003), pp. 235-250.
• Igor A. Mel’čuk "Lexical Functions: A Tool for the Description of Lexical Relations in the Lexicon". In L. Wanner (ed.),
Lexical Functions in Lexicography and Natural Language Processing. Amsterdam/Philadelphia: Benjamins, 1996, pp. 37-
102.
• Regina Barzilay & Lillian Lee. "Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment".
In Proceedings of Human Language Technology conference (HLT- NAACL 2003) 2003, pp. 16-23.
• Jasmina Milićević. Semantic Equivalence Rules in Meaning-Text Paraphrasing. Amsterdam and Philadelphia: Benjamins.
2007. ISBN: 978-90-272-3094-2. In Honour of Igor Mel'čuk
• Jasmina Milićević. La paraphrase. Modélisation de la paraphrase langagière. Bern: Peter Lang. 2007. ISBN: 978-3-03911-
197-8.
322009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
This research work was funded by Fundação para a Ciência e a Tecnologia
(doctoral scholarship SFRH/BD/14076/2003), co-financed by POSI and partly
supported by Fundação para a Computação Científica Nacional - Linguateca.
In addition to the doctoral scholarship, I had a ½ month contract from
Linguateca to help make Port4NooJ and ReEscreve public.
I currently have a new contract with Linguateca to enlarge and enhance the
linguistic resources herein described.
‫جزيال‬ ‫شكرا‬
ACKNOWLEDGMENTS
332009 NooJ conference and workshop
Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009

More Related Content

Similar to Paraphrasing biomedical support verb constructions for machine translation

Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...ravi sharma
 
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesAutomatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesEditor IJCATR
 
Hybrid Phonemic and Graphemic Modeling for Arabic Speech Recognition
Hybrid Phonemic and Graphemic Modeling for Arabic Speech RecognitionHybrid Phonemic and Graphemic Modeling for Arabic Speech Recognition
Hybrid Phonemic and Graphemic Modeling for Arabic Speech RecognitionWaqas Tariq
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert Systemcsandit
 
Performance on speech enhancement objective quality measures using hybrid wav...
Performance on speech enhancement objective quality measures using hybrid wav...Performance on speech enhancement objective quality measures using hybrid wav...
Performance on speech enhancement objective quality measures using hybrid wav...karthik annam
 
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORM
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORMSTANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORM
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORMijnlc
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems IJECEIAES
 
Text to-voice applications-to_promote_speaking_skill[1]
Text to-voice applications-to_promote_speaking_skill[1]Text to-voice applications-to_promote_speaking_skill[1]
Text to-voice applications-to_promote_speaking_skill[1]Amany AlKhayat
 
CALL (computer Assisted Language)
CALL (computer Assisted Language)CALL (computer Assisted Language)
CALL (computer Assisted Language)syeda12345
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifiereSAT Publishing House
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifiereSAT Journals
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Daniel Adenew
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemIJERA Editor
 
33 9765 development paper id 0034 (edit a) (1)
33 9765 development paper id 0034 (edit a) (1)33 9765 development paper id 0034 (edit a) (1)
33 9765 development paper id 0034 (edit a) (1)IAESIJEECS
 

Similar to Paraphrasing biomedical support verb constructions for machine translation (20)

Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
 
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesAutomatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
 
Hybrid Phonemic and Graphemic Modeling for Arabic Speech Recognition
Hybrid Phonemic and Graphemic Modeling for Arabic Speech RecognitionHybrid Phonemic and Graphemic Modeling for Arabic Speech Recognition
Hybrid Phonemic and Graphemic Modeling for Arabic Speech Recognition
 
FYPReport
FYPReportFYPReport
FYPReport
 
Content Writing Optimization with ReWriter
Content Writing Optimization with ReWriterContent Writing Optimization with ReWriter
Content Writing Optimization with ReWriter
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 
Performance on speech enhancement objective quality measures using hybrid wav...
Performance on speech enhancement objective quality measures using hybrid wav...Performance on speech enhancement objective quality measures using hybrid wav...
Performance on speech enhancement objective quality measures using hybrid wav...
 
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORM
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORMSTANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORM
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORM
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems
 
Text to-voice applications-to_promote_speaking_skill[1]
Text to-voice applications-to_promote_speaking_skill[1]Text to-voice applications-to_promote_speaking_skill[1]
Text to-voice applications-to_promote_speaking_skill[1]
 
CALL (computer Assisted Language)
CALL (computer Assisted Language)CALL (computer Assisted Language)
CALL (computer Assisted Language)
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...
 
Parafraseo-Chenggang.pdf
Parafraseo-Chenggang.pdfParafraseo-Chenggang.pdf
Parafraseo-Chenggang.pdf
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis System
 
Ey4301913917
Ey4301913917Ey4301913917
Ey4301913917
 
33 9765 development paper id 0034 (edit a) (1)
33 9765 development paper id 0034 (edit a) (1)33 9765 development paper id 0034 (edit a) (1)
33 9765 development paper id 0034 (edit a) (1)
 
OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual DictionariesOpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
 
almisbarIEEE-1
almisbarIEEE-1almisbarIEEE-1
almisbarIEEE-1
 

More from INESC-ID (Spoken Language Systems Laboratory - L2F)

More from INESC-ID (Spoken Language Systems Laboratory - L2F) (20)

Multi3Generation@INGL2020
Multi3Generation@INGL2020Multi3Generation@INGL2020
Multi3Generation@INGL2020
 
NooJ 2020 presentation
NooJ 2020 presentationNooJ 2020 presentation
NooJ 2020 presentation
 
PROPOR2020_Barreiroetal
PROPOR2020_BarreiroetalPROPOR2020_Barreiroetal
PROPOR2020_Barreiroetal
 
Análise comparativa das edições portuguesa e brasileira de Os livros que dev...
Análise comparativa das edições portuguesa e brasileira de  Os livros que dev...Análise comparativa das edições portuguesa e brasileira de  Os livros que dev...
Análise comparativa das edições portuguesa e brasileira de Os livros que dev...
 
Welcome session 3rd Annual MC Meeting - enetCollect COST Action
Welcome session 3rd Annual MC Meeting - enetCollect COST ActionWelcome session 3rd Annual MC Meeting - enetCollect COST Action
Welcome session 3rd Annual MC Meeting - enetCollect COST Action
 
Syntactic-semantic analysis for information extraction in biomedicine
Syntactic-semantic analysis for information extraction in biomedicineSyntactic-semantic analysis for information extraction in biomedicine
Syntactic-semantic analysis for information extraction in biomedicine
 
Cross language semantic relations between English and Portuguese
Cross language semantic relations between English and PortugueseCross language semantic relations between English and Portuguese
Cross language semantic relations between English and Portuguese
 
ReWriter for legal text
ReWriter for legal textReWriter for legal text
ReWriter for legal text
 
Chatbots for Language Learning
Chatbots for Language LearningChatbots for Language Learning
Chatbots for Language Learning
 
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and SummarizationeSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
 
Barreiro et al POP@PROPOR2018-informal2formal-language
Barreiro et al POP@PROPOR2018-informal2formal-languageBarreiro et al POP@PROPOR2018-informal2formal-language
Barreiro et al POP@PROPOR2018-informal2formal-language
 
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignmentsRebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
 
Barreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Batista-LR4NLP@Coling2018-presentationBarreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Batista-LR4NLP@Coling2018-presentation
 
Barreiro-Mota-VarDial@Coling2018-poster
Barreiro-Mota-VarDial@Coling2018-posterBarreiro-Mota-VarDial@Coling2018-poster
Barreiro-Mota-VarDial@Coling2018-poster
 
NooJ-2018-Palermo
NooJ-2018-PalermoNooJ-2018-Palermo
NooJ-2018-Palermo
 
Poster @ enetCollect CA MC meeting in Iasi, Romania
Poster @ enetCollect CA MC meeting in Iasi, Romania Poster @ enetCollect CA MC meeting in Iasi, Romania
Poster @ enetCollect CA MC meeting in Iasi, Romania
 
projeto-eSPERTo
projeto-eSPERToprojeto-eSPERTo
projeto-eSPERTo
 
Poster l2f 2017
Poster l2f 2017Poster l2f 2017
Poster l2f 2017
 
Nooj2017 cmota-etal
Nooj2017 cmota-etalNooj2017 cmota-etal
Nooj2017 cmota-etal
 
Machine Translation of Discontinuous Multiword Units
Machine Translation of Discontinuous Multiword UnitsMachine Translation of Discontinuous Multiword Units
Machine Translation of Discontinuous Multiword Units
 

Recently uploaded

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Paraphrasing biomedical support verb constructions for machine translation

  • 1. PARAPHRASING BIOMEDICAL SUPPORT VERB CONSTRUCTIONS FOR MACHINE TRANSLATION Anabela Barreiro CLUP CENTRO DE LINGUÍSTICA UNIVERSIDADE DO PORTO 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 2.  Build a body of lexical, syntactic and semantic knowledge around SVC  Apply this linguistic knowledge to paraphrasing - paraphrase and re-write SVC, such as fazer uma operação (to make an operation/surgery) with corresponding correct equivalences. The equivalences can be: • strong lexical verbs, such as operar (to operate on) • lexical-syntactic and stylistic variants of the original SVC, such as realizar uma operação (to perform an operation/surgery) or submeter-se a uma operação/cirurgia (to undergo/have an operation/surgery or to be operated on), depending on the lexical-semantic filling of the arguments and on their argument structure.  show how these paraphrases increase source text quality  show how MT results can improve significantly when applying paraphrasing capabilities MAIN GOALS 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 3. GENERAL MOTIVATION 1. MT changed the world of translation it can no longer be ignored or underestimated 2. It is increasingly more used and useful it often replaces human translation when the client requires gisting 3. However, good quality MT is still an ambitious goal after +50 years, MT still eludes researchers and developers who are constantly challenged to provide better translations How can MT move forward? 32009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 4. CHALLENGES TO MACHINE TRANSLATION 1. homography and common-noun nuance 2. anaphora + distant referential associations 3. ellipsis 4. extra-sentential and extra-textual information + extra-linguistic knowledge 5. lexical divergences, idioms, etc. 6. named entities 7. long sentences and wordiness 8. unusual word order 9. multiword expressions (including support verb constructions) 42009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 5. = multiword or complex predicate consisting of a semantically weak verb (the support verb), and a predicate noun, a predicate adjective, or a predicate adverb. Predicate nouns can be:  morphologically related to a verb PT: fazer uma apresentação de N = apresentar N EN: pay a visit to N = to visit N  autonomous PT: fazer um mestrado - *mestrar EN: have fun - *to fun SUPPORT VERB CONSTRUCTION 52009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 6.  Abundant in language and indispensable to communicate ASSESSMENT TEST: Search pattern in corpora (COMPARA) (<dar.V+inf> + <tomar.V+inf> + <pôr.V+inf> + <fazer.V+inf> + <ter.V+inf>) <Modif>? <N> Selected the 1st 100 sentences for each verb and manually annotated the 500 sentences as SVC or NO SVC  Extensively and systematically studied within the Lexicon-Grammar Theory [M. Gross and followers], also in Portuguese [Ranchhod, 1990] [Baptista, 2005] [Chacoto, 2005] and in contrastive studies [Salkoff, 1990, 1999]  Often can be replaced by stylistic variants - paraphrases  Present several degrees of variability (variable - invariable - idiomatic)  Semantically weak and ambiguous  They cannot be translated literally WHY SUPPORT VERB CONSTRUCTIONS? RESULTS: These verbs occur very frequently in SVCs Overall, in 64.2% of their occurrence, these verbs are support verbs 6 PT dar tomar pôr fazer ter transl. give take put make/do have NO SVC 11% 12% 33% 53% 80% SVC 89% 88% 77% 47% 20% 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 7. 7 AH… THAT’S BETTER! MT SYSTEMS MISHANDLING SUPPORT VERB CONSTRUCTIONS MT OF SUPPORT VERB CONSTRUCTIONS WE PROMPTLY VERIFY THE SUPERIOR QUALITY OF THE RESULTS IN THE SECOND CASE! 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 8. Por falta de condições técnicas, ele foi removido para o Hospital das Clínicas, onde se fez uma amputação a nível de ombro. FreeTranslation For lack of technical conditions, he was removed for the Hospital of the Clinics, where was done an amputation in terms of shoulder. WorldLingo Due to conditions techniques, it it was removed for the Hospital of the Clinics, where if the shoulder level made an amputation. Por falta de condições técnicas, ele foi transportado para o Hospital das Clínicas, onde os médicos amputaram o seu braço ao nível do ombro. FreeTranslation Due to conditions techniques, it it was carried to the Hospital of the Clinics, where the doctors had amputated its arm to the level of the shoulder. WorldLingo For lack of technical conditions, he was transported for the Hospital of the Clinics, where the doctors amputated his arm level with the shoulder. Por falta de condições técnicas, ele foi transportado para o Hospital das Clínicas, onde o seu braço foi amputado ao nível do ombro. FreeTranslation for lack of technical conditions, he was transported for the Hospital of the Clinics, where arm was amputated level with the shoulder. WorldLingo Due to conditions techniques, it it was carried to the Hospital of the Clinics, where its arm was amputated to the level of the shoulder. IMPROPER TREATMENT OF BIOMEDICAL SUPPORT VERB CONSTRUCTIONS IN MT 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 9. STATEMENT Linguistic knowledge of paraphrases can improve the quality of machine translation  The conversion of support verb constructions into morphossyntactically and/or semantically related verbs or verbal expressions and the adequate specification and inclusion of predicate-argument structure information produces a controlled language that is applicable both to general and domain specific contexts and makes MT more reliable 1. Reduces complexity (= paraphrasing by simplification) 2. Reduces lexical ambiguity (paraphrase conveys the correct interpretation) 3. Facilitates interpretation when the support verb construction is idiomatic 4. Reduces wordiness (sometimes, improves source text quality) 5. Improves translation quality 92009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 10. THEORETICAL BACKGROUND - LINGUISTICS  Meaning-Text Theory [Mel’čuk, 1988, 1996, 2003] – Support verb constructions are described and formalized in terms of lexical functions (LF). LF can be used as an interlingua to facilitate lexical transfer in MT projects. [Milićević, 2007] establishes a typology of paraphrases based on lexical-syntactic rules.  FrameNet [Fillmore et al., 2002, 2003] – records the information necessary for the representation of argument mapping relations between a support verb and a nominalization.  NomBank [Meyers et al., 2004b, 2004b] - maps syntactic positions in nominalizations to verbal arguments and identify the allowed complements for a nominalization, relating the nominal complements to the arguments of the corresponding verb, including information about support verbs.  Lexicon-Grammar Theory [M. Gross and followers] - theoretical framework adopted in this research 102009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 11. RELATED WORK ON PARAPHRASING  question answering – discovering paraphrased answers provides additional evidence that an answer is correct - [Ibrahim et al., 2003], [Paşca, 2003], [Duboué & Chu-Carroll, 2006]  information extraction and text mining - paraphrases help text categorization tasks or mapping to texts with similar characteristics - [Ibrahim et al., 2003], [Shinyama et al., 2002] [Shinyama & Sekine, 2003], [Sekine, 2005] [Paşca, 2005], [Paşca & Dienes, 2005]  summarization - the identification of paraphrases allows information across documents to be condensed and helps improve the quality of the generated summaries [McKeown et al., 2002], [Barzilay, 2001, 2003], [Hirao et al., 2004] [Zhou et al., 2006b]  natural language generation – the generation of paraphrases allows the production of more varied and fluent text - [Iordanskaja et al. 1991]  machine translation – paraphrases help create a more fluent translation and are valuable in the evaluation of MT results - [Zhou et al., 2006], [Callison-Burch et al., 2006a, 2006b, 2007 and 2008] 112009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 12. RELATED WORK ON CONTROLLED WRITING AND STYLISTIC EDITORS/AUTHORING AIDS 12  Controlled language - application of human language techniques in an industrial environment. It is used to support technical writers in producing high quality technical documentation by checking spelling, grammar, style and terminology in technical documents (text optimisation). Automated rewriting for controlled language translation  Stylistic editing - focuses on clarity and expression, re-ordering of paragraphs, sentences and words; elimination of verbiage and jargon; simplification of untangle complicated clauses; elimination of inconsistencies, use of stylistic paraphrases (reductions/extensions)  Authoring aids - contemplate mostly spell checking, simple grammar checking (correction of text), writing and morphological variants and synonyms LREC 2008 Some available tools: MULTIDOC [Haller, 2000] KANT CE Checker [Mitamura and Nyberg, 2001] [Mitamura et al., 2003] [Rascu, 2006] CLAT [Schmidt-Wigger, 1998]; [Carl et al., 2002]; [Hernandez and Rascu, 2004] CLOUT - the Controlled Language Optimized for Machine Translation. 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 13. METHODOLOGYTask 1 Analysis, extraction from corpora and formalization of support verb constructions by using information described in dictionaries and grammars – NooJ linguistic environment was the tool used Task 2 Paraphrasing of predicate noun constructions Phase 1 Development of resources and pre-processing Construction of simple grammars to recognize and extract SVCs from texts Definition of properties in the dictionary and establishment of semantic links between verbs and SVCs + establishment of classes of paraphrases Application of grammars with dictionary information to texts Refinement of grammars Finer recognition of SVCs in texts -> paraphrasing + translation Phase 2 Evaluation experiments 132009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 14. Explicit marking of derivation and semantic verb associations Adjective entries: • Identification of derivational paradigms for predicate adverbs - adverbializations (annotation AVDRV) literal,A+FLX=PRINCIPAL+IN+symb+EN=literal+DRV=AVDRV00:LITERALMENTE Autonomous predicate nouns: • Identification of autonomous predicate nouns (annotation Npred) • Identification of a semantically related verb (annotation VRB) • Link to the support verb (annotation VSUP) curso,N+FLX=ANO+Npred+IN+inst+EN=course+VSUP=tirar+VRB=estudar+NPrep=de+Det=um FORMALIZATION IN THE DICTIONARIES 142009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 15. Explicit marking of derivation and support verb associations Verb entries: • Identification of derivational paradigms for predicate nouns - nominalizations (annotation NDRV) and predicate adjectives (annotation ADRV) • Link to the support verbs (annotation VSUP) occurring with the derived predicate nouns and adjectives adaptar,V+FLX=FALAR+Aux=1+INOP57+Subset=132+EN=adapt+VSUP=fazer+DRV=NDRV00:CANÇÃO azedar,V+FLX=LIMPAR+Aux=1+OBJTRundif98+Subset=740+EN=sour+VSUP=estar+DRV=ADRV00:ALTO FORMALIZATION IN THE DICTIONARIES 152009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 16.  Port4NooJ • a set of open source Portuguese linguistic resources representing ontological relations, and paraphrasal relations. This resources integrates a bilingual extension for PT-EN MT. • Publicly available resources at: http://www.nooj4nlp.net http://www.linguateca.pt/Repositorio/Port4Nooj/  DicTUM (Dicionário de Termos e Unidades Multipalavra) • a Dictionary of Multiword Expressions NEW RESOURCES 16 Semantic relations Verb - Predicate Nouns 8472 Predicate Adjectives - Adverbs 222 Total Links 8694 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 17.  ReWriter • a monolingual standalone paraphraser to pre-edit texts, using paraphrasing capabilities • Portuguese version ReEscreve - publicly available service at: http://www.linguateca.pt/ReEscreve/  ParaMT • a bilingual/multilingual paraphraser to be integrated in machine translation systems NEW TOOLS 172009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 18.  Controlled writing & text pre-editing  Machine translation  Text production and stylistics (authoring aids) USEFULNESS OF THE AFOREMENTIONED PARAPHRASING TOOLS 182009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 19. REWRITER – APPLICATION INTERFACE TEXT PRODUCTION & STYLISTICS 192009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 20. Recognition of Portuguese SVC and translation into English verbs PARAMT: A BILINGUAL/MULTILINGUAL PARAPHRASER FOR MT MACHINE TRANSLATION 202009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 21. SVC Recognition Precision SVC Recognition Recall SVC Paraphrasing Precision Pôr 73/73 - 100% 73/100 – 73% 72/73 - 98.6% Tomar 75/75 - 100% 75/100 – 75% 68/73 - 93.1% Ter 65/65 - 100% 65/100 – 65% 59/65 - 90.7% Dar 57/60 - 95% 57/100 – 57% 46/51 - 90.1% Fazer 43/45 – 95.5% 43/100 – 43% 40/45 - 88.8% 98.4% 62.6% 93.4% Evaluation of recognition and paraphrasing of SVC with ReWriter 5 support verbs 500 baseline sentences - manually annotated 100 for each elementary support verb EVALUATION EXPERIMENT 1 212009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 22. EVALUATION EXPERIMENT 2 22 5 different support verbs 100 sentences with SVC (for each language PT and EN) 20 sentences for each support verb – manually assigned a paraphrase for each one of them > 20 SVC sentences + 20 paraphrases of the original SVC sentences TOTAL: 200 sentences for each language 100 with SVC + 100 with paraphrases of the SVC 10 testers translated and evaluated translations students of the MA in Translation and Linguistic Services Equal qualityresult SVCtranslationisbetter Paraphrase translationisbetter EN-PTMT 17% 26% 57% PT-ENMT 19% 30% 51% Impact of SVC paraphrases in translation For both EN-PT and PT-EN, testers considered that the paraphrase translations presented a better quality. The evaluation results clearly confirm that these paraphrases helped the systems produce better MT. 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 23. ENHANCED DICTIONARY ENTRIES WITH SYNTACTIC-SEMANTIC INFORMATION 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009 adaptar,V+FLX=FALAR+Aux=1+INOP57+Subset132+EN=adapt+VSUP=fazer+DRV=NDRV00:CANÇÃO+NPr ep=de favor,N+FLX=MAR+Npred+AB+state+EN=favor+VSUP=fazer+NPrep=a+VRB=ajudar amputar,V+FLX=FALAR+Aux=1+OBJTRundif21+BioMed+EN=amputate+SUJ=AG+VSUP=fazer+DRV=NDR V00:CANÇÃO+NPrep=de+OD=BP+VSTYLE=realizar+VSTYLE=efectuar+VASP=iniciar+VASP=prosseguir+V ASP=concluir amputar,V+FLX=FALAR+Aux=1+OBJTRundif21+BioMed+EN=amputate+VSUP=fazer+DRV=NDRV00:CAN ÇÃO+NPrep=de+OD=BP+OI=PAT+VSTYLE=sofrer+VSTYLE=realizar+VSTYLE=efectuar+VASP=iniciar+VASP =prosseguir+VASP=concluir rápido,A+FLX=RÁPIDO+PV+eagerType+EN=quick+DRV=AVDRV06:RAPIDAMENTE adoçar,V+FLX=COMEÇAR+Aux=1+OBJTRundif75+Subset604+EN=sweeten+DRV=ADRV11:VERDE+VCOP =tornar transplantar,V+FLX=FALAR+Aux=1+RECTR26+Subset=504+BioMed+EN=transplant+SUBJ=AG+VSUP=faz er+DRV=NDRV79:ANO+NPrep=de+DO=BP+VSTYLE=realizar+VSTYLE=efectuar+VASP=iniciar+VASP=pros seguir+VASP=concluir transplantar,V+FLX=FALAR+Aux=1+RECTR26+Subset=504+BioMed+EN=transplant+SUBJ=AG+VSUP=faz er+DRV=NDRV79:ANO+NPrep=de+DO=BP+IO=PAT+VSTYLE=sofrer+VSTYLE=realizar+VSTYLE=efectuar+ VASP=iniciar+VASP=prosseguir+VASP=concluir médico,N+FLX=ANO+AN+des+med+AG+EN=doctor médico,N+FLX=ANO+AN+des+med+AG+EN=physician doente,N+FLX=ANO+AN+des+med+PAT+EN=patient
  • 24. GRAMMAR TO RECOGNIZE AND PARAPHRASE BIOMEDICAL SVC 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 25. PARAPHRASING FOR CONTROLLED WRITING AND TEXT STYLISTICS ARG0 = AGENT (AG) Elementary SVC > Lexical Verb – fazer uma amputação = amputar (to amputate) Elementary SVC > non-elementary SVC - realizar/efectuar uma amputação (to perform an amputation) ARG0 = PATIENT (PAT) Submeter-se/ser submetido a uma operação (to undergo surgery) Ser operado (to be operated) Elementary SVC > non-elementary SVC - realizar/efectuar uma operação (to perform an operation) 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 26. RECOGNITION AND MONOLINGUAL PARAPHRASING OF BIOMEDICAL-RELATED SVC 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 27.  Linguistic knowledge on paraphrases formalized in this research when applied to a MT system improves its output quality. MAIN CONCLUSION 27 When SVC were identified and replaced with semantically equivalent or similar verbal expressions as a pre-processing step to translating: a 21% improvement was observed in the evaluated quality of the results of PT-EN machine translation a 31% improvement in the results of EN-PT machine translation. 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 28. FUTURE RESEARCH SUGGESTED BY THIS THESIS  Enlargement of syntactic-semantic relations between predicates, and other morphossyntactically and/or semantically related elements Exs: operar – operação – operado urgente – urgentemente  Assignment of thematic roles and semantic functions to nominal entities Exs: médico+AG+EN=doctor doente+PAT+EN=patient  Paraphrasing of various linguistic phenomena for ReWriter extensibility (comprising stylistic variance and controlled language)  Development and enhancement of ParaMT  Enhancement of the machine translation model 28 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 29. Linguistic Phenomenon Expression or sentence Paraphrase Adverbial à volta da órbita around the orbit of the eye de forma interactiva in an interactive way periorbital periorbital interactivamente interactively Relative Clause N0 que têmsido escritos N0 that have been written A velocidade a que se move a luz Ŧ *The speed to which light moves O papel que a Europa tem The role that Europe plays/has As dificuldades que temos The difficulties we have N0 que foram escritos > N0 escritos N0 that were written > N0 written A velocidade da luz The speed of light O papel da Europa The role of Europe ≡ Europe’s role As nossas dificuldades Our difficulties If clause se for necessário if it is necessary se necessário if necessary Named Entity Arainha de Inglaterra The queen of England Arainha inglesa The British queen Noun Phrase O heróico povo português The heroic Portuguese people Os heróicos portugueses The heroic Portuguese 29 PARAPHRASING FOR REWRITER EXTENSIBILITY A formal linguistic study of paraphrases as these ones would represent a significant contribution to NLP in general, and to MT in particular. 2009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 30. RELEVANT PUBLICATIONS • Anabela Barreiro. "ParaMT: a Paraphraser for Machine Translation". In António Teixeira, Vera Lúcia Strube de Lima, Luís Caldas de Oliveira & Paulo Quaresma (eds.), Computational Processing of the Portuguese Language, 8th International Conference, Proceedings (PROPOR 2008) Vol. 5190, (Aveiro, Portugal, 8-10 de Setembro de 2008), Springer Verlag, pp. 202-211. • Anabela Barreiro. "Port4NooJ: Portuguese Linguistic Module and Bilingual Resources for Machine Translation". In Proceedings of the 2007 International NooJ Conference (Barcelona, Spain, June 7-9, 2007), Cambridge Scholars Publishing. • Anabela Barreiro & Elisabete Ranchhod. "Machine Translation Challenges for Portuguese". Linguisticæ Investigationes 28.1 (2005), pp. 3-18. (Machine Translation, Controlled Languages and Specialised Languages). Amsterdam/Philadelphia: John Benjamins Publishing Company. ISSN: 0378-4169. • Anabela Barreiro "Novas Ferramentas e Recursos Linguísticos para a Tradução Automática: Por ocasião d´O Fim do Início de uma Nova Era no Processamento da Língua Portuguesa". In Luís Costa, Diana Santos & Nuno Cardoso (eds.), Perspectivas sobre a Linguateca / Actas do encontro Linguateca : 10 anos. Linguateca, 2008, pp. 13-23. • Anabela Barreiro. "Port4NooJ Linguistic Resources Overview". 2008. http://poloclup.linguateca.pt/Port4NooJ/ResourcesOverview.pdf • Anabela Barreiro. "Tutorial introdutório do Port4NooJ". 21 de Novembro de 2008. • Belinda Maia & Anabela Barreiro "Uma experiência de recolha de exemplos classificados de tradução automática de inglês para português". In Diana Santos (ed.), Avaliação conjunta: um novo paradigma no processamento computacional da língua portuguesa. Lisboa, Portugal:IST Press, 2007, pp. 205-216. • Luís Sarmento, Anabela Barreiro, Belinda Maia & Diana Santos "Avaliação de Tradução Automática: alguns conceitos e reflexões". In Diana Santos (ed.), Avaliação conjunta: um novo paradigma no processamento computacional da língua portuguesa. Lisboa, Portugal:IST Press, 2007, pp. 181-190. 302009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 31. REFERENCES • Max Silberztein. "NooJ: A Cooperative, Object-Oriented Architecture for NLP". INTEX pour la Linguistique et le traitement automatique des langues. Presses Universitaires de Franche-Comté. Cahiers de la MSH Ledoux, 2004. Besançon, France. • Adam Meyers, Ruth Reeves, Catherine Macleod, Rachel Szekely, Veronika Zielinska, Brian Young & Ralph Grishman. "Annotating noun argument structure for NomBank". In Maria Teresa Lino, Maria Francisca Xavier, Fátima Ferreira, Rute Costa & Raquel Silva (eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC'2004) (Lisboa, Portugal, May 26-28, 2004). • Morris Salkoff. A French-English Grammar: A Contrastive Grammar On Translational Principles. Amsterdam, Philadelphia: John Benjamins Publishing Company. 1999. • Maurice Gross. "Les bases empiriques de la notion de prédicat sémantique". Formes Syntaxiques et Prédicat Sémantiques, Langages 63 (1981), pp. 7-52. Paris, França: Larousse. • Maurice Gross. "Simple sentences. Discussion of Fred W. Householder". Text Processing (1982), pp. 297-315. Almqvist Wiksell. • Elisabete Ranchhod (ed.). Sintaxe dos predicados nominais com Estar. Lisboa, Portugal: INIC. 1990. • Jorge Baptista. Sintaxe dos Nomes Predicativos com verbo-suporte SER DE. Lisboa: Fundação para a Ciência e a Tecnologia/Fundação Calouste Gulbenkian. 2005. • Lucília Chacoto. O Verbo Fazer em Construções Nominais Predicativas. PhD dissertation. Universidade do Algarve, Portugal. 2005. • Diana Santos. "Lexical gaps and idioms in Machine Translation". In Hans Karlgren (ed.), Proceedings of the 14th International Conference on Computational Linguistics (COLING'90) 2, (Helsínquia, 20-25 August 1990), pp. 330- 335. 312009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 32. REFERENCES • Maurice Gross. "Simple sentences. Discussion of Fred W. Householder". Text Processing (1982), pp. 297-315. Almqvist Wiksell. • Zellig Harris. "Transformations in Linguistic Structure". Proceedings of the American Philosophical Society (1964), pp. 418- 422. Repr. in 1970a.472-481 • Catherine Fuchs. La Paraphrase. Presses Universitaires de France. 1982. ISBN: 9782130371038. • Bud Scott. "The Logos Model: An Historical Perspective". Machine Translation 18 (2003), pp. 1-72. • Belinda Maia & Anabela Barreiro "Uma experiência de recolha de exemplos classificados de tradução automática de inglês para português". In Diana Santos (ed.), Avaliação conjunta: um novo paradigma no processamento computacional da língua portuguesa. Lisboa, Portugal: IST Press, 2007, pp. 205-216. • Chris Callison-Burch. Paraphrasing and Translation. PhD Thesis. University of Edinburgh. 2007. • Fillmore, C. J., Johnson, C. R. & Petruck, M. R. L. "Background to Framenet". International Journal of Lexicography 16.3 (2003), pp. 235-250. • Igor A. Mel’čuk "Lexical Functions: A Tool for the Description of Lexical Relations in the Lexicon". In L. Wanner (ed.), Lexical Functions in Lexicography and Natural Language Processing. Amsterdam/Philadelphia: Benjamins, 1996, pp. 37- 102. • Regina Barzilay & Lillian Lee. "Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment". In Proceedings of Human Language Technology conference (HLT- NAACL 2003) 2003, pp. 16-23. • Jasmina Milićević. Semantic Equivalence Rules in Meaning-Text Paraphrasing. Amsterdam and Philadelphia: Benjamins. 2007. ISBN: 978-90-272-3094-2. In Honour of Igor Mel'čuk • Jasmina Milićević. La paraphrase. Modélisation de la paraphrase langagière. Bern: Peter Lang. 2007. ISBN: 978-3-03911- 197-8. 322009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009
  • 33. This research work was funded by Fundação para a Ciência e a Tecnologia (doctoral scholarship SFRH/BD/14076/2003), co-financed by POSI and partly supported by Fundação para a Computação Científica Nacional - Linguateca. In addition to the doctoral scholarship, I had a ½ month contract from Linguateca to help make Port4NooJ and ReEscreve public. I currently have a new contract with Linguateca to enlarge and enhance the linguistic resources herein described. ‫جزيال‬ ‫شكرا‬ ACKNOWLEDGMENTS 332009 NooJ conference and workshop Anabela Barreiro Touzeur - Tunisia, 8-10 June 2009