SPIDER: A SYSTEM FOR PARAPHRASING
       IN DOCUMENT EDITING AND REVISION
          APPLICABILITY IN MACHINE TRANSLATION PRE-EDITING




                            Anabela Barreiro



                              ab@metatrad.com




CICLing 2011                                        February 20-26, 2011
Anabela Barreiro                                    Tokyo, Japan
OUTLINE
          INTRODUCTION
                  PARAPHRASES IN NLP
                  PARAPHRASES IN PEDAGOGICAL AND PROFESSIONAL CONTEXTS

          SPIDER
                  FIRST STEPS
                  IMPORTANT FEATURES
                  PARAPHRASES COVERED BY SPIDER
                  INTERFACE
                  LINGUISTIC RESOURCES
                  EVALUATION RESULTS

          THE FUTURE
                  FUTURE APPLICATIONS?
                  FUTURE RESEARCH


CICLing 2011                                                              February 20-26, 2011
Anabela Barreiro                                                          Tokyo, Japan
IMPORTANCE OF PARAPHRASES IN NLP TASKS
         Question Answering
          [Ibrahim et al., 2003], [Paşca, 2003], [Duboué & Chu-Carroll, 2006]
         Information Extraction and Text Mining
          [Ibrahim et al., 2003], [Shinyama et al., 2002] [Shinyama & Sekine, 2003],
          [Sekine, 2005] [Paşca, 2005], [Paşca & Dienes, 2005]
         Summarization
          [McKeown et al., 2002], [Barzilay, 2001, 2003], [Hirao et al., 2004] [Zhou et
          al., 2006b]
         Natural Language Generation
          [Iordanskaja et al. 1991]
         Plagiarism Detection
          [Potthast et al., 2010], [Vila et al., 2010]
         Machine Translation
          [Zhou et al., 2006], [Callison-Burch et al., 2006a, 2006b, 2007 and 2008]
          [Barreiro, 2008, 2009, 2011]



CICLing 2011                                                                    February 20-26, 2011
Anabela Barreiro                                                                Tokyo, Japan
THE PRACTICAL NEED FOR PARAPHRASES
                    IN PEDAGOGICAL CONTEXTS

          Text Processing and Authoring Aids
           Writing and revision of original/creative/customized texts
          Learning Tools
           Native and second language learning
           Creation of clear and understandable text content
           e.g. students learning language and writing skills
          Style Editors
           Uniformization /consistency of style




CICLing 2011                                                            February 20-26, 2011
Anabela Barreiro                                                        Tokyo, Japan
THE PRACTICAL NEED FOR PARAPHRASES
                    IN PROFESSIONAL CONTEXTS
          Technical Writing
           Professional high quality documentation and domain-specific texts
           Controlled language
          Linguistic Quality Assurance
           Linguistic quality of generic texts and specialized documentation
           Verification/validation of meaningful content
          Text Optimization
           Readable / publishable texts (business-oriented or purpose-oriented content)
          Terminology
           Search for the “exact” term or relevant keywords
          Translation
           Indispensable for human and machine translation (pre-editing and post-editing)


CICLing 2011                                                                   February 20-26, 2011
Anabela Barreiro                                                               Tokyo, Japan
OUTLINE
          INTRODUCTION
                  PARAPHRASES IN NLP
                  PARAPHRASES IN PEDAGOGICAL AND PROFESSIONAL CONTEXTS

          SPIDER
                  FIRST STEPS
                  IMPORTANT FEATURES
                  PARAPHRASES COVERED BY SPIDER
                  INTERFACE
                  LINGUISTIC RESOURCES
                  EVALUATION RESULTS

          THE FUTURE
                  FUTURE APPLICATIONS?
                  FUTURE RESEARCH


CICLing 2011                                                              February 20-26, 2011
Anabela Barreiro                                                          Tokyo, Japan
SPIDER PARAPHRASING SYSTEM
                                      FIRST STEPS

            Initially developed for Portuguese
            1st version – ReEscreve
            publicly available service at http://www.linguateca.pt/ReEscreve/

            2nd version – eSPERTo (Portuguese: the smart/clever one; expert)
            currently being integrated in a cyber school project within the scope of an
            educational program

            Writing exercises – students learning how to improve their writing skills in
            the Portuguese language

            English SPIDER
            prototype to assist writing of domain-specific texts



CICLing 2011                                                               February 20-26, 2011
Anabela Barreiro                                                           Tokyo, Japan
SPIDER
                             IMPORTANT FEATURES
       Applies linguistic knowledge to recognize and generate paraphrases
      automatically (preserves the source text semantics and grammaticality -
      inflectional features) in the suggestions provided (included transformations of
      multi-word units)
       Uses text-editing mechanisms which provide a variety of alternatives for
      each expression and the possibility to choose among them (according to
      personal preferences, style, idiomacity, etc.)
       Allows users to suggest new expressions that can be immediately applied
      to their text, making the text editing process easier, more flexible, and
      upgradable
       Designed to help with writing optimization, understandability and
      translatability (improvement of the quality of the source text so that it can cause
      a positive impact in translation)


CICLing 2011                                                                 February 20-26, 2011
Anabela Barreiro                                                             Tokyo, Japan
PARAPHRASES COVERED BY SPIDER
       Synonyms in context (ex: phrasal verbs into equivalent expressions)
             to clear up (weather) = (weather) to become better/brighter
       Support verb constructions into single verbs and stylistic variants
             to make a decision = to decide; to make an audit = to perform an audit
       Aspectual constructions into single verbs
             to launch an attack = to attack
       Adverbials (compounds into single adverbs)
             in a constructive way = constructively
       Relatives into participial adjectives
             the president that was elected = the president elect
       Relatives into possessives
             the role that Europe plays/has = the role of Europe
       Relatives into compound nouns (and vice-versa)
             a container for the milk = a milk container; a bottle made of plastic = a plastic bottle
       Agentive passives into actives
             the man was released by the police officer = the police officer released the man


CICLing 2011                                                                       February 20-26, 2011
Anabela Barreiro                                                                   Tokyo, Japan
INTERFACE
                       SUGGESTIONS FOR EXAMPLE SENTENCES
 Suggestions for general language
      linguistic phenomena



                                                          Compound adverbs >
                                                            single adverbs




                                                                    Relatives >
                                                               participial adjectives



                                         Support verb constructions >
                                                 single verbs




CICLing 2011                                                                   February 20-26, 2011
Anabela Barreiro                                                               Tokyo, Japan
INTERFACE
       SELECTION OF PARAPHRASING GRAMMARS FOR SPECIFIC
                                        LINGUISTIC PHENOMENA
    Users can select among general and technical dictionaries (more than one
selection allowed), grammars for specific linguistic transformations (one, several
or all grammars can be selected). The interface provides sample texts for testing.


                                                                                      Informative details about the
                                                                                       linguistic resources selected




                                                                  Sample LEGAL text




CICLing 2011                                                                                            February 20-26, 2011
Anabela Barreiro                                                                                        Tokyo, Japan
INTERFACE
                          SELECTION OF A DOMAIN DICTIONARY




                                                                                  Identification of legal terms in the text




                       Suggestions for the term “breach of law”

 Users can select one term from the list of suggestions or provide a new suggestion

CICLing 2011                                                                                                 February 20-26, 2011
Anabela Barreiro                                                                                             Tokyo, Japan
INTERFACE
  SUGGESTIONS PROVIDED AND USER’S CAPABILITY TO ADD NEW REWRITING
                                                         OPTIONS




                                                                              The user can suggest new words or
                                                                            expressions (synonyms or paraphrases)

                                                                            It is possible to go back and change the user
                                                                                   option as many times as necessary

                                Text rewritten
                 • In red, the expressions in the source text
    •   In green, suggestions provided by SPIDER and selected by the user




CICLing 2011                                                                                     February 20-26, 2011
Anabela Barreiro                                                                                 Tokyo, Japan
LINGUISTIC RESOURCES
        Eng4NooJ – linguistic knowledge system
       • OpenLogos dictionary (http://logos-os.dfki.de/)
       • converted into NooJ format, and enhanced with new
             properties, including derivational and morpho-syntactic
             and semantic relations
       • Morphological system
       • Contextual rules and grammars
       • Domain specific dictionary (sample “legal terms”)




CICLing 2011                                                 February 20-26, 2011
Anabela Barreiro                                             Tokyo, Japan
LINGUISTIC RESOURCES
                          General language dictionary entries
      impress,V+FLX=POLISH+SAL=PVPCpleasetype+PT=impressionar+DRV=NDRV01:BOOK+
      VSUP=make+VSUP=cause+NPREP=on                                   Morpho-syntactic
      aesthetic,AFLX=NATURAL+SAL=AVstate+PT=aesthetically+DRV=AVDRV03 and semantic
                                                                         relations
      skepticism,N+FLX=BOOK+SAL=ABcause+PT=cepticismo+DRV=NAVDRV02

       NDRV04 = <B>ion/Npred+Nom                 Rules to transform
                                                morpho-syntactically
       ADRV02 = <B>icable                         and semantically
       AVDRV01 = <E>ly/ADV                        related words of
                                                  different parts of
       AVDRV04 = <B>tically/ADV                        speech
                                                                       Grammar to recognize adverbial compounds and
                                                                        transform them into equivalent single adverbs


      Contextual rules

Rules to improve precision
in specific contexts
[bring(vt)) N(charge; action)
> present(vt) N(idem)]



CICLing 2011                                                                                     February 20-26, 2011
Anabela Barreiro                                                                                 Tokyo, Japan
LINGUISTIC RESOURCES




                                          Sample of terms classified
                                              as Information +
                                             Instructional/legal




CICLing 2011                                  February 20-26, 2011
Anabela Barreiro                              Tokyo, Japan
EVALUATION RESULTS: PARAPHRASING
                                     PRECISION
                   Corpus: 500 sentences
                   100 sentences for each of 5 elementary support verbs

                     SVC Recognition            SVC Recognition            SVC Paraphrasing
                        Precision                    Recall                    Precision
       Pôr              73/73 - 100%              73/100 – 73%                72/73 - 98.6%
       Tomar            75/75 - 100%              75/100 – 75%                68/73 - 93.1%
       Ter              65/65 - 100%              65/100 – 65%                59/65 - 90.7%
       Dar               57/60 - 95%              57/100 – 57%                46/51 - 90.1%
       Fazer           43/45 – 95.5%              43/100 – 43%                40/45 - 88.8%
       Average        62.6/63.6 - 98.4%          62.6/100 - 62.6%             57/61 - 93.4%

                              Evaluation of recognition and paraphrasing
                                    of support verb constructions



CICLing 2011                                                                     February 20-26, 2011
Anabela Barreiro                                                                 Tokyo, Japan
EVALUATION RESULTS: IMPACT ON
                   TRANSLATABILITY (MT)
     Same corpus, 50 sentences selected randomly

     (i) automated pre-processing of support verb constructions with SPIDER and
          conversion into equivalent single verbs
     (ii) pre-processed sentences (automatically generated paraphrases) and original text
          are submitted to MT and the output translations for both original and pre-processed
          sentences were compared

     • 29 (58%) of the best translations were of automatically generated paraphrases
     • 9 (18%) were of support verb constructions
     • 12 (24%) were equally bad or equally good

     CONCLUSION
     The experiment indicates that paraphrases such as those generated by SPIDER help
     improve translation scores
     • The automated paraphrasing of support verb constructions through SPIDER
       allowed a significant improvement of the quality of the MT results in that context

CICLing 2011                                                                  February 20-26, 2011
Anabela Barreiro                                                              Tokyo, Japan
OUTLINE
          INTRODUCTION
                  PARAPHRASES IN NLP
                  PARAPHRASES IN PEDAGOGICAL AND PROFESSIONAL CONTEXTS

          SPIDER
                  FIRST STEPS
                  IMPORTANT FEATURES
                  PARAPHRASES COVERED BY SPIDER
                  INTERFACE
                  LINGUISTIC RESOURCES
                  EVALUATION RESULTS

          THE FUTURE
                  FUTURE APPLICATIONS?
                  FUTURE RESEARCH


CICLing 2011                                                              February 20-26, 2011
Anabela Barreiro                                                          Tokyo, Japan
FUTURE APPLICATIONS?
     •     Writing / authoring aid (word processing applications)
     •     Language composition tool - general and technical language (e.g. student texts or legal
     texts)
     •     Text production and style editor
     •     Terminology verification tool - professional use of terminology in technical domains
                (elimination of informal, idiomatic, slang use of language)
     •      Empirical testbed for linguistic quality assurance (source and target texts)
     •     Text editing (machine translation pre-editing and post-editing) and translation aid
     •     Controlled language tool
                   •   Consistent, direct, and simple language
                   •   Restricted grammar (avoid certain types of construction)
                   •   Avoid complex reasoning, figures of speech, metaphors, etc.
                   •   Elimination of wordiness
     •     “Revision memory” tool (≈ “translation memory”) - recycling of validated reviewed
                sentences, structures or phrases



CICLing 2011                                                                               February 20-26, 2011
Anabela Barreiro                                                                           Tokyo, Japan
FUTURE RESEARCH
                    FROM SPIDER TO MACHINE TRANSLATION

         a fazer um estágio para   dar aulas de / tutor         Religião
         a fazer um estágio para   dar aulas de / lecture       Religião
         a fazer um estágio para   dar aulas de / teach         Religião
         começa a                  dar exemplos / exemplify     :
         sentia-se capaz de        dar um murro em / punch      quem quisesse detê-lo
         gostávamos de lhe         dar uma palavrinha / speak   .




                                                                                    $EN



CICLing 2011                                                               February 20-26, 2011
Anabela Barreiro                                                           Tokyo, Japan
SPIDER: A SYSTEM FOR PARAPHRASING
       IN DOCUMENT EDITING AND REVISION
          APPLICABILITY IN MACHINE TRANSLATION PRE-EDITING




                            Anabela Barreiro



                              ab@metatrad.com




CICLing 2011                                        February 20-26, 2011
Anabela Barreiro                                    Tokyo, Japan

SPIDER: a System for Paraphrasing - Applicability in Machine Translation Pre-Editing - Anabela Barreiro

  • 1.
    SPIDER: A SYSTEMFOR PARAPHRASING IN DOCUMENT EDITING AND REVISION APPLICABILITY IN MACHINE TRANSLATION PRE-EDITING Anabela Barreiro ab@metatrad.com CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 2.
    OUTLINE INTRODUCTION  PARAPHRASES IN NLP  PARAPHRASES IN PEDAGOGICAL AND PROFESSIONAL CONTEXTS SPIDER  FIRST STEPS  IMPORTANT FEATURES  PARAPHRASES COVERED BY SPIDER  INTERFACE  LINGUISTIC RESOURCES  EVALUATION RESULTS THE FUTURE  FUTURE APPLICATIONS?  FUTURE RESEARCH CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 3.
    IMPORTANCE OF PARAPHRASESIN NLP TASKS  Question Answering [Ibrahim et al., 2003], [Paşca, 2003], [Duboué & Chu-Carroll, 2006]  Information Extraction and Text Mining [Ibrahim et al., 2003], [Shinyama et al., 2002] [Shinyama & Sekine, 2003], [Sekine, 2005] [Paşca, 2005], [Paşca & Dienes, 2005]  Summarization [McKeown et al., 2002], [Barzilay, 2001, 2003], [Hirao et al., 2004] [Zhou et al., 2006b]  Natural Language Generation [Iordanskaja et al. 1991]  Plagiarism Detection [Potthast et al., 2010], [Vila et al., 2010]  Machine Translation [Zhou et al., 2006], [Callison-Burch et al., 2006a, 2006b, 2007 and 2008] [Barreiro, 2008, 2009, 2011] CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 4.
    THE PRACTICAL NEEDFOR PARAPHRASES IN PEDAGOGICAL CONTEXTS  Text Processing and Authoring Aids Writing and revision of original/creative/customized texts  Learning Tools Native and second language learning Creation of clear and understandable text content e.g. students learning language and writing skills  Style Editors Uniformization /consistency of style CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 5.
    THE PRACTICAL NEEDFOR PARAPHRASES IN PROFESSIONAL CONTEXTS  Technical Writing Professional high quality documentation and domain-specific texts Controlled language  Linguistic Quality Assurance Linguistic quality of generic texts and specialized documentation Verification/validation of meaningful content  Text Optimization Readable / publishable texts (business-oriented or purpose-oriented content)  Terminology Search for the “exact” term or relevant keywords  Translation Indispensable for human and machine translation (pre-editing and post-editing) CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 6.
    OUTLINE INTRODUCTION  PARAPHRASES IN NLP  PARAPHRASES IN PEDAGOGICAL AND PROFESSIONAL CONTEXTS SPIDER  FIRST STEPS  IMPORTANT FEATURES  PARAPHRASES COVERED BY SPIDER  INTERFACE  LINGUISTIC RESOURCES  EVALUATION RESULTS THE FUTURE  FUTURE APPLICATIONS?  FUTURE RESEARCH CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 7.
    SPIDER PARAPHRASING SYSTEM FIRST STEPS Initially developed for Portuguese 1st version – ReEscreve publicly available service at http://www.linguateca.pt/ReEscreve/ 2nd version – eSPERTo (Portuguese: the smart/clever one; expert) currently being integrated in a cyber school project within the scope of an educational program Writing exercises – students learning how to improve their writing skills in the Portuguese language English SPIDER prototype to assist writing of domain-specific texts CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 8.
    SPIDER IMPORTANT FEATURES  Applies linguistic knowledge to recognize and generate paraphrases automatically (preserves the source text semantics and grammaticality - inflectional features) in the suggestions provided (included transformations of multi-word units)  Uses text-editing mechanisms which provide a variety of alternatives for each expression and the possibility to choose among them (according to personal preferences, style, idiomacity, etc.)  Allows users to suggest new expressions that can be immediately applied to their text, making the text editing process easier, more flexible, and upgradable  Designed to help with writing optimization, understandability and translatability (improvement of the quality of the source text so that it can cause a positive impact in translation) CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 9.
    PARAPHRASES COVERED BYSPIDER  Synonyms in context (ex: phrasal verbs into equivalent expressions) to clear up (weather) = (weather) to become better/brighter  Support verb constructions into single verbs and stylistic variants to make a decision = to decide; to make an audit = to perform an audit  Aspectual constructions into single verbs to launch an attack = to attack  Adverbials (compounds into single adverbs) in a constructive way = constructively  Relatives into participial adjectives the president that was elected = the president elect  Relatives into possessives the role that Europe plays/has = the role of Europe  Relatives into compound nouns (and vice-versa) a container for the milk = a milk container; a bottle made of plastic = a plastic bottle  Agentive passives into actives the man was released by the police officer = the police officer released the man CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 10.
    INTERFACE SUGGESTIONS FOR EXAMPLE SENTENCES Suggestions for general language linguistic phenomena Compound adverbs > single adverbs Relatives > participial adjectives Support verb constructions > single verbs CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 11.
    INTERFACE SELECTION OF PARAPHRASING GRAMMARS FOR SPECIFIC LINGUISTIC PHENOMENA Users can select among general and technical dictionaries (more than one selection allowed), grammars for specific linguistic transformations (one, several or all grammars can be selected). The interface provides sample texts for testing. Informative details about the linguistic resources selected Sample LEGAL text CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 12.
    INTERFACE SELECTION OF A DOMAIN DICTIONARY Identification of legal terms in the text Suggestions for the term “breach of law” Users can select one term from the list of suggestions or provide a new suggestion CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 13.
    INTERFACE SUGGESTIONSPROVIDED AND USER’S CAPABILITY TO ADD NEW REWRITING OPTIONS The user can suggest new words or expressions (synonyms or paraphrases) It is possible to go back and change the user option as many times as necessary Text rewritten • In red, the expressions in the source text • In green, suggestions provided by SPIDER and selected by the user CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 14.
    LINGUISTIC RESOURCES  Eng4NooJ – linguistic knowledge system • OpenLogos dictionary (http://logos-os.dfki.de/) • converted into NooJ format, and enhanced with new properties, including derivational and morpho-syntactic and semantic relations • Morphological system • Contextual rules and grammars • Domain specific dictionary (sample “legal terms”) CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 15.
    LINGUISTIC RESOURCES General language dictionary entries impress,V+FLX=POLISH+SAL=PVPCpleasetype+PT=impressionar+DRV=NDRV01:BOOK+ VSUP=make+VSUP=cause+NPREP=on Morpho-syntactic aesthetic,AFLX=NATURAL+SAL=AVstate+PT=aesthetically+DRV=AVDRV03 and semantic relations skepticism,N+FLX=BOOK+SAL=ABcause+PT=cepticismo+DRV=NAVDRV02 NDRV04 = <B>ion/Npred+Nom Rules to transform morpho-syntactically ADRV02 = <B>icable and semantically AVDRV01 = <E>ly/ADV related words of different parts of AVDRV04 = <B>tically/ADV speech Grammar to recognize adverbial compounds and transform them into equivalent single adverbs Contextual rules Rules to improve precision in specific contexts [bring(vt)) N(charge; action) > present(vt) N(idem)] CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 16.
    LINGUISTIC RESOURCES Sample of terms classified as Information + Instructional/legal CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 17.
    EVALUATION RESULTS: PARAPHRASING PRECISION Corpus: 500 sentences 100 sentences for each of 5 elementary support verbs SVC Recognition SVC Recognition SVC Paraphrasing Precision Recall Precision Pôr 73/73 - 100% 73/100 – 73% 72/73 - 98.6% Tomar 75/75 - 100% 75/100 – 75% 68/73 - 93.1% Ter 65/65 - 100% 65/100 – 65% 59/65 - 90.7% Dar 57/60 - 95% 57/100 – 57% 46/51 - 90.1% Fazer 43/45 – 95.5% 43/100 – 43% 40/45 - 88.8% Average 62.6/63.6 - 98.4% 62.6/100 - 62.6% 57/61 - 93.4% Evaluation of recognition and paraphrasing of support verb constructions CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 18.
    EVALUATION RESULTS: IMPACTON TRANSLATABILITY (MT) Same corpus, 50 sentences selected randomly (i) automated pre-processing of support verb constructions with SPIDER and conversion into equivalent single verbs (ii) pre-processed sentences (automatically generated paraphrases) and original text are submitted to MT and the output translations for both original and pre-processed sentences were compared • 29 (58%) of the best translations were of automatically generated paraphrases • 9 (18%) were of support verb constructions • 12 (24%) were equally bad or equally good CONCLUSION The experiment indicates that paraphrases such as those generated by SPIDER help improve translation scores • The automated paraphrasing of support verb constructions through SPIDER allowed a significant improvement of the quality of the MT results in that context CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 19.
    OUTLINE INTRODUCTION  PARAPHRASES IN NLP  PARAPHRASES IN PEDAGOGICAL AND PROFESSIONAL CONTEXTS SPIDER  FIRST STEPS  IMPORTANT FEATURES  PARAPHRASES COVERED BY SPIDER  INTERFACE  LINGUISTIC RESOURCES  EVALUATION RESULTS THE FUTURE  FUTURE APPLICATIONS?  FUTURE RESEARCH CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 20.
    FUTURE APPLICATIONS? • Writing / authoring aid (word processing applications) • Language composition tool - general and technical language (e.g. student texts or legal texts) • Text production and style editor • Terminology verification tool - professional use of terminology in technical domains (elimination of informal, idiomatic, slang use of language) • Empirical testbed for linguistic quality assurance (source and target texts) • Text editing (machine translation pre-editing and post-editing) and translation aid • Controlled language tool • Consistent, direct, and simple language • Restricted grammar (avoid certain types of construction) • Avoid complex reasoning, figures of speech, metaphors, etc. • Elimination of wordiness • “Revision memory” tool (≈ “translation memory”) - recycling of validated reviewed sentences, structures or phrases CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 21.
    FUTURE RESEARCH FROM SPIDER TO MACHINE TRANSLATION a fazer um estágio para dar aulas de / tutor Religião a fazer um estágio para dar aulas de / lecture Religião a fazer um estágio para dar aulas de / teach Religião começa a dar exemplos / exemplify : sentia-se capaz de dar um murro em / punch quem quisesse detê-lo gostávamos de lhe dar uma palavrinha / speak . $EN CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan
  • 22.
    SPIDER: A SYSTEMFOR PARAPHRASING IN DOCUMENT EDITING AND REVISION APPLICABILITY IN MACHINE TRANSLATION PRE-EDITING Anabela Barreiro ab@metatrad.com CICLing 2011 February 20-26, 2011 Anabela Barreiro Tokyo, Japan