Human Translation &
Translation Workflow
Prof. Gloria Corpas Pastor
Dr. Jorge Leiva Rojo
Dr. Míriam Seghiri Domínguez
Univ...
Human Translation Workflow
I: Overview
HTW I: Overview (Prof. Corpas)
HTW II: Professional Translation (Dr. Leiva)
HTW III: Corpus-based translation (Dr. Seghiri...
MAIN TRAINING EVENTS AND CONFERENCES (WP7)
Scientific and technological training
Complementary skills training
Scientific ...
TUTORIAL ON HUMAN TRANSLATION AND TRANSLATION WORKFLOW
Relevant to all research sub-programmes (* WP1 & WP5)
Introduce the...
LIST OF CONTENTS
Market studies (eg. industry, quality, technology,
language service providers)
The translation workflow (...
Human Translation Workflow II:
Professional Translation
Table of contents
1. Introduction. Market studies
2. The translation workflow
3. Emerging trends
1. Introduction. Market studies

(Trusted Translations)
1. Introduction
25,000 companies in the world (Translation Bureau, 2012)
1,500 translation companies in Europe; average tu...
1. Introduction
700 participants (LSP) (European Commission, 2009):
43% freelancers or sole proprietors;
36% 1-10 employee...
1. Introduction
Six hyper-languages of the web (English, French, Italian,
German, Spanish, Japanese) and Chinese to underg...
1. Introduction
Average per-word rate for the 30 most commonly used
languages on the web fell 34.71%: 0.205 US$ (2010)
0.1...
1. Introduction
Domain and technological skills should be better
addressed (European Commission, 2009).
“Use of technology...
1. Introduction
Decrease in resistance to MT (Systran #1; Google #2)
(European Commission, 2009).
MT does not produce a le...
2. The translation workflow
(Project Management Watch)
“[A translation brief is a] definition of the
communicative purpose for which the translation
is needed. The ideal brief p...
(Translate Media, 2013)
2. The translation workflow
- ISO 9001:2008, ISO 17100 (mid 2014 [Rosam, 2013])
- EN 15038:2006 (EU)
- ASTM (USA)
- GB/T 1...
2. The translation workflow
EN15038 needs amendments.
“The standard although well intended does neither indicate
nor refle...
2. The translation workflow

(DePalma et al., 2013)
3. Emerging trends
(fiverr)
3. Emerging trends

(Lynch, 2012)
3. Emerging trends
CAT tool suppliers to deal with newer media and new
crowd-based supply chains (DePalma et al., 2013).
U...
“Post-Editing is the process by which language
professionals edit machine translation outputs to
create human-quality tran...
3. Emerging trends
Crowd-sourced translations

(Muntés Molero et al., 2012)
Human Translation Workflow III:
Corpus-based translation
Table of contents
1. Introduction
2. Corpora in Translation Training
3. Guidelines for Corpus Creation
3.1. Design Criteri...
Introduction
The inclusion of documentation as a core subject in the
curriculum of Translation and Interpretation degrees clearly
under...
The sources of information that may be utilised by the translator
are extremely varied, ranging from an oral consultation ...
Here, we shall present a systematic methodology for corpus
compilation based on electronic resources available on the
Inte...
CORPUS OF
TELECOMMUNICATIONS

English
subcorpus

Spanish
subcorpus
Telecommunications, why?

Telecommunication is now the world’s largest industry [and] the
world’s fastest-changing industr...
Corpora in Translation
Training
What is a corpus?
corpus, pl. corpora, from the Latin word corpus, i.e. “body”

A collection of texts assumed to be repres...
Characteristics of corpora
• collections of text
• naturally-occurring / authentic text
• representative of a given langua...
Different types of corpora

According to what could corpora be distinguished/classified?
• language
• size
• purpose
•
The advantages of using corpora in translation have been shown by
various studies (cf. Laviosa, 1998; Bowker, 2002; Bowker...
Translators turn to the Internet in search of solutions to information
and documentation problems because they are not onl...
In order for a collection of texts to be considered
a corpus in the strict sense of the term, it must
meet:
a set of clear...
Guidelines
for Corpus Creation
Professional Competences
- Translating
- Linguistic and textual
- Research, information acquisition & processing
- Cultura...
1) Design Criteria
The extract comes from a brochure from the company DVEO:
<http://www.dveo.com/broadcast-systems/TDMB-and-DAB-modulator.sht...
The objective is to create a specialized corpus
on Telecommunications in English and
Spanish
compiled
exclusively
from
res...
CORPUS DESING
Text type: brochures, research articles,
Language/s:
English (subcorpus 1) & Spanish (subcorpus 2)

Diatopic...
2) Compilation Protocol
The Compilation Protocol is
integrated by 4 steps
(Seghiri,2011):
I.
II.
III.
IV.

Locating and accessing resources
Downlo...
Step I:
Locating and accessing
resources
The main sources of information to compile our corpus have
been:

institutional searches, carried out on the web sites of
...
key word searches
Step II:
Downloading Data
This step can be performed manually (Ctrl+S)
We can download a group of pages with programs as GNU
Wget:
http://www.gnu.or...
Step III:
Text formatting
III.Text formatting
Clear preference for .HTML and .PDF
Format conversion: ASCII or plain text format.
(cf. clean-policy, ...
http://www.pdf-to-html-word.com/pdf-to-text
Step IV:
Data Storage
IV. Data storage
bilingual (EN-ES)
documented
comparable
virtual
Using the
Corpus to
Translate
1) Concordancers
COMPARABLE CONCORDANCERS:
AntConc 3.2. is a non-commercial freely downloadable
concordancer for Windows, Mac and Linux. Th...
AntConc

[Monolingual Freeware Multiplatform Concordancer]
Another monolingual concordancer for Windows only is the
Multilingual Corpus Toolkit which supports many European and
Asia...
PARALLEL CONCORDANCERS:
A bilingual or multilingual concordancer is a program for
parallel corpora, i.e. corpora of source...
ParaConc

[Bilingual Commercial Suite for Windows. Alignment]
ParaConc
[Concordancing]
Corolary
Comparable corpora are particularly useful for meeting
translators’ information needs.

Representative Corpora: finding in...
Corpora:
instant access to authentic language and real usage
syntagmatic patterns and translation equivalents unavailable ...
Human Translation &
Translation Workflow
Prof. Gloria Corpas Pastor
Dr. Jorge Leiva Rojo
Dr. Míriam Seghiri Domínguez
Univ...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow
Upcoming SlideShare
Loading in …5
×

12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1,623 views
1,423 views

Published on

Published in: Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,623
On SlideShare
0
From Embeds
0
Number of Embeds
253
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

  1. 1. Human Translation & Translation Workflow Prof. Gloria Corpas Pastor Dr. Jorge Leiva Rojo Dr. Míriam Seghiri Domínguez Universidad de Málaga Birmingham, 13th November 2013
  2. 2. Human Translation Workflow I: Overview
  3. 3. HTW I: Overview (Prof. Corpas) HTW II: Professional Translation (Dr. Leiva) HTW III: Corpus-based translation (Dr. Seghiri)
  4. 4. MAIN TRAINING EVENTS AND CONFERENCES (WP7) Scientific and technological training Complementary skills training Scientific and technological workshop Business showcases
  5. 5. TUTORIAL ON HUMAN TRANSLATION AND TRANSLATION WORKFLOW Relevant to all research sub-programmes (* WP1 & WP5) Introduce the most common translation workflow to researchers • to learn how translators currently work • to design new translation technologies • to cover confidence and quality estimation in HTWs
  6. 6. LIST OF CONTENTS Market studies (eg. industry, quality, technology, language service providers) The translation workflow (eg. certification, project management, agents, emerging trends) Training translators using corpora (compilation protocol, analysis, translation strategies, etc.)
  7. 7. Human Translation Workflow II: Professional Translation
  8. 8. Table of contents 1. Introduction. Market studies 2. The translation workflow 3. Emerging trends
  9. 9. 1. Introduction. Market studies (Trusted Translations)
  10. 10. 1. Introduction 25,000 companies in the world (Translation Bureau, 2012) 1,500 translation companies in Europe; average turnover 300,000 € in 2005 (EUATC, 2005). Translation and interpreting (+ software & website localization) sector’s assumed value: 5.7 billion € in 2008; 9.1 billion € estimated in 2013 (European Commission, 2009). Highest growth rate of all European industries in Europe. World-wide annual growth: 5.13% (DePalma et al., 2013).
  11. 11. 1. Introduction 700 participants (LSP) (European Commission, 2009): 43% freelancers or sole proprietors; 36% 1-10 employees; 21% 10+ employees. Growth of big companies is quicker than growth of the rest of the language market (Boucau, 2009). Supply exceeds demand; number of well-qualified linguists is too small to cover the growing demand.
  12. 12. 1. Introduction Six hyper-languages of the web (English, French, Italian, German, Spanish, Japanese) and Chinese to undergo a major growth (cf. Common Sense Advisory, 2011). Prices dependent on exchange rates, not influenced by inflation (cf. Goddard, 2013) Prices relatively stable 2004-2008. Market is very competitive. 80% of providers charge less than 0.15 $ / word (Translation Bureau, 2012).
  13. 13. 1. Introduction Average per-word rate for the 30 most commonly used languages on the web fell 34.71%: 0.205 US$ (2010) 0.134 US$ (2012) Global supply, advances in technology, economic issues and more aggressive buyers conspired to drive down the prices since 2008; Situation remains unchanged.
  14. 14. 1. Introduction Domain and technological skills should be better addressed (European Commission, 2009). “Use of technology by LSPs is sporadic” (Translation Bureau, 2012). It requires an investment to build and maintain infrastructure and a significant repository of data in order for the tool to be effective; difficulty for small enterprises, the bulk of businesses within the industry.
  15. 15. 1. Introduction Decrease in resistance to MT (Systran #1; Google #2) (European Commission, 2009). MT does not produce a level of quality sufficient, output to be reviewed by qualified translators MT is not widely adopted (large volume translations) (Translation Bureau, 2012). HAMT is growing in usage. 2009 study indicating that HAMT doubled the translation output and was 45% cheaper (Translation Bureau, 2012).
  16. 16. 2. The translation workflow (Project Management Watch)
  17. 17. “[A translation brief is a] definition of the communicative purpose for which the translation is needed. The ideal brief provides explicit or implicit information about the intended text function(s), the target-text addressee(s), the medium over which it will be transmitted, the prospective place and time and, if necessary, motive of production or reception of the text” (Nord, 1997).
  18. 18. (Translate Media, 2013)
  19. 19. 2. The translation workflow - ISO 9001:2008, ISO 17100 (mid 2014 [Rosam, 2013]) - EN 15038:2006 (EU) - ASTM (USA) - GB/T 19363 (China) - CA/CSGB-131.10 (Canada) - To define translation’s basic terms and concepts. - To establish the basics for the client-translation service provider relationship to meet market needs. - To determine the implementation of the translation process.
  20. 20. 2. The translation workflow EN15038 needs amendments. “The standard although well intended does neither indicate nor reflect the quality of the output of an LSP. Due to downward pressures and trends in pricing, many translation agencies need to operate with limited budgets in order to stay competitive. As a result, if low cost and low quality translation work is performed, the mere fact that such work is revised does not guarantee high quality” (European Commission, 2009).
  21. 21. 2. The translation workflow (DePalma et al., 2013)
  22. 22. 3. Emerging trends (fiverr)
  23. 23. 3. Emerging trends (Lynch, 2012)
  24. 24. 3. Emerging trends CAT tool suppliers to deal with newer media and new crowd-based supply chains (DePalma et al., 2013). Users want different forms of content translated: emails, blogs, tweets. Slight decrease in turnover due to the economic downturn, small enterprises with turnovers below 50,000 € (European Commission, 2009).
  25. 25. “Post-Editing is the process by which language professionals edit machine translation outputs to create human-quality translations” (Marcu, 2013).
  26. 26. 3. Emerging trends Crowd-sourced translations (Muntés Molero et al., 2012)
  27. 27. Human Translation Workflow III: Corpus-based translation
  28. 28. Table of contents 1. Introduction 2. Corpora in Translation Training 3. Guidelines for Corpus Creation 3.1. Design Criteria 3.2. Compilation Protocol 4. Using Corpora to Translate 5. Using the corpus to translate 6. Corolary
  29. 29. Introduction
  30. 30. The inclusion of documentation as a core subject in the curriculum of Translation and Interpretation degrees clearly underlines its importance to translators. Training in this discipline is considered essential for a translator given that only sufficient and conscientious work on documentation will allow an adequate translation of a specialised text.
  31. 31. The sources of information that may be utilised by the translator are extremely varied, ranging from an oral consultation with an expert to a search using specialised glossaries and dictionaries. However, in the field of translation perhaps the most relevant documentation activity today involves the use of the Internet and, closely related to this, the compilation and management of virtual corpora.
  32. 32. Here, we shall present a systematic methodology for corpus compilation based on electronic resources available on the Internet. The methodology will be illustrated through the example of the creation of a virtual corpus of Telecommunications integrated by: 1 subcorpus in English 1 subcorpus in Spanish
  33. 33. CORPUS OF TELECOMMUNICATIONS English subcorpus Spanish subcorpus
  34. 34. Telecommunications, why? Telecommunication is now the world’s largest industry [and] the world’s fastest-changing industry from any measure of change you can name technology, players applications and users. In one decade, this industry is going from totally-closed, governmentcontrolled, highly regulated, monopolistic, bureaucratic, plodding thing to an exploding fre-for-all (Newton, 1994: 1)
  35. 35. Corpora in Translation Training
  36. 36. What is a corpus? corpus, pl. corpora, from the Latin word corpus, i.e. “body” A collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis (Francis, 1982)
  37. 37. Characteristics of corpora • collections of text • naturally-occurring / authentic text • representative of a given language • collected according to specific criteria • stored in machine-readable format • used for linguistic analysis
  38. 38. Different types of corpora According to what could corpora be distinguished/classified? • language • size • purpose •
  39. 39. The advantages of using corpora in translation have been shown by various studies (cf. Laviosa, 1998; Bowker, 2002; Bowker y Pearson, 2002; Zanettin et al. 2003). Advantages: their objectivity, their reusability and multiple usage. They are user-friendly and allow access to and management of huge quantities of information in almost no time.
  40. 40. Translators turn to the Internet in search of solutions to information and documentation problems because they are not only translating between languages but also between discourse communities and cultures. The compilation of corpora and the Internet appear to be two of the most important documentation resources in the practice and research of specialised translation. Corpora for a particular speciality are not available for consultation on the Internet. Translators have no alternative other than to compile their own virtual corpora for the specific translation that has been commissioned in each case.
  41. 41. In order for a collection of texts to be considered a corpus in the strict sense of the term, it must meet: a set of clear design criteria and a specific compilation protocol so that the collection may be deemed representative of the field of specialisation or the particular type of document that is being translated.
  42. 42. Guidelines for Corpus Creation
  43. 43. Professional Competences - Translating - Linguistic and textual - Research, information acquisition & processing - Cultural - Technical The knowledge of how to compile and use corpora is an essential part of modern translational competence (Varantola, 2003)
  44. 44. 1) Design Criteria
  45. 45. The extract comes from a brochure from the company DVEO: <http://www.dveo.com/broadcast-systems/TDMB-and-DAB-modulator.shtml>.
  46. 46. The objective is to create a specialized corpus on Telecommunications in English and Spanish compiled exclusively from resources available on the Internet. Restricted to texts that have been drawn up in UK and Spain. It will include original documents (comparable corpus), complete texts and documented.
  47. 47. CORPUS DESING Text type: brochures, research articles, Language/s: English (subcorpus 1) & Spanish (subcorpus 2) Diatopic restrictions: United Kingdom & Spain Original or translations: Comparable (original) Complete text or partial: complete Documented: yes
  48. 48. 2) Compilation Protocol
  49. 49. The Compilation Protocol is integrated by 4 steps (Seghiri,2011): I. II. III. IV. Locating and accessing resources Downloading Data Text formatting Data storage
  50. 50. Step I: Locating and accessing resources
  51. 51. The main sources of information to compile our corpus have been: institutional searches, carried out on the web sites of international organisations and institutions (International Telecommunication Union, Telefonica, etc.) key word searches using a search engine (www.google.com, www.yahoo.co.uk, etc.)
  52. 52. key word searches
  53. 53. Step II: Downloading Data
  54. 54. This step can be performed manually (Ctrl+S) We can download a group of pages with programs as GNU Wget: http://www.gnu.org/software/wget
  55. 55. Step III: Text formatting
  56. 56. III.Text formatting Clear preference for .HTML and .PDF Format conversion: ASCII or plain text format. (cf. clean-policy, Sinclair 1991: 21).
  57. 57. http://www.pdf-to-html-word.com/pdf-to-text
  58. 58. Step IV: Data Storage
  59. 59. IV. Data storage
  60. 60. bilingual (EN-ES) documented comparable virtual
  61. 61. Using the Corpus to Translate
  62. 62. 1) Concordancers
  63. 63. COMPARABLE CONCORDANCERS: AntConc 3.2. is a non-commercial freely downloadable concordancer for Windows, Mac and Linux. This versatile software features several tools, which display lists of words and keywords (Word List, Keyword List), list, sort and search for lexical bundles (Collocates), generate lines in KWIC format (Concordance), indicate the position of the keyword within a given corpus (Concordance Plot), allow the user to have access to the whole source file or corpus (File View). http://www.antlab.sci.waseda.ac.jp/antconc_index.html
  64. 64. AntConc [Monolingual Freeware Multiplatform Concordancer]
  65. 65. Another monolingual concordancer for Windows only is the Multilingual Corpus Toolkit which supports many European and Asian languages. http://personalpages.manchester.ac.uk/staff/scott.piao/research/DownLoad/downl oad.htm Freeware concordancers for Mac are Conc 1.7/1.8 and Concorder 1.0. Conc: http://www.sil.org/computing/conc/conc.html Concorder: http://mac.softpedia.com/get/WordProcessing/Concorder.shtml
  66. 66. PARALLEL CONCORDANCERS: A bilingual or multilingual concordancer is a program for parallel corpora, i.e. corpora of source texts and their translations into other languages. As a rule, this kind of software requires input aligned at sentence level. Most bi-/multilingual concordances are commercial. A well-known example is ParaConc 0.9, the multilingual version of MonoConc Pro. It can analyse up to four languages in parallel (one source text corpus and up to three target corpora).
  67. 67. ParaConc [Bilingual Commercial Suite for Windows. Alignment]
  68. 68. ParaConc [Concordancing]
  69. 69. Corolary
  70. 70. Comparable corpora are particularly useful for meeting translators’ information needs. Representative Corpora: finding information on terminology, phraseology, concepts, cultural issues and text discourse for direct and inverse translation.
  71. 71. Corpora: instant access to authentic language and real usage syntagmatic patterns and translation equivalents unavailable in other resources or technologies guidance to style, text-structuring devices and conventions in both SL and TL useful for the the translation of any kind of text type, language/s and in any direction
  72. 72. Human Translation & Translation Workflow Prof. Gloria Corpas Pastor Dr. Jorge Leiva Rojo Dr. Míriam Seghiri Domínguez Universidad de Málaga Birmingham, 13th November 2013

×