SlideShare a Scribd company logo
1 of 11
Download to read offline
Overview
353
Using Translation Memory to Speed up Translation
Process
Marija Brkić
Department of Informatics, University of Rijeka
Omladinska 14, 51000 Rijeka, Croatia
mbrkic@uniri.hr
Sanja Seljan
Department of Information Sciences
Faculty of Humanities and Social Sciences, University of Zagreb
Ivana Lučića 3, 10000 Zagreb, Croatia
sanja.seljan@ffzg.hr
BoĆŸena BaĆĄić Mikulić
Primary school “Turnić”
Franje Čandeka 20, 51000 Rijeka, Croatia
bozena.basic@gmail.com
Summary
Translation process is one aspect of human creativity. Due to globalization, EU
accession negotiations, and the need for information exchange, the amount of
translation work increases on a daily basis. The translation process is hindered
by the fact that the languages involved differ culturally, stylistically, syntacti-
cally and lexically. This paper explores the benefits and limitations of TMs
(translation memories). TMs are not used for replacing humans in the transla-
tion process, but rather for enhancing the human translation process. In this
paper, a detailed analysis of Atril’s DĂ©jĂ  Vu X system is presented, along with
its time-saving implications, which are based on the reuse of previously stored
segments. Excerpts from three different digital camera user manuals are trans-
lated from English into Croatian. Evaluation is performed by measuring the
time difference between human and TM-based translation speeds in prepara-
tion, translation, and revision phases, and with regard to six different parame-
ters.
Key words: Translation Memory (TM), DĂ©jĂ  Vu, Computer-Assisted Transla-
tion (CAT), language pair, translation unit (TU), translation speed
Introduction
The EU relies on the principles of open access to documents, multilingualism
and democracy. Therefore, the EU legislation needs to be translated into each of
INFuture2009: “Digital Resources and Knowledge Sharing”
354
the official languages. On the other hand, the legislation of each particular
member state needs to be translated into one of the EU’s official languages
(Seljan & Pavuna, 2006b). To sum up, translation demands in the EU surpass
human capacities. There are 23 official languages with 23 x 22 = 506 language
pairs. A huge number of pages need to be translated on a daily basis. Short
deadlines, demands for consistency and data-sharing, and insufficient number of
translators further impede the translation process, particularly for newly admit-
ted countries (Seljan & Pavuna, 2006a). Nowadays, fortunately, translators have
fully automatic MT systems and CAT tools at their disposal (ValderrĂĄbanos,
2003). Moreover, the usage of such tools has been recommended by the Direc-
torate General for Translation of the European Commission (DGT), which is
European Commission’s in-house translation department (Seljan & Pavuna,
2006a). Depending on their needs, translators can opt for MT or CAT tools.
MT was first conceived as a technology that significantly speeds up the transla-
tion process and offers human-like quality translations. Soon, it became clear
that such a goal is far-fetched (ValderrĂĄbanos, 2003), which led to the develop-
ment of TM technology. Current computational models of MT are limited to
tasks for which rough translations are adequate, tasks where human post-editors
are used and tasks limited to sub-domains in which fully automatic high quality
translations are achievable (FAHQT) (Jurafsky & Martin, 2009). TMs, on the
other hand, exploit machine memory for storing translated segments in order to
reuse them in future translations. Their usability, therefore, increases with the
size of the stored data.
As Croatia’s EU accession negotiations are underway, it is high time for the de-
velopment of Croatian language tools and resources. This paper explores the
benefits of using TMs, in particular Atril’s DĂ©jĂ  Vu X (DVX) system, and pre-
sents the results of a study in which English and Croatian are source and target
languages, respectively.
Translation memory
TM technology is based on the notion of reuse of previously translated seg-
ments. It is usually integrated into a system which has a terminology manage-
ment module and a lexicon (ValderrĂĄbanos, 2003). This technology does not
aim at replacing humans, but rather at enhancing the human translation process.
A TM can be defined as a database which stores corresponding source and tar-
get language translations, called translation units (TUs).
Approaches
There are two TM implementation approaches. Despite the differences in im-
plementation, TMs are designed with the common purpose of storing previously
translated material in an organized way, in order to present it to the user in fu-
ture translations (Gow, 2003).
M. Brkić, S. Seljan, B. Baơić Mikulić, Using Translation Memory to 

355
Sentence-based approach
A sentence-based approach divides source and target language texts into corre-
sponding TUs, which can be sentences, titles, subtitles or list entries. TUs are
stored in a database and retrieved in future translations in cases of identical or
similar TUs in new source texts. Sentence units are easy to identify if they start
with capital letters and end with full stops (Gow, 2003). However, abbreviations
or full stops which are not at the end of sentences pose problems. These prob-
lems can be solved by defining new sentence delimitation rules (DĂ©jĂ  Vu,
2009). The main benefit of the sentence-based approach, compared to a charac-
ter-string-in-bitext-based approach, is that exact matches are more likely to be
relevant because sentence-based TMs represent an extreme form of high preci-
sion, low recall search (Simard & Langlais, 2000). Fuzzy matching algorithms,
on the other hand, are based on statistical models of similarity. Since these
models are only loose approximations, the matching algorithms sometimes cre-
ate useless matches, known as ‘noise’, or fail to generate matches, the phe-
nomenon known as ‘silence’ (Bowker, 2002 in Gow, 2003).
Character-string-in-bitext(CSB)-based approach
A CSB-based approach involves storing of source texts and corresponding
translations in a database. The resulting texts are called bitexts. Bitexts can be
used for preparatory background reading. In this approach, identical character
strings of any length are recognized and reused (Gow, 2003). Working with
sentence segments, instead of entire sentences, has its advantages (Simard &
Langlais, 2000). It enables identification of identical sentence segments or even
several consecutive identical sentences at once (Macklovitch & Russell, 2000 in
Gow, 2003). ‘Noise’ phenomenon is still present, but this time as a result of
finding unreliably small matches. ‘Silence’, on the other hand, occurs because
there is no support for fuzzy matching. One of the disadvantages of this ap-
proach is that internal repetitions have to be recycled exclusively through termi-
nology databases, because only entire translations are added to databases (Gow,
2003).
Advantages
Two major advantages of TM technology are consistency and speed. Consis-
tency is of crucial importance in non-literary texts, for example software and
hardware manuals (ValderrĂĄbanos, 2003), or business, legal, scientific and
technical texts (Gow, 2003). These texts are highly repetitive. The longer they
are, the more likely they are to contain repetitive content (AustermĂŒhl, 2001 in
Gow, 2003). Repetition can occur, not only internally, but also across several
texts in the same domain (Gow, 2003). Moreover, software documentation, be-
sides being highly repetitive, is subject to frequent version updates. It is thus an
ideal candidate for exploiting TM benefits (Bruckner, 2001). Consistency is es-
INFuture2009: “Digital Resources and Knowledge Sharing”
356
pecially important in cases where several translators work on the same project
and share the same TM on the network (O’Brien, 1998 in Gow, 2003).
Speed is important, regardless of the domain, because globalization has brought
forward endless translation demands (ValderrĂĄbanos, 2003). Using TMs in the
translation process implies cost reduction. For example, the translation process
can be started as soon as the first draft of the document to be translated is ob-
tained. Furthermore, translation vendors can lower prices and thus earn more
contracts (Gordon, 1997 in Gow, 2003). On the other hand, freelance translators
can save up their valuable time or increase their earnings by increasing the
translation speed (Gordon, 1996 in Gow, 2003).
In addition, using TMs preserves original page layouts in translated documents,
because formatting information is hidden in embedded codes (Seljan & Pavuna,
2006a).
Finally, TMs are usually integrated into systems which have tools for building
dictionaries (Webb, 1998) and reporting detailed statistics on internal and exter-
nal repetition and word counts. These tools can help project managers in sched-
uling localization products (Esselink, 2000).
Disadvantages
Although TMs bring a lot of advantages, there are also some limitations of their
usage.
First of all, using TMs implies initial decrease in productivity because transla-
tors need to master the environment (Webb, 1998). The odds of finding quality
matches increase with the size of TMs (Gow, 2003). Moreover, the beneficiary
effects of using TMs are felt only on repetitive texts (ValderrĂĄbanos, 2003).
Therefore, cost-effects of investing additional time into importing existent
translations through the process of alignment should be calculated (Seljan &
Pavuna, 2006a).
Although, according to Esselink (2000), TMs indisputably save time, regular
database maintenance is time-consuming (AustermĂŒhle, 2001 in Gow, 2003).
Furthermore, source texts need to be in digital form (Gow, 2003) and suitable
file formats because not all formats are supported since TM systems require
filters to preserve formatting. As an effect, they are usually bundled only with
filters for most commonly used formats (Esselink, 2000).
TMs affect quality of the translation because, by using them, translators tend to
avoid using anaphoric or cataphoric references in order to make segments more
‘universal’.
Seljan and Pavuna (2006a) add lack of language knowledge and context insen-
sitivity to the list of drawbacks and point out additional software, maintenance
and education costs.
There are also other concerns with regard to using TMs. For example, it is
questionable whether translators should be paid differently for identical and
M. Brkić, S. Seljan, B. Baơić Mikulić, Using Translation Memory to 

357
fuzzy matches recovered from TMs. Furthermore, the ownership of final TMs is
also unclear.
DĂ©jĂ  Vu X
DVX is a very powerful and adaptive CAT system which integrates several
CAT tools – lexicon, terminology database, TM, alignment module, etc. The
first version of this system appeared in 1993. DVX is an example of a hybrid
approach. Matches are ranked into the following categories: perfect/exact
match, fuzzy match, guaranteed match (there is overlapping of neighbouring
TUs as well) and assembled from portions match. It features auto-search, as-
sembling, propagation and pre-translation. Besides, it gives a detailed statistical
analysis of source and target language texts. Terminology database is the only
database that needs to be manually filled and it allows linguistic enrichment of
inserted terms. A list of lexicon entries is automatically built. However, entries
which are to be included need to be manually translated. DVX combines mod-
ern TM technology with example-based machine translation (EBMT). EBMT
implies translation by analogy and enables combining several segments into one
translation segment (DĂ©jĂ  Vu, 2009).
According to a survey, which included 699 translators from 50 different coun-
tries, DVX was the second TM on the list of popularity, meaning that 61% of
translators were acquainted with the system. On the usage list, it was the fourth.
To be precise, there were 23% of translators using the system (Laugodaki,
2006). The statistics show that the system is preferred by translators with higher
level of information literacy. DVX scored better than competitors’ systems in
functionality, efficiency, speed, reliability, price, and usability. According to the
survey, it also had better customer support.
Experimental study
The feature to be examined in this study was the speed of the translation proc-
ess, with the goal of measuring the time difference between human and TM-
based translation speeds (Bruckner, 2001). The following parameters were
taken into account:
‱ Measure (minutes),
‱ Evaluation procedure (comparing times needed by each translator to de-
liver translation),
‱ Score (time needed for the translation process),
‱ Metric (faster / slower),
‱ Languages (English – source language, Croatian – target language),
‱ Text type (hardware documentation), and
‱ System used (DĂ©jĂ  Vu).
The study was conducted by two expert translators, who translated three ex-
cerpts from Kodak’s, Nokia’s and Canon’s digital camera user manuals, re-
INFuture2009: “Digital Resources and Knowledge Sharing”
358
spectively. Each excerpt was two standard pages long and contained informa-
tion on battery and relating equipment care and maintenance. Therefore, each
excerpt contained terms and phrases from the same domain. The excerpts were
translated from English into Croatian.
The experimental study was conducted in the following phases:
1. Text selection phase
2. Preparation phase (lexical analysis of the source language texts and their
technical registers)
3. Translation phase (setting up environment, translating, building up lexi-
con, filling terminology database)
4. Revision phase (post-editing)
Preparation phase
For all the excerpts, the preparation phase was performed jointly by the two
translators. The lexical analysis of the source language texts (English) lasted 45,
10 and 18 minutes, respectively.
Translation phase
Prior to the translation process, the translator using the TM spent 5 minutes set-
ting up the environment. The translator had some previous experience with
DVX translation memory. After the text was inserted into DVX, the system cal-
culated that the internal repetition was 6 per cent.
The translation speed in the first translation (Kodak) was identical for both
translators (37 minutes).
After the first text was translated, the TM translator had to manually build up
the system’s lexicon, which lasted 15 minutes and included 68 entries. The in-
serted entries were mostly nouns in nominative singular and plural, and mascu-
line adjectives in singular. Filling the terminology database lasted 10 minutes.
Only 18 phrases were added due to Croatian rich morphological system.
The second excerpt (Nokia) first underwent the pre-translation process. Besides
the exact matches, fuzzy matches and parts assembled from portions were also
allowed. After processing 69 source language sentences, the system found 6
sentences with 1 or more fuzzy matches and 31 sentences assembled from por-
tions, some of which are presented in Figure 1.
The traditional translation process for the second text lasted 31 minutes, while,
with the help of the TM, it lasted 30 minutes. After the second text translation,
111 new entries were added to the lexicon and 13 new phrases were added to
the terminology database. These processes lasted 15 and 10 minutes, respec-
tively.
The third excerpt (Canon) also underwent the process of pre-translation prior to
the translation. The system processed 41 source sentences and found 1 sentence
with a unique exact match and 32 sentences assembled from portions. The sys-
tem found substantially more matches than it had in the second text (Figure 2).
M. Brkić, S. Seljan, B. Baơić Mikulić, Using Translation Memory to 

359
Figure 1: Matches found by DVX when the second source text was inserted
Figure 2: Matches found by DVX after the third source text was inserted
The translation phase for the third text lasted 29 minutes for the traditional
translation and 23 minutes with the TM. It took 5 minutes to add 45 new entries
to the lexicon, and 3 minutes to add 7 new phrases to the terminology database.
INFuture2009: “Digital Resources and Knowledge Sharing”
360
Revision Phase
The revision phase for the three translations in the traditional translation process
lasted 5, 7, and 5 minutes, respectively, while in the TM translation process it
lasted 5, 8, and 5 minutes, respectively.
Evaluation
According to the automation level, evaluation methods can be divided into
automatic and manual. Although time-consuming, manual methods better suit
real-life system application. They can be implemented in two different ways.
One way is that a translator scores each segment on a 1-4 scale from “absolutely
useless” to “no changes needed”. The other way includes modifying each re-
sulting segment into acceptable translation and counting post-editing steps
needed (HodĂĄsz, 2006).
The results of a 1-4 scale test performed on the third translation are presented in
Chart 1. Since the TM is still under development, these are only initial results.
With the growth of the TM, the increase in the percentage of segments classi-
fied as ‘few changes needed’ or ‘no changes needed’ can be expected.
Chart 1: Effectiveness of TM
20%
71%
7% 2%
absolutely useless
many changes needed
few changes needed
no changes needed
Discussion
The search process gives the highest priority to the TM matches and the lowest
to the lexicon matches, with the terminology database matches in between.
Nevertheless, the user is supplied with all the matches in a separate window,
which enables them to choose the most appropriate match for the given context.
Most of the problems with DVX regard morphology and word order. A word
extracted from the lexicon which is a masculine adjective needs to be changed
into feminine or neuter in order to match the context syntactically. The same is
valid for verbs, since the lexicon contains their infinitive form.
Furthermore, words are extracted in order of appearance and they often need to
be rearranged because of the syntactic rules of the target language.
There are also capitalization issues which need to be resolved. For example, if
there is a word which was the first word in a previously stored segment, it re-
mains capitalized regardless of its position in subsequent occurrences.
M. Brkić, S. Seljan, B. Baơić Mikulić, Using Translation Memory to 

361
Additionally, punctuation is also retained if the word found in one of the re-
sources is followed by a punctuation mark.
However, one of the main advantages of using the TM is the user interface,
which enables the translator to see parallel sentences or units in the same row.
That eases the translation process because the translator does not have to spend
a lot of time inserting or deleting units of the source text and scanning through
both texts.
Speed of translation process
Since the main advantage of any TM system should be speeding up the transla-
tion process, here follow the results of this case study. Table 1 presents time
(expressed in minutes) needed for each phase (preparation phase has been
omitted because it was done jointly by the two translators).
Table 1: Time spent for each phase
Translation
phase
(minutes)
Lexicon and
terminology database
filling (TM) (minutes)
Revision
phase
(minutes)
Total
(minutes)
Traditional 37 / 5 421st
text TM 37 25 7 69
Traditional 31 / 5 362nd
text TM 30 25 8 63
Traditional 29 / 5 343rd
text TM 23 8 5 36
Chart 2: Time spent in translation process
37
29
37
23
31
30
0
5
10
15
20
25
30
35
40
First text Second
text
Third text
Time(minutes)
Traditional
translation
TM
It is evident that the TM translator becomes faster than the traditional translator
in the second, and even more, in the third translation. Since the TM was empty
prior to the first translation, the translation phase of the first excerpt was of the
same length for both translators. As DVX grows in size, the TM translator be-
INFuture2009: “Digital Resources and Knowledge Sharing”
362
comes increasingly faster than the traditional one (Chart 2). The difference in
speed is the most obvious in the third translation, where the TM translator is 6
minutes faster.
Taking into account the time which the TM translator needs to spend in the
process of filling the lexicon and the terminology database, it is evident that
using TM systems can be somewhat time-consuming in the beginning (Chart 3).
Chart 3: Total time for traditional translation and TM
42 36
69 63
34
36
0
15
30
45
60
75
First text Second
text
Third text
Time(minutes)
Traditional
translation
TM
Nevertheless, it is quite plausible that the time which the TM translator spends
in filling the lexicon and terminology database gradually decreases because
both databases have smaller number of entries to be added in, while the time
needed for human translation remains the same. Therefore, the time difference
for the TM translator and the traditional translator becomes less significant.
Conclusion
This paper presents a detailed analysis of the benefits of the Atril’s DĂ©jĂ  Vu X
translation memory system, with its time-saving implications based on the reuse
of previously stored translation units.
In this case study, segments from three different digital camera user manuals are
translated, with the aim of presenting how TMs ensure consistency in non-liter-
ary texts. After measuring the time difference between human and TM-based
translation speeds and taking into account 6 parameters, those being measure,
evaluation procedure, score, metric, languages and text type, it can be con-
cluded that TMs speed up the translation process, especially in later phases, i.e.
when texts show certain level of local or global repetition. Nevertheless, even
though TMs unquestionably save time, regular lexicon and terminology data-
base maintenance is still time-consuming. However, this task does not take as
much of translator’s time as databases grow in size. On the other hand, lack of
knowledge asks for intensive work in the revision phase. Even so, TMs repre-
sent a valuable resource, especially when several translators work in the same
M. Brkić, S. Seljan, B. Baơić Mikulić, Using Translation Memory to 

363
domain, and aim to produce fast, consistent and professional-quality transla-
tions.
Acknowledgments
This work has been supported by the Ministry of Science, Education and Sports
of the Republic of Croatia, under the grants No.130-1300646-0909 and No.009-
0361935-0852.
References
Bruckner, Christine; Plitt, Mirko. Evaluating the Operational Benefit of Using Machine Transla-
tion Output as Translation Memory Input. // Proceedings of the VIII MT Summit. Spain,
Santiago de Compostela, 2001
DĂ©jĂ  Vu – Translation memory and productivity system. http://www.atril.com/ (July 25, 2009)
Esselink, Bert. A Practical Guide to Localization. Amsterdam/Philadelphia : John Benjamins
Publishing Company, 2000.
Gow, Francie. Metrics for Evaluating Translation Memory Software. M.A. thesis. University of
Ottawa, Canada, 2003.
http://www.chandos.ca/Metrics_for_Evaluating_Translation_Memory_Software.pdf (August
3, 2009)
HodĂĄsz, GĂĄbor. Evaluation Methods of a Linguistically Enriched Translation Memory System. //
Proceedings of the Fifth International Conference on Language Resources and Evaluation.
Italy, Genoa, 2006, 2044-2047
Jurafsky, Daniel; Martin, James H. Speech and language processing : an introduction to natural
language processing, computational linguistics, and speech recognition. New Jersey : Pearson
education, 2009
Lagoudaki, Elina. Translation Memories Survey 2006: Users’ perceptions around TM use // Pro-
ceedings of the ASLIB International Conference Translating & the Computer 28. UK, Lon-
don, 2006.
Seljan, Sanja; Pavuna, Damir. Translation Memory Database in the Translation Process. // Pro-
ceedings of the 17th International Conference on Information and Intelligent Systems IIS
2006 / Aurer, B.; Bača, M. (ed.). Croatia, VaraĆŸdin : FOI, 2006a, 327-332
Seljan, Sanja; Pavuna, Damir. Why Machine-Assisted Translation (MAT) Tools for Croatian? //
Proceedings of the 28th Conference on Information Technology Interfaces – ITI 2006. Croa-
tia, Cavtat, 2006b, 469-474
Simard, Michel; Langlais, Philippe. Sub-sentential Exploitation of Translation Memories. //
LREC 2000 Second International Conference on Language Resources and Evaluation.
Greece, Athens, 2000
Valderråbanos, Antonio S.; Esteban, José; Iraola, Luis. TransType2 - A New Paradigm for
Translation Automation. // Proceedings of the MT Summit IX. New Orleans, Louisiana, 2003,
498-501
Webb, Lynn E. Advantages and Disadvantages of Translation Memory : A Cost/Benefit Analysis.
M.A. thesis. Monterey Institute of International Studies, Monterey, California, 1998.
http://www.tradulex.org/Bibliography/Webb.htm (October 31, 1998)

More Related Content

What's hot

Introduction to development of lexical databases
Introduction to development of lexical databasesIntroduction to development of lexical databases
Introduction to development of lexical databasesMuhammad Shoaib Chaudhary
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYijnlc
 
STRUCTURED AND QUANTITATIVE PROPERTIES OF ARABIC SMS-BASED CLASSIFIED ADS SUB...
STRUCTURED AND QUANTITATIVE PROPERTIES OF ARABIC SMS-BASED CLASSIFIED ADS SUB...STRUCTURED AND QUANTITATIVE PROPERTIES OF ARABIC SMS-BASED CLASSIFIED ADS SUB...
STRUCTURED AND QUANTITATIVE PROPERTIES OF ARABIC SMS-BASED CLASSIFIED ADS SUB...ijnlc
 
A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...
A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...
A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...IJNSA Journal
 
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...ijaia
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ijnlc
 
Compiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTCompiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTGuy De Pauw
 
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...RIILP
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) inventionjournals
 
Analysis review on feature-based and word-rule based techniques in text stega...
Analysis review on feature-based and word-rule based techniques in text stega...Analysis review on feature-based and word-rule based techniques in text stega...
Analysis review on feature-based and word-rule based techniques in text stega...journalBEEI
 
Review on feature-based method performance in text steganography
Review on feature-based method performance in text steganographyReview on feature-based method performance in text steganography
Review on feature-based method performance in text steganographyjournalBEEI
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indianeSAT Publishing House
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...mlaij
 
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESMULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESijcseit
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemIJERA Editor
 
Dictionary based concept mining an application for turkish
Dictionary based concept mining  an application for turkishDictionary based concept mining  an application for turkish
Dictionary based concept mining an application for turkishcsandit
 
Comparison of Compression Algorithms in text data for Data Mining
Comparison of Compression Algorithms in text data for Data MiningComparison of Compression Algorithms in text data for Data Mining
Comparison of Compression Algorithms in text data for Data MiningIJAEMSJORNAL
 

What's hot (17)

Introduction to development of lexical databases
Introduction to development of lexical databasesIntroduction to development of lexical databases
Introduction to development of lexical databases
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
 
STRUCTURED AND QUANTITATIVE PROPERTIES OF ARABIC SMS-BASED CLASSIFIED ADS SUB...
STRUCTURED AND QUANTITATIVE PROPERTIES OF ARABIC SMS-BASED CLASSIFIED ADS SUB...STRUCTURED AND QUANTITATIVE PROPERTIES OF ARABIC SMS-BASED CLASSIFIED ADS SUB...
STRUCTURED AND QUANTITATIVE PROPERTIES OF ARABIC SMS-BASED CLASSIFIED ADS SUB...
 
A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...
A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...
A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...
 
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
 
Compiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTCompiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFST
 
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Analysis review on feature-based and word-rule based techniques in text stega...
Analysis review on feature-based and word-rule based techniques in text stega...Analysis review on feature-based and word-rule based techniques in text stega...
Analysis review on feature-based and word-rule based techniques in text stega...
 
Review on feature-based method performance in text steganography
Review on feature-based method performance in text steganographyReview on feature-based method performance in text steganography
Review on feature-based method performance in text steganography
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indian
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
 
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESMULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis System
 
Dictionary based concept mining an application for turkish
Dictionary based concept mining  an application for turkishDictionary based concept mining  an application for turkish
Dictionary based concept mining an application for turkish
 
Comparison of Compression Algorithms in text data for Data Mining
Comparison of Compression Algorithms in text data for Data MiningComparison of Compression Algorithms in text data for Data Mining
Comparison of Compression Algorithms in text data for Data Mining
 

Similar to Using translation memory_to_speed_up_tra

Using ontology based context in the
Using ontology based context in theUsing ontology based context in the
Using ontology based context in theijaia
 
Direct Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsDirect Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsIJCI JOURNAL
 
Language Grid
Language GridLanguage Grid
Language Gridlindh
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsSpringer
 
Ethical issues regarding machine assisted translation of literary texts
Ethical issues regarding machine assisted translation of literary textsEthical issues regarding machine assisted translation of literary texts
Ethical issues regarding machine assisted translation of literary textsantonellarose
 
Building of Database for English-Azerbaijani Machine Translation Expert System
Building of Database for English-Azerbaijani Machine Translation Expert SystemBuilding of Database for English-Azerbaijani Machine Translation Expert System
Building of Database for English-Azerbaijani Machine Translation Expert SystemWaqas Tariq
 
The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...
The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...
The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...IJCI JOURNAL
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicijnlc
 
Semantic Rules Representation in Controlled Natural Language in FluentEditor
Semantic Rules Representation in Controlled Natural Language in FluentEditorSemantic Rules Representation in Controlled Natural Language in FluentEditor
Semantic Rules Representation in Controlled Natural Language in FluentEditorCognitum
 
ATAR: Attention-based LSTM for Arabizi transliteration
ATAR: Attention-based LSTM for Arabizi transliterationATAR: Attention-based LSTM for Arabizi transliteration
ATAR: Attention-based LSTM for Arabizi transliterationIJECEIAES
 
Role of Machine Translation and Word Sense Disambiguation in Natural Language...
Role of Machine Translation and Word Sense Disambiguation in Natural Language...Role of Machine Translation and Word Sense Disambiguation in Natural Language...
Role of Machine Translation and Word Sense Disambiguation in Natural Language...IOSR Journals
 
A Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationA Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationDarian Pruitt
 
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...TELKOMNIKA JOURNAL
 
SECOND LANGUAGE RESEARCH.pptx
SECOND LANGUAGE RESEARCH.pptxSECOND LANGUAGE RESEARCH.pptx
SECOND LANGUAGE RESEARCH.pptxssuser1ac0fa
 
Translation j 2009-cocci (1)
Translation j 2009-cocci (1)Translation j 2009-cocci (1)
Translation j 2009-cocci (1)FabiolaPanetti
 
A Critical Approach To Online Dictionaries Problems And Solutions
A Critical Approach To Online Dictionaries   Problems And SolutionsA Critical Approach To Online Dictionaries   Problems And Solutions
A Critical Approach To Online Dictionaries Problems And SolutionsNat Rice
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
 
Fq2510361043
Fq2510361043Fq2510361043
Fq2510361043IJERA Editor
 
Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Moses Altovar
 

Similar to Using translation memory_to_speed_up_tra (20)

Using ontology based context in the
Using ontology based context in theUsing ontology based context in the
Using ontology based context in the
 
Direct Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete UnitsDirect Punjabi to English Speech Translation using Discrete Units
Direct Punjabi to English Speech Translation using Discrete Units
 
Language Grid
Language GridLanguage Grid
Language Grid
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Ethical issues regarding machine assisted translation of literary texts
Ethical issues regarding machine assisted translation of literary textsEthical issues regarding machine assisted translation of literary texts
Ethical issues regarding machine assisted translation of literary texts
 
Building of Database for English-Azerbaijani Machine Translation Expert System
Building of Database for English-Azerbaijani Machine Translation Expert SystemBuilding of Database for English-Azerbaijani Machine Translation Expert System
Building of Database for English-Azerbaijani Machine Translation Expert System
 
The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...
The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...
The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabic
 
Semantic Rules Representation in Controlled Natural Language in FluentEditor
Semantic Rules Representation in Controlled Natural Language in FluentEditorSemantic Rules Representation in Controlled Natural Language in FluentEditor
Semantic Rules Representation in Controlled Natural Language in FluentEditor
 
ATAR: Attention-based LSTM for Arabizi transliteration
ATAR: Attention-based LSTM for Arabizi transliterationATAR: Attention-based LSTM for Arabizi transliteration
ATAR: Attention-based LSTM for Arabizi transliteration
 
Role of Machine Translation and Word Sense Disambiguation in Natural Language...
Role of Machine Translation and Word Sense Disambiguation in Natural Language...Role of Machine Translation and Word Sense Disambiguation in Natural Language...
Role of Machine Translation and Word Sense Disambiguation in Natural Language...
 
A Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationA Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference Annotation
 
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
 
SECOND LANGUAGE RESEARCH.pptx
SECOND LANGUAGE RESEARCH.pptxSECOND LANGUAGE RESEARCH.pptx
SECOND LANGUAGE RESEARCH.pptx
 
Translation j 2009-cocci (1)
Translation j 2009-cocci (1)Translation j 2009-cocci (1)
Translation j 2009-cocci (1)
 
A Critical Approach To Online Dictionaries Problems And Solutions
A Critical Approach To Online Dictionaries   Problems And SolutionsA Critical Approach To Online Dictionaries   Problems And Solutions
A Critical Approach To Online Dictionaries Problems And Solutions
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...
 
arttt.pdf
arttt.pdfarttt.pdf
arttt.pdf
 
Fq2510361043
Fq2510361043Fq2510361043
Fq2510361043
 
Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...
 

Recently uploaded

Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Russian Call Girls in Karol Bagh Aasnvi âžĄïž 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi âžĄïž 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi âžĄïž 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi âžĄïž 8264348440 💋📞 Independent Escort S...soniya singh
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 

Recently uploaded (20)

Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Call Girls In Mukherjee Nagar đŸ“± 9999965857 đŸ€© Delhi đŸ«Š HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar đŸ“±  9999965857  đŸ€© Delhi đŸ«Š HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar đŸ“±  9999965857  đŸ€© Delhi đŸ«Š HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar đŸ“± 9999965857 đŸ€© Delhi đŸ«Š HOT AND SEXY VVIP 🍎 SE...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Russian Call Girls in Karol Bagh Aasnvi âžĄïž 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi âžĄïž 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi âžĄïž 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi âžĄïž 8264348440 💋📞 Independent Escort S...
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 

Using translation memory_to_speed_up_tra

  • 1. Overview 353 Using Translation Memory to Speed up Translation Process Marija Brkić Department of Informatics, University of Rijeka Omladinska 14, 51000 Rijeka, Croatia mbrkic@uniri.hr Sanja Seljan Department of Information Sciences Faculty of Humanities and Social Sciences, University of Zagreb Ivana Lučića 3, 10000 Zagreb, Croatia sanja.seljan@ffzg.hr BoĆŸena BaĆĄić Mikulić Primary school “Turnić” Franje Čandeka 20, 51000 Rijeka, Croatia bozena.basic@gmail.com Summary Translation process is one aspect of human creativity. Due to globalization, EU accession negotiations, and the need for information exchange, the amount of translation work increases on a daily basis. The translation process is hindered by the fact that the languages involved differ culturally, stylistically, syntacti- cally and lexically. This paper explores the benefits and limitations of TMs (translation memories). TMs are not used for replacing humans in the transla- tion process, but rather for enhancing the human translation process. In this paper, a detailed analysis of Atril’s DĂ©jĂ  Vu X system is presented, along with its time-saving implications, which are based on the reuse of previously stored segments. Excerpts from three different digital camera user manuals are trans- lated from English into Croatian. Evaluation is performed by measuring the time difference between human and TM-based translation speeds in prepara- tion, translation, and revision phases, and with regard to six different parame- ters. Key words: Translation Memory (TM), DĂ©jĂ  Vu, Computer-Assisted Transla- tion (CAT), language pair, translation unit (TU), translation speed Introduction The EU relies on the principles of open access to documents, multilingualism and democracy. Therefore, the EU legislation needs to be translated into each of
  • 2. INFuture2009: “Digital Resources and Knowledge Sharing” 354 the official languages. On the other hand, the legislation of each particular member state needs to be translated into one of the EU’s official languages (Seljan & Pavuna, 2006b). To sum up, translation demands in the EU surpass human capacities. There are 23 official languages with 23 x 22 = 506 language pairs. A huge number of pages need to be translated on a daily basis. Short deadlines, demands for consistency and data-sharing, and insufficient number of translators further impede the translation process, particularly for newly admit- ted countries (Seljan & Pavuna, 2006a). Nowadays, fortunately, translators have fully automatic MT systems and CAT tools at their disposal (ValderrĂĄbanos, 2003). Moreover, the usage of such tools has been recommended by the Direc- torate General for Translation of the European Commission (DGT), which is European Commission’s in-house translation department (Seljan & Pavuna, 2006a). Depending on their needs, translators can opt for MT or CAT tools. MT was first conceived as a technology that significantly speeds up the transla- tion process and offers human-like quality translations. Soon, it became clear that such a goal is far-fetched (ValderrĂĄbanos, 2003), which led to the develop- ment of TM technology. Current computational models of MT are limited to tasks for which rough translations are adequate, tasks where human post-editors are used and tasks limited to sub-domains in which fully automatic high quality translations are achievable (FAHQT) (Jurafsky & Martin, 2009). TMs, on the other hand, exploit machine memory for storing translated segments in order to reuse them in future translations. Their usability, therefore, increases with the size of the stored data. As Croatia’s EU accession negotiations are underway, it is high time for the de- velopment of Croatian language tools and resources. This paper explores the benefits of using TMs, in particular Atril’s DĂ©jĂ  Vu X (DVX) system, and pre- sents the results of a study in which English and Croatian are source and target languages, respectively. Translation memory TM technology is based on the notion of reuse of previously translated seg- ments. It is usually integrated into a system which has a terminology manage- ment module and a lexicon (ValderrĂĄbanos, 2003). This technology does not aim at replacing humans, but rather at enhancing the human translation process. A TM can be defined as a database which stores corresponding source and tar- get language translations, called translation units (TUs). Approaches There are two TM implementation approaches. Despite the differences in im- plementation, TMs are designed with the common purpose of storing previously translated material in an organized way, in order to present it to the user in fu- ture translations (Gow, 2003).
  • 3. M. Brkić, S. Seljan, B. BaĆĄić Mikulić, Using Translation Memory to 
 355 Sentence-based approach A sentence-based approach divides source and target language texts into corre- sponding TUs, which can be sentences, titles, subtitles or list entries. TUs are stored in a database and retrieved in future translations in cases of identical or similar TUs in new source texts. Sentence units are easy to identify if they start with capital letters and end with full stops (Gow, 2003). However, abbreviations or full stops which are not at the end of sentences pose problems. These prob- lems can be solved by defining new sentence delimitation rules (DĂ©jĂ  Vu, 2009). The main benefit of the sentence-based approach, compared to a charac- ter-string-in-bitext-based approach, is that exact matches are more likely to be relevant because sentence-based TMs represent an extreme form of high preci- sion, low recall search (Simard & Langlais, 2000). Fuzzy matching algorithms, on the other hand, are based on statistical models of similarity. Since these models are only loose approximations, the matching algorithms sometimes cre- ate useless matches, known as ‘noise’, or fail to generate matches, the phe- nomenon known as ‘silence’ (Bowker, 2002 in Gow, 2003). Character-string-in-bitext(CSB)-based approach A CSB-based approach involves storing of source texts and corresponding translations in a database. The resulting texts are called bitexts. Bitexts can be used for preparatory background reading. In this approach, identical character strings of any length are recognized and reused (Gow, 2003). Working with sentence segments, instead of entire sentences, has its advantages (Simard & Langlais, 2000). It enables identification of identical sentence segments or even several consecutive identical sentences at once (Macklovitch & Russell, 2000 in Gow, 2003). ‘Noise’ phenomenon is still present, but this time as a result of finding unreliably small matches. ‘Silence’, on the other hand, occurs because there is no support for fuzzy matching. One of the disadvantages of this ap- proach is that internal repetitions have to be recycled exclusively through termi- nology databases, because only entire translations are added to databases (Gow, 2003). Advantages Two major advantages of TM technology are consistency and speed. Consis- tency is of crucial importance in non-literary texts, for example software and hardware manuals (ValderrĂĄbanos, 2003), or business, legal, scientific and technical texts (Gow, 2003). These texts are highly repetitive. The longer they are, the more likely they are to contain repetitive content (AustermĂŒhl, 2001 in Gow, 2003). Repetition can occur, not only internally, but also across several texts in the same domain (Gow, 2003). Moreover, software documentation, be- sides being highly repetitive, is subject to frequent version updates. It is thus an ideal candidate for exploiting TM benefits (Bruckner, 2001). Consistency is es-
  • 4. INFuture2009: “Digital Resources and Knowledge Sharing” 356 pecially important in cases where several translators work on the same project and share the same TM on the network (O’Brien, 1998 in Gow, 2003). Speed is important, regardless of the domain, because globalization has brought forward endless translation demands (ValderrĂĄbanos, 2003). Using TMs in the translation process implies cost reduction. For example, the translation process can be started as soon as the first draft of the document to be translated is ob- tained. Furthermore, translation vendors can lower prices and thus earn more contracts (Gordon, 1997 in Gow, 2003). On the other hand, freelance translators can save up their valuable time or increase their earnings by increasing the translation speed (Gordon, 1996 in Gow, 2003). In addition, using TMs preserves original page layouts in translated documents, because formatting information is hidden in embedded codes (Seljan & Pavuna, 2006a). Finally, TMs are usually integrated into systems which have tools for building dictionaries (Webb, 1998) and reporting detailed statistics on internal and exter- nal repetition and word counts. These tools can help project managers in sched- uling localization products (Esselink, 2000). Disadvantages Although TMs bring a lot of advantages, there are also some limitations of their usage. First of all, using TMs implies initial decrease in productivity because transla- tors need to master the environment (Webb, 1998). The odds of finding quality matches increase with the size of TMs (Gow, 2003). Moreover, the beneficiary effects of using TMs are felt only on repetitive texts (ValderrĂĄbanos, 2003). Therefore, cost-effects of investing additional time into importing existent translations through the process of alignment should be calculated (Seljan & Pavuna, 2006a). Although, according to Esselink (2000), TMs indisputably save time, regular database maintenance is time-consuming (AustermĂŒhle, 2001 in Gow, 2003). Furthermore, source texts need to be in digital form (Gow, 2003) and suitable file formats because not all formats are supported since TM systems require filters to preserve formatting. As an effect, they are usually bundled only with filters for most commonly used formats (Esselink, 2000). TMs affect quality of the translation because, by using them, translators tend to avoid using anaphoric or cataphoric references in order to make segments more ‘universal’. Seljan and Pavuna (2006a) add lack of language knowledge and context insen- sitivity to the list of drawbacks and point out additional software, maintenance and education costs. There are also other concerns with regard to using TMs. For example, it is questionable whether translators should be paid differently for identical and
  • 5. M. Brkić, S. Seljan, B. BaĆĄić Mikulić, Using Translation Memory to 
 357 fuzzy matches recovered from TMs. Furthermore, the ownership of final TMs is also unclear. DĂ©jĂ  Vu X DVX is a very powerful and adaptive CAT system which integrates several CAT tools – lexicon, terminology database, TM, alignment module, etc. The first version of this system appeared in 1993. DVX is an example of a hybrid approach. Matches are ranked into the following categories: perfect/exact match, fuzzy match, guaranteed match (there is overlapping of neighbouring TUs as well) and assembled from portions match. It features auto-search, as- sembling, propagation and pre-translation. Besides, it gives a detailed statistical analysis of source and target language texts. Terminology database is the only database that needs to be manually filled and it allows linguistic enrichment of inserted terms. A list of lexicon entries is automatically built. However, entries which are to be included need to be manually translated. DVX combines mod- ern TM technology with example-based machine translation (EBMT). EBMT implies translation by analogy and enables combining several segments into one translation segment (DĂ©jĂ  Vu, 2009). According to a survey, which included 699 translators from 50 different coun- tries, DVX was the second TM on the list of popularity, meaning that 61% of translators were acquainted with the system. On the usage list, it was the fourth. To be precise, there were 23% of translators using the system (Laugodaki, 2006). The statistics show that the system is preferred by translators with higher level of information literacy. DVX scored better than competitors’ systems in functionality, efficiency, speed, reliability, price, and usability. According to the survey, it also had better customer support. Experimental study The feature to be examined in this study was the speed of the translation proc- ess, with the goal of measuring the time difference between human and TM- based translation speeds (Bruckner, 2001). The following parameters were taken into account: ‱ Measure (minutes), ‱ Evaluation procedure (comparing times needed by each translator to de- liver translation), ‱ Score (time needed for the translation process), ‱ Metric (faster / slower), ‱ Languages (English – source language, Croatian – target language), ‱ Text type (hardware documentation), and ‱ System used (DĂ©jĂ  Vu). The study was conducted by two expert translators, who translated three ex- cerpts from Kodak’s, Nokia’s and Canon’s digital camera user manuals, re-
  • 6. INFuture2009: “Digital Resources and Knowledge Sharing” 358 spectively. Each excerpt was two standard pages long and contained informa- tion on battery and relating equipment care and maintenance. Therefore, each excerpt contained terms and phrases from the same domain. The excerpts were translated from English into Croatian. The experimental study was conducted in the following phases: 1. Text selection phase 2. Preparation phase (lexical analysis of the source language texts and their technical registers) 3. Translation phase (setting up environment, translating, building up lexi- con, filling terminology database) 4. Revision phase (post-editing) Preparation phase For all the excerpts, the preparation phase was performed jointly by the two translators. The lexical analysis of the source language texts (English) lasted 45, 10 and 18 minutes, respectively. Translation phase Prior to the translation process, the translator using the TM spent 5 minutes set- ting up the environment. The translator had some previous experience with DVX translation memory. After the text was inserted into DVX, the system cal- culated that the internal repetition was 6 per cent. The translation speed in the first translation (Kodak) was identical for both translators (37 minutes). After the first text was translated, the TM translator had to manually build up the system’s lexicon, which lasted 15 minutes and included 68 entries. The in- serted entries were mostly nouns in nominative singular and plural, and mascu- line adjectives in singular. Filling the terminology database lasted 10 minutes. Only 18 phrases were added due to Croatian rich morphological system. The second excerpt (Nokia) first underwent the pre-translation process. Besides the exact matches, fuzzy matches and parts assembled from portions were also allowed. After processing 69 source language sentences, the system found 6 sentences with 1 or more fuzzy matches and 31 sentences assembled from por- tions, some of which are presented in Figure 1. The traditional translation process for the second text lasted 31 minutes, while, with the help of the TM, it lasted 30 minutes. After the second text translation, 111 new entries were added to the lexicon and 13 new phrases were added to the terminology database. These processes lasted 15 and 10 minutes, respec- tively. The third excerpt (Canon) also underwent the process of pre-translation prior to the translation. The system processed 41 source sentences and found 1 sentence with a unique exact match and 32 sentences assembled from portions. The sys- tem found substantially more matches than it had in the second text (Figure 2).
  • 7. M. Brkić, S. Seljan, B. BaĆĄić Mikulić, Using Translation Memory to 
 359 Figure 1: Matches found by DVX when the second source text was inserted Figure 2: Matches found by DVX after the third source text was inserted The translation phase for the third text lasted 29 minutes for the traditional translation and 23 minutes with the TM. It took 5 minutes to add 45 new entries to the lexicon, and 3 minutes to add 7 new phrases to the terminology database.
  • 8. INFuture2009: “Digital Resources and Knowledge Sharing” 360 Revision Phase The revision phase for the three translations in the traditional translation process lasted 5, 7, and 5 minutes, respectively, while in the TM translation process it lasted 5, 8, and 5 minutes, respectively. Evaluation According to the automation level, evaluation methods can be divided into automatic and manual. Although time-consuming, manual methods better suit real-life system application. They can be implemented in two different ways. One way is that a translator scores each segment on a 1-4 scale from “absolutely useless” to “no changes needed”. The other way includes modifying each re- sulting segment into acceptable translation and counting post-editing steps needed (HodĂĄsz, 2006). The results of a 1-4 scale test performed on the third translation are presented in Chart 1. Since the TM is still under development, these are only initial results. With the growth of the TM, the increase in the percentage of segments classi- fied as ‘few changes needed’ or ‘no changes needed’ can be expected. Chart 1: Effectiveness of TM 20% 71% 7% 2% absolutely useless many changes needed few changes needed no changes needed Discussion The search process gives the highest priority to the TM matches and the lowest to the lexicon matches, with the terminology database matches in between. Nevertheless, the user is supplied with all the matches in a separate window, which enables them to choose the most appropriate match for the given context. Most of the problems with DVX regard morphology and word order. A word extracted from the lexicon which is a masculine adjective needs to be changed into feminine or neuter in order to match the context syntactically. The same is valid for verbs, since the lexicon contains their infinitive form. Furthermore, words are extracted in order of appearance and they often need to be rearranged because of the syntactic rules of the target language. There are also capitalization issues which need to be resolved. For example, if there is a word which was the first word in a previously stored segment, it re- mains capitalized regardless of its position in subsequent occurrences.
  • 9. M. Brkić, S. Seljan, B. BaĆĄić Mikulić, Using Translation Memory to 
 361 Additionally, punctuation is also retained if the word found in one of the re- sources is followed by a punctuation mark. However, one of the main advantages of using the TM is the user interface, which enables the translator to see parallel sentences or units in the same row. That eases the translation process because the translator does not have to spend a lot of time inserting or deleting units of the source text and scanning through both texts. Speed of translation process Since the main advantage of any TM system should be speeding up the transla- tion process, here follow the results of this case study. Table 1 presents time (expressed in minutes) needed for each phase (preparation phase has been omitted because it was done jointly by the two translators). Table 1: Time spent for each phase Translation phase (minutes) Lexicon and terminology database filling (TM) (minutes) Revision phase (minutes) Total (minutes) Traditional 37 / 5 421st text TM 37 25 7 69 Traditional 31 / 5 362nd text TM 30 25 8 63 Traditional 29 / 5 343rd text TM 23 8 5 36 Chart 2: Time spent in translation process 37 29 37 23 31 30 0 5 10 15 20 25 30 35 40 First text Second text Third text Time(minutes) Traditional translation TM It is evident that the TM translator becomes faster than the traditional translator in the second, and even more, in the third translation. Since the TM was empty prior to the first translation, the translation phase of the first excerpt was of the same length for both translators. As DVX grows in size, the TM translator be-
  • 10. INFuture2009: “Digital Resources and Knowledge Sharing” 362 comes increasingly faster than the traditional one (Chart 2). The difference in speed is the most obvious in the third translation, where the TM translator is 6 minutes faster. Taking into account the time which the TM translator needs to spend in the process of filling the lexicon and the terminology database, it is evident that using TM systems can be somewhat time-consuming in the beginning (Chart 3). Chart 3: Total time for traditional translation and TM 42 36 69 63 34 36 0 15 30 45 60 75 First text Second text Third text Time(minutes) Traditional translation TM Nevertheless, it is quite plausible that the time which the TM translator spends in filling the lexicon and terminology database gradually decreases because both databases have smaller number of entries to be added in, while the time needed for human translation remains the same. Therefore, the time difference for the TM translator and the traditional translator becomes less significant. Conclusion This paper presents a detailed analysis of the benefits of the Atril’s DĂ©jĂ  Vu X translation memory system, with its time-saving implications based on the reuse of previously stored translation units. In this case study, segments from three different digital camera user manuals are translated, with the aim of presenting how TMs ensure consistency in non-liter- ary texts. After measuring the time difference between human and TM-based translation speeds and taking into account 6 parameters, those being measure, evaluation procedure, score, metric, languages and text type, it can be con- cluded that TMs speed up the translation process, especially in later phases, i.e. when texts show certain level of local or global repetition. Nevertheless, even though TMs unquestionably save time, regular lexicon and terminology data- base maintenance is still time-consuming. However, this task does not take as much of translator’s time as databases grow in size. On the other hand, lack of knowledge asks for intensive work in the revision phase. Even so, TMs repre- sent a valuable resource, especially when several translators work in the same
  • 11. M. Brkić, S. Seljan, B. BaĆĄić Mikulić, Using Translation Memory to 
 363 domain, and aim to produce fast, consistent and professional-quality transla- tions. Acknowledgments This work has been supported by the Ministry of Science, Education and Sports of the Republic of Croatia, under the grants No.130-1300646-0909 and No.009- 0361935-0852. References Bruckner, Christine; Plitt, Mirko. Evaluating the Operational Benefit of Using Machine Transla- tion Output as Translation Memory Input. // Proceedings of the VIII MT Summit. Spain, Santiago de Compostela, 2001 DĂ©jĂ  Vu – Translation memory and productivity system. http://www.atril.com/ (July 25, 2009) Esselink, Bert. A Practical Guide to Localization. Amsterdam/Philadelphia : John Benjamins Publishing Company, 2000. Gow, Francie. Metrics for Evaluating Translation Memory Software. M.A. thesis. University of Ottawa, Canada, 2003. http://www.chandos.ca/Metrics_for_Evaluating_Translation_Memory_Software.pdf (August 3, 2009) HodĂĄsz, GĂĄbor. Evaluation Methods of a Linguistically Enriched Translation Memory System. // Proceedings of the Fifth International Conference on Language Resources and Evaluation. Italy, Genoa, 2006, 2044-2047 Jurafsky, Daniel; Martin, James H. Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition. New Jersey : Pearson education, 2009 Lagoudaki, Elina. Translation Memories Survey 2006: Users’ perceptions around TM use // Pro- ceedings of the ASLIB International Conference Translating & the Computer 28. UK, Lon- don, 2006. Seljan, Sanja; Pavuna, Damir. Translation Memory Database in the Translation Process. // Pro- ceedings of the 17th International Conference on Information and Intelligent Systems IIS 2006 / Aurer, B.; Bača, M. (ed.). Croatia, VaraĆŸdin : FOI, 2006a, 327-332 Seljan, Sanja; Pavuna, Damir. Why Machine-Assisted Translation (MAT) Tools for Croatian? // Proceedings of the 28th Conference on Information Technology Interfaces – ITI 2006. Croa- tia, Cavtat, 2006b, 469-474 Simard, Michel; Langlais, Philippe. Sub-sentential Exploitation of Translation Memories. // LREC 2000 Second International Conference on Language Resources and Evaluation. Greece, Athens, 2000 ValderrĂĄbanos, Antonio S.; Esteban, JosĂ©; Iraola, Luis. TransType2 - A New Paradigm for Translation Automation. // Proceedings of the MT Summit IX. New Orleans, Louisiana, 2003, 498-501 Webb, Lynn E. Advantages and Disadvantages of Translation Memory : A Cost/Benefit Analysis. M.A. thesis. Monterey Institute of International Studies, Monterey, California, 1998. http://www.tradulex.org/Bibliography/Webb.htm (October 31, 1998)