Transzaar - CAT Tool for Indian Languages including English Arabic


Transzaar is an AI powered Language Service platform developed by eBhasha Setu, that streamlines the language processing tasks like – Translation, Transliteration, Localization, and Text Analysis. It provides the human translators with necessary linguistic tools and resources to seamlessly deliver high quality translation with a better turn around time.

  1. 1. Transzaar: Empowers Human Translators Rashid Ahmad, Priyank Gupta, Nagaraju Vuppala, Sanket Kumar Pathak, Ashutosh Kumar, Gagan Soni, Sravan Kumar, Manish Shrivastava, Avinash K Singh, Arbind K Gangwar, Pawan Kumar, Mukul K Sinha IIIT Hyderabad INDIA, eBhasha Setu INDIA The 18th International Conference on Computational Science and Its Applications (ICCSA 2018), Melbourne, Monash University, Australia 2-5-July-2018 For more details visit at
  2. 2. Outline • Introduction • Related Work • Background and Motivation • Transzaar: Problems, Solutions and Features • Evaluation and Usage • Conclusion and Future Work
  3. 3. Introduction • Digital Content on Internet Accessible using devices (Desktop/Mobile/Tablet) through telecom data services provider. Understandability is big issue as content is not available to people in their native language. • Content Creation (Native/local language) – Writing the fresh content – By translating the existing content into local languages. • Transzaar - An AI powered CAT tool – It offers Computer Aided Translation (CAT) functionality. – Using Transzaar translators can post-edit MT output easily while delivering with better turn-around-time. 3
  4. 4. Related Work • ALPS system developed in 1981 it was first CAT Tool designed for personal computer. • CASMACAT is a modular, web-based translation workbench that offers advanced functionalities for computer-aided translation. • PET is a post-editing tool, which can be used for evaluating the quality of translations in terms of how much effort these translations require in order to be fixed. • MateCat is a tool whose objective is to improve the integration of machine translation (MT) and human translation within the so-called computer aided translation (CAT) framework. • Anubis, provides detailed description of one of the state-of-art CAT systems. • Some of the commercial available CAT tools are - SDL Trados, Wordfast, Memsource, DejaVu and SmartCat etc. 4
  5. 5. Background and Motivation • MT system does not produce Publishable Quality Content. • Hence, MT as technology, far from replacing humans, should be considered as a tool in the hands of human translators. • This observation motivated our team to build this new tool Transzaar whose prime function would be -- to assist human translators. 5
  6. 6. Transzaar CAT Tool • Transzaar - An AI powered CAT tool – It has linguistic tools/resources and feedback loop – It aids the process of post-editing thereby increasing the productivity of a human translator by 2 to 3 folds. – It collects user feedback continuously, which helps the MT system and tool to further learn and improve periodically, with the additional new dataset generated. 6
  7. 7. Transzaar: Translator Problems • Mechanical Tasks – Search into the lexical databases and Translation Management. • Repetitive Tasks – Name Entity and TERMINOLOGY. • Cognitive Tasks – Translation from the scratch is an Recall . For human brain “Recognition is easy; Recall is hard”. 7
  8. 8. Transzaar : Solution Techniques • Search – It provides one click search into multiple lexical resources for a selected/highlighted word or term. • Visual Alerts – While the human translator is post editing the machine translated content all NER, TERMINOLOGY are flagged in the editing pane by visual alerts. • Recall to Recognition – It is easier for the translator to select from the most likely correct results (or formulate from an approximate result) than to provide a solution on a blank slate. 8
  9. 9. Transzaar Features • Editor Kit – Cut/copy/paste, find/replace, predictive typing, segmentation etc. • Resources Integration – Multiple MT, TM and various resources integration provision is available. • Concordance Lookup – Concordance lookup in the source/target language parallel aligned corpus. • Web-based – Access from anywhere, anytime, platform independent, for each user. 9
  10. 10. Transzaar Distinct Features • Text Analysis (Pre-Processing) – Extraction of Collocations, Terminologies, NEs, MWEs, etc. • Transliteration – Transliteration of foreign language terms (mostly proper nouns), or technical terms into target language script. • Spell Checker – Spell checking tool for Indian major languages • Customization – Inject individual creativity with customized lexical resources or tools. 10
  11. 11. Evaluation and Usage • Transzaar CAT Tool is being currently used by 65+ users. • It is being used by a language service startup company providing translation service among Indian languages • For an experienced translator with higher skill set, the tool helps to improve his productivity by more than 2 fold. • As Tool have integrated its own MT systems for some language pairs (accuracy > 83%) at the back-end. We see higher improvement in productivity for the such language pairs. 11
  12. 12. Conclusion and Future Work • We have introduced Transzaar, a next-generation AI powered tool that offers computer aided translation function to human translators. It helps to improve the productivity of translators. • It also helps a human translator to improve the fluency and the accuracy of the machine translated content to match the naturalness of a native speaker. • In future, our team plans to extend this tool to support factory mode translation to support multiple kinds of translation work-flow. 12
  13. 13. Thank You Questions? Questions and Queries For more details visit at