TaaS Workshop 2014, Terminology as a Service, Indra Samite, Tilde


This presentation is about the cloud terminology services “TaaS: Terminology as a Service” established within the EU-funded TaaS project. TaaS provides the following user-friendly, collaborative, and multilingual terminology services for translation and localization businesses among others:

· Search terminology in various sources
· Identify term candidates in your documents and extract them automatically
· Look up translation candidates in various sources
· Refine and approve terms and their translations
· Share your terminology with other users
· Collaborate with your friends & colleagues
· Use your terminology in other working environment

TaaS is available for open Beta testing at

The research within the project TaaS leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013), grant agreement no 296312

Published in: Technology, Business
  1. 1. Wednesday,  4  June  /09:10  –  09:40       Terminology  as  a  Service   Indra  Samite,  Tilde     TaaS  Workshop  2014   4  June,  Dublin  (Ireland)   The  research  within  the  project  TaaS  leading  to  these  results  has  received  funding  from  the  European  Union  Seventh  Framework  Programme  (FP7/2007-­‐2013),  grant  agreement  no  296312  
  2. 2. TAUS  TaaS  Workshop     Dublin  /  04.06.2014.  
  3. 3. * What  does  a  terminologist  do?   ____________________________________   TAUS  TaaS  Workshop  |  Dublin  |  June  4,  2014  
  4. 4. embracing  innovation   Terminology  field  
  5. 5. SaaS  Software  as  a  Service  
  6. 6. PaaS  Platform  as  a  Service  
  7. 7. TaaS  Terminology  as  a  Service  
  8. 8. *  Tilde          Latvia  (Coordinator)   *  TAUS        Netherlands   *  Kilgray        Hungary   *  University  of  Cologne      Germany   *  University  of  Sheffield    UK   TaaS  Partners  
  9. 9. *  Industry  &  Research  Collaboration  project     *  Supported  by  EU  7th  R&I  Framework  Programme   *  Resulted  in  TaaS  cloud-­‐based  services     *  Accessible  for  free  at  the  online  portal   TaaS  
  10. 10. *  Ioannis  Iakovidis,  Interverbum  Technology   *  Uwe  Muegge  &  Carl  Yao,  CSOFT  International   *  Luigi  Muzii,  sQuid   *  Maria  Pia  Montoro,  Intrasoft  International,   Terminology  Blogger  at  Wor  lLo     Invited  speakers   ____________________________________   TAUS  TaaS  Workshop  |  Dublin  |  June  4,  2014  
  11. 11. *  Little  survey  for  warming-­‐up   *  Panel  of  speakers   *  Reflections  from  the  audience   Discussion   from  11:55   ____________________________________   TAUS  TaaS  Workshop  |  Dublin  |  June  4,  2014  
  12. 12. Welcome  to  the  Cloud!   Terminology  as  a  Service     Andrejs  Vasiļjevs,  Indra  Sāmīte   Tilde   TAUS  TaaS  Workshop  /  Dublin  /  04.06.2014.  
  13. 13. *  Language  technology  developer   *  Translation  and  Terminology  systems   *  >350  000  users   *  Localization  service  provider   *  Leadership  in  smaller  languages   *  Offices  in  Riga  (Latvia),  Tallinn  (Estonia)  and  Vilnius   (Lithuania)   *  130  employees   *  Strong  R&D  team   *  5  PhDs  +  and  engineers  and  students,  80+  research  papers   *  Coordinator  of  several  EU  industry-­‐academic  R&D  projects   About  Tilde  
  14. 14. EuroTermBank  Portal  
  15. 15. Microsoft  Language  Portal  
  16. 16. ECDC  Terminology  Server  
  17. 17. *  Term  identification  in  the  source  text   *  Consulting  online  databases  and  local  files  for  translation   equivalents     *  Creating  and  maintaining  terminology  glossaries     *  Sharing  term  glossaries  and  involving  others  in  their  polishing   *  Structuring  data  in  the  industry  standard  formats   *  Integrating  term  glossaries  in  CAT  and  other  productivity  tools   *  Keeping  terminology  up  to  date     *  etc.   Complexity  of  terminology  work  
  18. 18. TaaS  User  Needs  Survey  Results:   Importance  of  terminology  work   43.5%   39.9%   14.8%   1.8%   Very  important   Quite  important   Less  important   Not  important  
  19. 19. TaaS  User  Needs  Survey:     willingness  to  share   24.9%   19.2%   14.2%   11.4%   7.6%   6.0%   16.7%   Yes,  provided  that…   Joint  contribution  to  the  DB     Access  control   Legal  aspects   External  quality  control   Little  effort   Anonymity   Other   48.6%   22.0%   16.5%   8.3%   4.6%   No,  because…   Legal  restrictions   Poor  quality/Lack  of  time   Own  asset   Risk  of  misunderstanding   60.5%   39.5%  
  20. 20. *  Simplify  the  process  for  language  workers  to  prepare,   store  and  share  of  task-­‐specific  multilingual  term   glossaries   *  Provide  instant  access  to  term  translation  equivalents   and  translation  candidates  for  professional  translators   *  Improve  quality  of  machine  translation  systems  by   dynamic  integration  of  terminology  data   TaaS  Mission  
  21. 21. cloud-­‐based  platform  that     automates     the  terminology  work     for  human  and  machine  use  
  22. 22. *  Automatic  extraction  of  monolingual  term   candidates     from  user  uploaded  documents     *  Automatic  retrieval  of  translation  equivalents     from  different  public  and  industry  terminology   databases   *  Translation  candidate  acquisition     from  multilingual  web  data   *  Facilities  for  cleaning-­‐up     by  users  automatically  acquired  terminological   data;   *  Data  sharing  and  integration  facilities     through  APIs  and  export  tools   Key  services  of  TaaS  
  23. 23. TaaS  Services  
  24. 24. Term  identification  and  annotation  
  25. 25. *  Support  for  industry  standard   formats   *  Integration  into  CAT  and   productivity  tools   *  API  to  integrate  TaaS  services   into  various  software   applications   Integration  
  26. 26. TaaS  in  the  service  for  MT  
  27. 27. Online  Terminology  Services Translation Training SMT  System   Training  and   adaptation Online  Translation   Service Input  Text  for   Translation Parallel   corpus Monolingual   corpus Bilingual  term   collections Monolingual   Term   Extraction Trained   SMT   Model Bilingual   Term   Extraction Translated   Text
  28. 28. TaaS  Architecture   Presentation  Layer Web  Page  UI Public  API Application  Logic  Layer Terminology collection   management User   management   Terminology collection   search Terminology   collection   creation Data  Storage  Layer (Shared  Term  Repository) High-­‐performance   Computing  (HPC)  Cluster SGE External   TDBs CAT  tools MT https REST http/https html https REST https REST included CPUCPU included   Shared  Term   Repository DB File  Store Web   Browsers HPC  frontend CPU CPUCPU CPU CPUCPU CPU Term  extraction  workflows Full  collection   creation   workflow Monolingual   collection   creation Translation   candidate extraction .... Modules Result  processing Collection  Importer Marked  Text   enrichment Text   tagging   with  terms Statistical  DB   acquisition Statistical  DB  feeding Bilingual  Term   Extraction  System Parameter  retriever Translation   lookup ETB  &  STR IATE TAUS  API Statistical  DB Collection  merger CPUCPU CPU Term  extraction TXT  extractor TWSC Kilgray  Term Extractor Collection  creator Term  normalizer   Statistical   DB
  29. 29. Research   Development   Usage     Focus  areas   *  Term  extraction   *  Collection  of  domain  specific   multilingual  corpora   *  Max(FTC)     *  Usability   *  Outreach   *  Sustainability   *  Quality   *  Performance   *  Scalability   *  Interoperability  
  30. 30. Indra  Sāmīte   Business  Development  Director   Tilde   TaaS  in  action  
  31. 31. The  Modern  Translator  
  32. 32. *  Search  for  individual  terms  in  various  sources   *  Identify  term  candidates  in  your  documents   and  extract  them  automatically   *  Automated  Look  up  translation  candidates  in  various   sources   *  Refine  and  approve  terms  and  their  translations   *  Share  your  terminology  with  other  users   *  Collaborate  with  colleagues  &  team   *  Use  your  terminology  in  other  working  environments   Features  
  33. 33. Simple  Search  
  34. 34.   *  Search  for  terms  in  various  sources   Simple  Search  
  35. 35. Identify  &  extract  
  36. 36.       Identify  &  extract  
  37. 37. Automated  Lookup  
  38. 38. Automated  Lookup  
  39. 39. Refine  &  Approve  
  40. 40. Refine  &  Approve  
  41. 41. Refine  &  Approve  
  42. 42. Share  &  Collaborate  
  43. 43. CAT  integrated  
  44. 44. Machine  translation  by  Tilde  
  45. 45. *  In  the  cloud   *  Do  it  yourself  or  Custom   *  CAT  integrated   *  Terminology  ready   *  Vast  data  base  of  resources  for  training   *  Productivity  boosting   LetsMT  
  46. 46. MT  friendly  
  47. 47. * Trusted  terminology  resources   * Sharing  new  terminology  data   * Reuse  of  terminology  resources   * Quality  improvement  of  MT  systems   * Efficient  work  patterns   * Increase  competitiveness   * Translation  quality  improvement   Strategic  impact  
  48. 48. * Free  access  to  online  services     * Integrated  with  memoQ  2014   * Integrated  with    OmegaT  (July’14)   Sign  up  now!  
  49. 49. Thank  You!     The  research  within  the  project  TaaS  leading  to  these  results  has  received  funding  from    the  European  Union  Seventh  Framework  Programme  (FP7/2007-­‐2013),  grant  agreement  n°  296312   Contact:     @TermServ  on  Twitter   Terminology  Services  group  on  LinkedIn