BigTM.net at MT Summit XII

3,804 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,804
On SlideShare
0
From Embeds
0
Number of Embeds
1,025
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

BigTM.net at MT Summit XII

  1. 1. BIGTM.NETGaining a Head Start on Translation Projects<br />Achim Ruopp<br />Digital Silk Road<br />achim@digitalsilkroad.net<br />
  2. 2. A New Translation Project<br />“The Bundle Crusher is a sturdy machine with moving parts driven by electric motors, pneumatics, and hydraulics.”<br />“The downstream platen of the compression bridge may move side-to-side unexpectedly and strike personnel in its path.”<br />
  3. 3. A New Translation ProjectBi-Lingual Terminology<br />Criteria<br />In domain<br />Correct<br />Current<br />In context<br />Sources<br />Translation memory from previous projects<br />Domain dictionaries<br />
  4. 4. BigTM.net Custom Translation Search<br />What is it?<br />Custom Translation Search Engine<br />Input: project source text<br />Searches the web for similar bi-lingual pages<br />Indexes discovered bi-lingual pages<br />Provides search UI<br />Automated search and indexing overnight<br />Current Language Pairs<br />English - French / Italian / German / Spanish<br />
  5. 5. BigTM.net Project Page<br />
  6. 6. BigTM.net Search Results<br />
  7. 7. Terminology Criteria<br />Don’t find as many examples as possible – Find the right examples<br />
  8. 8. Privacy<br />Never shared<br />Source text<br />Customized index<br />Terms<br />Dictionary<br />User management built-in<br />Grant/revoke rights for translators<br />Public: General Index of found pages<br />No association to projects possible<br />
  9. 9. BigTM.net Architecture<br />SourceText<br />SearchEngine<br />BigTM.net<br />Extracted Terms<br />Candidate Pairs<br />Parallel Content<br />Search Index<br />The Web<br />Search UI<br />
  10. 10. BigTM.net Data Flow<br />Matcher<br />Searchable Index<br />Classifier<br />MT System<br />CAT Tool<br />Aligner<br />
  11. 11. Pilot Project Statistics<br />
  12. 12. Integration with Translation Tools<br />Tools with web search integration<br />Dictionary download<br />CSV format with probabilities<br />Translation memory download<br />Standard TMX format<br />Open Issues<br />IP (BigTM.net respects robots.txt)<br />Most efficient segmentation - sub-segment matching?<br />
  13. 13. Training of domain-specific statistical MT systems<br />Supplement general domain corpus with domain-specific corpus downloaded from BigTM.net<br />KDE documentation prototype English-German<br />
  14. 14. Benefits Summary<br />Automatically searches the web for similar bi-lingual pages<br />Provides a searchable index<br />Rapidly prototypes terminology<br />Provides a core translation memory<br />Training of domain-specific machine translation systems<br />
  15. 15. Beta coming soon!<br />Website: http://www.bigtm.net/<br />Email: bigtm@digitalsilkroad.net<br />Email your source text to get added to the beta program<br />Limited Beta starting September 15th<br />

×