Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
TAUS ROUNDTABLE 2014
22 May/ Moscow (Russia)
THURSDAY, 22 May /12:10 – 12:40
MT at Yandex: Overview and Ways We Use It in
Localization
Farkhat Aminov, Yandex
TAUS ROUN...
MT at Yandex
Overview and Ways We Use It in
Localization
What is Yandex.Translate
Phrase-based SMT for 42 languages
Online translation service (24/7, 100Gb daily)
Translation appl...
Milestones
2009
The beginning of experiments with SMT
2010
Alpha-version of own SMT system for English-Russian pair
2011
B...
Languages
Mar
2011
Sep
2011
Aug
2012
Mar
2013
May
2013
Jun
2013
Sep
2013
Dec
2013
Apr
2014
+Russian
+English
+Ukrainian
+P...
How we coped
Quality
Quality differs
- Similar languages
- Vast linguistic resources
- Complex languages
- Language differences
- Littl...
Quality: Examples
RU: На Петербургский международный экономический форум каждый год собираются
руководители крупнейших рос...
Quality: Evaluation
main metric
BLEU
additional metrics
OOV rate
experiments
METEOR, HMEANT
manual evaluation
Adequacy / F...
Quality: Improvement
More data
More experiments
More research
Users and Use Cases
Yandex.Translate: Web
Yandex.Translate: Web
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
monthly active users
over 6 000 00...
Yandex.Translate: Mobile Apps
iOS
Windows Phone
Android
Yandex.Translate: Mobile Apps
0
100000
200000
300000
400000
500000
600000 Jun-13
Jul-13
Aug-13
Sep-13
Oct-13
Nov-13
Dec-13...
Yandex.Translate: Topics
EN-RU DE-RU PL-RU
web pages texts
news
educational texts
correspondence
literature
cooking recipe...
Yandex.Translate: Languages
80.80%
5.90%
3%
3.40%
1.60%
1.50%
1.50%
0.80%
0.40%
E…
G…
U…
F…
I…
T…
S…
P…
V…
Yandex.Search
Yandex.Mail
Yandex.Browser
Yandex.Browser
0
10
20
30
40
50
60
70
80
90
100
web page translations per day
6 500 000
text fragment translations per
day...
Internal Services
Interface
Localization
Wiki
translation
Documentatio
n Localization
Interface Localization
Machine translated texts approved by human translators
Project 1 Project 2
(short texts) (long text...
Documentation Localization
0%
20%
40%
60%
80%
100%
10-50% 20-60% 25-50%
GOOD
usable segment
EDITABLE
understandable
transl...
API
API: High Performance
API: Clients
Mobile app developers
Language learning services
Text analysis technology
Trading systems & E-commerce
Others
API: Free and Paid Service
PAID SERVICE
for business
FREE ACCESS
for small projects
30
Thank you!
31
Farkhat Aminov
aminov@yandex-team.ru
Project Manager,
Yandex.Translate
Yandex.Translate
http://translate.yandex.com
API...
Upcoming SlideShare
Loading in …5
×

MT at Yandex: Overview and Ways We Use it in Localization. Farkhat Aminov, Yandex

986 views

Published on

This presentation will give an overview of what Yandex's technology is and how it has evolved during 3 years of its existence. Farkhat will talk about learnings and success stories, as well as some prospects and plans for the future. They will also share the localization case as a best practice of MT technology implementation.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

MT at Yandex: Overview and Ways We Use it in Localization. Farkhat Aminov, Yandex

  1. 1. TAUS ROUNDTABLE 2014 22 May/ Moscow (Russia)
  2. 2. THURSDAY, 22 May /12:10 – 12:40 MT at Yandex: Overview and Ways We Use It in Localization Farkhat Aminov, Yandex TAUS ROUNDTABLE 2014 22 May/ Moscow (Russia)
  3. 3. MT at Yandex Overview and Ways We Use It in Localization
  4. 4. What is Yandex.Translate Phrase-based SMT for 42 languages Online translation service (24/7, 100Gb daily) Translation application for mobile platforms Full-functional API Besides that a team of researchers, developers, linguists, engineers, designers and managers 
  5. 5. Milestones 2009 The beginning of experiments with SMT 2010 Alpha-version of own SMT system for English-Russian pair 2011 Beta-version of the system on translate.yandex.ru 2012 API, machine dictionaries, synonyms, mobile applications 2013 Support for all European languages Today Asian languages
  6. 6. Languages Mar 2011 Sep 2011 Aug 2012 Mar 2013 May 2013 Jun 2013 Sep 2013 Dec 2013 Apr 2014 +Russian +English +Ukrainian +Polish +Turkish +German +Spanish +Italian +French +Bulgarian +Romanian +Serbian +Czech +Belorussian +Dutch +Danish +Swedish +Portuguese +Croatian +Estonian +Latvian +Lithuanian +Armenian +Azerbaijani +Greek +Slovak +Slovenian +Albanian +Catalan +Macedonian +Finnish +Hungarian +Norwegian +Arabic +Hebrew +Georgian +Vietnamese +Indonesian +Malay +Maltese +Bosnian +Icelandic 3 5 9 13 17 19 33 36 42
  7. 7. How we coped
  8. 8. Quality Quality differs - Similar languages - Vast linguistic resources - Complex languages - Language differences - Little corpora
  9. 9. Quality: Examples RU: На Петербургский международный экономический форум каждый год собираются руководители крупнейших российских и иностранных компаний, главы государств и политические лидеры. UK: На Петербурзький міжнародний економічний форум щороку збираються керівники найбільших російських і закордонних компаній, глави держав і політичні лідери. EN: At the St. Petersburg international economic forum every year gather heads of major Russian and foreign companies, the heads of state and political leaders. DE: Auf St. Petersburg International Economic Forum jedes Jahr versammeln sich die Führer der größten Russischen und ausländischen Unternehmen, die Staats-und die politischen Führer. TR: Üzerinde Petersburg uluslararası ekonomik forumu her yıl toplanır yöneticileri, büyük rus ve yabancı şirketlerin, devlet başkanları ve siyasi liderler.
  10. 10. Quality: Evaluation main metric BLEU additional metrics OOV rate experiments METEOR, HMEANT manual evaluation Adequacy / Fluency Test sets (manual or automatic) for all translation models
  11. 11. Quality: Improvement More data More experiments More research
  12. 12. Users and Use Cases
  13. 13. Yandex.Translate: Web
  14. 14. Yandex.Translate: Web 0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000 monthly active users over 6 000 000 daily translations 3 000 000
  15. 15. Yandex.Translate: Mobile Apps iOS Windows Phone Android
  16. 16. Yandex.Translate: Mobile Apps 0 100000 200000 300000 400000 500000 600000 Jun-13 Jul-13 Aug-13 Sep-13 Oct-13 Nov-13 Dec-13 Jan-14 Feb-14 Mar-14 Apr-14 monthly active users 500 000 installs 1 300 000
  17. 17. Yandex.Translate: Topics EN-RU DE-RU PL-RU web pages texts news educational texts correspondence literature cooking recipes documentation song lyrics / poems general subjects
  18. 18. Yandex.Translate: Languages 80.80% 5.90% 3% 3.40% 1.60% 1.50% 1.50% 0.80% 0.40% E… G… U… F… I… T… S… P… V…
  19. 19. Yandex.Search
  20. 20. Yandex.Mail
  21. 21. Yandex.Browser
  22. 22. Yandex.Browser 0 10 20 30 40 50 60 70 80 90 100 web page translations per day 6 500 000 text fragment translations per day 1 000 000 Daily traffic (GBs)
  23. 23. Internal Services Interface Localization Wiki translation Documentatio n Localization
  24. 24. Interface Localization Machine translated texts approved by human translators Project 1 Project 2 (short texts) (long texts) Uk… Uk…En… En… 78,4% 23,2% 46,6% 16,0%
  25. 25. Documentation Localization 0% 20% 40% 60% 80% 100% 10-50% 20-60% 25-50% GOOD usable segment EDITABLE understandable translation that needs to be edited BAD unusable translation of segment
  26. 26. API
  27. 27. API: High Performance
  28. 28. API: Clients Mobile app developers Language learning services Text analysis technology Trading systems & E-commerce Others
  29. 29. API: Free and Paid Service PAID SERVICE for business FREE ACCESS for small projects
  30. 30. 30 Thank you!
  31. 31. 31 Farkhat Aminov aminov@yandex-team.ru Project Manager, Yandex.Translate Yandex.Translate http://translate.yandex.com API http://api.yandex.com/translate Questions translate- business@support.yandex.com

×