Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

intl me this, intl me that


Published on

What are the problems with and best solutions to translating your web site or application into other languages? This presentation covers several approaches to this problem-based on PHP, focusing on utilizing the new intl extension as well as other open source tools.

Published in: Technology
  • Dating direct: ❤❤❤ ❤❤❤
    Are you sure you want to  Yes  No
    Your message goes here

intl me this, intl me that

  1. 1. intl me this, intl me that Andrei Zmievski IPC ~ May 26, 2009 ~ Berlin Tuesday, May 26, 2009
  2. 2. Who is this guy? Open Source Fellow @ Digg PHP Core Developer since 1999 Architect of the Unicode/i18n support Release Manager for PHP 6 Twitter: @a Beer lover (and brewer) Tuesday, May 26, 2009
  3. 3. Why localize? Tuesday, May 26, 2009
  4. 4. One example. Tuesday, May 26, 2009
  5. 5. Another reason. Tuesday, May 26, 2009
  6. 6. Another reason. Tuesday, May 26, 2009
  7. 7. Why Localize? English speakers are now a minority on WWW Nearly 3 out of 4 participants surveyed by Common Sense Advisory agreed that they were more likely to buy from sites in their own languages than in English Global consumers will pay more for products with information in their language Tuesday, May 26, 2009
  8. 8. Most important thing... Tuesday, May 26, 2009
  9. 9. No assumptions! Tuesday, May 26, 2009
  10. 10. No assumptions! English German is just another language USA Germany is just another country Earth is just another planet (eventually) Tuesday, May 26, 2009
  11. 11. i18n PHP 5.3 or PHP 6 intl extension Consider all data processing and output points Tuesday, May 26, 2009
  12. 12. Locale data Common Locale Data Repository (CLDR) 374 locales: 137 languages and 140 territories Updated regularly Used by intl extension Tuesday, May 26, 2009
  13. 13. Translation Identifying what to translate Checking all sources Obtaining translation Iteration Tuesday, May 26, 2009
  14. 14. What to translate Translatable units Continue or There were 5 search results Approaches Automatic “rippers” Manual markup Tuesday, May 26, 2009
  15. 15. Sources: PHP Anything destined for output layer single- and double-quoted strings heredocs error/exception messages (if seen by messages) 404 pages, anyone? Tuesday, May 26, 2009
  16. 16. Sources: PHP Use output buffering to detect misses Consider templates to enforce separation Don’t use extensions that cannot deal with UTF-8 Tuesday, May 26, 2009
  17. 17. Sources: JS and CSS Text Images Position or alignment of elements may change Modularize locale-dependent code into separate files <script src=quot;/js/common.jsquot; type=quot;text/javascriptquot;></script> <script src=quot;/js/locale-<?php echo $LOCALE ?>.jsquot; type=quot;text/javascriptquot;></script> Tuesday, May 26, 2009
  18. 18. Sources: DB Strings are fine, if they will never be displayed to users Consider using constants/identifiers, e.g. not admin or user, but 1 or 2 For things like product titles, keep separate table with translations and link against the main one Tuesday, May 26, 2009
  19. 19. Sources: external File-based content RSS Feeds Web services et al Tuesday, May 26, 2009
  20. 20. Obtaining translations Fast Cheap Accurate Tuesday, May 26, 2009
  21. 21. Obtaining translations You (maybe) Fast and cheap - not accurate quot;Not to perambulate the corridors during the hours of repose in the boots of ascension.quot; —sign in an Austrian ski hotel Tuesday, May 26, 2009
  22. 22. Obtaining translations Professionals (usually) Accurate and fast - not cheap Tuesday, May 26, 2009
  23. 23. Obtaining translations Community (fairly) Accurate and cheap - not fast Tuesday, May 26, 2009
  24. 24. Facebook approach Turn translation into a competitive activity Build it into the interface (just another app) Validation via voting Tuesday, May 26, 2009
  25. 25. Iteration Catching new units mark up untranslated strings use mnemonic identifiers, e.g. MENU.NAV.HELP Merge/update tools Tuesday, May 26, 2009
  26. 26. Using translations Self-contained pages (masochistic) standalone per-locale pages with no common root “quick-n-dirty” iteration? not so much Tuesday, May 26, 2009
  27. 27. Using translations Runtime uses translation storage and on-the-fly lookup usually combined with caching Tuesday, May 26, 2009
  28. 28. Using translations Pre-generation (“baking”) complete per-locale sites generated offline no runtime lookups may require runtime operations (sorting, etc) could increase opcode cache memory requirements Tuesday, May 26, 2009
  29. 29. Considerations Fidelity Ease of use Performance Flexibility Portability Tuesday, May 26, 2009
  30. 30. Fidelity UTF-8 don’t use tools that don’t support it Tuesday, May 26, 2009
  31. 31. Fidelity How big should translatable units be? “As large as possible, but not larger” Avoid concatenation problem There are <?php echo $nMesg ?> unread messages in <?php echo $nFolders ?> folders. Tuesday, May 26, 2009
  32. 32. Fidelity How big should translatable units be? “As large as possible, but not larger” Avoid concatenation problem There are <?php echo $nMesg ?> unread messages in <?php echo $nFolders ?> folders. Tuesday, May 26, 2009
  33. 33. Fidelity How big should translatable units be? “As large as possible, but not larger” Avoid concatenation problem There are <?php echo $nMesg ?> unread messages in <?php echo $nFolders ?> folders. Tuesday, May 26, 2009
  34. 34. Fidelity Sometimes the largest possible unit is a word Context is important chinese (person) vs. chinese (language) Add context as part of the unit chinese-person or CHINESE.PERSON Tuesday, May 26, 2009
  35. 35. Fidelity Combining translations with runtime data (parametrization) There are %1 unread messages in %2 folders. sprintf() - works for simple things gettext() - can help with plurals MessageFormat + ChoiceFormat is better Tuesday, May 26, 2009
  36. 36. Ease of use Intuitive tools (or good documentation) Transparent formats Translation memory useful for short, precise matches, not fuzzy use in testing and first pass, not in production Tuesday, May 26, 2009
  37. 37. Performance Caching translation units translated pages APC, memcache, etc Reduce runtime overhead Tuesday, May 26, 2009
  38. 38. Flexibility Adding new languages/locales quickly Translation inheritance Tuesday, May 26, 2009
  39. 39. Portability Moving between tools Translations, most importantly XLIFF Tuesday, May 26, 2009
  40. 40. Tools: gettext Developed for C/C++ originally Somewhat obscure format Translations on disk Have to compile translations with every change Proper markup not always possible POedit is a decent translation editor Tuesday, May 26, 2009
  41. 41. Tools: ezTranslation (et al) More of a translation look-up tool Can support various backends for translation storage and caching (QT Linguist format by default) Supports parametrized strings Bork/l33t filters for marking untranslated strings Tuesday, May 26, 2009
  42. 42. Tools: template engines Smarty (for example) 3rd party solutions based on pre- and post-filters Translations in config files or gettext mainly, could be in DB Mark-up approaches vary Parametrized strings are possible (depends on plugin) Tuesday, May 26, 2009
  43. 43. Tools: r3 Developed and supported by Yahoo! Very flexible and powerful, but a bit of a learning curve Translations are a subset of “site variations” Tuesday, May 26, 2009
  44. 44. Tools: r3 Inheritance everywhere Translations in DB (MySQL or SQLite) Has basic GUI for some operations Tuesday, May 26, 2009
  45. 45. Tools: intl Available for PHP 5.3 and PHP 6 Access to locale data Formatters/parsers Number, date, time, message, choice, etc Collation (sorting) More coming Tuesday, May 26, 2009
  46. 46. r3: setup % sudo sudo pear install -f stickleback-[version].tgz % sudo pear install -f --alldeps r3-[version].tgz % mkdir ~/r3 % r3 setup setuphome ~/r3 % export R3HOME=~/r3 % r3 setup installdb Tuesday, May 26, 2009
  47. 47. r3: setup % r3 dim product create wine % r3 dim intl create generic_intl % r3 dim intl create -p generic_intl us % r3 dim intl create -p generic_intl fr % r3 dim intl create -p us ca % r3 dim intl parent ca set fr -d translation % r3 dimension intl parent fr unset -f -d translation ... Tuesday, May 26, 2009
  48. 48. r3: inheritance templates translations generic_intl generic_intl fr us us fr ca ca Tuesday, May 26, 2009
  49. 49. r3: make a page % r3 target create wine/generic_intl/index.php % r3 template edit wine/generic_intl/index.php index.php.ros ... % r3 generate -av Tuesday, May 26, 2009
  50. 50. r3: markup <r3:trans>The Wine Source</r3:trans> % r3 translation list % r3 translation set wine/fr ‘The Wine Source’ ‘La Source de Vin’ % r3 generate wine/fr Tuesday, May 26, 2009
  51. 51. r3: translation % r3 translation save all fr.xml ... % r3 translation merge fr.xml <file original='wine/fr/generic' source-language='en' target-language='fr' datatype='plaintext'> <body> <trans-unit id='26'> approved='yes'> <source>The Wine Source</source> <target>La Source de Vin</target> </trans-unit> ... Tuesday, May 26, 2009
  52. 52. r3: compile-time PHP test.html.ros test.html <div> <div> <r3:cphp> 1 foreach (range(1, 5) as $i) 2 { 3 echo ‘$i,’<br/>’; 4 } 5 </r3:cphp> </div> </div> Tuesday, May 26, 2009
  53. 53. r3: parameterized strings test.php.ros $message = quot;<r3:trans>You have {0,number} messages as of {1,date,full}.</r3:trans>quot;; $args = array(1234, time()); echo MessageFormatter::formatMessage( $LOCALE, $message, $args ); fr translation Au {1,date,full} vous avez {0,number} messages. fr output Au mardi 22 juillet 2008 vous avez 1 234 messages. Tuesday, May 26, 2009
  54. 54. r3: runtime processing $map = array('jp' => 'ja', fr' => 'fr', 'us' => 'en_US', 'ca' => 'fr_CA', 'ru' => 'ru_RU', 'de' => 'de_DE', 'generic_intl' => 'en_US'); $ar = array($context->trans('Ivory Coast'), $context->trans('Russia'), $context->trans('USA')); $lang = $context->location()->get_lang_attribute(); $LOCALE = $map[$lang]; $coll = new Collator($map[$lang]); $coll->sort($ar); foreach ($ar as $c) { print quot;<li>$c</li>quot;; } Tuesday, May 26, 2009
  55. 55. Resources r3 gettext Smarty Chapter 12 of Smarty book ezTranslation intl Tuesday, May 26, 2009
  56. 56. thank you спасибо Tuesday, May 26, 2009