5 — LocalizationFrom Code to Productgidgreen.com/course
Getting it wrongFrom Code to Product      Lecture 5 — Localization— Slide 2   gidgreen.com/course
Something we should know?From Code to Product   Lecture 5 — Localization— Slide 3   gidgreen.com/course
Lecture 5•    Countries and languages•    Character sets•    Unicode•    Text localization•    Outsourcing translation•   ...
PopulationChina                      1,347 M   19.3%                    Mandarin     845 M      12.1%India                ...
Economic weight (nominal)USA                              $14.4 T    23.7%                    English      $21.3 T     34....
Internet usersChina                              485 M     36%                    English      565 M         43%USA       ...
Internet penetrationFrom Code to Product    Lecture 5 — Localization— Slide 8   gidgreen.com/course
E-commerce volumes                                                                           USA                          ...
Multilingual countries                                                                            Italian          French ...
Language variations•  US vs UK English      –  color | colour      –  vacation | holiday      –  Where are you (at)?•  Eur...
Language codes (ISO-639-1)ar            Arabic                              zh-CN       Chinese (simplified)fr            ...
Lecture 5•    Countries and languages•    Character sets•    Unicode•    Text localization•    Outsourcing translation•   ...
Computer representation  0 1 0 0 0 0 0 1         00 … 41 … FF          0 … 65 … 255               A   .,/?;:’!%abcdefghijk...
US-ASCII                                                            Image from czyborra.comFrom Code to Product   Lecture ...
ISO-8859-1From Code to Product   Lecture 5 — Localization— Slide 16   gidgreen.com/course
Windows-1252From Code to Product     Lecture 5 — Localization— Slide 17   gidgreen.com/course
ISO-8859-5From Code to Product   Lecture 5 — Localization— Slide 18   gidgreen.com/course
ISO-8859-8From Code to Product   Lecture 5 — Localization— Slide 19   gidgreen.com/course
Problems with character sets•    Extra metadata•    Potential for misdisplay•    Mutually exclusive•    Little space to gr...
Lecture 5•    Countries and languages•    Character sets•    Unicode•    Text localization•    Outsourcing translation•   ...
The Unicode solution•  One global character set      –  Over 110,000 characters      –  Over 100 alphabets•  1,114,112 cod...
U+0000 … U+007FFrom Code to Product      Lecture 5 — Localization— Slide 23   gidgreen.com/course
U+0080 … U+00FFFrom Code to Product      Lecture 5 — Localization— Slide 24   gidgreen.com/course
U+0400 … U+047FFrom Code to Product      Lecture 5 — Localization— Slide 25   gidgreen.com/course
U+0590 … U+060FFrom Code to Product      Lecture X — SUBJECT— Slide 26   gidgreen.com/course
U+4E00 … U+4E7FFrom Code to Product      Lecture 5 — Localization— Slide 27   gidgreen.com/course
U+2190 … U+220FFrom Code to Product      Lecture 5 — Localization— Slide 28   gidgreen.com/course
U+2800 … U+267FFrom Code to Product      Lecture 5 — Localization— Slide 29   gidgreen.com/course
UTF-16 encoding•  2 or 4 bytes per code point•  Simple for U+0000…D7FF and E000…FFFF      –  “Basic Multilingual Pane”•  H...
UTF-8 encoding•  1 to 6 bytes per code point•  1 byte for U+0000…007F      –  Perfect compatibility with ASCII•  2 bytes f...
UTF-8 encodingFrom Code to Product     Lecture 5 — Localization— Slide 32   gidgreen.com/course
UTF-8 advantages•  Natural compression for English•  English works in old tools/APIs      –  HTML tags unaffected•  No sha...
Unicode on the web                                                                              googleblog.blogspot.com   ...
Lecture 5•    Countries and languages•    Character sets•    Unicode•    Text localization•    Outsourcing translation•   ...
The original source codefunction Check_Username(username)  …  if Username_Taken(username)…    error="username is taken."  ...
And now in Spanish…function Check_Username(username)  …  if Username_Taken(username)…    error="username se toma."  …  ret...
Internationalizedfunction Check_Username(username)  …  if Username_Taken(username)…    error=Get_String("un-taken")  …  re...
Internationalizedfunction Check_Username(username)  …  if Username_Taken(username)…    error=Translate("username istaken")...
IDs vs English strings                       IDs                                  English strings        More compact code...
Concatenation is evil          You will travel from London to Parisprint Translate("You will travel from ") +from_city + T...
Substitutionsraw=Translate("You will travel from%from% to %to%")raw=replace(raw, "%from%", from_city)print replace(raw, "%...
Singular/plural  You have 3 credits left                     You have 1 credit leftif (credits is 1)   c_string=translate(...
Text in imagesFrom Code to Product     Lecture 5 — Localization— Slide 44   gidgreen.com/course
Width in layouts.‫أﺷﻜﺮﻛﻢ ﻋﻠﻰ اﻟﺪﻓﻊ‬感谢 的付款。                          +57%!Gracias por su pago..‫אנו מודים לך על התשלום‬Спас...
LTR / RTLFrom Code to Product   Lecture 5 — Localization— Slide 46   gidgreen.com/course
Lecture 5•    Countries and languages•    Character sets•    Unicode•    Text localization•    Outsourcing translation•   ...
Outsourcing translation•    Preparing code•    Collecting (English) assets•    Choosing a provider•    Costs and quotes•  ...
Collecting assets•  Text files      –  Simple arrays or resource files      –  Standard formats, e.g. gettext, XLIFF•  HTM...
Choosing a provider•  Problem: you can’t assess quality•  Go by reputation and clients      –  Examples of previous work• ...
Cost and quotes                                                               Ibidem-translations.com•  Add 15-50% for spe...
Glossary•  Fixed translation for specific terms      –  Control over branding      –  Domain-specific terminology      –  ...
Glossary                                                 Image from Google Translator Toolkit HelpFrom Code to Product   L...
Translation memory•  Lots of translation is repetitive      –  Same text in many places      –  Small changes between vers...
Translation memory                                                                        Image from kilgray.com screensho...
Machine translationFrom Code to Product       Lecture 5 — Localization— Slide 56   gidgreen.com/course
Lecture 5•    Countries and languages•    Character sets•    Unicode•    Text localization•    Outsourcing translation•   ...
Numbers1,234,567.89            —    Japan, UK, USA1 234 567,89            —    France, Central Europe1.234.567,89         ...
Date and Times7/21/2012                                                     15:4521/7/2012                                ...
Time zones                                                                    Map from                                    ...
Displaying times online•  Store times independent of zone•  Options for display      –  Ask the user for their time zone  ...
Currencies•  Biggest traded currencies: $ € ¥ £      –  But there are almost 200•  How to display      –  Number formattin...
Names•  Surname can come first      –  China, Japan, Korea, Hungary•  Multiple surnames      –  José Santos Tavares Melo S...
Names                         Full Name:     What should we call you?                       Family name:            Other/...
Addresses     John Doe                             〒100-8994     Acme, Inc                            東京都中央区八重洲一丁目5番3号    ...
Addresses•  Single multi-line field•  Change in response to country•  Generic formatFrom Code to Product   Lecture 5 — Loc...
Phone numbersUK:                    +44 (0) 123-456-7890France:                +33 1-23-45-67-89China:                 +86...
Indexing, sorting, searching•  Capitalization and accents      –  Øyvind matches oyvind?•  Collation (sort order)      –  ...
Paper sizes         A4                US Letter   297 x 210 mm          279 x 216 mm                         US Legal     ...
Domain names•  Country-code top-level domains      –  .fr .de .uk .in .br .jp .cn•  Need separate registrar for many•  Som...
And there’s more…•    Units of measurement•    Colors•    Images of people•    Calendars•    Holidays•    Border disputes•...
Google in China•    2005:         Chinese language google.com•    2006:         google.cn under censorship•    2009:      ...
Getting real•  It’s time consuming and costly•  Cheap wins in version 1.0      –  Parameterize + functionize      –  Use U...
Getting real•  Don’t skimp the details      –  Needs to look native•  Use serious service providers•  Prepare for tech sup...
Upcoming SlideShare
Loading in …5
×

Localization and Internationalization

890 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
890
On SlideShare
0
From Embeds
0
Number of Embeds
21
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Localization and Internationalization

  1. 1. 5 — LocalizationFrom Code to Productgidgreen.com/course
  2. 2. Getting it wrongFrom Code to Product Lecture 5 — Localization— Slide 2 gidgreen.com/course
  3. 3. Something we should know?From Code to Product Lecture 5 — Localization— Slide 3 gidgreen.com/course
  4. 4. Lecture 5•  Countries and languages•  Character sets•  Unicode•  Text localization•  Outsourcing translation•  Other localizationFrom Code to Product Lecture 5 — Localization— Slide 4 gidgreen.com/course
  5. 5. PopulationChina 1,347 M 19.3% Mandarin 845 M 12.1%India 1,210 M 17.3% Spanish 329 M 4.7%USA 313 M 4.5% English 328 M 4.7%Indonesia 238 M 3.4% Hindi-Urdu 240 M 3.4%Brazil 192 M 2.8% Arabic 221 M 3.2%Pakistan 179 M 2.6% Bengali 181 M 2.6%Nigeria 162 M 2.3% Portuguese 178 M 2.5%Russia 143 M 2.0% Russian 144 M 2.1%Bangladesh 142 M 2.0% Japanese 122 M 1.7%Japan 128 M 1.8% Punjabi 109 M 1.6%2011-2012 from WikipediaFrom Code to Product Lecture 5 — Localization— Slide 5 gidgreen.com/course
  6. 6. Economic weight (nominal)USA $14.4 T 23.7% English $21.3 T 34.9%Japan $4.9 T 8.1% Chinese $5.2 T 8.4%China $4.3 T 7.1% Japanese $4.9 T 8.1%Germany $3.7 T 6.0% German $4.4 T 7.2%France $2.9 T 4.7% Spanish $4.2 T 6.8%UK $2.7 T 4.4% French $4.0 T 6.5%Italy $2.3 T 3.8% Italian $2.5 T 4.1%Russia $1.7 T 2.8% Russian $2.2 T 3.7%Spain $1.6 T 2.6% Portuguese $1.9 T 3.1%Brazil $1.6 T 2.6% Arabic $1.9 T 3.1%2008 from globalization-group.com, IMFFrom Code to Product Lecture 5 — Localization— Slide 6 gidgreen.com/course
  7. 7. Internet usersChina 485 M 36% English 565 M 43%USA 245 M 78% Chinese 510 M 37%India 100 M 8% Spanish 165 M 39%Japan 99 M 78% Japanese 99 M 78%Brazil 76 M 37% Portuguese 83 M 32%Germany 65 M 80% German 75 M 80%Russia 60 M 43% Arabic 65 M 19%UK 51 M 82% French 60 M 17%France 45 M 70% Russian 60 M 43%Nigeria 44 M 28% Korean 39 M 55%2011 from internetworldstats.comFrom Code to Product Lecture 5 — Localization— Slide 7 gidgreen.com/course
  8. 8. Internet penetrationFrom Code to Product Lecture 5 — Localization— Slide 8 gidgreen.com/course
  9. 9. E-commerce volumes USA Japan $123B $135B China Germany France $13B UK Italy $15B $51B Canada $16B Spain $19B $28B $37B South Korea $28B $36B Other2009 from EverisFrom Code to Product Lecture 5 — Localization— Slide 9 gidgreen.com/course
  10. 10. Multilingual countries Italian French 0.5M 8M French 1.6M Germa English n 21M 5.0M Canada SwitzerlandFrom Code to Product Lecture 5 — Localization— Slide 10 gidgreen.com/course
  11. 11. Language variations•  US vs UK English –  color | colour –  vacation | holiday –  Where are you (at)?•  European vs Brazilian Portuguese•  French•  SpanishFrom Code to Product Lecture 5 — Localization— Slide 11 gidgreen.com/course
  12. 12. Language codes (ISO-639-1)ar Arabic zh-CN Chinese (simplified)fr French zh-TW Chinese (traditional)nl Dutch en-GB English (UK)de German en-US English (US)he Hebrew pt-BR Portuguese (Brazilian)it Italian pt-PT Portuguese (Portugal)ja Japanese es-AR Spanish (Argentina)pl Polish es-CL Spanish (Chile)ru Russian es-MX Spanish (Mexico)es Spanish es-ES Spanish (Spain)From Code to Product Lecture 5 — Localization— Slide 12 gidgreen.com/course
  13. 13. Lecture 5•  Countries and languages•  Character sets•  Unicode•  Text localization•  Outsourcing translation•  Other localizationFrom Code to Product Lecture 5 — Localization— Slide 13 gidgreen.com/course
  14. 14. Computer representation 0 1 0 0 0 0 0 1 00 … 41 … FF 0 … 65 … 255 A .,/?;:’!%abcdefghijklmnopqrstuvwxyz… …BCDEFGHIJKMNOPQRSTUVWXYZ0123456789From Code to Product Lecture X — SUBJECT— Slide 14 gidgreen.com/course
  15. 15. US-ASCII Image from czyborra.comFrom Code to Product Lecture 5 — Localization— Slide 15 gidgreen.com/course
  16. 16. ISO-8859-1From Code to Product Lecture 5 — Localization— Slide 16 gidgreen.com/course
  17. 17. Windows-1252From Code to Product Lecture 5 — Localization— Slide 17 gidgreen.com/course
  18. 18. ISO-8859-5From Code to Product Lecture 5 — Localization— Slide 18 gidgreen.com/course
  19. 19. ISO-8859-8From Code to Product Lecture 5 — Localization— Slide 19 gidgreen.com/course
  20. 20. Problems with character sets•  Extra metadata•  Potential for misdisplay•  Mutually exclusive•  Little space to grow - e.g. €•  Ideographic languages –  70,000+ Chinese characters –  Multibyte encodingFrom Code to Product Lecture 5 — Localization— Slide 20 gidgreen.com/course
  21. 21. Lecture 5•  Countries and languages•  Character sets•  Unicode•  Text localization•  Outsourcing translation•  Other localizationFrom Code to Product Lecture 5 — Localization— Slide 21 gidgreen.com/course
  22. 22. The Unicode solution•  One global character set –  Over 110,000 characters –  Over 100 alphabets•  1,114,112 code points –  0…255 compatible with ISO-8859-1 –  U+0041 = A•  Multiple encodingsFrom Code to Product Lecture X — SUBJECT— Slide 22 gidgreen.com/course
  23. 23. U+0000 … U+007FFrom Code to Product Lecture 5 — Localization— Slide 23 gidgreen.com/course
  24. 24. U+0080 … U+00FFFrom Code to Product Lecture 5 — Localization— Slide 24 gidgreen.com/course
  25. 25. U+0400 … U+047FFrom Code to Product Lecture 5 — Localization— Slide 25 gidgreen.com/course
  26. 26. U+0590 … U+060FFrom Code to Product Lecture X — SUBJECT— Slide 26 gidgreen.com/course
  27. 27. U+4E00 … U+4E7FFrom Code to Product Lecture 5 — Localization— Slide 27 gidgreen.com/course
  28. 28. U+2190 … U+220FFrom Code to Product Lecture 5 — Localization— Slide 28 gidgreen.com/course
  29. 29. U+2800 … U+267FFrom Code to Product Lecture 5 — Localization— Slide 29 gidgreen.com/course
  30. 30. UTF-16 encoding•  2 or 4 bytes per code point•  Simple for U+0000…D7FF and E000…FFFF –  “Basic Multilingual Pane”•  Higher code points use 4 bytes•  U+FEFF = byte-order mark –  No well-followed default•  Windows APIs since Windows 2000 –  Also .NET, Android, iOS, Mac OS XFrom Code to Product Lecture 5 — Localization— Slide 30 gidgreen.com/course
  31. 31. UTF-8 encoding•  1 to 6 bytes per code point•  1 byte for U+0000…007F –  Perfect compatibility with ASCII•  2 bytes for U+0080…07FF –  etc…•  Byte order mark allowed –  But unnecessary, causes problems•  Dominant on web, emailFrom Code to Product Lecture 5 — Localization— Slide 31 gidgreen.com/course
  32. 32. UTF-8 encodingFrom Code to Product Lecture 5 — Localization— Slide 32 gidgreen.com/course
  33. 33. UTF-8 advantages•  Natural compression for English•  English works in old tools/APIs –  HTML tags unaffected•  No shared values between byte types –  Easy to synchronize mid-stream –  Easy to search by byte value•  No zero bytes (good for C)•  Byte-sorting = codepoint-sortingFrom Code to Product Lecture 5 — Localization— Slide 33 gidgreen.com/course
  34. 34. Unicode on the web googleblog.blogspot.com Source:From Code to Product Lecture 5 — Localization— Slide 34 gidgreen.com/course
  35. 35. Lecture 5•  Countries and languages•  Character sets•  Unicode•  Text localization•  Outsourcing translation•  Other localizationFrom Code to Product Lecture 5 — Localization— Slide 35 gidgreen.com/course
  36. 36. The original source codefunction Check_Username(username) … if Username_Taken(username)… error="username is taken." … return errorend functionFrom Code to Product Lecture 5 — Localization— Slide 36 gidgreen.com/course
  37. 37. And now in Spanish…function Check_Username(username) … if Username_Taken(username)… error="username se toma." … return errorend functionFrom Code to Product Lecture 5 — Localization— Slide 37 gidgreen.com/course
  38. 38. Internationalizedfunction Check_Username(username) … if Username_Taken(username)… error=Get_String("un-taken") … return errorend functionFrom Code to Product Lecture 5 — Localization— Slide 38 gidgreen.com/course
  39. 39. Internationalizedfunction Check_Username(username) … if Username_Taken(username)… error=Translate("username istaken") … return errorend functionFrom Code to Product Lecture 5 — Localization— Slide 39 gidgreen.com/course
  40. 40. IDs vs English strings IDs English strings More compact code More explicit code Enforces sync between English can be changed languages Less error-prone Easier for third partiesFrom Code to Product Lecture 5 — Localization— Slide 40 gidgreen.com/course
  41. 41. Concatenation is evil You will travel from London to Parisprint Translate("You will travel from ") +from_city + Translate(" to ") + to_city Usted viajará de London a Paris Sie wird von London nach Paris reisenFrom Code to Product Lecture 5 — Localization— Slide 41 gidgreen.com/course
  42. 42. Substitutionsraw=Translate("You will travel from%from% to %to%")raw=replace(raw, "%from%", from_city)print replace(raw, "%to%", to_city)You will travel from %from% to %to%Usted viajará de %from% a %to%Sie wird von %from% nach %to% reisenFrom Code to Product Lecture 5 — Localization— Slide 42 gidgreen.com/course
  43. 43. Singular/plural You have 3 credits left You have 1 credit leftif (credits is 1) c_string=translate("1 credit")else c_string=replace(translate("%#% credits","%#%", credits)raw=translate("You have %credits% left”)print replace(raw, "%credits", c_string)From Code to Product Lecture 5 — Localization— Slide 43 gidgreen.com/course
  44. 44. Text in imagesFrom Code to Product Lecture 5 — Localization— Slide 44 gidgreen.com/course
  45. 45. Width in layouts.‫أﺷﻜﺮﻛﻢ ﻋﻠﻰ اﻟﺪﻓﻊ‬感谢 的付款。 +57%!Gracias por su pago..‫אנו מודים לך על התשלום‬Спасибо за ваш платеж.Thank you for your payment.Vielen Dank für Ihre Bezahlung.Σας ευχαριστούµε για την πληρωµή σας.Nous vous remercions de votre paiement.お支払いしていただきありがとうございます。From Code to Product Lecture 5 — Localization— Slide 45 gidgreen.com/course
  46. 46. LTR / RTLFrom Code to Product Lecture 5 — Localization— Slide 46 gidgreen.com/course
  47. 47. Lecture 5•  Countries and languages•  Character sets•  Unicode•  Text localization•  Outsourcing translation•  Other localizationFrom Code to Product Lecture 5 — Localization— Slide 47 gidgreen.com/course
  48. 48. Outsourcing translation•  Preparing code•  Collecting (English) assets•  Choosing a provider•  Costs and quotes•  Glossary•  Translation memory•  Independent reviewFrom Code to Product Lecture 5 — Localization— Slide 48 gidgreen.com/course
  49. 49. Collecting assets•  Text files –  Simple arrays or resource files –  Standard formats, e.g. gettext, XLIFF•  HTML files –  Risk of accidental markup changes•  Graphics files –  Originals, not rendered•  Think about text expansionFrom Code to Product Lecture 5 — Localization— Slide 49 gidgreen.com/course
  50. 50. Choosing a provider•  Problem: you can’t assess quality•  Go by reputation and clients –  Examples of previous work•  Ask who will actually do it –  Native speaker of target language –  Subject-specific experience•  Consider future language needsFrom Code to Product Lecture 5 — Localization— Slide 50 gidgreen.com/course
  51. 51. Cost and quotes Ibidem-translations.com•  Add 15-50% for specialized areas•  Clarify how words are counted•  Check for extra costsFrom Code to Product Lecture 5 — Localization— Slide 51 gidgreen.com/course
  52. 52. Glossary•  Fixed translation for specific terms –  Control over branding –  Domain-specific terminology –  Consistency•  Not-to-be-translated terms•  Requires thorough review of productFrom Code to Product Lecture 5 — Localization— Slide 52 gidgreen.com/course
  53. 53. Glossary Image from Google Translator Toolkit HelpFrom Code to Product Lecture 5 — Localization— Slide 53 gidgreen.com/course
  54. 54. Translation memory•  Lots of translation is repetitive –  Same text in many places –  Small changes between versions•  Same sentence = same translation –  Save time and money –  Help ensure consistency –  But manual confirmation required•  Should be owned by youFrom Code to Product Lecture 5 — Localization— Slide 54 gidgreen.com/course
  55. 55. Translation memory Image from kilgray.com screenshotsFrom Code to Product Lecture 5 — Localization— Slide 55 gidgreen.com/course
  56. 56. Machine translationFrom Code to Product Lecture 5 — Localization— Slide 56 gidgreen.com/course
  57. 57. Lecture 5•  Countries and languages•  Character sets•  Unicode•  Text localization•  Outsourcing translation•  Other localizationFrom Code to Product Lecture 5 — Localization— Slide 57 gidgreen.com/course
  58. 58. Numbers1,234,567.89 — Japan, UK, USA1 234 567,89 — France, Central Europe1.234.567,89 — Germany, Scandinavia1’234’567.89 — Switzerland 123,4567.89 — China1’234,567.89 — Mexico12,34,567.89 — IndiaFrom Code to Product Lecture X — SUBJECT— Slide 58 gidgreen.com/course
  59. 59. Date and Times7/21/2012 15:4521/7/2012 3.45 PM21.7.2012 3:45 pm2012-07-127. 21. 20127-12-2012From Code to Product Lecture 5 — Localization— Slide 59 gidgreen.com/course
  60. 60. Time zones Map from wikipedia.orgFrom Code to Product Lecture 5 — Localization— Slide 60 gidgreen.com/course
  61. 61. Displaying times online•  Store times independent of zone•  Options for display –  Ask the user for their time zone –  Show an explicit time zone –  Use “ago” notation•  Javascript to get from browserFrom Code to Product Lecture 5 — Localization— Slide 61 gidgreen.com/course
  62. 62. Currencies•  Biggest traded currencies: $ € ¥ £ –  But there are almost 200•  How to display –  Number formatting –  Symbols: ₪ ₩ ฿ $ –  Currency codes: USD EUR JPY GBP CAD AUD•  Also: currency conversion –  Live feed, e.g. from ECBFrom Code to Product Lecture 5 — Localization— Slide 62 gidgreen.com/course
  63. 63. Names•  Surname can come first –  China, Japan, Korea, Hungary•  Multiple surnames –  José Santos Tavares Melo Silva•  Middle names/initials•  Double-barrelled names –  Sarah-Jane Darlington-Whit•  No spaces in CJKFrom Code to Product Lecture 5 — Localization— Slide 63 gidgreen.com/course
  64. 64. Names Full Name: What should we call you? Family name: Other/given names:•  Or localize based on language•  Do you need names at all? –  Username or email can be enoughFrom Code to Product Lecture 5 — Localization— Slide 64 gidgreen.com/course
  65. 65. Addresses John Doe 〒100-8994 Acme, Inc 東京都中央区八重洲一丁目5番3号 Suite 3B-3824 東京中央郵便局 294 W Ronson Dallas TX 75211 Tokyo Central Post Office USA 1-5-3 Yaesu, Chuo-ku Tokyo 100-8994 Japan John Smith Acme, Ltd Flat 384 33 Walton Road C/Pescadoro, 13, 2°, 3ª Birmingham 28331 – Madrid B26 3QJ Spain UKFrom Code to Product Lecture 5 — Localization— Slide 65 gidgreen.com/course
  66. 66. Addresses•  Single multi-line field•  Change in response to country•  Generic formatFrom Code to Product Lecture 5 — Localization— Slide 66 gidgreen.com/course
  67. 67. Phone numbersUK: +44 (0) 123-456-7890France: +33 1-23-45-67-89China: +86 10-2345-6789USA: +1 (123) 456-7890 x123•  Country selector•  Change in response to country•  Generic formatFrom Code to Product Lecture 5 — Localization— Slide 67 gidgreen.com/course
  68. 68. Indexing, sorting, searching•  Capitalization and accents –  Øyvind matches oyvind?•  Collation (sort order) –  Swedish: a b c … x y z å ä ö –  French: cote côte coté côté•  CJK (ideographic languages) –  No spaces between words –  Sort based on stroke countFrom Code to Product Lecture 5 — Localization— Slide 68 gidgreen.com/course
  69. 69. Paper sizes A4 US Letter 297 x 210 mm 279 x 216 mm US Legal 356 x 216 mmFrom Code to Product Lecture 5 — Localization— Slide 69 gidgreen.com/course
  70. 70. Domain names•  Country-code top-level domains –  .fr .de .uk .in .br .jp .cn•  Need separate registrar for many•  Some countries have restrictions –  .com.au requires registered company –  .ca requires nationality/residence –  Also restricted: .fr .br .cn .ie .jp …•  Internationalized domain namesFrom Code to Product Lecture 5 — Localization— Slide 70 gidgreen.com/course
  71. 71. And there’s more…•  Units of measurement•  Colors•  Images of people•  Calendars•  Holidays•  Border disputes•  Culture•  LawFrom Code to Product Lecture 5 — Localization— Slide 71 gidgreen.com/course
  72. 72. Google in China•  2005: Chinese language google.com•  2006: google.cn under censorship•  2009: China blocks YouTube•  2010: Google claims hacking attack –  Redirects google.cn to google.com.hk –  China blocks it for a day•  Today: Baidu 79%, Google 17% –  Baidu links to MP3/movie downloadsFrom Code to Product Lecture 5 — Localization— Slide 72 gidgreen.com/course
  73. 73. Getting real•  It’s time consuming and costly•  Cheap wins in version 1.0 –  Parameterize + functionize –  Use Unicode throughout –  Flexible layouts•  See where there is demand•  Identify most important localesFrom Code to Product Lecture 5 — Localization— Slide 73 gidgreen.com/course
  74. 74. Getting real•  Don’t skimp the details –  Needs to look native•  Use serious service providers•  Prepare for tech support –  Machine translation an option?•  It will slow development –  So wait for product maturityFrom Code to Product Lecture 5 — Localization— Slide 74 gidgreen.com/course

×