SlideShare a Scribd company logo
1 of 74
Download to read offline
5 ā€” Localization

From Code to Product
gidgreen.com/course
Getting it wrong




From Code to Product      Lecture 5 ā€” Localizationā€” Slide 2   gidgreen.com/course
Something we should know?




From Code to Product   Lecture 5 ā€” Localizationā€” Slide 3   gidgreen.com/course
Lecture 5
ā€¢ā€Æ   Countries and languages
ā€¢ā€Æ   Character sets
ā€¢ā€Æ   Unicode
ā€¢ā€Æ   Text localization
ā€¢ā€Æ   Outsourcing translation
ā€¢ā€Æ   Other localization



From Code to Product   Lecture 5 ā€” Localizationā€” Slide 4   gidgreen.com/course
Population
China                      1,347 M   19.3%                    Mandarin     845 M      12.1%
India                      1,210 M   17.3%                    Spanish      329 M       4.7%
USA                         313 M     4.5%                    English      328 M       4.7%
Indonesia                   238 M     3.4%                    Hindi-Urdu   240 M       3.4%
Brazil                      192 M     2.8%                    Arabic       221 M       3.2%
Pakistan                    179 M     2.6%                    Bengali      181 M       2.6%
Nigeria                     162 M     2.3%                    Portuguese   178 M       2.5%
Russia                      143 M     2.0%                    Russian      144 M       2.1%
Bangladesh                  142 M     2.0%                    Japanese     122 M       1.7%
Japan                       128 M     1.8%                    Punjabi      109 M       1.6%
2011-2012 from Wikipedia


From Code to Product                 Lecture 5 ā€” Localizationā€” Slide 5     gidgreen.com/course
Economic weight (nominal)
USA                              $14.4 T    23.7%                    English      $21.3 T     34.9%
Japan                              $4.9 T    8.1%                    Chinese       $5.2 T      8.4%
China                              $4.3 T    7.1%                    Japanese      $4.9 T      8.1%
Germany                            $3.7 T    6.0%                    German        $4.4 T      7.2%
France                             $2.9 T    4.7%                    Spanish       $4.2 T      6.8%
UK                                 $2.7 T    4.4%                    French        $4.0 T      6.5%
Italy                              $2.3 T    3.8%                    Italian       $2.5 T      4.1%
Russia                             $1.7 T    2.8%                    Russian       $2.2 T      3.7%
Spain                              $1.6 T    2.6%                    Portuguese    $1.9 T      3.1%
Brazil                             $1.6 T    2.6%                    Arabic        $1.9 T      3.1%
2008 from globalization-group.com, IMF


From Code to Product                        Lecture 5 ā€” Localizationā€” Slide 6      gidgreen.com/course
Internet users
China                              485 M     36%                    English      565 M         43%
USA                                245 M     78%                    Chinese      510 M         37%
India                              100 M       8%                   Spanish      165 M         39%
Japan                               99 M     78%                    Japanese      99 M         78%
Brazil                              76 M     37%                    Portuguese    83 M         32%
Germany                             65 M     80%                    German        75 M         80%
Russia                              60 M     43%                    Arabic        65 M         19%
UK                                  51 M     82%                    French        60 M         17%
France                              45 M     70%                    Russian       60 M         43%
Nigeria                             44 M     28%                    Korean        39 M         55%
2011 from internetworldstats.com


From Code to Product                       Lecture 5 ā€” Localizationā€” Slide 7     gidgreen.com/course
Internet penetration




From Code to Product    Lecture 5 ā€” Localizationā€” Slide 8   gidgreen.com/course
E-commerce volumes
                                                                           USA
                                                                           Japan
                          $123B                          $135B             China
                                                                           Germany
                                                                           France

        $13B                                                               UK
                                                                           Italy
         $15B
                                                              $51B         Canada
            $16B                                                           Spain
                   $19B     $28B                     $37B                  South Korea
                                   $28B   $36B                             Other
2009 from Everis



From Code to Product                  Lecture 5 ā€” Localizationā€” Slide 9   gidgreen.com/course
Multilingual countries

                                                                            Italian
          French                                                             0.5M
            8M                                                        French
                                                                       1.6M

                                                                                       Germa
                       English                                                            n
                        21M                                                             5.0M




               Canada                                                    Switzerland


From Code to Product             Lecture 5 ā€” Localizationā€” Slide 10                   gidgreen.com/course
Language variations
ā€¢ā€Æ US vs UK English
      ā€“ā€Æ color | colour
      ā€“ā€Æ vacation | holiday
      ā€“ā€Æ Where are you (at)?
ā€¢ā€Æ European vs Brazilian Portuguese
ā€¢ā€Æ French
ā€¢ā€Æ Spanish


From Code to Product       Lecture 5 ā€” Localizationā€” Slide 11   gidgreen.com/course
Language codes (ISO-639-1)
ar            Arabic                              zh-CN       Chinese (simplified)
fr            French                              zh-TW       Chinese (traditional)
nl            Dutch                               en-GB       English (UK)
de            German                              en-US       English (US)
he            Hebrew                              pt-BR       Portuguese (Brazilian)
it            Italian                             pt-PT       Portuguese (Portugal)
ja            Japanese                            es-AR       Spanish (Argentina)
pl            Polish                              es-CL       Spanish (Chile)
ru            Russian                             es-MX       Spanish (Mexico)
es            Spanish                             es-ES       Spanish (Spain)


From Code to Product     Lecture 5 ā€” Localizationā€” Slide 12           gidgreen.com/course
Lecture 5
ā€¢ā€Æ   Countries and languages
ā€¢ā€Æ   Character sets
ā€¢ā€Æ   Unicode
ā€¢ā€Æ   Text localization
ā€¢ā€Æ   Outsourcing translation
ā€¢ā€Æ   Other localization



From Code to Product   Lecture 5 ā€” Localizationā€” Slide 13   gidgreen.com/course
Computer representation

  0 1 0 0 0 0 0 1
         00 ā€¦ 41 ā€¦ FF
          0 ā€¦ 65 ā€¦ 255
               A
   .,/?;:ā€™!%abcdefghijklmnopqrstuvwxyzā€¦               ā€¦BCDEFGHIJKMNOPQRSTUVWXYZ0123456789


From Code to Product              Lecture X ā€” SUBJECTā€” Slide 14           gidgreen.com/course
US-ASCII




                                                            Image from czyborra.com

From Code to Product   Lecture 5 ā€” Localizationā€” Slide 15             gidgreen.com/course
ISO-8859-1




From Code to Product   Lecture 5 ā€” Localizationā€” Slide 16   gidgreen.com/course
Windows-1252




From Code to Product     Lecture 5 ā€” Localizationā€” Slide 17   gidgreen.com/course
ISO-8859-5




From Code to Product   Lecture 5 ā€” Localizationā€” Slide 18   gidgreen.com/course
ISO-8859-8




From Code to Product   Lecture 5 ā€” Localizationā€” Slide 19   gidgreen.com/course
Problems with character sets
ā€¢ā€Æ   Extra metadata
ā€¢ā€Æ   Potential for misdisplay
ā€¢ā€Æ   Mutually exclusive
ā€¢ā€Æ   Little space to grow - e.g. ā‚¬
ā€¢ā€Æ   Ideographic languages
      ā€“ā€Æ 70,000+ Chinese characters
      ā€“ā€Æ Multibyte encoding


From Code to Product   Lecture 5 ā€” Localizationā€” Slide 20   gidgreen.com/course
Lecture 5
ā€¢ā€Æ   Countries and languages
ā€¢ā€Æ   Character sets
ā€¢ā€Æ   Unicode
ā€¢ā€Æ   Text localization
ā€¢ā€Æ   Outsourcing translation
ā€¢ā€Æ   Other localization



From Code to Product   Lecture 5 ā€” Localizationā€” Slide 21   gidgreen.com/course
The Unicode solution
ā€¢ā€Æ One global character set
      ā€“ā€Æ Over 110,000 characters
      ā€“ā€Æ Over 100 alphabets
ā€¢ā€Æ 1,114,112 code points
      ā€“ā€Æ 0ā€¦255 compatible with ISO-8859-1
      ā€“ā€Æ U+0041 = A
ā€¢ā€Æ Multiple encodings


From Code to Product   Lecture X ā€” SUBJECTā€” Slide 22   gidgreen.com/course
U+0000 ā€¦ U+007F




From Code to Product      Lecture 5 ā€” Localizationā€” Slide 23   gidgreen.com/course
U+0080 ā€¦ U+00FF




From Code to Product      Lecture 5 ā€” Localizationā€” Slide 24   gidgreen.com/course
U+0400 ā€¦ U+047F




From Code to Product      Lecture 5 ā€” Localizationā€” Slide 25   gidgreen.com/course
U+0590 ā€¦ U+060F




From Code to Product      Lecture X ā€” SUBJECTā€” Slide 26   gidgreen.com/course
U+4E00 ā€¦ U+4E7F




From Code to Product      Lecture 5 ā€” Localizationā€” Slide 27   gidgreen.com/course
U+2190 ā€¦ U+220F




From Code to Product      Lecture 5 ā€” Localizationā€” Slide 28   gidgreen.com/course
U+2800 ā€¦ U+267F




From Code to Product      Lecture 5 ā€” Localizationā€” Slide 29   gidgreen.com/course
UTF-16 encoding
ā€¢ā€Æ 2 or 4 bytes per code point
ā€¢ā€Æ Simple for U+0000ā€¦D7FF and E000ā€¦FFFF
      ā€“ā€Æ ā€œBasic Multilingual Paneā€
ā€¢ā€Æ Higher code points use 4 bytes
ā€¢ā€Æ U+FEFF = byte-order mark
      ā€“ā€Æ No well-followed default
ā€¢ā€Æ Windows APIs since Windows 2000
      ā€“ā€Æ Also .NET, Android, iOS, Mac OS X
From Code to Product      Lecture 5 ā€” Localizationā€” Slide 30   gidgreen.com/course
UTF-8 encoding
ā€¢ā€Æ 1 to 6 bytes per code point
ā€¢ā€Æ 1 byte for U+0000ā€¦007F
      ā€“ā€Æ Perfect compatibility with ASCII
ā€¢ā€Æ 2 bytes for U+0080ā€¦07FF
      ā€“ā€Æ etcā€¦
ā€¢ā€Æ Byte order mark allowed
      ā€“ā€Æ But unnecessary, causes problems
ā€¢ā€Æ Dominant on web, email
From Code to Product     Lecture 5 ā€” Localizationā€” Slide 31   gidgreen.com/course
UTF-8 encoding




From Code to Product     Lecture 5 ā€” Localizationā€” Slide 32   gidgreen.com/course
UTF-8 advantages
ā€¢ā€Æ Natural compression for English
ā€¢ā€Æ English works in old tools/APIs
      ā€“ā€Æ HTML tags unaffected
ā€¢ā€Æ No shared values between byte types
      ā€“ā€Æ Easy to synchronize mid-stream
      ā€“ā€Æ Easy to search by byte value
ā€¢ā€Æ No zero bytes (good for C)
ā€¢ā€Æ Byte-sorting = codepoint-sorting
From Code to Product      Lecture 5 ā€” Localizationā€” Slide 33   gidgreen.com/course
Unicode on the web




                                                                              googleblog.blogspot.com
                                                                              Source:
From Code to Product       Lecture 5 ā€” Localizationā€” Slide 34   gidgreen.com/course
Lecture 5
ā€¢ā€Æ   Countries and languages
ā€¢ā€Æ   Character sets
ā€¢ā€Æ   Unicode
ā€¢ā€Æ   Text localization
ā€¢ā€Æ   Outsourcing translation
ā€¢ā€Æ   Other localization



From Code to Product   Lecture 5 ā€” Localizationā€” Slide 35   gidgreen.com/course
The original source code
function Check_Username(username)
  ā€¦
  if Username_Taken(username)ā€¦
    error="username is taken."
  ā€¦
  return error
end function


From Code to Product   Lecture 5 ā€” Localizationā€” Slide 36   gidgreen.com/course
And now in Spanishā€¦
function Check_Username(username)
  ā€¦
  if Username_Taken(username)ā€¦
    error="username se toma."
  ā€¦
  return error
end function


From Code to Product    Lecture 5 ā€” Localizationā€” Slide 37   gidgreen.com/course
Internationalized
function Check_Username(username)
  ā€¦
  if Username_Taken(username)ā€¦
    error=Get_String("un-taken")
  ā€¦
  return error
end function


From Code to Product      Lecture 5 ā€” Localizationā€” Slide 38   gidgreen.com/course
Internationalized
function Check_Username(username)
  ā€¦
  if Username_Taken(username)ā€¦
    error=Translate("username is
taken")
  ā€¦
  return error
end function

From Code to Product      Lecture 5 ā€” Localizationā€” Slide 39   gidgreen.com/course
IDs vs English strings

                       IDs                                  English strings


        More compact code                                More explicit code


                                                     Enforces sync between
     English can be changed
                                                           languages


           Less error-prone                          Easier for third parties

From Code to Product         Lecture 5 ā€” Localizationā€” Slide 40        gidgreen.com/course
Concatenation is evil
          You will travel from London to Paris

print Translate("You will travel from ") +
from_city + Translate(" to ") + to_city


               Usted viajarĆ” de London a Paris

       Sie wird von London nach Paris reisen

From Code to Product    Lecture 5 ā€” Localizationā€” Slide 41   gidgreen.com/course
Substitutions
raw=Translate("You will travel from
%from% to %to%")
raw=replace(raw, "%from%", from_city)
print replace(raw, "%to%", to_city)

You will travel from %from% to %to%
Usted viajarĆ” de %from% a %to%
Sie wird von %from% nach %to% reisen


From Code to Product    Lecture 5 ā€” Localizationā€” Slide 42   gidgreen.com/course
Singular/plural
  You have 3 credits left                     You have 1 credit left

if (credits is 1)
   c_string=translate("1 credit")
else
   c_string=replace(translate("%#% credits",
"%#%", credits)

raw=translate("You have %credits% leftā€)
print replace(raw, "%credits", c_string)

From Code to Product     Lecture 5 ā€” Localizationā€” Slide 43   gidgreen.com/course
Text in images




From Code to Product     Lecture 5 ā€” Localizationā€” Slide 44   gidgreen.com/course
Width in layouts
.ā€«Ų£ļŗ·ļ»œļŗ®ļ»›ļ»¢ ļ»‹ļ» ļ»° Ų§ļ»ŸļŗŖļ»“ļ»Šā€¬
ę„Ÿč°¢ ēš„ä»˜ę¬¾ć€‚                          +57%!
Gracias por su pago.
.ā€«×× ×• מודים לך על ה×Ŗשלוםā€¬
Š”ŠæŠ°ŃŠøŠ±Š¾ Š·Š° Š²Š°Ńˆ ŠæŠ»Š°Ń‚ŠµŠ¶.
Thank you for your payment.
Vielen Dank fĆ¼r Ihre Bezahlung.
Ī£Ī±Ļ‚ ĪµĻ…Ļ‡Ī±ĻĪ¹ĻƒĻ„ĪæĻĀµĪµ Ī³Ī¹Ī± Ļ„Ī·Ī½ Ļ€Ī»Ī·ĻĻ‰ĀµĪ® ĻƒĪ±Ļ‚.
Nous vous remercions de votre paiement.
恊ę”Æę‰•ć„ć—ć¦ć„ćŸć ćć‚ć‚ŠćŒćØć†ć”ć–ć„ć¾ć™ć€‚
From Code to Product      Lecture 5 ā€” Localizationā€” Slide 45   gidgreen.com/course
LTR / RTL




From Code to Product   Lecture 5 ā€” Localizationā€” Slide 46   gidgreen.com/course
Lecture 5
ā€¢ā€Æ   Countries and languages
ā€¢ā€Æ   Character sets
ā€¢ā€Æ   Unicode
ā€¢ā€Æ   Text localization
ā€¢ā€Æ   Outsourcing translation
ā€¢ā€Æ   Other localization



From Code to Product   Lecture 5 ā€” Localizationā€” Slide 47   gidgreen.com/course
Outsourcing translation
ā€¢ā€Æ   Preparing code
ā€¢ā€Æ   Collecting (English) assets
ā€¢ā€Æ   Choosing a provider
ā€¢ā€Æ   Costs and quotes
ā€¢ā€Æ   Glossary
ā€¢ā€Æ   Translation memory
ā€¢ā€Æ   Independent review

From Code to Product   Lecture 5 ā€” Localizationā€” Slide 48   gidgreen.com/course
Collecting assets
ā€¢ā€Æ Text files
      ā€“ā€Æ Simple arrays or resource files
      ā€“ā€Æ Standard formats, e.g. gettext, XLIFF
ā€¢ā€Æ HTML files
      ā€“ā€Æ Risk of accidental markup changes
ā€¢ā€Æ Graphics files
      ā€“ā€Æ Originals, not rendered
ā€¢ā€Æ Think about text expansion
From Code to Product      Lecture 5 ā€” Localizationā€” Slide 49   gidgreen.com/course
Choosing a provider
ā€¢ā€Æ Problem: you canā€™t assess quality
ā€¢ā€Æ Go by reputation and clients
      ā€“ā€Æ Examples of previous work
ā€¢ā€Æ Ask who will actually do it
      ā€“ā€Æ Native speaker of target language
      ā€“ā€Æ Subject-specific experience
ā€¢ā€Æ Consider future language needs


From Code to Product       Lecture 5 ā€” Localizationā€” Slide 50   gidgreen.com/course
Cost and quotes
                                                               Ibidem-translations.com




ā€¢ā€Æ Add 15-50% for specialized areas
ā€¢ā€Æ Clarify how words are counted
ā€¢ā€Æ Check for extra costs
From Code to Product      Lecture 5 ā€” Localizationā€” Slide 51       gidgreen.com/course
Glossary
ā€¢ā€Æ Fixed translation for specific terms
      ā€“ā€Æ Control over branding
      ā€“ā€Æ Domain-specific terminology
      ā€“ā€Æ Consistency
ā€¢ā€Æ Not-to-be-translated terms
ā€¢ā€Æ Requires thorough review of product



From Code to Product   Lecture 5 ā€” Localizationā€” Slide 52   gidgreen.com/course
Glossary




                                                 Image from Google Translator Toolkit Help




From Code to Product   Lecture 5 ā€” Localizationā€” Slide 53               gidgreen.com/course
Translation memory
ā€¢ā€Æ Lots of translation is repetitive
      ā€“ā€Æ Same text in many places
      ā€“ā€Æ Small changes between versions
ā€¢ā€Æ Same sentence = same translation
      ā€“ā€Æ Save time and money
      ā€“ā€Æ Help ensure consistency
      ā€“ā€Æ But manual confirmation required
ā€¢ā€Æ Should be owned by you

From Code to Product       Lecture 5 ā€” Localizationā€” Slide 54   gidgreen.com/course
Translation memory




                                                                        Image from kilgray.com screenshots
From Code to Product       Lecture 5 ā€” Localizationā€” Slide 55   gidgreen.com/course
Machine translation




From Code to Product       Lecture 5 ā€” Localizationā€” Slide 56   gidgreen.com/course
Lecture 5
ā€¢ā€Æ   Countries and languages
ā€¢ā€Æ   Character sets
ā€¢ā€Æ   Unicode
ā€¢ā€Æ   Text localization
ā€¢ā€Æ   Outsourcing translation
ā€¢ā€Æ   Other localization



From Code to Product   Lecture 5 ā€” Localizationā€” Slide 57   gidgreen.com/course
Numbers
1,234,567.89            ā€”    Japan, UK, USA
1 234 567,89            ā€”    France, Central Europe
1.234.567,89            ā€”    Germany, Scandinavia
1ā€™234ā€™567.89            ā€”    Switzerland
 123,4567.89            ā€”    China
1ā€™234,567.89            ā€”    Mexico
12,34,567.89            ā€”    India

From Code to Product   Lecture X ā€” SUBJECTā€” Slide 58   gidgreen.com/course
Date and Times
7/21/2012                                                     15:45
21/7/2012                                                     3.45 PM
21.7.2012                                                     3:45 pm
2012-07-12
7. 21. 2012
7-12-2012



From Code to Product     Lecture 5 ā€” Localizationā€” Slide 59        gidgreen.com/course
Time zones




                                                                    Map from
                                                                wikipedia.org


From Code to Product   Lecture 5 ā€” Localizationā€” Slide 60   gidgreen.com/course
Displaying times online
ā€¢ā€Æ Store times independent of zone
ā€¢ā€Æ Options for display
      ā€“ā€Æ Ask the user for their time zone
      ā€“ā€Æ Show an explicit time zone
      ā€“ā€Æ Use ā€œagoā€ notation
ā€¢ā€Æ Javascript to get from browser



From Code to Product   Lecture 5 ā€” Localizationā€” Slide 61   gidgreen.com/course
Currencies
ā€¢ā€Æ Biggest traded currencies: $ ā‚¬ Ā„ Ā£
      ā€“ā€Æ But there are almost 200
ā€¢ā€Æ How to display
      ā€“ā€Æ Number formatting
      ā€“ā€Æ Symbols: ā‚Ŗ ā‚© ąøæ $
      ā€“ā€Æ Currency codes: USD EUR JPY GBP CAD AUD
ā€¢ā€Æ Also: currency conversion
      ā€“ā€Æ Live feed, e.g. from ECB

From Code to Product   Lecture 5 ā€” Localizationā€” Slide 62   gidgreen.com/course
Names
ā€¢ā€Æ Surname can come first
      ā€“ā€Æ China, Japan, Korea, Hungary
ā€¢ā€Æ Multiple surnames
      ā€“ā€Æ JosĆ© Santos Tavares Melo Silva
ā€¢ā€Æ Middle names/initials
ā€¢ā€Æ Double-barrelled names
      ā€“ā€Æ Sarah-Jane Darlington-Whit
ā€¢ā€Æ No spaces in CJK
From Code to Product   Lecture 5 ā€” Localizationā€” Slide 63   gidgreen.com/course
Names
                         Full Name:

     What should we call you?


                       Family name:

            Other/given names:


ā€¢ā€Æ Or localize based on language
ā€¢ā€Æ Do you need names at all?
      ā€“ā€Æ Username or email can be enough

From Code to Product             Lecture 5 ā€” Localizationā€” Slide 64   gidgreen.com/course
Addresses
     John Doe                             怒100-8994
     Acme, Inc                            ę±äŗ¬éƒ½äø­å¤®åŒŗå…«é‡ę“²äø€äøē›®5ē•Ŗ3号
     Suite 3B-3824                        ę±äŗ¬äø­å¤®éƒµä¾æ局
     294 W Ronson
     Dallas TX 75211                      Tokyo Central Post Office
     USA                                  1-5-3 Yaesu, Chuo-ku
                                          Tokyo 100-8994
                                          Japan
     John Smith
     Acme, Ltd
     Flat 384
     33 Walton Road
                                          C/Pescadoro, 13, 2Ā°, 3ĀŖ
     Birmingham
                                          28331 ā€“ Madrid
     B26 3QJ
                                          Spain
     UK


From Code to Product   Lecture 5 ā€” Localizationā€” Slide 65   gidgreen.com/course
Addresses
ā€¢ā€Æ Single multi-line field
ā€¢ā€Æ Change in response to country
ā€¢ā€Æ Generic format




From Code to Product   Lecture 5 ā€” Localizationā€” Slide 66   gidgreen.com/course
Phone numbers
UK:                    +44 (0) 123-456-7890
France:                +33 1-23-45-67-89
China:                 +86 10-2345-6789
USA:                   +1 (123) 456-7890 x123

ā€¢ā€Æ Country selector
ā€¢ā€Æ Change in response to country
ā€¢ā€Æ Generic format
From Code to Product        Lecture 5 ā€” Localizationā€” Slide 67   gidgreen.com/course
Indexing, sorting, searching
ā€¢ā€Æ Capitalization and accents
      ā€“ā€Æ Ƙyvind matches oyvind?
ā€¢ā€Æ Collation (sort order)
      ā€“ā€Æ Swedish: a b c ā€¦ x y z Ć„ Ƥ ƶ
      ā€“ā€Æ French: cote cĆ“te cotĆ© cĆ“tĆ©
ā€¢ā€Æ CJK (ideographic languages)
      ā€“ā€Æ No spaces between words
      ā€“ā€Æ Sort based on stroke count

From Code to Product   Lecture 5 ā€” Localizationā€” Slide 68   gidgreen.com/course
Paper sizes



         A4                US Letter
   297 x 210 mm          279 x 216 mm                         US Legal
                                                            356 x 216 mm




From Code to Product   Lecture 5 ā€” Localizationā€” Slide 69       gidgreen.com/course
Domain names
ā€¢ā€Æ Country-code top-level domains
      ā€“ā€Æ .fr .de .uk .in .br .jp .cn
ā€¢ā€Æ Need separate registrar for many
ā€¢ā€Æ Some countries have restrictions
      ā€“ā€Æ .com.au requires registered company
      ā€“ā€Æ .ca requires nationality/residence
      ā€“ā€Æ Also restricted: .fr .br .cn .ie .jp ā€¦
ā€¢ā€Æ Internationalized domain names
From Code to Product     Lecture 5 ā€” Localizationā€” Slide 70   gidgreen.com/course
And thereā€™s moreā€¦
ā€¢ā€Æ   Units of measurement
ā€¢ā€Æ   Colors
ā€¢ā€Æ   Images of people
ā€¢ā€Æ   Calendars
ā€¢ā€Æ   Holidays
ā€¢ā€Æ   Border disputes
ā€¢ā€Æ   Culture
ā€¢ā€Æ   Law
From Code to Product       Lecture 5 ā€” Localizationā€” Slide 71   gidgreen.com/course
Google in China
ā€¢ā€Æ   2005:         Chinese language google.com
ā€¢ā€Æ   2006:         google.cn under censorship
ā€¢ā€Æ   2009:         China blocks YouTube
ā€¢ā€Æ   2010:         Google claims hacking attack
      ā€“ā€Æ Redirects google.cn to google.com.hk
      ā€“ā€Æ China blocks it for a day
ā€¢ā€Æ Today: Baidu 79%, Google 17%
      ā€“ā€Æ Baidu links to MP3/movie downloads
From Code to Product      Lecture 5 ā€” Localizationā€” Slide 72   gidgreen.com/course
Getting real
ā€¢ā€Æ Itā€™s time consuming and costly
ā€¢ā€Æ Cheap wins in version 1.0
      ā€“ā€Æ Parameterize + functionize
      ā€“ā€Æ Use Unicode throughout
      ā€“ā€Æ Flexible layouts
ā€¢ā€Æ See where there is demand
ā€¢ā€Æ Identify most important locales


From Code to Product    Lecture 5 ā€” Localizationā€” Slide 73   gidgreen.com/course
Getting real
ā€¢ā€Æ Donā€™t skimp the details
      ā€“ā€Æ Needs to look native
ā€¢ā€Æ Use serious service providers
ā€¢ā€Æ Prepare for tech support
      ā€“ā€Æ Machine translation an option?
ā€¢ā€Æ It will slow development
      ā€“ā€Æ So wait for product maturity


From Code to Product    Lecture 5 ā€” Localizationā€” Slide 74   gidgreen.com/course

More Related Content

Similar to Localization and Internationalization

8 innovation myths in china
8 innovation myths in china8 innovation myths in china
8 innovation myths in chinaIan Hou
Ā 
Agribusiness agriexport
Agribusiness  agriexportAgribusiness  agriexport
Agribusiness agriexportSankara Narayanan
Ā 
Turkey's REIT sector: an iceberg in Europe- Mete Varas, REIDIN.com
Turkey's REIT sector: an iceberg in Europe- Mete Varas, REIDIN.com Turkey's REIT sector: an iceberg in Europe- Mete Varas, REIDIN.com
Turkey's REIT sector: an iceberg in Europe- Mete Varas, REIDIN.com MIPIMWorld
Ā 
Mobile social network
Mobile social networkMobile social network
Mobile social networkdriver86
Ā 
Mobile social network
Mobile social networkMobile social network
Mobile social networkdriver86
Ā 
Selling Services To Brazil October 18, 2012
Selling Services To Brazil   October 18, 2012Selling Services To Brazil   October 18, 2012
Selling Services To Brazil October 18, 2012James Locke
Ā 

Similar to Localization and Internationalization (7)

8 innovation myths in china
8 innovation myths in china8 innovation myths in china
8 innovation myths in china
Ā 
Agribusiness agriexport
Agribusiness  agriexportAgribusiness  agriexport
Agribusiness agriexport
Ā 
Turkey's REIT sector: an iceberg in Europe- Mete Varas, REIDIN.com
Turkey's REIT sector: an iceberg in Europe- Mete Varas, REIDIN.com Turkey's REIT sector: an iceberg in Europe- Mete Varas, REIDIN.com
Turkey's REIT sector: an iceberg in Europe- Mete Varas, REIDIN.com
Ā 
2009 03 11 Apresentacao Teleconferencia 4 T09 Eng Final
2009 03 11 Apresentacao Teleconferencia 4 T09 Eng Final2009 03 11 Apresentacao Teleconferencia 4 T09 Eng Final
2009 03 11 Apresentacao Teleconferencia 4 T09 Eng Final
Ā 
Mobile social network
Mobile social networkMobile social network
Mobile social network
Ā 
Mobile social network
Mobile social networkMobile social network
Mobile social network
Ā 
Selling Services To Brazil October 18, 2012
Selling Services To Brazil   October 18, 2012Selling Services To Brazil   October 18, 2012
Selling Services To Brazil October 18, 2012
Ā 

More from gidgreen

The Secret Guide to Cloud Performance - Cloudlook
The Secret Guide to Cloud Performance - CloudlookThe Secret Guide to Cloud Performance - Cloudlook
The Secret Guide to Cloud Performance - Cloudlookgidgreen
Ā 
Localization and Internationalization 2013
Localization and Internationalization 2013Localization and Internationalization 2013
Localization and Internationalization 2013gidgreen
Ā 
Analytics and Optimization 2013
Analytics and Optimization 2013Analytics and Optimization 2013
Analytics and Optimization 2013gidgreen
Ā 
Web API Design 2013
Web API Design 2013Web API Design 2013
Web API Design 2013gidgreen
Ā 
Search Engine Visibility 2013
Search Engine Visibility 2013Search Engine Visibility 2013
Search Engine Visibility 2013gidgreen
Ā 
Marketing for Startups 2013
Marketing for Startups 2013Marketing for Startups 2013
Marketing for Startups 2013gidgreen
Ā 
Selling Advertising 2013
Selling Advertising 2013Selling Advertising 2013
Selling Advertising 2013gidgreen
Ā 
Selling Products and Services 2013
Selling Products and Services 2013Selling Products and Services 2013
Selling Products and Services 2013gidgreen
Ā 
User Interface Design 2013
User Interface Design 2013User Interface Design 2013
User Interface Design 2013gidgreen
Ā 
User Interface Principles 2013
User Interface Principles 2013User Interface Principles 2013
User Interface Principles 2013gidgreen
Ā 
The Software Entrepreneurship Process 2013
The Software Entrepreneurship Process 2013The Software Entrepreneurship Process 2013
The Software Entrepreneurship Process 2013gidgreen
Ā 
Introduction to Software Products and Startups 2013
Introduction to Software Products and Startups 2013Introduction to Software Products and Startups 2013
Introduction to Software Products and Startups 2013gidgreen
Ā 
Question2Answer - September 2012
Question2Answer - September 2012Question2Answer - September 2012
Question2Answer - September 2012gidgreen
Ā 
Search Engine Visibility
Search Engine VisibilitySearch Engine Visibility
Search Engine Visibilitygidgreen
Ā 
Marketing for Startups
Marketing for StartupsMarketing for Startups
Marketing for Startupsgidgreen
Ā 
Analytics and Optimization
Analytics and OptimizationAnalytics and Optimization
Analytics and Optimizationgidgreen
Ā 
Selling Products and Services
Selling Products and ServicesSelling Products and Services
Selling Products and Servicesgidgreen
Ā 
Advertising as a Business Model
Advertising as a Business ModelAdvertising as a Business Model
Advertising as a Business Modelgidgreen
Ā 
User Interface Design
User Interface DesignUser Interface Design
User Interface Designgidgreen
Ā 
User Interface Principles
User Interface PrinciplesUser Interface Principles
User Interface Principlesgidgreen
Ā 

More from gidgreen (20)

The Secret Guide to Cloud Performance - Cloudlook
The Secret Guide to Cloud Performance - CloudlookThe Secret Guide to Cloud Performance - Cloudlook
The Secret Guide to Cloud Performance - Cloudlook
Ā 
Localization and Internationalization 2013
Localization and Internationalization 2013Localization and Internationalization 2013
Localization and Internationalization 2013
Ā 
Analytics and Optimization 2013
Analytics and Optimization 2013Analytics and Optimization 2013
Analytics and Optimization 2013
Ā 
Web API Design 2013
Web API Design 2013Web API Design 2013
Web API Design 2013
Ā 
Search Engine Visibility 2013
Search Engine Visibility 2013Search Engine Visibility 2013
Search Engine Visibility 2013
Ā 
Marketing for Startups 2013
Marketing for Startups 2013Marketing for Startups 2013
Marketing for Startups 2013
Ā 
Selling Advertising 2013
Selling Advertising 2013Selling Advertising 2013
Selling Advertising 2013
Ā 
Selling Products and Services 2013
Selling Products and Services 2013Selling Products and Services 2013
Selling Products and Services 2013
Ā 
User Interface Design 2013
User Interface Design 2013User Interface Design 2013
User Interface Design 2013
Ā 
User Interface Principles 2013
User Interface Principles 2013User Interface Principles 2013
User Interface Principles 2013
Ā 
The Software Entrepreneurship Process 2013
The Software Entrepreneurship Process 2013The Software Entrepreneurship Process 2013
The Software Entrepreneurship Process 2013
Ā 
Introduction to Software Products and Startups 2013
Introduction to Software Products and Startups 2013Introduction to Software Products and Startups 2013
Introduction to Software Products and Startups 2013
Ā 
Question2Answer - September 2012
Question2Answer - September 2012Question2Answer - September 2012
Question2Answer - September 2012
Ā 
Search Engine Visibility
Search Engine VisibilitySearch Engine Visibility
Search Engine Visibility
Ā 
Marketing for Startups
Marketing for StartupsMarketing for Startups
Marketing for Startups
Ā 
Analytics and Optimization
Analytics and OptimizationAnalytics and Optimization
Analytics and Optimization
Ā 
Selling Products and Services
Selling Products and ServicesSelling Products and Services
Selling Products and Services
Ā 
Advertising as a Business Model
Advertising as a Business ModelAdvertising as a Business Model
Advertising as a Business Model
Ā 
User Interface Design
User Interface DesignUser Interface Design
User Interface Design
Ā 
User Interface Principles
User Interface PrinciplesUser Interface Principles
User Interface Principles
Ā 

Recently uploaded

Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelDeepika Singh
Ā 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
Ā 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
Ā 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
Ā 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
Ā 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
Ā 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
Ā 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
Ā 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
Ā 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
Ā 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
Ā 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
Ā 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
Ā 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
Ā 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vƔzquez
Ā 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĆŗjo
Ā 

Recently uploaded (20)

Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls šŸ„° 8617370543 Service Offer VIP Hot Model
Ā 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Ā 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
Ā 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
Ā 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Ā 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Ā 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Ā 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Ā 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Ā 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Ā 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Ā 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Ā 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
Ā 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Ā 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Ā 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Ā 

Localization and Internationalization

  • 1. 5 ā€” Localization From Code to Product gidgreen.com/course
  • 2. Getting it wrong From Code to Product Lecture 5 ā€” Localizationā€” Slide 2 gidgreen.com/course
  • 3. Something we should know? From Code to Product Lecture 5 ā€” Localizationā€” Slide 3 gidgreen.com/course
  • 4. Lecture 5 ā€¢ā€Æ Countries and languages ā€¢ā€Æ Character sets ā€¢ā€Æ Unicode ā€¢ā€Æ Text localization ā€¢ā€Æ Outsourcing translation ā€¢ā€Æ Other localization From Code to Product Lecture 5 ā€” Localizationā€” Slide 4 gidgreen.com/course
  • 5. Population China 1,347 M 19.3% Mandarin 845 M 12.1% India 1,210 M 17.3% Spanish 329 M 4.7% USA 313 M 4.5% English 328 M 4.7% Indonesia 238 M 3.4% Hindi-Urdu 240 M 3.4% Brazil 192 M 2.8% Arabic 221 M 3.2% Pakistan 179 M 2.6% Bengali 181 M 2.6% Nigeria 162 M 2.3% Portuguese 178 M 2.5% Russia 143 M 2.0% Russian 144 M 2.1% Bangladesh 142 M 2.0% Japanese 122 M 1.7% Japan 128 M 1.8% Punjabi 109 M 1.6% 2011-2012 from Wikipedia From Code to Product Lecture 5 ā€” Localizationā€” Slide 5 gidgreen.com/course
  • 6. Economic weight (nominal) USA $14.4 T 23.7% English $21.3 T 34.9% Japan $4.9 T 8.1% Chinese $5.2 T 8.4% China $4.3 T 7.1% Japanese $4.9 T 8.1% Germany $3.7 T 6.0% German $4.4 T 7.2% France $2.9 T 4.7% Spanish $4.2 T 6.8% UK $2.7 T 4.4% French $4.0 T 6.5% Italy $2.3 T 3.8% Italian $2.5 T 4.1% Russia $1.7 T 2.8% Russian $2.2 T 3.7% Spain $1.6 T 2.6% Portuguese $1.9 T 3.1% Brazil $1.6 T 2.6% Arabic $1.9 T 3.1% 2008 from globalization-group.com, IMF From Code to Product Lecture 5 ā€” Localizationā€” Slide 6 gidgreen.com/course
  • 7. Internet users China 485 M 36% English 565 M 43% USA 245 M 78% Chinese 510 M 37% India 100 M 8% Spanish 165 M 39% Japan 99 M 78% Japanese 99 M 78% Brazil 76 M 37% Portuguese 83 M 32% Germany 65 M 80% German 75 M 80% Russia 60 M 43% Arabic 65 M 19% UK 51 M 82% French 60 M 17% France 45 M 70% Russian 60 M 43% Nigeria 44 M 28% Korean 39 M 55% 2011 from internetworldstats.com From Code to Product Lecture 5 ā€” Localizationā€” Slide 7 gidgreen.com/course
  • 8. Internet penetration From Code to Product Lecture 5 ā€” Localizationā€” Slide 8 gidgreen.com/course
  • 9. E-commerce volumes USA Japan $123B $135B China Germany France $13B UK Italy $15B $51B Canada $16B Spain $19B $28B $37B South Korea $28B $36B Other 2009 from Everis From Code to Product Lecture 5 ā€” Localizationā€” Slide 9 gidgreen.com/course
  • 10. Multilingual countries Italian French 0.5M 8M French 1.6M Germa English n 21M 5.0M Canada Switzerland From Code to Product Lecture 5 ā€” Localizationā€” Slide 10 gidgreen.com/course
  • 11. Language variations ā€¢ā€Æ US vs UK English ā€“ā€Æ color | colour ā€“ā€Æ vacation | holiday ā€“ā€Æ Where are you (at)? ā€¢ā€Æ European vs Brazilian Portuguese ā€¢ā€Æ French ā€¢ā€Æ Spanish From Code to Product Lecture 5 ā€” Localizationā€” Slide 11 gidgreen.com/course
  • 12. Language codes (ISO-639-1) ar Arabic zh-CN Chinese (simplified) fr French zh-TW Chinese (traditional) nl Dutch en-GB English (UK) de German en-US English (US) he Hebrew pt-BR Portuguese (Brazilian) it Italian pt-PT Portuguese (Portugal) ja Japanese es-AR Spanish (Argentina) pl Polish es-CL Spanish (Chile) ru Russian es-MX Spanish (Mexico) es Spanish es-ES Spanish (Spain) From Code to Product Lecture 5 ā€” Localizationā€” Slide 12 gidgreen.com/course
  • 13. Lecture 5 ā€¢ā€Æ Countries and languages ā€¢ā€Æ Character sets ā€¢ā€Æ Unicode ā€¢ā€Æ Text localization ā€¢ā€Æ Outsourcing translation ā€¢ā€Æ Other localization From Code to Product Lecture 5 ā€” Localizationā€” Slide 13 gidgreen.com/course
  • 14. Computer representation 0 1 0 0 0 0 0 1 00 ā€¦ 41 ā€¦ FF 0 ā€¦ 65 ā€¦ 255 A .,/?;:ā€™!%abcdefghijklmnopqrstuvwxyzā€¦ ā€¦BCDEFGHIJKMNOPQRSTUVWXYZ0123456789 From Code to Product Lecture X ā€” SUBJECTā€” Slide 14 gidgreen.com/course
  • 15. US-ASCII Image from czyborra.com From Code to Product Lecture 5 ā€” Localizationā€” Slide 15 gidgreen.com/course
  • 16. ISO-8859-1 From Code to Product Lecture 5 ā€” Localizationā€” Slide 16 gidgreen.com/course
  • 17. Windows-1252 From Code to Product Lecture 5 ā€” Localizationā€” Slide 17 gidgreen.com/course
  • 18. ISO-8859-5 From Code to Product Lecture 5 ā€” Localizationā€” Slide 18 gidgreen.com/course
  • 19. ISO-8859-8 From Code to Product Lecture 5 ā€” Localizationā€” Slide 19 gidgreen.com/course
  • 20. Problems with character sets ā€¢ā€Æ Extra metadata ā€¢ā€Æ Potential for misdisplay ā€¢ā€Æ Mutually exclusive ā€¢ā€Æ Little space to grow - e.g. ā‚¬ ā€¢ā€Æ Ideographic languages ā€“ā€Æ 70,000+ Chinese characters ā€“ā€Æ Multibyte encoding From Code to Product Lecture 5 ā€” Localizationā€” Slide 20 gidgreen.com/course
  • 21. Lecture 5 ā€¢ā€Æ Countries and languages ā€¢ā€Æ Character sets ā€¢ā€Æ Unicode ā€¢ā€Æ Text localization ā€¢ā€Æ Outsourcing translation ā€¢ā€Æ Other localization From Code to Product Lecture 5 ā€” Localizationā€” Slide 21 gidgreen.com/course
  • 22. The Unicode solution ā€¢ā€Æ One global character set ā€“ā€Æ Over 110,000 characters ā€“ā€Æ Over 100 alphabets ā€¢ā€Æ 1,114,112 code points ā€“ā€Æ 0ā€¦255 compatible with ISO-8859-1 ā€“ā€Æ U+0041 = A ā€¢ā€Æ Multiple encodings From Code to Product Lecture X ā€” SUBJECTā€” Slide 22 gidgreen.com/course
  • 23. U+0000 ā€¦ U+007F From Code to Product Lecture 5 ā€” Localizationā€” Slide 23 gidgreen.com/course
  • 24. U+0080 ā€¦ U+00FF From Code to Product Lecture 5 ā€” Localizationā€” Slide 24 gidgreen.com/course
  • 25. U+0400 ā€¦ U+047F From Code to Product Lecture 5 ā€” Localizationā€” Slide 25 gidgreen.com/course
  • 26. U+0590 ā€¦ U+060F From Code to Product Lecture X ā€” SUBJECTā€” Slide 26 gidgreen.com/course
  • 27. U+4E00 ā€¦ U+4E7F From Code to Product Lecture 5 ā€” Localizationā€” Slide 27 gidgreen.com/course
  • 28. U+2190 ā€¦ U+220F From Code to Product Lecture 5 ā€” Localizationā€” Slide 28 gidgreen.com/course
  • 29. U+2800 ā€¦ U+267F From Code to Product Lecture 5 ā€” Localizationā€” Slide 29 gidgreen.com/course
  • 30. UTF-16 encoding ā€¢ā€Æ 2 or 4 bytes per code point ā€¢ā€Æ Simple for U+0000ā€¦D7FF and E000ā€¦FFFF ā€“ā€Æ ā€œBasic Multilingual Paneā€ ā€¢ā€Æ Higher code points use 4 bytes ā€¢ā€Æ U+FEFF = byte-order mark ā€“ā€Æ No well-followed default ā€¢ā€Æ Windows APIs since Windows 2000 ā€“ā€Æ Also .NET, Android, iOS, Mac OS X From Code to Product Lecture 5 ā€” Localizationā€” Slide 30 gidgreen.com/course
  • 31. UTF-8 encoding ā€¢ā€Æ 1 to 6 bytes per code point ā€¢ā€Æ 1 byte for U+0000ā€¦007F ā€“ā€Æ Perfect compatibility with ASCII ā€¢ā€Æ 2 bytes for U+0080ā€¦07FF ā€“ā€Æ etcā€¦ ā€¢ā€Æ Byte order mark allowed ā€“ā€Æ But unnecessary, causes problems ā€¢ā€Æ Dominant on web, email From Code to Product Lecture 5 ā€” Localizationā€” Slide 31 gidgreen.com/course
  • 32. UTF-8 encoding From Code to Product Lecture 5 ā€” Localizationā€” Slide 32 gidgreen.com/course
  • 33. UTF-8 advantages ā€¢ā€Æ Natural compression for English ā€¢ā€Æ English works in old tools/APIs ā€“ā€Æ HTML tags unaffected ā€¢ā€Æ No shared values between byte types ā€“ā€Æ Easy to synchronize mid-stream ā€“ā€Æ Easy to search by byte value ā€¢ā€Æ No zero bytes (good for C) ā€¢ā€Æ Byte-sorting = codepoint-sorting From Code to Product Lecture 5 ā€” Localizationā€” Slide 33 gidgreen.com/course
  • 34. Unicode on the web googleblog.blogspot.com Source: From Code to Product Lecture 5 ā€” Localizationā€” Slide 34 gidgreen.com/course
  • 35. Lecture 5 ā€¢ā€Æ Countries and languages ā€¢ā€Æ Character sets ā€¢ā€Æ Unicode ā€¢ā€Æ Text localization ā€¢ā€Æ Outsourcing translation ā€¢ā€Æ Other localization From Code to Product Lecture 5 ā€” Localizationā€” Slide 35 gidgreen.com/course
  • 36. The original source code function Check_Username(username) ā€¦ if Username_Taken(username)ā€¦ error="username is taken." ā€¦ return error end function From Code to Product Lecture 5 ā€” Localizationā€” Slide 36 gidgreen.com/course
  • 37. And now in Spanishā€¦ function Check_Username(username) ā€¦ if Username_Taken(username)ā€¦ error="username se toma." ā€¦ return error end function From Code to Product Lecture 5 ā€” Localizationā€” Slide 37 gidgreen.com/course
  • 38. Internationalized function Check_Username(username) ā€¦ if Username_Taken(username)ā€¦ error=Get_String("un-taken") ā€¦ return error end function From Code to Product Lecture 5 ā€” Localizationā€” Slide 38 gidgreen.com/course
  • 39. Internationalized function Check_Username(username) ā€¦ if Username_Taken(username)ā€¦ error=Translate("username is taken") ā€¦ return error end function From Code to Product Lecture 5 ā€” Localizationā€” Slide 39 gidgreen.com/course
  • 40. IDs vs English strings IDs English strings More compact code More explicit code Enforces sync between English can be changed languages Less error-prone Easier for third parties From Code to Product Lecture 5 ā€” Localizationā€” Slide 40 gidgreen.com/course
  • 41. Concatenation is evil You will travel from London to Paris print Translate("You will travel from ") + from_city + Translate(" to ") + to_city Usted viajarĆ” de London a Paris Sie wird von London nach Paris reisen From Code to Product Lecture 5 ā€” Localizationā€” Slide 41 gidgreen.com/course
  • 42. Substitutions raw=Translate("You will travel from %from% to %to%") raw=replace(raw, "%from%", from_city) print replace(raw, "%to%", to_city) You will travel from %from% to %to% Usted viajarĆ” de %from% a %to% Sie wird von %from% nach %to% reisen From Code to Product Lecture 5 ā€” Localizationā€” Slide 42 gidgreen.com/course
  • 43. Singular/plural You have 3 credits left You have 1 credit left if (credits is 1) c_string=translate("1 credit") else c_string=replace(translate("%#% credits", "%#%", credits) raw=translate("You have %credits% leftā€) print replace(raw, "%credits", c_string) From Code to Product Lecture 5 ā€” Localizationā€” Slide 43 gidgreen.com/course
  • 44. Text in images From Code to Product Lecture 5 ā€” Localizationā€” Slide 44 gidgreen.com/course
  • 45. Width in layouts .ā€«Ų£ļŗ·ļ»œļŗ®ļ»›ļ»¢ ļ»‹ļ» ļ»° Ų§ļ»ŸļŗŖļ»“ļ»Šā€¬ ę„Ÿč°¢ ēš„ä»˜ę¬¾ć€‚ +57%! Gracias por su pago. .ā€«×× ×• מודים לך על ה×Ŗשלוםā€¬ Š”ŠæŠ°ŃŠøŠ±Š¾ Š·Š° Š²Š°Ńˆ ŠæŠ»Š°Ń‚ŠµŠ¶. Thank you for your payment. Vielen Dank fĆ¼r Ihre Bezahlung. Ī£Ī±Ļ‚ ĪµĻ…Ļ‡Ī±ĻĪ¹ĻƒĻ„ĪæĻĀµĪµ Ī³Ī¹Ī± Ļ„Ī·Ī½ Ļ€Ī»Ī·ĻĻ‰ĀµĪ® ĻƒĪ±Ļ‚. Nous vous remercions de votre paiement. 恊ę”Æę‰•ć„ć—ć¦ć„ćŸć ćć‚ć‚ŠćŒćØć†ć”ć–ć„ć¾ć™ć€‚ From Code to Product Lecture 5 ā€” Localizationā€” Slide 45 gidgreen.com/course
  • 46. LTR / RTL From Code to Product Lecture 5 ā€” Localizationā€” Slide 46 gidgreen.com/course
  • 47. Lecture 5 ā€¢ā€Æ Countries and languages ā€¢ā€Æ Character sets ā€¢ā€Æ Unicode ā€¢ā€Æ Text localization ā€¢ā€Æ Outsourcing translation ā€¢ā€Æ Other localization From Code to Product Lecture 5 ā€” Localizationā€” Slide 47 gidgreen.com/course
  • 48. Outsourcing translation ā€¢ā€Æ Preparing code ā€¢ā€Æ Collecting (English) assets ā€¢ā€Æ Choosing a provider ā€¢ā€Æ Costs and quotes ā€¢ā€Æ Glossary ā€¢ā€Æ Translation memory ā€¢ā€Æ Independent review From Code to Product Lecture 5 ā€” Localizationā€” Slide 48 gidgreen.com/course
  • 49. Collecting assets ā€¢ā€Æ Text files ā€“ā€Æ Simple arrays or resource files ā€“ā€Æ Standard formats, e.g. gettext, XLIFF ā€¢ā€Æ HTML files ā€“ā€Æ Risk of accidental markup changes ā€¢ā€Æ Graphics files ā€“ā€Æ Originals, not rendered ā€¢ā€Æ Think about text expansion From Code to Product Lecture 5 ā€” Localizationā€” Slide 49 gidgreen.com/course
  • 50. Choosing a provider ā€¢ā€Æ Problem: you canā€™t assess quality ā€¢ā€Æ Go by reputation and clients ā€“ā€Æ Examples of previous work ā€¢ā€Æ Ask who will actually do it ā€“ā€Æ Native speaker of target language ā€“ā€Æ Subject-specific experience ā€¢ā€Æ Consider future language needs From Code to Product Lecture 5 ā€” Localizationā€” Slide 50 gidgreen.com/course
  • 51. Cost and quotes Ibidem-translations.com ā€¢ā€Æ Add 15-50% for specialized areas ā€¢ā€Æ Clarify how words are counted ā€¢ā€Æ Check for extra costs From Code to Product Lecture 5 ā€” Localizationā€” Slide 51 gidgreen.com/course
  • 52. Glossary ā€¢ā€Æ Fixed translation for specific terms ā€“ā€Æ Control over branding ā€“ā€Æ Domain-specific terminology ā€“ā€Æ Consistency ā€¢ā€Æ Not-to-be-translated terms ā€¢ā€Æ Requires thorough review of product From Code to Product Lecture 5 ā€” Localizationā€” Slide 52 gidgreen.com/course
  • 53. Glossary Image from Google Translator Toolkit Help From Code to Product Lecture 5 ā€” Localizationā€” Slide 53 gidgreen.com/course
  • 54. Translation memory ā€¢ā€Æ Lots of translation is repetitive ā€“ā€Æ Same text in many places ā€“ā€Æ Small changes between versions ā€¢ā€Æ Same sentence = same translation ā€“ā€Æ Save time and money ā€“ā€Æ Help ensure consistency ā€“ā€Æ But manual confirmation required ā€¢ā€Æ Should be owned by you From Code to Product Lecture 5 ā€” Localizationā€” Slide 54 gidgreen.com/course
  • 55. Translation memory Image from kilgray.com screenshots From Code to Product Lecture 5 ā€” Localizationā€” Slide 55 gidgreen.com/course
  • 56. Machine translation From Code to Product Lecture 5 ā€” Localizationā€” Slide 56 gidgreen.com/course
  • 57. Lecture 5 ā€¢ā€Æ Countries and languages ā€¢ā€Æ Character sets ā€¢ā€Æ Unicode ā€¢ā€Æ Text localization ā€¢ā€Æ Outsourcing translation ā€¢ā€Æ Other localization From Code to Product Lecture 5 ā€” Localizationā€” Slide 57 gidgreen.com/course
  • 58. Numbers 1,234,567.89 ā€” Japan, UK, USA 1 234 567,89 ā€” France, Central Europe 1.234.567,89 ā€” Germany, Scandinavia 1ā€™234ā€™567.89 ā€” Switzerland 123,4567.89 ā€” China 1ā€™234,567.89 ā€” Mexico 12,34,567.89 ā€” India From Code to Product Lecture X ā€” SUBJECTā€” Slide 58 gidgreen.com/course
  • 59. Date and Times 7/21/2012 15:45 21/7/2012 3.45 PM 21.7.2012 3:45 pm 2012-07-12 7. 21. 2012 7-12-2012 From Code to Product Lecture 5 ā€” Localizationā€” Slide 59 gidgreen.com/course
  • 60. Time zones Map from wikipedia.org From Code to Product Lecture 5 ā€” Localizationā€” Slide 60 gidgreen.com/course
  • 61. Displaying times online ā€¢ā€Æ Store times independent of zone ā€¢ā€Æ Options for display ā€“ā€Æ Ask the user for their time zone ā€“ā€Æ Show an explicit time zone ā€“ā€Æ Use ā€œagoā€ notation ā€¢ā€Æ Javascript to get from browser From Code to Product Lecture 5 ā€” Localizationā€” Slide 61 gidgreen.com/course
  • 62. Currencies ā€¢ā€Æ Biggest traded currencies: $ ā‚¬ Ā„ Ā£ ā€“ā€Æ But there are almost 200 ā€¢ā€Æ How to display ā€“ā€Æ Number formatting ā€“ā€Æ Symbols: ā‚Ŗ ā‚© ąøæ $ ā€“ā€Æ Currency codes: USD EUR JPY GBP CAD AUD ā€¢ā€Æ Also: currency conversion ā€“ā€Æ Live feed, e.g. from ECB From Code to Product Lecture 5 ā€” Localizationā€” Slide 62 gidgreen.com/course
  • 63. Names ā€¢ā€Æ Surname can come first ā€“ā€Æ China, Japan, Korea, Hungary ā€¢ā€Æ Multiple surnames ā€“ā€Æ JosĆ© Santos Tavares Melo Silva ā€¢ā€Æ Middle names/initials ā€¢ā€Æ Double-barrelled names ā€“ā€Æ Sarah-Jane Darlington-Whit ā€¢ā€Æ No spaces in CJK From Code to Product Lecture 5 ā€” Localizationā€” Slide 63 gidgreen.com/course
  • 64. Names Full Name: What should we call you? Family name: Other/given names: ā€¢ā€Æ Or localize based on language ā€¢ā€Æ Do you need names at all? ā€“ā€Æ Username or email can be enough From Code to Product Lecture 5 ā€” Localizationā€” Slide 64 gidgreen.com/course
  • 65. Addresses John Doe 怒100-8994 Acme, Inc ę±äŗ¬éƒ½äø­å¤®åŒŗå…«é‡ę“²äø€äøē›®5ē•Ŗ3号 Suite 3B-3824 ę±äŗ¬äø­å¤®éƒµä¾æ局 294 W Ronson Dallas TX 75211 Tokyo Central Post Office USA 1-5-3 Yaesu, Chuo-ku Tokyo 100-8994 Japan John Smith Acme, Ltd Flat 384 33 Walton Road C/Pescadoro, 13, 2Ā°, 3ĀŖ Birmingham 28331 ā€“ Madrid B26 3QJ Spain UK From Code to Product Lecture 5 ā€” Localizationā€” Slide 65 gidgreen.com/course
  • 66. Addresses ā€¢ā€Æ Single multi-line field ā€¢ā€Æ Change in response to country ā€¢ā€Æ Generic format From Code to Product Lecture 5 ā€” Localizationā€” Slide 66 gidgreen.com/course
  • 67. Phone numbers UK: +44 (0) 123-456-7890 France: +33 1-23-45-67-89 China: +86 10-2345-6789 USA: +1 (123) 456-7890 x123 ā€¢ā€Æ Country selector ā€¢ā€Æ Change in response to country ā€¢ā€Æ Generic format From Code to Product Lecture 5 ā€” Localizationā€” Slide 67 gidgreen.com/course
  • 68. Indexing, sorting, searching ā€¢ā€Æ Capitalization and accents ā€“ā€Æ Ƙyvind matches oyvind? ā€¢ā€Æ Collation (sort order) ā€“ā€Æ Swedish: a b c ā€¦ x y z Ć„ Ƥ ƶ ā€“ā€Æ French: cote cĆ“te cotĆ© cĆ“tĆ© ā€¢ā€Æ CJK (ideographic languages) ā€“ā€Æ No spaces between words ā€“ā€Æ Sort based on stroke count From Code to Product Lecture 5 ā€” Localizationā€” Slide 68 gidgreen.com/course
  • 69. Paper sizes A4 US Letter 297 x 210 mm 279 x 216 mm US Legal 356 x 216 mm From Code to Product Lecture 5 ā€” Localizationā€” Slide 69 gidgreen.com/course
  • 70. Domain names ā€¢ā€Æ Country-code top-level domains ā€“ā€Æ .fr .de .uk .in .br .jp .cn ā€¢ā€Æ Need separate registrar for many ā€¢ā€Æ Some countries have restrictions ā€“ā€Æ .com.au requires registered company ā€“ā€Æ .ca requires nationality/residence ā€“ā€Æ Also restricted: .fr .br .cn .ie .jp ā€¦ ā€¢ā€Æ Internationalized domain names From Code to Product Lecture 5 ā€” Localizationā€” Slide 70 gidgreen.com/course
  • 71. And thereā€™s moreā€¦ ā€¢ā€Æ Units of measurement ā€¢ā€Æ Colors ā€¢ā€Æ Images of people ā€¢ā€Æ Calendars ā€¢ā€Æ Holidays ā€¢ā€Æ Border disputes ā€¢ā€Æ Culture ā€¢ā€Æ Law From Code to Product Lecture 5 ā€” Localizationā€” Slide 71 gidgreen.com/course
  • 72. Google in China ā€¢ā€Æ 2005: Chinese language google.com ā€¢ā€Æ 2006: google.cn under censorship ā€¢ā€Æ 2009: China blocks YouTube ā€¢ā€Æ 2010: Google claims hacking attack ā€“ā€Æ Redirects google.cn to google.com.hk ā€“ā€Æ China blocks it for a day ā€¢ā€Æ Today: Baidu 79%, Google 17% ā€“ā€Æ Baidu links to MP3/movie downloads From Code to Product Lecture 5 ā€” Localizationā€” Slide 72 gidgreen.com/course
  • 73. Getting real ā€¢ā€Æ Itā€™s time consuming and costly ā€¢ā€Æ Cheap wins in version 1.0 ā€“ā€Æ Parameterize + functionize ā€“ā€Æ Use Unicode throughout ā€“ā€Æ Flexible layouts ā€¢ā€Æ See where there is demand ā€¢ā€Æ Identify most important locales From Code to Product Lecture 5 ā€” Localizationā€” Slide 73 gidgreen.com/course
  • 74. Getting real ā€¢ā€Æ Donā€™t skimp the details ā€“ā€Æ Needs to look native ā€¢ā€Æ Use serious service providers ā€¢ā€Æ Prepare for tech support ā€“ā€Æ Machine translation an option? ā€¢ā€Æ It will slow development ā€“ā€Æ So wait for product maturity From Code to Product Lecture 5 ā€” Localizationā€” Slide 74 gidgreen.com/course