Adam Goucher   I18n And L10n
Upcoming SlideShare
Loading in...5

Adam Goucher I18n And L10n



Slides from my recent presentation on I18N and L10N at GLSEC 2007

Slides from my recent presentation on I18N and L10N at GLSEC 2007



Total Views
Views on SlideShare
Embed Views



5 Embeds 1,216 1144 67 2 2 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Adam Goucher   I18n And L10n Adam Goucher I18n And L10n Presentation Transcript

  • I18N & L10N a technical primer Adam Goucher Senior Quality Specialist, Jonah Group
  • Definitions
    • Internationalization  I + 18 chars + N  I18N
      • Your application can accept, store, manipulate, retrieve and display text in the user’s native language
    • Localization  L + 10 chars + N  L10N
      • Your application looks as if it was designed for the locale it is being used in
  • The Problem
    • English is the native language of only ~ 30% of the Internet’s population.
    • To not alienate the other 70% of your potential customers, you need to worry about I18N and L10N.
  • Don’t worry
    • I18N and L10N are technical problems, not linguistic ones.
    • Programmers and testers know how to solve technical problems.
    • Translation is the linguistic problem.
    • Translators know how to solve linguistic problems.
  • Unicode
    • Unicode 1 provides a unique number:
      • for every character
      • no matter what the platform
      • no matter what the program
      • no matter what the language
    • There are a number of ways (called Encodings) to represent a Unicode code point (single character)
      • UTF-8 2 is an 8 bit, variable length encoding
      • UTF-8 is the de facto standard
    • 1
    • 2
  • Resource Bundles
    • One of the more difficult things to get right is all the string data embedded in your source code.
    • The easiest solution here is to use resource bundles (locale specific collections of string data)
  • String Rules
    • Like most tools, resource bundles can make your life difficult if not done correctly.
    • Do not build strings to display by concatenating strings. This increases translation difficulty by removing context
    • Include all punctuation in bundle content to avoid correct translation content, but incorrect punctuation
    • Include formatting in bundle content
  • Resource Bundle Tests
    • LOUD 3 to check for string rules
    • Resource key not in code
    • Resource key in code, but not bundle
    • Key present (or missing) from different locales
    • 3
  • Other areas
    • I18N and L10N is a huge topic. Some of what has not been discussed:
    • Date / Time
    • Numbers
    • Currency
    • Username / Password conventions
    • Postal / Zip Codes
    • Paper size (when printing)
  • Testing Advice
    • Test your application’s I18N and L10N early to avoid having to re-test everything.
    • Include as many checks as possible during the build process
    • Beta test translations with friendly customers
  • Summary
    • This is a technical problem, not a linguistic one
    • Use UTF-8 everywhere you can
    • Use resource bundles instead of putting literal strings in the code
    • Learn about the nuances of your target locales
    • Test early