Adam Goucher I18n And L10n

3,226 views
3,150 views

Published on

Slides from my recent presentation on I18N and L10N at GLSEC 2007

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,226
On SlideShare
0
From Embeds
0
Number of Embeds
1,354
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Adam Goucher I18n And L10n

  1. 1. I18N & L10N a technical primer Adam Goucher Senior Quality Specialist, Jonah Group http://www.jonahgroup.com http://adam.goucher.ca
  2. 2. Definitions <ul><li>Internationalization  I + 18 chars + N  I18N </li></ul><ul><ul><li>Your application can accept, store, manipulate, retrieve and display text in the user’s native language </li></ul></ul><ul><li>Localization  L + 10 chars + N  L10N </li></ul><ul><ul><li>Your application looks as if it was designed for the locale it is being used in </li></ul></ul>
  3. 3. The Problem <ul><li>English is the native language of only ~ 30% of the Internet’s population. </li></ul><ul><li>To not alienate the other 70% of your potential customers, you need to worry about I18N and L10N. </li></ul>
  4. 4. Don’t worry <ul><li>I18N and L10N are technical problems, not linguistic ones. </li></ul><ul><li>Programmers and testers know how to solve technical problems. </li></ul><ul><li>Translation is the linguistic problem. </li></ul><ul><li>Translators know how to solve linguistic problems. </li></ul>
  5. 5. Unicode <ul><li>Unicode 1 provides a unique number: </li></ul><ul><ul><li>for every character </li></ul></ul><ul><ul><li>no matter what the platform </li></ul></ul><ul><ul><li>no matter what the program </li></ul></ul><ul><ul><li>no matter what the language </li></ul></ul><ul><li>There are a number of ways (called Encodings) to represent a Unicode code point (single character) </li></ul><ul><ul><li>UTF-8 2 is an 8 bit, variable length encoding </li></ul></ul><ul><ul><li>UTF-8 is the de facto standard </li></ul></ul><ul><li>1 http://www.unicode.org </li></ul><ul><li>2 http://en.wikipedia.org/wiki/UTF-8 </li></ul>
  6. 6. Resource Bundles <ul><li>One of the more difficult things to get right is all the string data embedded in your source code. </li></ul><ul><li>The easiest solution here is to use resource bundles (locale specific collections of string data) </li></ul>
  7. 7. String Rules <ul><li>Like most tools, resource bundles can make your life difficult if not done correctly. </li></ul><ul><li>Do not build strings to display by concatenating strings. This increases translation difficulty by removing context </li></ul><ul><li>Include all punctuation in bundle content to avoid correct translation content, but incorrect punctuation </li></ul><ul><li>Include formatting in bundle content </li></ul>
  8. 8. Resource Bundle Tests <ul><li>LOUD 3 to check for string rules </li></ul><ul><li>Resource key not in code </li></ul><ul><li>Resource key in code, but not bundle </li></ul><ul><li>Key present (or missing) from different locales </li></ul><ul><li>3 http://adam.goucher.ca/?p=28 </li></ul>
  9. 9. Other areas <ul><li>I18N and L10N is a huge topic. Some of what has not been discussed: </li></ul><ul><li>Date / Time </li></ul><ul><li>Numbers </li></ul><ul><li>Currency </li></ul><ul><li>Username / Password conventions </li></ul><ul><li>Postal / Zip Codes </li></ul><ul><li>Paper size (when printing) </li></ul>
  10. 10. Testing Advice <ul><li>Test your application’s I18N and L10N early to avoid having to re-test everything. </li></ul><ul><li>Include as many checks as possible during the build process </li></ul><ul><li>Beta test translations with friendly customers </li></ul>
  11. 11. Summary <ul><li>This is a technical problem, not a linguistic one </li></ul><ul><li>Use UTF-8 everywhere you can </li></ul><ul><li>Use resource bundles instead of putting literal strings in the code </li></ul><ul><li>Learn about the nuances of your target locales </li></ul><ul><li>Test early </li></ul>

×