2. About Me
Working & living in İstanbul
Coding since ‘85, CGI Programming since ’94 (Perl, C/C++)
ColdFusion Developer since ’97
Interested in ColdFusion, Flex, AIR, Ruby, Ajax,
Frameworks, i18N, L10N, G11N, Geolocation
Created projects in more than 20 languages!
Have big interest in Epistemology
Personal blog: http://blog.demirkapi.net
5. Why i18N (Internationalization)?
English is just another language.
The World Wide Web should be truly world-wide!
Internationalization is important to ensure that users worldwide can
equally benefit from Web technology.
Wide diversity world-wide:
Scripts (Latin, Cyrillic, Hebrew, Tamil, Katakana,...)
Languages (English, German, Turkish, Korean, Japanese,...)
Typographic conventions
Cultural conventions
Political circumstances
Avoid fragmentation of specifications due to localization.
Make sure internationalization is done at the right place.
6. What is i18N (Internationalization)?
Application functions in at least two locales
What is L10N (Localization)?
Process of applying a locale or language "skin" to an
I18N application
What is G11N (Globalization)?
i18N & L10N
What is …?
7. Internationalization & Localization
Globalization
Single character set
Single executable
Single install
Single server serves all clients
in all languages
Localization
Based on globalized software
Adds specific translations and
adaptations for particular
languages and markets
Globalized software can be localized without any code changes
8. Character encoding
Character encoding specifies mappings from a character
set to the integer numbers that represent the characters
on a computer.
EUC-JP (Japanese)
EUC-KR (Korean)
ISO-8859-1 (Western European and English)
SHIFT_JIS (Japanese)
UTF-8 (All Languages)
9. What is Locale?
The combination of a language and a country code
en_US: US English (color, $)
en_GB: British English (colour, £)
de_CH: German in Switzerland
tr_TR: Turkish in Turkey
10. Selecting/Detecting Locale
Manual
Default locale can be loaded and other options would be
available by selection via buttons/selects etc.
Locale Detection
Parsing ‘Accept-Language’ header on HTTP request
Not usable with URLLoader
Capabilities.language property in Flash Player and AIR
Parsing browser and OS language settings via JavaScript
Location detection depending on IP
11. What is Unicode?
Unicode (unicode.org) is an character set for all the
characters and symbols of the world.
Unicode provides a unique number for every character.
Except Klingon!
؟ "يونِكود"ما هيالشفرة الموحدة
Unicode nedir?
Τι είναι το Unicode
Cos'è Unicode?
유니코드에 대해?
Что такое Unicode?
12. Unicode (cont.)
Why do we need to use Unicode?
Avoids data corruption
Single encoding for text in all languages
Makes software globalization possible
Vastly reduces development cost
Vastly reduces maintenance, update and support cost
Switching to Unicode has no disadvantages for single language
users, to the contrary it usually offers advantages even for single
language users. And it offers great advantages for multilingual
users.
Encoding: Use Unicode wherever possible for content,
databases, etc. Always declare the encoding of content.
14. Unicode & Files
UTF-8 is the recommended encoding for files.
Use a Unicode capable editor (IDE)
ColdFusion/Flash Builder
Default encoding is UTF-8
Eclipse
Default encoding is Cp1252
Change it into UTF-8
Window-> Preferences -> General -> Workspace
Text file encoding
No BOM creation
OK for current files with BOM
15. Unicode & Files (cont.)
Dreamweaver
Supports full Unicode when selected
Support BOM (Byte Order Mark)
Homesite/CFStudio Never!
http://www.adobe.com/go/tn_19059
Notepad
The best tool!
16. Unicode & Database
Use a robust database with right settings
MS SQL Server
MySQL Server
PostgreSQL Server
Oracle
Beware of Unicode Support
MySQL 4.1 and up
default-character-set=utf8
character-set-server = utf8
collation-server = utf8_general_ci
MS SQL Server
SQL Server nvarchar, ntext etc.
18. ColdFusion History
Supports Unicode starting from ColdFusion MX
Use ColdFusion MX
If possible ColdFusion MX 7.x and up
Use Updated JVM
Set Required Locale
19. ColdFusion & Databases
Use DSN Settings
MS SQL Server
String Format
Enable High ASCII characters and Unicode for data sources
configured for non-Latin characters
MySQL
Select MySQL 4/5 driver
20. ColdFusion & Files
Use Unicode
If possible with BOM support
ColdFusion MX Detects BOM
If you have BOM support on your file ColdFusion
understand your locale and there is no need to use extra
tags such as cfprocessingdirective
21. ColdFusion & Files
ColdFusion MX Templates
If there is no BOM
<cfprocessingdirective pageencoding="utf-8">
must be included in every CFM template.
Using in Application.cfm or cfc etc. does not help
22. ColdFusion & Tags & Functions
Beware of ColdFusion Functions & Tags
CFPROCESSINGDIRECTIVE, CFCONTENT, CFFILE, CFHEADER, CFHTTP,
CFHTTPPARAM, CFMAIL, CFMAILPART tags and the SETENCODING,
GETENCODING, TOBASE64, TOSTRING, URLDECODE, and URLENCODEDFORMAT
functions etc.
Use encoding in Tags
<cfmail to="#user#" from="#me#" subject="Unicode Test" charset="utf-8">
<cfmailpart charset="utf-8" type="plain“>#mymailasplaintext#
</cfmailpart>
<cfmailpart charset="utf-8" type="html“>#mymailashtml#
</cfmailpart>
</cfmail>
23. HTML & i18N
Multipart POST
<form action="#FOO#“ method="post"
enctype="multipart/form-data: charset=utf-8“>
HTML/XHTML
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8" />
Detect Locale
cgi.HTTP_ACCEPT_LANGUAGE
IP & GeoLocator
24. Code & Language Isolation
MVC Benefits
Java Resource Bundles
JRB General usage with CFCs
IBM ICU4J Library
GNU GetText ()
Database
XML
29. Base i18N Methods
Get user’s current locale
getfwLocale()
Get the current user's locale from the locale storage
Set a specific locale
setFWLocale( valid_locale )
Set the current user's locale, internally it uses the locale storage
assigned
getResource(resource, [default], [locale], [values])
Get a language resource from the default user's locale or a
specific locale and even do array/struct/string replacements via
positional or named {} patterns: {1}, {2}, {username}, {lastName}
30. Prc Scope Usage
Prc scope usage
function onRequestStart(event,rc,prc) {
prc.i18n = getPlugin("i18n");
}
A common practice for localized applications is to store a
reference to the i18n plugin object in the prc scope for easy
access throughout your application so you are not constantly
calling getPlugin("i18n") all over the place if you need to rely
heavily on i18n methods.
This is a performance optimization best practice.
34. Using Resource Bundle
Get basic resource based on default or a specific locale
getResource(resource,[default],[locale])
#getresource("login")# // gets the login content for active locale
Get complex resource
<cfscript>
oRB = getPlugin("resourceBundle");
footerParams = ["Oğuz Demirkapı", "http://blog.demirkapi.net"];
</cfscript>
#oRB.formatRBString(getResource('footerNote'),footerParams)#
38. Content Structure
Same content in every locale
Mirroring sites
Isolated templates and contents
Different content tree in different locales
Roots for trees
Complex tree referencing
39. Database Modeling
Designing Database
Locale definitions
Content related pages have a flag field as locale.
Lookup tables have i18N key in table and gets the content from RB
tools
Tables vs. Fields
Creating flags for any locale based content would be best practice
instead of creating different tables.
Content Types
Isolating localized content
All content should be isolated in specific locales
Saving localized content
A localization manager would be a proper tool