Making Cents of Yens and Euros: Web 2.0 Internationalization

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Making Cents of Yens and Euros: Web 2.0 Internationalization - Presentation Transcript

    1. Making Cents of Yens and Euros: Web 2.0 Internationalization Achim Ruopp Digital Silk Road http://www.digitalsilkroad.net/
    2. Demo A Currency Converter Application – before and after Web 2.0 Internationalization
    3. Agenda
      • Introduction to Web Internationalization (i18n)
        • Selecting and Persisting User Preferences
        • Locales and Locale Identifiers
        • Unicode
        • Localization – Model and Tools
      • Client-side Scripting
        • Javascript Internationalization
        • Ajax
      • Multi-lingual Syndication
        • RSS
        • Atom
      • International Web Services Design
        • REST
        • SOAP
    4. Intro to Web Internationalization Language and Location en-US fr en;0.8 da-DK
    5. Intro to Web Internationalization User Preferences
      • Language
        • HTTP Accept-Language header
        • E.g.: en, fr-CA;0.8, fr;0.6
        • Language negotiation with the server
      • Locale
        • Cultural preferences for formatting, sorting etc.
        • Infer from Accept-Language header
        • Map IPv4 address to ccTLD (country code top-level domain)
          • Public information accessible through libraries
            • E.g. Perl IP::Country CPAN module
          • Commercial services offer more precision
      • Always provide option to change defaults
      • Store preferences in cookies
    6. Intro to Web Internationalization Internet Language Tags
      • IETF Language Tags (BCP 47)
      • Language[-Language]* 3 [-Script][-Region] [-Variant]*[-Extension]*[-PrivateUse]*
      • Examples
        • en-CA: English in Canada
        • Zh-Hant-TW: Chinese written in traditional Chinese script used in Taiwan
      • Obsoletes RFC 3066 & RFC 1766
        • Often still used in products/earlier standards
    7. Internationalization Changes
    8. Intro to Web Internationalization POSIX Locales
      • Cross-platform API
        • Locale-identifiers can have variations
          • Un*x: en_US
          • Windows: English_United States
        • Results can be platform-dependent
      • Basis for locale functionality in all scripting languages
      • Provides functionality for
        • Number Formatting: 1,000,000.23
        • Date/Time Formatting: 8 Μάρτιος 2007 12:00:00 μμ
        • Sorting
        • String processing (e.g. upper-/lower-casing)
        • Some translated strings like weekdays, yes/no messages
    9. Intro to Web Internationalization International Components for Unicode
      • IBM Open Source project
      • Extensive locale data and APIs
        • Data vetted as part of Common Locale Data Repository (CLDR) project
      • Java and C++ APIs
      • Wrappers for scripting languages
        • PyICU (Python)
        • ICU4R (Ruby) – abandoned?
        • DIY – difficult because of API complexity and character encoding issues
    10. Intro to Web Internationalization Microsoft Internationalization APIs
      • Windows NLS API
      • Microsoft .NET Framework System.Globalization namespace
      • Similar set of data to ICU
        • Data vetted by Microsoft subsidiaries
      • APIs accessible from all Microsoft programming languages
    11. Intro to Web Internationalization Unicode 5.0 00000 10000 20000 30000 E0000 F0000 100000 … Basic Multilingual Plane Dead Languages & Math Han Characters Language Tags Private Use 0000 1000 2000 3000 4000 5000 6000 7000 8000 9000 A000 B000 C000 D000 E000 F000 Alphabets Punctuation Asian Languages Han Characters Yi Hangul Surrogates Private Use Legacy/Compatibility 99,024 of 1,114,112 code points (U+0000 to U+10FFFF) defined
    12. Intro to Web Internationalization Unicode Encodings Forms
      • Variable length: UTF-8/UTF-16
      • Fixed length: UTF-32
      • U+2122: ™: Trade Mark Sign
      0…00100001 00100010 0x00002122 UTF-32 00100001 00100010 0x2122 UTF-16 1110 0010 10 000100 10 100010 0xE2 0x84 0xA2 UTF-8
    13. Intro to Web Internationalization Unicode on the Web
      • XML processors are required to process UTF-8/UTF-16
      • Encoding declaration precedence
        • HTTP Content-Type header charset declaration
        • XML encoding declaration (XHTML)
        • meta charset declaration in (X)HTML
        • link element charset attribute
      • Approx. 4% of pages have encoding errors*
      • No real need for character references
        • Exceptions: <,>,&,&quot;
      • Use styles to control font selection
    14. Demo A Currency Converter Application – globalized but not localized
    15. Intro to Web Internationalization Localization Recommendations Avoid translatable text in graphics Make sure graphics are culturally neutral Avoid absolute sizing Use HTML flow layout Write complete sentences
    16. Intro to Web Internationalization Localization Model and Tools
      • Text translation
        • Localization formats
          • HTML with template library
            • W3C Internationalization Tag Set (tool support?)
          • GNU gettext/PO
          • XLIFF - XML Localization Interchange File Format
        • Localization tools
          • OmegaT
          • Open Language Tools (Sun)
          • The WordForge Project: Pootle
      • Searchability – Links/Sitemap
    17. Demo A Currency Converter Application – fully internationalized Web 1.0 application
    18. Client-side Scripting Javascript Internationalization
      • ECMAScript edition 3 added a range of internationalization features (1999)
        • Good support for Unicode processing
        • Set of locale-sensitive functions
          • Dependent on host locale (i.e. browser)
        • Set of locale-insensitive functions
        • No number or date/time parsing
      • Javascript libraries with additional internationalization functionality
        • dojo Toolkit (i18n contributed by IBM)
        • Microsoft AJAX Library
    19. Client-side Scripting AJAX Recommendations
      • Late globalization
        • Transmit data in locale-independent form with XMLHttpRequest
        • Might require some creative parsing/UI
      • Early localization
        • Text localization server-side
        • Browsers are missing a message-catalog facility
        • Dynamically created page content is invisible to search engines
    20. Demo A Currency Converter Application – dynamic update of exchange amounts using Ajax
    21. Multi-lingual Syndication RSS 2.0
      • Character encoding
        • RSS 2.0 is an XML application
        • XML encoding rules apply
      • Language
        • Element only on channel (feed), not on item
          • Create one channel per language
        • Specified to comply to RFC1766 language tags
      • Date/Time
        • In standard RFC 822 format (including 4-digit years)
          • E.g. “Wed, 02 Oct 2002 08:00:00 EST”
    22. Multi-lingual Syndication Atom Syndication
      • More granular language marking
        • xml:lang can be applied to any human readable text in the format
        • Aggregators need to deal with this
      • Better date/time format: RFC 3339
        • E.g. “2003-12-13T18:30:02-05:00”
      • Acknowledgement: Tim Bray
    23. Demo A Currency Converter Application – adding a syndication feed with exchange rate information
    24. International Web Services Design Service Patterns norske kroner ? NOK CHF Service adjusts formatting and language to locale the data refers to Data Driven 03/08/2007 12:00pm EST Service is locale-specific and ignores client preference Service Determined Kanadischer Dollar CAD (Accept-Language: de) Service reacts to client-locale e.g. HTTP Accept-Language Client Influenced 1.1785 CAD Neutral data formats Locale Neutral Return data Request data Description
    25. International Web Services Design REST
      • REST naturally ties into i18n features in HTTP/HTML/XML
        • Locale indicated with HTTP Accept-Language
        • Encoding and language marking in markup
      • Special caution for HTTP GET parameters
        • Locale-independent formatting recommended
        • Text parameters
          • Encode in UTF-8 and escape in URIs
          • IRI (International Resource Identifier) functionality might provide this for you
    26. International Web Services Design SOAP
      • Locale can be communicated in
        • Transport header (e.g. HTTP)
        • SOAP header
        • SOAP message body
      • Beware of automatically generated SOAP interfaces
        • Might be locale-dependent, but not allow to specify locale
      • Use of XML Schema data types promotes locale-independence
      • Also consider localization of error messages
    27. Demo A Currency Converter Application – exchange rates as a REST web service
    28. Conclusions
      • Unification
        • One code base
      • Customization
        • Localization and adaptation for locales
      • Next step: cross-language “leakage”
        • Provide views in multiple languages to the same (user-generated) data
        • Translate user-generated content
          • Volunteers
          • Machine Translation
    29. Call for Contributions
      • The Perl CGI demo code is available on
        • http://www.digitalsilkroad.net/twiki/CurrencyConverter
      • Add a version in your preferred language
        • Ruby on Rails
        • PHP
        • Python
      • A similar application for ASP.NET is available on
        • http://quickstarts.asp.net/QuickStartv20/aspnet/doc/localization/default.aspx
    30. References
      • W3C Internationalization Activity
        • http://www.w3.org/International/
      • POSIX Locale
        • http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html
      • International Components for Unicode
        • http://www-306.ibm.com/software/globalization/icu/
      • Unicode/Common Locale Data Repository
        • http://www.unicode.org/
      • Microsoft Internationalization APIs
        • http://msdn2.microsoft.com/en-us/library/ms776254.aspx
        • http://msdn2.microsoft.com/en-us/library/system.globalization.aspx
    31. References
      • OmegaT
        • http://www.omegat.org/omegat/omegat_en/omegat.html
      • Open Language Tools
        • https://open-language-tools.dev.java.net/
      • The WordForge Project
        • http://www.wordforge.org/drupal/
      • Javascript Internationalization
        • http://www.icu-project.org/docs/papers/internationalization_support_for_javascript.html
      • RSS 2.0
        • http://www.rssboard.org/rss-specification
      • Atom Syndication
        • http://www.atomenabled.org/developers/syndication
      • RSS 1.0
        • http://web.resource.org/rss/1.0/spec
      • W3C Web Services Internationalization Usage Scenarios
        • http://www.w3.org/TR/ws-i18n-scenarios/
    32. Additional Slides
    33. Multi-lingual Syndication RSS 1.0
      • Character encoding
        • RSS 1.0 is an XML application
        • XML encoding rules apply
      • Complies to RDF (Resource Description Framework) specification
        • Definition of language and date/time formats are left to RDF metadata formats
          • Dublin Core Metadata Element Set
          • Language: RFC1766/ISO639-2
          • Date/Time: ISO 8601 (superset of RFC 3339)
            • Also Dublin Core allows to specify time periods!

    + techdudetechdude, 3 years ago

    custom

    3200 views, 1 favs, 0 embeds more stats

    One thing hasn’t changed in Web 2.0: users can be more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 3200
      • 3200 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 76
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories