I18n with PHP 5.3
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

I18n with PHP 5.3

  • 9,498 views
Uploaded on

Talk by Stas Malyshev of Zend at ZendCon 2009

Talk by Stas Malyshev of Zend at ZendCon 2009

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
9,498
On Slideshare
9,458
From Embeds
40
Number of Embeds
4

Actions

Shares
Downloads
112
Comments
0
Likes
5

Embeds 40

http://www.slideshare.net 27
https://twimg0-a.akamaihd.net 8
http://a0.twimg.com 3
http://www.slideee.com 2

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Globalization Formats, names, rules, algorithms – complexity & volume Keeping it all up-to-date
  • Strength is which character properties matter (a vs. à, a vs. A) and which characters matter (space, punctuation) Attributes: which case first, which characters are considered space/visible, if to use normalization, if two representations of the same (e.g. Katakana/Hiragana) are different

Transcript

  • 1. PHP Internationalization with ICU By Stas Malyshev, Zend Technologies
  • 2. What and why?
    • ICU - http://icu-project.org/ (IBM)
    • Unicode
    • CLDR - http://cldr.unicode.org/
  • 3. Intl extension
    • Locale
    • Collator
    • Number & Currency formatter
    • Date & Time formatter
    • Message & Choice formatter
    • Normalizer
    • Graphemes
    • IDN
    • Calendars
    • Resources
  • 4. Intl extension
    • Dual API OO and procedural
    • Same implementation underneath
      • collator_create () == new Collator ()
      • numfmt_format () == NumberFormatter::format ()
      • locale_get_default () == Locale::getDefault ()
  • 5. Locale
    • Relies on ICU locales
    • <language>[_<script>]_<country>[_<variant>][@<keywords>]
    • Default locale
      • new Collator(Locale::DEFAULT )
      • Locale::setDefault, Locale::getDefault
      • You can use null
  • 6. Locale
    • Locale pieces
    • getPrimaryLanguage($locale)
    • getScript($locale)
    • getRegion($locale)
    • getVariant($locale)
    • getKeywords($locale)
  • 7. Locale
    • Locale display pieces
    • getDisplayName($locale, $in_locale = null)
    • getDisplayLanguage($locale, $in_locale = null)
    • getDisplayScript($locale, $in_locale = null)
    • getDisplayRegion($locale, $in_locale = null)
    • Example:
    • getDisplayScript(getScript(&quot;zh-Hant-TW&quot;), &quot;en-US&quot;) returns “Traditional Chinese”
  • 8. Locale building blocks
    • parseLocale () - returns array composed of locale subtags
    • composeLocale () - creates locale ID out of subtags
    • parseLocale('sr-Latn-RS') returns
    • array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)
    • composeLocale(array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)) returns ‘sr-Latn-RS ’
  • 9. Locale guessing
    • acceptFromHttp - Accept-Language to locale
    • lookup – find in the list
    • filterMatches – are they the same?
  • 10. Collator
    • Comparing, sorting strings
    • Collation level (strength)
    • All ICU collator attributes
      • Numeric collation
      • Ignoring punctuation
    • Not yet: custom “tailoring” rules
  • 11. Collator
    • $coll  = new  Collator ( &quot;fr_CA&quot; );
    • if ( $coll -> compare ( &quot;côte&quot; ,  &quot;coté&quot; ) <  0 ) {
    •       echo  &quot;less &quot; ; 
    • } else {
    •       echo  &quot;greater &quot; ; 
    côte < coté
  • 12. Collator
    • $strings  = array( &quot;cote&quot; ,  &quot;côte&quot; ,  &quot;Côte&quot; ,  &quot;coté&quot; , &quot;Coté&quot; ,  &quot;côté&quot; ,  &quot;Côté&quot; ,  &quot;coter&quot; );
    • $coll  = new  Collator ( &quot;fr_CA&quot; ); 
    • $coll -> sort ( $strings );
    cote côte Côte coté Coté côté Côté coter sort($array, $flags) asort($array, $flags) sortWithSortKeys($array)
  • 13. NumberFormatter
    • Formatting and parsing
    • Numbers and currency
    • numfmt_create($locale, $style, $pattern = null)
    NumberFormatter::PATTERN_DECIMAL NumberFormatter::ORDINAL NumberFormatter::DECIMAL NumberFormatter::DURATION NumberFormatter::CURRENCY NumberFormatter::SCIENTIFIC NumberFormatter::PERCENT NumberFormatter::SPELLOUT
  • 14. NumberFormatter
    • Formatting
    $fmt  = new  NumberFormatter ( ‘en_US’ ,                            NumberFormatter :: DECIMAL ); echo $fmt -> format ( 1234 ); // result is 1,234 $fmt  = new  NumberFormatter ( ‘de_CH’ ,                            NumberFormatter :: DECIMAL ); echo $fmt -> format ( 1234 ); // result is 1'234
  • 15. NumberFormatter
    • Parsing
    $fmt  = new  NumberFormatter ( ‘de_DE’ ,                            NumberFormatter :: DECIMAL ); $num  =  ‘1.234 , 567 min’ ; $fmt -> parse ( $num ,  NumberFormatter :: TYPE_DOUBLE ,  $pos ); // result is 1234.567 , $pos = 9 $fmt -> parse ( $num ,  NumberFormatter :: TYPE_INT32 ); // result is 1234
  • 16. MessageFormatter
    • Formatting and parsing whole messages, including data inside
    • Also allows choice between things printed:
      • 0≤are no files|1≤is one file|1<are many files
  • 17. MessageFormatter $fmt  = new  MessageFormatter ( &quot;en_US&quot; ,  &quot;{0,number,integer}  monkeys on {1,number,integer} trees  make {2,number} monkeys per tree&quot; ); echo  $fmt -> format (array( 4560 ,  123 ,  4560 / 123 )); $fmt  = new  MessageFormatter ( &quot;de&quot; ,  &quot;{0,number,integer}  Affen über {1,number,integer} Bäume  um {2,number} Affen pro Baum&quot; ); echo  $fmt -> format (array( 4560 ,  123 ,  4560 / 123 ));
  • 18. IntlDateFormatter
    • Allows using locale-dependent canned patterns
    • Short, medium, long date & time
      • Long: Tuesday, April 12, 1952 AD or 3:30:42pm PST
      • Medium: January 12, 1952 or 3:30:32pm
      • Short: 12/13/52 or 3:30pm
    • Also allows free-form patterns
      • &quot;yyyy.MM.dd G 'at' HH:mm:ss vvvv&quot;
      • 1996.07.10 AD at 15:08:56 Pacific Time
  • 19. IntlDateFormatter $fmt  = new  IntlDateFormatter (  &quot;en_US&quot;  , IntlDateFormatter :: FULL , IntlDateFormatter :: FULL , 'America/Los_Angeles' , IntlDateFormatter :: GREGORIAN ); echo  $fmt -> format ( 0 ); // Wednesday, December 31, 1969 4:00:00 PM PT $fmt  = new  IntlDateFormatter (  &quot;de-DE&quot;  , IntlDateFormatter :: FULL , IntlDateFormatter :: FULL , 'America/Los_Angeles' , IntlDateFormatter :: GREGORIAN ); echo  $fmt -> format ( 0 ); // Mittwoch, 31. Dezember 1969 16:00 Uhr GMT-08:00
  • 20. Normalizer
    • Brings Unicode text to one of the normal forms: NFC, NFD, NFKC, NFKD
    • normalize(), isNormalized()
    $combining_ring_above  =  &quot;xCCx8A&quot; ;   // 'COMBINING RING ABOVE' (U+030A) $chars  =  Normalizer :: normalize (  'A'  .  $combining_ring_above ,  Normalizer :: FORM_C  ); echo  urlencode ( $chars ); // %C3%85 i.e. // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)
  • 21. Grapheme functions
    • Graphemes are multi-char entities, like letter + accent mark(s)
    • Same as string functions, but operate on grapheme units
    • Strlen, substr, strpos, strstr
    • Extraction function – extract to fill limited buffer, but always keep graphemes whole
  • 22. IDN
    • עברית .idn.icann.org ↔ xn--5dbqzzl.idn.icann.org
    • русский.idn.icann.org ↔ xn--h1acbxfam.idn.icann.org
    • idn_to_ascii
    • idn_to_utf8
  • 23. TODO
    • ResourceHandler
    • Transliteration
    • StringSearch
    • Tighter integration with other modules in 6.0
  • 24. Thanks!
    • http://php.net/intl for futher information.