• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
I18n with PHP 5.3
 

I18n with PHP 5.3

on

  • 9,147 views

Talk by Stas Malyshev of Zend at ZendCon 2009

Talk by Stas Malyshev of Zend at ZendCon 2009

Statistics

Views

Total Views
9,147
Views on SlideShare
9,109
Embed Views
38

Actions

Likes
5
Downloads
112
Comments
0

3 Embeds 38

http://www.slideshare.net 27
https://twimg0-a.akamaihd.net 8
http://a0.twimg.com 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Globalization Formats, names, rules, algorithms – complexity & volume Keeping it all up-to-date
  • Strength is which character properties matter (a vs. à, a vs. A) and which characters matter (space, punctuation) Attributes: which case first, which characters are considered space/visible, if to use normalization, if two representations of the same (e.g. Katakana/Hiragana) are different

I18n with PHP 5.3 I18n with PHP 5.3 Presentation Transcript

  • PHP Internationalization with ICU By Stas Malyshev, Zend Technologies
  • What and why?
    • ICU - http://icu-project.org/ (IBM)
    • Unicode
    • CLDR - http://cldr.unicode.org/
  • Intl extension
    • Locale
    • Collator
    • Number & Currency formatter
    • Date & Time formatter
    • Message & Choice formatter
    • Normalizer
    • Graphemes
    • IDN
    • Calendars
    • Resources
  • Intl extension
    • Dual API OO and procedural
    • Same implementation underneath
      • collator_create () == new Collator ()
      • numfmt_format () == NumberFormatter::format ()
      • locale_get_default () == Locale::getDefault ()
  • Locale
    • Relies on ICU locales
    • <language>[_<script>]_<country>[_<variant>][@<keywords>]
    • Default locale
      • new Collator(Locale::DEFAULT )
      • Locale::setDefault, Locale::getDefault
      • You can use null
  • Locale
    • Locale pieces
    • getPrimaryLanguage($locale)
    • getScript($locale)
    • getRegion($locale)
    • getVariant($locale)
    • getKeywords($locale)
  • Locale
    • Locale display pieces
    • getDisplayName($locale, $in_locale = null)
    • getDisplayLanguage($locale, $in_locale = null)
    • getDisplayScript($locale, $in_locale = null)
    • getDisplayRegion($locale, $in_locale = null)
    • Example:
    • getDisplayScript(getScript(&quot;zh-Hant-TW&quot;), &quot;en-US&quot;) returns “Traditional Chinese”
  • Locale building blocks
    • parseLocale () - returns array composed of locale subtags
    • composeLocale () - creates locale ID out of subtags
    • parseLocale('sr-Latn-RS') returns
    • array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)
    • composeLocale(array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)) returns ‘sr-Latn-RS ’
  • Locale guessing
    • acceptFromHttp - Accept-Language to locale
    • lookup – find in the list
    • filterMatches – are they the same?
  • Collator
    • Comparing, sorting strings
    • Collation level (strength)
    • All ICU collator attributes
      • Numeric collation
      • Ignoring punctuation
    • Not yet: custom “tailoring” rules
  • Collator
    • $coll  = new  Collator ( &quot;fr_CA&quot; );
    • if ( $coll -> compare ( &quot;côte&quot; ,  &quot;coté&quot; ) <  0 ) {
    •       echo  &quot;less &quot; ; 
    • } else {
    •       echo  &quot;greater &quot; ; 
    côte < coté
  • Collator
    • $strings  = array( &quot;cote&quot; ,  &quot;côte&quot; ,  &quot;Côte&quot; ,  &quot;coté&quot; , &quot;Coté&quot; ,  &quot;côté&quot; ,  &quot;Côté&quot; ,  &quot;coter&quot; );
    • $coll  = new  Collator ( &quot;fr_CA&quot; ); 
    • $coll -> sort ( $strings );
    cote côte Côte coté Coté côté Côté coter sort($array, $flags) asort($array, $flags) sortWithSortKeys($array)
  • NumberFormatter
    • Formatting and parsing
    • Numbers and currency
    • numfmt_create($locale, $style, $pattern = null)
    NumberFormatter::PATTERN_DECIMAL NumberFormatter::ORDINAL NumberFormatter::DECIMAL NumberFormatter::DURATION NumberFormatter::CURRENCY NumberFormatter::SCIENTIFIC NumberFormatter::PERCENT NumberFormatter::SPELLOUT
  • NumberFormatter
    • Formatting
    $fmt  = new  NumberFormatter ( ‘en_US’ ,                            NumberFormatter :: DECIMAL ); echo $fmt -> format ( 1234 ); // result is 1,234 $fmt  = new  NumberFormatter ( ‘de_CH’ ,                            NumberFormatter :: DECIMAL ); echo $fmt -> format ( 1234 ); // result is 1'234
  • NumberFormatter
    • Parsing
    $fmt  = new  NumberFormatter ( ‘de_DE’ ,                            NumberFormatter :: DECIMAL ); $num  =  ‘1.234 , 567 min’ ; $fmt -> parse ( $num ,  NumberFormatter :: TYPE_DOUBLE ,  $pos ); // result is 1234.567 , $pos = 9 $fmt -> parse ( $num ,  NumberFormatter :: TYPE_INT32 ); // result is 1234
  • MessageFormatter
    • Formatting and parsing whole messages, including data inside
    • Also allows choice between things printed:
      • 0≤are no files|1≤is one file|1<are many files
  • MessageFormatter $fmt  = new  MessageFormatter ( &quot;en_US&quot; ,  &quot;{0,number,integer}  monkeys on {1,number,integer} trees  make {2,number} monkeys per tree&quot; ); echo  $fmt -> format (array( 4560 ,  123 ,  4560 / 123 )); $fmt  = new  MessageFormatter ( &quot;de&quot; ,  &quot;{0,number,integer}  Affen über {1,number,integer} Bäume  um {2,number} Affen pro Baum&quot; ); echo  $fmt -> format (array( 4560 ,  123 ,  4560 / 123 ));
  • IntlDateFormatter
    • Allows using locale-dependent canned patterns
    • Short, medium, long date & time
      • Long: Tuesday, April 12, 1952 AD or 3:30:42pm PST
      • Medium: January 12, 1952 or 3:30:32pm
      • Short: 12/13/52 or 3:30pm
    • Also allows free-form patterns
      • &quot;yyyy.MM.dd G 'at' HH:mm:ss vvvv&quot;
      • 1996.07.10 AD at 15:08:56 Pacific Time
  • IntlDateFormatter $fmt  = new  IntlDateFormatter (  &quot;en_US&quot;  , IntlDateFormatter :: FULL , IntlDateFormatter :: FULL , 'America/Los_Angeles' , IntlDateFormatter :: GREGORIAN ); echo  $fmt -> format ( 0 ); // Wednesday, December 31, 1969 4:00:00 PM PT $fmt  = new  IntlDateFormatter (  &quot;de-DE&quot;  , IntlDateFormatter :: FULL , IntlDateFormatter :: FULL , 'America/Los_Angeles' , IntlDateFormatter :: GREGORIAN ); echo  $fmt -> format ( 0 ); // Mittwoch, 31. Dezember 1969 16:00 Uhr GMT-08:00
  • Normalizer
    • Brings Unicode text to one of the normal forms: NFC, NFD, NFKC, NFKD
    • normalize(), isNormalized()
    $combining_ring_above  =  &quot;xCCx8A&quot; ;   // 'COMBINING RING ABOVE' (U+030A) $chars  =  Normalizer :: normalize (  'A'  .  $combining_ring_above ,  Normalizer :: FORM_C  ); echo  urlencode ( $chars ); // %C3%85 i.e. // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)
  • Grapheme functions
    • Graphemes are multi-char entities, like letter + accent mark(s)
    • Same as string functions, but operate on grapheme units
    • Strlen, substr, strpos, strstr
    • Extraction function – extract to fill limited buffer, but always keep graphemes whole
  • IDN
    • עברית .idn.icann.org ↔ xn--5dbqzzl.idn.icann.org
    • русский.idn.icann.org ↔ xn--h1acbxfam.idn.icann.org
    • idn_to_ascii
    • idn_to_utf8
  • TODO
    • ResourceHandler
    • Transliteration
    • StringSearch
    • Tighter integration with other modules in 6.0
  • Thanks!
    • http://php.net/intl for futher information.