0
PHP Internationalization with ICU By Stas Malyshev, Zend Technologies
What and why? <ul><li>ICU -  http://icu-project.org/  (IBM) </li></ul><ul><li>Unicode </li></ul><ul><li>CLDR -  http://cld...
Intl extension <ul><li>Locale </li></ul><ul><li>Collator </li></ul><ul><li>Number & Currency formatter </li></ul><ul><li>D...
Intl extension <ul><li>Dual API OO and procedural </li></ul><ul><li>Same implementation underneath </li></ul><ul><ul><li>c...
Locale <ul><li>Relies on ICU locales </li></ul><ul><li><language>[_<script>]_<country>[_<variant>][@<keywords>] </li></ul>...
Locale <ul><li>Locale pieces </li></ul><ul><li>getPrimaryLanguage($locale) </li></ul><ul><li>getScript($locale) </li></ul>...
Locale <ul><li>Locale display pieces </li></ul><ul><li>getDisplayName($locale, $in_locale = null) </li></ul><ul><li>getDis...
Locale building blocks <ul><li>parseLocale () - returns array composed of locale subtags </li></ul><ul><li>composeLocale (...
Locale guessing <ul><li>acceptFromHttp -  Accept-Language to locale </li></ul><ul><li>lookup  – find in the list  </li></u...
Collator <ul><li>Comparing, sorting strings </li></ul><ul><li>Collation level (strength) </li></ul><ul><li>All ICU collato...
Collator <ul><li>$coll  = new  Collator ( &quot;fr_CA&quot; ); </li></ul><ul><li>if ( $coll -> compare ( &quot;côte&quot; ...
Collator <ul><li>$strings  = array( &quot;cote&quot; ,  &quot;côte&quot; ,  &quot;Côte&quot; ,  &quot;coté&quot; , &quot;C...
NumberFormatter <ul><li>Formatting and parsing </li></ul><ul><li>Numbers and currency </li></ul><ul><li>numfmt_create($loc...
NumberFormatter <ul><li>Formatting </li></ul>$fmt  = new  NumberFormatter ( ‘en_US’ ,                            NumberFor...
NumberFormatter <ul><li>Parsing </li></ul>$fmt  = new  NumberFormatter ( ‘de_DE’ ,                            NumberFormat...
MessageFormatter <ul><li>Formatting and parsing whole messages, including data inside </li></ul><ul><li>Also allows choice...
MessageFormatter $fmt  = new  MessageFormatter ( &quot;en_US&quot; ,  &quot;{0,number,integer}    monkeys on {1,number,int...
IntlDateFormatter <ul><li>Allows using locale-dependent canned patterns </li></ul><ul><li>Short, medium, long date & time ...
IntlDateFormatter $fmt  = new  IntlDateFormatter (  &quot;en_US&quot;  ,   IntlDateFormatter :: FULL ,   IntlDateFormatter...
Normalizer <ul><li>Brings Unicode text to one of the normal forms: NFC, NFD, NFKC, NFKD </li></ul><ul><li>normalize(), isN...
Grapheme functions <ul><li>Graphemes are multi-char entities, like letter + accent mark(s) </li></ul><ul><li>Same as strin...
IDN <ul><li>עברית .idn.icann.org ↔ xn--5dbqzzl.idn.icann.org </li></ul><ul><li>русский.idn.icann.org ↔ xn--h1acbxfam.idn.i...
TODO <ul><li>ResourceHandler </li></ul><ul><li>Transliteration  </li></ul><ul><li>StringSearch </li></ul><ul><li>Tighter i...
Thanks! <ul><li>http://php.net/intl  for futher information. </li></ul>
Upcoming SlideShare
Loading in...5
×

I18n with PHP 5.3

7,217

Published on

Talk by Stas Malyshev of Zend at ZendCon 2009

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,217
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
113
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • Globalization Formats, names, rules, algorithms – complexity &amp; volume Keeping it all up-to-date
  • Strength is which character properties matter (a vs. à, a vs. A) and which characters matter (space, punctuation) Attributes: which case first, which characters are considered space/visible, if to use normalization, if two representations of the same (e.g. Katakana/Hiragana) are different
  • Transcript of "I18n with PHP 5.3"

    1. 1. PHP Internationalization with ICU By Stas Malyshev, Zend Technologies
    2. 2. What and why? <ul><li>ICU - http://icu-project.org/ (IBM) </li></ul><ul><li>Unicode </li></ul><ul><li>CLDR - http://cldr.unicode.org/ </li></ul>
    3. 3. Intl extension <ul><li>Locale </li></ul><ul><li>Collator </li></ul><ul><li>Number & Currency formatter </li></ul><ul><li>Date & Time formatter </li></ul><ul><li>Message & Choice formatter </li></ul><ul><li>Normalizer </li></ul><ul><li>Graphemes </li></ul><ul><li>IDN </li></ul><ul><li>Calendars </li></ul><ul><li>Resources </li></ul>
    4. 4. Intl extension <ul><li>Dual API OO and procedural </li></ul><ul><li>Same implementation underneath </li></ul><ul><ul><li>collator_create () == new Collator () </li></ul></ul><ul><ul><li>numfmt_format () == NumberFormatter::format () </li></ul></ul><ul><ul><li>locale_get_default () == Locale::getDefault () </li></ul></ul>
    5. 5. Locale <ul><li>Relies on ICU locales </li></ul><ul><li><language>[_<script>]_<country>[_<variant>][@<keywords>] </li></ul><ul><li>Default locale </li></ul><ul><ul><li>new Collator(Locale::DEFAULT ) </li></ul></ul><ul><ul><li>Locale::setDefault, Locale::getDefault </li></ul></ul><ul><ul><li>You can use null </li></ul></ul>
    6. 6. Locale <ul><li>Locale pieces </li></ul><ul><li>getPrimaryLanguage($locale) </li></ul><ul><li>getScript($locale) </li></ul><ul><li>getRegion($locale) </li></ul><ul><li>getVariant($locale) </li></ul><ul><li>getKeywords($locale) </li></ul>
    7. 7. Locale <ul><li>Locale display pieces </li></ul><ul><li>getDisplayName($locale, $in_locale = null) </li></ul><ul><li>getDisplayLanguage($locale, $in_locale = null) </li></ul><ul><li>getDisplayScript($locale, $in_locale = null) </li></ul><ul><li>getDisplayRegion($locale, $in_locale = null) </li></ul><ul><li>Example: </li></ul><ul><li>getDisplayScript(getScript(&quot;zh-Hant-TW&quot;), &quot;en-US&quot;) returns “Traditional Chinese” </li></ul>
    8. 8. Locale building blocks <ul><li>parseLocale () - returns array composed of locale subtags </li></ul><ul><li>composeLocale () - creates locale ID out of subtags </li></ul><ul><li>parseLocale('sr-Latn-RS') returns </li></ul><ul><li>array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’) </li></ul><ul><li>composeLocale(array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)) returns ‘sr-Latn-RS ’ </li></ul>
    9. 9. Locale guessing <ul><li>acceptFromHttp - Accept-Language to locale </li></ul><ul><li>lookup – find in the list </li></ul><ul><li>filterMatches – are they the same? </li></ul>
    10. 10. Collator <ul><li>Comparing, sorting strings </li></ul><ul><li>Collation level (strength) </li></ul><ul><li>All ICU collator attributes </li></ul><ul><ul><li>Numeric collation </li></ul></ul><ul><ul><li>Ignoring punctuation </li></ul></ul><ul><li>Not yet: custom “tailoring” rules </li></ul>
    11. 11. Collator <ul><li>$coll  = new  Collator ( &quot;fr_CA&quot; ); </li></ul><ul><li>if ( $coll -> compare ( &quot;côte&quot; ,  &quot;coté&quot; ) <  0 ) { </li></ul><ul><li>      echo  &quot;less &quot; ;  </li></ul><ul><li>} else { </li></ul><ul><li>      echo  &quot;greater &quot; ;  </li></ul><ul><li>}  </li></ul>côte < coté
    12. 12. Collator <ul><li>$strings  = array( &quot;cote&quot; ,  &quot;côte&quot; ,  &quot;Côte&quot; ,  &quot;coté&quot; , &quot;Coté&quot; ,  &quot;côté&quot; ,  &quot;Côté&quot; ,  &quot;coter&quot; ); </li></ul><ul><li>$coll  = new  Collator ( &quot;fr_CA&quot; );  </li></ul><ul><li>$coll -> sort ( $strings ); </li></ul>cote côte Côte coté Coté côté Côté coter sort($array, $flags) asort($array, $flags) sortWithSortKeys($array)
    13. 13. NumberFormatter <ul><li>Formatting and parsing </li></ul><ul><li>Numbers and currency </li></ul><ul><li>numfmt_create($locale, $style, $pattern = null) </li></ul>NumberFormatter::PATTERN_DECIMAL NumberFormatter::ORDINAL NumberFormatter::DECIMAL NumberFormatter::DURATION NumberFormatter::CURRENCY NumberFormatter::SCIENTIFIC NumberFormatter::PERCENT NumberFormatter::SPELLOUT
    14. 14. NumberFormatter <ul><li>Formatting </li></ul>$fmt  = new  NumberFormatter ( ‘en_US’ ,                            NumberFormatter :: DECIMAL ); echo $fmt -> format ( 1234 ); // result is 1,234 $fmt  = new  NumberFormatter ( ‘de_CH’ ,                            NumberFormatter :: DECIMAL ); echo $fmt -> format ( 1234 ); // result is 1'234
    15. 15. NumberFormatter <ul><li>Parsing </li></ul>$fmt  = new  NumberFormatter ( ‘de_DE’ ,                            NumberFormatter :: DECIMAL ); $num  =  ‘1.234 , 567 min’ ; $fmt -> parse ( $num ,  NumberFormatter :: TYPE_DOUBLE ,  $pos ); // result is 1234.567 , $pos = 9 $fmt -> parse ( $num ,  NumberFormatter :: TYPE_INT32 ); // result is 1234
    16. 16. MessageFormatter <ul><li>Formatting and parsing whole messages, including data inside </li></ul><ul><li>Also allows choice between things printed: </li></ul><ul><ul><li>0≤are no files|1≤is one file|1<are many files </li></ul></ul>
    17. 17. MessageFormatter $fmt  = new  MessageFormatter ( &quot;en_US&quot; ,  &quot;{0,number,integer}  monkeys on {1,number,integer} trees  make {2,number} monkeys per tree&quot; ); echo  $fmt -> format (array( 4560 ,  123 ,  4560 / 123 )); $fmt  = new  MessageFormatter ( &quot;de&quot; ,  &quot;{0,number,integer}  Affen über {1,number,integer} Bäume  um {2,number} Affen pro Baum&quot; ); echo  $fmt -> format (array( 4560 ,  123 ,  4560 / 123 ));
    18. 18. IntlDateFormatter <ul><li>Allows using locale-dependent canned patterns </li></ul><ul><li>Short, medium, long date & time </li></ul><ul><ul><li>Long: Tuesday, April 12, 1952 AD or 3:30:42pm PST </li></ul></ul><ul><ul><li>Medium: January 12, 1952 or 3:30:32pm </li></ul></ul><ul><ul><li>Short: 12/13/52 or 3:30pm </li></ul></ul><ul><li>Also allows free-form patterns </li></ul><ul><ul><li>&quot;yyyy.MM.dd G 'at' HH:mm:ss vvvv&quot; </li></ul></ul><ul><ul><li>1996.07.10 AD at 15:08:56 Pacific Time </li></ul></ul>
    19. 19. IntlDateFormatter $fmt  = new  IntlDateFormatter (  &quot;en_US&quot;  , IntlDateFormatter :: FULL , IntlDateFormatter :: FULL , 'America/Los_Angeles' , IntlDateFormatter :: GREGORIAN ); echo  $fmt -> format ( 0 ); // Wednesday, December 31, 1969 4:00:00 PM PT $fmt  = new  IntlDateFormatter (  &quot;de-DE&quot;  , IntlDateFormatter :: FULL , IntlDateFormatter :: FULL , 'America/Los_Angeles' , IntlDateFormatter :: GREGORIAN ); echo  $fmt -> format ( 0 ); // Mittwoch, 31. Dezember 1969 16:00 Uhr GMT-08:00
    20. 20. Normalizer <ul><li>Brings Unicode text to one of the normal forms: NFC, NFD, NFKC, NFKD </li></ul><ul><li>normalize(), isNormalized() </li></ul>$combining_ring_above  =  &quot;xCCx8A&quot; ;   // 'COMBINING RING ABOVE' (U+030A) $chars  =  Normalizer :: normalize (  'A'  .  $combining_ring_above ,  Normalizer :: FORM_C  ); echo  urlencode ( $chars ); // %C3%85 i.e. // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)
    21. 21. Grapheme functions <ul><li>Graphemes are multi-char entities, like letter + accent mark(s) </li></ul><ul><li>Same as string functions, but operate on grapheme units </li></ul><ul><li>Strlen, substr, strpos, strstr </li></ul><ul><li>Extraction function – extract to fill limited buffer, but always keep graphemes whole </li></ul>
    22. 22. IDN <ul><li>עברית .idn.icann.org ↔ xn--5dbqzzl.idn.icann.org </li></ul><ul><li>русский.idn.icann.org ↔ xn--h1acbxfam.idn.icann.org </li></ul><ul><li>idn_to_ascii </li></ul><ul><li>idn_to_utf8 </li></ul>
    23. 23. TODO <ul><li>ResourceHandler </li></ul><ul><li>Transliteration </li></ul><ul><li>StringSearch </li></ul><ul><li>Tighter integration with other modules in 6.0 </li></ul>
    24. 24. Thanks! <ul><li>http://php.net/intl for futher information. </li></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×