Your SlideShare is downloading. ×
  • Like
I18n with PHP 5.3
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

I18n with PHP 5.3

  • 7,002 views
Published

Talk by Stas Malyshev of Zend at ZendCon 2009

Talk by Stas Malyshev of Zend at ZendCon 2009

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
7,002
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
112
Comments
0
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Globalization Formats, names, rules, algorithms – complexity & volume Keeping it all up-to-date
  • Strength is which character properties matter (a vs. à, a vs. A) and which characters matter (space, punctuation) Attributes: which case first, which characters are considered space/visible, if to use normalization, if two representations of the same (e.g. Katakana/Hiragana) are different

Transcript

  • 1. PHP Internationalization with ICU By Stas Malyshev, Zend Technologies
  • 2. What and why?
    • ICU - http://icu-project.org/ (IBM)
    • Unicode
    • CLDR - http://cldr.unicode.org/
  • 3. Intl extension
    • Locale
    • Collator
    • Number & Currency formatter
    • Date & Time formatter
    • Message & Choice formatter
    • Normalizer
    • Graphemes
    • IDN
    • Calendars
    • Resources
  • 4. Intl extension
    • Dual API OO and procedural
    • Same implementation underneath
      • collator_create () == new Collator ()
      • numfmt_format () == NumberFormatter::format ()
      • locale_get_default () == Locale::getDefault ()
  • 5. Locale
    • Relies on ICU locales
    • <language>[_<script>]_<country>[_<variant>][@<keywords>]
    • Default locale
      • new Collator(Locale::DEFAULT )
      • Locale::setDefault, Locale::getDefault
      • You can use null
  • 6. Locale
    • Locale pieces
    • getPrimaryLanguage($locale)
    • getScript($locale)
    • getRegion($locale)
    • getVariant($locale)
    • getKeywords($locale)
  • 7. Locale
    • Locale display pieces
    • getDisplayName($locale, $in_locale = null)
    • getDisplayLanguage($locale, $in_locale = null)
    • getDisplayScript($locale, $in_locale = null)
    • getDisplayRegion($locale, $in_locale = null)
    • Example:
    • getDisplayScript(getScript(&quot;zh-Hant-TW&quot;), &quot;en-US&quot;) returns “Traditional Chinese”
  • 8. Locale building blocks
    • parseLocale () - returns array composed of locale subtags
    • composeLocale () - creates locale ID out of subtags
    • parseLocale('sr-Latn-RS') returns
    • array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)
    • composeLocale(array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)) returns ‘sr-Latn-RS ’
  • 9. Locale guessing
    • acceptFromHttp - Accept-Language to locale
    • lookup – find in the list
    • filterMatches – are they the same?
  • 10. Collator
    • Comparing, sorting strings
    • Collation level (strength)
    • All ICU collator attributes
      • Numeric collation
      • Ignoring punctuation
    • Not yet: custom “tailoring” rules
  • 11. Collator
    • $coll  = new  Collator ( &quot;fr_CA&quot; );
    • if ( $coll -> compare ( &quot;côte&quot; ,  &quot;coté&quot; ) <  0 ) {
    •       echo  &quot;less &quot; ; 
    • } else {
    •       echo  &quot;greater &quot; ; 
    côte < coté
  • 12. Collator
    • $strings  = array( &quot;cote&quot; ,  &quot;côte&quot; ,  &quot;Côte&quot; ,  &quot;coté&quot; , &quot;Coté&quot; ,  &quot;côté&quot; ,  &quot;Côté&quot; ,  &quot;coter&quot; );
    • $coll  = new  Collator ( &quot;fr_CA&quot; ); 
    • $coll -> sort ( $strings );
    cote côte Côte coté Coté côté Côté coter sort($array, $flags) asort($array, $flags) sortWithSortKeys($array)
  • 13. NumberFormatter
    • Formatting and parsing
    • Numbers and currency
    • numfmt_create($locale, $style, $pattern = null)
    NumberFormatter::PATTERN_DECIMAL NumberFormatter::ORDINAL NumberFormatter::DECIMAL NumberFormatter::DURATION NumberFormatter::CURRENCY NumberFormatter::SCIENTIFIC NumberFormatter::PERCENT NumberFormatter::SPELLOUT
  • 14. NumberFormatter
    • Formatting
    $fmt  = new  NumberFormatter ( ‘en_US’ ,                            NumberFormatter :: DECIMAL ); echo $fmt -> format ( 1234 ); // result is 1,234 $fmt  = new  NumberFormatter ( ‘de_CH’ ,                            NumberFormatter :: DECIMAL ); echo $fmt -> format ( 1234 ); // result is 1'234
  • 15. NumberFormatter
    • Parsing
    $fmt  = new  NumberFormatter ( ‘de_DE’ ,                            NumberFormatter :: DECIMAL ); $num  =  ‘1.234 , 567 min’ ; $fmt -> parse ( $num ,  NumberFormatter :: TYPE_DOUBLE ,  $pos ); // result is 1234.567 , $pos = 9 $fmt -> parse ( $num ,  NumberFormatter :: TYPE_INT32 ); // result is 1234
  • 16. MessageFormatter
    • Formatting and parsing whole messages, including data inside
    • Also allows choice between things printed:
      • 0≤are no files|1≤is one file|1<are many files
  • 17. MessageFormatter $fmt  = new  MessageFormatter ( &quot;en_US&quot; ,  &quot;{0,number,integer}  monkeys on {1,number,integer} trees  make {2,number} monkeys per tree&quot; ); echo  $fmt -> format (array( 4560 ,  123 ,  4560 / 123 )); $fmt  = new  MessageFormatter ( &quot;de&quot; ,  &quot;{0,number,integer}  Affen über {1,number,integer} Bäume  um {2,number} Affen pro Baum&quot; ); echo  $fmt -> format (array( 4560 ,  123 ,  4560 / 123 ));
  • 18. IntlDateFormatter
    • Allows using locale-dependent canned patterns
    • Short, medium, long date & time
      • Long: Tuesday, April 12, 1952 AD or 3:30:42pm PST
      • Medium: January 12, 1952 or 3:30:32pm
      • Short: 12/13/52 or 3:30pm
    • Also allows free-form patterns
      • &quot;yyyy.MM.dd G 'at' HH:mm:ss vvvv&quot;
      • 1996.07.10 AD at 15:08:56 Pacific Time
  • 19. IntlDateFormatter $fmt  = new  IntlDateFormatter (  &quot;en_US&quot;  , IntlDateFormatter :: FULL , IntlDateFormatter :: FULL , 'America/Los_Angeles' , IntlDateFormatter :: GREGORIAN ); echo  $fmt -> format ( 0 ); // Wednesday, December 31, 1969 4:00:00 PM PT $fmt  = new  IntlDateFormatter (  &quot;de-DE&quot;  , IntlDateFormatter :: FULL , IntlDateFormatter :: FULL , 'America/Los_Angeles' , IntlDateFormatter :: GREGORIAN ); echo  $fmt -> format ( 0 ); // Mittwoch, 31. Dezember 1969 16:00 Uhr GMT-08:00
  • 20. Normalizer
    • Brings Unicode text to one of the normal forms: NFC, NFD, NFKC, NFKD
    • normalize(), isNormalized()
    $combining_ring_above  =  &quot;xCCx8A&quot; ;   // 'COMBINING RING ABOVE' (U+030A) $chars  =  Normalizer :: normalize (  'A'  .  $combining_ring_above ,  Normalizer :: FORM_C  ); echo  urlencode ( $chars ); // %C3%85 i.e. // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)
  • 21. Grapheme functions
    • Graphemes are multi-char entities, like letter + accent mark(s)
    • Same as string functions, but operate on grapheme units
    • Strlen, substr, strpos, strstr
    • Extraction function – extract to fill limited buffer, but always keep graphemes whole
  • 22. IDN
    • עברית .idn.icann.org ↔ xn--5dbqzzl.idn.icann.org
    • русский.idn.icann.org ↔ xn--h1acbxfam.idn.icann.org
    • idn_to_ascii
    • idn_to_utf8
  • 23. TODO
    • ResourceHandler
    • Transliteration
    • StringSearch
    • Tighter integration with other modules in 6.0
  • 24. Thanks!
    • http://php.net/intl for futher information.