Handling multibyte CSV files in PHP

3,340 views
3,085 views

Published on

Handling multibyte CSV files in PHP using fgetcsv() and setlocale()

Published in: Technology
2 Comments
0 Likes
Statistics
Notes
  • Note that PHP 5.6 deprecates a bunch of mbstring (and iconv) configuration directives to do with character encodings: http://php.net/manual/en/migration56.deprecated.php
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Don't think I've spelled 'curiosity' right lol
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total views
3,340
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
2
Likes
0
Embeds 0
No embeds

No notes for slide

Handling multibyte CSV files in PHP

  1. 1. Handling multibyte CSV files in PHP using fgetcsv() and setlocale() By Daniel Rhodes of Warp Asylum ( www.warpasylum.co.uk )
  2. 2. fgetcsv() <ul><li>fgetcsv() “Gets line from file pointer and parse for CSV fields”
  3. 3. There is no mb_fgetcsv() !
  4. 4. But “Locale setting is taken into account” by fgetcsv()
  5. 5. Let's see what we can do... </li></ul>
  6. 6. A case study <ul>Let's say that we've got the following all on UTF-8: <li>Mbstring.internal_encoding
  7. 7. Our database
  8. 8. Our PHP files
  9. 9. Our HTML output
  10. 10. But we need to process the following CSV file which is in EUC-JP encoding... </li></ul>
  11. 11. The CSV file
  12. 12. The CSV file source
  13. 13. The first attempt <ul><li>Well, let's try to process it as we would a “normal” non-multibyte CSV file... </li></ul>
  14. 14. The first attempt
  15. 15. Let's try again <ul><li>Doing nothing clearly didn't work!
  16. 16. Let's try setting mbstring.internal_encoding to that of our CSV file, EUC-JP... </li></ul>
  17. 17. Let's try again
  18. 18. Using setlocale() <ul><li>Mbstring.internal_encoding clearly didn't work!
  19. 19. Let's try using the setlocale() function... </li></ul>
  20. 20. Using setlocale()
  21. 21. Success! <ul><li>Setlocale() is the key!
  22. 22. Just out of curiousity, let's remove the Japanese directive settings for mbstring and see if it still works... </li></ul>
  23. 23. Out of curiousity...
  24. 24. Solution <ul><li>Setlocale() is the key
  25. 25. Set locale to that of CSV file
  26. 26. Note that setlocale() doesn't permanently affect the system locale setting
  27. 27. Mbstring settings not important
  28. 28. Mbstring itself not actually needed! </li></ul>
  29. 29. That's all folks! <ul>I'll leave you with some things to think about: <li>Locale:: class (from PHP 5.3) might also be of use
  30. 30. Specified locale for setlocale() must be supported by OS of PHP server
  31. 31. Quoting string fields in the source CSV file may help enormously – so much so that setlocale() is not needed!
  32. 32. Questions welcome at daniel.rhodes@warpasylum.co.uk </li></ul>

×