Your SlideShare is downloading. ×
ぐだ生 Java入門第三回(文字コードの話)(Keynote版)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

ぐだ生 Java入門第三回(文字コードの話)(Keynote版)

1,952
views

Published on

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,952
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
11
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    • 1. Java( Unicode) 2011 4 16 twitter: @zaki50
    • 2. Who am I• YAMAZAKI Makoto(twitter: @zaki50)• Android • • StickyShortcut• Java
    • 3. • (CharacterSet) (Encoding)•• UTF-16• UTF-8
    • 4. • Unicode 6.0 11•
    • 5. • US ASCII• Shift JIS• JIS X 208• UCS-2• UCS-4
    • 6. • Unicode • UTF-8 • UTF-16 • UTF-32 • ... ( : US ASCII, Shift JIS)
    • 7. Unicode
    • 8. Unicode•• Xerox Microsoft, Apple, Sun Microsystems, HP, JUST System The Unicode Consortium• iso10646
    • 9. Unicode iso10646•• Unicode
    • 10. Unicode• The Unicode Consortium ( http://www.unicode.org/ )••
    • 11. • Unicode • •
    • 12. ( , Ligature)• • (U+3075) + (U+309A) = • ( (U+3077) )
    • 13. Unicode• Unicode Unicode
    • 14. • NFC(Normalization Form C) • NFD(Normalization Form D) • NFKC(Normalization Form KC) • NFKD(Normalization Form KD)C(Composition, )/D(Decomposition, ) K(Compatibility, )
    • 15. • NFC• MacOS X (HFS+) NFD• NFKC, NFKD
    • 16. Unicode• C(Composition)• D(Decomposition)
    • 17. Unicode• K(Compatibility) 1 : (U+3000) (U+0020) ( ) (5 )
    • 18. Unicode K KCD
    • 19. Eclipse
    • 20. UTF-16
    • 21. UTF-16• • Java String • Windows
    • 22. CodePoint UTF-16U+0000-U+FFFF xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx U+010000-U 0000 0000 000u uuuu 1101 10ww wwxx xxxx +0010FFFF xxxx xxxx xxxx xxxx 1101 11xx xxxx xxxx x,u,w ∈ {0,1} wwww = uuuuu - 1
    • 23. UTF-16• 2• 2
    • 24. • 1 16 16bit * 2 • 1 16bit 16 20bit • U+D800-U+DFFF (11bit ) • 0xD800-0xDBFF, 0xDC00-xDFFF
    • 25. • UTF-16 2 1 2U+3000 = 0x3000 0x30, 0x00 UTF-16BEU+3000 = 0x3000 0x00, 0x30 UTF-16LE
    • 26. BOM(byte order mark)• U+FEFF • U+FFFE • U+FEFF ZERO WIDTH NON- BREAKING SPAEC
    • 27. UTF-8
    • 28. CodePoint bit bit U+00-U+7F 0xxx xxx 7bitsU+0080-U+07FF 110y yyyx 10xx xxxx 11bitsU+0800-U+FFFF 1110 yyyy 10yxx xxxx (10xx xxxx) * 1 16bits U+010000-U+1FFFFF 1111 0yyy 10yy xxxx (10xx xxxx) * 2 21bitsU+00200000-U+03FFFFFF 1111 10yy 10yy yxxx (10xx xxxx) * 3 26bitsU+04000000-U+7FFFFFFF 1111 110y 10yy yyxx (10xx xxxx) * 4 31bits x,y ∈ {0,1} y 1
    • 29. UTF-8• US ASCII US ASCII• • 0x80-0xbf 2 ,• 2
    • 30. • •
    • 31. • UTF-8 1• BOM
    • 32. BOM(byte order mark)• 0xEF, 0xBB, 0xBF• byte order mark• UTF-8
    • 33. 1-6 2^31UTF-8 (1-4) (2^21)UTF-16 2-4 2^16-2*2^10+2^20UTF-32 4 2^16+2^20
    • 34. • (CharacterSet) (Encoding)• 1• 1 (UTF-16 ) 2
    • 35. • The Unicode Consortium(http://www.unicode.org/) • Unicode 6.0.0(http://www.unicode.org/versions/Unicode6.0.0/)• WikiPedia • Unicode(http://ja.wikipedia.org/wiki/Unicode) • UTF-8(http://ja.wikipedia.org/wiki/UTF-8) • UTF-16(http://ja.wikipedia.org/wiki/UTF-16)• Unicode (http://homepage1.nifty.com/nomenclator/ unicode/normalization.htm)

    ×